0% found this document useful (0 votes)
12 views

Foundations of Digital Libraries

The document discusses the concept of digital libraries, their definitions, types, and the challenges faced in knowledge dissemination and communication. It outlines the various types of digital libraries, including standalone, federated, and harvested libraries, as well as the processes involved in acquiring and managing digital collections. Additionally, it highlights the importance of staff, materials, and digitization methods in the creation and maintenance of digital libraries.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Foundations of Digital Libraries

The document discusses the concept of digital libraries, their definitions, types, and the challenges faced in knowledge dissemination and communication. It outlines the various types of digital libraries, including standalone, federated, and harvested libraries, as well as the processes involved in acquiring and managing digital collections. Additionally, it highlights the importance of staff, materials, and digitization methods in the creation and maintenance of digital libraries.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Digital Libraries:

What are the foundations?


Vannevar Bush
Some day there will
be an easy way to
store, disseminate,
and preserve all of
“man’s” knowledge,
without leaving your
desk.
Knowledge Building Process Problems
• Too much information
• Scientists can’t communicate across disciplines
• Over specialization, no longer can be generalist
• Research slowing because the amount of time
required to know the literature
• Knowledge dissemination too slow
• Too much repetitive activities, reading, analysis
etc.
Solutions
• Mini-camera—fit on head, size of walnut
– Google maps (what are some others?)
• “Dry photography” ?
• Compression is a key term
• Computers that are “Mind like”
• How can we do “math” with letters?
– OCR, Natural Language Processing, Text Mining,
Speech recognition ( a la Apple’s Siri)
Another DL Definition
• A digital library is a networked collection of
digital objects – text, still images, moving
images, sound, data – with arrangement,
search features, and metadata that allow for
discovery and presentation, supporting
research and teaching, and with attention paid
to architecture, persistence, longevity, and
digital preservation. (Jenn Riley, IU)
Another DL Definition
• A digital library is a special library with a focused
collection of digital objects that can include text, visual
material, audio material, video material, stored as
electronic media formats (as opposed to print,
microform, or other media), along with means for
organizing, storing, and retrieving the files and media
contained in the library collection. Digital libraries can
vary immensely in size and scope, and can be
maintained by individuals, organizations, or affiliated
with established physical library buildings or institutions,
or with academic institutions.[1] (Wikipedia)
Another DL Definition
• Digital libraries are organizations that provide
the resources, including the specialized staff,
to select, structure, offer intellectual access to,
interpret, distribute, preserve the integrity of,
and ensure the persistence over time of
collections of digital works so that they are
readily and economically available for use by a
defined community or set of communities.
(Don Waters, DLF)
Collections of Digital Works
• "Collections of digital works...." Distinctions
among libraries commonly focus on the
subject matter that defines the collections
(e.g., medical, art, science, music, and such),
or on the communities interested in the
collected materials (e.g., research, college,
public).
Aspects of a Collection
• Is built following a • Respects intellectual
collection policy property rights
• Is described so a • Is interoperable
user can discover its • Integrates into the
characteristics user’s workflow
• Contains actively • Is sustainable
managed resources
Types of Digital Libraries
• Stand-alone Digital Library (SDL)
– also self-contained, several collections
• Federated Digital Library (FDL)
– also confederated, networked
• Harvested Digital Library (HDL)
– also distributed
Standalone DL
• A “typical” Digital Library
• Usually installed on a web server
• Self-contained material:
– born digital
– scanned or digitized
– purchased or licensed
• Single or Several digital collections
Standalone DL
Federated DL
• Contains many separate digital libraries
• Usually heterogeneous repositories
• Uses search layer “federated search”
• Connected via a network
• Forms a virtual library
• Unified/Transparent user interface
• The major problem is interoperability (does the
metadata cross walk properly? Does it render well?
• Example: Brown University Digital Repository
Federated DL
Harvested DL
• Harvests digital objects, not full DLs.
• Objects harvested into metadata (using Open Archives
Initiative).
• Does not have to contain objects, just
metadata/summaries.
• But has regular DL characteristics
• They contain the summaries about the objects, and
typically direct you to the home DL if you want to
see/hear digital object
• Example: Digital Public Library of America
Harvested DL
Breakdown
Single Digital Library Harvested Digital Library
& Federated
Items Origin Purchased/Digitized Gathered

Items Location Local/Networked Scattered

Material Items+Catalog Catalog

Repository Size Large Small

Update Medium Fast/Dynamic


Composition
Interoperability Inherent
Method
Examples: College or University DL Digital Public Library of America (DPLA)
Acquiring: Digital Collections
• The digital acquisition continuum:

linking mirroring hosting archiving

LESS MORE
Amount of Responsibility

• New procedures and workflows are


required
– tape loading, scanning, format conversion, etc.
Staffing
• Every DL will require staff
– Some designated titles, some part time or cross
trained
– Grant funded
– Cataloger, metadata specialist
– Digital curator
– Systems Librarian
• Maybe one designated unit or a blend of staff
across departments.
Types of Materials
• Library/archives flavored items
– Audio, video, books, images, documents
• Scientific materials
– Datasets, raw image data, GIS, Architecture etc.
– raw materials that would have never gone on a
book shelf
– Gene banks, phonology (speech)
Born Digital
• Born-digital resources are items created and
managed in digital form.
• Types: Images, Audio, Documents, Video,
data-centric materials, websites
• “electronic records” Data or information that
has been captured and fixed for storage and
manipulation in an automated system and
that requires the use of the system to render
it intelligible by a person.
Digitization
• Digitization is the process of converting
information into a digital format. In this format,
information is organized into discrete units of data
(called bit s) that can be separately addressed
(usually in multiple-bit groups called bytes).
• Digitize: The process of transforming analog
material into binary electronic (digital) form,
especially for storage and use in a computer. (SAA)
Acquiring: Image File Formats

• Archival version: high-resolution TIFF


• Online versions:
– Preview: low-resolution GIF
– Full: medium-resolution JPEG
– High: med./high-resolution JPEG or TIFF
• Up-and-coming: MrSID, Flashpix, PNG
• Formats
– Lossy and lossless
– .jpg, .tiff, .raw
Scanners
• Flatbed Scanner • Overhead Camera
– Cheap and relatively easy – Fragile items
to operate. – More difficult
– High resolution, but slow – Lighting
Digital Images
• Digital Images are electronic snapshots taken of a scene or
scanned from documents, such as photographs, manuscripts,
printed texts, and artwork.
• The digital image is sampled and mapped as a grid of dots or
picture elements (pixels). Each pixel is assigned a tonal value
(black, white, shades of gray or color), which is represented in
binary code (zeros and ones).
• The binary digits ("bits") for each pixel are stored in a
sequence by a computer and often reduced to a mathematical
representation (compressed). The bits are then interpreted
and read by the computer to produce an analog version for
display or printing.
Digital Image
• Pixel: Is the smallest controllable element on a screen
• Image Resolution: How many pixels per-square inch
• Bit Depth: is determined by the number of bits used to define each
pixel. The greater the bit depth, the greater the number of tones
(grayscale or color) that can be represented.

• Digital images may be produced in black and white (bitonal),


grayscale (8 bit), or color (24 bit).
• Pixel Values: As shown in this bitonal image, each pixel is
assigned a tonal value, in this example 0 for black and 1 for white.

You might also like