0% found this document useful (0 votes)
16 views

Final Fundamentals of Metadata (1)

The document is a tutorial on the fundamentals of metadata, presented at a seminar aimed at training Nigerian cataloguers on transitioning from MARC 21 to a Linked Data environment. It discusses the evolution of cataloguing standards, the importance of metadata types and standards, and the methods for creating and transforming metadata. The tutorial emphasizes the need for libraries to adapt to technological advancements and user expectations in the digital age.

Uploaded by

bhlbudamyassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Final Fundamentals of Metadata (1)

The document is a tutorial on the fundamentals of metadata, presented at a seminar aimed at training Nigerian cataloguers on transitioning from MARC 21 to a Linked Data environment. It discusses the evolution of cataloguing standards, the importance of metadata types and standards, and the methods for creating and transforming metadata. The tutorial emphasizes the need for libraries to adapt to technological advancements and user expectations in the digital age.

Uploaded by

bhlbudamyassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

FUNDAMENTALS OF METADATA

A Tutorial Presented at the 40th Annual Cataloguing, Classification and Indexing


Seminar/Workshop held at Center for Advanced Library and Information Management
Enugu on 27th- 29th October, 2021

By

Victoria Sokari
University Library, Bayero University, Kano
[email protected]
Introduction

In the time past, cataloguing was done using standards commonly ISBD, AACR2 and later the
MARC 21 in libraries. However, libraries and other cultural institutions of today have been
required to make a drastic move and change their approaches. This is as a result of the fact that
those standards that have been used for decades have come under increasing pressure to either
adapt new circumstances or to give way entirely to different standards, because of their shortfalls
in serving current purposes and circumstances (ALA, 2012).

One of the inevitable reasons for this change/transformation in the words of Coyle (2010) is that
the library data, despite being saved and accessed via computers, is designed mostly for use and
consumption by humans. Non-the-less, it has been observed that, people are not the only users of
data produced in the name of bibliographic control. Machine applications also interact with those
data in a variety of ways (Library of Congress, 2008). More so, the drastic technological
advancement may set the need: for system migration; to update to a different schema; to
disseminate metadata to an aggregator (e.g. library); to adopt change to an input standard; to
entirely refresh to philosophy. In addition, technological advancement has resulted in great
change in how users seek for information for instance, user research methods and their
expectations. The aforementioned reasons give way for the need for web sharable data.
Unfortunately, library data is not integrated with the web as most of the data is encoded in
natural language rather than as data (Baker, et al. 2011).

The older standards (like AACR2, MARC 21) are too library centric and not sufficient enough to
cater for the present and future needs of the web era. For this reason, the World Wide Web
Consortium (W3C) provides a way forward which is to build the Semantic Web using the Linked
Data principles to implement standards that seems to fit well with the legacy metadata produced
and maintained by libraries and other cultural institutions (ALA, 2012). In fact, we are facing a
chaotic atmosphere in Libraries at the moment as these changes take a toll on us. However, we
are not relenting as we must stand on our toes and work rigorously to make the necessary
changes that are required to keep us in the business of information service delivery.

This tutorial on fundamentals of metadata is meant to train Nigerian cataloguers on some


theoretical and practical aspects of metadata as a basis for the transitioning from the MARC 21
to Linked data Environment. Also, practical lessons would be conducted on how to catalogue or
create metadata for information resources using schemas like MARC21, Dublin Core, Metadata
Object Description Schema (MODS) and Visual Resource Association (VRA) Core respectively.

Cataloging and metadata in a library context

 Generally working with a curated set of resources


 Commitment to standards with an eye towards interoperability
 Diversity of users, except for some specialized libraries
 Strongly value community of practice
 Often using a unified discovery tool
 Driven by Functional Requirements for Bibliographic Records (FRBR) user tasks:
find, identify, select, obtain

What makes cataloging and metadata similar?

 Describing resources
 Community supported standards for structure and input

What are the differences between cataloging and metadata?

 Scale
 Cataloging typically involves consideration/creation of description on a per item basis
 Metadata typically involves consideration/creation of a description on a bulk scale
 Metadata work also seeks to transform existing metadata
 Diversity
 Cataloging is widely understood to entail input governed by AACR/AACR2/RDA
into the MARC format
 Metadata entails input governed by any number of standards into a variety of schema
 Cataloging involves describing resources, typically on an item-by-item basis, to
conform to a known universe of description and discovery e.g., MARC in an
ILS/OPAC
 Metadata frequently requires creating the universe of description and discovery to
approach description at bulk scale e.g. any schema in a variety of management
systems/discovery layers
 Evaluation of schema recalls FRBR user tasks: find, identify, select, obtain
 This approach is philosophically aligned with a linked data world, wherein the
metadata may be repurposed and repackaged in many contexts
Types of metadata

 Administrative
 Descriptive
 Access/Use
 Preservation
 Structural
 Technical

Administrative • Information pertinent to management of the resource • Examples: Persistent


identifier; Date of metadata creation; Institution or individual responsible for metadata creation

Descriptive • Information provided for the discoverability of the resource • Examples: Title;
Date; Subjects

Access/use • Information pertinent to finding and accessing the resource • Examples: Conditions
governing access (e.g., a collection is restricted); Conditions governing use (e.g., copyright
restrictions may apply)

Preservation • Information about versions, migration, and interventions • Examples:


Preservation event date or time; Persons, organizations, or software responsible for preservation
events

Structural • Information about a complex digital resource to assist machine presentation


Examples: Schemas to package and transmit a grouping of administrative, technical, and
descriptive metadata about a resource

Technical • Details of the digital object and its creation • Examples: Size; File type; Date
digitized

Standards

Key to Interoperability: There must be “consistency”. Cataloging has long involved


standardized usage of terminology to achieve human readability and both human and machine
discoverability. Users can understand meaning within the context. Both collocation and
presentation of resources is derived from consistency.

Standards for Metadata include:


 Schemas
 Input/content standards
 Controlled Vocabularies

Schema: Metadata schemes (also called schema) are sets of metadata elements designed for a
specific purpose, such as describing a particular type of information resource. The definition or
meaning of the elements themselves is known as the semantics of the scheme. The values given
to metadata elements are the content. ‐NISO “Understanding Metadata”, 2004

 Schemas define a set of elements within a context


 Define rules for machine validation
 Validation only ensures that required elements are present and that they are
presented as outlined in the schema.
 Validation does not find poor inputs such as typos or incorrect information
 Provide structure for presentation
 Discovery interface can be programmed to display metadata in proscribed
elements
 Can provide information for indexing purposes, such as non-filing characters.

Examples of Library Metadata Schema:

 DublinCore Metadata Element Set


 Metadata Encoding and Transmission Standard (METS)
 Visual Machine Readable Catalogue (MARC) 21
 Preservation Metadata (PREMIS)
 Resource Association (VRA) Core
 European Broadcasting Union (EBU) Core
 Metadata Object Description Schema (MODS)
 Encoded Archival Description (EAD)
 Public Broadcasting (PB) Core
 Music Encoding Initiative (MEI) Schema
 Online Information eXchange (ONIX) metadata schema
 Music Encoding Initiative (MEI) Schema
 Text Encoding Initiative (TEI) Schema

Input/Content Standards

 Controls quality of human readable metadata


 Provides for consistent use of terminology
 Defines minimum standard for adequate description
Examples of Library Input Standards:

 Resource Description and Access (RDA)


 International Standard Bibliographic Description (ISBD)
 Anglo American Cataloguing Rules (AACR)
 Cataloguing Cultural Objects and their Images (CCO)
 Archives, Personal Papers and Manuscripts (APPM)
 Describing Archives: a Content Standard (DACS)

Controlled vocabularies: A controlled vocabulary is an organized arrangement of words and


phrases used to index content and/or to retrieve content through browsing or searching. It
typically includes preferred and variant terms and has a defined scope or describes a specific
domain. “Introduction to Controlled Vocabularies”, Patricia Harpring, 2010

 Control vocabulary identify preferred terms among synonyms or comparable


topics
 Define terms and appropriate usage
 Not mutually exclusive
 May be created for a specific community or intended for general audience
 Allow for indexing and display of resources that contain a commonality
 Most widely known example is subject headings

Examples of Controlled Vocabularies

 Library of Congress Subject Headings (LCSH)


 Faceted Application of Subject Terminology (FAST)
 Sears List of Subject Headings (SEARS)
 Book Industry Standards and Communications (BISAC) thesauri
 Getty Thesaurus of Geographic Names (TGN)
 Thesaurus of Graphic Materials (TGM)
 Medical Subject Headings (MeSH)
 Art & Architecture Thesauri (AAT)
 Rare Book and Special Collection Materials (RMBS) thesauri
 Library of Congress Name Authority File (LCNAF)
 Union List of Artist Names (ULAN)

Community Creation & Maintenance

Pros
 Responsibility for maintenance and documentation is spread across a larger group
 External perspectives on proposals can lead to stronger end product that is usable
across a larger group
 Supports interoperability across the community
Cons
 Customization may be limited or more arduous to enact
Local Creation & Maintenance

Pros

 Ability to customize as and when desired

Cons

 Full responsibility for workload and documentation


 Absence of external perspectives and feedback
 Limits interoperability

Methods of Creating Metadata

Preparation for creation

 Select schema
 Select input standard
 Select controlled vocabularies

Creation:

Management Systems

 Provides a way to manage metadata in a holistic environment


 Metadata creation/maintenance is a module among any other variety of functions
 Often limited flexibility with schema; may offer enhanced functionality with
schema that are covered
 May offer enhanced functionality with frequently used controlled vocabularies

Types of Management Systems:

 Integrated Library System (ILS):


 Earliest ILS was created in the 1970s, after the advent of MARC
 Comprehensive computer system that stores information about resources, patrons,
billing and receiving and circulation.
 Includes interfaces for staff and patron interaction with metadata, as appropriate
 Typically structures resource description into MARC records
 Often features interact with OCLC and the LCNAF
 Archival Information Management System
 Widely adopted in research libraries in the late 2000s
 Comprehensive computer system that stores information about acquisition,
description and digital objects related to primary resource collections
 Includes interfaces for staff and patron interaction with metadata, as appropriate
 Structured as a relational database that can export metadata in EAD, MARCXML,
Dublin Core and other schema.
 Often features interaction with OCLC and the LCNAF
 Librarything
 Created in 2005 as a source for describing and categorizing personal libraries.
 Cloud based software that allows for bibliographic description and subject
identification as well as community reviews and tags.
 Includes primary web based interface and mobile app for data entry.
 Users can keep their library private or make metadata available for community
viewing.
 Connects to libraries via z39.50 protocol and six Amazon.com stores, allowing
users to import existing metadata in MARC or Dublin Core.
 iTunes
 Proprietary software created as SoundJam MP in 1998, the software was
purchased by Apple and released as iTunes 1.0 in 2001
 In addition to usage as a media player, iTunes also serves as a library for digital
media, with a variety of metadata inputs.
 Metadata is referred to as “attributes”
 Users can elect to accept metadata provided when the file is loaded into the
library, edit metadata by hand or activate a feature that dynamically updates
metadata when information in the iTunes store is updated. The only attribute that
cannot be edited within the iTunes application is a tag indicating the presence or
absence of potentially objectionable, explicit content.
 Users can keep their library private or make available for community viewing

Creation: Independent Tools

 Provides a way to manage metadata independent of a system


 Metadata creation and editing is the only function.
 Often great flexibility with schema; some may offer validation
 Limited to no ability to preview metadata in a discovery interface
 Limited to no ability to validate inputs.

Examples of Independent Tools:

 CSV file
Comma Separated Values: a comma is used as a delimiter to identify where a
machine should parse values into separate fields
 Format in use with computers since 1972.
 Relatively software agnostic: created in a plain text file, with each line
representing a data record, with each field within the record separated by a
comma
 Common data exchange format; many software applications can export/import
CSV
 Best used for very simple data that is highly consistent and does not contain
commas within the input value
 Spreadsheet
 Computer software initially implemented in 1962 that allows a user to create and
store data in an interactive table form.
 Relatively software agnostic: most computers are loaded with a basic spreadsheet
program, and there are web based products as well.
 Data is separated into columns and rows, with an expectation of consistency in
type of input between columns or rows. Data can be sorted in columns to bring
together identical input

 XML editor
 Computer software created to facilitate creation of well formed XML
 Extensible Markup Language is derived from Standardized General Markup
Language (SGML) that was created in the mid to late 1990s.
 XML editors typically require the user to read and create native XML syntax
 When a valid schema is named at the beginning of XML document, many XML
editors can provide real-time feedback regarding any structural issues with the
metadata

 Google Forms
 Conceived as a cloud based survey application integrated with Google
spreadsheet, released in 2007
 Includes options such as drop down menus, radio buttons and checkboxes;
does not require user knowledge of metadata schema and can be used to
avoid typos and other common consistency errors
 Responses populate a Google spreadsheet

Transformation

What is transformation?

Transformation is an intentional, automated change to existing metadata. It can be a onetime


process or a dynamic automated workflow. Successful transformations involve an understanding
of the goals of transformation, the constraints controlling resources or process and analysis and
planning.

Goals of transformation

Understanding the goal greatly informs decisions around a metadata transformation.

Common goals include:

 A system migration
 Update to schema in use
 Send to an aggregator
 Change to input standard
 Refresh to philosophy
 User research methods change
 User expectations change

Transformation examples

 MODS to Dublin Core


 MARC to MODS
 MARC to BIBFRAME
 LCSH to FAST
 Spreadsheet to Dublin Core

Tools to support automated transformation:

 MarcEdit
 Stylesheets (XSLT – Extensible Stylesheet Language Transformations)
 Caveat emptor: metadata transformations are unlikely to be lossless. Always read the
fine print on crosswalk documentation.
Practical Lessons
Practical steps for creating metadata involve:
 selecting a schema, input standard and controlled vocabularies
 determining a method of metadata creation, using a management system or independent
tools
 adequately document context, criteria and methodology.

Creating Metadata using (practical to be guided by resource persons)

MARC 21 http://www.loc.gov/marc/bibliographic/

Dublin core http://dublincore.org/documents/dces/

MODS http://www.loc.gov/standards/mods/mods-outline-3-6.html

VRACore http://www.loc.gov/standards/vracore/VRA_Core4_Element_Description.pdf

Conclusion

To create metadata in a linked data environment, cataloguers, especially in Nigeria are expected
to re-tool and re-skill on the new ways as evidenced in the tutorials covered. There are numerous
things to learn and unlearn. We need to brush up on our meta-competencies now more than ever,
because that is what is required in order to have a competitive edge in the information world.

References:

American Library Association. (2012) "Transforming Library Metadata into Linked Library
Data" .http://www.ala.org/alcts/resources/org/cat/research/linked-data (Accessed October
16, 2021) Document ID: fc4b530d-00bb-4aef-b6d4-21d40f4ae19c

Baker, T., E. Bermès, K. Coyle, G. Dunsire, A. Isaac, P. Murray, M. Panzer, J. Schneider, R.


Singer, E. Summers, W. Waites, J. Young and M. Zeng (2011). Library Linked Data
Incubator Group Final Report. World Wide Web Consortium, accessed October 18,
2021, www.w3.org/2005/Incubator/lld/XGR-lld-20111025.
Course: Fundamentals of Metadata Session 3.0 (ala.org)

Coyle, K. (2010). “RDA Vocabularies for a Twenty-First-Century Data Environment.” Library


Technology Reports 46(2): 1–39
Library of Congress. (2008) On the Record: Report of The Library of Congress Working Group
on the Future of Bibliographic Control. Accessed October 16, 2021
www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf.

NISO (2004). Understanding Metadata. Bethesda, MD: NISO Press, National Information
Standards Organization. Accessed on October 16, 2021
http://www.niso.org/publications/press/UnderstandingMetadata.pdf

Exercise
1) Select what characteristic(s) cataloging and metadata have in common:
Select one:
a. Each has a commitment to standards geared toward interoperability
b. Usually resources are available to a diverse set of users, excepting special libraries
c. Each strongly values a community of practice
d. All of the above

2) True or false: cataloging and metadata are similar in that each involves describing resources
using community supported standards for structure and input.
Select one:
True
False

3) Select what makes metadata different from cataloging:


Select one:
a. Metadata creation typically involves item-by-item description according to one input standard
and one schema
b. Metadata involves consideration of description on a bulk scale
c. Metadata involves consideration of a variety of schema and input standards
d. B and C

4) Select the six types of metadata:


Select one:
a. Descriptive, technical, migrational, preservation, structural, informational
b. Administrative, descriptive, access/use, preservation, structural, technical
c. Descriptive, copyright, technical, preservation, structural, administrative
d. Administrative, descriptive, preservation, functional, technical, structural

5) What is the key to interoperability for metadata standards?


Select one:
a. Granular description
b. Local customization
c. Consistency
d. Library of Congress Subject Headings
6) Metadata schemas are made up of ________ designed for a specific purpose.
Select one:
a. Controlled vocabularies
b. Element sets
c. Input standards
d. Local workflows

7) Metadata schema:
Select one:
a. Define a set of elements within a context
b. Define rules for machine validation
c. Provide structure for presentation
d. All of the above

8) Which of the following are common library metadata schemas?


Select one:
a. MODS, VRACore, MARC21, DublinCore
b. MARC21, MODS, MusicCore, DublinCore
c. MODS, VRACore, DublinCore, CopyrightCore
d. MARC21, ImageCore, MODS, VRACore

9) Input standards provide for consistence use of terminology and define the minimum standard
for adequate description.
Select one:
True
False

10) Which of the following are common library input standards?


Select one:
a. RDA
b. DACS
c. CCO
d. All of the above

11) Which of the following is NOT true of controlled vocabularies


Select one:
a. They may be created for both general audiences and specific communities
b. Vocabularies are mutually exclusive from one another
c. The most visible practice for controlled vocabularies is subject headings
d. They link similar resources through index and display

12) Which of the following is NOT a common library controlled vocabulary?


Select one:
a. Library of Congress Subject Headings
b. Medical Subject Headings
c. Getty Thesaurus for Music Performer Names
d. Thesaurus for Graphic Materials

13) Which of the following is an advantage to local creation and maintenance of metadata
standards?
Select one:
a. Organizations bear full responsibility for workload and documentation
b. Ability exists for customization when desired
c. Absence of external perspectives and feedback
d. Limited interoperability

14) Which of the following are advantages to community creation and maintenance of metadata
standards? Select one:
a. Supports interoperability
b. Responsibility for maintenance and documentation is spread across a larger group
c. External perspectives on proposals can lead to stronger end product that is usable across a
larger group
d. All of the above

You might also like