Note: This report was originally published in D-Lib Magazine, July 1995, by Stuart Weibel (OCLC Office of Research).
Original publication: https://www.dlib.org/dlib/July95/07weibel.html

Metadata: The Foundations of Resource Description

Stuart Weibel, Office of Research, OCLC Online Computer Library Center, Inc.

Introduction

Valuable collections of texts, images, and sounds from many scholarly communities now exist only in electronic form. Information resources that are born digital or converted to digital form represent a growing body of knowledge that requires effective methods for description, organization, discovery, and access.

While search services such as Lycos provide useful indexing of networked resources, indexes are most useful in small collections within a given domain. A metadata record — more informative than an index entry but less complete than a formal cataloging record — can serve as a practical middle ground for resource description.

The Workshop

The OCLC/NCSA Metadata Workshop convened March 1–3, 1995, in Dublin, Ohio. Fifty-two librarians, archivists, humanities scholars, geographers, and standards makers from the Internet, Z39.50, and SGML communities gathered to identify the scope of the problem and achieve consensus on a list of metadata elements for describing networked resources.

The scope of the workshop focused on document-like objects (DLOs) — resources where intellectual content is primarily textual, such as electronic newspaper articles or dictionaries. The workshop addressed discovery metadata rather than complete resource description.

The Dublin Core Metadata Element Set

The workshop produced a set of thirteen core elements for describing networked resources:

Subject — The topic addressed by the work
Title — The name of the object
Author — The person(s) primarily responsible for the intellectual content of the object
Publisher — The agent or agency responsible for making the object available
OtherAgent — The person(s), such as editors and transcribers, who have made other significant intellectual contributions to the work
Date — The date of publication
ObjectType — The genre of the object, such as novel, poem, or dictionary
Form — The physical manifestation of the object, such as PostScript file or Windows executable
Identifier — String or number used to uniquely identify the object
Relation — Relationship to other objects
Source — Objects, either print or electronic, from which this object is derived, if applicable
Language — Language of the intellectual content
Coverage — The spatial locations and temporal durations characteristic of the object

Underlying Principles

Six core principles guided the development of the Dublin Core:

Intrinsicality. The elements focus on properties that can be discovered from examination of the work itself, distinguishing intrinsic data (intellectual content, physical form) from extrinsic data (cost, access considerations).

Extensibility. The element set allows inclusion of additional descriptive material for site-specific purposes and specialized fields while maintaining backward compatibility.

Syntax Independence. The design avoids syntactic bindings to remain discipline-agnostic and accommodate various application programs.

Optionality. All elements are optional, recognizing that some elements lack meaning for certain resources.

Repeatability. All elements are repeatable for multiple instances (for example, multiple authors).

Modifiability. Elements can be qualified to bridge casual and sophisticated users. For example, "Subject (scheme=LCSH)" indicates use of Library of Congress Subject Headings.

Implementation Projects

Nine organizations initiated prototype projects based on the workshop results:

The OCLC Spectrum Project (Diane Vizine-Goetz)
The OCLC Internet Resources Cataloging Project (Erik Jul)
Library of Congress (Rebecca Guenther)
O'Reilly Associates (Terry Allen)
Los Alamos National Laboratory and Indiana University (Ron Daniel Jr. and Pete Percival)
Bunyip Systems (Chris Weider)
Georgia Institute of Technology (Michael Mealling)
SoftQuad (Yuri Rubinsky)
Concordia University (Bipin Desai)

Next Steps

Planned extensions included expansion to describe other object types (services, collections), inclusion of functionality beyond resource discovery (archival control, authentication, charging mechanisms), and refinement through practical experience.

OCLC and NCSA established a workshop series, with a Metadata Workshop Steering Committee to define topics and design groups to prepare discussion papers. Coordination was planned with the IETF working group on Uniform Resource Identifiers, the Machine-Readable Bibliographic Information Committee (MARBI), and stakeholders from publishing, SGML, GIS, government information, and business communities.

Workshop Report