Note: This report was originally published in D-Lib Magazine, January 1997, by Stuart Weibel and Eric Miller (OCLC Office of Research).
Original publication: https://www.dlib.org/dlib/january97/oclc/01weibel.html

Image Description on the Internet: A Summary of the CNI/OCLC Image Metadata Workshop

Stuart Weibel and Eric Miller, Office of Research, OCLC Online Computer Library Center, Inc.

Introduction

Seventy practitioners in the area of networked image description gathered on September 24–25, 1996, in Dublin, Ohio, for the third in a series of metadata workshops sponsored by the Coalition for Networked Information (CNI) and OCLC Online Computer Library Center.

The first two workshops had focused on the semantic content and deployment strategies for a simple resource description record — the Dublin Core — primarily in the context of textual resources. This workshop addressed the application of the Dublin Core to image resource description.

The expectation that a set of image-specific elements (an "Image-Core") would emerge gave way to the recognition that the Dublin Core, within the context of the Warwick Framework, affords a foundation for a simple resource description model to support network-based discovery of images.

Is an Image a Document-Like Object?

The workshop reached consensus that images function similarly to text for discovery purposes. The key finding was that "document-like" status depends on whether resources are bounded or fixed — appearing the same to all users — rather than on content type (textual versus graphical).

Resources classified as document-like include images, movies, and musical performances. Non-document-like objects include databases, virtual experiences, business graphics, and interactive applications that generate different content for different users.

A Model for Metadata

The workshop developed a conceptual framework describing the research process through five interactive stages:

Discovery — locating relevant resources
Retrieval — obtaining access to the resource
Collation — gathering and organizing related resources
Analysis — examining and interpreting resources
Re-presentation — creating new works from existing resources

This model acknowledged that metadata requirements vary by stage and that metadata is created by multiple agents at different times throughout an object's lifecycle.

How Are Images Different?

Several challenges specific to image metadata were identified:

Indexing. Text can be indexed automatically through full-text search; image descriptors are mostly extrinsic and must be supplied manually.

Encoding schemes. Critical for images, which come in numerous format varieties: TIFF, GIF, JFIF, PICT, PCD, Photoshop, EPS, CGM, TGA, and others.

Rendering requirements. Information needed includes type (bit-mapped, vector, video), compression schemes (JPEG, LZW, QuickTime), dimensions, dynamic range, and color lookup tables.

Capture metadata. Light source, resolution, scanner type, scan date, audit trails, and digital signatures are all relevant to evaluating image quality.

Versioning. Source images, different views, different scans, various resolutions, and details all require distinct metadata treatment.

Modifications to the Dublin Core

Two significant changes were made to the element set:

Subject and Description separated. These were previously combined in a single element. Subject now encompasses keywords and controlled vocabulary terms, while Description contains descriptive prose or content descriptions (including visual content descriptions for images).

Rights Management field added. A new element addressing intellectual property concerns, with three proposed values: null (restrictions status unknown), "No Restrictions on Reuse", or a URI or pointer to restrictions information.

The Dublin Core Elements (15 Total)

The workshop modified the original 13 elements and added 2, establishing the 15-element Dublin Core:

Title — Name given by creator or publisher
Author or Creator — Person(s) or organization(s) responsible for intellectual content
Subject and Keywords — Topic, keywords, phrases, or classification descriptors
Description — Textual content description, abstracts, or visual content descriptions
Publisher — Entity responsible for resource availability
Other Contributors — Secondary intellectual contributors
Date — Date resource became available in present form
Resource Type — Category (home page, novel, poem, working paper, technical report, essay, dictionary)
Format — Data representation (text/html, ASCII, PostScript, executable, JPEG image, etc.)
Resource Identifier — Unique identifier (URLs, URNs)
Source — Originating work if applicable
Language — Intellectual content language(s)
Relation — Relationship to other resources
Coverage — Spatial locations and temporal durations
Rights Management — Link to copyright notice or rights statement

Open Issues

Several issues remained unresolved:

Surrogates vs. objects: Distinguishing between describing objects versus their digital versions
Collection vs. item description: Determining appropriate aggregation representation
Source recursion: Managing complex object-surrogate-derivative relationships
Mapping to other standards: Coordinating Dublin Core with MARC, FRBR, and image-specific standards
Viewing requirements: Standardizing the format element for bandwidth and usability indication

Future Directions

The Dublin Core was framed as a high-level reference model requiring integration of detail, elaboration, extension, and community expansion. A fourth Dublin Core workshop was planned for March 1997 in Canberra, Australia.

Workshop Report