Image Description on the Internet
Workshop Report
Note: This report was originally published in D-Lib Magazine, January 1997, by Stuart Weibel and Eric Miller (OCLC Office of Research).
Original publication: https://www.dlib.org/dlib/january97/oclc/01weibel.html
Image Description on the Internet: A Summary of the CNI/OCLC Image Metadata Workshop
Stuart Weibel and Eric Miller, Office of Research, OCLC Online Computer Library Center, Inc.
Introduction
Seventy practitioners in the area of networked image description gathered on September 24–25, 1996, in Dublin, Ohio, for the third in a series of metadata workshops sponsored by the Coalition for Networked Information (CNI) and OCLC Online Computer Library Center.
The first two workshops had focused on the semantic content and deployment strategies for a simple resource description record — the Dublin Core — primarily in the context of textual resources. This workshop addressed the application of the Dublin Core to image resource description.
The expectation that a set of image-specific elements (an "Image-Core") would emerge gave way to the recognition that the Dublin Core, within the context of the Warwick Framework, affords a foundation for a simple resource description model to support network-based discovery of images.
Is an Image a Document-Like Object?
The workshop reached consensus that images function similarly to text for discovery purposes. The key finding was that "document-like" status depends on whether resources are bounded or fixed — appearing the same to all users — rather than on content type (textual versus graphical).
Resources classified as document-like include images, movies, and musical performances. Non-document-like objects include databases, virtual experiences, business graphics, and interactive applications that generate different content for different users.
A Model for Metadata
The workshop developed a conceptual framework describing the research process through five interactive stages:
- Discovery — locating relevant resources
- Retrieval — obtaining access to the resource
- Collation — gathering and organizing related resources
- Analysis — examining and interpreting resources
- Re-presentation — creating new works from existing resources
This model acknowledged that metadata requirements vary by stage and that metadata is created by multiple agents at different times throughout an object's lifecycle.
How Are Images Different?
Several challenges specific to image metadata were identified:
Indexing. Text can be indexed automatically through full-text search; image descriptors are mostly extrinsic and must be supplied manually.
Encoding schemes. Critical for images, which come in numerous format varieties: TIFF, GIF, JFIF, PICT, PCD, Photoshop, EPS, CGM, TGA, and others.
Rendering requirements. Information needed includes type (bit-mapped, vector, video), compression schemes (JPEG, LZW, QuickTime), dimensions, dynamic range, and color lookup tables.
Capture metadata. Light source, resolution, scanner type, scan date, audit trails, and digital signatures are all relevant to evaluating image quality.
Versioning. Source images, different views, different scans, various resolutions, and details all require distinct metadata treatment.
Modifications to the Dublin Core
Two significant changes were made to the element set:
Subject and Description separated. These were previously combined in a single element. Subject now encompasses keywords and controlled vocabulary terms, while Description contains descriptive prose or content descriptions (including visual content descriptions for images).
Rights Management field added. A new element addressing intellectual property concerns, with three proposed values: null (restrictions status unknown), "No Restrictions on Reuse", or a URI or pointer to restrictions information.
The Dublin Core Elements (15 Total)
The workshop modified the original 13 elements and added 2, establishing the 15-element Dublin Core:
- Title — Name given by creator or publisher
- Author or Creator — Person(s) or organization(s) responsible for intellectual content
- Subject and Keywords — Topic, keywords, phrases, or classification descriptors
- Description — Textual content description, abstracts, or visual content descriptions
- Publisher — Entity responsible for resource availability
- Other Contributors — Secondary intellectual contributors
- Date — Date resource became available in present form
- Resource Type — Category (home page, novel, poem, working paper, technical report, essay, dictionary)
- Format — Data representation (text/html, ASCII, PostScript, executable, JPEG image, etc.)
- Resource Identifier — Unique identifier (URLs, URNs)
- Source — Originating work if applicable
- Language — Intellectual content language(s)
- Relation — Relationship to other resources
- Coverage — Spatial locations and temporal durations
- Rights Management — Link to copyright notice or rights statement
Open Issues
Several issues remained unresolved:
- Surrogates vs. objects: Distinguishing between describing objects versus their digital versions
- Collection vs. item description: Determining appropriate aggregation representation
- Source recursion: Managing complex object-surrogate-derivative relationships
- Mapping to other standards: Coordinating Dublin Core with MARC, FRBR, and image-specific standards
- Viewing requirements: Standardizing the format element for bandwidth and usability indication
Future Directions
The Dublin Core was framed as a high-level reference model requiring integration of detail, elaboration, extension, and community expansion. A fourth Dublin Core workshop was planned for March 1997 in Canberra, Australia.
Workshop Details
- Dates
- September 24, 1996 – September 25, 1996
- Location
- Dublin, Ohio, USA
- Hosts
- Coalition for Networked Information (CNI); OCLC Online Computer Library Center
- Attendees
- 70 from multiple countries
- Conveners
-
- Stuart Weibel, OCLC Office of Research
- Eric Miller, OCLC Office of Research