innovation in metadata design, implementation & best practices

DCMI Usage Board - Meeting Agenda

DCMI USAGE BOARD: MISSION AND PRINCIPLES

Thomas Baker
Version: Mon May 7 18:38:03 MET DST 2001

MISSION

The mission of the DCMI Usage Board is to ensure an orderly evolution
of metadata vocabularies grounded in grammatical principle. The Usage
Board evaluates proposed vocabulary terms (or changes to existing
terms) in light of grammatical principle, semantic clarity, and overlap
with existing terms. To proposals that are accepted it assigns a
specific status, such as Recommended, Conforming, or Obsolete. The
Usage Committee strives for consensus, justifying its decisions and
interpretations in terms both of principle and of empirical practice.

PUBLICATION POLICY

The Usage Board publishes its decisions in the form of text documents
on the Web. In the medium term, the Usage Board intends to maintain
the canonical definitions and Usage Board annotations for these terms
in a formal registry that follows emerging guidelines of good practice
for machine-readable schemas on the Web.

PROCESS MODEL

The general model is that DCMI working groups submit well-formed
proposals for vocabulary terms to the Usage Committee; the working
groups, in effect, "recommend" the terms. The Usage Committee
evaluates and the proposals in light of grammatical principle
(dumb-down rule, refine not extend, etc). Proposals found to be in
conformance are certified as such and added to the registry with two
annotations: "Recommended" by the working group, certified as
"Conforming" or "Recommended" by the Usage Committee.

SCOPE

The scope of the Usage Committee as of April 2001 is "the Dublin Core"
-- the fifteen core elements and their qualifiers -- plus additional
non-core elements and qualifiers deemed useful for discovering
"information resources" across domains. If a DCMI working group were
to formulate a core element set for a different class of objects, such
as "agents," the Usage Board would evaluate that proposal in the light
of the same grammatical categories and principles that govern the
Dublin Core for information resources (e.g., elements and qualifiers,
dumb-down principle).

GRAMMAR

Dublin Core may be seen as a small language for making a particular
class of statements about resources. Like natural languages, it has a
vocabulary of word-like terms, the two classes of which -- elements and
qualifiers -- function within statements like nouns and adjectives; and
it has a syntax for arranging elements and qualifiers into statements
according to a simple pattern. Dublin Core statements are of the fixed
pattern "Resource has property X," where "resource" is the implied
subject; followed by an implied verb ("has"); followed by one of
fifteen properties from the Dublin Core element set; followed by a
property value -- an appropriate literal such as a person's name, a
date, some words, or a URL. For example: "Resource has dc:creator
'Karl Marx'," and "Resource has dc:date '2000-06-13'." Optional
qualifiers may make the meaning of a property more definite, as in
"Resource has dc:date dcq:revised '2000-06-13'." This grammar is
described more fully in [5].

VOCABULARY TERMS IN GENERAL

Strictly speaking, a Dublin Core element or qualifier is a unique
identifier formed by a name (e.g., title) prefixed by the URI of the
namespace in which it is defined, as in
http://dublincore.org/2000/03/13/dces#title. In this context, a
namespace is a vocabulary that has been formally published, usually on
the Web; it describes elements and qualifiers with natural-language
labels, definitions, and other relevant documentation.

ELEMENTS

An element is a property of a class of objects. In the case of the
Dublin Core, an element is a property of an "information resource"
(especially of "document-like objects"). The fifteen elements of the
Dublin Core Element Set are the defining feature of the Dublin Core.
These elements are fifteen broadly defined properties of resources
considered to be of general use for discovering information across
multiple domains. If classes themselves are defined by their
properties, then the scope of the Dublin Core Element Set is any object
that can be described with its elements.

QUALIFIERS

Qualifiers modify the properties of Dublin Core statements by
specifying, in the manner of natural-language adjectives, "what kind"
of subject, date, or relation. Qualifiers currently fall into two
classes: 

 -- Element Refinement. An element refinement is a qualifier that makes
    the meaning of an element narrower or more specific. A refined
    element shares the meaning of the unqualified element, but with a
    more restricted scope. A client that does not understand a specific
    element refinement term should be able to ignore the qualifier and
    treat the metadata value as if it were an unqualified (broader)
    element. The definitions of element refinement terms for qualifiers
    must be publicly available.

 -- Encoding Scheme. Encoding schemes are pointers to contextual
    information or parsing rules that aid in the interpretation of
    an element value. These schemes include controlled
    vocabularies and formal notations or parsing rules. A value
    expressed using an encoding scheme will thus be a token
    selected from a controlled vocabulary (e.g., a term from a
    classification system or set of subject headings) or a string
    formatted in accordance with a formal notation (e.g.,
    "2000-01-01" as the standard expression of a date). If an
    encoding scheme is not understood by a client or agent, the
    value may still be useful to a human reader. The definitive
    description of an encoding scheme for qualifiers must be
    clearly identified and available for public use.

DUMB-DOWN PRINCIPLE

The qualification of Dublin Core properties is guided by a rule known
colloquially as the Dumb-Down Principle. According to this rule, a
client should be able to ignore any qualifier and use the description
as if it were unqualified. While this may result in some loss of
specificity, the remaining element value (minus the qualifier) must
continue to be generally correct and useful for discovery.
Qualification is therefore supposed only to refine, not extend the
semantic scope of a property. In borderline cases, qualification
should not result in a literal that could be misleading.

APPROPRIATE LITERALS

Whether a property value is "useful for discovery" is at the heart of
the notion of appropriateness. A property value should be a string of
an expected type -- usually, for example, some sort of name for
dc:creator, dc:contributor, dc:publisher, or dc:title; a URL for
dc:relation, dc:identifier, or dc:source; full-text sentences for
dc:description; short text strings or keywords for dc:subject, dc:type,
dc:format, and dc:language; and a recognizable combination of years,
months, and days for dc:date. Both in theory and in practice, the range
of expected data types varies from property to property; which types
are appropriate for a given property is open to interpretation and
debate. One task of the Usage Board is to provide guidance in this
respect.

CRITERIA FOR EVALUATING ELEMENT AND QUALIFIER PROPOSALS 

(Source:
http://www.ischool.washington.edu/sasutton/dc-ed/Dc-ac/DC-Education_Report.html)

1. Can "it" be clearly described? Can the semantics of the proposed
   element or element qualifier be expressed precisely, unambiguously,
   and briefly?

2. Is there a clear requirement for "it" in support of resource
   discovery in the education domain? Is there a demonstrated need for
   the proposed element, element qualifier, or value qualifier?

3. Does "it" support interoperability? Does it, to the maximum extent
   possible, support interoperability.

4. Is "it" practical? How difficult would it be for people creating
   metadata to comprehend the semantics of the proposed element or
   element qualifier and to apply it reasonably in the description of
   resources.

5. Does "it" refine an existing element? If "it" is a proposal for a
   new element, can it rationally be handled as effectively as an
   element or value qualifier for an existing element?

6. Are there alternative ways of implementing "it"? Within the
   conceptual framework of the Dublin Core Element Set (i.e.,
   element/element qualifiers and value/value qualifiers), are there
   alternative ways to achieve the ends sought?

7. Are there existing implementations or controlled vocabularies, etc.,
   supporting "it"? Somewhat akin to number 2 above, are there existing
   implementations for which this solution (element or element
   qualifier or value qualifier) is needed in support of resource
   discovery. In similar fashion, are there existing value qualifiers
   (i.e., controlled vocabularies, thesauri, etc.) that support "it".

QUESTIONS FOR DISCUSSION

1) Is the Usage Board in the business of reviewing and certifying
   metadata terms in external (non-DCMI) namespaces? By publishing an
   application profile or namespace on the Web, a metadata-using
   project or service can in effect assert that its vocabulary terms
   are based on Dublin Core. DCMI could decide to acknowledge this
   assertion with a Usage Board review and publish its annotation. If
   such a term were approved by the Usage Board, would thereby become
   part of a DCMI namespace?

2) Should the Usage Board be in the business of reviewing entire
   application profiles or specifications based on Dublin Core? Under
   what circumstances, for example, might we want to review an entire
   Dublin-Core-based standard, such as PRISM
   (http://www.prismstandard.org/news/2001/0401.asp) or Musicbrainz
   (http://musicbrainz.org/MM/)?

3) Is the Usage Board in the business more of "certifying conformance" or of
   "recommending"? Many people wish that in the interest of global
   interoperability, DCMI would provide more guidance in the form of
   recommending, for example, "Last name, first name."

4) Should terms that have (merely) been proposed, such as
   http://www.dublincore.org/2000/03/13-dcagent, already be posted in a
   DCMI registry directory, and do we rely on Usage Board annotations
   to clarify their status? Or should the Usage Board act as
   gatekeeper of the registry? Are such questions even within the
   scope of the Usage Board?