Eduserv Foundation, UK
KMR Group, CID, NADA, KTH (Royal Institute of Technology), Sweden
KMR Group, CID, NADA, KTH (Royal Institute of Technology), Sweden
Eduserv Foundation, UK
|Is Replaced By:||Not applicable|
|Status of Document:||This is a DCMI Proposed Recommendation|
|Description of Document:||This document describes an abstract model for Dublin Core™ metadata.|
This document specifies an abstract model for Dublin Core™ metadata. The primary purpose of this document is to specify the components and constructs used in Dublin Core™ metadata. It defines the nature of the components used and describes how those components are combined to create information structures. It provides a reference model which is independent of any particular encoding syntax. Such a reference model allows us to gain a better understanding of the kinds of descriptions that we are trying to encode and facilitates the development of better mappings and cross-syntax translations.
This document is primarily aimed at the developers of software applications that support Dublin Core™ metadata, people involved in developing new syntax encoding guidelines for Dublin Core™ metadata and people developing metadata application profiles based on DCMI vocabularies or on other compatible vocabularies.
The DCMI Abstract Model builds on work undertaken by the World Wide Web Consortium (W3C) on the Resource Description Framework (RDF) [RDF, RDFS]. The use of concepts from RDF is summarized below in Section 5, on DCMI Abstract Model semantics.
The DCMI Abstract Model is represented here using UML class diagrams [UML]. Readers that are not familiar with UML class diagrams should note that lines ending in a block-arrow should be read as 'is' or 'is a' (for example, "a value is a resource") and that lines starting with a block-diamond should be read as 'contains a' or 'has a' (for example, "a statement contains a property URI"). Other relationships are labeled appropriately. Note that the UML modeling used here shows the abstract model but is not intended to form a suitable basis for the development of software applications. In this document, words and phrases in italics are defined in Section 7, Terminology.
The abstract model of the resources described by descriptions is as follows:
Each described resource may be described using one or more property-value pairs.
Each property-value pair is made up of one property and one value.
Each value is a resource - the physical or conceptual entity that is associated with a property when a property-value pair is used to describe a resource.
Figure 1 - the DCMI resource model
The abstract model of DC metadata descriptions is as follows:
A description set is a set of one or more descriptions, each of which describes a single resource.
A description is made up of one or more statements (about one, and only one, described resource) and zero or one resource URI (a URI that identifies the described resource).
Each statement instantiates a property-value pair and is made up of a property URI (a URI that identifies a property), zero or one value URI (a URI that identifies the value associated with the property), zero or one vocabulary encoding scheme URI (a URI that identifies the vocabulary encoding scheme of which the value is a member), and zero or more value representations.
The value representation may take the form of a value string or a rich representation.
Each value string is a string which represents the resource. Value strings are intended to be human-readable.
Each value string may have either an associated syntax encoding scheme URI that identifies a syntax encoding scheme or an associated value string language that is an ISO language tag (for example en-GB) but not both.
Each rich representation is a sequence of octets that represents the value (a resource) - for example, some marked-up text, an image, a video, some audio, or some combination thereof.
Each rich representation must have an associated media type (a MIME Media Type).
Figure 2 - the DCMI description model
The abstract model of the vocabularies used in DC metadata descriptions is as follows:
Each property may be related to one or more classes by a has domain relationship. Where it is stated that a property has such a relationship with a class and a described resource is related to a value by that property, it follows that the described resource is an instance of that class.
Each property may be related to one or more classes by a has range relationship. Where it is stated that a property has such a relationship with a class and a described resource is related to a value by that property, it follows that the value is an instance of that class.
Each resource may be an instance of one or more classes.
Each resource may be a member of one or more vocabulary encoding schemes.
Each class may be related to one or more other classes by a sub-class of relationship (where the two classes are defined such that all resources that are instances of the sub-class are also instances of the related class).
Each property may be related to one or more other properties by a sub-property of relationship. Where it is stated that such a relationship exists, the two properties are defined such that whenever a resource is related to a value by the sub-property, it follows that the resource is also related to that same value by the property.
Each syntax encoding scheme is a class (of strings).
A vocabulary is a set of one or more terms. Each term is a member of one or more vocabularies.
Figure 3 - the DCMI vocabulary model
A number of things about the model are worth noting:
Each value may be the described resource in a separate description within the same description set - for example, a separate description may provide metadata about the person that is the creator of the described resource.
The description model does not provide an explicit mechanism for indicating the classes of the described resource or the classes of any given value. Classes of the described resource can either be indicated explicitly using one or more statements in the description or be inferred from the domains of the properties used in the description. Classes of any given value can either be indicated explicitly using one or more statements in a separate description about that value or be inferred from the range of the property.
The abstract model presented above indicates that each DC metadata description describes one, and only one, described resource. This is commonly referred to as the one-to-one principle.
However, real-world metadata applications tend to be based on loosely grouped sets of descriptions (where the described resources are typically related in some way), known here as description sets. For example, a description set might comprise descriptions of both a painting and the artist. Furthermore, it is often the case that a description set will also contain a description about the description set itself (sometimes referred to as 'admin metadata' or 'meta-metadata').
Description sets are instantiated, for the purposes of exchange between software applications, in the form of metadata records, according to one of the DCMI encoding guidelines (for example, XHTML meta tags, XML and RDF/XML) [DCMI-ENCODINGS].
A DC metadata value is the physical or conceptual entity that is associated with a property when a property-value pair is used to describe a resource. For example, a value associated with the Dublin Core™ Creator property is a person, organization or service - a physical entity. A value associated with the Dublin Core™ Date property is a point (or range) in time - a conceptual entity. A value associated with the Dublin Core™ Coverage property is a geographic region or country - a physical entity. A value associated with the Dublin Core™ Subject property is a concept (a conceptual entity) or a physical object or person (a physical entity). Each of these entities is a resource.
The value may be identified using a value URI. The value may be represented by one or more value strings and/or rich representations. The value may described by a separate description. In each case, the value is a resource.
Some of the concepts in the DCMI Abstract Model are taken from the Resource Description Framework (RDF) and RDF Schema (RDFS) as follows:
|DCMI Abstract Model||RDF/RDFS|
|property or element||Class: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property|
|syntax encoding scheme||Class: http://www.w3.org/2000/01/rdf-schema#Datatype|
|has domain relationship||Property: http://www.w3.org/2000/01/rdf-schema#domain|
|has range relationship||Property: http://www.w3.org/2000/01/rdf-schema#range|
|sub-property of relationship||Property: http://www.w3.org/2000/01/rdf-schema#subPropertyOf|
|sub-class of relationship||Property: http://www.w3.org/2000/01/rdf-schema#subClassOf|
Table 1 - DCMI Abstract Model semantics
Particular encoding guidelines (HTML meta tags, XML, RDF/XML, etc.) [DCMI-ENCODINGS] do not need to encode all aspects of the abstract model described above. However, they should refer to the DCMI Abstract Model and indicate which parts of the model are encoded and which are not.
Encoding guidelines should indicate how a value can be treated as a described resource in a separate description in those cases where there is no value URI.
This document uses the following terms:
The underlying model for Dublin Core™ metadata has evolved since first formalisms were proposed in the late 1990s. The following table presents rough terminological equivalences between earlier versions of DCMI grammatical principles [DCMI-GRAM-PRIN] and the current DCMI Abstract Model.
|DCMI Grammatical Principles||DCMI Abstract Model|
|element||property or element|
|element refinement||property with sub-property of relation|
|encoding scheme||syntax encoding scheme or vocabulary encoding scheme|
|syntax encoding scheme||syntax encoding scheme|
|qualifier||property with sub-property of relation, syntax encoding scheme, or vocabulary encoding scheme|
|vocabulary encoding scheme||vocabulary encoding scheme|
Table 2 - DCMI Grammatical Principles and DCMI Abstract Model
Dublin Core™ Metadata Initiative
DCMI Usage Board. DCMI Grammatical Principles. November 2003.
DCMI Encoding Guidelines
Duerst, M., M. Suignard. RFC 3987: Internationalized Resource Identifiers (IRIs). Internet Engineering Task Force (IETF). January 2005.
Freed, N. and N. Borenstein. RFC 2045: Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. Internet Engineering Task Force (IETF). November 1996.
Freed, N. and N. Borenstein. RFC 2045: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. Internet Engineering Task Force (IETF). November 1996.
Klyne, Graham and Jeremy Carroll, editors. Resource Description Framework: Concepts and Abstract Syntax. W3C Recommendation. 10 February 2004.
Brickley, Dan and R.V. Guha, editors. RDF Vocabulary Description Language 1.0: RDF Schema. W3C Recommendation. 10 February 2004.
Booch, Grady, James Rumbaugh and Ivar Jacobson. The Unified Modeling Language User Guide. Addison-Wesley, 1998.
Berners-Lee, T., R. Fielding, L. Masinter. RFC 3986: Uniform Resource Identifier (URI): Generic Syntax. Internet Engineering Task Force (IETF). January 2005.
Thanks to Dan Brickley, Rachel Heery, Alistair Miles, Sarah Pulis, the members of the DC Usage Board and the members of the DCMI Architecture Community for their comments on previous versions of this document.