Expressing Dublin Core™ metadata using the DC-Text format
Creator: |
Pete Johnston Eduserv Foundation, UK |
---|---|
Date Issued: | 2007-12-03 |
Identifier: | http://dublincore.org/specifications/dublin-core/dc-text/2007-12-03/ |
Replaces: | Not applicable |
Is Replaced By: | Not applicable |
Latest Version: | http://dublincore.org/specifications/dublin-core/dc-text/ |
Status of Document: | This is a DCMI Recommended Resource |
Description of Document: | This document specifies a simple text format for representing a Dublin Core™ metadata description set. The format is known as "DC-Text". |
Table of contents
- Introduction
- The DCMI Abstract Model and DC-Text
- Some Features of the DC-Text Syntax
- The DC-Text Syntax
- References
- Acknowledgements
1. Introduction
The "Description Set Model" of the DCMI Abstract Model [DCAM] describes the constructs that make up a DC metadata description set. This document specifies a syntax for serialising, or representing, a DC metadata description set in plain text. The format is referred to as "DC-Text". A plain text format for serialisation of such description sets is useful as a means of presenting examples in a human-readable form which highlights the constructs of the DCMI Abstract Model, and also as a means of comparing the information represented in other machine-processable formats.
2. The DCMI Abstract Model and DC-Text
According to the "Description Set Model" of the DCMI Abstract Model [DCAM], a DC description set has the following structure:
-
a description set is made up of one or more descriptions
-
a description is made up of
-
zero or one described resource URI and
-
one or more statements
-
-
a statement is made up of
-
exactly one property URI and
-
exactly one value surrogate
-
-
a value surrogate is either a literal value surrogate or a non-literal value surrogate
-
a literal value surrogate is made up of
- exactly one value string
-
a non-literal value surrogate is made up of
-
zero or one value URIs
-
zero or one vocabulary encoding scheme URIs
-
zero or more value strings
-
-
-
a value string is either a plain value string or a typed value string
-
a plain value string may be associated with a value string language
-
a typed value string is associated with a syntax encoding scheme URI
-
-
a non-literal value may be described by another description
The format described in this document supports the full DCAM "description set model".
3. Some Features of the DC-Text Syntax
This section presents an overview of some features of the the syntax.
3.1 The Structure of a DC-Text Document
The general structure of a DC-Text document is as follows:
namespace declaration label ( label ( content ) label ( label ( [...] ) [...] ) )
Each of the primary components of a DC metadata description set defined by the DCMI Abstract Model is represented in DC-Text by a syntactic structure of the form:
label ( content )
where label is replaced by one of the following strings:
DescriptionSet, Description, ResourceURI, ResourceId, Statement, PropertyURI, VocabularyEncodingSchemeURI, ValueURI, ValueId, ValueString, Language, SyntaxEncodingSchemeURI, LiteralValueString
and content is one of:
-
a sequence of one or more syntactic structures of the form label ( content ) (i.e. these structures are "nested"); or
-
a string of the form "literal", which represents that Unicode literal; or
-
a string of the form <uri>, which represents a URI; or
-
a string of the form prefix:name, which represents a "qualified name" used as an abbreviation for a URI
-
a string which represents a language tag
-
a string which is a locally-scoped identifier used to establish relationships between values and their descriptions
For each label value in the list above, the permitted form of content is determined by the syntax rules specified in Section 4 below.
The DC-Text syntax supports the representation of a single DC description set, so a DC-Text document consists of zero or more namespace declarations followed by a single label ( content ) syntactic structure with a label of DescriptionSet, and as content, one or more nested label ( content ) structures with a label of Description, i.e. a DC-Text document has the following outline form:
@prefix prefix: <uri> . DescriptionSet ( Description ( Statement ( ... ) Statement ( ... ) ) Description ( Statement ( ... ) Statement ( ... ) ) )
3.2 URIs in DC-Text
The DCAM uses Uniform Resource Identifiers (URIs) [RFC3896] to refer both to the resources described and to metadata terms (properties, classes,vocabulary encoding schemes and syntax encoding schemes).
In the DC-Text syntax, URIs may be represented in full or may be represented as "qualified names".
3.2.1 URIs
A URI may be represented in full. The following example shows a property URI as the content of a label ( content ) syntactic structure:
Example 1: URI represented in full
DescriptionSet ( Description ( Statement ( PropertyURI ( <http://purl.org/dc/terms/subject> ) ValueString ( "Metadata" ) ) ) )
Note that the DC-Text format does not support URI references in the form of relative references.
3.2.2 URIs,
Qualified Names and Namespace Declarations
A URI may be represented as a "qualified name".
A "qualified name" is an abbreviation for a URI used in the DC-Text format. A "qualified name" is made up of two parts, a prefix and a name, separated by a colon (:). In DC-Text, wherever a "qualified name" is used, it is used to represent a URI.
The "prefix" in a "qualified name" is associated with a "namespace URI" using a namespace declaration. The URI represented by the "qualified name" is determined by concatenating the "namespace URI" with which the prefix is associated and the "name".
Namespace declarations occur at the start of a DC-Text document, and have the following form:
@prefix prefix: <uri> .
For example, the following declarations associates the prefix dcterms with the URI http://purl.org/dc/terms/ and the prefix ex with the URI http://example.org/resources/.
@prefix dcterms: <http://purl.org/dc/terms/> . @prefix ex: <http://example.org/resources/> .
When "encoding" a description set by generating a DC-Text instance, a "qualified name" to represent a URI is determined by
-
dividing the URI into a pair consisting of a local name (the trailing characters of the URI, subject to the lexical constraints described above) and a namespace URI (the preceding part of the URI), and
-
providing a Namespace Declaration element for this namespace URI (using a prefix in the namespace declaration and in the "qualified name").
Note that this means for a single URI there is more than one possible "qualified name" representation. For example, the URIhttp://purl.org/dc/terms/title might be represented using any of the following (namespace URI, local name) pairs:
-
{http://purl.org/dc/terms/}, title
-
{http://purl.org/dc/terms/t}, itle
-
{http://purl.org/dc/terms/ti}, tle
-
{http://purl.org/dc/terms/tit}, le
-
{http://purl.org/dc/terms/titl}, e
Communities typically decide on a convention for the "qualified name" to be used for a URI, particularly for the URIs of terms (properties, classes, vocabulary encoding schemes and syntax encoding schemes), but in theory any of these four forms could be deployed without changing the interpretation of the instance. For all DCMI terms, the convention used by the DCMI community is to split the term URI into an expanded name at the right-most '/' (forward slash) character (as per the first example above). Also, the characters used for the prefix in a "qualified name" are not significant, but communities often adopt a convention on the common use of a prefix to facilitate human readability.
The following examples shows a namespace declaration and the use of a "qualified name" for the property URIhttp://purl.org/dc/terms/title:
Example 2: URI represented as "qualified name"
@prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( Statement ( PropertyURI ( dcterms:title ) LiteralValueString ( "DCMI Home Page" ) ) ) )
If the prefix used in a "qualified name" has not been associated with a URI in a namespace declaration, it is an error. If the prefix has been associated with multiple URIs (though multiple namespace declarations) then the prefix is associated with the namespace URI specified in the latest declaration.
In the following examples the prefix "xx" is used in a "qualified name", and there are two namespace declaration for that prefix. The second namespace declaration is used to generate the URI http://your.example.org/terms/approved from the "qualified name":
Example 3: URI represented as "qualified name", multiple namespace declarations
@prefix xx: <http://my.example.org/terms/> . @prefix xx: <http://your.example.org/terms/> . DescriptionSet ( Description ( Statement ( PropertyURI ( xx:approved ) LiteralValueString ( "2007-12-03" ) ) ) )
3.3 Comments
Comments can be inserted anywhere in a DC-Text document. A comment starts with a # and ends with a newline.
# A comment at the start of the document @prefix prefix: <uri> . DescriptionSet ( Description ( # A comment at the start of a description Statement ( ... ) # A comment following a statement Statement ( ... ) ) Description ( Statement ( ... ) Statement ( ... ) ) )
3.4 String Escapes
Strings representing literals may contain escape characters to encode non-printable characters and characters that are used in the DC-Text syntax as terminators. The escape characters are:
- \t = U+0009, tab
- \n = U+000A, line feed
- \r = U+000D, carriage return
- " = U+0022, double quote
- \ = U+005C, backslash
Example 4: Literal containing Escape Characters
@prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( Statement ( PropertyURI ( dcterms:title ) LiteralValueString ( "Things that go \"bump\" in the night" ) ) ) )
4. The DC-Text Syntax
This section describes how each of the constructs of the DCAM Description Set Model is represented using the DC-Text syntax.
4.1 Encoding a Description Set
A description set is made up of one or more_descriptions_.
A DC-Text document supports the representation of a single DC description set. A description set is represented using a DescriptionSet ( ) syntactic structure.
Example 5: Description Sets
DescriptionSet ( Description ( Statement ( PropertyURI ( <http://purl.org/dc/terms/subject> ) ValueString ( "Metadata" ) ) ) )
4.2 Encoding
a Description
A description is a set of one or more statements about a resource.
A description is represented using a Description ( ) syntactic structure.
The following example represents a description set consisting of a single description.
Example 6: Description
DescriptionSet ( Description ( Statement ( PropertyURI ( <http://purl.org/dc/terms/subject> ) ValueString ( "Metadata" ) ) ) )
A description set may contain multiple descriptions.
Each description is represented using a separate Description ( ) syntactic structure.
The following example represents a description set consisting of two descriptions. The order of the Description ( ) syntactic structures is not significant.
Example 7: Multiple Descriptions
DescriptionSet ( Description ( Statement ( PropertyURI ( <http://purl.org/dc/terms/subject> ) ValueString ( "Metadata" ) ) ) Description ( Statement ( PropertyURI ( <http://xmlns.com/foaf/0.1/name> ) LiteralValueString ( "Dublin Core™ Metadata Initiative" ) ) ) )
4.4.1 The
Described Resource URI
A description may have an associated described resource URI.
A described resource URI is represented using a ResourceURI ( <uri> ) syntactic structure:
Example 8: Described Resource URI
DescriptionSet ( Description( ResourceURI( <http://dublincore.org/pages/home> ) Statement ( PropertyURI ( <http://purl.org/dc/terms/subject> ) ValueString ( "Metadata" ) ) ) )
By introducing namespace declarations, the "qualified name" mechanism can be used to abbreviate the described resource URI. The same description set as in the previous example might be encoded as follows.
Example 9: Described Resource URI abbreviated using "qualified name"
@prefix page: <http://dublincore.org/pages/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( <http://purl.org/dc/terms/subject> ) ValueString ( "Metadata" ) ) ) )
Note: from this point in this document, all the examples will show URIs abbreviated as "qualified names", but in each case they could be represented as URIs in full.
4.3 Encoding a Statement
A description is made up of one or more statements.
A statement is represented using a Statement ( ) syntactic structure.
The following example represents a description consisting of a single statement.
Example 10: Statements
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueString ( "Metadata" ) ) ) )
A description may contain multiple statements.
Each statement is represented using a separate Statement ( ) syntactic structure.
The following example represents a description consisting of two statements. The order of the Statement ( ) syntactic structures is not significant.
Example 11: Multiple Statements
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueString ( "Metadata" ) ) Statement ( PropertyURI ( dcterms:title ) LiteralValueString ( "DCMI Home Page" ) ) ) )
4.3.1 Encoding a Property URI
A statement must contain exactly one property URI.
A property URI is represented using a PropertyURI ( <uri> ) syntactic structure.
The following example represents a description consisting of a single statement where the property URI is http://purl.org/dc/terms/subject.
Example 12: Property URI
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueString ( "Metadata" ) ) ) )
4.4 Encoding a Value Surrogate
A statement must contain exactly one value surrogate. A value surrogate is either a literal value surrogate or a non-literal value surrogate.
4.4.1 Encoding a Literal Value Surrogate
A literal value surrogate is made up of exactly one_value string_.
4.4.1.1 Encoding a Literal Value Surrogate Value String
A value string within a literal value surrogate is represented using a LiteralValueString ( ) syntactic structure.
The following example represents a description consisting of a single statement with a literal value surrogate containing a value string.
Example 13: Literal Value Surrogate: Value String
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:title ) LiteralValueString ( "DCMI Home Page" ) ) ) )
4.4.2 Encoding a Non-Literal Value Surrogate
A non-literal value surrogate is made up of:
-
zero or one value URIs
-
zero or one vocabulary encoding scheme URIs
-
zero or more value strings
4.4.2.1 Value URI
A value URI is represented using a ValueURI ( <uri> ) syntactic structure.
The following example represents a description consisting of a single statement with a non-literal value surrogate containing a value URI.
Example 14: Non-Literal Value Surrogate: Value URI
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement( PropertyURI ( dcterms:creator ) ValueURI ( agent:DCMI ) ) ) )
4.4.2.2 Vocabulary Encoding Scheme URI
A vocabulary encoding scheme URI is represented using a VocabularyEncodingSchemeURI ( <uri> ) syntactic structure.
The following example represents a description consisting of a single statement with a non-literal value surrogate containing a value URI and a vocabulary encoding scheme URI.
Example 15: Non-Literal Value Surrogate: Vocabulary Encoding Scheme URI
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix exterms: <http://example.org/terms/> . @prefix exsh: <http://example.org/sh/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueURI ( exsh:metadata ) VocabularyEncodingSchemeURI ( myterms:EXSH ) ) ) )
4.4.2.3 Encoding a Non-Literal Value Surrogate Value String
A value string within a non-literal value surrogate is represented using a ValueString ( ) syntactic structure.
The following example represents a description consisting of a single statement with a non-literal value surrogate containing a value URI, a vocabulary encoding scheme URI and a value string.
Example 16: Non-Literal Value Surrogate: Value String
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix exterms: <http://example.org/terms/> . @prefix exsh: <http://example.org/sh/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueURI ( exsh:metadata ) VocabularyEncodingSchemeURI ( exterms:EXSH ) ValueString ( "Metadata" ) ) ) )
A non-literal value surrogate may contain multiple value strings.
The following example represents a description consisting of a single statement with a non-literal value surrogate containing a value URI, a vocabulary encoding scheme URI and two value strings.
Example 17: Non-Literal Value Surrogate: Multiple Value Strings
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix exterms: <http://example.org/terms/> . @prefix exsh: <http://example.org/sh/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueURI ( exsh:metadata ) VocabularyEncodingSchemeURI ( exterms:EXSH ) ValueString ( "Metadata" ) ValueString ( "Métadonnées" ) ) ) )
4.5 Encoding a Value String
A value string is either a plain value string or a typed value string.
4.5.1 Encoding a Plain Value String
A plain value string may be associated with a value string language
4.5.1.1 Encoding a Value String Language
A value string language is represented using a Language ( tag ) syntactic structure.
The following example represents a description consisting of a single statement with a non-literal value surrogate containing a value URI, a vocabulary encoding scheme URI and two plain value strings, each associated with a value string language.
Example 18: Value String Languages
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix exterms: <http://example.org/terms/> . @prefix exsh: <http://example.org/sh/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueURI ( exsh:metadata ) VocabularyEncodingSchemeURI ( exterms:EXSH ) ValueString ( "Metadata" Language ( en ) ) ValueString ( "Métadonnées" Language ( fr ) ) ) ) )
4.5.2 Encoding a Typed Value String
A typed value string is associated with a syntax encoding scheme URI.
4.5.2 Encoding a Syntax Encoding Scheme URI
A syntax encoding scheme URI is represented using the SyntaxEncodingSchemeURI ( <uri> ) syntactic structure.
The following example represents a description consisting of a single statement with a non-literal value surrogate containing a value URI and a vocabulary encoding scheme URI.
Example 19: Syntax Encoding Scheme URI
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix xs: <http://www.w3.org/2001/XMLSchema#> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:modified ) ValueString ( "2006-02-14" SyntaxEncodingSchemeURI ( xs:date ) ) ) ) )
4.6 Descriptions of Non-Literal Values
A description set may contain multiple descriptions, each represented by a Description ( content ) syntactic structure. The order of the structures has no significance.
Example 20: Multiple Descriptions
@prefix page: <http://dublincore.org/pages/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:subject ) ValueString ( "Metadata" ) ) ) Description ( Statement ( PropertyURI ( foaf:name ) LiteralValueString ( "Dublin Core™ Metadata Initiative" ) ) ) )
A resource which is referred to as a non-literal value in a statement in one description may be the described resource of another description within the description set_. If that_ resource has been assigned a URI, then that URI appears as the value URI in the statement where the resource is referred to as the non-literal value and as a described resource URI in the description of that resource, as shown below:
Example 21: Non-Literal Value as Described Resource
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:creator ) ValueURI ( agent:DCMI ) ) ) Description ( ResourceURI ( agent:DCMI ) Statement ( PropertyURI ( foaf:name ) LiteralValueString ( "Dublin Core™ Metadata Initiative" ) ) ) )
In some cases a resource will not have a URI assigned, or the URI will not be known. Such a resource may still be a referred to as a non-literal value in a statement in one description and the described resource of another description in the same description set.
In DC-Text, the association between the statement in the first description and the second description is made by using an identifier for the resource which is local to a DC-Text instance. This local identifier is used in a ValueId ( id ) syntactic construct within one or more Statement ( ) constructs where the resource is referred to as a non-literal value, and in a ResourceId ( id ) construct within a Description ( ) construct for which the resource is the described resource. The content of a ValueId ( id ) construct must match the content of a ResourceId ( id ) construct in the same DC-Text instance.
Note that this is a syntactic mechanism for linking references to values in statements to descriptions of those values: the local identifier itself does not appear in the description set.
Example 23: Non-Literal Value as Described Resource
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dcterms:creator ) ValueId ( agentDCMI ) ) ) Description ( ResourceId ( agentDCMI ) Statement ( PropertyURI ( foaf:name ) LiteralValueString ( "Dublin Core™ Metadata Initiative" ) ) ) )
References
[DCAM]
DCMI Abstract Model DCMI Recommendation. 2007-06-04
http://dublincore.org/specifications/dublin-core/abstract-model/2007-06-04/
[URI]
Berners-Lee, T., R. Fielding, L. Masinter. RFC 3986: Uniform Resource Identifier (URI): Generic Syntax. Internet Engineering Task Force (IETF). January 2005.
< http://www.ietf.org/rfc/rfc3986.txt>
Changes to this document
2008-01-14. Removed note saying: As of 2007-12-03, a proposalto "replicate" the fifteen properties of the http://purl.org/dc/elements/1.1/ ("dc:") namespace in the http://purl.org/dc/terms/ ("dcterms:") namespace has not been approved or implemented. Until this is approved, references herein to names such as dcterms:subject and dcterms:title should be understood as having the status of "proposed" as well.
2008-03-31. Changed captions of examples 14, 15, 16 & 17 from "Literal Value Surrogate..." to "Non-Literal Value Surrogate...". Removed profile URI.