Guidelines for Dublin Core™ Application Profiles (Working Draft)

Creator:	Karen Coyle Consultant
Creator:	Thomas Baker DCMI
Date Issued:	2008-11-03
Identifier:	http://dublincore.org/specifications/dublin-core/profile-guidelines/2008-11-03/
Replaces:	Not applicable
Is Replaced By:	Not applicable
Latest Version:	http://dublincore.org/specifications/dublin-core/profile-guidelines/
Status of Document:	This is a DCMI Working Draft
Description of Document:	This document provides guidelines for the creation of Dublin Core™ Application Profiles. The document explains the key components of a Dublin Core™ Application Profile and walks through the process of developing a profile. The document is aimed at designers of application profiles -- people who will bring together metadata terms for use in a specific context. It does not address the creation of machine-readable implementations of an application profile nor the design of metadata applications in an broader sense. For additional technical detail the reader is pointed to further sources. This document represents work in progress.

Introduction
Framework for Dublin Core™ Application Profiles
Defining Functional Requirements
Selecting or Developing a Domain Model
Selecting and Defining Metadata Terms
Designing the Metadata Record with a Description Set Profile
Usage Guidelines
Syntax Guidelines
Appendix A: Description Set Model (from DCMI Abstract Model)
Appendix B: MyBookCase Description Set Profile
Appendix C: Using RDF properties in profiles: a technical primer

1. Introduction

When it comes to metadata, one size does not fit all. In fact, one size often does not even fit many. The metadata needs of particular communities and applications are very diverse. The result is a great proliferation of metadata formats, even across applications that have metadata needs in common. The Dublin Core™ Metadata Initiative has addressed this by providing a framework for designing a Dublin Core™ Application Profile (DCAP) that meets specific application needs while providing semantic interoperability with other applications on the basis of globally defined vocabularies and models.

Note that a DCAP is a generic construct that does not require metadata terms defined by DCMI [DCMI-MT]. A DCAP can use any terms that are defined in accordance with the Resource Description Framework of the World Wide Web Consortium, or RDF [RDF] -- a generalized language for data integration -- combining terms from multiple namespaces as needed.

Although creating an application profile takes effort, that effort results in data designed to fit well with other data in semantic webs of "linked data" [LINKED]. The effort also yields better guidance for metadata creators and clear specifications for metadata developers. By articulating what is intended and can be expected from data, application profiles promote the sharing and linking of data within and between communities.

It is recommended that application profiles be developed as team projects involving, at a minimum, both:

data content specialists, who are knowledgeable in the resources that need to be described an in the metadata used in the description of those resources, and
data engineers or architects, who understand how to structure the underlying data for interoperability in a linked data environment.

2. Framework for Dublin Core™ Application Profiles

A DCAP is a document (or set of documents) that specifies and describes the metadata used in a particular application. To accomplish this, a profile:

describes what a community wants to accomplish with its application (Functional Requirements);
characterizes the types of things described by the metadata and their relationships (Domain Model);
enumerates the metadata terms to be used and the rules for their use (Description Set Profile and Usage Guidelines); and
defines the machine syntax that will be used to encode the data (Syntax Guidelines and Data Formats).

The interoperability of DCAP-based metadata in linked data environments derives from its basis in standards: community domain models (characterizing which things are described in a particular field or application), metadata vocabularies (from which the terms used in the DCAP are chosen), the Dublin Core™ Abstract Model (a generic syntax for metadata records) [DCAM], and DCMI syntax guidelines (which use the generic syntax for concrete implementation encodings) [DCMI-ENCODINGS]. The foundation standard on which these domain standards rest is RDF [RDF].

Singapore Framework

How these standards fit together is shown in the Singapore Framework for Dublin Core™ Application Profiles [DCMI-SF]. The bottom tier, RDF, provides the foundation standards on which domain standards are built. The upper tier holds the design and documentation components of a metadata application. Taking this upper tier as a roadmap, the sections that follow walk through the process of creating a DCAP, with side trips dipping into some technical details when needed. As an illustration, we create a simple application profile that describes books and authors. We call this example MyBookCase.

3. Defining Functional requirements

The purpose of any metadata is to support an activity. Defining clear goals for the application used in that activity is an essential first step.

Functional requirements guide the development of the application profile by providing goals and boundaries and are an essential component of a successful application profile development process. This development is often a broad community task and may involve managers of services, experts in the materials being used, technical application developers, and potential end-users of the services. In addition to data content specialists, this process should involve at least one expert with a deep understanding of the foundation standards.

There are many methodologies to help in the creation of functional requirements, such as business process modeling, and methods for visualizing requirements, such as the Unified Modeling Language [UML]. Many find that the definition of use cases and scenarios for a particular application helps elicit functional requirements that might otherwise be overlooked.

Functional requirements answer questions such as:

What do you want to accomplish with your application?
What are the limits of your application? What will it not attempt to do?
How do you want the application you create to serve your users?
Will your application need to perform specific actions, such as sorting alphabetically or downloading data in particular formats?
What are the key characteristics of your resources, and how does this affect your selection of data elements? For example, do you need to handle a variety of character sets?
What are the key characteristics of your users? Are they associated with a particular institution or are you serving a general public? Do they all speak the same language? How expert are they in relation to the data your application will manage?
Are there existing community standards that need to be considered?

Functional requirements can include general goals as well as specific tasks that you need to address. Ideally, functional requirements should address the needs of metadata creators, resource users, and application developers so that the resulting application fully supports the needs of the community.

These are some sample requirements from the Scholarly Works Application Profile (SWAP) [SWAP]:

Facilitate identification of open access materials.
Enable identification of the research funder and project code.

A set of functional requirements may include user tasks that must be supported such as the following from the Functional Requirements for Bibliographic Records (FRBR) [FRBR]:

Use the data to find materials that correspond to the user's stated search criteria.
Use the data retrieved to identify an entity.

For the MyBookCase DCAP our functional requirements are:

Use the data to retrieve books with a title search.
Limit a search to a particular language.
Sort retrieved items by publication date.
Find items about a given subject.
Describe the author as a person with a name and email address.

4. Selecting or Developing a Domain model

After defining functional requirements, the next step is to select or develop a domain model. A domain model is a description of what _things_your metadata will describe, and the relationships between those things. The domain model is the basic blueprint for the construction of the application profile.

In the MyBookCase DCAP, our things are books and persons, who can be the authors of the books. We will see below how to describe the book using elements such as title and language, and to describe the _person_with a name and an email address. For now, the domain model for our MyBookCase is simply:

Models can be even simpler than this (e.g., just a book), or they can be more complex. The domain model for the Scholarly Works Dublin Core™ Application Profile, for example, is based on the library-community domain model Functional Requirements for Bibliographic Records (FRBR) [FRBR]. SWAP defines "Scholarly Work" in place of FRBR's more general entity "Work", and introduces new agent relationships beyond those in the FRBR, such as "isFundedBy" and "isSupervisedBy." In this way, SWAP makes use of FRBR but customizes the FRBR model to meet its specific needs:

5. Selecting or Defining Metadata Terms

As explained above, the entities in the domain model -- whether Book and Author, Manifestation and Copy, or just a generic Resource -- are types of things to be described in our metadata. The next step is to choose properties for describing these things. For example, a _book_has a title and author , and a person has a name ; title , author , and name are properties.

The next step, then, is to scan available RDF vocabularies to see whether the properties needed already exist. DCMI Metadata Terms [DCMI-MT] is a good source of properties for describing intellectual resources like documents and web pages; the "Friend of a Friend" vocabulary has useful properties for describing people [FOAF]. If the properties one needs are not already available, it is possible to declare one's own (see Appendix C).

For the MyBookCase application profile, we concluded from our functional requirements that a book should have a title , date , language , subject , and author.

The first consideration in evaluating terms from existing vocabularies is their definition. The Dublin Core™ property "title", for example, is defined as "a name given to the resource". If the definition fits, this property is a candidate for use in your profile. However, the suitability of a property for use in a particular application also depends on data-engineering aspects that may require closer scrutiny. To enable this closer scrutiny, it is useful to ask the following questions about the values you intend to use with each of the properties needed in the profile. Note that for any given property, there may be more than one "yes" answer.

For the value associated with this property:

Do you want to use free text?
Will the free text ever need to follow a pre-defined format such as the W3C format for dates ("YYYY-MM-DD")?
Will you want to select valid values from a controlled list?
- If so, is that list already available somewhere, or will you need to create it?
- Do you want to limit the valid values to a selection from a list, or can unlisted values be used?
Will single value strings suffice (e.g., "1989" or "John Adams"), or is there a need (or potential need) for a more complex structure with multiple components (as when an author has a name , email address , and affiliation )?
Might you ever want to use a URI to identify the value or point to a description of the value?

Answers to these questions inform the metadata engineering step that determines which properties may be used from existing vocabularies and how the corresponding metadata values can be expressed interoperably in the machine-readable format for exchange in a linked data environment. Looking again at our MyBookCase, this is how we might answer the questions about the properties for book:

The title will be transcribed from the book itself. It will be a free text string.
We want to use the date property in various ways in our application, such as sorting a set of retrieved bibliographic records, so we want to be sure that dates are presented in a uniform way as a structured string.
We want to indicate the language of the book so that users can limit their searches by language. To make sure that languages are always input in the same way, we want to use a controlled list of languages.
We want to record the subject. In our case, we want to select terms from the Library of Congress Subject headings.
We know that our author is not a single text string but will be described with several pieces of information, such as email address.

The process of metadata engineering will use these decisions to model the data elements as described in Appendix C. This analysis results in a data model that is coherent with RDF and the DCMI Abstract Model:

title: For title we can use the Dublin Core™ property dcterms:title, which has a "literal" range. Properties with literal ranges are often used when single, stand-alone text strings are all that is needed.
date: Because we want to perform automated operations like sorting on the date, we can select the Dublin Core™ property dcterms:date. This property has a "literal" range. We can indicate that the value string is formatted in accordance with the W3C Date and Time Formats specification by using syntax encoding scheme dcterms:W3CDTF.
language: The language needs to be selected from a controlled list. We achieve this by requiring the use of three-letter codes listed in the international standard ISO 639-3 for the representation of names of languages (such as "eng" for "English") together with the syntax encoding scheme dcterms:ISO639-3 as a datatype. For this, we can use the DCMI property dcterms:language.
subject: We want to record the subject using the Library of Congress Subject Headings. Typically, we would indicate the subject with a string (e.g., "Islam and Science") together with the vocabulary encoding scheme dcterms:LCSH, which identifies the heading "Islam and Science" as a member of the Library of Congress Subject Headings. Alternately, if the individual terms of the controlled vocabularies have already been given URIs as part of the work to express existing vocabularies using the RDF vocabulary Simple Knowledge Organization System [SKOS], then that URI may be used. In this case, the Library of Congress subject heading "Islam and Science" has been assigned the URI http://lcsh.info/sh85068424#concept. The DCMI property dcterms:subject has a "non-literal" range, which means that it can support the use of value strings, value URIs, and vocabulary encoding scheme URIs as needed.
author: Because our author needs to be described with multiple components, such as email address, the author property will need to have a non-literal range so that a separate but linked description can be created in the metadata record. The Dublin Core™ property dcterms:creator is defined with a non-literal range, so we will use this in MyBookCase.

Property	Range	Value String	SES URI	Value URI	VES URI	Related description
dcterms:title	literal	YES	no	not applicable [1]	not applicable	not applicable
dcterms:created	literal	YES	YES [2]	not applicable	not applicable	not applicable
dcterms:language	non-literal	YES	YES [3]	no	no	no
dcterms:subject	non-literal	YES	no	YES	YES [4]	no
dcterms:creator	non-literal	YES	no	no	no	YES
foaf:firstName	literal	YES	no	not applicable	not applicable	not applicable
foaf:family_name	literal	YES	no	not applicable	not applicable	not applicable
foaf:mbox	non-literal	no	no	YES [5]	no	no

Guidelines for Dublin Core™ Application Profiles

Guidelines for Dublin Core™ Application Profiles (Working Draft)

Table of contents

1. Introduction

2. Framework for Dublin Core™ Application Profiles

Singapore Framework

3. Defining Functional requirements

4. Selecting or Developing a Domain model

5. Selecting or Defining Metadata Terms

6. Designing the Metadata Record with a Description Set Profile

7. Usage Guidelines

8. Syntax Guidelines

References

Appendix A: Description Set Model (from DCMI Abstract Model)

Appendix B: MyBookCase Description Set Profile

Appendix C: Using RDF properties in profiles: a technical primer

The basics of RDF properties

RDF property semantics

Coining new RDF properties

Translating user-defined data requirements into design decisions