innovation in metadata design, implementation & best practices

DCMI Registry Community

Title:

Draft DCMI Open Metadata Registry Functional Requirements

Editor: Rachel Heery
Date Issued: 2001-10-31
Identifier: http://dublincore.org/groups/registry/fun_req_ph1-20011031.shtml
Previous version: http://www.ukoln.ac.uk/~lisrmh/DCMI-registry/funreq-20010819.html
Latest version:
Status of document:
This is a DCMI Working Draft.
Description of document:

The DCMI  registry is designed to provide an added value 'registry service' to assist humans and software obtain reliable trusted information about DCMI terms. It is expected and encouraged that other registry services will add value in other ways, for example by providing domain specific approaches to DCMI terms. Readers should be aware that the DCMI Registry exists alongside the straightforward 'registry service' provided by the web i.e. it will be possible to link over the web direct to DCMI schemas by means of resolvable  DCMI namespace URI's (using HTTP GET namespace URI).


DCMI Registry Functional requirements

Important Note to readers who may be able to offer offer input:

In particular I would welcome volunteers to elaborate

  • Functionality required by 'RDF experts'. I am rather assuming this might be very similar to what is there now as the 'RDF Interface' at http://wip.dublincore.org:8080/registry/registry1 . In other words the same functionality as for information seekers but by-passing any XSLT re-labelling of properties/classes and ensuring transparency of any 'canned searches'.

  • Functionality required by software. I hope that the DCMI registry will be able to 'add value' to a straight 'HTTP GET schema URI'. Presumably there would be functionality concerned with multi-lingual context. But also the added 'richness' from additional information expressed by means of EOR schemas. So EOR schema vocabulary might enable additional comment to be associated with terms over and above that expressible by RDFS... Such functionality might be associated with a particular application such as a metadata creation tool, or a 'metadata-aware' harvester.

  • Confirmation that functionality listed is that really required by 'information seekers', real people, real implementors!

  • Following the DC-2001 workshop in Tokyo, the Usage Board has agreed to define any additional functionality, or changes in functionality that they require in order to 'manage' the DCMI vocabulary. In addition following the workshop we are considering how to take a 'operational Registry' implementation forward, and will be considering alternative software solutions.

1. Background

This document is a more detailed elaboration of  requirements for the DCMI Registry. It follows on from the high level overview  Purpose and scope of DCMI Registry 2001-05-11.  It is intended that these detailed requirements will inform further development of the prototype over the next few weeks.

The DCMI  registry is designed to provide an added value 'registry service' to assist humans and software obtain reliable trusted information about DCMI terms. It is expected and encouraged that other registry services will add value in other ways, for example by providing domain specific approaches to DCMI terms. Readers should be aware that the DCMI Registry exists alongside the straightforward 'registry service' provided by the web i.e. it has been assumed that it will be possible to link over the web direct to DCMI schemas by means of resolvable  DCMI namespace URI's (using HTTP GET namespace URI).

The specification of functionality has been informed by extremely useful prototyping of a DCMI Registry carried out by Harry Wagner of OCLC, see http://wip.dublincore.org:8080/registry/Registry.

2. The purpose and scope of the DCMI Registry

The purpose and scope is amended from  Purpose and scope of DCMI Registry 2001-05-11

  • To assist DCMI to manage the evolution of DC vocabularies
  • To provide authoritative definitions of recommended DC elements, qualifiers and controlled vocabularies
  • To identify DCMI recommended names for schemes
  • To express these 'controlled metadata sets' and the relationships between them in machine readable schema language and in human readable mode.
  • To provide a user friendly interface to the registered metadata ( e.g. search and browse facility, browseable element set lists, links to annotations and guidance on use of DC elements and qualifiers)
  • To manage multilingual aspects of DC.

This functional requirements specification has been developed with the assumption that the DCMI Registry will be based on an 'infusion' from proposed RDFS encoded schema for the DCMI terms., as well as infusion from additional 'eor schemas' to facilitate structuring and accessing terms in the Registry. The prototypes have been based on this assumption too. These terms and the relationships between them are defined within the following RDFS encoded schema:

http://purl.org/dc/elements/1.1/

http://purl.org/dc/terms/

http://purl.org/dc/dcmitype/

EOR schema..... ??

3. Users

We have identified four categories of users of the registry

  • information seekers : those looking for up to date information on the semantics of DCMI terms. these might typically be metadata creators, information specialists implementing new systems and looking for appropriate terms for their schemas, librarians, etc
  • computer specialists : developers using RDF/XML, software engineers etc
  • applications : software using the regsitry to navigate information about DCMI terms
  • administrators : who may add, edit and delete entries

The priority is to define a minimum functionality to give the first catefory, information seekers, a user friendly impression of the registry so thay can evaluate the Registry. Because a lot of information seekers will not have English as their native language,  we need to prioritise multilingual functionality in the first phase. Also it will help to ensure correct design decisions by implementing multilinguality in the first phase. Albeit we accept there may only be a limited implementation of multilinguality to begin with, as  it requires a lot of additional data to be entered. Also we want to benefit from additional work on multilingual regiustries going on in parallel in a different forum ( ULIS in Japan).
 

4. Phase 1 requirements

4.1 Content of registry

The registry will contain information about all DCMI terms including

  • DCMI elements (identified by namespace ...
  • DCMI element refinements and new elements (identified by namespace
  • DCMI controlled vocabularies (identified by namespace ....
  • DCMI recognised and registered schemes

Initially we will include for all DCMI elements and qualifiers:

  • Name - The unique token assigned to the qualifier.
  • Label - The human-readable label assigned to the qualifier
  • Registration Authority - The entity authorised to register the data element (i.e. DCMI)
  • Language - The language of the term name, definition and comment
  • Definition - A statement that clearly represents the concept and essential nature of the term
  • Obligation - Indicates if the term is required to always or sometimes be present (i.e. contain a value)
  • Datatype - Indicates the type of data that can be represented in the value of the term
  • Maximum Occurrence - Indicates any limit to the repeatability of the term
  • Comment - A remark concerning the application of the term
  • Relation - Indicating relation to other terms e.g. element refinement of particular element , controlled vocabulary term related to a specific scheme

All this information is given explicitly or implied for all elements andqualifiers in existing web documents [1] and [2].  Information for terms within controlled vocabularies is given in the documents referenced at [4].

Please note that within [2] 'Name' has been used in place of 'Identifier' in [1], and also in [2] Label has been used in place of Name in [1]. The usage in [2] has been agreed as preferred by the DCMI approval process rather than that defined in ISO 11179 [3].

4.2 Requirements for information seeker interface

Users must be able within displays to distinguish whether terms are elements or element refinements or schemes or from controlled vocabularies. Users must be able to browse and search for these categories of terms separately.

In addition users must be able to identify relationships between terms e.g. to see that Alternative is a sub-property of Title.

4.2.1 Browse through lists of terms

Users will want to browse separate summary lists of the following

  • all DCMI terms that have been registered (i.e. elements, element refinements, schemes, terms from controlled vocabularies). Note this may not be the same thing as 'all schemas available in the registry' as the registry will include schemas used for 'managing' the data. Users will want to link from the list of categories to terms within those categories e.g. from 'DCMI elements' to a list of summarised displays of infromation about those elements

  • all DCMI elements in summarised display

  • all DCMI element refinements

  • all schemes for which DCMI have recommended names

  • all controlled vocabulary terms [4]

Summary displays of terms might include name, identifier, relationship to other terms. From summary displays users can link to full display of all information for that term. From the initial list users may also want to link to appropriate related terms e.g. from display of an element refinement to all element refinements for the appropriate element, to all element refinements, to the element refined.

Listing should be sorted by term label (e.g.  contributor, subject, title etc) in alphabetical order
Users will want to select their preferred language for translation of name and definition etc in the list

4.2.2 Search contents of registry

Users should have some choice in what fields (properties and classes) are searched. For example users might not be interested in searching eor schema properties themselves which exist for the description of terms (i.e. It must be possible to limit the search to schema about DCMI terms and exclude schema that exist for managing the registry. )

Users must be able to

  • enter a string to match label of DCMI terms (e.g. title, alternative title etc)
  • enter a string which will be used as free text search of values describing DCMI terms (e.g. any mention of creator, subject, Dewey anywhere in text of definitions, comments, term labels)

On the top screen one needs to offer this browse facility. The 'information seeker interface' needs to
replace RDF jargon such as 'properties/classes' with 'names of terms/scheme/schema/element etc etc ' whatever is appropriate to particular display.

Indicate on the top screen search box which properties within a schema are being searched

e.g. Search all values associated with DCMI terms (which in effect might include rdf:ID? rdfs:label? eor:comment? rdfs:comment?) to find the string "title"

Search on rdfs:subPropertyOf contains "title" (to get me properties which qualify title etc)

In all searches user must be able to specify which language of name/definition/comment they wish to searc

4.2.3 Display of retrieved data, input screens etc

All RDF jargon needs to be replaced with more accessible terminology. The terminology decided on will need to be available in user's language of preference.

Within displays all 'property' and 'class' labelling must be replaced with easily understood terminology. Displays should be of human readable information with RDF syntax stripped away. For example:

Suggested display of information about a schema (RDFS tags need to be replaced with plain English, some information stripped, only text in bold needs to be displayed )

Example : Full display of information about a term

                      Schema                          DCMI elements  defining v1.1 elements)
  rdf:type                                               LINK to human readable version of http://purl.org/dc/elements/1.1/ )
  dc:title       Title                                 The Dublin Core Element Set v1.1
  dc:description Description                   The Dublin Core metadata vocabulary is a simple vocabulary intended to facilitate discovery of resources.
   dc:publisher Registration Authority     The Dublin Core Metadata Initiative
   dc:date                                                 2000-07-02
   dc:language Language                         English
   dc:relation Relation                            Related to other schema xxx, xxx

Example: Full display of information about a term

  •      Name - The label assigned to the term
  •      Identifier - The unique identifier assigned to the term
  •     Registration Authority - The entity authorised to register the data element (i.e. DCMI)
  •     Definition - A statement that clearly represents the concept and essential nature of the data element
  •     Obligation - Indicates if the data element is required to always or sometimes be present (contain a value)
  •      Datatype - Indicates the type of data that can be represented in the value of the data element
  •     Maximum Occurrence - Indicates any limit to the repeatability of the data element
  •     Comment - A remark concerning the application of the data element
  •      Relation - Indicate whether element has qualifiers (sub-properties) or to which element(s) a qualifier relates (LINK to appropriate terms)
  •     Category - whether term is element, element refinement, encoding scheme, term from controlled vocabulary

Summary display of information about a term:

  •      Name - The label assigned to the term (LINK to detailed information)
  •     Identifier - The unique identifier assigned to the term
  •     Registration Authority - The entity authorised to register the data element (i.e. DCMI)
  •     Relation - Indicate whether element has qualifiers (sub-properties) or to which element(s) a qualifier relates
  •     Category - whether term is element, element refinement, encoding scheme, term from controlled vocabulary

4.3 RDF 'expert interface' to registry

Implementors of systems using RDF/XML encoding may wish to see

  • Examples of RDF/XML encoding of terms and relationship between terms as would be required in instance metadata
  • Output of parts or whole RDFS schema relating to particular categories of terms
  • Output of parts or whole XML schema relating to particular categories of terms

5. Phase 2 requirements:

5.1 The following requirement was originally recommended for implementation

  • To enable DC elements and qualifiers to be annotated with a status such as 'proposed', 'recommended', 'deprecated' (these 'status' terms to be provided by the usage committee)
  • To enable implementors to submit proposed extensions and application profiles

However my understanding is that the Usage Board does not want implementors to register 'proposed' terms nor proposed application profiles within the registry. Also the Usage Board sees no immediate requirement for 'deprecated' status. This means this requirement can be shelved.

5.2 Other requirements to be phased in, priority as indicated:

High priority: - To provide access to DCMI approved domain specific 'application profiles' e.g. the DCMI Education group application profile Priority to be established: - To register authoritative mappings and crosswalks between DC and other metadata sets (e.g. ONIX, MARC etc.) - To provide information on deployment (e.g. which services are using particular domain specific extensions) - To provide links to best practice, guidelines for use (perhaps link into the user guide?)

5.3 Machine readable access to registry, use by software application

?????????????????

Use by metadata editors, metadata conversion tools, for infusing schemas into other applications etc etc

6. Constraints/Assumptions

  • We will use RDF schema language in the first instance as this is supported in the prototype software.
  • We have been advised by the usage committee that there will be no requirement for versioning DCMI terms
  • The Working Group on Dublin Core in Multiple Languages takes the position that the reference language of the international Dublin Core community is English inasmuch all of its outputs are discussed and approved in English [5]. Accordingly, the English version of the Dublin Core has a special status as the agreed result of an international process.

7. Dependencies

In order to express DC elements and qualifiers in RDFS there needs to be a decision on the namespace model for DCMI, in particular we would like to use the correct URIs within linked schemas. We are working with draft schemas at present.

Development work on the DCMI Registry software is currently undertaken at OCLC. Work on multilingual aspects is also being taken forward at ULIS, Japan.

Definitions of terms in this document

Name : Single or multi word designation assigned to a term. Serves as human readable label, can change within an implementation or in multilingual context.

Identifier: A language independent unique identifier of a term, a machine readable 'tag', cannot change.

DCMI term : A DCMI term is a metadata element or qualifier defined in a DCMI recommendation.  Each
DCMI term is identified by a Uniform Resource Identifier (URI) within a DCMI namespace.

DCMI namespace : A means of uniquely identifying a DCMI term.  Each DCMI namespace is identified by a URI.

DCMI recommendation : A DCMI recommendation is a human-readable document that defines one or more DCMI terms.

Application profile : A schema describing a set of terms (terms already identified by a unique namespace which may or may not be a DCMI namespace). These terms will be selected from already existing schema as optimal for use within a particular implementaion or domain.

References

[1] Dublin Core Metadata Element Set, Version 1.1: Reference Description
http://www.dublincore.org/documents/dces/

[2] Dublin Core Qualifiers
http://www.dublincore.org/documents/dcmes-qualifiers/

[3] ISO/IEC 11179-1 Specification and standardization of data elements. Parts
1-6
http://www.sdct.itl.nist.gov/~ftp/x3l8/other/coalition/Ovr11179.html

[4] Controlled vocabularies:
DCMI Box Encoding Scheme: Specification of the spatial limits of a place, and methods for encoding this in a text string.
http://dublincore.org/documents/2000/07/28/dcmi-box/

[5] Plans for a distributed registry of Dublin Core in multiple languages. Thomas Baker. 1998-10-28
http://www.dublincore.org/documents/1998/10/28/distributed-registry/

Other relevant documents:

DCMI Period Encoding Scheme: specification of the limits of a time interval, and methods for encoding this in a text string.
http://dublincore.org/documents/2000/07/28/dcmi-period/ Draft RDF schemas

DCMI Point Encoding Scheme: a point location in space, and methods for encoding this in a text string
http://dublincore.org/documents/2000/07/28/dcmi-point/

[x] Draft schemas .

http://homes.ukoln.ac.uk/~lisap/dcmi-ns/dcmes-rdfs.xml

http://dublincore.org/2000/03/13/

http://purl.org/dc/elements/1.1/

http://purl.org/dc/terms/

http://purl.org/dc/dcmitype/

http://dublincore.org/documents/2000/07/11/dcmi-type-vocabulary/