innovation in metadata design, implementation & best practice

Names in Dublin Core

Creators: Diane Hillmann
Date Issued: 1998-10-27
This Version: http://dublincore.org/documents/1998/10/27/names-in-dc/
Latest Version: https://www.dublincore.org/specifications/dublin-core/names-in-dc/
Replaces:
  • https://www.dublincore.org/specifications/dublin-core/names-in-dc/1998-10-27/
  • Status: note
    Description: There has seemed to be an irresistible inclination to incorporate into the Dublin Core elements information associated with names that might be useful to users of metadata records. I would like to propose that we adopt instead the general strategy for names that has worked well in libraries for many decades.

    There has seemed to be an irresistible inclination to incorporate into the Dublin Core elements information associated with names that might be useful to users of metadata records. I would like to propose that we adopt instead the general strategy for names that has worked well in libraries for many decades. Libraries store only authorized forms of names in metadata records for library resources--any other information on the name resides in a separate record for the name itself. This avoids just the sort of muddle that we have been struggling with in the DC data model, whereby we find ourselves endlessly arguing about whether a subelement modifies a "real" resource or a name (which may also be a resource in its own right but is nonetheless primarily a name in the context of the "real" resource). Thankfully, RDF supports this kind of structure, so we need not invent some additional box to accomplish our goal.

    There are ways to accomplish this "disintegration" of name information from resource information within the context of RDF. Ideally, we can link to outside resources such as VCard or LCNAF and avoid altogether the overhead of maintaining name information within our DC data. Alternatively, we can embed name information in our DC records using qualifiers from other namespaces specializing in names, and configure our searches to recognize those namespace conventions. Using these methods, we take advantage of work done by others, do not succumb to the temptation to reinvent the wheel, and allow the option of picking and choosing amongst the variety of name sources available, depending on our needs. The disadvantage to the

    linking option is that there are not yet clear paths to accomplish the task, but the option to embed may assist us in making that transition.

    Adopting such a strategy assists us in several ways:

    • using other existing standards can provide us with functionality which we cannot easily replicate (alternate forms of names, contact information)
    • RDF allows us to link to the most relevant kind of name record, whether it be VCard or LCNAF (or something else), each of which has its strengths and weaknesses for certain kinds of metadata
    • linking, rather than reinventing, allows us the luxury of not having to maintain repetitive and volatile data over time
    • by always providing an RDF:value string, we can also accommodate dumb applications, with little overhead for DC
    • providing both a link and embedding selected data from a record might allow providers a way to make a transition from text based searching to full use of linked information.

    If we were to adopt such a structural model, we could potentially use it for all kinds of situations where a need for an "authority" record could be envisioned. Names of persons or organizations, events (whether named conferences, currently handled also in LCNAF, or other kinds, such as performances). One could also envision such an approach for geographic names (where coverage data could be stored once and referred to as needed), or subjects, where access to classification numbers as well as subject strings and alternate terms might be desirable.

    The Personal/Corporate conundrum

    On area relating to names that seems to be a continuing source of problems is that of identifying the relevant category of the name, be it personal, corporate, or some other category. Some possible sources of name information include this information routinely, either as part of the internal coding (LCNAF) or as fielded data (possibly VCard). Where this information is desired but not available as part of the chosen namespace, a provider has the option of including this information as part of domain-specific namespace, or seeking an external name resource that includes the desired categorization.

    Doing it good in RDF

    In a paper written for the Data Model group (colloquially known as the "Book of Charles," sect. 3.4 XML Namespace, Charles Wickstead suggests that "Users of Dublin Core should not use the DCQ namespace for property types that are not defined in this document. Such extensions should use a namespace which is associated with the person or organization defining the extension, even if they are for use with a Dublin Core element." This seems exactly right to me, and should help us avoid the muddles that we seem to step in regularly in our discussions.

    Charles offers the following example of this approach:

    [Resource] -----DC:Creator-----> [#node001]

    [#node001] --+--RDF:Value----> [#node002]

    +--XX:Creator.Importance-> "minor"

    [#node002] --+--VC:FN---------> "Mr. John Q. Public, Esq."

    +--VC:N-----------> "Public;John;Quinlan;Mr.;Esq."

    +--VC:Email------> "jqpublic@xyz.doml.com"

    An equivalent example, using LCNAF, might look like this:

    [Resource] -----DC:Creator-----> [#node001]

    [#node001] --+--RDF:Value----> [#node002]

    +--XX:Creator.Importance-> "minor"

    [#node002] --+--LCNAF:100---------> "Public, John Q. (John Quinlan), 1933-"

    +--LCNAF:400---------> "Public, John Quinlan, 1933-"

    +--LCNAF:400---------> "A Disgruntled Voter"

    +--LCNAF:010---------> "n 89099111"

    Note that by adopting the USMARC coding conventions (100 = authorized form, personal name), information on form and name category ("Personal") is retained.

    Presumably, one could also do the following:

    [Resource] -----DC:Creator-----> [#node001]

    [#node001] --+--RDF:Value----> [#node002]

    +--XX:Creator.Importance-> "minor"

    [#node002] --+--LCNAF:100---------> "Public, John Q. (John Quinlan), 1933-"

    +--LCNAF:URL---------> "http://www.loc.gov/naf/n 89099111"

    [NOTE: no doubt there's a better way to link directly through the URL, but I don't know how to do it properly--my larger point is that there exists in this example both a direct link to an LCNAF record *and* a text string.]

    Some questions arise:

    • How would this work within the current thinking on the "dumb down rule?" If we use the explicit coding for the namespace (desirable if we wish to retain the functionality), we may lose the clear RDF:value path in the process.

    • Do we care about the fact that some names will be in the form "Surname, Forename" and others in direct order? Clearly the different forms of name in the two illustrated "recommended" options are not particularly compatible, though a sophisticated application might be able to relate them effectively. Other name sources may follow one convention or the other. (NOTE: the guidelines for simple DC suggest the "Surname, Forename" order--do we want to continue to recommend that?)

    An "Authorized" Approach to Subjects

    A similar approach might be used for subject terms and classification systems. Particularly in the case of classifications, where numeric or alphanumeric strings may provide useful entry for browsing but not necessarily be the search term of choice, having access to both via a structured link could be very helpful.

    The Book of Charles discusses subjects and subject schemes, using LCSH as the example scheme:

    [Resource] -----DC:Creator-----> [#node001]

    [#node001] --+--RDF:Value----> "Cookies"

    +--DCQ:Scheme--> "LCSH"

    One might also accomplish the same thing thusly:

    [Resource] -----DC:Creator-----> [#node001]

    [#node001] --+--RDF:Value----> [#node002]

    +--DCQ:Scheme--> "LCSH"

    [#node002] --+--LCSH:150---------> "Cookies"

    +--LCSH:URL--------> "http://www.loc.gov/lcsh/sh 82556900"

    This particular structure could make possible a better way to link to DDC, for which the classification number and the caption string may be equally weighted:

    [Resource] -----DC:Creator-----> [#node001]

    [#node001] --+--RDF:Value----> [#node002]

    +--DCQ:Scheme--> "DDC"

    [#node002] --+--DDC:153a---------> "306.36"

    +--DDC:153j----------> "Systems of labor"

    +--DDC:URL----------->"http://www.loc.gov/ddc/2348766"

    [NOTE: The Book of Charles suggests in section 3.3 "Degrading to Unqualified Dublin Core," that the first example would degrade to the unqualified version thusly:

    [Resource]----->DC:Subject------> "LCSH Cookies"

    I disagree strongly with this interpretation. In my view, it should be degraded as:

    [Resource]----->DC:Subject------> "Cookies"

    Just as Type A qualifiers are not included as part of a text string for "dumbed down" browsing, nor should Type B qualifiers be used in similar situations.

    An "Authorized" Approach to Geographic Names

    Using the same linking mechanism might well provide some functionality for users desiring methods for accessing GIS data via Dublin Core. Since Coverage information is currently one of the real headaches for qualified DC, it might be helpful to consider how linking to external GIS systems might take some of the burden from DC namespace.

    Some possible geographic name systems are the Getty Thesaurus for Geographic Names and the USGS Geographic Names Information System (for US names). Both of these supply latitude and longitude, variant names, and categories (which seem not to be standardized as yet).

    An example from the Getty TGN:

    [Resource] -----DC:Coverage-----> [#node001]

    [#node001] --+--RDF:Value----> [#node002]

    +--XX:Coverage> "spatial"

    [#node002] --+--TGN:Place---------> "Tallinn"

    +--TGN:Lat------------> "59 26 N"

    +--TGN:Long---------> "024 43 E"

    +--TGN:PlaceType---> "inhabited place"

    +--TGN:PlaceType---> "city"

    +--TGN:PlaceType---> "national capital"

    +--URL-->http://www.ahip.getty.edu/tgn_browser/file=7006629

    And from USGS/GNIS:

    [Resource] -----DC:Coverage-----> [#node001]

    [#node001] --+--RDF:Value----> [#node002]

    +--XX:Coverage> "spatial"

    [#node002] --+--GNIS:Place---------> "Trenton"

    +--GNIS:Lat------------> "401301N"

    +--GNIS:Long---------> "0744436W"

    +--GNIS:State---------> "New Jersey"

    +--GNIS:FeatureType---> "populated place"

    +--GNIS:NameVar---> "Trents Town"

    +--URL--->http://mapping.usgs.gov:8888/gnis/owa/id=884540