innovation in metadata design, implementation & best practices

DCMI Citations Working Group

Bath Citation Meeting Comments

Dublin Core Metadata Initiative - Citation Working Group

16 May 2001

Editor: Ann Apps < ann.apps@man.ac.uk>

Status of this document:*Working Draft_

This document includes comments, for clarification, on the minutes of the meeting held at The University of Bath, UK, on 15 March 2001 to discuss a proposal for capturing journal article citation information in Dublin Core and a way forward for this proposal. This document also includes a proposed structure for a proposal paper, and the expected process required to progress this proposal to become a DCMI recommendation.

Comments and feedback should be sent to the working group mailing list, < dc-citation@jiscmail.ac.uk>, the archives for which may be browsed at < http://jiscmail.ac.uk/lists/dc-citation.html> (NOTE, you must be a member of the WG to post messages to the WG) or, alternatively, send your feedback to the Editor of this Working Draft.


Citation Meeting, University of Bath, 15 March 2001

A meeting of a small group of interested people was held at The University of Bath, UK (hosted by UKOLN) on 15 March 2001 to discuss a proposal for capturing journal article citation information in Dublin Core and a way forward for this proposal. The minutes of this meeting may be read at http://www.ukoln.ac.uk/metadata/resources/dc/dc-citation-wg/meetings/2001-03-15.html.

Comments on Minutes

Following are some comments on the meeting minutes for clarification, resulting from some email discussion following the meeting. Note that the minutes record the general, and fairly informal, discussion at the meeting. Not all the suggestions made during the meeting are expected to be taken forwarded into a proposal for a DC recommendation. The suggested proposal from the meeting is in the following section.

  • Section 1, first bullet point. "volume, issue and part" should read "volume, issue and page(s)".
  • Section 1, first bullet point. "optional and repeatable". Although in DC every element is optional and repeatable, that is not necessarily the case for all of the dc-citation set. Everything's optional is probably OK in theory, on the grounds that in practice people will put in whatever's necessary. Possibly the only repeatable elements in this list are "journal abbreviated title" and "journal issue number". Issue/part/number, or whatever it's called for a particular journal, can be repeatable. Eg: Vol 6, Issue 2, Part 5, with many variations on the theme. Probably anything is optional, though it's difficult to see a journal article citation without a journal title/identifier. Maybe the report needs to say something about mandatory items - a citation profile?
  • Table: (i) Number (journalIssueNumber) maps to "issue" and "part" in OpenURL, not "volume?". We viewed 'issue' as a generic term to cover 'part', 'number', etc., ie. the concept of part of a volume rather than the specific name.
  • Table: (ii) The identifier for an issue is not "an article level SICI"; it's an issue level SICI (i.e. a SICI with an empty Contribution Segment).
  • Table: (addition) Chronology (journalIssueDate) also maps to OpenURL 'quarter'.
  • Discussion following table: (1st bullet point) Can JournalTitleAbbreviated can be "seen as a kind of identifier" in any DC sense. It certainly doesn't identify an article, and it doesn't *identify* a journal - it is a variation on its *title*? *Together with* the other dc-citation elements the abbreviated title can help to identify an article (although it's only there for its use in linking), but it is not in itself an identifier, any more than the other dc-citation elements taken individually. We assumed that 'journal abbreviated title' would identify the journal if that is the only journal-level element present. In reality it probably doesn't, but it is generally the abbreviated title which appears in references at the end of articles.
  • Discussion following table: (4th bullet point) Although it is true that most journals do not require the issue number if you have volume and page, there are journals which start each issue's pagination at "1", and for these the issue number is essential. This was looking at having separate metadata for each issue, as oposed to volume. It reflects the discussion at the time, rather than any recommendation.
  • Discussion following table: (5th bullet point) It is suggested that we make a distinction between the date that's used in JournalIssueDate and the date used in dc:date at the article level. JournalIssueDate specifically equals Chronology, in the SICI sense of the date that goes on the cover (including seasonal and quarter dates). This may not be appropriate for article date, where DCQ values are more likely to be dc:date.created and dc:date.issued, where "issued" means the true publication date, not the cover date. (This is particularly important with the trend towards article-by-article publishing before issue publication. The issue is increasingly being seen as just a convenient print archive - the true publication date of any article is the date it's posted on the Web, which may be months before the issue is printed. So, you might have an article with a dc:date.issued of 1 May 2001 and a JournalIssueDate of August 2001, for example.)
  • Discussion following table: (6th bullet point) "(logically part of DC.Format)" should be removed.
  • Section 2 - 1st para, 2nd sentence. The statement that "the one-to-one principle would mean that it would not be acceptable to encode all of this [citation] information in a single DC record for an article." is misleading. The whole point of dc-citation was precisely to say how you *could* do this. That's one of the reasons why we ended up recommending dc:identifier - we were basically saying that a citation/bibliographic record *is* a unique identifier for an article; as an article's identifier, it belongs perfectly well in that article's DC record - no one-to-one principle is broken. This is just an abbreviated bit of the discussion. We discussed the one-to-one principle because this is one of the objections the purists have to encoding citation information in the article's metadata. They would argue for hierarchical metadata, with separate metadata for journal, volume (or issue) and article with pointers upward. Some of this relates to the 'keep DC simple view'. I don't think anyone in the meeting really shared this view - we were just trying to answer expected criticism. In fact, I think that the discoverer of an article's metadata wants to know all about the article in one chunk, and not have to link somewhere else to find some of it out (though it is conceivable that intermediate software could do this for them).
  • Section 2, 1st bullet point. dc:relation is not "a type of identifier". The previous dc-citation WG recommended that dc:relation be used to relate the article to its "container" (such as the issue); the value for the element could be an identifier (such as an issue SICI). This sentence isn't clear. We didn't suggest changing the recommendation for using dc:relation to hold SICI or ISSN
  • The previous dc-citation WG didn't propose eight elements for an article's DC citation. Of the eight in the table, they didn't propose journalIdentifier or journalIssueIdentifier - these belong in the metadata for the journal and issue respectively, but not in the dc-citation record for the article. These 8 elements reflect the discussion at the meeting. We were just identifying all the information possible about journals, issues and volumes. We weren't going to carry all these were forward into a recommendation.
  • Defining new elements - DC.Host. This was more discussion at the meeting and represents an alternative approach. I don't think DC.Host was suggested though - more likely citation-specific elements (eg. CITE.Host). The term 'host' is used by some to refer to a containing journal, etc - see Elsevier's (I think it's theirs) journal article DTD for capturing references.

    Recommendation

Briefly the recommendation from the meeting was:

  • That the 'citation' information should be captured within the dc:identifier element.
  • That it should consist of:
    • one or more of: journalTitle, journalTitleAbbreviated, (and maybe journalIdentifier, eg ISSN if neither of the other 2 are known).
    • journal Volume, journalIssueNumber, journalIssueDate (all optional, journalIssueNumber repeatable) (SICI - if wanted should be in dc:relation, isPartOf).
    • pagination
  • All other information about the article, eg its title and authors, can be included in the other DC elements (with maybe some recommendations as to what, where, optional or mandatory, etc)
  • And that the semantics of thes elements should be defned. Which isn't really much different from the previous dc-citation proposal, except for some names.

Given this recommendation, we would then suggest 3 possible syntaxes to capture them: DCSV, OpenURL, RDF. With examples.

The difference between this and the previous Citation WG's charter is:

  • we also considered metadata requirements for reference linking (maybe more of a hot topic than it was 2 years ago)
  • we didn't consider the version issue (which always a completely separate issue).

Proposal Structure

This is an outline draft structure for the proposal. To a large extent the document can utilise the structure and content of the meeting minutes report.

  • 1. Identification of the Problem
    • how to capture a journal article citation in DC
    • both to support citation linking, and to capture the complete bibliographic record of an article.
    • [I don't think we need to say anything about the version issue, just make it clear which problem this proposal attempts to resolve.]
  • 2. Possible Solutions
    • this should cover the discussion at the meeting
    • what information elements need to be captured
    • various options for structuring/grouping the information
    • maybe it should include previous discussions and recommendations from the WG, certainly the final recommendation (but not too much)
    • it should probably include something about the voting at Ottawa (I'll look this up)
  • 3. Proposal
    • our recommendation for which extra fields are needed, and what we called them
    • maybe brief examples would be useful here but they shouldn't degenerate into syntax.
  • 4. Recommended Encodings
    • 4.1 DCSV (including as a structured string)
    • 4.2 OpenURL (the Ariadne paper could be referenced here)
    • 4.3 RDF
    • this section need some examples, which I think I said I would look out. I'll find some from the previous WG discussion, and maybe some real ones from the n-million zetoc records.

DCMI Process

Following the Bath meeting, Rachel Heery sent the following email to Stu Weibel and Makx Dekkers asking for advice on how to submit and progress a proposal from the group.

We hosted a meeting here at UKOLN yesterday of the group looking at progressing DCMI recommendation on handling metadata for journal articles. We had a productive meeting, unfortunately Cliff had to cancel at the last moment but fortunately we had sufficient background and collective memory to proceed! An account of the meeeting will be available next week, in the meantime I have been actioned to contact you now and ask for advice on how to proceed. In the meeeting we agreed the outline of a recommendation on best practice for describing journal articles using data elements based on the DC element set . The recommendation closely follows the previous DC Citation WG proposal with some small variance. We would also like to include in the recommendation an explanation of three encoding mechanisms (OpenURL, DC structured value, RDF). The questions we would like you to answer are - what form should we write up our recommendation (what 'template' do we follow?) - what will be the process for getting the recommendation we produce approved? - who do we send it to? We need re-assurance (before spending time writing up) that the process is in place to ensure our recommendation has a good chance of succesfully becoming an 'official DC recommendation'! Note that we have limited the scope somewhat from that of the original DC Citation WG which also considered issues related to version.

Following is the draft recommended process taken from Stu's reply with some additional information from Makx.

Form for the proposal

We do not currently have a template for such proposals, other than basic header material, and Beth Marsh can assist you in pouring your proposal into such a template.

As for the content, what is most important is that the document include:

  • a. Clear identification of the problem
  • b. Outlines possible solutions, including, if appropriate, a history of the discussion and the Pros and Cons of alternatives. It is hard to say how detailed this should be. Concise is better, but not if it leads to re-asking of the objections that have marked previous discussion.
  • c. Proposes one or more solutions, with pros and cons identified.

The more clearly this can be done, of course, the greater the likelihood that objections to the proposal can be resolved quickly. The proposal needs to be concise, but needs to guide an informed reader through the options.

It is my belief that the success of the proposal will depend as much as anything on the clarity of the solution and *why* it is an effective solution. I trust there is confidence that the objections raised in Ottawa are taken into account?

Who to send it to and how will it be approved

The following is a summary of the process for approving such proposals. Makx and I are working on a formal policy document for this issue, and there are details still being worked out, but it will look a bit like this:

  • a. Submit the draft to Working Group Chair (and to Beth Marsh for formatting) and it will be put up on the Citation WG page on the web site. The WG Chair should 'introduce the document' to the dc-citation list, and one would hope, be able to speak in its defense and encourage quick consideration.
  • b. The WG chair makes a judgement that the draft has had sufficient discussion, and the editors may make changes in response to comments (or decide not to do so... in any case, substantive issues raised in the discussion should be addressed or acknowledged in the proposal). Assuming that discussion is relatively brief and there are not major conflicts on substance, this could be as little as a couple of weeks I should think.
  • c. Revised Working Draft becomes a DCMI Working Group Proposal and is forwarded to the DC Directorate by the WG Chair. The Directorate will also forward the Proposal to a review committee for discussion and approval. This review committee may be the Usage Board or an ad-hoc group with potentially people from outside. Current plans are for the Usage Board to meet face to face twice a year to approve proposals. The first meeting will take place in May, the next in Tokyo in October. Given that May is close, and there is a significant agenda for them now, it may be that this proposal would be dealt with in October. It is possible that it could receive consideration in May however. (Note this email was sent at the end of March.)
  • d. If the review committee recommends approval, the document becomes a Proposed Recommendation, is made available for public comment, via dc-general, and the Executive Director makes the final call, taking into account the the advice of the Usage Board and any public comment received.
  • These procedures are still being worked out... Makx and I expect to have a draft of the policy in the next week or so, but it will work something like what I have described.