Title: "Vocabulary Encoding Scheme Registration" issue Modified: 2004-03-22 09:41, Monday Maintainer: Tom Baker Latest version: http://dublincore.org/usage/meetings/2004/03/ISSUES/registration/ See also: http://dublincore.org/usage/meetings/2004/03/ISSUES/ Description: Evolving summary of the VESR issue from a UB point of view. Past meeting actions are summarized in Appendix A, followed by a Bibliography. Shepherd: Traugott Koch SUMMARY (Tom) In Bath, we will start by discussing first the possible implications of InfoURI on the very idea of setting up a DCMI registry. The following two documents have therefore been put into the main meeting packet: http://www2.elsevier.co.uk/~tony/info/info.html http://dublincore.org/usage/meetings/2004/03/Weibel.InfoURI.Registry.pdf If we conclude that InfoURI will not meet the need we have long recognized, only then should we proceed to a discussion of the steps to be undertaken and processes by which a DCMI registration service would function. The main sources for that discussion are the "Overview of Next Steps" (below) and the following two documents in the supplementary packet: http://www.lub.lu.se/~traugott/drafts/vocab-scheme-Jan04.html http://www.lub.lu.se/~traugott/drafts/vocab-guide6.html ------------------------------------------------------------------------ OVERVIEW OF NEXT STEPS ------------------------------------------------------------------------ In Seattle, the UB agreed to proceed with the fast-track system to "register" controlled vocabularies as Vocabulary Encoding Schemes (VESes)(see Appendix point A.8). This entails: 1. Finalize the Web tool 1.1 Work with Harry to improve interface and functionality 2. Put into place the necessary documentation 2.1 Update "Guidelines for Registration" 2.2 DCMI namespace policy 2.2.1 Modify policy, adding "http://purl.org/dc/schemes" 2.2.2 Duplication of legacy URIs in the new namespace 2.3 Process for merging of "registry" output into raw UB data 2.4 Formulate and document a "good-neighbor" policy 2.5 Reflect a "good neighbor policy" in documentation and schemas 2.6 Set up and clarify use of JISCMAIL archive for audit trail 2.7 Clarify criteria and processes for vetting proposals 3. Approve an initial set of encoding schemes 4. Manage the above tasks and represent the project to the public 4.1 Create a one-stop Web page for the Registration project 4.2 Clarify who will actually do what 4.3 Assume overall project-management responsibility 5. Plan a workshop (out of UB scope per se) NEXT STEPS IN DETAIL 1. Finalize the Web tool 1.1 TASK (Traugott): As of 2004-01-04, Traugott has updated his summary of development work needed on the Web-based registration tool and will follow up with Harry and Stu to (hopefully) complete in January-February. See http://www.lub.lu.se/~traugott/drafts/vocab-scheme-Jan04.html and http://wip.dublincore.org/schemes/index.html. (See also Appendix points A.6 and A.14.) 2. Put into place the necessary documentation 2.1 "Guidelines for registration of Vocabulary Encoding Schemes" TASK (Traugott): As of 2004-01-10, Traugott has updated [GUIDELINES] and will post a draft to DC-USAGE for comment. See http://www.lub.lu.se/~traugott/drafts/vocab-guide6.html. NOTE: We should look closely at the guidelines it provides on forming and proposing a "Name" for a vocabulary -- for example, appending a language suffix such as "-fr", etcetera. I believe there are some unresolved issues here with regard to the use of "date-stamped" URIs or of reflecting the numbers of specific versions of vocabularies. After talking with Traugott, I believe this was the intention behind the Seattle Action Item 13 for Traugott to draft "a document setting out guidelines for the creation of URIs for encoding schemes" (see A.12 below). 2.2 DCMI Namespace Policy According to the policy as it currently stands, all new Encoding Schemes go into the http://purl.org/dc/terms/. However, now that we expect the creation of many new encoding schemes rather quickly and according to a new fast-track procedure, we need a new namespace for encoding schemes (i.e., http://purl.org/dc/schemes/). This change and addition to the namespace policy is on the critical path -- not just for going into production to register encoding schemes but even before we can finalize the related texts and guidelines. This breaks down into the following tasks: 2.2.1 TASK (who?): Modify the Namespace Policy [DCMI-NAMESPACE] as follows: -- a name for the new namespace; -- clarify whether any aspect of the existing policy needs to be modified with respect to encoding schemes that are approved according to the fast-track procedure; -- shepherd the draft through online DC-USAGE discussion; -- liaise with Makx for Directorate approval; -- present the results (hopefully completed?) in Bath. 2.2.2 TASK (who?): Duplication of legacy URIs under "http://purl.org/dc/terms" in the new namespace "http://purl.org/dc/schemes". This decision (and its implications) needs to be implemented and documented: -- gather from the Directorate, Tom, Roland, and mailing lists any existing documentation of past discussions and edit them into a one-page clarification; -- include documentation of how the equivalence would be declared in the formal RDF term declarations; -- liaise with Tom (for the raw UB data in XML) and Harry (for the generated RDF schemas and Web pages) about declaring and documenting the equivalence; -- after consulting with the Directorate on process, shepherd the one-pager through list discussion, approval, and posting on the Web, ideally in time for the Bath meeting. 2.3 TASK (Tom): Merging of "registry" output into raw UB data (see also A.7). According to our current model, information about VESes will be recorded in two places: -- in the back-end database to the Web tool. This database will include administrative information -- e.g., who submitted a proposal and when. The database will periodically output a listing of new VESes in a form that can _automatically_ be merged into the raw UB data (currently in XML). NOTE: This functionality is on the critical path to using the Web tool -- if we cannot merge data from the Web tool into the system of XSLT scripts currently used to generate updated Web pages and RDF schemas of the DCMI terms automatically, I do not currently see a way to manage updates to our documentation with a reasonable and sustainable level of effort. If we cannot automate the workflow from registration through to final publication so as to sustain a reasonable and efficient throughput, we should not really embark on this adventure to begin with. -- in the formal RDF term declarations and related Web pages generated from the raw UB data (which, in turn, is automatically generated from the database above). This entails the following: -- liaison with Traugott to clarify whether any attributes need to be exported beyond those already used to describe existing encoding schemes; -- decide in discussion on DC-USAGE or in Bath at what frequency terms documents should be updated to show new VESes (this entails liaison with maintainers of the DCMI Registry about the availability of new terms in the registry database); -- liaise with Harry and the Web Team to clarify how often and by what workflow descriptions of VESes will be exported from the Web-tool database and incorporated into the raw UB term data; -- verify that the workflow is completed and functions as intended for generating updated term documentation; -- update the "schema" of attributes used to describe encoding schemes. 2.4 TASK (who???): formulate a policy for pointing to non-DCMI URIs created for vocabularies to which DCMI URIs have already been assigned (sometimes called a "good-neighbor" policy) (see also A.11). This entails the following: -- describing a "DCMI philosophy" (or etiquette) for pointing to non-DCMI URIs (e.g., do we "recommend" one over the other or simply point?); -- clarifying exactly where the non-DCMI URI will be recorded, and how that URI will be reflected in the DCMI Registry and exported for merging into the raw UB data used for generating RDF schemas and Web pages (see also 2.3). 2.5 TASK (who???): clarify and document exactly how the "good neighbor policy" (2.4) will be reflected in the RDF schema and in the terms Web pages. (See also A.4 and A.5). This entails: -- clarifying with Roland -- who I believe had a solution for this that was discussed and for which notes exist somewhere... -- exactly how the cross-reference would be expressed in RDF; -- clarifying with Tom exactly what relevant field or fields will be automatically exported as a basis for generating Web pages and RDF schemas; -- clarifying with Tom exactly how that additional information should appear in the Web documents; -- liaising with Harry to ensure that the additional RDF assertions will be generated from the raw UB data. 2.6 TASK (Traugott): set up a JISCMAIL list to use as an archive of (all?) actions taken in the fast-track procedure (see also A.3 below). This entails: -- clarifying exactly who needs to do what to verify and evaluate a submission, what follow-up actions they need to take, what needs to be documented and where; -- providing a user-friendly list of actions and responsibilities for inclusion in the DCMI Usage Board process [UB-PROCESS] or in another appropriate document. 2.7 TASK (Diane and Stuart?): Documentation of process for reviewers of fast-track proposals. This should expand on Section 5 of [UB-PROCESS] clarifying exactly who is expected to do what in order to reach a fast-track decision about a proposed VES. This could take a list of such actions and responsibilities from 2.6. (See also point A.1 below.) 3. Approve an initial set of encoding schemes When all of the above is in place, we somehow need to move this forward to the actual creation of VESes, especially for the initial set of known important vocabularies in the pipeline (see also point A.10). Note that the restriction that proposals be accepted only from the owners and maintainers of vocabularies is slightly at odds with the notion that we would begin with "known important vocabularies" that have long been in the pipeline. In other words, we should decide whether to go ahead with the registration of some vocabularies on our own initiative, and if so, who will take that initiative and to what extent will those volunteers be expected to obtain permission from the owner/maintainers of the vocabularies in question. Deciding how to proceed on this would be the responsibility of the overall project manager for Vocabulary Registration, in consultation with the Usage Board. 4. Manage the above tasks and representing the project to the public 4.1 TASK (Traugott?): Create a Web page describing DCMI's project for registering VESes along with a one-stop annotated set of pointers to all relevant resources (such as [GUIDELINES], [UB-PROCESS], [NAMESPACE POLICY], DCMI terms documentation, and the DCMI Registry. This entails: -- taking as input from 2.4 a "good-neighbor policy"; -- explaining overall philosophy, policy, and intentions (perhaps this should be where we explain that in the first instance, registration will be on the initiative of scheme owners -- i.e., the maintainer of the vocabulary in question does the registering by proposing an acronym for use in a DCMI-maintained URI and optionally supplying an owner-maintained URI for the same; -- creating short versions of the above for posting as announcements to DC-GENERAL. Such explanatory text could be folded into the start page for the Web tool [WEB-TOOL], which currently provides user guidance on using the tool, which would require coordination with Harry on editing a single Web page. Or the information could be split out into a separate document -- whatever seems friendliest for users. Either way, we should ensure that all Web pages of relevance to the project of registering encoding schemes be fully cross-referenced with all of the other relevant Web pages. 4.2 TASK (Traugott?): Clarify who will actually do what. This task involves defining what needs to be done (see also 2.7), but also who is going to actually do it and the process for managing the people who are doing it. Note that there has been considerable discussion between Ithaca and Seattle of a "reasonable" role for Usage Board members in processing proposals. However, experience in last year's trial run and with AskDCMI this year strongly suggest that asking UB members to claim proposals to vet will be problematic. 4.3 TASK (Traugott): In Seattle, Traugott volunteered to assume overall responsibility for the Registration issue. From my point of view, I see this as involving, among other things: -- coordinating and motivating the people who will check, evaluate, and approve the proposals for VESes; -- making announcements to DC-GENERAL; -- coordinating with Tom about responsibilities to avoid unnecessary redundancies in tracking issues. 5. Plan a workshop Note: As of December 2003, Tom, Traugott, Stu Weibel, Diane, and Stuart Sutton are discussing the possibility of holding a workshop to coordinate better between DCMI and other communities interested in the general problem of identifying and citing controlled vocabularies. While it is related to the process of Vocabulary Encoding Scheme Registration discussed here, the workshop issue is not further covered below. Note that this venue would be the appropriate place to discuss a possible use of IETF's InfoURI (Appendix point A.9). ------------------------------------------------------------------------ APPENDIX A: Recent decisions and action items ------------------------------------------------------------------------ 2003-06-17: Ithaca meeting 2003-09-28: Seattle meeting A.1 ITHACA ACTION ITEM (Diane and Stuart): Make necessary updates to the UB Process document. A.2 ITHACA ACTION ITEM (Tom): Ask Directorate to advise UB on their position in regard to the legal issues surrounding encoding scheme registration. [Note: this has been done, and the opinion is that we can go ahead as long as we articulate our policies clearly.] A.3 ITHACA ACTION ITEM (Traugott): In the interest of maintaining an audit trail all emails between UB/DCMI and the scheme owner/maintainer to be sent to a closed Jiscmail DC list for permanent retention; Traugott to ask Paul Miller. A.4 ITHACA DECISION (according to Traugott): DCMI assigns a URI and lists URIs created by vocabulary owners. If this is the case, a "same as" relationship between the two is declared. A.5 ITHACA ACTION ITEM (Tom and Diane): Draft new document that explains such things as 'good neighbour' policy, what the process involves, the aims of the registration service, registration help, etc. A.6 ITHACA ACTION ITEM (Traugott): List of priorities for enhancements/changes to the scheme registration tool to be submitted to Makx. A.7 ITHACA ACTION ITEM (Tom): Document the XML output formats that are wanted from the registration tool. A.8 SEATTLE DECISION: UB agrees that it must proceed with Vocabulary Encoding Scheme registration. A.9 SEATTLE DECISION: UB will consider adopting IETF's InfoURI if and when this is finalized. A.10 SEATTLE DECISION: The UB is to aim for implementation of the registry by January or February 2004. A.11 SEATTLE DECISION: Where schemes have existing URIs such schemes should be registered at the request of implementers. A.12 SEATTLE ACTION ITEM 13 (Traugott): Draft a document setting out guidelines for the creation of URIs for encoding schemes. A.13 SEATTLE ACTION ITEM 14 (Tom): Tom to remind Stu to seek advice from OCLC lawyers regarding legal issues surrounding encoding scheme registration. A.14 SEATTLE ACTION ITEM 15 (Traugott): Questions to be posed to Harry, as part of request for updates and changes, regarding authentication and whether existing authentication facility is robust enough. ------------------------------------------------------------------------ BIBLIOGRAPHY ------------------------------------------------------------------------ [GUIDELINES] Guidelines for Vocabulary and Encoding Scheme Qualifiers, http://dublincore.org/usage/documents/vocabulary-guidelines/ [NAMESPACE-POLICY] DCMI Namespace Policy, http://dublincore.org/documents/dcmi-namespace/. [DCMI-PRINCIPLES] DCMI Grammatical Principles, http://dublincore.org/usage/documents/principles/. [UB-PROCESS] DCMI Usage Board Process, http://dublincore.org/usage/documents/process/ [WEB-TOOL] Vocabulary Scheme Registration [a Web-based tool], http://wip.dublincore.org/schemes/index.html.