innovation in metadata design, implementation & best practice

DCMI 2019: Metadata Best Practice Day

The DCMI 20129 Best Practice Day is a new event, which will take place in Seoul (hosted by the National Library of Korea), on September 26th 2019, immediately after the DCMI 2019 conference and workshops.

The aim of the Best Practice Day is to hear from institutions that have implemented sound and innovative metadata systems, so that they share their experiences and insights with the DCMI community.

Moderator: Sam Oh, Chair of 2019 DC Conference

09:00 - 09:20
Jihae Jeon [National Library of Korea]
Title: Dublin Core Usage for Linked Open Data Service: NLK LOD
Abstract: The National Library of Korea (NLK) has been publishing LOD with the conversion of bibliographic, name authority, and subject authority records from KORMARC and MODS to RDF format. To achieve interoperability and openness of data, NLK is using properties and classes of existing RDF vocabularies and ontologies (e.g. Dublin Core, BIBO, SKOS, FOAF), while defining its local vocabulary named NLON (National Library ONtology). This presentation will show the development of NLK LOD service focusing on the process of modelling and converting data with these reusable standards. On purpose of offering data, NLK runs a separate platform for LOD service (lod.nl.go.kr) where data can be queried from the SPARQL Endpoint. Bulk download of datasets and some applied services are also available. The platform will be briefly introduced.
09:20 - 09:40
Cuijuan Xia [Shanghai Library, China]
Title: Modeling Different Metadata Schema(s) as One Ontology for Knowledge Consilience.
Abstract: There are different kinds of LOM resources with different metadata schemas. While transforming all those metadata records into Linked Data, we need an unified data model to integrate the data together. The Digital Humanities team of Shanghai Library has designed an ontology as abstract data model to define the relationships of different concepts,classes and properties extracted from the metadata schemas of different resources such as archives, genealogy documents, ancients books, old photos, old movies and so on. Then transformed the metadata records in different formats into RDF data and linked all the resources as Linked Open Data by building the relationships among entities extracted from metadata records including persons, organizations, places etc.
09:40 - 10:00
Nuno Freire and Antonine Isaac [Europeana]
Title: The Europeana Data Model – Principles, Community and Innovation.
Abstract: In the Europeana aggregation process, the Europeana Data Model (EDM) is the data model that allows Europeana to maintain a sustainable aggregation of metadata about digital representations of culture artefacts together with rich contextualization data and supporting multilinguality in metadata. EDM supports several of the core processes of Europeana’s operations and contributes also to the access layer of the Europeana platform, supporting the sharing of data with third parties, in accordance with best practices for publishing data on the web. EDM is a community-based effort, involving representatives from all the domains represented in Europeana: libraries, museums, archives, and galleries. It was initially defined in 2010 and has been under continuous improvement since, under the coordination and maintenance of Europeana. While Europeana maintains a core EDM, it also provides guidance in the creation of new extensions of the model within the community. We will present the story and principles that guided the design of EDM, recent progress and examples of extensions. In the light of a report from a recent task force on governance of EDM, we will reflect on how a community model can live up to initial expectations. Further, we will discuss the role that EDM plays in the activities that Europeana's Network carry out to enhance metadata quality across the board, especially through Europeana's Data Quality Committee. Finally, we will offer to discuss our recent work on innovation aspects related with metadata flows in Europeana , such as the usage of Schema.org for description of cultural artefacts.
10:00 - 10:20
Sachiko Inoue [The National Diet Library of Japan]
Title: Japan Search: National cross-sectoral portal for digital cultural heritage.
Abstract: Japan Search, as its name suggests, is a portal site for searching a wide variety of Japanese content, ranging from published works, paintings, and cultural assets to broadcast programs and movies. It enhances the discoverability of this content through aggregation of metadata created by holding agencies such as libraries, galleries, museums, and archives. In this presentation, the National Diet Library of Japan, which developed the Japan Search system, will give an explanation of it focusing on three major points: how it provides users with a cross-sectoral search without determining a single data format when aggregating metadata; how it achieves interoperability with other cultural heritage portals by converting aggregated metadata into a standardized RDF-based data model; and what additional functions it provides to invigorate the user community. A brief demonstration on how Japan Search works will also be provided.
10:30 - 11:00
Coffee break
11:00 - 11:20
MC Cote [Library and Archives Canada]
Title: Metadata Success Stories & Other Stories in the Government of Canada
Abstract: Marie-Claude Côté will cover selected successful metadata applications in the Government of Canada (GC), explain their success factors, and draw lessons learned from not so successful metadata endeavours. Stories will include metadata supporting the federal geospatial Platform, recordkeeping systems at the GC-wide level, statistical data, web metadata, and e-learning objects.
11:20 - 11:40
Xiaoguang Wang, Tan Xu, Ningyuan Song, David Clarke, and Xiaoxi Luo. [Wuhan University]
Title: Deep Semantic Annotation of Cultural Heritage Images.
Abstract: With the rapid growth of the information resources for cultural heritage images and the development of Digital Humanities research, Deep Semantic Annotation (DSA) of images, which aims at semantic annotation of cultural heritage images, has gradually attracted more and more attention. Current methods of cultural heritage images for semantic indexing by features and themes, however, are insufficient as they only focus on the metadata level mostly, and fail to address the content of these images very well. Unlike existing methods, the DSA proposed in this paper not only indexes the cultural image metadata but also annotates the fine-grained semantic elements in these images and thus supports the integration of the image resources and automatic knowledge discovery. DSA approach proposed in the study named DSA-CH (Deep Semantic Annotation for Cultural Heritage Images), contains a series of image content organization models, and was designed using an annotation and experimentation process. Multi-layer organizational structure was the key of the study. Using Panofsky’s Iconology theory (Panofsky, 1939) as reference, this study proposed a Multi-leveled fine-grained Annotation Model of Cultural Heritage Images, which include metadata level, element level, semantic annotation level and semantic organization level. In this study, two separated annotation experiments were adopted using two different Narrative Paintings from Dunhuang Murals evaluated and described: (1) the Nine-Colored Deer Jataka in Mogao Grottoes Cave 257 - as example to implement DSA process; and (2) the Starving Tiger Jataka in Mogao Grottoes Cave 428 - as the example to test the annotation method. In these experiments, objects in the images are marked with a tool (Synaptic) and mapped to domain ontologies for structured organization. The relationships between these objects are extracted and annotated using DSA-CH method. The annotation experiments verified the feasibility of DSA-CH.
11:40 - 12:00
Marcia Zeng [iSchool at Kent State University, USA], Imma Subirats Coll [FAO of UN]
Title: AGRIS’ Step of Integrating Research Datasets Metadata.
Abstract: AGRIS is the International System for Agricultural Science and Technology of the Food and Agriculture Organization (FAO) of the United Nations. As a global public multilingual bibliographic database and service of the Agricultural Information Management Standards (AIMS) of FAO, AGRIS has been providing access to bibliographic information resources on agricultural science and technology worldwide since 1975 and is one of the earliest to embrace Linked Open Data with advanced semantic technologies. Its AGRIS AP metadata element set is an application profile of Dublin Core. This presentation reports a new effort of the AGRIS to effectively extend the metadata spectrum in order to not only continually cover bibliographic metadata of publications but also to include research data resources. The pilot project has resulted a new function which enables users search for thousands open datasets through AGRIS since May 2019. The presentation will share the processes, research findings, and best practices on achieving metadata interoperability within the AGRIS Framework for the management and access of bibliographic and research data.
12:00 - 12:20
Sophy Shu-jiun Chen and Lu-Yen Lu. [Sinica Center for Digital Cultures, Taiwan]
Title: Linked Open Data and Possible Approaches for Constructing a Semantic Aggregation Platform: The Taiwan Digital Archives Case
Abstract: In the era of Semantic web, one of the major issues by data integration is how to deal with data of different topics and from heterogeneous resources in the digital aggregation platform. Taking the Taiwan Digital Archive (TaiUC) as example, a union catalog maintained by the Academia Sinica Center for Digital Cultures(ASCDC), which contains more than 5.6 million digitized objects and focuses on 17 diverse themes as biology, anthropology, arts and artifacts and so forth. ASCDC are trying to demonstrate an approach for integrating heterogeneous linked data based on generous and specific semantic data models and its possible application (EDM, BIBFRAME) in data visualization of different modes (Chart, GIS Maps, SNA), cross-domain data querying and linking to external resources (VIAF, ULAN).
12:30 - 14:00
Lunch
14:00 - 14:20
Kosuke Tanabe [National Institute for Materials Science, Japan]
Title: Collaborative vocabulary service for FAIR materials data
Abstract: At the National Institute for Materials Science, we are developing a service platform, called Materials Data Platform (MDPF), to support data-driven materials science. This project, aggregating the data from many sub-domains, called for a flexible metadata management scheme storing different type of information such as chemical substances, characterization methods, instruments, and units. Given the complexity and the amount of data to be managed, we have decided to follow a collaborative approach where the platform benefits from contributions from the userbase (also called crowd-sourcing). For this reason, we are building a vocabulary management service based on Wikibase, which allows us to incorporate inputs from various sub-domains and expertise. This vocabulary not only supports flexibility and findability in materials data, but we also expect this vocabulary to be a reusable knowledge base by itself. In this contribution, we present the details of the service and its integration with other applications and systems within the MDPF, such as the Materials Data Repository and the Text and Data Mining Platform.
14:20 - 14:40
Myung-Ja Han and Jackie Shieh [University of Illinois, Smithsonian Institution]
Title: Program for Cooperative Cataloging (PCC) on Linked Data
Abstract: The Program for Cooperative Cataloging (PCC) works “to promote the discovery and use of the world’s knowledge by supporting metadata producers in library and other cultural heritage communities and by forging alliances with partners who share common goals.”[1]Since 2016, the PCC has actively worked on developing linked data best practices, notably establishing task groups and working groups to provide guidance on library linked data, and publishing best practices documents after reviewing and assessing linked data tools and standards to guide linked data practitioners. This presentation will share the goals and accomplishments of three PCC groups: the Linked Data Advisory Committee (LDAC), which provides guidance on linked data; the Linked Data Best Practice Group, which establishes linked data practices in MARC environments; and the Application Profile Task Group, which defines guidelines for developing community based application profiles. While these three groups have specific charges, they share the same goal, e.g. to establish best practices for the library community on linked data.
14:40 - 15:00
Yoonkyung Choi. [National Library of Korea]
Title: Metadata Application and Future Direction in National Library of Korea.
Abstract: The National Library of Korea (NLK) is responsible for the preparation and standardization of national bibliography. The NLK has adopted various metadata schemes such as KORMARC (Korean MAchine Readable Catalogue), MODS (Metadata Object Description Schema) and Dublin Core based on the resources and services that need to be rendered. The metadata obtained via different schemes have been shared with other libraries in Korea. In KOLIS (Korean Library Information System), an in-house developed system, we adopted KORMARC for offline resources and MODS for online resources. The indexing of those metadata is generated for integrated search. The adoption of various metadata schemes has caused some problems in data management and usage. To settle those issues, the NLK has applied ‘work clustering’ for offline and online resources in the national union catalog ‘KOLIS-NET’ since April 2019. In a long-term perspective, the NLK set up the future goals for the national bibliographic data as “faster and richer”, “more precise and with high quality” and “well-used in various fields even outside of the library”. To achieve that goal, the NLK is looking for some ways to use various external data other than MARC library data. Also, we are planning to apply BIBFRAME to integrate many different types of metadata until 2023. Furthermore, the NLK will develop strategies for next-generation national bibliography to provide special services for data consumers such as libraries, publishers/venders, end users, and researchers.
15:00 - 15:20
David Clarke. [Synaptica]
Title: Linked Data KOS – The Space between Taxonomy and Ontology.
Abstract: What does one do if axioms and full OWL are too complex for one’s business use-case, but SKOS taxonomies are restrictively simplistic. With Reference to the Zeng-Mayr 2018 paper on KOS in the Semantic Web (1), and with the use of live demonstrations in Synaptica’s KOS modelling tool Graphite, Clarke will explore some practical design, build, and governance methods for curating Knowledge Organization Systems as Linked Data in the space between taxonomy and ontology.https://link.springer.com/article/10.1007/s00799-018-0241-2
15:30 - 16:00
Coffee break
16:00 - 16:20
Guojian Xian and Li Jiao. [Agricultural Information Institution of the Chinese Academy of Agricultural Sciences (CAAS0), China]
Title: Open, Interlink and Discover Multilingual Agricultural Data Semantically Based on KOSs.
Abstract: In order to explore an efficient way to open, interlink and discover multilingual data internationally, this presentation firstly gives a conceptual analysis of open linked data and knowledge organization system (KOS). The architecture and key technologies for semantically opening and integrating Chinese agricultural literature data as RESTful API are illustrated, which mainly based on the KOSs (the Chinese Agricultural Thesaurus and AGROVOC and their mappings). In addition, millions of Named Entities (such as people, organizations, flora, fauna, etc.) are used to promote the cognitive search process. Finally, the merge trend of KOSs with Big Data and Artificial Intelligence is prospected.
16:20 - 16:40
Haiqing Lin. [UC Berkeley]
Title: Assigning LC Name Authorities to Movie-Star Photo Collection by using Facial Recognition Technology: An Experiment Report.
Abstract: Identifying people in old photos is a challenge for libraries when managing the historical photo collections. An experiment is now designed to explore the way of deploying Amazon’s face recognition to identify the movie stars in historical stage photos. The presentation outlines the framework of the experiment which consists of three basic components including application interface, Amazon Rekognition API, facial collection and metadata set. Discussions will focus on developing facial collection metadata solution to meet local needs. As a local practice, we examined connecting facial recognition identities to the Library of Congress name authority files as we intend to assign Library of Congress Name authority heading to the stage photos. At the end of the presentation, the ethical concerns involved in the use of facial recognition technology will be talked as well.
16:40 - 17:00
Tae-Sul Seo and Mihwan Hyun [KISTI, Korea]
Title: Metadata for Information Sharing between Open Access Repositories
Abstract: With the recent advances in open access, there are numerous repositories and aggregation services. Therefore, it is needed to exchange and integrate data between repositories. Metadata standardization is important because the data structure of each repository may vary slightly. Therefore, we present an example of exchange of open access journal information between repositories of KISTI and OpenAIRE, as well as a metadata mapping method between repositories. A data model of scholarly information for the future is also presented.
17:00 - 17:20
Dr. Nisachol Chamnongsri [School of Information Technology, Suranaree University of Technology]
Title: Metadata Standards for Palm Leaf Manuscript in Thailand and Asia
Abstract: The goal of palm leaf manuscripts (PLM) metadata is to facilitate, as effectively as possible, user access to and use of the knowledge recorded on PLMs. At the same time, the schema should serve as a standard information structure to be used in the management of PLMs and other digitized ancient documents. This will also make the linking of Asian cultural heritage and wisdom with those of countries in this region possible via the internet. This paper aims to provide the current state of PLMs management in Thailand and Asia, current use of PLMs metadata schema in working projects for the long-term preservation of PLMs, the challenge and constructive solutions, and the first draft of PLMs metadata core elements from the IFLA PLMs preservation workshop at National Library and Documentation Services Board of Sri Lanka held by IFLA on September 6-7, 2017 are presented in this paper.
17:20 - 17:40
Dr. Ahmad Zam Hariro Samsudin
Title: Innovative Use of Metadata: Some Insights from Malaysia
Abstract: Metadata is a vital element in information retrieval in ensuring relevant information to be retrieved promptly. There are many metadata initiatives at the international and local levels, including the contributions by information agencies, particularly libraries in Malaysia. The paper highlights the current development of metadata initiatives in Malaysia. The state of development, issues and challenges will be discussed.