Panel: Metadata and Knowledge Graphs

Starts at
05 Oct 22 17:30 UTC
Finishes at
05 Oct 22 19:00 UTC
Venue
Virtual Conference Room A
Moderator
Jian Qin
Metadata as the vehicle for representing resources of all kinds, digital or otherwise, has evolved from record-centric style to networks or graphs that are supported by entities and their semantic relations. The record-centric style metadata focuses on information objects in which entities are represented by strings of text. This tradition is being challenged by newer concepts and techniques that are becoming reality of metadata research and development. Ontologies, linked data, entity management, and knowledge graphs have been applied in many domains that use metadata to represent resources. These developments are exciting and have promises to make metadata more intelligent, efficient, and interoperable. Yet, they also bring questions for the metadata community to explore and address: What implications does the transition from record-based metadata to knowledge networks/graphs have for legacy bibliographic metadata? How will the metadata-deduced knowledge graphs impact the current and future resource representation and discovery? How can knowledge graph techniques be applied to domain resource representation to facilitate efficient discovery? This panel invited experts from research, industry, and academia sectors to provide insights into these questions.

Moderator

  • Jian Qin

    Syracuse University, USA

    Jian Qin is Professor at the iSchool at Syracuse University. She conducts research in areas of metadata, knowledge and data modeling, scientific communication, research collaboration networks, and research data management. Her research has received funding from IMLS to develop an eScience librarianship curriculum and from NSF for the Science Data Literacy project. Jian Qin directs a Metadata Lab that focuses on big metadata analytics and metadata modeling and linking.

Presentations

Creating, managing, and using cultural heritage linked data

CONTENTdm is a digital asset management service from OCLC. Working closely with product and technology colleagues, OCLC Research partnered with five CONTENTdm institutions to investigate how to convert current CONTENTdm metadata into linked structured data, manage that data, create new data, and leverage the structed data to improve the end-user experience of CONTENTdm. Wikibase, the platform used to support WikiData, was selected for this project because it provides a large number of services and features that are critical to this work. It also has a large and engaged community of support. This talk with discuss the framework for the project, provide a technical overview of the data and services, and share participant feedback from the project.

From Code Repositories to searchable Knowledge Graphs or Research Software Metadata

Research Software is key to understand, reproduce and reuse existing scientific results in many disciplines, ranging from Geosciences to Astronomy or Artificial Intelligence. However, research software is usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In addition, research software is no longer isolated code, but a connected network of scripts, libraries, notebooks and containers that represent different aspects and dimensions of a research software project. These problems affect not only researchers, but also students who aim to compare published findings and policy makers seeking clarity on a scientific result. In this talk I will present an overview of the main challenges when creating Knowledge Graphs of Research Software metadata, together with our latest efforts to capture this knowledge from software documentation and code to make it useful to researchers.

  • Daniel Garijo

    Information Sciences Institute, University of Southern California

    Daniel Garijo is a postdoctoral researcher at the Information Sciences Institute of the University of Southern California. He also collaborates with the Ontology Engineering Group at the Artificial Intelligence Department of the Computer Science Faculty of Universidad Politécnica de Madrid. His research activities focus on e-Science and the Semantic Web, specifically on how to increase the understandability of scientific workflows using their outputs, inputs, provenance, metadata, intermediate results and exposing them as Linked Data.

H-Graph: Driving Cross-Product Innovation at Elsevier Health

Elsevier Health provides a suite of offerings to clinicians and patients both at the point of care and throughout their professional education and training. Our products are backed by a multitude of datasets managed by different teams across the company. The Healthcare Knowledge Graph (H-Graph) was launched in 2017 in an effort to facilitate linking data in the health domain across products and datasets, and remove barriers to cross-product innovation. H-Graph contains high-quality medical knowledge sourced through a combinatorial approach of expert-curated and machine learning extracted relations between medical concepts and their links to source material and associated metadata. In this talk, we will review the progress of H-Graph over the past 5 years, the challenges encountered, and our proposed path forward.

  • Jessica Cox

    Elsevier Health

    Jess Cox is the Director of Software Engineering for Search and Knowledge Management. In this role, Jess oversees the Search and H-Graph groups responsible for development of the knowledge base and search stack used by the Health Markets division of Elsevier. Jess has been with Elsevier since 2017, where she started as a Technology Researcher in the Labs group working on data science, text mining and NLP methods. Before joining Elsevier, Jess completed her PhD in biomedical research and a post-doctoral fellowship in public health.