The Cross-Domain Interoperability Framework: Coordinating Standards for Scalable, Practical FAIR Sharing

Starts at
04 Oct 22 17:15 UTC
Finishes at
04 Oct 22 18:45 UTC
Venue
Virtual Conference Room B
Moderator
Simon Hodson
We are now witnessing the emergence of FAIR data-sharing mechanisms in many areas, with the focus having shifted from the "what" to the "how" in many organizations. In many domains, there are a number of common standards – some which can apply equally across domains, and some specific to the data, processes, and practices within that domain. The challenge of FAIR data sharing – ubiquitous, automated reuse of data and metadata – is particularly acute across domain and infrastructure boundaries, demanding a change in how data are described. To meet this challenge, it is important to first understand how the different standards and models used to describe data can be employed, so that they speak not only to traditional users, but also to users coming from other domains. One major development in this area is the idea of a FAIR Digital Object Framework (FDOF), where information - both data and metadata - of interest for the discovery and reuse of data can be identified and obtained. The FDOF represents an initial step, but does not address many of the practical issues of interoperability. We must look at the intersection of standards of different types and how they fit into this picture: the idea that every FAIR resource is implemented according to an entirely new set of technical standards is not realistic. The FDOF serves as an agreed way to obtain needed FAIR resources and to learn enough about them to understand some related resources (e.g., metadata schemas) at the level of a protocol. It is not sufficient on its own to produce interoperability, which will require an ability to actually understand the metadata schemas being used. When it comes to standards, some parts of FAIR are better supported than others. Discovery of FAIR resources increasingly relies on standards and approaches which are widely adopted, and often much the same across domains and institutional boundaries. DCAT, Schema.org, and Dublin-Core-based cataloguing metadata is commonly found in many areas. For other aspects of FAIR however, this degree of domain-agnostic standardization does not exist. Semantics and vocabularies are often deeply domain-dependent, and other important types of metadata needed for effective reuse - structural metadata, provenance, etc. - are also seen in many different forms, reflecting domain practice. Within any given domain, the standards requiring support may be well-understood, and limited in number. The same cannot typically be said when data from other domains is the target of reuse. If we are to make use of the FDOF as intended, we need to have a second tier of domain-agnostic standards which makes this profusion of models, schemas, etc. tractable. Such a second tier should be developed as a mechanism for domain-specific standards to be more easily exchanged and transformed. Technical standards such as RDF, JSON, XML (etc.) may provide a useful foundation, but they are not themselves sufficient. The standard vocabularies and models which are understandable across domains provide an additional needed layer of interoperability. One good example of this is SKOS: many domains use concept systems of different types. If they are described in SKOS, they can at least be exchanged and processed in a coherent way across domain boundaries, even if the specifics of the concepts themselves need further attention. The EOSC Interoperability Framework introduced this idea of a leveled hierarchy of standards, and it is a useful way to understand what a practical approach to interoperability looks like as we progress from the universal toward the domain- and community specific. This session presents the requirements which lead us to a middle tier of domain-agnostic standards in support of the FDOF, and proposes some candidates for consideration based on implementations and explorations to date. Some examples of such standards are provided, showing how they can work together to provide the complete information set needed to reuse data in a FAIR data-sharing scenario across domain and institutional boundaries. The focus of the session is on the "interoperability" and "reuse" elements of FAIR, but the session will touch on all aspects of FAIR data sharing, and how it might practically be realized. In particular, we aim to present these ideas to the DCMI community, to get feedback and to understand how this approach may intersect with current activities and thinking in the DCMI community and with related initiatives.

Moderator

  • Simon Hodson

    CODATA, the Committee on Data of the International Science Council

    Simon Hodson is the Executive Director of CODATA and is also a member of the DDI Scientific Board.. He is an expert on data policy issues and research data management. Simon has directed and contributed to a number of influential reports: he chaired the European Commission’s Expert Group on FAIR Data which produced the report Turning FAIR into Reality (https://doi.org/10.2777/1524). He was also vice-chair of the UNESCO Open Science Advisory Committee, with an influential role in drafting the UNESCO Recommendation on Open Science, which was adopted in November 2021.

Presentations

Overview of Requirements and Potential Candidate Standards for a Cross-Domain Interoperability Framework

This presentation addresses the specific requirements and a possible slate of candidate standards and models for a 'cross-domain interoperability framework (CDIF)" in support of the FDOF.

  • Arofan Gregory

    CODATA and DDI Alliance

    Arofan Gregory works as a standards expert at CODATA, focusing on the Decadal Programme and WorldFAIR projects. He has been involved in the development and implementation of many metadata standards over the past two decades, including the Data Documentation Initiative (DDI) the Statistica Data and Metadata Exchange (SDMX), the Generic Statistical Information Model (GSIM), and others. He is currently the chair of the DDI Cross-Domain Integration Working Group.

The Structural Description of Data: DDI-CDI and the Variable Cascade

The Data Documentation Initiative Cross-Domain Integration (DDI-CDI) specification provides a model for describing variables as they are used in different ways, and supports their packaging in a variety of ways for reuse and connection to process descriptions and concept systems. This presentation covers the main features of the standard for these purposes.

  • Flavio Rizzolo

    Statistics Canada and DDI Alliance

    Flavio Rizzolo is a systems architect at Statistics Canada. He has been active in the development of many standards both within the domain of official statistics and in research more broadly. He is active in the UN/ECE efforts for the Modernization of Statistics, and is a member of the DDI Technical Committee and the DDI-CDI Working Group.

Examples: The Coordinated Use of Standards for Data Production and Dissemination

The CDIF concept is based on a number of real-world case studies and prototypes. This session looks at the range of standards which are employed together to support dissemination of data coming from a variety of domains, to users of many different types. Both domain-specific and domain-independent standards are involved, implemented with a variety of different technologies.

  • Franck Cotton

    Institut National de la Statistique et des Études Économiques

    Franck is a technology advisor at INSEE, the French National Statistical Institute, where he started as a business statistician, then became project manager for the development of a classifications management system before taking the responsibility of IT infrastructure and IT security. He is particularly active in the fields of metadata standards, linked data, and international cooperation.

Looking Forward: Planned and Needed Work (short presentation and then discussion)

There are currently many different initiatives focused on the exploration and establishment of a practical approach to scalable FAIR data-sharing across domains. This presentation looks at these and aims to situate further discussion. A path forward will be suggested, comprising, inter alia, the exploration of the idea of the cross-domain interoperability framework in the WorldFAIR project, the collaborations around the Global Open Science Cloud initiative and the CODATA Decadal Programme 'Making Data Work for Cross-Domain Grand Challenges' more generally. The session will close with discussion of the ideas presented. We will seek to identify in advance, appropriate respondants from the DCMI community who will be asked to give feedback and to suggest how this approach may intersect with current activities and thinking in the DCMI community and with related initiatives.

  • Simon Hodson

    CODATA, the Committee on Data of the International Science Council

    Simon Hodson is the Executive Director of CODATA and is also a member of the DDI Scientific Board.. He is an expert on data policy issues and research data management. Simon has directed and contributed to a number of influential reports: he chaired the European Commission’s Expert Group on FAIR Data which produced the report Turning FAIR into Reality (https://doi.org/10.2777/1524). He was also vice-chair of the UNESCO Open Science Advisory Committee, with an influential role in drafting the UNESCO Recommendation on Open Science, which was adopted in November 2021.