Panel: Metadata for Statistical Data

Starts at
03 Oct 22 20:00 UTC
Finishes at
03 Oct 22 21:30 UTC
Venue
Virtual Conference Room B
Moderator
Marie-Claude Côté
This panel will present the life cycle of the production of statistical data, from its definition to the metadata that support, among other things, its management, FAIR use, interoperability, and integration. A number of metadata standards will be examined, including the Statistical Data and Metadata Exchange (SDMX) standard, and illustrated with case studies.

Moderator

  • Marie-Claude Côté

    Library and Archives Canada

    Marie-Claude Côté is an expert advisor in information management (IM) and project management (PM) at Library and Archives Canada (LAC). In her new role, she keeps advising Government of Canada (GC) departments and agencies on (IM) practices, while developing LAC's PM capacity. Marie-Claude has held IM- and metadata-related management and analyst positions at the Treasury Board of Canada Secretariat, the Department of Canadian Heritage, the Canadian International Development Agency, and Industry Canada. After obtaining her Master’s Degree in Library and Information Science (MLIS), she worked in municipal and private sector libraries before joining the federal public service. For the last 25 years, she has contributed to the development of the IM domain in the GC. Marie-Claude has also taught the courses of the IM Curriculum at the Canada School of Public Service. She is a certified project management professional (PMP), and is extending her knowledge and experience in this field.

Presentations

What Is Statistical Metadata?

Let us consider the number, 3.6. The lack of context around it – what does it mean, how was it produced, what is its accuracy, etc. – renders it unusable. The answers to these questions (and others) provide the information needed to understand what the number represents. Giving a hint, this number was recently released by the US Bureau of Labor Statistics; this and the answer to the questions are statistical metadata. They are metadata because they describe the number, designs, and processing leading to its creation, and they are statistical because the number was produced by a process that uses probability and statistics. This talk will describe the activities that lead to the production of the number, providing an introduction to many aspects of statistical metadata. Along the way we will mention the many standards that address the metadata needs for statistical data production.

  • Daniel Gillman

    US Bureau of Labor Statistics

    Dan Gillman works in the Office of Survey Methods Research at the US Bureau of Labor Statistics. His research interests include metadata, standards, classification, terminology, and transparency. He has decades experience developing international metadata standards in ISO, ANSI, DDI Alliance, UNECE, OMG, W3C, and others, and applying these to metadata management needs at BLS and other US federal statistical agencies. The DDI (Data Documentation Initiative) and UNECE (UN Economic Commission for Europe) standards are built for statistical data and production, and Dan has had a central role in their development. Today, his talk will describe what statistical metadata is and some of the details of the many standards.

An overview of the SDMX standard and how the OECD is using it to improve accessibility and interoperability

First released in 2002, the Statistical Data and Metadata eXchange standard (SDMX) started as “an initiative to foster standards. The standard has been further enhanced to include other statistical data and metadata use cases, for example making authoring, validation and exchange easier, faster, and with better quality. SDMX is not only a set of technical standards but also includes content-oriented guidelines that aim at harmonizing metadata across statistical domains and agencies. An example is a guideline on how to perform structural modelling of datasets.

This presentation will provide an overview on SDMX. Also, it will describe how the OECD is using SDMX to harmonise their structural metadata, metadata governance, and tools collaboration at the OECD and with other agencies.

  • David Barraclough

    OECD

    David Barraclough is a “Smart Data Practices Manager” at the OECD, leading the SDMX Structural Modelling team that remodels the OECD’s structural metadata using SDMX. He has been involved in IT system development for 30 years, most of that time in official statistics.

    David is chair of the SDMX Statistical Working Group which maintains the “Content-Oriented Guidelines” to harmonise the implementation of structural metadata, and is a member of the ModernStats standards working group that maintains statistical modernisation models for business and technical architecture.

Statistics Canada’s approach in implementing its virtual metadata integrated platform

Metadata plays a key role in statistical production: from concepts, classifications and variables to retention and provenance information, metadata is created, used and shared across all phases of the data lifecycle. As we move towards a FAIRer infrastructure (Findable, Accessible, Interoperable, Reusable), metadata management needs to leave behind its silo-centric, tool-specific history and embrace a new, integrated and cross-domain approach. Statistics Canada is in the early stages of implementing a virtual metadata integration platform, the Metadata Hub, at the core of which stands a semantic infrastructure anchored in an RDF triple store. This triple store integrates content from a number of metadata repositories, e.g., Colectica, Ariā, SDMX .Stat Toolkit, GeoNetwork and CKAN across a number of standards, e.g. DCAT, DDI, SDMX, and S/XKOS, among others. We’ll discuss here our experience so far, use cases and the way forward.

  • Gabriel Gellner

    Statistics Canada

    Lead Data Scientist, Centre for Statistical and Data Standards at Statistics Canada