Invited Talk: Putting standards into practice: pathogen genomics contextual data (“metadata”) standards in public health and food safety
Faculty of Health Science, Simon Fraser University, Vancouver, Canada
Emma Griffiths received her PhD at McMaster University in Ontario, Canada studying how molecular markers known as inserts and deletions can be used to augment traditional approaches to understanding bacterial phylogeny. Her postdoctoral work focused on how contextual data standards such as ontologies, improve data harmonization and integration in public health and food safety genomics. She is a member of the Standards Council of Canada and is engaged in many contextual data (“metadata”) harmonization initiatives - both Canadian and international - including the development of an ISO standard for the use of whole genome sequencing and contextual data in the typing and genomic characterization of foodborne bacteria, and the development of an international public health data standard for CanCOGeN - Canada’s SARS-CoV-2 genomic surveillance initiative. She is currently a research associate at Simon Fraser University in Vancouver, Canada.
Whole genome sequencing (WGS) is a powerful tool for tracking and understanding the spread of pathogens impacting environmental, animal and human health. Contextual data (“metadata”) consists of laboratory (e.g. date and location of testing, cycle threshold (CT) values), clinical (e.g. hospitalization, outcomes), epidemiological (e.g. age, gender, exposures) and methods (sampling, sequencing, bioinformatics) information that enables the interpretation of sequence data and the production of actionable results for public health and food safety programs. Contextual data is often collected on a project-specific basis according to local needs and reporting requirements which results in the collection of different data types at different levels of granularity, with different meanings and implicit bias of variables and attributes. Furthermore, the information is often collected as free text, or if structured, according to organization or initiative-specific data dictionaries, using different fields, terms, formats, abbreviations, and jargon. The variability in the way information is encoded in private databases tends to propagate to public repositories, which makes the information more difficult to interpret and to use. Our work focuses on the development and implementation of contextual data standards that can improve data harmonization and integration in different Canadian and international initiatives. Examples of our work include the development of two ontologies - the Food Ontology (FoodOn) and the Genomic Epidemiology Ontology (GenEpiO) - as part of the IRIDA project (Canada’s public health Integrated Rapid Infectious Disease Analysis bioinformatics platform), an ISO standard for genomic characterization of foodborne bacteria, and an international standard for SARS-CoV-2 pandemic genomic surveillance.
For more information about the work of Dr. Griffiths with respect to metadata, please see also Genome Canada’s Leveraging contextual data in the fight against COVID-19 - Q&A with Dr. Emma Griffiths.