Full Papers

The programme is still being finalized and is subject to ongoing updates as sessions are scheduled. Please check back regularly for the latest changes.

A Workflow-Based Approach for Metadata Interoperability via Domain-Specific Schema Mapping to DataCite

Authors: Yan CONG, Masao TAKAKU, Yasuyuki MINAMIYAMA, Shigeki MATSUBARA

As research becomes increasingly data-driven and interdisciplinary, metadata interoperability has become essential for efficient data discovery and reuse. However, many domain-specific metadata schemas lack clear specifications and standardized structures, making cross-domain integration difficult. To address this issue, this study proposes a systematic three-phase workflow for cross-domain metadata mapping. The workflow consists of Phase (i) preliminary assessment of metadata schemas, Phase (ii) mapping relationship analysis, and Phase (iii) XSLT-based metadata transformation. To evaluate the effectiveness of the proposed workflow, a case study is conducted using a set of 125 metadata schemas. The results show that only 10 out of 125 schemas satisfy the requirements for mapping to the six mandatory DataCite properties, while the remaining schemas are limited by incomplete documentation or structural heterogeneity. In particular, challenges in representing properties such as ``Identifier'' and ``ResourceType'' highlight persistent semantic and structural mismatches across metadata standards. XSLT transformation files were developed and released, enabling practical implementation of the proposed mapping approach. This study contributes a systematic workflow for metadata mapping and provides empirical evidence on the limitations of current metadata standardization practices, supporting future efforts toward improved cross-domain metadata interoperability.
  • YAN CONG

    Nagoya University

    I obtained my Ph.D. in Library and Information Science, with a research focus on metadata standardization, Linked Open Data (LOD), and TEI markup. I also explored the application of AI in education, particularly methods for evaluating and validating its effectiveness in learning contexts. After graduation, I joined my current position, which focuses on metadata and interoperability. My work centers on metadata standardization and Persistent Identifiers (PIDs), with the aim of improving data consistency, system integration, and long-term accessibility across heterogeneous information systems.

AI-Guided Metadata Construction for Meaning-Driven Digital Knowledge Systems: A Framework for Automated Metadata Generation and Semantic Discovery

Authors: Wirapong Chansanam, Umawadee Detthamrong, Chunqiu Li, Abdul Rahman Ahmad, and Avshalom Elmalech

The rapid expansion of digital repositories and scholarly resources has increased the demand for scalable and intelligent metadata management systems. Traditional metadata creation methods, which rely on manual cataloguing by information professionals, struggle to keep pace with the growing volume and heterogeneity of digital content. This study develops and evaluates an AI-guided framework, Metadata-Building-AI-Guidance V.1.0 that combines document ingestion, chunking, OpenAI text-embedding-3-small embeddings, and the gpt-4o-mini large language model into a single Streamlit application supporting both automated Dublin Core-aligned metadata extraction and embedding-based semantic retrieval. We position the system as a design-science artefact in which retrieval is not a side feature but a feedback loop: the same vector index that powers similarity search is also used to surface the contextual chunks from which structured metadata are extracted and to support retrieval-augmented question answering over uploaded collections. The system was evaluated on a purposively sampled corpus of 30 open-access academic documents and 20 expert queries, using (i) field-level F1 against librarian-curated ground truth with Cohen's κ for inter-annotator agreement and (ii) Precision@5 and Mean Reciprocal Rank with binary relevance judgements from two librarians. Field-level F1 ranged from 0.67 to 0.73 for dc:title, dc:creator, dc:subject, and dc:description, with overall κ = 0.83 (almost perfect agreement); the lowest F1 was 0.57 for dc:type, traced to a fixed generic prompt output rather than to a model-capability limitation. Semantic retrieval reached Precision@5 = 0.61 and MRR = 0.69 with κ = 0.66 (substantial agreement) across the 20 queries. We discuss the limits of LLM-only evaluation—including the absence of a head-to-head comparison with established non-LLM extractors such as GROBID—and identify controlled baseline comparison together with a refined dc:type prompt as the immediate next steps. Prompts, JSON schema, library versions, sampling log, and evaluation queries are released to support replication. The contribution is a reproducible reference implementation that aligns AI-assisted metadata extraction with Dublin Core Terms and the FAIR principles for digital libraries, archives, and cultural-heritage repositories.
  • Avshalom Elmalech

    Bar-Ilan University

    Avshalom Elmalech is a researcher at Bar-Ilan University with a PhD in Computer Science, working at the intersection of applied artificial intelligence and digital humanities. His research bridges information science and AI by examining how deep learning methods can be effectively applied to humanities data. He has contributed practical frameworks for guiding digital humanities scholars in choosing and adapting NLP and deep learning approaches under constraints such as limited training data and domain specificity.
  • Wirapong Chansanam

    Khon Kaen University

    Wirapong Chansanam is an Associate Professor of Information Science at Khon Kaen University, Thailand. He earned his Ph.D. in Information Science in 2014 and currently serves as Head of the Information Science Department and Chair of the Digital Humanities Research Group. His research focuses on information science, ontology, knowledge organization systems, linked open data, and data analytics. He actively contributes to advancing digital knowledge management and innovation.

An Exploratory Study on Genre Labeling of Online Comic Reading Platforms in Taiwan

Authors: Tzu-Yun Chien, Li-Min Huang

As online comic reading continues to grow in Taiwan, online platforms have become important sites for both comic consumption and discovery. This exploratory study examines genre-labeling practices on three major online comic platforms in Taiwan. We collected genre terms from the platforms’ Traditional Chinese and English interfaces and generated a list of 35 unique English genre terms. These terms were then mapped onto a facet framework drawn from previous literature. Our preliminary findings show that platform genre labels function as multidimensional access points rather than simple genre categories, representing aspects such as setting, mood, plot or narrative, and production context. Cross-platform differences in labeling granularity and cross-language differences in semantic scope were also observed. The findings may inform the development of more consistent and user-friendly genre labels for digital comic environments.
  • Tzu-Yun Chien

    National Taiwan University

    Tzu-Yun Chien is a Master’s student in the Department of Library and Information Science at National Taiwan University. Her research interest focusing on human–computer interaction and information behavior. Her prior research on user behaviors in AI-assisted tasks has been published as a full conference paper. Her ongoing master's thesis focuses on the differences between existing genre categorization frameworks, platform labeling practices, and user interpretations within online comics.

Beyond Mapping: A Semantic Transformation Approach from KORMARC and MODS to BIBFRAME

MARC-and MODS-based bibliographic data, due to their record-based structure, have limitations in supporting entity identification and semantic relationships in linked data environments. Existing approaches to converting these formats into BIBFRAME primarily rely on field-level mappings, which fail to capture bibliographic meaning and relationships adequately. To address this issue, this study proposes a semantic-based transformation framework that reinterprets bibliographic data at the level of semantic units. The framework includes a normalization model and a transformation module consisting of semantic analysis, entity extraction, relationship generation, and RDF transformation. This approach enables the reconstruction of bibliographic data into an entity–relationship structure, facilitating their conversion into BIBFRAME while preserving semantic integrity and enhancing interoperability and reusability.
  • Seungmin Lee

    Chung-Ang University, Seoul, South Korea

    Seungmin Lee is a professor in the Department of Library and Information Science at Chung-Ang University, South Korea. He has served as Chair of the Cataloging Committee, Chair of the Librarian Certification Committee, and Chair of the Planning and Policy Committee of the Korean Library Association (KLA). He is currently Vice President of the Korean Biblia Society for Library and Information Science and Editor-in-Chief of the Journal of the Korean Library and Information Science Society. His research interests include metadata, bibliographic ontology, and knowledge organization.

Beyond Mapping: A Semantic Transformation Approach from KORMARC and MODS to BIBFRAME

Authors: Seungmin Lee

MARC- and MODS-based bibliographic data, due to their record-based structure, have limitations in supporting entity identification and semantic relationships in linked data environments. Existing approaches to converting these formats into BIBFRAME primarily rely on field-level mappings, which fail to capture bibliographic meaning and relationships adequately. To address this issue, this study proposes a semantic-based transformation framework that reinterprets bibliographic data at the level of semantic units. The framework includes a normalization model and a transformation module consisting of semantic analysis, entity extraction, relationship generation, and RDF transformation. This approach enables the reconstruction of bibliographic data into an entity–relationship structure, facilitating their conversion into BIBFRAME while preserving semantic integrity and enhancing interoperability and reusability.
  • Seungmin Lee

    Chung-Ang University, South Korea

    Seungmin Lee is a Professor in the Department of Library and Information Science at Chung-Ang University, Seoul, South Korea. He received his Ph.D. in Information Science from Indiana University Bloomington. His research interests include library classification, metadata, bibliographic ontologies, and knowledge organization. He previously served as Chair of the Cataloging Committee, Chair of the Planning and Policy Committee, and Chair of the Librarian Qualification Committee of the Korean Library Association. His recent research focuses on AI-driven metadata generation and AI literacy.

Beyond Metadata Completeness: A Multidimensional Interoperability Readiness Framework for National Web-Scale Discovery Services

Authors: Dwi Fajar Saputra, Taufik Asmiyanto, Nina Mayesti

National web-scale discovery services (WSDS) depend on the sustained interoperability of institutional repositories to deliver reliable access to scholarly content. Existing evaluations predominantly assess interoperability through metadata completeness at registration, overlooking the operational dimensions that determine long-term integration. This paper introduces the Multidimensional Interoperability Readiness (MIR) framework, which integrates three analytically distinct dimensions: metadata completeness, harvesting sustainability, and metadata capacity. The framework is validated empirically through analysis of the Indonesia One Search (IOS) registry and OAI-PMH harvesting dataset comprising 29 registration fields across four functional categories. Findings reveal a structural decoupling between metadata completeness and harvesting sustainability: most repositories register adequate descriptive metadata but fail to sustain active harvesting over time. Journal repositories demonstrate substantially higher interoperability readiness than dataset and ETD repositories. The MIR framework offers a principled basis for evaluating national discovery infrastructure, with concrete governance implications for repository onboarding, monitoring, and differentiated intervention strategies. This study contributes to the DCMI 2026 theme of Data Integrity and Reliability, arguing that trustworthy discovery infrastructure requires verified, sustained metadata flow—not merely administrative registration.
  • Dwi Fajar Saputra

    Faculty of Humanities, Universitas Indonesia

    Dwi Fajar Saputra is a doctoral candidate in Information Studies at the Faculty of Humanities, Universitas Indonesia. His research focuses on digital library systems, repository interoperability, metadata quality, and web-scale discovery services. His doctoral research examines the sustainability of national aggregation infrastructure, with particular emphasis on metadata readiness and harvesting continuity in Indonesia One Search as a national web-scale discovery service.

Decolonizing Metadata: Lessons from Stolen Relations’ Controlled Vocabulary Development

Authors: Mairelys Lemus-Rojas, Patrick Rashleigh, Khanh Vo

Metadata should be understood as an interpretive practice and not just as a technical framework for describing digital objects. It holds power and facilitates community engagement. Within the digital humanities arena, metadata plays a pivotal role in narrating and recovering stories that have remained obscured or misrepresented in historical records. This raises a fundamental question: how can we more humanly describe Indigenous communities whose identities and relationships to kinship and culture have been misrepresented in colonial records? This paper examines the role of controlled vocabularies in shaping the representation of Indigenous histories in Stolen Relations’ digital humanities project, positioning metadata as a form of archival intervention. It demonstrates how iterative feedback informs the refinement, implementation, or creation of controlled vocabularies and positions metadata as a space where descriptive practices are examined and reshaped.
  • Mairelys Lemus-Rojas

    University of Central Florida

    Mairelys Lemus-Rojas is the Head of Digital Scholarship at the University of Central Florida Libraries. She oversees Digital Initiatives, Open Scholarship, and the Digital Exploration Center, a digital scholarship hub to learn, engage, and collaborate on digital projects. Previously, she worked as the Head of Open Metadata Production and Initiatives at Brown University. As a strong advocate for open knowledge and an active contributor to Wikimedia projects, Mairelys is committed to democratizing access to information by amplifying the visibility of underrepresented communities.

Does AI-Encoded Meaning Align with Human Meaning?

Authors: Zhenhua Wang, Aixin Yao and Ming Ren

AIs are increasingly used to support metadata processing and investigation, which depends on whether AI-encoded meaning aligns with human meaning. However, AI encodes word meaning through distributional and contextual representations, and it remains unclear whether such representations preserve the meaning value of human system. We answer this question through Zipf’s meaning law, which links word frequency to number of word meanings. We compare multiple AI-induced meaning estimates with human-measured meaning. To quantify alignment, we propose Meaning-Zipf Deviation (MZD), which covers continuous meaning distributions and measures their divergence with reliability adjustment. Extensive experiments show that human words consistently follow Zipf’s meaning law. AI-encoded meanings also exhibit Zipfian regularities, inheriting part of the statistical structure of human language. However, AI meaning distributions remain flatter than human distributions, with lower scaling exponents and non-negligible MZD values. Larger models do not reduce this gap. AI tends to bind words to context-conditioned senses rather than preserve their broader polysemous potential.
  • Ming Ren

    School of Information Resources Management, Renmin University of China

    Ren Ming is a Professor, Doctoral Supervisor, and Vice Dean at the School of Information Resource Management, Renmin University of China. Her research focuses on big data analytics and applications, AI, and data element markets. She has led multiple national-level research projects, published in leading journals such as JASIST, JOI, TOIS, authored two academic monographs, and led the annual Data Element Marketization Promotion Index report. She serves as a committee member in national and professional societies related to information technology, knowledge organization.

From Multi-Notation Assignment to Faceted Classmark Synthesis in K-KOS: An Exploratory Application of the Integrative Levels Classification with a Classmark Builder

Authors: Ziyoung Park1, Claudio Gnoli, Daniele Morelli

KOS registry entries often cover multidimensional topics that resist representation by a single classmark, making multi-notation assignment a common but semantically limited approach. This study explores how selected K-KOS entries can be reclassified under the developing version of the Integrative Levels Classification (ILC) and synthesized into structured classmarks using the ILC Classmark Builder, with multi-notation assignment serving as the baseline representation. Based on three representative cases, candidate classmarks were manually constructed and examined through baseline assignment, free-facet combination, and facet synthesis, with support from the Builder for retrieval, combination, and syntactic validation, and followed by expert review of the resulting classmarks. The findings show that notation synthesis makes semantic relations more explicit and enhances the structural expressiveness of KOS representation, while also revealing that successful synthesis depends on human judgment in syntactic disambiguation, conceptual interpretation, and evaluation among alternative formulations. The study demonstrates both the feasibility and the practical challenges of this transition, and confirms the value of tool support — while underscoring that human judgment remains indispensable.
  • Ziyoung Park

    Hansung University, South Korea

    Ziyoung Park is a professor of Library and Information Science at Hansung University, Seoul, Republic of Korea, where she also serves as Director of the University Library. Her research focuses on knowledge organization systems (KOSs), including the design of classification systems and metadata modeling. She serves as a member of ISKO Italy, an editor for BARTOC, and a program committee member of NKOS. She leads research projects on designing and building registries for Korean KOSs. Her additional work includes developing a bibliographic database of German literature translated into Korean.

Grounding AI Subject Cataloguing in Standards and Policy: An MCP Server for Live LC Authority Lookup and a DITA-Encoded SHM for RAG

Authors: May Chan

Large language models (LLMs) applied to subject cataloguing using Library of Congress Subject Headings (LCSH) tend to generate headings and strings that are syntactically plausible but policy-invalid, bypassing the controlled vocabularies and governing rules that subject collocation depends on. This paper describes a two-track experiment to address the lack of grounding in standards and policy. The first track is lc-vocabularies-mcp, a Model Context Protocol (MCP) server connecting an LLM to live Library of Congress (LC) linked data APIs, enabling real-time authority validation for LCSH and related controlled vocabularies. The second track is a conversion of the Subject Headings Manual (SHM) from PDF to structured DITA (Darwin Information Typing Architecture), designed as a machine-actionable retrieval-augmented generation (RAG) corpus. Together, the two tracks support a five-part subject cataloguing workflow in which non-parametric knowledge is supplied to Claude at each part where parametric knowledge alone is insufficient. The paper reports on the architecture of lc-vocabularies-mcp, the DITA conversion methodology, and an evaluation design that tests whether policy-grounded retrieval improves AI-assisted subject cataloguing.
  • May Chan

    University of Toronto

    May Chan is Head, Metadata Services at the University of Toronto Libraries, with 17 years of prior experience in public libraries at Vancouver and Burnaby, British Columbia. A Carpentries Instructor Trainer, she is committed to building computational and technical literacy among library practitioners, and has been active in cataloguing training and professional development in a variety of roles throughout her career. May currently serves as co-chair of the PCC Standing Committee on Training and the SCT Linked Data Training Task Group.

Reconstructing Metadata Literacy in the AI Era: A Conceptual Framework and Educational Reflections for LIS Education

Authors: Ba Xi, Nurussobah Hussin and Hanis Diyana Kamarudin

This paper calls for a rethinking of metadata literacy in the age of AI. Previous discussions often defined metadata literacy as knowledge of descriptive structures or skills in creating and using metadata records. But this understanding is no longer enough – though still necessary – when AI systems create, transform, rank, summarise and recommend information at scale. In such environments, metadata is not simply about post hoc description of resources; it structures provenance, visibility, accountability, cultural representation, and the conditions under which machine outputs can be interpreted and trusted. Using metadata, metadata instruction, AI literacy, and information literacy scholarship, this conceptual paper presents a reconstructed model of metadata literacy for LIS education. The model is built on five dimensions: understanding of infrastructure, contextual description and representation, provenance and disclosure, evaluation of algorithmically mediated outputs and intervention in terms of ethics and governance. The paper also gives examples of learning tasks and assessment evidence to illustrate how the model can be applied. It argues for viewing metadata literacy not as a narrow technical specialisation but as a fundamental educational response to AI-mediated knowledge environments.
  • Ba Xi

    Faculty of Information Management, Universiti Teknologi MARA, Shah Alam, Malaysia ; Hebei University of Economics and Business, Shijiazhuang, China

    I am a PhD candidate in Information Management at Universiti Teknologi MARA, Malaysia and a researcher at Hebei University of Economics and Business, China. My research focuses on AI literacy and community libraries. I specialize in qualitative approaches, including in-depth interviews and ethnography. I have published multiple academic papers in both Chinese and English and have led a research project in this field.

Together in Practice: Comparing LCC and DDC Assignment Across Library of Congress Bibliographic Records

Authors: Kai Li, Inkyung Choi, Jessica Yi-Yun Cheng, Zach Jenkins, and Brian Dobreski

Library of Congress Classification (LCC) and Dewey Decimal Classification (DDC) are two of the most widely used knowledge organization systems in libraries, yet empirical understanding of how they align and diverge in cataloging practice at scale remains limited. This paper examines co-assignment patterns between LCC and DDC classes using 4,042,962 dual-classified bibliographic records drawn from the Library of Congress's book catalog. Through descriptive quantitative analysis and bipartite network analysis, we identify areas of strong and weak structural correspondence between the two systems. Results reveal that well-defined humanities disciplines — including law, fine arts, religion, literature, and history — exhibit high one-to-one alignment, while broader and more applied domains such as social sciences, technology, and computer and information science show markedly dispersed cross-system mappings. A structural asymmetry is also evident: LCC classes tend to map more sharply to single DDC counterparts than vice versa. Network analysis identifies seven disciplinary communities and highlights Social Sciences and Technology as key interdisciplinary hubs, while second-level classes reveal contrasting topologies — a hub-centric star structure for LCC:G and a fragmented, multi-polar constellation for DDC:6XX. These findings carry practical implications for library reclassification projects, cataloging workflows, and the reuse of bibliographic metadata in emerging technological environments.
  • Inkyung Choi

    Sungkyunkwan University

    Dr. Inkyung Choi (she/her) is an Assistant Professor at Sungkyunkwan University (SKKU), Seoul, South Korea. Her research focuses on metadata semantics and schema design, scalable metadata aggregation, data provenance modeling, and ontology engineering for data integration and interoperability. She holds a Ph.D. in Information Studies from the University of Wisconsin-Milwaukee and an M.S. from Syracuse University. Prior to joining SKKU, she was an Associate Research Scientist at OCLC and a Teaching Assistant Professor at the University of Illinois Urbana-Champaign.