Flash News

"जिबन पर्यन्त शिक्षाका लागि पुस्तकालय (Library for lifelong education)"

Monday, November 17, 2025

Unlocking the Power of Controlled Vocabularies: A Practical Guide for Knowledge Managers in the Digital and Semantic Web Era

 Introduction

As libraries and knowledge institutions evolve into digital-first environments, the need for smarter, more consistent, and better-connected information systems has never been greater. Whether managing a national collection, a university repository, or a specialized research archive, professionals today face the challenge of organizing knowledge in ways that are both human-friendly and machine-readable. This is where controlled vocabularies, thesauri, ontologies, and semantic-web technologies step in.

These tools form the backbone of modern knowledge organization. They help standardize terminology, link related concepts, support multilingual access, and allow systems across the world to communicate seamlessly. With the rise of Linked Data, libraries are no longer isolated information silos—they are active nodes in a global web of knowledge.

This article introduces the key technologies shaping this landscape—from standards like SKOS and formats like JSON-LD, to platforms such as Skosmos and Finto, and query languages like SPARQL. It also explores essential international vocabularies including LCSH, MeSH, AAT, AGROVOC, and EuroVoc, explaining how they enrich cataloging and support advanced discovery.

For knowledge managers in Nepal and around the world, understanding these tools is more than a technical exercise. It is a strategic step toward building interoperable, inclusive, and future-ready information services. By embracing semantic-web practices, libraries can unlock new opportunities for collaboration, innovation, and global visibility—strengthening their role at the heart of the knowledge ecosystem.

 

1. Controlled Vocabulary

A controlled vocabulary is a curated list of terms used to ensure consistency in how information is described.
In libraries, it helps with:

  • Standardizing subject headings
  • Improving search accuracy
  • Linking related concepts

Examples you already know: LCSH, MeSH, AAT, AGROVOC, etc.


2. Thesaurus & Ontology Service: FINTO

Finto (from Finland) is a web service that hosts and publishes thesauri, ontologies, and vocabularies openly on the web.

Why it matters for libraries:

  • Lets you browse and search thesauri online
  • Helps you reuse standard vocabularies in cataloging
  • Supports linked-data technology (SKOS, RDF)
  • Good model for national-level vocabulary services

3. Publishing Vocabularies using SKOSMOS

Skosmos is an open-source web application used to publish thesauri and vocabularies online.

Libraries use Skosmos for:

  • Hosting their controlled vocabularies
  • Giving users a friendly interface to browse terms
  • Providing APIs for integrating vocabularies into library systems
  • Exporting data in SKOS, a standard for thesauri on the semantic web

Think of it as:
“A website tool to publish and provide access to thesauri in a structured way.”


4. REST API

A REST API allows software systems to communicate over the web.

For library vocabulary services:

  • You can search terms programmatically
  • Fetch definitions, broader/narrower terms
  • Integrate vocabularies into OPACs, digital libraries, or IR systems

Example use:
A digital library can call “Get subject terms” from a thesaurus through a REST API instead of manually typing them.


5. JSON-LD

JSON-LD (JavaScript Object Notation for Linked Data) is a lightweight format used for sharing linked data.

Why librarians care:

  • Makes metadata readable by machines (Google, Wikidata, etc.)
  • Connects library data with the wider semantic web
  • Works well with vocabularies in SKOS and RDF

Example:
A book record with JSON-LD allows automatic linking to AGROVOC terms or LCSH via URIs.


6. How Do We Put a Thesaurus on the Web?

You need four components:

Step 1 — Create the thesaurus in SKOS format

(SKOS = Simple Knowledge Organization System; the standard model for thesauri on the web)

Step 2 — Store the data in a triple store (RDF database)

Examples:

  • Fuseki
  • Virtuoso
  • GraphDB

Step 3 — Publish using a tool like:

  • Skosmos (most popular)
  • VocBench
  • Finto (as a service)

Step 4 — Provide web access

  • Human interface (browsing)
  • Machine access (REST API, SPARQL endpoint)

7. Major International Thesauri (What They Are and Why They Matter)

LCSH – Library of Congress Subject Headings

  • Widely used in libraries worldwide
  • Very broad subject coverage
  • Ideal for general academic and public libraries

MeSH – Medical Subject Headings

  • Used in medicine, health sciences, and biomedical research
  • Highly structured; excellent for medical libraries

STW – Economics Thesaurus (Germany)

  • Covers economics, finance, business
  • Good for academic research institutions

Iconclass

  • Used for art and iconography
  • Helps describe visual content, paintings, images

TheSoz – Thesaurus for the Social Sciences

  • Useful for social science libraries, NGOs, think tanks

EuroVoc

  • A multilingual EU thesaurus
  • Useful for legal, policy, governance, development studies

GND / SWD

  • German authority file for persons, subjects, corporate bodies
  • High-quality linked-data model
  • Often used for authority control work

AGROVOC

  • FAO’s multilingual agricultural thesaurus
  • Useful for agriculture, food security, environment

AAT – Art & Architecture Thesaurus (Getty)

  • Covers art, design, architecture
  • Widely used in museums and heritage institutions

8. Semantic Web in Libraries

Semantic web technologies allow library data to connect with global knowledge systems.

Benefits:

  • Better discovery
  • More accurate linking
  • Reuse of global vocabularies
  • Smarter search and knowledge services

Linked data transforms libraries into part of “a web of knowledge,” not isolated silos.


9. SPARQL Access

SPARQL is a query language for retrieving and filtering linked data stored in RDF format.

What you can do with SPARQL:

  • Search all terms related to a concept
  • Retrieve broader/narrower terms
  • Find SKOS concepts connected to records
  • Integrate vocabularies with machine-learning tools

Libraries use SPARQL endpoints to:

  • Build advanced search tools
  • Run analytics on vocabularies
  • Connect catalogs with external linked data resources

Example SPARQL questions:

  • “Give me all concepts narrower than ‘Agriculture’ from AGROVOC.”
  • “Find all terms in Nepali with English equivalents.”

Bringing It All Together for Your Profession

As a library leader and knowledge director, these technologies help you:

1. Build a national-level thesaurus service (Nepal can model Finto/Skosmos).

Useful in national digital library, knowledge hubs, and educational platforms.

2. Improve cataloging quality & discoverability

By reusing LCSH, MeSH, AGROVOC, etc.

3. Integrate semantic web tools into Nepalese library systems

Using JSON-LD, REST APIs, SPARQL.

4. Enable multilingual and cross-institutional interoperability

Essential for Nepal's multilingual context.

5. Connect Nepalese library data to global networks

Making Nepal visible in the global linked-data ecosystem.

Summary,

In today’s data-driven world, libraries and knowledge institutions rely heavily on structured, consistent, and interoperable metadata. Controlled vocabularies and semantic-web technologies are becoming essential tools for organizing information, improving discovery, and connecting local knowledge to global networks. This article introduces key concepts—such as controlled vocabularies, thesauri, ontologies, SKOS, Skosmos, Finto, REST APIs, JSON-LD, and SPARQL—and explains how they shape modern knowledge management.

Readers will find clear explanations of major international vocabularies including LCSH, MeSH, AAT, EuroVoc, AGROVOC, STW, and more. The blog highlights how these resources strengthen cataloging, enhance multilingual access, and support linked-data integration across libraries, archives, and digital repositories. It also outlines the technical pathway for publishing a thesaurus on the web using SKOS standards and tools like Skosmos and RDF triple stores.

Whether you are working in a national library, a university, a research center, or a heritage institution, this guide shows how semantic-web practices can elevate metadata quality, promote interoperability, and help build smarter knowledge ecosystems. For knowledge managers in Nepal and around the world, adopting these technologies opens the door to more connected, discoverable, and future-ready information services.

 

No comments:

Post a Comment