Information Logo

Home

Courses/Workshops
Taxonomy Consulting
Indexing
Web Site Indexing
Information Architecture
Translations

Presentations
Articles
Books
 The Accidental Taxonomist
 Indexing Specialties: Web Sites

About Heather

 

Hedden Information Management

Taxonomies, Thesauri,
and Controlled Vocabularies

Consulting Services
Training Services
Taxonomy Types and Definitions
Resources
Portfolio of Taxonomy Projects

Taxonomy Consulting Services

  • Taxonomy design and creation
    • For websites
    • For intranets
    • For content management systems
    • For SharePoint
  • Taxonomy review and evaluation
  • Metadata design
  • Taxonomy web user interface design
  • Faceted navigation/search design
  • Taxonomy merging, integration, and "mapping" (metadata "crosswalks")
  • Autoclassification or machine-aided indexing term rule writing and term weighting
  • Tagging/indexing guidelines and policy writing
  • Taxonomy maintenance guidelines and policy writing
  • Taxonomy/thesaurus software testing and evaluation

The above contract taxonomy consulting and development services are offered on an hourly basis for clients in all industries. For larger or more complex projects that require additional support (project management, technical, or subject matter experties), Heather Hedden is affiliated with with the consulting firm Project Performance Corporation's Knowledge Management practice. If you think your project would be more suitable for capabilities of Project Performance Corporation, contact Heather at heather.hedden@ppc.com.

Training in Taxonomy Creation

Heather Hedden teaches a 5-week self-online workshop "Taxonomies and Controlled Vocabularies" through Simmons College Graduate School of Library and Information Science Continuing Education Program. Simmons online course information

An independent version of this online course is also available to corporate groups of two or more at any time and can be taken on a self-paced schedule. Corporate online course information

Heather Hedden also offers a full-day workshop on creating taxonomies and controlled vocabularies at conferences. It has also occasionally been offered as an onsite workshop through Simmons College Graduate School of Library and Information Science Continuing Education Program. Workshop description

Heather Hedden also provides customized onsite corporate training and workshops on request. This is a modified version

Taxonomy Types and Definitions

We often use the single word "Taxonomy" to cover all of the following variations of knowledge organization system. The services and training offered by Hedden Information Management cover all of these.

Controlled Vocabularies
A controlled vocabulary is a restricted list of words or terms used for labeling, indexing or categorizing. It is controlled because only terms from the list may be used for the subject area covered by the controlled vocabulary. It is also controlled because, if it used by more than one person, there is control over who adds terms to the list, when, and how to the list. The list could grow, but only under defined policies. Most controlled vocabularies also have some form of cross-references pointing from one or more “non-preferred” terms to the designated “preferred” term. Only if a controlled vocabulary is very small and easily browsed, such as on a single page, might such synonyms be excluded.

Thesauri
A thesaurus is a more structured kind of controlled vocabulary. It provides information about each term and its relationships to other terms within the same thesaurus. In addition to clearly specifying which terms can be used as synonyms (called “used from”), a thesaurus also indicates which terms are more specific (narrower terms), which are broader, and which are related terms. National and international standards have been developed to provide guidance on creating such thesauri, including ISO 2788, ISO 5964, ANSI/NISO Z39.19. The standards explain in great detail the types of relationships that fall into three types: hierarchical (Broader Term/Narrower Term), associative (Related Term), and equivalence (Use/Used from).

A literature retrieval thesaurus, like a dictionary-thesaurus (such as Roget's) lists similar terms at each controlled vocabulary term entry. The difference is that in a dictionary-thesaurus all the associated terms might be used in place of the term entry depending upon the specific context, which the user needs to consider in each case. But in certain contexts some of these terms are not appropriate. The literature retrieval thesaurus, on the other hand, is designed to be used for all contexts, regardless of a specific term usage or document. The synonyms or near synonyms must therefore be suitably equivalent in all circumstances.

Taxonomies
The word taxonomy means the science of classifying things, and traditionally the classification of plants and animals, as in the Linnaean classification system. It has become a popular term now for any hierarchical classification or categorization system. Thus, we no longer speak of “taxonomy” as a science but rather “a taxonomy” (plural: taxonomies) as a kind of controlled vocabulary that has a hierarchy (broader term/narrower terms), but not necessarily the related-term relationships and other features of a standard thesaurus.

Unlike a thesaurus, where a given term may or may not have broader or narrower terms, in a taxonomy all terms belong to a single, large hierarchy that encompasses all concepts of a certain class, category, or facet. The structure is sometimes referred to as a “tree” and the terms as “nodes” in the tree. Sometimes "a taxonomy" refers to a single hierarchical tree, and sometime "a taxonomy" means the collection of term hierarchies available in combination for searching or browsing a given content repository.

A variation on the form of a collection of hierarchies is a faceted taxonomy. Each facet is its own hierarchy of terms, but actually the terms within a facet do not have to be in a hierarchy and may be a flat list under the facet category label. What distinguishes facets is that the user may select multiple terms, one from each facet, in combination to excute a complex search. Furthermore, facets must represent different aspects or dimmensions of a query such as location, topic, source, type, etc.

Ontologies
An ontology is set of concepts with attributes and relationships between the various concepts that contain various meanings, all to define a domain of knowledge, and is expressed in a format that is machine-readable. Certain applications of ontologies, as used in artificial intelligence or biomedical informatics, may define a domain of knowledge through terms and relationships as the end goal, rather than being used for any tagging. In the area of taxonomies and information science, however, an ontology can be seen as a more complex type of thesaurus, in which instead of having simply "related term" relationships, there are various customized relationship pairs that contain specific meaning, such as "owns" and a reciprocal "is owned by."

Resources on Taxonomies and Thesauri

American Society for Indexing - Taxonomies & Controlled Vocabularies Special Interest Group

SLA Taxonomy Division

Taxonomy Community of Practice Wikispace

Taxonomy Warehouse
Directory of taxonomies and controlled vocabularies

Thesaurus principles and practice
Willpower Information

Managing taxonomies strategically
Montague Institute article

Content Classification
EncycloZine article

Taxonomy Community of Practice Yahoo Group
Discussion group dedicated to taxonomies