Information Logo

Home

Courses/Workshops
Taxonomies
Indexing
Translations

Presentations
Articles
Blog Link
Books
 The Accidental Taxonomist
 Indexing Specialties: Web Sites

About Heather

 

Hedden Information Management

Taxonomies, Thesauri,
and Controlled Vocabularies

Taxonomy Types and Definitions
Past of Taxonomy Projects
 

Taxonomy Types and Definitions

In business use, the single word "taxonomy" may cover any and all of the following variations of knowledge organization systems.

Controlled Vocabularies
A controlled vocabulary is a restricted list of words or terms used for labeling, indexing or categorizing. It is controlled because only terms from the list may be used for the subject area covered by the controlled vocabulary. It is also controlled because, if it used by more than one person, there is control over who adds terms to the list, when, and how to the list. The list could grow, but only under defined policies. Most controlled vocabularies also have some form of cross-references pointing from one or more “non-preferred” terms to the designated “preferred” term. Only if a controlled vocabulary is very small and easily browsed, such as on a single page, might such synonyms be excluded.

Thesauri
A thesaurus is a more structured kind of controlled vocabulary. It provides information about each term and its relationships to other terms within the same thesaurus. In addition to clearly specifying which terms can be used as synonyms (called “used from”), a thesaurus also indicates which terms are more specific (narrower terms), which are broader, and which are related terms. National and international standards have been developed to provide guidance on creating such thesauri, including ISO 2788, ISO 5964, ANSI/NISO Z39.19. The standards explain in great detail the types of relationships that fall into three types: hierarchical (Broader Term/Narrower Term), associative (Related Term), and equivalence (Use/Used from).

A literature retrieval thesaurus, like a dictionary-thesaurus (such as Roget's) lists similar terms at each controlled vocabulary term entry. The difference is that in a dictionary-thesaurus all the associated terms might be used in place of the term entry depending upon the specific context, which the user needs to consider in each case. But in certain contexts some of these terms are not appropriate. The literature retrieval thesaurus, on the other hand, is designed to be used for all contexts, regardless of a specific term usage or document. The synonyms or near synonyms must therefore be suitably equivalent in all circumstances.

Taxonomies
The word taxonomy means the science of classifying things, and traditionally the classification of plants and animals, as in the Linnaean classification system. It has become a popular term now for any hierarchical classification or categorization system. Thus, we no longer speak of “taxonomy” as a science but rather “a taxonomy” (plural: taxonomies) as a kind of controlled vocabulary that has a hierarchy (broader term/narrower terms), but not necessarily the related-term relationships and other features of a standard thesaurus.

Unlike a thesaurus, where a given term may or may not have broader or narrower terms, in a taxonomy all terms belong to a single, large hierarchy that encompasses all concepts of a certain class, category, or facet. The structure is sometimes referred to as a “tree” and the terms as “nodes” in the tree. Sometimes "a taxonomy" refers to a single hierarchical tree, and sometime "a taxonomy" means the collection of term hierarchies available in combination for searching or browsing a given content repository.

A variation on the form of a collection of hierarchies is a faceted taxonomy. Each facet is its own hierarchy of terms, but actually the terms within a facet do not have to be in a hierarchy and may be a flat list under the facet category label. What distinguishes facets is that the user may select multiple terms, one from each facet, in combination to execute a complex search. Furthermore, facets must represent different aspects or dimensions of a query such as location, topic, source, type, etc.

Ontologies
An ontology is set of concepts with attributes and relationships between the various concepts that contain various meanings, all to define a domain of knowledge, and is expressed in a format that is machine-readable. Certain applications of ontologies, as used in artificial intelligence or biomedical informatics, may define a domain of knowledge through terms and relationships as the end goal, rather than being used for any tagging. In the area of taxonomies and information science, however, an ontology can be seen as a more complex type of thesaurus, in which instead of having simply "related term" relationships, there are various customized relationship pairs that contain specific meaning, such as "owns" and a reciprocal "is owned by."

 

Past Taxonomy Projects

Heather Hedden has engaged in the following projects during 2004 - 2013:

TechTarget

  • Recruited and coordinated the work of four freelance taxonomists in developing multiple information technology (IT) taxonomies for the purpose of autoclassifying technical articles.
  • Consulted on the development of taxonomy guidelines.

Wyndham Hotels Group

  • Developed a new faceted taxonomy from scratch for web content management for implementation in Adobe Experience Manager for public websites of hotels and hotel franchisee intranet.

KAPS Group, LLC

  • Developed and refined topical taxonomies in economic, financial and social development subjects to be used in new knowledge management portal for the retrieval of research and project reports.

Pearson Education

  • Developed taxonomies for 8 world languages and language methods, based on textbook tables of contents for Pearson Education’s higher education digitized content project.

Avrio Knowledge

  • Developed a medical terminology taxonomy based on comparative textbook content for the publisher F.A. Davis to manage digital content.

Project Performance Corporation

  • For a major multinational investment banking and securities firm, designed a new faceted taxonomy structure for classifying documents required for new client onboarding. Interviewed stakeholders, developed taxonomy terms, mapped legacy codes to the new taxonomy, and wrote taxonomy maintenance guidelines.
  • For a mutual fund company, designed a new faceted taxonomy to support search refinement of enterprise-wide internal documents that were being migrated from shared drives to a new SharePoint-based intranet. Interviewed stakeholders, designed facets, developed taxonomy terms for the initial project of legal department documents, created keyword “clues” for terms to support automated indexing with ConceptSearching.
  • For an investment company (mutual funds and retirement planning), reviewed the proposed new taxonomy structure and made recommendations.
  • For a leading national retailer, revised the top two levels of product categories taxonomy to reflect product category changes and absorb new product areas, developed the taxonomy maintenance processes, and for 7 months responded to requests for additions and changes to the product taxonomy.
  • For an athletic wear company, developed new hierarchical sports taxonomy as part of the Unified Taxonomy with general and sports-specific vocabulary for content indexing. Interviewed stakeholders, researched terms, and built taxonomy in Excel and MultiTes.
  • For a leading international educational publisher, developed subject discipline taxonomies based on comparing and analyzing the detailed tables of contents in multiple textbooks for multiple courses.

Earley and Associates

  • For the Inter-American Development Bank, took interview notes, conducted content repository analysis, designed new set of taxonomy facets, and mapped legacy terms to new terms for multiple taxonomy implementations project and publication resources. (See example of Projects taxonomy.)
  • For the research organization Westat, designed an ontology for indexing research reports for implementation in the Smartlogic Semaphore Ontology Manager tool.
  • For Jackson Laboratory, took interview notes, conducted content repository analysis, and created an initial taxonomy for the web site and intranet of this biomedical research organization.
  • For Motorola, conducted term extraction and content analysis of its web site to support taxonomy development.
  • For a financial services company, created a starter taxonomy for the life goals section of a corporate intranet.
  • For the public web sites of a major insurance company, conducted term extraction and content analysis to support taxonomy development.
  • For the intranet of a large manufacturing company, conducted term extraction and content analysis to support taxonomy development.

Bain & Company

  • Reviewed the multi-faceted hierarchical taxonomy for usability and how it conforms with best practices. Analyzed retrieval statistics for taxonomy terms. Wrote recommendations for how to improve the taxonomy, and delivered visual presentation of recommendations to stakeholders.

Viziant Corporation

  • Developed base taxonomies (totaling 1,464 node terms) in Geographies, Actions/Events, Occupations & Roles, Cultures & Languages, and Facilities & Infrastructure; and vertical market taxonomies (totaling 1,643 node terms) in Business & Finance, Politics & Government, Military & Defense, Terrorism, and Information Security; along with multiple synonyms/cross-references for each node term. (See interface screenshot)

Answerbag (Demand Media)

  • Edited categories of questions & answers in areas of Recreation & Sports, Electronics, Finance, Legal, Business & Careers, Animals, Pets, and Internet, by merging categories, breaking out subcategories, and renaming categories, ensuring that categories had neither too many questions to them nor were they more than three levels deep in the hierarchy. Re-categorized questions & answers in the newly created categories.

50Lessons (Truman Company)

  • Redesigned and integrated legacy web site taxonomies for content searching of the “50 Lessons” database of executive interview videos. Specified metadata for search and retrieval of video lessons and of speakers and wrote tagging guidelines. (See the Lesson Theme categories)

Factiva (Dow Jones & Reuters)

  • Mapped thousands of logged search phrases to the controlled vocabulary of a Web commercial products and services directory (Superpages.com).

Banana Pages

  • Reviewed and edited hierarchical taxonomy in home and garden categories for consumer products and services for Web-based yellow pages directory. Created more narrower terms to expand the depth of select subject areas.

PlaceLinks

  • Edited controlled vocabulary of products and services for web-based yellow pages.

Ministry of Education of Saudi Arabia (through the consulting company SEED Group)

  • Provided expert review and comments to the taxonomy design plan: "Enjaz Office Management & Content Management ENJAZ Project: Content Analysis and Design Document"