Taxonomy Types & Definitions

In business use, the single word “taxonomy” may cover any and all of the following variations of knowledge organization systems.

Controlled Vocabularies
A controlled vocabulary is a restricted list of words or terms used for labeling, indexing or categorizing. It is controlled because only terms from the list may be used for the subject area covered by the controlled vocabulary. It is also controlled because, if it used by more than one person, there is control over who adds terms to the list, when, and how to the list. The list could grow, but only under defined policies. Most controlled vocabularies also have some form of cross-references pointing from one or more “non-preferred” terms to the designated “preferred” term. Only if a controlled vocabulary is very small and easily browsed, such as on a single page, might such synonyms be excluded.

A thesaurus is a more structured kind of controlled vocabulary. It provides information about each term and its relationships to other terms within the same thesaurus. In addition to clearly specifying which terms can be used as synonyms (called “used from”), a thesaurus also indicates which terms are more specific (narrower terms), which are broader, and which are related terms. National and international standards have been developed to provide guidance on creating such thesauri, including ISO 25964 and ANSI/NISO Z39.19. The standards explain in great detail the types of relationships that fall into three types: hierarchical (Broader Term/Narrower Term), associative (Related Term), and equivalence (Use/Used from).

A literature retrieval thesaurus, like a dictionary-thesaurus (such as Roget’s) lists similar terms at each controlled vocabulary term entry. The difference is that in a dictionary-thesaurus all the associated terms might be used in place of the term entry depending upon the specific context, which the user needs to consider in each case. But in certain contexts some of these terms are not appropriate. The literature retrieval thesaurus, on the other hand, is designed to be used for all contexts, regardless of a specific term usage or document. The synonyms or near synonyms must therefore be suitably equivalent in all circumstances.

The word taxonomy means the science of classifying things, and traditionally the classification of plants and animals, as in the Linnaean classification system. It has become a popular term now for any hierarchical classification or categorization system. Thus, we no longer speak of “taxonomy” as a science but rather “a taxonomy” (plural: taxonomies) as a kind of controlled vocabulary that has a hierarchy (broader term/narrower terms), but not necessarily the related-term relationships and other features of a standard thesaurus.

Unlike a thesaurus, where a given term may or may not have broader or narrower terms, in a taxonomy all terms belong to a single, large hierarchy that encompasses all concepts of a certain class, category, or facet. The structure is sometimes referred to as a “tree” and the terms as “nodes” in the tree. Sometimes “a taxonomy” refers to a single hierarchical tree, and sometime “a taxonomy” means the collection of term hierarchies available in combination for searching or browsing a given content repository.

A variation on the form of a collection of hierarchies is a faceted taxonomy. Each facet is its own hierarchy of terms, but actually the terms within a facet do not have to be in a hierarchy and may be a flat list under the facet category label. What distinguishes facets is that the user may select multiple terms, one from each facet, in combination to execute a complex search. Furthermore, facets must represent different aspects or dimensions of a query such as location, topic, source, type, etc.

An ontology is set of concepts with attributes and relationships between the various concepts that contain various meanings, all to define a domain of knowledge, and is expressed in a format that is machine-readable. Certain applications of ontologies, as used in artificial intelligence or biomedical informatics, may define a domain of knowledge through terms and relationships as the end goal, rather than being used for any tagging. In the area of taxonomies and information science, however, an ontology can be seen as a more complex type of thesaurus, in which instead of having simply “related term” relationships, there are various customized relationship pairs that contain specific meaning, such as “owns” and a reciprocal “is owned by.”