Taxonomies and Attribute Data

In the past (such as my 2021 blog post “Attributes in Taxonomies“), I have explained that “attributes” serve as filters to refine search results on content, results that have already been narrowed by a hierarchical taxonomy concept or category. As such, the attributes available for filtering can vary based on a taxonomy concept or category that had been selected. To the end user, high-level taxonomy facets and attributes both function similarly as filters, and the distinction between facets and attributes may not be apparent. If the distinction is not noticeable to end users, then then facets and attributes may be confused. It’s best to describe attributes for what they are, and not merely by what they can do. That’s that this blog post aims to do.

Attributes

Data is information in the form of specific values that are relevant to something such as an asset, object, product, person, event, or transaction. Since data is relevant to something else, we can refer to data as an “attribute “of something. When attributes are standardized and used in information/data management, then attributes are metadata. Metadata schema are structures to organize data.

Examples of attribute metadata are:

  • for people: birth date, gender, occupation, nationality, phone number
  • for products: brand, price, color, size, SKU number
  • for documents: title, author, publication date, language, word count, publication status, file type

Almost all metadata, both descriptive and administrative, are attributes of something. (Only structural metadata, that which is used to mark up text, would not be an attribute.)  Attributes, as metadata, can serve various purposes, including identification, comparison, sorting, filtering, and finding something based on its attributes.

Attribute values may be of different types: text, numbers, dates, or yes/no (also called “Boolean”). As text strings, attribute values may be uncontrolled free text or terms from a controlled list.

Taxonomies

Taxonomies are structures of concepts, which are used primarily for tagging and retrieval of content, although there are secondary uses. The concepts include subjects and named entities. In all cases, the concepts are of controlled vocabularies. The structures may be primarily hierarchical or primarily faceted, although a combination, such as limited hierarchies within a facet, is also possible. The structure of the taxonomy provides context for tagging supports interaction by users.

When a taxonomy is structured into facets, typically each facet serves also as a metadata property.  A hierarchical topical taxonomy can also provide values for a metadata property. Taxonomies are structures to organize controlled vocabulary concepts.

Examples of taxonomy facets include:

  • Topics
  • Activities
  • Industries
  • Product/service types
  • Brand names
  • Companies
  • Organizations
  • Names of people
  • Types of people/Roles
  • Events/Occasions

Thus, the types of things that are facets are usually not the same types of things that are considered attributes.

Metadata schema are structures to organize data, whereas taxonomies are structures to organize controlled vocabulary concepts.

Where Attributes and Taxonomies Overlap

Considering again the examples of different types of attributes for different things, there are some attributes that could be managed in a “taxonomy” instead of merely as “attributes”:

  • For people:  Name
  • For products:  Product type/category
  • For documents:  Subject/topic

Technically, each of these characteristics is also an attribute, but it is usually more practical to manage them as taxonomies so that they can support the implemented benefits of a taxonomy, such as semantic tagging, searching (including type-ahead search suggest), and browsing.

Thus, when we talk about “attributes” in the context of taxonomies, we mean those characteristics of something that are better managed as attributes and not managed as taxonomies. The decision is one of knowledge modeling.

For example, to support the refinement of searches, a taxonomy of expert people for an organization may have the following taxonomy facets:

  •  Name
  •  Subject of expertise
  •  Organizational unit
  •  Location

Then in addition to the facets, the taxonomy may have the following attributes associated with each record of a person:

  • Job title
  • Academic degree
  • Email address
  • Phone number
  • URL of headshot image

This is selected data of interest, but not values that are used in initial search or browsing for finding and retrieving content. Attributes are metadata, and taxonomy facets are also metadata, but that does not mean that they are the same, because different metadata can have different functions or purposes.

Ontologies: Bridging Taxonomies and Attributes

When we enrich a taxonomy with features of an ontology, not only can we add semantic relationships, but we can also add attributes to taxonomy concepts. Usually, when taxonomists first learn about ontologies, they think primarily of the addition of customized relationships between concepts, and they might be aware of the importance of the addition of attributes.

In ontologies, semantic relationships are formally called “object properties,” and attributes are called “data type properties.” Both are equally important. Meanwhile, the feature of “classes” in an ontology typically correspond to taxonomy concept schemes or facets.

To add attributes to a taxonomy, the best way to do it is through adding an ontology, which may be very simple and not even include semantic relationships. As the availability of different attributes may vary based on a hierarchy branch of concepts, this can be managed by creating classes, which are assigned to hierarchical branches, facets, or concept schemes. Then, attributes (data type properties) are applied and used with concepts based on the class the concept belongs to. 

Conclusion

The following table summarized the differences between taxonomy facets and attributes.

Taxonomy Facets         Attributes
Basic structure of many taxonomiesAdditional data added to taxonomies
Controlled vocabulariesControlled or uncontrolled terms, text,
numbers, dates, Boolean options, etc.
Concepts as nouns or noun phrasesIf text, any kind of text string
Top organizational level of a taxonomyValues relevant to any taxonomy concept
Concept Schemes in SKOS, or
Classes in an OWL ontology
Metadata on a concept, or
datatype properties in an OWL ontology

Leave a Reply