A question was recently posted to a group: “I’m wondering if anyone knows of a standard for designing taxonomies for industrial components (widgets).” So far, no one has replied.
To clarify, taxonomies for different subject areas and different content don’t have different standards. Standards, whether for interoperability, such as SKOS, or for structural design, such as ANSI/NISO Z39.19 or ISO 25964, are the same and just as relevant for taxonomies and thesauri in all subject areas. Taxonomies for different subject areas and content may have different design best practices, though. The published standards don’t spell out everything; there is room for design and style differences for different taxonomies, including those that differ in their subject domain and content.
Areas of taxonomy design best practices that may differ include:
- Degree of term specificity or granularity
- Depth of hierarchical levels
- Number of terms at the same level (i.e. the number of narrower terms a term has)
- Length of terms
- Use of parenthetical modifiers and other term label fields
- Additional attribute details for terms (notes or controlled value fields)
There are also issues of relationships between terms (whether a term may have more than one broader term, and whether there should be associative/related-term relationships) and how extensive alternative labels/synonyms shall be. Best practices for these issues, however, depend more upon the implementation and user interface for the taxonomy than on the subject area of the taxonomy.
In the case of an industrial component taxonomy, best practices for the aforementioned points would likely be of the following:
- There should be relatively high level of specificity of terms to include all components
- Depth of hierarchy that accurately reflects standard component categories and subcategories. So this could be deeper than for other, business taxonomies. Also, the levels of depth may vary in different parts of the taxonomy.
- The number of terms at the same level should also accurately reflect standard component categories and subcategories, so there could be a large number of terms at the same level.
- The length of the term should be complete and unambiguous, but any component number should be managed in a separate field.
- It may be desired to use some additional numeric or alphanumeric classification system. If so, the classification code would be another field or component of the term, separate from the term name, for purpose of sorting.
- Additional attribute details for each term would be desired and expected. These may include a component number, size, price, and other specifications. (Attribute fields may or may not be searchable. They are not for filtering, though, as facets are.)
In contrast, a consumer products ecommerce taxonomy would follow different best practices:
- Terms should not be too specific, not more specific than what users would be familiar with. Specificity should reflect the number of units (SKUs) covered by the term category. A term that refers to only 1-5 products is probably too specific. If there are additional refinement filters, then a category term may be broad enough to include 10-50 items.
- Hierarchy should not be too deep, probably no more than 3 levels.
- Terms per level should be limited, such as 3-12 terms per hierarchy level
- Term names should be concise, for easy browsing, yet unambiguous, usually 1-3 words
- Terms should probably not have any other fields/components or parenthetical qualifiers
- Attribute details would include at minimum product number/SKU and description. Price would be managed as a separate filter, rather than as merely an attribute.
These best practices are not “standards” because they tend not to be shared outside of an organization. Each organization comes up with their own policies and guidelines, just as they have their own taxonomies. The best practices could be considered internal standards, though. Regardless of what they are called, these guidelines should be documented and overseen as part of a taxonomy governance plan.