Evaluating Taxonomies – Hedden Information Management

In my last blog post, “Taxonomy Management Consulting,” I mentioned that more organizations now have taxonomies, so the need is shifting somewhat from designing and building new taxonomies to managing existing taxonomies. It might not be that simple, however, if the existing taxonomy was created and never used, created for a slightly different purpose or different content, or created by those not sufficiently knowledgeable in taxonomy design best practices. I often find that an organization that has taxonomy consulting needs typically has some pre-existing taxonomies, but they are not adequate for one reason or another.

Any pre-existing taxonomies are important as part of a taxonomy development or redesign process and should be carefully considered. Whether pre-existing taxonomies will be only a source of terms for a new taxonomy or actually the basis of a new taxonomy with some editing depends on how structured, comprehensive, and sound these pre-existing taxonomies are.

Structure: Pre-existing taxonomies may be of the type that is a simple flat list of terms with no hierarchy. These are good sources of taxonomy terms but are rarely the basis for the taxonomy.

Comprehensiveness: Often existing taxonomies cover only part of the scope of a desired full or enterprise-wide taxonomy, in which case they will serve as part of the new taxonomy.

Soundness: This concerns to what extent the taxonomy is conforms to standards (such as ANSI/NISO Z39.19) and general best practices, so that it ought to work well with the content it is intending to reference. This is where taxonomy experts can come in and make such determinations.

Evaluation Criteria

Evaluating a taxonomy for soundness typically involves checking off or rating the taxonomy against a set of pre-defined criteria regarding terms, inter-term relationships, and overall structure and design. Some of the most important criteria include the following:

Terms should be unambiguous and clear, yet not too wordy and long. If the taxonomy will be displayed for browsing, then terms should begin with key words and those that come under the same broader term should be in a somewhat consistent grammatical format.
Hierarchical relationships should conform to the ANSI/NISO Z39.19 standards of conforming to only one of the three types: generic-specific, instance, or whole-part, with perhaps limited exceptions in a corporate taxonomy that are intuitively logical and justified. (See my blog post “Deviating from Taxonomy Standards”).
Overall structure and design involves issues include the number of narrower terms for a broader term not being too few nor too many (such as 3-20), and the depth of the taxonomy being somewhat balanced and not too deep. For example, three levels deep in some places and four levels deep in others is OK, but two levels in some areas and five levels deep in others is not a well-balanced design.

Evaluation vs. Testing

Evaluating a taxonomy is not the same as testing a taxonomy. Testing a taxonomy involves using sample content and sample users in a controlled manner and can take considerable time and effort, so should not be done until after a taxonomy is determined to be generally sound. Evaluating a taxonomy, on the other hand, is to determine if it’s well constructed regardless of the content or users. Testing focuses on the specific application and use of the taxonomy and will be the topic of a future blogpost.

Taxonomy vs. Web Usability Heuristic Evaluation

Even if a numeric rating scale is used, the process is still more judgmental than scientific, and as such may be referred to as a “heuristic” analysis or evaluation. A “heuristic method” generally means evaluation, experimentation, or a trial-and-error method to find something out. The designation of heuristic evaluation has been used in website usability evaluation and from there has been carried over into taxonomy evaluation. User experience expert Jakob Nielsen first introduced the idea of heuristic evaluation to usability design back in 1990, described in his blogpost of 1995: “How to Conduct a Heuristic Evaluation.”

There are several differences, though, between taxonomy evaluation and web user interface evaluation. Although user testing of websites is not that much different from the testing of taxonomies, evaluation of taxonomies requires a more critical and analytical understanding and approach. Website usability evaluation does not require usability design experts, but taxonomy evaluation does require a level of expertise. Nielsen refers to “evaluators”, not experts, who are not much different from user testers. (Rather, the procedures in usability evaluating and testing differ.)

Another difference between website evaluation and taxonomy evaluation is that a website, even if a test dummy site, will have content, even if just mock-up pages with partial filler text, because navigation and content are integrally combined on websites. When a taxonomy, on the other hand, is at the evaluation stage, it is not implemented/linked to content, which makes it more difficult for the non-expert to evaluate. It might appear to look good on paper but not function well when implemented.

Nielson wrote: “Heuristic evaluation involves having a small set of evaluators examine the interface and judge its compliance with recognized usability principles.” If the evaluators are not experts, then it’s easier and more affordable to have multiple evaluators. When a taxonomy requires evaluation, typically just one taxonomy expert is hired, but if you can afford two separate independent expert evaluations of your taxonomy, that’s all the better.