Taxonomies and Sitemaps

I was recently asked if a website’s sitemap of company’s website could serve as the start of a taxonomy for an organization. The sitemap, after all, includes all the relevant topics pertaining to an organization’s business offerings, and they are arranged in a hierarchy.  I have previously blogged on the subject of why a website’s navigation is not a taxonomy in Navigation Schemes and Taxonomies. A sitemap is similar to a website’s navigation, but it goes deeper by including the titles or topics of web pages which are not included in the website’s menu, and it is not necessarily intended for user browsing. A sitemap may go five or six levels deep, whereas the website menu navigation menus are usually only two levels. Therefore, a sitemap may seem as if it’s a taxonomy. However, just because a sitemap is as large and detailed as a taxonomy needs to be does not make it suitable as a taxonomy.

Different purposes

We need to understand what a taxonomy is for. It’s to aid users in locating desired content by topic-terms, which reflect both the terminology use of the users and of the content. Taxonomy terms are tagged/indexed to content that is relevant to the term. The starting point when creating a taxonomy is to identify the topics of the content and identify the topics of user interest or search, and then merge those topics into a taxonomy by bringing together different names for the same concept. The concepts are then structurally arranged to show the relationships between the terms, especially hierarchical relationships. The primary purpose of the hierarchy of terms in a taxonomy is to aid the users in finding the appropriate term. When browsing the taxonomy, they may find a broader term or narrower term that better describes their search goals. Then they can select that term to retrieve content that was tagged with the term.  

A sitemap, on the other hand, lists all or most pages of a website, usually by page title and organized in the hierarchical structure of the website. The hierarchical structure of the website was designed to organize information in a logical manner for users to browse and explore, as considered by the information architect who designed the website. The sitemap thus reflects pages, which are often topics but not always. A page may have multiple topics of interest that a user might want to look up. A page is sometimes for performing a function or activity and not necessarily just a topic of information.

A sitemap is typically automatically generated from the page titles, and its primary purpose is not for user but for machines: they tell search engines about pages that are available for crawling on websites and can thus support search engine optimization (SEO). Sitemap are useful in planning the further development or organizational improvement of a website. Whether a sitemap should even be displayed to end users as a tool to find information on a website is questionable. If automatically generated, it’s not designed for that purpose, but users could find it helpful, especially users who understand that it is merely the aggregation of page titles organized in the file structure of the website. Some website make it available, and some do not. Some websites have displayed a simplified sitemap instead  that is designed to be a guide to the users, but then it do not include all pages.

Different labels

The title names of pages and thus of sitemap entries often do not correspond to taxonomy terms. They could start out with verb for an activity, they could be commands or questions, or they could be complete sentences. Taxonomy terms are topics or names only represented by nouns or noun phrases, or proper nouns. Examples of sitemap entries that are not good taxonomy terms may include:

How to use…
Get started with…
Help with…
Pay a bill
Shop for…

As with navigation, the entries of a sitemap reflect pages in a one-to-one relationship, in contrast to taxonomy terms, each of which may retrieve multiple pages or content sources, and each page or content item can be tagged with multiple taxonomy terms. As such, entries in a sitemap may actually be more specific than would be needed in a taxonomy.  The user’s selection of multiple taxonomy terms in combination, through filters/refinements, achieves the result of obtaining an appropriate list of relevant content.

Conclusions

Sitemaps should not be used as taxonomies, but their topics (not their labels) may be considered as a good source for a taxonomy. Sitemaps might not even be suitable as a basis or starting point for a taxonomy, but rather as a source for developing taxonomy terms. Rather, it is recommended that a taxonomy be created separately from a sitemap based on a review of content, search log data, and stakeholder and user interviews, and the sitemap is yet one other source for consideration when taxonomy terms. The hierarchy of the sitemap should also not be too closely followed, although parts of its hierarchical structure may be taken into consideration for creating taxonomy relationships.

Leave a Reply

Your email address will not be published. Required fields are marked *