The terms taxonomy, ontology,
directory, cataloguing, categorization and
classification are often confused and used interchangeably.
These are all ways of organizing information (or things or animals)
into categories.
Categorization is the process of
associating an object with one or more subject categories. So the
entry for a page on cross trainer shoes could go into Running,
Manufacturing, Sports Medicine, or Rushkoff, Douglas! All of these
are legitimate, depending on the context.
Cataloging and Classification come
from libraries, where specialists enter the metadata (such as
author, date, title and edition) for a document, apply subject
categories to it, and place it into a class (such as a call number)
for later retrieval. These tend to be used interchangeably with
Categorization.
Clustering is the process of grouping
documents based on similarity of words, or the concepts in the
documents as interpreted by an analytical engine. These engines use
complex algorithms including Natural Language Processing, Latent
Semantic Analysis, Bayesian statistical analysis, and so on.
A Thesaurus is a set of related terms
describing a set of documents. This is not hierarchical: it
describes the standard terms for concepts in a controlled
vocabulary. Thesauri include synonyms and more complex
relationships, such as broader or narrower terms, related terms and
other forms of words.
Taxonomy is the organization of a
particular set of information for a particular purpose. It comes
from biology, where it's used to define the single location for a
species within a complex hierarchic. Biologists have arguments
about where various species belong, although DNA analysis can
resolve most of the questions. In informational taxonomies, items
can fit into several taxonomic categories.
Ontology is the study of the categories of
things within a domain. It comes from philosophy and provides a
logical framework for academic research on knowledge
representation. Work on ontologies involves schema and diagrams for
showing relationships in Venn diagrams, trees, lattices and so
on.