Newsletters




Language is Always Hard


Words such as “taxonomy” and “ontology” are often thrown around by data architects as if these terms were interchangeable. Generically, the ideas of taxonomy and ontology are similar, but not synonymous.

While the general listener might not suspect, of these two terms, one is simple and straightforward, while the other is deep, dark, and mysterious. Taxonomies are the more familiar, and yes, the simple one. A taxonomy is a hierarchical classification organizing ideas/subjects/categories into tree-like parent-child relationships. The taxonomy is meant to be a plain categorization of your data/ideas. Your navigation pane in most tools is an example of a taxonomy. Taxonomies are the organizational framework. There is no right or wrong way, there is only more or less effective. If one is storing accounting information over time, having yearly folders containing sub-folders for incoming payments and outgoing payments would be functional. But if someone else set up an incoming payments folder and an outgoing payments folder with yearly sub-folders inside each, that would be an alternate and valid taxonomy. The most famous taxonomy is the biologic taxonomy categorizing all life across domains, kingdoms, phylum, class, order, etc.

Ontologies dip into the bigger and more expansive stream of trying to describe a universe. Formally, ontology is a representation of knowledge that defines concepts, their properties and relationships, often using logic for reasoning. In fact, ontology may be described as a branch of metaphysics dealing with the nature of being. With this elaborate descriptive intent, we can easily understand that any given ontology is more challenging than a parent-child hierarchy. Rather than a simple grouping of one idea within another, an ontology is allowed to express multiple kinds of relationships. One item may be related to another without having to be a formal part or child of that “other.” Ontologies enable more than just a simple “stepping through,” they enable thought and reasoning to be applied. One’s conceptual, logical, or physical data models can qualify as an ontology of a subject area. There are an infinite number of ways to express an ontology. There is no standard expected format and notation, although a couple standardized notations (OWL or RDF) have arisen with hopes of becoming “standard.” Whatever is conceived to flag or imply meaning and interrelationship is fine, as long as people agree that expected knowledge is expressed. For example, adding hashtags to documents enables each document to have as many relationships as it has hashtags, converting what had been a simple, navigational folder hierarchy into an ontology that may be accessed across many dimensions. Magic, enabled by a simple octothorpe, i.e., the “pound sign” #, now used to refer to “hashtags.”

Each element placed within one’s ontology must be carefully considered as to its appropriateness and clarity. Is every item defined at the best level of abstraction for revealing what we desire in our informational landscape? Of course, should one’s ontology resolve into a simple hierarchy, then our taxonomy and ontology are equivalent. But that convergence is a rare circumstance.  Whether normal, dimensional, or Data Vault within data models, the designer is always organizing data concepts into structures and relationships meant to be efficient and useful to those working with the data. Taxonomies and ontologies lurk beneath our data designs whether we acknowledge them or not. Success for our data models is based on their success in conveying knowledge about the content in focus. Now that you understand the differences between a taxonomy and an ontology, you now also know that if you wish to impress, confuse, or otherwise put folks on the defensive, start going on and on about your wonderful ontology.


Sponsors