Two Paths Toward Establishing a Design: Multidimensional and Normal

Do not allow well-meaning but confused proponents to obscure concepts related to normalization and dimensional design.  Under a normalized approach one usually would not expect for numeric data items and textual data items to fall into different logical relations when connected to the same entity object. Yet within a multidimensional approach that is exactly what happens. Multidimensional design and normal design are not the same, and one should not expect to claim that both approaches were used and that they resulted in the same data model.  Normal design is by its nature a logical construct. Multidimensional can be considered a specific technique of physical design that is completely divorced from a logical layer. As such, normal and multidimensional designs are not even at a compatible level of construction.

Normalized Design

The goal of a normalized design is to express the true nature of the data in focus. However, the term “true nature” usually does not mean some independent universal truth. Very few universal truths are out there for us to use and leverage. Rather, the truth expressed is that contained within the confines of the organization and those who work there. The shared signals and symbols used within the microculture defining the organization, reflecting how one organization sees the world as unique and different than that within another organization, even when both organizations are in the same industry. Certainly, organizations within the same industry have commonalities and similarities. But there are differences among these organizations in thought, in approach, and across a great many subtle perspectives that make each organization distinctive and unique.

Dimensional Modeling

Unlike normalized design, dimensional modeling is not attempting to express the true nature of the data in focus. Dimensional modeling is working toward other goals. This point of diverse goals seems to confuse some who see multidimensional and normalized forms as relatively equivalent. The approaches are not the same. A multidimensional approach is attempting to standardize the breakdown of data items based on data type and generalized role, arranging individual data items in a skewed fashion. These arrangements are biased into forms meant to support the answer of expected questions. These forms are effectively optimized for slicing and dicing, e.g., group by this value and filter by that value. Every dimension group connects into the fact in exactly the same expected fashion.

And when wishing to ultimately create a star schema for use in reporting and OLAP tasks, before one can successfully establish a multidimensional design, one needs to have a normalized design covering the same universe of discourse. Ideally, these normalized designs were already created during the initiatives that selected the source operational solutions. If those designs do not exist, that is not entirely unusual. Every initiative is challenged to establish and maintain current database designs. However, the time should be taken to establish a fairly useful understanding of the normalized design. Because even though the targeted tables will be components of a multidimensional approach, normal knowledge is important. Fact tables, dimension tables and even the occasional evil bridge table must all be maintained. And in establishing the rules to properly understand value changes and their impact across these multidimensional structures, the normalized relationships will direct how to embrace those changes. For simple aspects, such as properly grouping a hierarchy within a single dimension or to something more subtle, such as understanding when inserts to a dimension will drive inserts within a bridge table, understanding the true nature of data within an organization keeps everything in order.

Multidimensional approaches and normalized approaches establish different designs and have differing goals. However, understanding the nature of one’s data is always important. Therefore, it is always in one’s best interest to understand how all the data reacts in a normalized fashion because this knowledge will come into play sooner or later.