Newsletters




The Mind of a Data Modeler


Years ago, before the rise of the gig economy, people developed their careers over time, often more slowly than desired. Folks who became DBAs, Data Modelers, or Data Architects had a rule-following gene they were either born with or grew. And that aspect of their mentality was greatly leveraged into how procedures and change progressed in their areas of influence. Such perspectives created a serious consistency, and constancy of purpose that fit well into the demands of the job. The rules were guardrails on how tasks within the framework were done. 

The consistency that evolved from such rule enforcement enhanced the data user’s ability to navigate and grow in their use of data. But such devotion to standardization seems to be losing stream. Either those who perform data modeling tasks no longer have learned the rules related to various data modeling styles, or they simply do not care. It has always been true that relational DBMSs did many things that were beyond what a strictly dimensional model should do. And today we now have Modern Data Platforms that do many things outside of what is specifically allowed for in a normalized design. (For example, having primary keys that are just there as documentation, and are not actually enforced.) Generally, the data modeler is responsible for consistency in applying whatever “the rules” may be. Architecturally, if we say this schema is dimensional, or that schema is normalized, or data vault, then we are saying our designs are following specific rules and even a specific philosophy. So, when designs for a dimensional model include an extension into any other structure that SQL and RDBMSs can perform, lines are literally blurred. Star schemas become more normalized than even the concept of “snowflaking” can account for.

Randomly throwing in something that is supported by the platform but not a part of those rules or philosophy that is the framework of one’s design, ultimately creates chaos. How can people learn the “rules” when some facts occasionally must join into a dimension using an “alternate key” for no good reason? Or perhaps not always defining primary keys. Users may no longer just learn the rules and be safe, because now there may be many arbitrary exceptions that need to be filed away as well. And worse for our poor users, those exceptions may or may not be documented in the place where the users know to look. Overstepping and ignoring boundaries cause confusion regarding the rules and their purpose. Just because a platform can do a specific query does not mean that this unique kind of query should be incorporated into all styles of data modeling. Slapping the word “fact” or “dimension” alone onto a data structure, does not make a table a fact or a dimension.

The data modeler must be the one arguing for a clear and consistent future, rather than just agreeing to change things at the drop of a hat. Having a low bar to reach for, in agreeing to not follow the rules, only provides inconsistency. Not following the rules specifically intended for the data modeling approach being used ultimately leads to a sloppy data model. And sloppy modeling leaves one with no real rules, which eventually may even be no real architecture. But this approach still results in a lot of data that likely leaves a slow and painful process for users to become proficient with. Attention to detail seemingly is becoming a lost art. “Let the AI do it,” appears as a preferred response. But as one becomes less proficient at spotting things that are wrong with a data modeling approach. will one even recognize a hallucination? I'm sure everything will be fine; I am just a worrier.


Sponsors