Not Everything is Important

Bookmark and Share

A database design may occasionally show evidence that it lacks proper prioritization.  Data models should express truths about the business, or about the universe of discourse.  But in expressing business truth this does not mean a data model should express absolutely every truth that anyone might conceive.  Some relationships are significant while other relationships are not.  And as a general rule, database design is not an exercise in trivial pursuit.  Insignificant truths only clutter up a design, increasing complexity, causing users’ eyes to glaze over more quickly, and adding no real value towards the endeavors of the enterprise.  The Entity-Relationship Diagram is filled with relationships, appearing as not so much a database design but as a possible circuit-board configuration.  Under the worst scenarios, extraneous relationships and their associated foreign keys can force your DBMS to do extra checking and verifying that could harm implementation performance.

Certainly, a data model can suffer from its own version of an obsessive compulsive disorder; and when encountered, these excessive obsessions may not seem completely obvious.  Data models are often busy enough on their own, even when done well.  A frequently encountered excessiveness symptom occurs with data models having foreign keys connected to absolutely every attribute defined as a date, or worse, every flag or indicator limited to a “Y’ or an “N”.  These petty relationships are of no value to anyone.  Foreign keys like these on dates or flags are not defining business truths, but instead they exist only to serve a mysterious god of extravagance.  Attributes that need restriction to limited values such as “Y” or “N” are more easily enforced via a value constraint instead of a foreign key, and date data types are enforced by the DBMS without a need to rely on a foreign key simply to determine if a date is valid.  The relationships themselves are associated with entities only defined in the model because the DBMS does not support a true user-defined domain.  Under a dimensional approach there may be exceptions.  For example, if one has a date dimension and uses the date values as the dimension’s key, foreign keys may be defined for each date value within a fact table that may be involved in analysis.  But even under this scenario, if a foreign key from the data dimension is defined into a fact table’s load date, then the belief should be that the load date normally will be used for analysis.  Another aspect in this situation comes with overthought designs which may exhibit additional associations that are more truly derivable across multiple entities.

Superfluous entities and relationships are even more inappropriate on the conceptual side, and placing such things inside an enterprise conceptual model should be avoided.  Generally, enterprise models should have a significant amount of abstraction.  Care should be focused on those items of the greatest importance to an organization.  Being buried under an infinite amount of minutia only provides an enterprise model that ultimately will not be used by business personnel.

A good normalized database design follows Einstein’s famous quote about being as simple as possible, but no simpler.  Truly, the lines are fuzzy; there is a not a test that can be applied to determine what must stay and what must go with a design.  Choices are a judgment call, or perhaps it is an artistic decision.  Making those decisions requires deciding what is important and what is useful, versus what might be trivial or insignificant.  And making the best choices in those selections is what results in either a good data model or a weak one.  If this means that data modeling is still more of an art than a science, so be it.