Newsletters




Which Data Modeling Pattern is Better: “Discrete Attributes” or “Going Vertical”?


Data modelers face a choice when encountering multiple variations of a data item.  The item itself is unimportant; it could be a name, a physical address, a phone number, an email address, a shoe size, or almost anything.  For now, let’s assume a name is being addressed.  If an organization uses multiple names for an object, whether a “formal name” versus an “informal name,” or a “short name” versus a “long name,” or anything else, the simple existence of more than one name in and of itself  does not demand the creation of a “Name Table” inside the data design.   

Creating a “Name Table” structure is an approach that can be called “Going Vertical” as each of the types of names becomes its own distinct row in such a table.  Tables like this Name Table will generally have columns for the name value, the name-type value, perhaps a start and stop date, perhaps a source-name.  The natural key likely would be the name and name-type (plus the start date, if present).  Then a secondary name-type reference table also would be added into the design.

Alternatively, a “Discrete Attributes” approach could be used to contain this data, meaning stringing data elements out horizontally within a single structure, having individualized buckets for each name - a short-name column, then a long-name column, and so forth.  The strung-out items become additional descriptors functionally dependent upon the entity they are contained within. 

One could look at these distinctions as the differences between using a normalized versus a multidimensional approach; and as a default view, that is a valid perspective, albeit not the only perspective.  Under the proper circumstances, even a normalized approach might consider using a Discrete Attributes style.  If an organization has a standardized set of names that have not changed and likely will not change in the foreseeable future, then Discrete Attributes could be used.  Additionally, if the existing rules for when each name is used are clear and well followed within the organization, then breaking these multiple names apart into separate rows in a separate set of tables to follow a Going Vertical style is really just overkill.  Going Vertical structures will be creating added complexity that does not exist within the target organization, and that complexity is not needed.

On the other hand, Going Vertical is appropriate when brand new name types are created here and there, or when there is fluidity in which name is used when, or how.  And Going Vertical is a fine choice when the circumstances are simply unknown and subject-matter experts are uncertain about what the future may hold.  Going Vertical allows anything to be added at any time; simply add in the new name-type to the reference and start adding content into the Name Table.  The goal is fitness for use.  Part of that use-fitness means not making it unnecessarily complicated for people, i.e., the system users.  Depending on exact circumstances, those users may be reporting end-users, report developers, or application developers.  Easing the perceived burden of the DBAs, if called upon to add a column sometime in the future, is inconsequential. 

Those who query or populate the data in one’s solution, on a day-after-day  and year-after-year basis, are the important ones to consider. Designers must focus on the longer term appropriateness of their decisions when choosing how their designs will play out; and going vertical or horizontal does have an impact over time.  Therefore, when choosing to go one way or the other, such decisions should be given conscious thought and not simply done on an automatic reflex.  Both Discrete Attributes and Going Vertical can be appropriate under the proper situations.


Sponsors