Don’t Be Parsimonious with Abbreviations

Data modeling has an intimate relationship with abbreviations. Since the creation of the very first data model, there were circumstances where fully worded names for tables or columns were simply too long to implement within one tool or another. Therefore, words within names were shortened to make things fit. Even without any character count limitations, developers or even users have wished to save finger-exertions from having to perform too many keystrokes in composing queries and requested names to be as short as possible.

Occasionally one runs into an individual who cannot conceive that anyone on the planet would abbreviate something in a different fashion than they do; but more often data modelers tend to enjoy consistency, and when possible, employ rules to support consistent outcomes. Abbreviations are no exception.

As a usual expectation, starting to formalize a data modeling practice starts with the creation of a naming standard.The created naming standards will likely include an official listing of words and their approved abbreviations, and a set of guidelines on how to create new abbreviations. Abbreviation rules could include such things as: remove repeated letters, remove vowels. Or controversies may arise … only remove some vowels, but which vowels, and when, can become heated issues. Which objects should be abbreviated? Which objects should never be abbreviated? Other controversies an organization may encounter revolve around the idea of doubling up. Can multiple words be allowed to resolve into the same abbreviation? Or a wilder idea: Can a single word be allowed more than one abbreviation?

To varying degrees, data modeling tools support the usage of abbreviations. Lists of words and their associated abbreviation can be easily imported and referenced within the tool. This support can be leveraged to help automate parts of the process for building logical and physical data models. But some IT shops make their jobs harder by approaching the concept of abbreviations in a very piecemeal fashion. A rule such as, “Always use full words, unless a limit of X letters is exceeded, then use abbreviations” may sound reasonable.

And such a rule may feel justified by those who would prefer to always and only deal within fully spelled out words. But by establishing an approach to abbreviations based on exceptions, the application of an abbreviation will only be accomplished manually. When abbreviation automation is used, another possible heartache to occur can be growing lists of abbreviations. If a circumstance arises where a previously unabbreviated word must now be abbreviated, columns or tables not undergoing a requested change may be altered unexpectedly. Often the hardening of previously existing names is allowed within a data modeling tool, so that older columns can remain unchanged as newer standards are enforced on new columns. But the steps to harden names are steps that usually are not performed.

A simple and consistent approach is to use full language words for all the logical object names, then always abbreviate every word for the physical name. “Always abbreviate” means every single word used within a table or column name exists on the official abbreviations listing, even when the abbreviation is the exact same as the word itself. In this fashion, growth of the abbreviations listing will not start a cascade of changes to previously existing names. 

When circumstances occur that require previously defined words to have a more drastic abbreviation, rather than changing the pre-existing word, add in a larger component, such as a two- or three-word phase including that word (addressing the new circumstance), and then abbreviate that phrase more harshly than the original word by itself. And lastly, if desired, expand abbreviations back into full words for names within views. In this fashion, only the views will require manual governance, the rest is fairly automated.