A proper database design cannot be thrown together quickly by novices. A practiced and formal approach to gathering data requirements and modeling data is mandatory. This modeling effort requires a formal approach to the discovery and identification of entities and data elements. Data normalization is a big part of data modeling and database design. A normalized data model reduces data redundancy and inconsistencies by ensuring that the data elements are designed appropriately.
It is actually quite simple to learn the basics of data modeling, but it can take a lifetime to master all of its nuances. But what if you do not practice data modeling and database design? Or what if you’d like to but are forced to start building before designing because of bad management, “internet time,” or any other number of bad practices?
Well, the answer, of course, is “it depends!” The best advice I can give you is to be aware of design failures that can result in a hostile database. A hostile database is difficult to understand, hard to query, and takes an enormous amount of effort to change. Of course, it is impossible to list every type of database design flaw that could be introduced to create a hostile database. But let’s examine some common database design failures.
Assigning inappropriate table and column names is a common design error made by novices. Database names that are used to store data should be as descriptive as possible to allow the tables and columns to self-document themselves, at least to some extent. Application programmers are notorious for creating database naming problems, such as using screen variable names for columns or coded jumbles of letters and numbers for table names.
When rushed for time, some DBAs resort to designing the database with output in mind. This can lead to flaws such as storing numbers in character columns because leading zeroes need to be displayed on reports. This is usually a bad idea with a relational database. It is better to let the database system perform the edit-checking to ensure that only numbers are stored in the column. If the column is created as a character column then the developer will need to program edit checks to validate that only numeric data is stored in the column. It is better in terms of integrity and efficiency to store the data based on its domain. Users and programmers can format the data for display instead of forcing the data into display mode for storage in the database.
Another common database design problem is overstuffing columns. This actually is a normalization issue. Sometimes, a single column is used for convenience to store what should be two or three columns. Such design flaws are introduced when the DBA does not analyze the data for patterns and relationships. An example of overstuffing would be storing a person’s name in a single column instead of capturing first name, middle initial, and last name as individual columns.
Poorly designed keys can wreck the usability of a database. A primary key should be nonvolatile because changing the value of the primary key can be very expensive. When you change a primary key value you have to ripple through foreign keys to cascade the changes into the child table. A common design flaw is using a Social Security number for the primary key of a personnel or customer table. This is a flaw for several reasons, two of which are that Social Security numbers are not necessarily unique, andif your business expands outside the U.S. no one will have a Social Security number to use, so then what do you store as the primary key?
Actually, failing to account for international issues can have greater repercussions. For example, when storing addresses how do you define ZIP code. ZIP code is U.S. code but many countries have similar codes; though they are not necessarily numeric. And state is a U.S. concept too. Of course, some other countries have states or similar concepts (Canadian provinces). So just how do you create all of the address columns to assure that you capture all of the information for every person to be stored in the table regardless of country? The answer, of course, is to conduct proper data modeling and database design.
Denormalization of the physical database is a design option but it can only be done if the design was first normalized. How do you denormalize something that was not first normalized? Actually, a more fundamental problem with database design is improper normalization. By focusing on normalization, data modeling and database design, you can avoid creating a hostile database.
Without proper up-front analysis and design, the database is unlikely to be flexible enough to easily support the changing requirements of the user. With sufficient preparation, flexibility can be designed into the database to support the user’s anticipated changes. Of course, if time is not taken during the design phase to ask the users about their anticipated future needs, you cannot create the database with those needs in mind.
Taming the Hostile Database
Data is the heart of today’s modern business, and database design is the armor that protects that heart. If proper design is not a component of the database creation process, you will wind up with a confusing mess of a database that may work fine for the first application but not for subsequent applications. And heaven help the developer or DBA who has to make changes to the database or application because of changing business requirements. That DBA will have to try to tame the hostile database!