Fourth and fifth normal forms send us down a pretty, yellow brick road leaving behind functional dependencies to gather up join dependencies and their special child, multi-valued dependencies.
Explanations for these extra dependencies can be conflicting, which makes them perfectly at home in Wonderland with Alice. We’ll try and explain them a little here, but be prepared, because others will claim this is wrong.
Fourth normal form involves multi-valued dependencies. Multi-valued dependencies are a special case of a join dependency where the table consists of all primary key attributes. These special cases most often are some three-column combinations, where obviously columns A, B, and C are a unique grouping; but we could break this entity apart into two entities. One entity consists of columns A and B, and the second entity has columns A and C. Entity ABC is simply entity AB joined with entity AC on the column A. The most sited example is a messy table of Teacher/Book/Author that apparently no one noticed should be a Teacher/Book table and a Book/Author table. These should have started as independent lists, but for unknown reasons everything was lumped together and considered perfectly fine until we get to this point in stepping through the normalization process.
Join dependency evaluation, which is what 5th normal form is about, asks the data modeler to look at a table structure and try to imagine two or more structures containing subsets of the attributes that the starting entity included. Can those same attributes be configured into other “smaller” tables that when joined together would give you the starting structure and the exact same content? There are rules—one cannot simply create tables with the same primary key and have one non-primary key column on each of those tables. (We save that kind of drastic entity chopping for 6th normal form.)
Fortunately, an entity/table with a single primary key column is immune from troubles with this type of issue. One should start with a multi-column primary key entity, and the smaller tables proposed would each have a subset of the primary keys from the original, spread across the new tables in some combination. Often this gets very confusing because entities that are smaller in terms of columns are always better. This better/smaller desire means that a configuration of three two-column entities is more desirable than one three-column entity. An often-used example of this situation involves an entity containing subject/class/teacher that can be rebuilt by joining entity subject/class to entity class/teacher and then to entity subject/teacher. It is because of this ability to bring things back together via joins that 5th normal form is also called “Project-Join Normal Form.”
While all this may seem mysterious, in reality, such join and multi-value dependency sightings occur because of a combination of events; perhaps the data modeler is unfamiliar with the subject area being modeled, and potentially the user community doesn’t relate to the “double-talk” from the data modeler when they speak of “business rules.” Because of these “blind spots,” business objects that should have been identified and split into separate objects earlier in the process of reviewing data, have been left instead in a mushed-together state. Having been completely ignored and overlooked previously, 4th and even 5th normal forms provide a space for an intrepid data modeler to notice that these things should have been split up, and to do so now. And with this change, suddenly, voila!, normalization magic is performed. But then again, others may have drastically different explanations for 4th and 5th normal forms. So be cautious about listening to me. If you too, much like the Red Queen, try to believe up to six impossible things before breakfast, then you may now proclaim dominion over 4th and 5th normal forms.