Foreign Keys to the Kingdom

Bookmark and Share

In a relational database approach, regardless of whether one is considering persisted data or transitory query results, almost everything is viewed as a logical table. Associations between these envisioned tables are based on data values, versus the alternative to the pre-relational idea of linking multiple data structures via "hidden" pointers maintained by the system. Relationships among objects are ultimately derived from the semantics of a situation. "A customer order contains one or more ordered items" would be an example of an association. There are two objects, which could be considered as two tables: the CustomerOrderTable and the OrderedItemTable, and these objects have a relationship. The relational data associations articulating a relationship are straight-forward-the column, or columns, comprising the primary key of one table are repeated on the associated table. The nature of the relationship between the objects represented by tables determines which table's primary key migrates over to which one of the other tables. When a definable one-to-many relationship exists between objects, the primary key from the "one-side" migrates onto the "many-sides" table. In the current example, a single customer order may have many ordered items and presents an expectation that the primary key from the CustomerOrderTable would exist as column(s) on the OrderedItemTable. On the associated table, those repeated column(s) would be labeled as the "foreign key" into the referenced table.

Beyond the simple linkage between tables, a foreign key can serve even greater purposes. There are times when a foreign key becomes embedded within a table's primary key. In the previous example given above, the CustomerOrderTable has a column named OrderID serving as its primary key. The OrderedItemTable would have an OrderID column serving as a foreign key to the CustomerOrderTable within the structure. If the primary key from the OrderedItemTable was defined as a composite (of the OrderID column plus another column named OrderLineNo), the design's configuration states that an ordered item cannot be identified without first creating a customer order with its OrderID value. This identity-driving dependence within the design demonstrates what is referred to as a weak entity. The entity is weak because its existence is only allowed after something else exists first; from our earlier example, a customer order must exist before an ordered item can exist. Objects that can exist without such dependence are strong entities. If the OrderedItemTable were defined with a completely unique number (such as the UniqueOrderLineNo value, meaning the OrderID was not part of the primary key), then the design semantics suggest that ordered items represent a strong entity whose members can exist without a customer order.

Some database designers try to avoid such semantic issues by defining OrderID as non-nullable so that it must have a value, even when not part of the primary key. These designers then try to propose that the simple non-optionality of a foreign key sufficiently defines an entity as weak. Functionally, such mandatory relationships that are not part of a composite primary key are often meant for reference tables providing supporting details and so do not define an entity as weak. And in actual usage such dependencies drive the creation of "unknown" or "not applicable" rows in the reference table. Semantically, when not part of the primary key, a foreign key only implies the objects are related, not that one object is completely dependent on the other for its very existence. Properly expressing foreign keys (which means understanding the possible semantic subtleties involved in various alternatives), singles out mastery of the data modeling process. And truly understanding data, by implementing accurate designs, can often lead to simpler and more useful databases.