Design Options, Varying Implementations Foster Varying Implications

Sep 14, 2009

By Todd Schraml

Composite keys are an implementation of business rules within the database. As an example, a table named INVOICE has a composite primary key consisting of Account_Number and Invoice_Date. In this example several possible rules are being expressed. The Account_Number which partially identifies an INVOICE instance implies that an account must exist before an invoice can exist. In addition to this INVOICE table, one expects to see a table named ACCOUNT with only the Account_Number as its primary key. Likewise, a referential integrity constraint would be defined between the INVOICE and the ACCOUNT tables based on that Account_Number value. In this manner, the DBMS would prevent Account_Number values from insertion into the INVOICE table unless they already existed in the ACCOUNT table.

The second element of the primary key for our INVOICE table is a column named Invoice_Date. Functionally, only one invoice can exist for a given account for each calendar date. If this column were a full timestamp data-type, containing both the day and the time, then the designed data structure would support an invoice creation once for any account per the smallest unit of time dictated by the timestamp, be it milliseconds, microseconds, etc. Invoice_Date values could be restricted further by the existence of a distinct INVOICE-RUN-DATE reference table which could, via another referential integrity constraint, limit Invoice_Date values to a very specific list, for instance Tuesdays. Through the mechanism of the referential integrity constraints and referred tables' value lists, the DBMS responds as if it were a business rules engine. Having these business rules imposed by the DBMS means that such imperatives are enforced regardless of whether a user works through an operational application or executes a SQL query to change data content.

A variation on the INVOICE table design might have a generated Invoice_Number column as the primary key. The earlier mentioned Account_Number and Invoice_Date columns may still exist on this new version of the table, but neither column is serving as a component of the primary key. By using this alternative design, some fluctuations can be allowed. Perhaps an account has not been created, but an invoice still exists. Maybe more than one invoice is generated for the same Account on the very same Invoice_Date value. While this looseness of control in relation to the Account_Number and Invoice_Date columns can occur, the imposition of the same restrictions from the original design may also be permitted. This restriction enforcement would start with defining the same referential integrity rules mentioned above, plus neither of these two columns would allow nulls, and lastly, a unique index would be defined across the Account_Number and Invoice_Date columns. Then from a rule-wise perspective we would have just as tightly controlled an arrangement as in our first example.

Even with the ability to tighten or loosen control over the two columns (Account_Number and Invoice_Date), such variety does not mean that the second case is better than the first. In the first example, the primary key proposes that an account must exist in order to generate an invoice, and if true, that is an important business rule. Conveying such rules easily and quickly is useful to those who develop and maintain business solutions. The latter example serves best when there is reason to suspect that a business will desire these specific flexibilities. If no such flexibility is necessary and we still opt for the second approach, then we have created a design that is less clear to read and allows for options that may be used inappropriately. Good designers focus intently on understanding current and future data usage in order to work toward the goal of a data model that is fit for use.

Newsletters

Design Options, Varying Implementations Foster Varying Implications

White Papers

Sponsors