Not Everything Ends

Jun 8, 2011

By Todd Schraml

Occasionally, one sees a data structure abomination. This atrocity involves an object of almost any type, in almost any database wherein the object has a start date but no end date. It is not that the finish date currently has no value and is null; it is that the end date does not even exist on the table structure. The stop date was never intended to exist. The object in question starts, but it doesn't ever end.

Completion Timeframe

Actually, the object does end, but that ending is implicit not explicit. The completion timeframe is implied by the existence of the same object identifier within another row having a start date that is after the start date of the previous row. In this manner, the start date of the second row for that object is the effective stop date for the previously existing row.

The object could be something as simple as a price in a list where one has an identifier, a price and a start date, and then another price and a start date; or it could be something more complex, such as eligibility for some kind of coverage. Even without a termination date, one can query such tables to find a point-in-time value, but the syntax put into use becomes quiet lengthy.

The Two-Query Approach

Effectively, two queries are generated and joined to provide a point-in-time perspective. The first part is a subquery that identifies the objects within scope and uses the MAX() function to return the highest start date value which is less than a specific point-in-time across the rows associated with each identified object. The second query also pulls back the same object identifiers that are within scope as the first query, and then joins into the first query results in order to select only the row for each object identifier having that returned highest value begin date. Once joined together in this fashion, the second query may now also bring back the other attributes that are desired by the requester.

Such a two-query approach is very complex when compared to the alternative of simply selecting objects whose rows have a start date less than a specific point-in-time and an end date after the specific point-in-time.

Excuses and Reasons

The excuse for the missing end date technique is that updating the rows when they expire would be too costly. Or, those tables in the environment must as a general and conscientious rule never, ever be updated, only inserts should ever be allowed. However, the general reason for missing an end date is that the logic to find the matching record and update it, is more complicated than what the developer desires to deal with during the initial development stage. The developer may feel the pressure of tight deadlines and sees the start-date-only approach as a corner to cut, or it may be that the matching logic involved is fairly complex and being avoided.

Longer Queries

If one truly has a circumstance where the volume of records absolutely precludes updating and also prohibits drop-and-replace logic, then one may be forced to have start dates on a record without stop dates. Under all other circumstances, the no-end-date approach is simply a way for the initial developer to avoid coding extra logic. And in this case, that lack of initial effort generates a steady stream of much longer queries created by everyone else who may ever need to use this data.

The Long-Term Costs

The long-term costs continue to grow with every new query written. This absent-end-date is often a situation where the ends do not justify the means. If one can generate a finish date, one should. Not populating a completion date is more often simply rude instead of being a practical tactic.

Newsletters

Not Everything Ends

White Papers

Sponsors