Entities I Have Known and Loved

In science fiction movies, the word “entity” pops up when discussing an as-yet-unseen but at least suspected alien presence, presumed evil (or sometimes evil presence, presumed alien). This entity is not named until half the cast has vanished, then we might hear, “It’s Cthulhu’s daughter and she’s angry!” But not all entities are evil or angry. A tamer understanding of entity is simply a separate and existing object or, dare I say, thing. Any organization, person, or place can be considered an entity. In the data modeling practice, entity is foundational, and a key element of the Entity-Relationship Diagram or ERD. The first step in attempting a logical data model is to start listing off the business concepts about which data needs to be managed, manipulated, reported, or in some way retained. Purchase, Product, Customer, Inventory . . . this list is a first draft of one’s logical entities.

Logical Entities

Quite often people are introduced to logical entities by statements claiming that entities are the logical equivalent of a physical database table. And while there is an equivalence that can be drawn, people then still find confusion by assuming tables and entities are truly the same. The ideas are equal in their related contexts (logical vs. physical), but the two ideas follow differing rules on how they are implemented. Tables and entities are only exactly equivalent if one builds tables that map one-for-one to the specified entities. There is no requirement that physical implementation and logical design be in lockstep unless one makes that requirement a rule in the process. The preponderance of database or database-like tools is large, and many options exist that may require multiple logical concepts be bound together into a single physical structure. Newcomers to the world of data modeling often have a very hard time assimilating the distinction between logical and physical. If a database table looks a certain way, then the presumption that the logical entity must be “exactly the same” is their first guess. But physical tables may be denormalized and otherwise abused, meaning one table may be a mangled version of many logical entities and include other non-logical doodads. The idea that some items may end up logical only or physical only adds to the confusion.

In efforts to increase speed, developers often jump straight into defining tables and other physical structures, ignoring the establishment of a logical design. Not creating a logical data model first leads to increased development times, as business rules that would have been clearly laid out in a logical design are missed, causing what may be drastic rework later in the process. The logical data model forces an examination of what the data in scope means, how it interrelates and interacts with the rest of the data, and ultimately clears a path for a more focused and error-minimized development effort. One can state that if the logical data model has not been attempted, one has not really defined the business requirements of the solution one is building. And if the requirements are unidentified, one is hunting bear, or Cthulhu’s daughter, with a bow and arrow while blindfolded.

Determining the logical entities in scope for a design can often be done quickly. Most entities are obvious; ask yourself, “What concepts is business thinking of?” as developers go through whatever process they are working at defining. What concepts will business want to report on from this data? Those questions help drive what entities your solution will revolve around; those concepts will get the attention and love from what is built. The physical structures may need to mash ideas together to provide proper performance, hiding the logical entities away. But the logical objects and their related business rules must still be enforced by the logic within the processing. Hidden or not, the entities prevail.