Digging Deeper into Many-To-Many Relationships

Bookmark and Share

An inherent awkwardness exists in every many-to-many relationship between entities. Ambiguity causes this persistent awkwardness, primarily because a many-to-many relationship is such a fuzzy thing. In data discovery, encountering many-to-many relationships may actually expose a level of disregard about details by the subject matter experts. For example, if one must model parents and children, initially a many-to-many relationship exists between a Parent entity and a Child entity. One Parent instance could be "John" and another "Mary." Both "John" and "Mary" are the parents of "Sam," "Suzy," and "Mike," where each represent Child instances. While the possibility to do so does exist, rarely should one find it sufficient to simply make these statements: "John is parent to Sam"; "Mary is parent to Sam"; "John is parent to Suzy"; "Mary is parent to Suzy"; "John is parent to Mike"; and "Mary is parent to Mike." A more nuanced and detailed description of these interrelationships supports informational needs, and how these descriptions evolve affects the manner in which each data model focuses on an organizational set of needs and differs from other such needs.

Worlds of options exist when looking at more detailed relationships for this Parent/Child example. One could say that "John" is Father to "Sam," and "Mary" is Mother to "Sam," and that in this view both Father and Mother constitute separate entities. By incorporating this split, we have made the many-to-many relationship a one-to-many: "John" may Father many "Child" instances, but each "Child" is "Fathered" only once. If rearing versus biological relationships is in focus, then Step-Father/Step-Mother is included. Adding in date ranges and possibly flagging biological versus legal guardian status, creates a circumstance that can bring order with several one-to-many relationships, including the establishment of a Father entity, a Mother entity, and a Child entity. Alternately, a Parent entity, a Child Entity, a Parenting Relationship entity, and a Parenting Relationship Type entity might all be created within a design. Or a Parent entity (having Mother, Father, Step-Father, and Step-Mother sub-types) with a Child Entity and a Parenting Relationship entity, might also be envisioned. In a different focus on the biological aspect, one might have each child as the offspring of a singular Union or Coupling, instantiating a "Coupling" entity, where the coupling includes a relationship between a single "Male" and a single "Female" entity instance.

The data modeler has responsibility for diving in and obtaining a level of certainty about every many-to-many relationship encountered. Regardless of a final model design, many-to-many relationships open the door to varying options, and the designer should understand why one option arises over another. This choice usually means the database designer must "annoy" subject matter experts with "trivia" that those experts would rather ignore. (If the expert has dealt with data modelers before, such annoyances are likely expected.) When subject matter experts get involved in assisting database design initiatives, it is not unusual for those experts to feel like witnesses under cross-examination by the word-parsing, semantic-game-playing data modeler.

Every circumstance of many-to-many relationships is unique. However, digging into the deeper level of descriptive granularity, in order to successfully break down a many-to-many relationship into various one-to-one and one-to-many relationship components thereby augmenting the semantic veracity of the data modeled, should always be possible. In dismantling these relationships, the data modeler can ease the creation and maintainability of the processes that manage and use the detailed data within the scope of a given project. The designer's selections should be based on understanding the data and belief in the likelihood of future organizational needs. Ideally, a well-designed model results in a happier user community, because as their processes grow the data model easily embraces that growth.