Think of Solutions as Independent Enterprises

As we have moved forward with APIs and microservices, every organization has even more data stores to manage and more sources of data to consider. Sorting through data structures for operational solutions can become mind-numbing due to the variety, or even frustrating due to a lack of detail from many vendors. Source systems are no longer the monoliths they once were. Cloud-based solutions that are proffered for enterprises to use may very well have multiple physical data stores across their internal microservices. This can preclude a simple download from a centralized database to a customer organization for ingestion on their analytics platform. It is becoming more normal for developers with cloud-based solutions to respond as if someone is speaking an unknown language when customers ask for a “dump” of their database. It is entirely possible that said cloud-based solution may not have a database to dump. Data persisted could be JSON documents that may or may not be stored within some DBMS tool. Or, even more likely, the components or services comprising the solution may each have their own separate databases, with each microservice using a completely different tool. Therefore, part of the data may be in a document store, another stored in key-value pairs, and a third in a graph database.

Presenting an API for logically unloading data may be the only option one may have for obtaining data for use or integration into an organization’s analytics platform. Therefore, an organization should vet how such a data acquisition process is to be handled. Ignoring the need for this function until after a cloud-based tool is deployed can lead to very unpleasant surprises that come far too late to avoid. As mentioned, some vendors have anticipated their customers’ needs and have a clean API in place for providing data. However, not every vendor is so forward-thinking. Some may only provide an inflexible one-time snapshot, with follow-on delta files. This means that, for the customer, delta logic must be coded for updating and storing history within one’s data warehouse or data lake. Problems may arise if additional one-time snapshots are needed. Other vendors may take a stance that customers must pay as they go, meaning that if a customer wishes to actually have a copy of his or her data, the customer must pay a fee for developing and for each execution of such an extraction process. And yet others may provide nothing but tap dancing and double-talk as they try to keep customers confused until their internal development advances to a point of addressing such questions. Exposing this aspect of a possible tool is vital before an organization commits itself to a solution.

What this signifies is that, more than ever, solutions really need clarity as to the meaning of the data shepherded within them. If the solution were to be compared to an independent enterprise, there would be a screaming-loud need for an “enterprise data model” for the solution—meaning a solution data model. In this fashion, clarity and understanding would be enhanced, and conversations with solution data users might be held that are not frustrating to everyone involved. The data model and the understanding of the model’s composition allow for a framework in defining the breakdown of extraction API elements. Having this clarity exposes where the rubber meets the road, because physical structures are not logical structures. Industry-wide, this situation has always been a conceptualizing problem for many people. Data architects supporting this kind of an environment must not suffer from ambiguity in comprehending the physical versus logical differences. Even when differing data structures and DBMS platforms are used for separate microservices, those implementations should be able to be rationalized against the master solution logical data model. Confusing logical/physical distinctions will result in data models that will be much less helpful than they might be otherwise. The bottom line is that data modeling practices are more necessary now than ever before.