Everybody Needs a Data Warehouse

Every organization needs a data warehouse. No exceptions. Every company needs to have correlated data about common business objects from every different functional application. They need to identify trends by seeing current and historical activity. They need to be able to get the same answer consistently to the same questions at the same point in time when different people ask. They need to see what is now, and what is changing across what they know as “the enterprise.”

Ideally, organizations also need externally available details about their customers, their vendors, and their business partners. Such informational needs are simple common sense. This set of common needs does not mean every organization must build the same set of structures, use the same tools or approaches, spend the same amount of its capital, and have a data warehouse implementation that is very much the same as everyone else. No, like people, every corporate entity is unique.

Some organizations may best function by having an elaborate and expansive business intelligence ecosystem comprised of a data lake, data warehouse, data marts, operations data stores, exploration data warehouses, and more. Other organizations may have a few spreadsheets updated each day with new activity serving as its source of all knowledge; and that single source may be sufficient—today. A data warehouse has never been a one-size-fits-all kind of solution. Such variations exist and should be accepted.

It is said that in order for data to be useful, it must first be understood. Organizing, cleansing, standardizing, and integrating business data is the function of a data warehouse, data hub, repository, or whatever term one wishes to apply. These collective activities are performed successfully as the organization grows into understanding its data. And again, in some organizations, the amount of data, source systems, complexity in operational rules, kinds of platforms implemented, change management practices (or lack thereof), and a million other details, may make even the thought of a “data warehouse” seem daunting at best.

Even at extremes of complexity, data warehousing solutions are needed, and best addressed by taking a first step, dividing and conquering. Choose the area of most pain; choose the area that provides a quick win/low-hanging fruit; just choose something and start.

Data lakes can be a useful component, but in and of themselves the data lake is not a replacement for a data warehouse. When successfully implemented, a data lake may reduce the number of elements that need to be polished and refined into the data warehouse. Again, each enterprise is unique. How many components are necessary for an enterprise’s data ecosystem? How much cleansing is necessary? Does such cleanup require consideration of a CRM solution, or an MDM solution? How are reference tables managed and maintained? Should the Cloud be considered? Does open source fit into the organization’s approaches? These are the start of a long list of questions to be considered before settling on a design and strategy for an organization’s Business Intelligence solution.

Once understood, the data needs be put to use, and work for the goals of the organization. Putting data to use means being able to successfully leverage that data to make appropriate and meaningful business decisions. Key metrics should be identified, defined, and refined, then appropriately framed across dashboards to keep executives informed on how the organization is performing. These dashboards and metrics can lead the way in providing an understanding about what is happening within the organization, maybe even provide insights into why these happenings are occurring. All these elements are enabled and supported by a data warehouse collecting, integrating, and standardizing the key elements from various operational sources. Without useful data, decisions are better called guesses.