Big data analytics—touted as today's path to competitive advantage can only be effective if it presents a clear and consistent picture of what’s happening in and around the enterprise. There’s the rub: Big data is inherently messy; it is an agglomeration of unstructured files and structured records, often scattered across domains, with a high likelihood of unclear lineage and ownership.
For many organizations, then, the challenge is to figure out ways to bring varied sources of data into a common context, checked and verified that it is providing a consistent picture of what decision makers need to know—in real or near-real time. It means identifying the nuggets of information that are worth extracting from the avalanche of data now rushing through organizations, and what can be ignored. As Tony Fisher, vice president of data collaboration and integration at Progress Software, puts it, the issues with big data evoke the ramblings of the king at the trial in Lewis Carroll’s Alice’s Adventures in Wonderland: “This is very important… Unimportant, of course, I meant—important, unimportant, unimportant, important.
Even when relevant data is identified, the trustworthiness of the information plucked from such huge volumes is often suspect. “As data continues to grow exponentially, it has become increasingly difficult for business leaders to ensure that their source of information is trustworthy,” says Nancy Kopp-Hensley, director of strategy, database & systems for IBM. “Manual methods of discovering, integrating, governing and correcting information are no longer possible in today’s era of big data. The key, she says, is to build a data integration capability right from the start that helps the business get at those key pieces of data that are needed. “If data integration isn’t implemented to support applications and data warehouses from the very beginning, a business can find itself very far behind in making big data actionable.
The Catch-22 of Big Data Analytics
There is actually a “Catch-22” situation within enterprises when it comes to big data analytics, adds Rob Fox, vice president of application development at Liaison Technologies. “The paradox is that IT needs requirements from the business to build a system to analyze data,” he says. “But business cannot provide these requirements until they understand the data, which they may not be able to do until IT builds something for the business to use.”
Thus, IT and data managers are left in a quandary, since it takes time and resources to roll out big data analytics solutions—even with the least expensive open source solutions. “The challenge of identifying, collecting, retaining, and providing access to all relevant data for the business at an acceptable cost, and within the management and maintenance capabilities of the IT organization, is huge,” Fox points out. “Big data opens a whole other can of worms that is difficult to rationalize, including the ‘How do I get started?’ question.”
Without Solid Data Practices, Big Data Will Be Frustrating
Big data technology can be cheap, even free, Progress’ Fisher notes. “But, if you do not provide the proper budget, personnel, training, and business insight, then big data is just another technology.” What is required is an understanding of the business issues being addressed, and the outline of a precise role for big data in moving the business forward. “If you haven’t had solid data practices with ‘small data,’ you will find big data frustrating and, potentially, a waste of time and money. Remember, big data technology does not cleanse your data, it is not an island, and it needs to integrate with the rest of your infrastructure.”
Image courtesy of Shutterstock