Data Integration in the Era of Big Data: How Businesses Are Leveraging Value From Diverse Data Sources

Page 1 of 4 next >>

Big data analytics—touted as today's path to competitive advantage can only be effective if it presents a clear and consistent picture of what’s happening in and around the enterprise. There’s the rub: Big data is inherently messy; it is an agglomeration of unstructured files and structured records, often scattered across domains, with a high likelihood of unclear lineage and ownership.

For many organizations, then, the challenge is to figure out ways to bring varied sources of data into a common context, checked and verified that it is providing a consistent picture of what decision makers need to know—in real or near-real time. It means identifying the nuggets of informa­tion that are worth extracting from the avalanche of data now rushing through organizations, and what can be ignored. As Tony Fisher, vice president of data collaboration and integration at Progress Software, puts it, the issues with big data evoke the ramblings of the king at the trial in Lewis Carroll’s Alice’s Adventures in Wonderland: “This is very important… Unimportant, of course, I meant—important, unimportant, unimportant, important.

Even when relevant data is identified, the trustworthiness of the information plucked from such huge volumes is often suspect. “As data continues to grow exponentially, it has become increasingly difficult for business leaders to ensure that their source of information is trustworthy,” says Nancy Kopp-Hensley, director of strategy, database & systems for IBM. “Manual methods of discovering, integrating, governing and correcting information are no longer possible in today’s era of big data. The key, she says, is to build a data integration capability right from the start that helps the business get at those key pieces of data that are needed. “If data integration isn’t implemented to support applications and data warehouses from the very beginning, a business can find itself very far behind in making big data actionable.

The Catch-22 of Big Data Analytics

There is actually a “Catch-22” situation within enterprises when it comes to big data analytics, adds Rob Fox, vice president of application development at Liaison Technologies. “The paradox is that IT needs requirements from the business to build a system to analyze data,” he says. “But business cannot provide these requirements until they understand the data, which they may not be able to do until IT builds something for the business to use.”

Thus, IT and data managers are left in a quandary, since it takes time and resources to roll out big data analytics solutions—even with the least expensive open source solutions. “The challenge of identifying, collecting, retaining, and providing access to all relevant data for the business at an acceptable cost, and within the management and maintenance capabilities of the IT organization, is huge,” Fox points out. “Big data opens a whole other can of worms that is difficult to rationalize, including the ‘How do I get started?’ question.”

Without Solid Data Practices, Big Data Will Be Frustrating

Big data technology can be cheap, even free, Progress’ Fisher notes. “But, if you do not provide the proper budget, personnel, training, and business insight, then big data is just another technology.” What is required is an understanding of the business issues being addressed, and the outline of a precise role for big data in moving the business forward. “If you haven’t had solid data practices with ‘small data,’ you will find big data frustrating and, potentially, a waste of time and money. Remember, big data technology does not cleanse your data, it is not an island, and it needs to integrate with the rest of your infrastructure.”

Image courtesy of Shutterstock

Page 1 of 4 next >>

Related Articles

Data keeps growing, systems and servers keep sprawling, and users keep clamoring for more real-time access. The result of all this frenzy of activity is pressure for faster, more effective data integration that can deliver more expansive views of information, while still maintaining quality and integrity. Enterprise data and IT managers are responding in a variety of ways, looking to initiatives such as enterprise mashups, automation, virtualization, and cloud to pursue new paths to data integration. In the process, they are moving beyond the traditional means of integration they have relied on for years to pull data together.

Posted April 03, 2013

While all the excitement is currently focused on new-age solutions that have surfaced in the past few years—NoSQL, NewSQL, cloud, and open source databases—there is still a great deal of uncertainty and consternation among corporate and IT leaders as to what role new data sources will play in business futures.

Posted January 20, 2014

To say that big data is the sum of its volume, variety, and velocity is a lot like saying that nuclear power is simply and irreducibly a function of fission, decay, and fusion. It's to ignore the societal and economic factors that—for good or ill—ultimately determine how big data gets used. In other words, if we want to understand how big data has changed data integration, we need to consider the ways in which we're using—or in which we want to use—big data.

Posted February 21, 2014

Enterprise data warehouses aren't going away anytime soon. Despite claims that Hadoop will usurp the role of data warehousing, Hadoop needs data warehouses, just as data warehouses need Hadoop. However, making the leap from established data warehouse environments—the kind most companies still have, based on extract, transform and load (ETL) inputs with a relational data store and query and analysis tools—to the big data realm isn't a quick hop.

Posted February 26, 2014