Newsletters




Data Integration in the Era of Big Data: How Businesses Are Leveraging Value From Diverse Data Sources

<< back Page 3 of 4 next >>

Bookmark and Share

Olsson goes so far as to predict that more traditional relational databases will come roaring back as the preferred solution for all levels of big data challenges. “Every company will always need a way to store and access structured data, and relational databases are really the only way to do it,” he says.

Plus, relational databases and data warehouses coming on the market do address big data analytics with advanced tools, and are being configured to more easily integrate with the newer big data-centric platforms and frameworks on the market, such as Hadoop. These new databases and data warehouses accommodate both structured and unstructured data within their file systems.

“Organizations of all sizes need data integration capabilities that fully support their business requirements, including ETL, data federation, replication, synchronization, changed data capture, data quality and governance, master data management, natural language processing, business-to-business data exchange, and more,” says Kopp-Hensley. “Many types of data architecture are in use today, but an ideal strategy for addressing all these business needs is to use a combination of methodologies.”

It is Time to Accept Database Diversity

Olsson agrees, noting that it’s time “to stop fighting database diversity, and to accept the fact that you have to live with several database platforms. Good execution can be a great cost saver, so focus instead on being good at supporting each of these platforms.”

There are initiatives to bring newer platforms—such as Hadoop—more in line with traditional platforms such as data warehouses. As Haddad puts it, “Most organizations have concluded that Hadoop plays a key role in data warehouse infrastructure,” he points out. “Now, people who are managing this hybrid infrastructure are struggling with questions such as, ‘what type of data should I use, and when should I use Hadoop versus the data warehouse?’ Haddad advises data managers to consult the numerous reference architectures that now exist to provide guidance on where data should be deployed and managed.

In many instances, the attraction to new-generation data platforms will be economics. “Enterprises are not in a hurry to discard their traditional relational and SQL platforms which have worked well for them for certain specialized applications, such as payroll or HR,” says Jim Vogt, president and CEO of Zettaset. “However, there is clearly a shift toward adoption of Hadoop as a more cost-effective database option for certain types of data, especially the unstructured data that is being gathered from sensors or log files. So for some time, expect new platforms to coexist with legacy platforms.”

New Approaches to Enterprise Decision Making

Still, the confluence of new database types and new frameworks designed for big data processing and analysis has elevated the process of managing business intelligence and analytics to new levels. ?Along with the variety, volume, and velocity of data these new environments are handling, there is an even more profound transformation afoot. New data environments are also reinventing the way enterprises approach decision making, and are even creating new job roles for people working with information.

“In the old days of an RDBMS-centric world, we were forced to think about the questions we wanted to ask of our data before we even put them into relational tables,” says Will Hayes, chief product officer for LucidWorks. In addition, organizations required a great deal of staff expertise in building databases or data warehouses, and constructing the necessary schemas, tables and SQL queries. “Developers were left with little ability to construct the necessary schemas, tables and SQL queries required, especially as these applications went to production,” he says. “Effectively scaling RDBMSs required expensive technology, as well as a Ph.D. in database replication.”

Today’s new generation of solutions take away a lot of the pain—and specialized skills requirements—associated with database management. Hayes says. “NoSQL data stores and index technologies mean more flexibility in the questions you could ask with simple loading of semi-structured data,” he explains. “Developers find that tools like MongoDB and Cassandra offer easy-to-understand transport mechanisms, such as REST APIs and JSON for loading and retrieving data. These technologies, most of which are open source, scale horizontally, requiring little knowledge of complex replication schemes.”

Linking Structured and Unstructured Worlds

It pays to link the structured and unstructured worlds within a comprehensive data integration framework, says Sid Probstein, chief technology officer for Attivio. “You can gain huge insights from analyzing three types of information at the same time: transactional, or what happened; CRM or sensor systems; and human-generated content from things like open-ended survey questions, social media, or email,” he points out.

<< back Page 3 of 4 next >>

Related Articles

Data keeps growing, systems and servers keep sprawling, and users keep clamoring for more real-time access. The result of all this frenzy of activity is pressure for faster, more effective data integration that can deliver more expansive views of information, while still maintaining quality and integrity. Enterprise data and IT managers are responding in a variety of ways, looking to initiatives such as enterprise mashups, automation, virtualization, and cloud to pursue new paths to data integration. In the process, they are moving beyond the traditional means of integration they have relied on for years to pull data together.

Posted April 03, 2013

While all the excitement is currently focused on new-age solutions that have surfaced in the past few years—NoSQL, NewSQL, cloud, and open source databases—there is still a great deal of uncertainty and consternation among corporate and IT leaders as to what role new data sources will play in business futures.

Posted January 20, 2014

To say that big data is the sum of its volume, variety, and velocity is a lot like saying that nuclear power is simply and irreducibly a function of fission, decay, and fusion. It's to ignore the societal and economic factors that—for good or ill—ultimately determine how big data gets used. In other words, if we want to understand how big data has changed data integration, we need to consider the ways in which we're using—or in which we want to use—big data.

Posted February 21, 2014

Enterprise data warehouses aren't going away anytime soon. Despite claims that Hadoop will usurp the role of data warehousing, Hadoop needs data warehouses, just as data warehouses need Hadoop. However, making the leap from established data warehouse environments—the kind most companies still have, based on extract, transform and load (ETL) inputs with a relational data store and query and analysis tools—to the big data realm isn't a quick hop.

Posted February 26, 2014

Sponsors