Data Integration in the Modern Enterprise

Sep 4, 2018

By Joe McKendrick

Everyone wants to be part of a data-driven enterprise, and for good reason. Data analytics, when applied in a meaningful way, provides an enormous competitive advantage. There’s a catch to this though that frequently gets overlooked amidst the glowing analyst projections and keynote speeches about a limitless future in which systems and machines do all the heavy lifting and thinking for businesses. Data—the right kind, in the right sequence, in the right context—doesn’t just magically drop out of the cloud. It needs to be discovered, identified, transformed, and brought together for analysis, management, and eventual storage.

Not surprisingly, then, data integration is becoming more of a hot-button issue for enterprises moving into the digital realm. Any of the new initiatives now being developed—artificial intelligence (AI), machine learning, the Internet of Things (IoT), real-time responsiveness, and digital transformation—are dependent on data and, most importantly, the ability to trust that data. While data integration as a concept has been around for decades—as part of continuous efforts to bring data together from silos of incompatible systems—it often has been restrained by additional layers of processes and systems, such as extract-transform-load (ETL) functions or the requirement for separate management within data warehouses or data marts.

However, with the high volume of data now flowing into enterprises from environments such as IoT, previous mechanisms to manage data integration have become untenable.

These new developments are making it critical to take lots of data feeds, pull the points that are of material importance, and engage with them in real time. There are now a variety of tools, platforms, and frameworks available to help enterprises better manage their data. A recent survey of more than 200 data professionals, conducted by Unisphere Research, a division of Information Today, Inc., and sponsored by Oracle, uncovered a growing embrace of machine learning, data lakes, and Spark.

The following are ways to ensure that the data now powering digital and AI-centric organizations provides the right data at the right time for the right purpose:

Dive into data lakes

Data analytics requirements are changing rapidly. As end users keep innovating and coming up with new approaches for looking at data, they are going to need access to the raw data collected by their organizations. Data lakes fill this role, providing future-proofing for new innovation. The use of data lakes, which provide a place to store diverse datasets without having to build a model first, continues to rise as data managers seek to develop ways to rapidly capture and store data from a multitude of sources in various formats. Overall, 38% of organizations in the Unisphere-Oracle survey are employing data lakes as part of their data architecture, up from 20% in the 2016 survey. Another 15% are currently considering adoption.

Move to self-service-style data integration

It’s virtually impossible for IT or data managers to know what types of angles end users will be taking with their data. Pre-formatted or pre-built reports—even spreadsheets—won’t cut it anymore. End users need one-click data integration so they can ensure the viability of any data sources they specify.

Implement Data as a Service and Database as a Service

The cloud has become a powerful data integration tool. With data as a service, data from multiple sources across enterprises is abstracted into a uniform service layer that is universally accessible by users and applications from across the enterprise. Database as a service means entire databases are run in the cloud, freeing data managers from the headaches associated with database management.

Bring content management and relational data management into greater alignment

Much of the data that will drive decision making in the months and years to come will be unstructured data, such as video clips or documents. Typically, these have been maintained separately within content management systems. The focus needs to shift to enterprise information management systems and strategies that support and enable processing of all data and content types.

Think Hybrid

It’s likely that no one single environment is suitable for data integration—cloud may be applicable for some situations, while on-premise resources may be a better solution for others. With hybrid cloud and integration, current on-premise data and applications can work with cloud solutions. This can even serve as a legacy migration strategy, gradually and incrementally moving assets from on-premises environments to cloud-based services.

Expand your universe

IoT adds numerous new data sources, essentially from any type of system or device within any organization or home across the globe. The implications of IoT are profound and just starting to be felt, involving not only the re-orienting and expansion of databases but also of networks, supply chains, and physical facilities. As such, the data is incredibly varied, employed for different purposes, and created at varying times. This demands advanced integration tools and platforms that can integrate such data with existing backend systems.