<< back Page 3 of 3

Moving Beyond Relational With Data Integration

Latency is common in traditional BI systems where the data warehouse contains data that is not up-to-date with the production data. For historical querying, this is satisfactory, but more and more organizations are adopting real-time analytics to speed up their analysis and decision-making process. Old or outdated data simply is not sufficient for making up-to-the-minute decisions in real time.

Impact of Analytics on Data Integration

Continuous data integration has become the requirement for modern organizations. ETL is becoming passé as organizations adopt real-time analytics. Because ETL requires a lot of up-front work to transform and move data, it is not entirely compatible with real-time analytics. The time lag and data latency associated with ETL are rapidly becoming unacceptable. Of course, transformation may still be required to modify coded data into usable values, and manyusers remain for moving large amounts of data, so ETL is far from dead.

Nevertheless, real-time analyticsdemands different processes than the traditional, batch-oriented ETL of yore. Instead, REST APIs and services coupled with business rules that can be integrated into applications are becoming more prevalent.

For enterprise data replication (EDR), things are not quite as bad. EDR using change data capture (CDC) technology is more event-driven, but latency can still be several minutes or more for completely up-to-date data. So, for true real-time analytics, even CDC can result in data that is too outdated to be useful.

Users are looking for modern data integration platforms that can support big data at various velocities ranging from batch to streaming data. Integrating data over such a disparity of velocities can be challenging and difficult to implement properly. It is desirable for tools to be able to trigger integration activities using event-based processing rather than by time of day or on a preset schedule.

Of course, traditional data integration still has a place in the data infrastructure for populating data warehouses, which remains a part of many organization’s business intelligence applications. But the decades-old technology in traditional ETL, EDR, and EAI tools comes up short for organizations on the cutting edge with real-time analytics and a wide variety of data sources.

Embracing Technological Change

There are other technological trends that are changing the way applications are built and accessed that therefore impact the requirements of data integration. Atop this list of change is the widespread usage of mobile computing devices. Mobile devices—smartphones and tablets—have overtaken traditional computing devices such as desktop and laptop computers in terms of how we interact with service providers and retailers. And these mobile users expect up-to-the-minute accurate data wherever and whenever they access it. This further exacerbates the need for real-time data integration with little to no latency.

Cloud computing, another significant trend, is the practice of using a network of remote servers hosted on the internet to store, manage, and process data, rather than a local host. The cloud enables more types and sizes of organizations than ever before to be able to deploy and make use
of computing resources—without having to own those resources. And that means that more robust and efficient data integration to and from the cloud
is a requirement for many organizations.

More service providers are offering data integration in the cloud, often referred to as integration as a service. Such capabilities can make it easier to adopt data integration because the heavy lifting and handling of setting up and managing data integration tasks is offloaded to the service provider. Integration as a service can be particularly attractive to organizations that rely heavily on the cloud; the data resides in the cloud, and the data integration gets performed in the cloud. This type of co-location is sometimes referred to as data gravity, meaning manage the data where it is created or resides.

Tools and services are expected to be elastically scalable in the modern IT infrastructure. Elasticity is the degree to which a system can adapt to workload changes by provisioning and deprovisioning resources in an on-demand manner, such that at each point in time the available resources match the current demand as closely as possible. Scalability refers to the capability of a system to handle a growing amount of work, or its potential to perform more total work in the same elapsed time when processing power is expanded to accommodate growth. Modern data integration tools must be able to manage big data at multiple velocities and in various formats without downtime and without consuming all of a system’s resources.

What’s Ahead

The bottom line is that modern data integration offerings require different capabilities and technology to support the business and IT needs of today. Data integration tools that still work the same way as when they were first introduced in the 1980s are no longer capable of supporting the rich IT tapestry of 2018 and beyond. Be sure that your data integration technology meets your modern data requirements.

<< back Page 3 of 3


Subscribe to Big Data Quarterly E-Edition