Enabling the Real Time Enterprise with Data Lakes, Streaming Data, and the Cloud

Today, it is not enough to understand what happened, organizations want to have a view of what is happening now so they can impact the future.

At Data Summit, Dan Potter, vice president of product management and marketing, Attunity, looked at what it takes to become a real time enterprise and the role of change data capture in enabling the transformation.

As organizations embrace AI and machine learning versus historical views of the past, they are also moving to real-time computing and away from batch processing, said Potter.

Traditional approaches included business reporting, batch analysis of data at rest, and use of transactional sources, but modern approaches also incorporate data science and advanced analytics, real-time processing of data in flight, and transactional data as well as new sources of data.

Attunity offers a platform for delivering data efficiently and in real time to data lakes, streaming, and cloud architectures.

According to Potter, modern analytics,  such as AI, ML, and IoT analytics, requires scale to use data from thousands of sources with minimal development resources  and impact; stream analytics requires real-time data to create real-time streams from database transactions;  cloud analytics requires efficiency to transfer large data volumes from multiple data centers over limited network bandwidth;  and agile deployment requires self-service access to enable non-developers  to rapidly deploy solutions.

Cloud is a key piece of the new real time infrastructure, and change data capture is a critical piece of enabling rapid data movement.

But as organizations move to the cloud and data lakes there are a lot of challenges to get data that is needed, said Potter, who cited a Gartner estimate that 9 out of 10 data lake strategies have failed.  It is easy to get data in, but hard to get meaningful insights out, he observed.  Automating the end-to-end pipeline of the data lake enables continuous updating and merging of data.

Potter highlighted a financial services use case where there was a need to efficiently roll out a cloud based microservices platform with minimal latency and security risk while synchronizing massive transactions globally. The organization was able to copy live transactions without touching production data and securely transfer them for client usage on a global AWS microservices platform.

In another case, a healthcare organization needed efficient scalable delivery of clinical data but lacked tools for low impact data capture.  The solution uses CDC to Kaftka to a Lambda architecture, enabling a multi-pronged analysis of clinical data at scale, and minimal administrative burden and no production impact.

Many Data Summit 2018 presentations, including Corey’s, have been made available for review at