Strategies and Tools to Effectively Leverage Real-Time Data and Analytics

The emphasis on real-time data and analytics has grown increasingly relevant to a multitude of industries. The combination of relevant data with simple, fast consumption and access is the ideal—however, legacy systems, data silos, and data latency pose significant challenges to organizations looking to become more real-time.

Speakers Anais Dotis-Georgiou, developer advocate at InfluxData; Fred Patton, developer evangelist at Swim; Chad Meley, CMO of Kinetica; and Atanas Popov, general manager for Yellowfin, gathered for a DBTA webinar to address the various ways enterprises can overcome real-time roadblocks through some of the latest technologies and strategies.

Dotis-Georgiou launched the conversation discussing time series databases and data analytics, defining time series data as based on both time and source. Time series data is any data with a timestamp; time series data is collected from a source, such as sensors, networks, infrastructures, and applications. Typically, time series data is leveraged for forecasting insights or anomaly detection.

Transitioning the conversation to a deep dive into InfluxDB’s time series capabilities, Dotis-Georgiou explained, “with InfluxDB, you can query your data but also analyze your data, create dashboards, and visualize your data so you can gain insights, as well as create tasks that automatically transform or prepare your data.”

InfluxDB’s focus is imagining itself as a time series data lake, being able to ingest large volumes of data that other platforms are unable to do. Once it has been ingested, Dotis-Georgiou recommended moving subsets of data to a data warehouse for deep data analytics and data processing using a client library, allowing data scientists to employ tools and languages familiar to them while also providing them with access to other tools. Further, InfluxDB can be applied to MLOps monitoring—capable of monitoring data arrival and model deployment.

Flux, the functional language for querying, analyzing, and acting on data, is a creation from InfluxData that provides a myriad of different functions. The language provides:

  • Transformation functions for statistical time series analysis
  • Transformation functions for dynamic statistical and fundamental time series analysis
  • Technical momentum indicators for financial analysis
  • Writing and consolidation of data from the edge to cloud

Patton highlighted the need for actionable business context in the interest of streaming applications. They point to a gap in data technology, between real-time data and the analytics ecosystem, which perpetuates challenges to becoming real-time.

“We are in the age where streaming data is hugely important,” said Patton, as streaming data allows us to know what’s happening now, as opposed to a database that can only tell us what happened. However, extracting the most value you can from streaming data is a critical challenge; Patton explained that this is an application problem, citing that we must apply the data to the business in order to say what the data means. While analytics model the data and provide relevant insights, how can we use real-time data in business context without leaving real-time?

Oftentimes, we look toward building event-driven applications in response. The problem has now transformed; sure, we have events, but they are only concise summaries that lack context. With the context being streamed separately, the massive streaming joins are slow and computationally wasteful.

Patton presented Swim’s streaming app solution architecture, which provides end-to-end streaming with repeatable use case templates that maintain real-time state replication on a per-entity basis—all in-memory, all in-stream. Armed with real-time UIs, streaming APIs, stateful microservices imagined as web agents, continuous data orchestration, and enhanced data ingestion to eradicate pipelines and databases with stale data.

Meley provided observations of the current data management climate, noting the extreme data volumes and velocity, readily perishable value, and the presence of time series and geo-coded data. This points to the need for real-time analytics able to examine spatial relationships overtime, at high performance, and ideally, low cost. Most remedies have proven ill-equipped to manage and rectify all of these needs at once, Meley explained.

“Databases offer a compelling solution for real-time processing, but if you look at conventional databases, there is a potential to reduce latency,” said Meley. “Modern, real-time databases are compressing that latency.”

With Kinetica, a real-time data analytics database, clients can maintain fresh data through shortening the times of data latency and query latency. Data latency is mitigated with Kinetica’s native streaming connections, distributed headless ingest, and lockless architecture. In addressing query latency, Kinetica employs a vectorized query engine while minimizing the use of indexes and summaries.

Concluding the roundtable, Popov offered the Yellowfin perspective on real-time data and analytics, emphasizing the changing demand for BI and analytics. Previously, data analytics worked within a BI tool that was disjointed from decision workflows, highly technical, and required manual analysis.

The new, augmented business user works within business applications that are embedded into application workflows, supply analytics at the time of decision-making, lowers complexity through contextual analytics, and automates insight discovery.

Contextual analytics, as Popov defined, is delivering insight when required and guiding users to assist decision-making in real-time, at the point of consumption. Unlike other strategies, contextual analytics maintains the ability to handle massive volumes of real-time data and operationalize at a reasonable cost.

“We feel that new applications should have the decision-enabling analytics within the context of the application,” said Popov.

Popov concluded by pointing to critical elements that can make or break an enterprise’s analytics capabilities. The time it takes to employ the tool, its scalability and reliability, meeting your enterprise’s security requirements, and mitigating cost is undoubtedly significant in moving toward real-time analytics. As Popov explained, “when you’re choosing a BI and analytics tool to enable your real-time data applications, these are a lot of straightforward, but frequently overlooked, factors to take into consideration.”

For detailed use cases and elaboration on these real-time analytics tools and strategies, you can view an archived version of the webinar here.