Newsletters




The Machinations of Data Observability at Data Summit 2025


Data is the cornerstone to becoming insights-driven, and data observability holds the key for business success amid increasingly complex, distributed data ecosystems.  

Nilanjan Chatterjee, senior staff data architect, AMD, led the Data Summit session,  "Becoming an Insights-Driven Enterprise," guiding attendees through the ways that data observability can be effectively harnessed to power the highest standards of business data, for AI use cases and beyond. 

The annual Data Summit conference returned to Boston, May 14-15, 2025, with pre-conference workshops on May 13.

Chatterjee, centering his session on data observability and reliability engineering, explained that these areas of expertise are rapidly becoming foundational to modern data engineering and MLOps. Transcending traditional monitoring, data observability should deliver a holistic, proactive approach that spots and resolves issues before they impact the downstream, according to Chatterjee. 

Chatterjee began by examining different architecture models, beginning with Lambda. As a dual-processing approach that combines batch and stream processing, Lambda is best for complex analytics requiring historical context. The batch layer is designed for historical data, the speed layer is designed for real-time processing, while the serving layer merges both views. 

Kappa, on the other hand, is a stream-processing centric data pipeline design with a single stream processing layer, immutable event logs, and reprocessing capabilities. Uber, for example, uses a Kappa-like approach to process ride events in real time, then reprocesses data for billing adjustments. 

The Delta/Medallion architecture unifies batch and stream processing, offering ACID compliance, and support for schema enforcement and evolution. A use case for this architecture, according to Chatterjee, could involve stream fraud detection alerting in real time while backfilling historical data and providing data lineage. 

Each of these architectures become critical when examining the evolution of data reliability, according to Chatterjee. Traditional data quality—which focused on accuracy and completeness—differs from today’s notions of data observability, which offers more comprehensive visibility into the health of data ecosystems. 

Driven by the introduction of real-time streaming and increasingly distributed systems, data observability focuses on five core pillars:

  • Data freshness
  • Data volume
  • Data schema
  • Data lineage
  • Data quality 

Ultimately, tracking each of these aspects informs the viability of an enterprise’s data ecosystem, regardless of the architecture you choose. According to Chatterjee, “Bad data doesn't just mean bad insights…It causes lost reputation, lost revenue, anything you can think of.” 

Many Data Summit 2025 presentations are available for review at https://www.dbta.com/datasummit/2025/presentations.aspx.


Sponsors