CelerData 3 Bolsters Data Lake Analytics with Centralized, High-Performance Updates

CelerData, a unified analytics platform for the modern, real-time enterprise, is releasing the latest iteration of its enterprise analytics platform, CelerData Version 3, which continues to empower high-performance analytics in the enterprise.

With its foundation rooted in the open source, MPP SQL database, StarRocks, CelerData 3 allows lakehouse users to execute high-performance analytics without ingesting data into a central data warehouse. Integrations with open table formats—including Delta Lake, Iceberg, and Hudi—eliminate this need for data ingestion.

Additionally, users can perform analytics by querying across streaming data and historical data in real-time, driving streamlined data architecture and reducing time expenditure on lakehouse analytics. No longer do enterprises require a separate platform for streaming analytics; CelerData 3 centralizes streaming data analytics and data lake analytics onto a single platform, according to the company.

“The data lakehouse has added critical capabilities to the data lake architecture by introducing ACID control, table formats, and data governance,” said James Li, CEO of CelerData. “However, analytics capabilities on the lakehouse are still limited and cost prohibitive. Most query engines struggle to support interactive ad-hoc queries, are not able to support real-time analytics, and fall apart when facing a large number of concurrent users.”

Unlike other common query engines, query performance sees improvements by at least 3x—simultaneously saving on costs for the infrastructure and supporting thousands of concurrent users at 10,000 QPS (queries per second), according to the vendor.

CelerData also allows for users to bring data into its own storage format on the lake, as well as offers multi-table materialized views and a local caching layer, ensuring optimized performance and streamlined data pipelines. Raw data can be ingested and transformed within CelerData, additionally easing the data processing pipeline.

This update’s native cloud architecture, resource and workload isolation, and multi-AZ availability in the cloud presents a large opportunity for applying the data lakehouse to a myriad of use cases; it further improves reliability while limiting storage cost.

“Though several challenges exist when it comes to the underlying infrastructure supporting data lakes, organizations continue to look for solutions and approaches that can address those challenges head-on. They understand the value an organization can achieve when implementing a data lake the right way,” said Mike Leone, principal analyst at ESG. “Of all the data lake environment challenges organizations experience today, our research shows the greatest challenge is the management, optimization, and automation of data placement. With CelerData’s support for a lakehouse architecture through the integration with common table formats such as Iceberg and Hudi, a data lakehouse can now have the option to conduct high-performance analytics without ingesting data into a central data warehouse.”

The latest iteration of CelerData will be made generally available in early April 2023.

To learn more about CelerData 3, please visit