Starburst Strengthens Support for Building Interactive Applications on the Data Lake

Starburst, the data lake analytics platform, is offering new capabilities that enable organizations to build and scale game-changing data applications without compromising on performance or cost.

According to the company, new features in Starburst Galaxy will help customers simplify development on the data lake by unifying data ingestion, data governance, and data sharing on a single platform.

Interactive applications oftentimes require the scalability and cost-efficiency of a data lake, but building and maintaining that data lake is complex and time-consuming for data teams. To overcome these challenges, Starburst has added support for:

  • Near real-time analytics with streaming ingestion: With streaming ingestion, customers can leverage Kafka to hydrate their data lake in near real-time, ensuring applications have the most up-to-date insights for their users. Upcoming support for fully managed solutions, such as Confluent Cloud, is also planned.
  • Automated data governance: As new data lands in the lake, machine learning models in Gravity—a universal discovery, governance, and sharing layer in Starburst Galaxy—will automatically apply classifications for certain categories. Depending on the class, Gravity will apply policies granting or restricting access. This automation is particularly useful for teams handling sensitive data like personally identifiable information (PII), according to the company. Now, as soon as PII lands in the lake, Gravity will be smart enough to identify and restrict access to that data.
  • Automated data maintenance: New automations make it easy for customers to optimize their data lake by abstracting away common management tasks like data compaction and data vacuuming. Users can now maintain warehouse-like performance without adding brittle manual processes, as the volume and complexity of data in their data lake grows.
  • Universal data sharing with built-in observability: With Gravity, users can easily package data sets into shareable data products to power end-user applications, regardless of source, format, or cloud provider. New functionality will allow users to securely share these high-quality data products with third-parties, such as partners, suppliers, or customers.
  • Self-service analytics powered by AI: Not only are data lakes notoriously hard to manage, but the majority of data teams are understaffed, according to the company. New AI-powered experiences in Galaxy, like text-to-SQL processing, will enable data teams to offload basic exploratory analytics to business users, freeing up their time to build and scale data pipelines.

“Data-intensive initiatives like AI require a solid data foundation to be successful,” said Justin Borgman, co-founder and CEO of Starburst. “We provide that foundation, giving our customers the ability to quickly access and analyze all their data in order to scale applications from the first hundred users to the first thousand and beyond. We ensure optimal performance even with high concurrency and exponentially growing data volumes. The new streaming ingest, data maintenance and governance automations, and data sharing capabilities in Starburst make it remarkably easy for teams to build, deploy, and scale applications on top of the data lake.”

Starburst’s position as an Amazon Web Services (AWS) Data and Analytics Competency Partner, means that AWS customers can rest assured that these features will be made available on the fastest hardware AWS has to provide, including AWS Graviton3 and the newly launched Amazon Simple Storage Service (Amazon S3) Zonal storage class, and will integrate seamlessly with core tools like AWS QuickSight and new tools like Amazon Bedrock.

For more information about this news, visit