Databricks Creates Databricks LakeFlow: A Unified, Intelligent Platform for Data Engineering

Databricks, the Data and AI company, is introducing Databricks LakeFlow, a new solution that unifies and simplifies all aspects of data engineering—from data ingestion to transformation and orchestration.

According to Databricks, with LakeFlow, data teams can now simply and efficiently ingest data at scale from databases such as MySQL, Postgres, and Oracle, and enterprise applications such as Salesforce, Dynamics, Sharepoint, Workday, NetSuite, and Google Analytics. Databricks is also introducing Real Time Mode for Apache Spark, which allows stream processing at ultra-low latency.

LakeFlow automates deploying, operating and monitoring pipelines at scale in production with built-in support for CI/CD, and advanced workflows that support triggering, branching, and conditional execution.

Data quality checks and health monitoring are built-in and integrated with alerting systems such as PagerDuty. LakeFlow makes building and operating production-grade data pipelines simple and efficient while still addressing the most complex data engineering use cases, enabling even the busiest data teams to meet the growing demand for reliable data and AI, according to the company.

LakeFlow simplifies all aspects of data engineering via a single, unified experience built on the Databricks Data Intelligence Platform, with deep integrations with Unity Catalog for end-to-end governance and serverless compute enabling highly efficient and scalable execution.

Key features of LakeFlow include:

  • LakeFlow Connect: Simple and scalable data ingestion from every data source. LakeFlow Connect provides a breadth of native, scalable connectors for databases such as MySQL, Postgres, SQL Server and Oracle as well as enterprise applications like Salesforce, Dynamics, Sharepoint, Workday and NetSuite.
  • LakeFlow Pipelines: Simplifying and automating real-time data pipelines. Built on Databricks’ highly scalable Delta Live Tables technology, LakeFlow Pipelines allows data teams to implement data transformation and ETL in SQL or Python.
  • LakeFlow Jobs: Orchestrating workflows across the Data Intelligence Platform. LakeFlow Jobs provides automated orchestration, data health, and delivery spanning scheduling notebooks and SQL queries all the way to ML training and automatic dashboard updates.

With LakeFlow, the future of data engineering is unified and intelligent, the company said. LakeFlow is entering preview soon, starting with LakeFlow Connect. Customers can join the waitlist here.

For more information about this news, visit