Informatica Unveils Innovations for Spark-Based Big Data Clouds

Informatica, an enterprise cloud data management provider, is introducing a new solution for Apache Spark Based Big Data Cloud environments. These new innovations, powered by the CLAIRE engine, enable organizations to stream, ingest, process, cleanse, protect, and govern even more big data with less effort. 

The new AI-driven hybrid big data management solution delivers more trusted information assets and accelerates self-service analytics using machine learning for hybrid and multi-cloud environments. 

The solution will help organizations overcome the challenges of managing and governing large data volumes flowing into and through data lake environments on-premises and in the cloud. 

The new innovations include:

  • Increasing data engineering productivity with even broader support for big data clouds like Google Cloud Dataproc and new advanced Spark serverless based integrations with Qubole and Azure Databricks. Additionally, users will benefit from rapid development for IoT data pipelines with machine-learning driven structure discovery of semi-structured datasets (e.g. machine data).
  • Empowering data analyst and data science teams with advanced self-service data discovery and data preparation with 50+ new functions. Examples include statistical and windowing functions, fuzzy clustering, matching rules, more controlled access to data using data masking and the ability to ingest logical models and business terms into a data catalog.
  • Optimizing data operations with improved monitoring of data infrastructure with machine learning driven operational management and proactive actions and recommendations.

“The new Informatica innovations empower all levels of data users to interact with huge data sets to glean insights, said Ronen Schwartz, senior vice president and general manager, Cloud, Big Data and Data Integration, Informatica.  “For example, data engineers can now build serverless data pipelines running on Apache Spark in the Cloud and provide data scientists with advanced self-service data prep, powered by AI and machine learning.”

For more information about this news, visit