Newsletters




Pentaho Data Catalog Update Fuels AI with Trustworthy, High-Quality, Easily Accessible Data


Pentaho, an industry leading data intelligence and integration platform utilized by 73% of the Fortune 100, is unveiling a series of enhancements to Pentaho Data Catalog, a solution that provides a single source of truth through trusted data for core operations and AI. The latest iteration of Pentaho Data Catalog focuses on empowering organizations to deliver AI with increased data quality, observability, and trust.

Born both from customer requests and the ongoing, industry-wide pressure to deliver data to AI, Pentaho’s enhancements to its data catalog solution expands upon its 20-year legacy as a category leader in data management. Its innovations further enable its customers to cultivate foundational, AI-ready data and data intelligence without having to deal with infrastructural burdens or slow time-to-value, according to Pentaho.

This update to Pentaho Data Catalog focuses on the fact that, “if you are delivering data to AI, you need to know where the data is. You need to know the quality of the data. You need to know if the data is okay to be used for an AI application,” explained Kunju Kashalikar, senior director of product management at Pentaho. This demand resulted in a variety of new features, including the new, enhanced Data Marketplace.

The Data Marketplace centralizes the search for trusted, curated datasets for both daily and strategic initiatives. Cultivating a seamless data experience, the Data Marketplace offers:

  • Deeper integrations with Okta and Active Directory, improving policy access and security measures for AI models
  • Creation of data products paired with prescribed quality and sensitivity characteristics
  • Data delivery to data points of use, including Python IDE, ML Test and Deployment tools

“We always had the ability to know information about the data, but we never had presented a marketplace kind of view,” said Kashalikar. “We added… [the Data Marketplace] to be able to make that experience very seamless, to…find the data, check the trustworthiness of the data, see what peers are doing with the data, and then be able to deliver [that] data to wherever I'm using it.”

Additionally, “We have the notion of a data product as part of our data marketplace. What that means is, your data scientists, your business users can not only find the data, shape it to the way they wanted, but also create a data product out of it to be shared with their peers…the sharing of data in an easy manner as part of the Data Marketplace is a key capability that differentiates us,” Kashalikar noted.

Another notable feature highlights AI model governance, offering an integration that increases visibility into how and where models are accessing data. This not only empowers appropriate use but ensures a sense of proactive governance.

The purpose of this governance capability relates to how, “primarily when you are bringing data to use for AI, you want to make sure you are applying the right data governance guardrails, so while you are making as much data available as possible, you are doing it within the confines of your regulatory and corporate governance policies,” explained Kashalikar.

With Pentaho Data Catalog’s ML enhancements for data classification, customers benefit from the ability to automate and scale how data is managed—including for unstructured data. Further enhancements to data optimization, including re-tiering for structured and unstructured data, drives support for archiving, migration, and policy driven lifecycle management use cases.

Ultimately, this release makes the “journey of…[encountering] a problem to finding the data, to actually getting my hands on the data, to do what I need to do with data…—a pretty quick, satisfying shopping experience.”

To learn more about the latest version of Pentaho Data Catalog, please visit https://pentaho.com/.


Sponsors