Cloudera Introduces New Machine Learning and Analytics PaaS

Cloudera has announced Cloudera Altus with SDX, a machine learning and analytics platform-as-a-service (PaaS), built with a shared data catalog providing the business context of that data.

According to Cloudera, Altus supports a variety of high-value business use cases that require applying multiple data analysis capabilities and approaches together. SDX makes it possible for those analytic functions to work together to combine data from different sources into a single coherent and actionable picture.

SDX enables Altus cloud services – including Data Engineering, Analytic Database (beta) and soon Data Science – to securely access data through a reliable shared data experience. There is one trusted source of metadata for all machine learning and analytics services and users. Altus brings the simplicity and scale of the cloud to big data analytics, enabling people to confidently utilize multiple analytics services to unlock the value in their business data. Altus delivers IT control through simplified workload management, governance, and security, while catering to the end user by offering curated self-service access to data and their preferred tools.

The disparate cloud services being spun up as “shadow IT” by different teams present IT and organizational challenges because the discrete models and fragmented approaches are usually too narrow and not scalable to manage within the company. It also leads to increased cost, effort, and compliance challenges associated with ungoverned data replication and access.

“Cloudera Altus with SDX enables businesses to build and manage multi-function analytics use cases in the cloud, integrating data engineering, IoT, customer and operations analytics, with machine learning,” said Vikram Makhija, general manager, Cloud Business Unit, at Cloudera. “Cloudera offers a proven solution for businesses to capitalize on the value of their data, avoiding the analytics cloud sprawl problem through the simplicity and scale of Cloudera’s modern cloud platform for machine learning and analytics.”

According to Cloudera, SDX, introduced in September 2017, makes multi-function data use cases easier to develop, less expensive to deploy, and more consistently secure. SDX is a modular software framework that applies a centralized, consistent framework for schema, security, governance, data ingest and more, making it possible for dozens of different customer applications to run against shared or overlapping sets of data. Now SDX, currently available as a self-managed reference architecture for Cloudera Enterprise, will also be available in Cloudera Altus, making it easier for organizations to build high value multi-function data use cases.

Altus runs on Amazon Web Services (AWS) infrastructure, with support for Microsoft Azure infrastructure in beta.

The Altus cloud service offerings include:

Altus Data Engineering is a jobs-focused platform to facilitate ETL and data preparation for analytics and data science in cloud. It simplifies resource allocation, job creation, and troubleshooting for users. It is part of a more tightly integrated horizontal PaaS to support diverse analytics and data science use cases. The Altus SDK for Java allows users a means to programmatically leverage a platform-as-a-service for data engineering workloads.

Altus Analytic DB (beta) is the first data warehouse cloud service that brings the warehouse to the data through a unique cloud-scale architecture that eliminates complex and costly data movement. It delivers instant self-service BI and SQL analytics to anyone, easily, reliably, and securely. Furthermore, by leveraging SDX, the same data and catalog is accessible for analysts, data scientists, data engineers, and others using the tools they prefer - SQL, Python, R - without data movement.

Altus Data Science (beta soon) provides data science teams with on-demand Python and R services for advanced analytics and machine learning. With a serverless user experience, data science teams spend more time delivering value, and less time on DevOps. It works on a common framework of services for security, governance, data ingest, and data cataloging, helping to make it easier to integrate data science with analytic database and data engineering functions, for a complete analytic solution.

For more information, or to review the reference architecture, visit