Cloudera Advances Hadoop Security with RecordService

Bookmark and Share

Cloudera has launched a public beta release of RecordService, a new high-performance security layer for Apache Hadoop that centrally enforces role-based access control policies across the platform. Complementing Apache Sentry (incubating), which provides unified policy definition, RecordService delivers complete row- and column-based security, and dynamic data masking, for every Hadoop access engine. 

The announcement was made at Strata + Hadoop World in New York City.

A public beta of RecordService is available under the Apache open source license, and will be transitioned to the Apache Software Foundation in the future.

Cloudera has announced a public beta of a new storage to enable faster analytics in Hadoop. Kudu, a new columnar store for Hadoop, enables the combination of fast analytics on fast data. Complementing the existing Hadoop storage options, HDFS and Apache HBase, Kudu is a native Hadoop storage engine that supports both low-latency random access and high-throughput analytics, dramatically simplifying Hadoop architectures for increasingly common real-time use cases.

In regulated industries, advanced security is critical for protecting sensitive data without limiting the analytic agility necessary for competitive advantage. Capital One is one of the first early users of the beta of RecordService, and actively contributes to the project. RecordService is also being developed in collaboration with Cloudera partners, including Datameer and Platfora, and other Hadoop vendors.

According to Cloudera, to ensure that sensitive data cannot fall into the wrong hands, a comprehensive security approach must include fine-grained access controls that define what data users can see and what they can do with it. However, for enterprises to take full advantage of the power of Hadoop, these access controls cannot limit the agility of the platform or the users. In addition, these access permissions must be the same for the user, regardless of their access engine of choice, down to the row and column-level.

Sentry, the standard for unified policy definition in Hadoop, helps address this issue by applying consistent policies across all the different access paths. However, some access paths support more granular restrictions than others, and, as the Hadoop ecosystem has expanded to include diverse access engines such as Apache Spark, Impala, and Apache Solr, it has been challenging to enforce these policies consistently without limiting access to the data itself.

As a new core security layer, native to the Hadoop ecosystem, RecordService enables the continued expansion of Hadoop for the enterprise, so that organizations can innovate without compromise.

According to Cloudera, RecordService complements the policy definition of Sentry as a new layer that provides a single point of enforcement — simplifying security with unified row- and column-level controls for all access paths, including Spark and MapReduce. This fine-grained policy enforcement allows enterprises to take advantage of the full capabilities of Hadoop while eliminating the complex security workarounds that can often lead to security errors. With RecordService, all users can gain insights from their data, securely, using their tool of choice.

"With RecordService, the Hadoop community fulfills the vision of unified fine-grained access controls across every Hadoop access path,” said Eddie Garcia, chief security architect, Cloudera. “Together, with leading security advances such as Sentry, RecordService enables enterprises to gain powerful insights from Hadoop with the confidence that a powerful, core security layer is protecting their most sensitive data."

By enabling the native, granular controls for Apache Spark specifically, RecordService plays a key role in advancing the development of Spark to meet enterprise requirements. With security being a core focus area of Spark development, as part of Cloudera’s recently announced One Platform Initiative, RecordService is a key project to help Spark further progress towards becoming the next default processing engine for Hadoop.

For more information, go to