Cloudera Introduces New Version of its Big Data Platform

Cloudera has unveiled the fourth generation of its flagship Apache Hadoop data management platform, Cloudera Enterprise. Cloudera Enterprise 4.0 combines the company’s Cloudera Manager software with expert technical to provide a turnkey system for deploying and managing Hadoop in production. The company also announced the general availability of CDH4 (Cloudera’s Distribution Including Apache Hadoop, version 4), resulting from the successful completion of a beta program among its enterprise customers and partner ecosystem and the contributions of Cloudera’s engineering team and the greater Apache open source community.

According to Clouder, when used in conjunction, Cloudera Enterprise 4.0 and CDH4 enable enterprises to integrate Hadoop with their existing enterprise data management systems for mission-critical applications, with new features that support high availability, increased security, improved extensibility and automation for management of large-scale Hadoop clusters.

“CDH has been tried and tested in over 65% of the world’s largest commercial Hadoop deployments. With the new enhancements we have made, CDH4 and Cloudera Manager 4 together deliver a proven Hadoop platform that is hardened for enterprise use with deep consideration for high availability, scalability, performance, ease of use and other things enterprises have come to expect from any solution deployed to run mission critical processes,” says Charles Zedlewski, vice president of product at Cloudera.


 According to Cloudera, CDH4 marks a major step forward for CDH. CDH4 combines Apache Hadoop with other open source applications within the Hadoop stack to deliver advanced, enterprise-grade features, including:

  • High Availability: Offers increased usability for mission critical use cases and applications with a highly available NameNode that eliminates the only remaining single point of failure in HDFS. Heterogeneous clusters minimize downtime and allow users to run different nodes on different versions of Hadoop.
  • Increased Security: Allows for more sensitive data to be stored in CDH with more granular access control to support multi-tenancy. HBase table and column permissions secure which users and groups have access to HBase columns and tables, and Fair Scheduler ACLs secure which groups can administer or submit jobs into different Fair Scheduler pools.
  • Improved Extensibility: Helps solve a broader range of scenarios through coprocessors that enable more sophisticated applications in real time and open resource management (a.k.a. MR2) that allows for multiple data processing frameworks to run on the same Hadoop cluster, inevitably saving costs on storage.
  • Other new features from the Hadoop stack: Common compression codec (Snappy), common file format (Apache Avro), REST over HTTP access to HDFS, web shell (for Apache Pig and Apache HBase), slot-less resource manager, faster and easier user web access to Hadoop systems, 100% gain in filesystem I/O performance, 100% speedup in HBase random reads, 200% improvement in Apache Flume data ingest rate and a 30% faster Apache MapReduce shuffle.

Cloudera Manager 4

Offering greater ease of use, Cloudera Manager 4, an management application for Apache Hadoop, enables enterprises to now store, process and analyze all their data and do so with improved time to value. Cloudera Manager 4’s new features include:

  • Easier Deployment and Management: 3-Step HA Configuration guides setup for the NameNode in three simple steps, Multi-Cluster Management allows for management of multiple clusters from a single instance of Cloudera Manager, and Backwards Compatibility offers flexibility in management with support for both CDH3 and CDH4.
  • Rich Visualizations and Sophisticated Automations for Large-Scale Clusters: Heatmaps enable administrators to quickly identify problem nodes within large clusters and take action, while Federated NameNode Management simplifies the process of growing CDH clusters to billions of files across thousands of nodes.
  • Seamless Integration: The Cloudera Manager API integrates smoothly with existing enterprise management and monitoring tools. Cloudera Manager also includes support for LDAP authentication and now supports additional databases, including Oracle and PostgreSQL and includes new packages for Ubuntu and Debian.
  • Other new features: New features include comprehensive host monitoring, client configuration management and extensive Hadoop setup readiness checks.

“Big data is something that the enterprise can no longer afford to ignore, so Cloudera’s products and support empower organizations to maximize the value of data with confidence and ease,” says Zedlewski.

 To learn more about Cloudera Enterprise 4, visit

To learn more about CDH4, visit or download CDH4 for free at