EMC and Pivotal Partner on Data Lake Hadoop Bundle 2.0

EMC and Pivotal today announced the Data Lake Hadoop Bundle 2.0, offering a turn-key offering that combines compute, analytics and storage for customers building scale-out data lakes for predictive analytics. The new release includes the EMC's DCA (Data Computing Appliance) and now enables predictive analytics for customers. Data Lake Hadoop Bundle 2.0 is generally available today.

"Big data is top of mind in organizations worldwide—but for so many businesses, this means they're concerned about how to store it and harness its value," said Jeremy Burton, president, Products and Marketing, EMC. "This bundled offering from EMC and Pivotal addresses the needs of customers today as they're building their scale-out data lakes and offers a high performance system for leveraging Hadoop for big data analytics."

The term "data lake" describes a scalable Hadoop repository for aggregating data generated by traditional and next-generation workloads, and according to EMC, its data lake has enterprise-ready characteristics and is designed to help organizations rapidly gain business value from big data.

Earlier this year, EMC and Pivotal announced the first release of the Data Lake Hadoop Bundle, linking enterprise Hadoop and predictive analytics with enterprise scale-out storage.

The new Data Lake Hadoop Bundle 2.0 includes EMC's Data Computing Appliance (DCA), a high-performance big data computing appliance for deployment and scaling of Hadoop and advanced analytics, Isilon scale-out NAS (network attached storage), as well as the Pivotal HD Hadoop distribution  and  the Pivotal HAWQ parallel SQL query engine to support enterprise-class predictive analytics.  

DCA is optimized for big data workloads to provide the user with a streamlined experience, maximizing analytics performance and time to value.

The Data Lake Hadoop Bundle 2.0 is designed to help organizations accelerate the value of Hadoop Big Data initiatives in the enterprise while keeping acquisition and management costs lower than solutions assembled with disparate parts.  The solution is comprised of a pre-tested, high performance Big Data analytic system featuring world-class EMC storage and advanced analytics on Hadoop through enterprise appliances, enabling a single, focused and easy to implement solution.

In related news, EMC partner Pivotal also announced the implementation of an architecture that builds upon disk-based storage with memory-centric processing frameworks. Pivotal is now actively dedicating resources to an open source project called Tachyon.  Led by UC Berkeley PhD candidate Haoyuan Li, Tachyon is a memory-centric, fault-tolerant distributed file system that enables data exchange at in-memory speed across cluster frameworks. 

More information about EMC is available at

Related Articles

Hadoop heavyweight Pivotal is open sourcing components of its Big Data Suite, including Pivotal HD, HAWQ, Greenplum Database, and GemFire; forming the Open Data Platform (ODP), a new industry foundation along with founding members GE, Hortonworks, IBM, Infosys, SAS, and other big data leaders; and forging a new business and technology partnership with Hortonworks.

Posted February 17, 2015