BlueData Releases Summer Update that Powers AI and Machine Learning Enhancements

BlueData, provider of a Big-Data-as-a-Service (BDaaS) software platform, is releasing updates for its BlueData EPIC platform, building upon innovations for large-scale distributed analytics and machine learning (ML) workloads on Docker containers.

This summer release is the result of collaboration with BlueData’s enterprise customers to develop new functionality in each of these areas to support their Big Data and AI initiatives – as they extend well beyond Hadoop and Spark to a range of different ML / DL and data science workloads, and beyond on-premises infrastructure to public cloud and hybrid architectures.

These customer-driven innovations provide the agility of containers and elasticity of cloud computing, while ensuring enterprise-class security and reducing costs with automation.

Now BlueData customers can benefit from AI-as-a-Service and ML-as-a-Service capability for their enterprise deployments – whether on-premises, in multiple public clouds, or in a hybrid model.

One of the key concepts underpinning this new release is the separation of compute and storage for Big Data and ML / DL workloads. This is a fundamental tenant of the BlueData EPIC architecture, and it allows organizations to deploy multiple containerized compute clusters for different workloads (e.g., Spark, Kafka, Tensorflow) while sharing access to a common data lake.

This also enables hybrid and multi-cloud implementations, with the ability to mix and match compute/storage resources (whether on- or off-premises) depending upon the nature of the workload.

And it provides the ability to scale and optimize compute resources independent from data storage – delivering greater flexibility, improving resource efficiency, eliminating data duplication, and reducing cost through the reuse of existing storage investments.

Other several new features and benefits include:

  • Hadoop Cluster Optimization: The summer release includes a new “External Host” innovation that allows containers running on a private non-routable network with BlueData EPIC to act as an extension to their on-premises physical Hadoop clusters. This allows enterprises to easily scale and expand their existing Hadoop clusters – and reduce overall TCO – by adding new containerized compute-only nodes (for Spark, BI / ETL, ML / DL, etc.).
  • Automation for Hybrid and Multi-Cloud: BlueData EPIC now features a new extensible CLI software development kit (SDK) for automated deployments supporting multiple public cloud services. It includes fine-grained automation hooks to enable creation of auto-scaling schemes as well as the use of spot instances in public cloud environments. This same functionality can be used in hybrid cloud deployments.
  • Container Migration: With the new summer release, BlueData EPIC allows for the migration of stateful containers between hosts to support maintenance and disaster recovery scenarios. This feature includes a purpose-built external storage plug-in, and provides the unique ability to move relevant container-specific directories in addition to the data directories — thereby supporting seamless container migration for stateful services.
  • Service Provider Support: BlueData EPIC now has a new RESTful API version that supports sophisticated workflows spanning authentication, authorization, resource allocation, cluster creation, pre/post configurations, storage connectivity, and running analytics jobs. With this new functionality, organizations can map each of their tenants to a unique instance of an Active Directory and/or SAML service — for automated onboarding of multiple lines of business or separate external customers in a service provider model.

“At BlueData, we’ve been in the fortunate position of helping many of the world’s largest and most well-respected enterprises on their Big Data journeys,” said Kumar Sreekanti, co-founder and CEO of BlueData. “Now we’re working with these and other new customers to advance their AI implementations. This new release provides these customers with innovative functionality to deliver faster time-to-value, greater agility, and lower TCO for their Big Data and AI deployments.”

For more information about these updates, visit