MapR Adds Spark to Hadoop

Hadoop distribution provider MapR Technologies has announced a strategic partnership with Databricks and the addition of the complete Apache Spark technology stack to the MapR Distribution.

Databricks was founded by the creators of Apache Spark, and a key goal of Databricks is to drive broad adoption of Spark, an in-memory processing framework that provides speed, programming ease, and advantages for real-time processing.

With the new partnership, the MapR Distribution for Hadoop now includes the Spark stack that supports rapid application development allowing for reuse of code across batch, interactive, and streaming applications. Spark also provides a general-purpose execution framework with in-memory pipelining to speed up end-to-end application performance.

According to the vendors, the combination of Spark and the MapR distribution means that – because MapR allows streaming writes directly to the data platform - customers get not only get low-latency Spark applications, but these applications are operating on more real-time data which can facilitate advantages such as faster fraud detection, better personalization of media, higher quality from manufacturing processes, and other operational analytic use cases.

“The open source community is developing tremendous technology innovations at a rapid pace,” said John Schroeder, CEO and cofounder, MapR Technologies.  “MapR provides a future-proof investment for our customers with the most open distribution to give them flexibility to pick the right solution with the widest range of compute frameworks and libraries.”

With the introduction of the complete Spark stack, the MapR Distribution now includes more than 20 Apache open source projects for batch, streaming, graph, machine learning and other categories.