Redis Labs Integrates with Spark to Accelerate Data Processing

Redis Labs is offering integration with Spark SQL, and releasing a Spark-Redis connector package that promises to accelerate processing time.

Redis Labs is a commercial provider of Redis, the open source in-memory data store as a database, a caching layer or a message broker. “We provide an enterprise version of Redis which allows you to scale and cluster Redis seamlessly without any performance impact or downtime,” said Leena Joshi, vice president of product marketing at Redis Labs.

Redis Labs recently benchmarked using time-series data to show that running Spark on Redis as a data store results in 135 times faster processing compared to Spark using HDFS and is 45 times faster processing compared to Spark using Tachyon as an off-heap datastore or Spark storing the data on-heap.  “Redis can now provide that shared distributed memory infrastructure for Spark at very high performance and the data structures make processing of the data extremely fast,” Joshi said.

The Spark-Redis connector package is open source and provides a library for writing to and reading from a Redis cluster with access to all of Redis' data structures – String, Hash, List, Set, Sorted Set, bitmaps, hyperloglogs – from Spark as RDDs. In addition, the package also ensures close cluster alignment between Spark and Redis clusters, reducing network overhead and ensuring optimal processing times.

According to Redis Labs, capabilities that users can expect with Redis with Spark include the acceleration of Spark performance by more than 100 times, and the ability to access elements of data individually and rapidly, minimizing serialization/deserialization overhead and avoiding the need to transfer large chunks of data.

The solution allows Redis data structures exposure via Spark RDD and DataSet API, provides Spark SQL support as a standard query interface, and permits the use of Redis Cluster as a distributed memory infrastructure for Spark. Additional future enhancements to the solution will include using the combination of Spark and Redis for other popular use cases such as graph computation and machine learning.

Redis Labs offers  Redis as a service (in public clouds, private clouds or public PaaS) or as software (downloadable, in containers or in private PaaS).

For more information about Redis Labs and Spark, visit

Image courtesy of Shutterstock. 


Subscribe to Big Data Quarterly E-Edition