Splice Machine Creates Connector to Strengthen IoT and Machine Learning

Mar 19, 2018

Splice Machine is launching a connector that aims to boost IoT and machine learning applications.

The solution, Apache Spark DataSource, provides a fast, native, ACID-compliant datastore for Spark and also opens up Splice Machine’s underlying Apache Spark engine to directly use advanced capabilities such as Spark SQL, Spark Streaming, and MLlib or R (for machine learning).

The connector enables data engineers, data scientists, and developers to directly use Spark without excessive data transfers in and out of Splice Machine

The connector is now a part of Splice Machine’s community edition, with a simple query example and a streaming example also available on Github.

Apache Zeppelin notebooks with streaming and machine learning examples of the native Spark DataSource are also available on Splice Machine’s Cloud Service.

New functions include the ability to:

Create Table - create a Splice Machine table from the schema of a Spark DataFrame
Insert - insert the rows of a DataFrame into a Splice Machine table
Update - update the rows of a Splice Machine table specified by a DataFrame
Upsert - update or insert the rows of a Splice Machine table specified by a DataFrame
Delete - delete the rows of a Splice Machine table specified by a DataFrame
Query - issue a SQL query and return the result set as a DataFrame

Other features include ACID transactions on all CRUD operations, CRUD operations preserve all ACID properties on secondary indexes automatically, ipdates can update any number of columns simultaneously, and result sets return lazy-evaluated Spark DataFrames with instructions pipelined through Spark’s RDD structures.

Splice Machine provides the Native Spark DataSource API in Java, Scala, and Python.

For more information about this news, visit www.splicemachine.com.