Oracle Joins the NoSQL Movement

As the leading provider of relational database software, it's hardly surprising that Oracle initially gave little or no credence to the NoSQL movement that emerged in 2009. Indeed, an Oracle white paper from May 2011 concluded with the recommendation to "Go for the tried and true path," and avoid NoSQL databases.

Those of us who have been following Oracle for many years have learned to expect both business and technical savvy from the venerable corporation. So, it's not unexpected to see a company such as Oracle initially dismiss a new technology, only to embrace it as the momentum grows.

NoSQL databases currently represent only a tiny fraction of enterprise database deployments. Nevertheless, Oracle has seen key customers adopt NoSQL solutions and clearly has strong motivation to provide its own NoSQL solution. Consequently, Oracle announced "Oracle NoSQL" at Oracle OpenWorld in October.

Oracle acquired an embedded NoSQL solution-Berkeley DB-from Sleepycat software in 2006. Ironically, this was considered by some to be in response to another disruptive threat-that of open source databases such as MySQL. But, conveniently, Berkeley DB gave Oracle a good foundation on which to build for a more scalable nonrelational datastore.

Oracle NoSQL is a distributed key-value store: values written to nodes in a cluster based on a hash of the key value. Unlike some NoSQL databases, there is no support for a partitioning scheme that allows adjacent keys to be located on the same node. However, Oracle NoSQL supports the concept of major and minor key paths:  A major key may have sub-keys all stored on the same node. These may be used to optimize the retrieval of master detail records.

When a write occurs, copies of the data will also be written to other nodes to protect against data loss. This replication can be synchronous or asynchronous-synchronous replication increases write latency but guarantees no data loss in the event of a node failure.

In some NoSQL databases, the location of the latest value may be nondeterministic: The database may need to read multiple nodes to determine the most correct value. Oracle NoSQL, instead, nominates a master node which will always have the most up-to-date value. As a result, the database is immediately rather than eventually consistent.

Oracle offers Oracle NoSQL as stand-alone software, or in the upcoming "big data appliance." The big data appliance is delivered as a rack of 18 servers, each with 12 cores, 12x2TB of disk, and 48GB of RAM. To save you the math, that's a total of 864TB of storage across 216 disk spindles, 216 cores, and 864GB of RAM. The big data appliance also includes a copy of Apache Hadoop, and I suspect that Hadoop-not Oracle NoSQL-will be the main market driver for With NoSQL at a fairly early level of maturity, it's typical for some key features to be absent in an initial release, and Oracle NoSQL is no exception. The most surprising limitation in the initial release is undoubtedly the fixed number of nodes. Unlike virtually every other distributed key-value store, Oracle NoSQL currently supports only a fixed number of nodes. You cannot add or remove nodes from your cluster.

This lack of elastic scalability in the initial release will probably be a significant barrier to widespread adoption. However, Oracle's community and customer base is massive. For many Oracle customers, Oracle's endorsement of NoSQL will provide the impetus to consider nonrelational solutions. Almost by definition, the largest database vendor in the world introducing a NoSQL solution legitimizes NoSQL as an enterprise technology.