Why NoSQL?

NoSQL - probably the hottest term in database technology today - was unheard of only a year ago.  And yet, today, there are literally dozens of database systems described as "NoSQL."  How did all of this happen so quickly?

Although the term "NoSQL" is barely a year old, in reality, most of the databases described as NoSQL have been around a lot longer than the term itself.  Many databases described as NoSQL arose over the past few years as reactions to strains placed on traditional relational databases by two other significant trends affecting our industry:  big data and cloud computing.

Of course, database volumes have grown continuously since the earliest days of computing, but that growth has intensified dramatically over the past decade as databases have been tasked with accepting data feeds from customers, the public, point of sale devices, GPS, mobile devices, RFID readers and so on.    

Cloud computing also has placed new challenges on the database.  The economic vision for cloud computing is to provide computing resources on demand with a "pay-as-you-go" model.  A pool of computing resources can exploit economies of scale and a levelling of variable demand by adding or subtracting computing resources as workload demand changes.  The traditional RDBMS has been unable to provide these types of elastic services.

The demands of big data and elastic provisioning call for a database that can be distributed on large numbers of hosts spread out across a widely dispersed network.  While commercial relational databases - such as Oracle's RAC - have taken steps to meet this challenge, it's become apparent that some of the fundamental characteristics of relational database are incompatible with the elastic and Big Data demands.

Ironically, the demand for NoSQL did not come about because of problems with the SQL language.   The demand is due to the strong consistency and transactional integrity of NoSQL.  In a transactional relational database, all users see an identical view of data.   In 2000, however, Eric Brewer outlined the now famous CAP theorem, which states that both Consistency and high Availability cannot be maintained when a database is Partitioned across a fallible wide area network.

Google, Facebook, Amazon and other huge web sites, therefore, developed non-relational databases that sacrificed consistency for availability and scalability.   It just so happened that these databases didn't support the SQL language either, and, when a group of developers organized a meeting in June 2009 to discuss these non-relational databases, the term "NoSQL" seemed convenient.  Perhaps unfortunately, the term NoSQL caught on beyond expectations, and now is used as shorthand for any non-relational database.

Within the NoSQL zoo, there are several distinct family trees.  Some NoSQL databases are pure key-stores without an explicit data model, with many based on Amazon's Dynamo key-value store.   Others are heavily influenced by Google's BigTable database, which supports Google products such as Google Maps and Google Reader.  Document databases store highly structured self-describing objects, usually in an XML-like format called JSON.  Finally, graph databases store complex relationships such as those found in social networks.

Within these four NoSQL families are at least a dozen database systems of significance.   Some probably will disappear as the NoSQL segment matures, and, right now, it's anyone's guess as to which ones will win, and which will lose.  

NoSQL is a fairly imprecise term - it defines what the databases are not, rather than what they are, and rejects SQL rather than the more relevent strict consistency of the relational model.   As imprecise as the term may be, however, there's no doubt that NoSQL databases represent an important direction in database technology.