Eventual Consistency

Bookmark and Share

The NoSQL acronym suggests it's the SQL language that is the key difference between traditional relational and newer non-relational data stores.  However, an equally significant divergence is in the NoSQL consistency and transaction models.  Indeed, some have suggested that NoSQL databases would be better described as "NoACID" databases - since they avoid the "ACID" transactions of the relational world.

RDBMS transactions are typically described as ACID - Atomic, Consistent, Isolated and Durable.  In a traditional RDBMS, all users see a consistent view of the data.  Once a user commits a transaction, all subsequent queries will include that transaction and nobody will ever see partial results from a transaction.  It's not always easy to maintain this ACID consistency - especially as databases become widely distributed and replicated across multiple data centers.

In 2000, Eric Brewer outlined the "CAP" theorem, which says that a distributed database system cannot simultaneously provide Consistency, Availability and Partition Tolerance, but can satisfy, at most, only two of these guarantees at any one time.   Simplistically speaking, this means that a highly available, widely distributed database must make some compromises when it comes to strict consistency.

One compromise between strict consistency and weak (no guarantees) consistency is Eventual Consistency.  The core of the eventual consistency concept is that while a database may have some inconsistencies at any point in time, it will eventually become consistent when all updates cease: eventually all nodes will receive the latest consistent updates.

Various NoSQL databases treat consistency differently, and many allow the application or administrator to select the degree of consistency. NRW notation describes, at a high level, how a distributed database will trade off consistency, read performance and write performance: 

  • N is the number of copies of each data item that the database will maintain.
  • R is the number of copies that the application will access when reading the data item
  • W is the number of copies of the data item that must be written before the write can complete.

When N=W, the database will always write every copy before returning control to the client - this is more or less what traditional databases do when implementing synchronous replication. If you are more concerned about write performance than read performance, you can set W=1, R=N.  Then each read must access all copies to determine which is correct, but each write only has to touch a single copy of the data.

Most NoSQL databases support N>W>1:  more than one write must complete, but not all nodes need to be updated immediately. Another common setting is where W+R>N, which means a read will always retrieve the latest value, even if it mixed in with "older" values.  This is sometimes referred to as quorum assembly

NoSQL databases generally try hard to be as consistent as possible, even when configured for weaker consistency.  For instance, the read repair algorithm is often implemented to improve consistency when R=1.  Although the application does not wait for all the copies of a data item to be read, the database will read all known copies in the background after responding to the initial request.  If the application asks for the data item again, it will then see the latest version. 

A lot of the eventually consistent concepts are best articulated by Amazon in Werner Vogels' "Eventually Consistent" paper, and in "Dynamo: Amazon's Highly Available Key-value Store," the paper on the Dynamo eventually consistent key-value store.   Dynamo implements most of the ideas above, and is the inspiration for several well known NoSQL datastores, including Voldemort and - together with Google's BigTable specification - Cassandra.

At first glance, NoSQL databases may seem to take a more relaxed attitude toward consistency than their relational counterparts.  However, the underlying implementations are quite sophisticated, and can deliver an acceptable alternative to the RDBMS ACID transaction model for many applications.