Will Transactions Return to NoSQL?

The non-relational database explosion of the late 2000s was motivated by several disenchantments with the—until then—dominant relational database model. A key concern for many involved the overly restrictive transactions in the relational database model. In relational systems, transactions must be ACID—Atomic, Consistent, Independent, and Durable. While no one really disputed that ACID transactions represented a desirable ideal, many non-relational database advocates argued that other requirements—particularly for highly available global applications—were inconsistent with strict transactional consistency.

CAP—aka Brewer’s—theorem states that a system cannot be simultaneously highly available, strictly consistent, and provide “partition tolerance,” which is the ability to continue operating when a network partition splits the nodes in the system into multiple segments. To overcome CAP theorem, most non-relational databases opted for a more relaxed consistency model—widely known as “eventual consistency.” In an eventually consistent system, different users may see slightly different versions of data at any moment, but, eventually, all changes will propagate to all users.

Writing applications that perform correctly when using the various relaxed transaction models places a large burden on the developer, so it’s not surprising that many nonrelational systems have introduced some degree of stronger transactional support.

Google—whose BigTable model inspired many non-relational systems including Cassandra and HBase—needed a massively distributed system that provided more robust transactional support. The result was Spanner. Transactions in Spanner are based on a system called TrueTime, in which consistency is achieved by using atomic clock-synchronized transaction timestamps. The downside is that each rack of servers must incorporate an atomic clock (although the clocks have become relatively cheap).

Cassandra and some other non-relational systems allow for a sort of “lightweight” transaction system, which employs some of the algorithms used by Spanner—specifically the “Paxos” consistency model. These transactions, though, are severely limited compared to the multi-key and multi-statement ACID transactions of the relational database.

For the majority of non-relational databases that provide little inherent support for multi-statement transactions, the responsibility for coding business transaction logic is up to the application developer. The “compensating transaction” pattern involves coding logic to undo all intermediate changes that may have been applied during a logical business transaction, if a step in the transaction should fail. This requires that the developer essentially duplicate the rollback and read consistency logic (handled transparently in the RDBMS), which is an onerous and potentially error-prone activity.

Given these issues, it’s not surprising to find non-relational systems emerging that provide full support for ACID transactions. For example, FoundationDB is a key-value store that supports multi-key atomic transactions, together with RDBMS-level read consistency. In another twist, Splice Machine is a DBMS that provides ACID transactions and SQL language support while using sharded Hadoop HBase as a scalable storage layer.

The introduction of increased transactional capability into non-relational databases makes sense—in the same way that providing SQL layers on top of Hadoop and many other non-relational stores makes sense. But it does raise the possibility of convergence of relational and non-relational systems. After all, if I take a non-relational database and add SQL and ACID transactions, have I still got a non-relational database, or have I come full circle back to the relational model?

It’s really about choice. The relational model served us well for decades, but, its insistence upon ACID transactions made it unable to meet the requirements of today’s massive web applications and the emerging demands of the Internet of Things. Application developers should not be required to manually code all the logic for transactional integrity or for consistent reads, but, neither should they be forced to accept strict consistency if it conflicts with the demands for scalability and availability. The databases of the future are going to need to fully support the level of consistency and isolation that is appropriate for the application. The introduction of ACID transactions into nonrelational systems might be the first step toward that goal.