How WiredTiger Revolutionized MongoDB

Unlike many other “next-generation” databases, MongoDB was not created by refugees from one of the existing database companies or architects of large-scale cloud infrastructures. Rather, the MongoDB founders were first and foremost developers who aspired to create a more productive database system for fellow developers.

The developer community responded enthusiastically to the development centric approach, and MongoDB rapidly won the battle for developer mind-share. Initial versions of MongoDB were easy to deploy and easy to integrate into existing and emerging programming paradigms. But under the hood, the original MongoDB architecture was simplistic when compared to alternatives.

Simplicity in software architecture is mostly a good thing. In particular, developers had come to distrust and avoid the overly complicated and difficult to manage Enterprise databases. However, the decades of software engineering that went into these Enterprise databases were not without merit. At scale, these traditional databases were able to perform a level that challenged the early releases of MongoDB.

One of the core limitations of the initial MongoDB architecture was a relatively unsophisticated storage engine. A storage engine serves as the interface between the database API—which exposes data as tables or documents—and the underlying disk system. In the first versions of MongoDB, the “MMAP” storage engine allowed the files which contained MongoDB documents to be accessed in memory using operating system “mmap” calls. This implementation was robust and reliable but had significant performance limitations. For instance, MMAP could not provide document-level locking, which meant that simultaneous updates to a collection could create performance bottlenecks.

In 2014, MongoDB announced a “pluggable” storage engine API, which allowed new storage engines to be integrated with the MongoDB database. A variety of alternative storage engines emerged. Facebook provided a RocksDB based storage engine, and Perconna released its TokuMX engine. However, at the end of 2014, MongoDB acquired WiredTiger, which had adapted its storage engine to the MongoDB API. WiredTiger shipped with MongoDB 3.0 in 2015 and became the default storage engine in MongoDB 3.2. Although alternative storage engines are still available for MongoDB, the vast majority of current MongoDB implementations use WiredTiger.

Keith Bostic and Michael Cahill founded WiredTiger in 2010. Keith and Michael were open-source database veterans, having pioneered the open source BerkeleyDB database which was acquired by Oracle in 2005.  After serving their time at Oracle, they founded WiredTiger to build a next-generation, lock-free, key-value database that could be used as a storage engine both for established database systems such as MySQL and emerging distributed systems such as Riak. When MongoDB announced its storage engine API, the WiredTiger team immediately saw the opportunity and raced to provide the best solution. The rest, as they say, is history.

It’s hard to overstate the impact WiredTiger technology has had on MongoDB. Within a year, WiredTiger became the basis for MongoDB encryption, compression and an in-memory storage engine. The emergence of transactions in MongoDB 4.0 was made possible by WiredTiger’s internal transaction and snapshot management facilities. Just as importantly, WiredTiger removed many bottlenecks within the MongoDB architecture, allowing MongoDB to succeed in high-end deployments that are increasingly critical to MongoDB’s continuing growth.

As a side effect, the injection of database expertise to the MongoDB company has helped balance developer-centricity with technical depth. Arguably, MongoDB succeeded by jettisoning decades of relational database baggage. However, with WiredTiger, MongoDB has integrated a core engine that leverages many decades of database research and development.

Cahill now manages MongoDB Labs—a full-time research group looking at the future of MongoDB technology. We’ll be following the adventures of MongoDB Labs with great interest.