Streamlining Business With a NoSQL Document Database

NoSQL databases were born out of the need to scale transactional persistence stores more efficiently. In a world where the relational database management system (RDBMS) was king, this was easier said than done. Although many transactional systems have migrated to NoSQL databases for greater performance at a lower cost, this has not occurred within analytics on RDBMS, where analytics are predominantly rooted in the RDBMSs, tools, and technologies supporting OLAP.

Over the last 5 years, these NoSQL databases have grown rapidly in popularity, and their capability to support new use cases has grown proportionally. However, with that capability comes complexity when operating NoSQL databases to support business-critical operations. Challenges include figuring out data access patterns and how to distribute and encode the data so that developers can easily consume it without requiring too much overhead.

As in the early Hadoop days when people wanted Hadoop to be the tool for every use case, the same pattern was attempted with NoSQL databases. In general, the most popular model of NoSQL databases has been key-value. Storing heavily structured document models within a key-value database was not an ideal fit. While these are extremely fast when looking values up directly by key, they are not the perfect database for all use cases. Performing range scans over keys can be fast, but when seeking data in any way that is orthogonal to key lookup, the task of finding the relevant data is much more intensive.

As NoSQL databases have cemented their place in the industry, document databases have become a dominant choice due to their flexibility to support use cases where the key-value databases are not a perfect fit. These document databases have also shown that they can support many use cases where key-value databases excel.

Document databases do not have a strict definition they must follow for how they work or the APIs they expose. The more popular document databases support JSON or a similar native storage format. Natively integrating JSON capabilities allows users to modify database documents at the storage level. This is very different from non-native implementations, where you can only modify the data outside of the storage area. This means that you can modify data without first retrieving it, and you can update-in-place. These functional capabilities range from incrementing an integer field, to appending new fields, to performing global search/replace on all documents.

MongoDB has become one of the most popular in the document database world due to its easy-to-use API. Recently, the release of the Open JSON Application Interface (OJAI) has enabled developers to use a standard open source API for dealing with JSON data. A couple of the key details to watch for when evaluating a document database are its ability to scale, data protection, and integration into the ecosystem for both querying and performing analytics.

When a developer leverages a document database to support transactional workloads and persistence, they can store/retrieve data much faster. Transactional applications can store/retrieve data with as little as one line of code in each case, due to the ease of (de)serializing data from in-memory data structures into complex JSON documents. This is a compelling proposition when compared to an RDBMS. Most developers use a framework such as Hibernate for object-relational mapping, which takes a lot of code to support. Additionally, the amount of effort required by QA is substantially reduced, which translates to fewer moving parts and faster time to market.

While there are many benefits from the developer perspective, including reduced time to deployment and simplified testing, developers also benefit from leveraging operations and analytics on the same data. When leveraging OJAI, the data is stored in JSON, so there’s no need to transform it to another model. Not all document databases support performing operations and analytics in the same place. Within MongoDB, there’s a connector which duplicates all data from within MongoDB to a system such as Hadoop to support analytics. OJAI also supports running distributed computations such as Hadoop MapReduce, Spark, and Drill in place without the need to duplicate the data.

Document databases are now part of the Hadoop ecosystem and are worth a close look for your business use cases. The benefits of these databases over traditional RDBMS are many. While they may not be the solution to every use case, they should not be overlooked, as they can save you time and money. Document databases are the future of database technology.

Image courtesy of Shutterstock.


Subscribe to Big Data Quarterly E-Edition