Getting the Most out of Document-Based Databases

While document-based databases are increasing in relevancy due to their optimizations for performance and scalability, employing a set of practices aimed at best using document databases for your enterprise’s particular needs is critical for its utilization. Document databases are great but need solid architecture practices to become even greater.

DBTA hosted a webinar, titled “Architecture Best Practices: Scaling Native JSON Workloads With Amazon DocumentDB,” sponsored by AWS, featuring Cody Allen, senior DocumentDB specialist solutions architect at Amazon. Allen led an in-depth discussion of the origins of document databases, which backdropped the realities of their place in modern enterprise architectures; positing the methods and use-cases for Amazon DocumentDB with MongoDB compatibility, Allen explored the critical nature of effectively leveraging document databases for ideal business outcomes.

Driven by revolutionary JSON innovation by MongoDB in August 2009, document databases were a vehicle for developers to more easily store and query data by using the same document model format that they use in their application code. Integration of JSON with the native format of the database allowed for rich manipulation within the database itself—its flexible and semi-structured nature enabled developers to evolve with their applications’ needs, not in spite of it. JSON’s schema-less architecture means that developers no longer had to modify an entire database or reformat data to input a field, alleviating significant pressure from developer workloads.

Enter Amazon DocumentDB—a built-from-the-ground-up cloud native document database that is accompanied by Amazon’s extensive developments in database platforms. Allen focused on the ways in which Amazon DocumentDB embraces flexibility and change, citing that the platform’s key strategy is the separation of compute and storage. No longer are customers constrained by their database in terms of having to scale the entire entity, rather than scaling compute or storage individually. Amazon DocumentDB automatically grows or shrinks storage volume in accordance with your enterprise’s needs, with the storage layer being fully managed. This separation, Allen says, brings improved performance, availability, and scalability, without having to worry about managing the underlying infrastructure. There are no upfront investments, and users only pay for what they use—mitigating resource waste.

So, since JSON integration is built for efficiency paired with Amazon DocumentDB’s adaptive approaches to database management, is there a reason to employ better architecture practices? According to Allen, absolutely. While Amazon DocumentDB can accommodate a variety of use cases, ranging from gaming user profiles to object tracking, there is an importance to the methods and strategies used to employ document databases that yield optimal performance. For Allen, these methods are broken down into a few general practices:

  • Design your schema based on read/write access patterns particular to your application.
  • Favor embedding, unless there’s a reason not to. If an object in a document is needed on its own, then there’s a good reason not to embed it.
  • Avoid joins/$lookups, if possible. Use them if they can provide a better scheme for your use case.
  • Arrays should not grow without bound.

Ultimately, these practices value effective schema development to set up your document database for success rather than for failure. It focuses on addressing the specific needs of the enterprise, avoiding expenditure in resources that are incurred by unnecessary usage of tools. The reality is that any platform or system employed by an enterprise can only be as effective as the humans doing the employment; constantly being aware of the needs of your enterprise is critical to get the most out of document databases, according to Allen.

You can view an archived version of this webinar here.