Newsletters




Sponsored Content: Migrating from MongoDB to Amazon DocumentDB


Organizations focus a majority of their database migration efforts on a single task: synchronizing data from production to their new target database. The migration goal is to have a perfect copy of the production data in the replacement database so that the cutover will be in as small of a maintenance window as possible.

While data migration is a critical step in the overall migration project plan, it shouldn’t consume the majority of an organization’s resources. There are four key areas to consider when planning a migration from MongoDB to Amazon DocumentDB (with MongoDB compatibility):

  • Compatibility—Does the application work as-is or are changes needed?
  • Sizing—How large does the Amazon DocumentDB cluster need to be?
  • Testing—Does the application run correctly? Is it fast enough? How does it behave under load?
  • Data Migration—Can the data be moved without disrupting the production system? Does the time to cutover fit an acceptable maintenance window?


aws third article image

These tasks can be performed in parallel, and the results of the testing phase inform the compatibility and sizing phases, possibly forcing the Amazon DocumentDB cluster to be resized and retested.

Compatibility

Migrating the database platform could require changes to your application. It is imperative to understand what changes are required, and the resources needed to develop and test them. This step should be performed first, before time is spent in the other areas. By prioritizing other areas first, you could enter a circular dependency cycle, where additional testing requires a copy of your data. In this situation, focus should be on copying the data quickly instead of the final data migration architecture.

Amazon DocumentDB is compatible with MongoDB 3.6 and 4.0 drivers and tools. A vast majority of the applications, drivers, and tools that organizations already use today with their MongoDB non-relational database can be used with Amazon DocumentDB with little or no change. Amazon DocumentDB emulates the responses that a client expects from a MongoDB server by implementing the Apache 2.0 open source MongoDB 3.6 and 4.0 APIs on a purpose-built, distributed, fault-tolerant, self-healing storage system that gives organizations the performance, scalability, and availability they need when operating mission-critical MongoDB workloads at scale.

There is a compatibility tool to help determine if your existing MongoDB workload is compatible with Amazon DocumentDB and specific compatibility details are documented here.

Sizing

There is no perfect way to size a new Amazon DocumentDB cluster based solely on metrics from the existing MongoDB deployment. However, a reasonable starting point can be calculated if given enough data as well as learning from past sizing attempts. In the end, sizing still requires extensive testing to ensure a proper deployment. Sizing isn’t simply a process to determine the optimal infrastructure for the deployment but also how much it will cost.

Sizing is informed by the following activities:

  • Existing Database Infrastructure—MongoDB server specifics (CPU, RAM, Networking, IO) and utilization of those resources (CPU, free memory, receive/transmit bandwidth, average and peak read/write IO). This information provides a starting point for your Amazon DocumentDB cluster size.
  • Data Details—By collection, the number of documents, average document size, number of indexes, index sizes, insert/update/delete/query per second or minute.
  • Application Service Level Agreements (SLAs)—Application critical data change or query response time requirements.

The Amazon DocumentDB sizing calculator is available here to assist with a starting point for your migration.

Testing

After reviewing compatibility and sizing the cluster, perform a one-off data migration and begin testing. While there are several different testing methodologies (for example automated versus manual) the testing stage breaks down into 3 major focus areas:

  • Correctness—Ensure that the application is behaving correctly on the new database platform. Always start here first as your application needs to pass all of your correctness tests.
  • Performance—Your application likely has KPIs and metrics around the number of CRUD operations per second, and this step ensures the new database platform meets or exceeds those requirements. If Amazon DocumentDB is performing above or below your KPIs you should resize the cluster and repeat your performance testing.
  • Load—If Amazon DocumentDB passes all your performance testing, it is beneficial to know how it will behave under worst-case-scenario peak loads while still in your test environment. The results of these tests won’t necessarily mean that you will need to increase the size of your cluster; Amazon DocumentDB can easily scale vertically by increasing the size of your primary node or horizontally by adding up to 15 read-replicas in your primary AWS region. Up to five additional regional read-replicas with 16 instances each are supported using Amazon DocumentDB Global Clusters.

Data Migration

All migrations from MongoDB to Amazon DocumentDB require a proven mechanism to accurately synchronize data and allow the cutover to Amazon DocumentDB in a small maintenance window. AWS enables this migration path, and many other sources and targets, via AWS Database Migration Service (DMS). AWS DMS is an easy-to-use tool for one-time and ongoing database migrations with two key components:

  • DMS Full Load—You can migrate one or more MongoDB databases or specific collections via the DMS console. To achieve the best possible migration performance the full load can synchronize multiple collections simultaneously and parallelize the load of a single collection using the DMS Segmenting functionality. Full documentation on MongoDB as a DMS source is available here.
  • DMS Change Data Capture (CDC)—After the initial point-in-time synchronization of the data is complete, DMS supports keeping the Amazon DocumentDB cluster in-sync with the MongoDB deployment. This CDC process uses the Replica Set Oplog on MongoDB 3.x and Change Streams on MongoDB 4.x to ensure all changes are mirrored to Amazon DocumentDB.

You can begin the data synchronization process at any time prior to your final cutover window and monitor the progress of the full load and CDC tasks using the DMS console.

Considering a Migration?

All migrations from MongoDB to Amazon DocumentDB should follow the compatibility, sizing, testing, and data migration pattern. The time investment will pay dividends including a performant application, properly sized cluster, cost efficient infrastructure, and predictable cutover.

A good starting point to learn more about Amazon DocumentDB is the Getting Started Guide. Additionally, AWS DMS is free when migrating to Amazon DocumentDB. Details are available here.


Sponsors