By Mahee Gunturu, Sr. Solutions Architect, AWS In-Memory Databases
The world is increasingly online, and online is increasingly real-time. This presents interesting performance challenges for databases, and Amazon MemoryDB for Redis is purpose-built to address them. According to a recent statistic, more than 5.47 billion people have access to the Internet which is approximately 69% of the world’s population. This number has increased by a staggering 1.1 billion since 2019 or approximately 12% of the world population is now accessing internet for the first time. Even more surprising is 90% of them access internet from their smartphones, tablets, or other mobile devices and this number is expected to remain high as mobile broadband speeds increase along with 5G penetration globally.
This has led to a rapid proliferation of real-time applications as businesses adjust to changing market conditions by adapting their strategy to meet customer demands. Modern real-time applications require extreme performance, global scale, and constant availability. The cloud has catered to these needs by offering affordable and flexible options for enterprises. The cost of storage and compute has dramatically come down and as a result, businesses are able to spend less collecting more data than ever before. Enterprises are leveraging real-time data from financial transactions at point of sale, Internet of Things (IOT) devices, mobile devices, edge devices, and social media to better understand their customers and personalize their service to provide them with the best experience. Business leaders are eager to gain this real-time responsiveness from their IT infrastructure to ensure they keep up with the industry and gain a competitive advantage. This has created the need for advanced system architectures as traditional databases fall short in terms of responsiveness, throughput, concurrency, and price/performance that is required for real-time applications.
In-memory Speed With Strong Consistency and Durability
To address the need for real-time applications, enterprises are innovating rapidly using cloud-native architectures with loosely coupled microservices, fully managed databases, and DevOps practices for continuous deployments. Developers are increasingly employing microservices to break down application functionality into bite-sized chunks, and instead of having a one-size-fits-all database, each microservice adopts the database that is best suited to its needs. As Werner Vogels, CTO of Amazon.com said, “Seldom can one database fit the needs of multiple distinct use cases,” and developers are now building highly distributed applications using a multitude of purpose-built databases.
Amazon Web Services (AWS) has a broad portfolio of fully managed purpose-built databases and among these is our in-memory database offering, MemoryDB. It is a primary database for applications that require ultra-fast performance, strong consistency, high availability, and durable storage. MemoryDB is built on open source (OSS) Redis, providing a fast, flexible, and developer-friendly database that enterprises can use to quickly operationalize their data at in-memory speeds and address their customer’s needs. The database can process up to 13 trillion requests per day, which is over 160 million requests processed per second. It is fully compatible with open source Redis APIs, so customers can leverage Redis data structures and commands they may be familiar with. While storing and retrieving data from an in-memory system like OSS Redis can be extremely fast, typically a microsecond-scale operation, durability and consistency guarantees cannot be achieved. In this blog, we are going to explore how MemoryDB, an in-memory database with a unique architecture achieves strong consistency and durability for Redis workloads.
In MemoryDB, we use a separate Multi-AZ distributed transaction log to provide durability instead of writing to the local storage of the Redis nodes. This achieves durability, replication, and strong consistency while also separating concerns from the in-memory system. In MemoryDB, the primary node acknowledges the execution of a command only after it is committed into the multi-AZ transaction log. After acknowledging a commit, the corresponding updates are then propagated asynchronously to the replicas. In this architecture, replicas help with both scaling read throughput and providing high availability in the case of a failover. Reads are served from local memory at microsecond-speed and writes are as fast as 3-4ms, making it the fastest database in the AWS portfolio. Customers can choose to execute strongly consistent READ commands against primary nodes or eventually consistent commands against replicas.
MemoryDB offloads one additional responsibility from the core execution engine: snapshots. Snapshot functionality is supported in open source Redis to enable the recovery of snapshot state from storage, when needed. The process of taking snapshots for open source Redis clusters is usually very memory-intensive. It involves creating a background write process that requires sufficient memory and CPU to be available to accommodate the process overhead. As a result, it is important to reserve a significant amount of memory for open source Redis clusters to accommodate snapshots. For a MemoryDB cluster, snapshot functionality is offloaded from the Redis nodes using the multi-AZ transaction log. Even though system state can be restored directly from the transaction log, this would be slow and consume additional storage. We designed a process to compact the Multi-AZ transaction log by generating a snapshot offline. By using this process to generate a snapshot, we also removed this burden from the running Redis clusters, dedicating more RAM for customer workloads and maintaining steady performance without CPU spikes.