MinIO Releases MemKV, a Purpose-Built Context Memory Store Delivering Microsecond Context Retrieval at Petascale

May 14, 2026

By Stephanie Simone

MinIO, the data foundation for enterprise AI and analytics, is releasing MemKV, a context memory store that delivers microsecond context retrieval at petabyte scale for agentic AI inference workloads.

MemKV joins AIStor as the second pillar of MinIO's product portfolio, extending the company's data foundation, which began with AIStor, into the memory tier where inference runs. The new MinIO MemKV product delivers persistent, shared context across GPU clusters at a scale that existing memory and storage tiers cannot, according to MinIO.

"The industry has been papering over context loss for years because at small scale you may be able to absorb the recompute tax and move on. At the GPU density hyperscalers and neoclouds are building toward, that is no longer true. A GPU recomputing context it has already generated is burning power without return, and at a thousand GPUs that is not inefficiency, it is structural drag,” said AB Periasamy, co-founder and CEO, MinIO. “Yield economics at this scale demand something purpose-built for the inference data path. MemKV was designed for exactly this.”

Designed to run on NVIDIA BlueField-4 STX architecture and with native support for NVIDIA Dynamo and NVIDIA NIXL. MemKV gives enterprises, cloud providers, and AI platforms a shared memory tier that combines microsecond responsiveness with petabyte-scale capacity. For the first time, an entire GPU cluster can access a common pool of context at speeds that keep pace with inference, rather than waiting on storage.

Designed exclusively for AI inference and built from the ground up for the G3.5 layer of the GPU memory hierarchy, MemKV delivers petabytes of shared context memory at SSD economics, replacing the cost and capacity constraints of GPU HBM and DRAM with a tier that scales independently of the compute cluster, the company said.

The architecture incorporates how GPUs actually consume data at inference time:

Native support for NVIDIA BlueField-4 STX: Runs directly within NVIDIA STX infrastructure as a single ARM64-native binary, embedded in the storage tier rather than deployed on separate x86 storage servers connected over the network.
End-to-end RDMA transport: KV cache moves from GPU memory to NVMe over RDMA, bypassing file-system or object-storage protocols entirely.
GPU-native block sizes: Operates in 2-16 MB blocks optimized for throughput-oriented GPU access patterns, not the 4 KB blocks designed for legacy storage workloads.
Wire-speed fabric performance: Built for NVIDIA Spectrum-X Ethernet networking and PCIe Gen6, driving throughput to near wire speed across the physical fabric.

MinIO MemKV is available now.

For more information about this news, visit www.min.io.