Page 1 of 2 next >>

The Tools, Platforms, and Strategies Shaping Today’s Data Stacks

By Joe McKendrick

Apr 16, 2025

What does it take to deliver superior data services and capabilities in today’s enterprises? With the rise of AI and machine learning (ML)—and their data-heavy workloads—data managers are actively seeking greater scalability and adaptability in their environments. “Organizations are leaving behind traditional, monolithic databases and adopting AI-native or vector-search capable architectures,” said Skip Levens, product marketing manager for AI strategy at Quantum. They seek architectures “that can make it easier to slice through huge volumes of unstructured data throughout its lifecycle.”

Still, transforming legacy environments into modern data stacks is a gradual evolution—not an overnight sprint. “Even after years of effort, success rates remain low, and any transition to a modern database typically results in a stripped-down version of the original,” cautioned Kellyn Gorman, multi-platform database and AI advocate at Redgate. “Performance suffers because the original system benefited from years, sometimes decades, of optimization that can’t be replicated within a migration project’s time frame.”

AI’S OUTSIZED IMPACT

Behind the drive to build modern and adaptable data stacks is the looming force of AI, industry experts concurred. “It’s the biggest shift in years to a conservative industry—AI and ML are transforming how businesses approach data storage and management,” said Levens. “Data isn’t a cumbersome byproduct to be managed or thrown away, it’s a potentially massively valuable precious resource. As AI and ML adoption grows, organizations are learning that they can manage vast amounts of unstructured data while ensuring high-performance processing, security, and accessibility.”

AI “is definitely reshaping data priorities,” agreed Jonas Bonér, founder, CTO, and chairman of Akka. “It’s on every company’s mind. Organizations now need vector storage for embedding in RAG [retrieval-augmented generation] workflows, efficient data pipelines for training, more sophisticated query capabilities, and integration between traditional and AI specialized databases.”

As a result, “AI can quickly become very costly, so staying efficient is paramount,” Bonér continued. “Also, agentic AI usage demands platforms that can scale to millions of agents long term, while staying highly available, responsive, and efficient. It calls for a new stateful architecture that is quite different from how many have historically approached microservices. Traditional three-tier architecture and stateless services just won’t scale well.”

In addition, AI is only as good as the data that goes into it. “AI is driving so many outcomes, and people want to be sure that the data used to arrive at those outcomes is both correct and governed correctly,” said Mark Porter, chief technology officer at dbt Labs. “The adage ‘garbage in, garbage out’ is especially relevant when it comes to AI—poor data quality leads to unreliable out-comes,” said Gorman. “To ensure your AI delivers accurate and meaningful results, whether it’s leveraging retrieval-augmented generation or the latest agentic AI models, it’s critical to feed it high-quality, well-governed data. The success of your AI initiatives hinges on this foundation.”

CONSOLIDATION AND CONVERGENCE

Consolidation is a trend now being seen in today’s environments, as enterprises seek to get a clearer picture of their existing assets and prepare for the data-driven AI world that awaits. The adoption of multiple database platforms had been on an upward trajectory since 2020 as organizations grappled with the relentless increase in the volume and variety of data, Redgate’s 2025 State of the Database Landscape survey shows. Concerns around training, data integration complexities, and monitoring and troubleshooting are beginning to prompt a consolidation. The number of organizations using only one platform rose from 21% in 2023 to 26% this year, but nearly 75% have pulled back to three platforms or less.

The current generation of platforms aid in developing more responsive and adaptable data stacks. “We see blurring distinctions between specialized technologies—such as SQL versus NoSQL databases,” said Alok Shankar, senior software development manager at Adobe. These databases “are not that distinct. Most of the databases can offer all forms of transactions, with and without schemas.”

The distinction between transactional and analytical systems “has also been converging,” Shankar added. “Typically, organizations maintained separate OLTP and OLAP systems. But the newer hybrid transactional/analytical processing architectures— HTAP, along with real-time data lakes, and advancements in vector databases driven by AI—are blurring those boundaries.”

Still, not every organization has put newer generations of databases into production—resulting in a hodgepodge of environments with varying degrees of features and capabilities. “One of the biggest challenges in database migrations is recognizing that not all platforms are alike,” said Gorman. “Greenfield projects often thrive on open source databases like PostgreSQL, MySQL, or SQLite, while document databases excel with unstructured data. However, migrating long-established relational systems, especially those reliant on advanced features of enterprise databases like Oracle or SQL Server, often fails due to insufficient analysis of the original platform’s technical investments.”

Change keeps on coming. Databases and data warehouses “are stones that still aren’t gathering any moss, 50 years after they were introduced,” Porter agreed. “Databases are now including RAG—retrieval-augmented generation—and vector search directly into their engines as an access type—as they should. Many applications can benefit from this built-in technology and don’t need standalone vector databases.”

In another key area of the data environment—data warehouses—“I see more and more focus on data governance and lineage—required for data privacy, governance, and AI safety,” Porter added. He also predicted that in the year ahead, “AI agents, informed with data and opinionated on quality, will start improving how we think about our data significantly.”

OPEN SOURCE

The push is on to further enterprise-wide data virtualization—”a knowledge layer” that incorporates “a common vocabulary and ability to discover related data,” said Kunju Kashalikar, senior director of product management for Pentaho. As a result, data quality becomes even more critical. Data quality needs to be democratized and performed continuously. The system needs to “react automatically to a drop or shift in quality.”
Data stacks are also increasingly relying on open source technology, Porter said. “Iceberg and other open storage formats are taking the world by storm, disintermediating proprietary formats and allowing engine vendors to compete to be the best engine and also allow customers to make the best choice of how and where to process their data. This is opening up an entire new arena of customer choice, which I think is amazing.”

There are a number of companies “changing licenses from open source to source available and even some changing back,” said Peter Zaitsev, co-founder and CEO of Percona. “I expect a continuing split between corporate-owned open source, which will focus on monetization and will use more restrictive licenses or keeping some features only available in the cloud.”

Page 1 of 2 next >>