Central to successful AI implementations is establishing a robust tech stack to support its demands. When it comes to the database layer, the choice between traditional relational databases, vector databases, time-series databases, and NoSQL databases becomes a crucial one. Compounding this choice is the fact that database technologies are continuously evolving alongside the tools meant to facilitate its success.
Experts joined DBTA’s latest webinar, New Database Technologies and Strategies for the AI Era, to offer their perspectives on AI-database relationship, examining best practices and new solutions.
According to Stephane Castellani, SVP marketing, CrateDB, a database built for AI requires a range of connectors, optimized and flexible storage, and speed and scalability—without costing a fortune.
Diving deeper, in order for AI to be effective, it must collect data from many systems—such as IoT sensors, CRM and ERP systems, external APIs, and more. Connectors enable AI to access data as well as derive value from it with integrations to tools such as Tableau, Power BI, TensorFlow, LangChain, and more. After all, “AI data is only valuable when it’s put to work,” noted Castellani.
Optimized and flexible storage is essential to enabling AI to learn from data as it truly exists in the real world: as a complex, multi-modal, and continuously evolving entity. Therefore, database storage should:
- Handle multiple data formats, including traditional tabular, timeseries, JSON, geospatial, full-text, and vectors
- Have custom schema flexibility, capable of adjusting without requiring costly migrations or downtime, due to unpredictable and rapidly changing data sources in AI pipelines
- Index all fields by default, including those buried in deeply nested JSON structures, since AI does not immediately register what data fields it needs
Regarding costs, Castellani emphasized the following as key factors to consider:
- Resource usage (compute vs. storage)
- Data storage (including compression)
- If high availability is needed
- Licensing models (open-source vs. closed-source; node-based models vs. consumption-based models)
- The human resources and skillset needed to operate the database
Amid the chaos of selecting the right technology, one thing is consistent: AI disruption will transform every application. And, according to Matthew Groves, DevRel engineer, Couchbase, “AI agents will become the primary way we interact with computers in the future,” and supporting agentic AI should be a top priority when selecting a database.
However, agentic AI comes with a wide breadth of challenges and requirements, such as expansive access to data, a deep set of different tools and functions, safe model access, and more. Atop agentic AI’s extensive needs, yesterday’s data architecture is not ready to support it, bogged down by separate platforms and multiple integrations.
Couchbase’s developer data platform is designed for the critical applications in the AI world, offering a variety of data access services, a robust performance foundation, and several enterprise deployments, including Couchbase-managed and customer-managed options. Couchbase’s Capella AI services also offer the building block for agentic applications, providing model service, retrieval-augmented generation (RAG) pipelines, vector search, data intelligence, and an agent catalog.
Anil Inamdar, head of professional services, NetApp, offered an answer to the question: Why are database technologies and strategies evolving now?
Between the data and AI imperative, rising infrastructure complexity, and business innovation driving pressure, the database space is requiring a reimagination—namely due to AI’s disruption. Based on current innovations, Inamdar predicted that future database technologies will:
- Be autonomous, self-managing across diverse workflows and automating provisioning, scaling, security, and optimization with minimal human input
- Have AI-integrated pipelines, incorporating AI throughout the data lifecycle from intelligent ingestion to inference
- Maintain explainable AIOps, where transparency will become increasingly critical as automation grows
- Support cross-functional operations teams, blurring the traditional boundaries between DBAs, MLOps and DevOps specialists, and cultivating cross-functional expertise and shared responsibility
With evolution being the only constant, Inamdar emphasized the use of open source technologies, such as NetApp Instaclustr, the trusted enterprise partner for open source. Ensuring performance, reliability, and scalability for the open source tech stack, NetApp Instaclustr leverages NetApp’s 30-plus years’ experience delivering for enterprises at scale while offering up to 100% availability SLA for Cassandra and up to 99.999% for most other products.
Vivin Nath, director of product management, AI, Informatica, explained that despite AI’s foundation being data, data quality, governance, integration, and access continue to be a challenge for generative AI (GenAI). A modern data architecture that acknowledges and corrects each of these obstacles will be vital for supporting AI.
Informatica’s Intelligent Data Management Cloud (IDMC) provides a singular platform for all architecture patterns, including data lakehouse, data mesh, and data fabric. IDMC centralizes data governance and privacy, data quality and observability, DataOps and security, data ingestion data catalogs, and AI/ML intelligence and automation within one location, delivering a robust, comprehensive foundation for AI.
This is only a snippet of the full New Database Technologies and Strategies for the AI Era webinar. For the full webinar, featuring more detailed examinations, a Q&A, and more, you can view an archived version of the webinar here.