Supporting ChatGPT with Vector Databases for Optimized Efficiency and Accuracy

With ChatGPT dominating the space of conversational AI and rapid, helpful response turnout, as well as OpenAI’s open source retrieval plugins for the revolutionary tool, ChatGPT will begin to permeate a variety of solutions to bring people and information closer than ever. These exciting chat solutions, however, require a robust support network of data models that underpin the tool with the accuracy needed for optimal ML outcomes.

Frank Liu, ML architect at Zilliz, joined DBTA’s webinar, “Vector Databases Have Entered the Chat—How ChatGPT Is Fueling the Need for Specialized Vector Storage,” to explore how purpose-built vector databases are the key to successfully integrating with chat solutions, as well as present explanatory information on how autoregressive LMs, unstructured data, vectors, and vector databases intersect.

Liu offered clarity on the relationship between unstructured data, vectors, and vector databases; ultimately, the surge of unstructured data—or any data that does not conform to a predefined data model—necessitated a system to easily categorize and depict that data for more simplistic consumption and management. Enter vectors, a geometric representation of a data object used in ML that may indicate a variety of attributes, including origin, direction, and magnitude.

A vector database, then, is purpose-built to store, index, and query large quantities of embeddings—a low-dimension space into which you can translate high-dimensional vectors for simplified ML operations.

Liu stressed the importance of a purpose-built vector database, as opposed to more work-around solutions like a vector search library, because of its database functions. Liu pointed to benefits including high-performance vector search, replication and failover capabilities, horizontal/vertical scalability, automatic indexing, as well as backup and recovery for large inputs of vectors, that a typical vector search library could not accomplish.

“Once you reach beyond 10 million vectors—which a lot of organizations have that volume of unstructured data today—you need a way to be able to easily store, scale, and replicate that data,” said Liu. “You need a lot of traditional database features; that’s really why purpose-built is so important.”

Ultimately, Liu explained that achieving a purpose-built vector database is a difficult quest. Unlike traditional databases where concerns largely lie in managing data at scale, for vector databases at the intersection of AI, ML, and data infrastructure, large compute requirements must also be reckoned with. High query load, high insertion/deletion, full precision and recall, accelerator support (GPU, FPGA), and supporting billion-scale vector storage pose a rather large obstacle for organizations seeking to manage unstructured data.

Autoregressive large language models (LLMs), such as ChatGPT, Claude, and Bard, have incurred a widespread craze that has maintained an aspect of virality in just a short time of its public use. Purpose-built vector databases, along with LLMs such as ChatGPT, introduce a critical downside, however with hallucinations, or plausible-sounding but actually incorrect responses. Leveraging LLMs to drive algorithms that answer questions with both domain specificity and accuracy can be rather difficult to achieve; small, incorrect semantic details offered by autoregressive LLMs can completely ruin its viability.

The solution, Liu posited, is through domain knowledge injection into autoregressive LMs. Key pieces of domain knowledge are stored in a vector database, bridging the gap between correctness, autoregressive LMs, and material data already existing within an organization’s infrastructure. This intersection limits the number of hallucinations by providing the model with the large scale vector data, providing context that fills in the blanks that a tool like ChatGPT might auto-fill with inaccurate but plausible information.

For an in-depth discussion and demo of vector database technology and its significance for autoregressive LMs like ChatGPT, you can view an archived version of the webinar here.