Where are you using WebAssembly?
Wasm promises to let developers build once and run anywhere. Are you using it yet?
At work, for production apps
At work, but not for production apps
I don’t use WebAssembly but expect to when the technology matures
I have no plans to use WebAssembly
No plans and I get mad whenever I see the buzzword
AI / Data

Vector Databases Are Having a Moment — A Chat with Pinecone

Ever since ChatGPT launched, it's been quite a ride for Pinecone. We discuss its top use cases and why every org will need a vector database.
May 17th, 2023 8:58am by
Featued image for: Vector Databases Are Having a Moment — A Chat with Pinecone
Image via Unsplash

We first profiled Pinecone in early 2021, just after it launched its vector database solution. Since that time, the rise of generative AI has caused a massive increase in interest in vector databases — with Pinecone now viewed among the leading vendors.

To find out how Pinecone’s business has evolved over the past couple of years, I spoke with Elan Dekel, VP of Product at Pinecone. Prior to joining Pinecone last year, Dekel worked for more than 15 years (in two separate stints) at Google.

First of all, what is a vector database? Microsoft defines it as “a type of database that stores data as high-dimensional vectors, which are mathematical representations of features or attributes.” The data is stored as a vector via a technique called “embedding.”

In a recent post on The New Stack, TriggerMesh co-founder Mark Hinkle used the analogy of a warehouse to explain the use case for vector databases. “Imagine a vector database as a vast warehouse and the AI as the skilled warehouse manager,” Hinkle wrote. “In this warehouse, every item (data) is stored in a box (vector), organized neatly on shelves in a multidimensional space.” The AI can then retrieve or compare items based on their similarities. According to Hinkle, a vector database is “ideal for applications like recommendation systems, anomaly detection and natural language processing.”

What Is Pinecone Being Used for?

As you might imagine, it’s been quite a ride for Pinecone ever since generative AI burst into prominence last year. Dekel told me that Pinecone started its paid product in May of last year and has experienced fantastic growth since that time. The initial use cases were semantic search, he said, with a wide range of applications — such as FinTech, enterprise search, and more.

In December, after the announcement of ChatGPT, Pinecone’s growth accelerated even further. The power of large language models became evident to people, Dekel said, and they recognized that vector databases and embeddings were crucial for implementing this technology in real production systems. Pinecone began seeing a shift in the use cases after that.

Pinecone workflow

Pinecone workflow; graphic via Pinecone

“Now,” said Dekel, “everybody’s asking us, how do I do ‘retrieval-augmented generation’? How do I build a chatbot play? You know, how do we utilize large language models in production — that sort of thing.”

Retrieval Augmented Generation (RAG) is a type of language generation model used by Meta AI and others. According to Dekel, it’s a process of augmenting a large language model — such as ChatGPT — with an external dataset to enhance its intelligence and generate more accurate responses.

He gave an example of a pharmaceutical company using RAG within its intranet, where they have proprietary research documents and domain-specific knowledge. The RAG process involves embedding the internal dataset, creating vectors from it, and storing them in a vector database. When a query is made, the intranet first interacts with the vector database, retrieving relevant content related to the query. This retrieved information serves as context for the large language model. The model is then prompted to answer the question using the provided context, generating a well-written English response.

Enticing New Users and Competing Against Big Players

I asked Dekel who are the primary users of Pinecone.

He replied that they have a diverse user base, including hobbyists interested in vector databases and embeddings, who often utilize the free offering. In terms of enterprise users, they cater to different user groups — such as ML engineers, data scientists, and systems and production engineers.

Of course, given the popularity of generative AI, there are now many other options for vector databases in the market — including from existing database companies that are bolting this functionality on. For example, Redis offers vector database functionality in its Redis Enterprise product.

Dekel claims that Pinecone’s advantage is its ability to deal with scale. For small-scale use cases, running vector retrieval on a laptop with sample code found online can suffice. As the usage tier increases, solutions like Redis and PostgreSQL with vector plugins can be adequate. For large-scale usage, he said, a custom-designed system becomes necessary. He noted that Pinecone’s solution allows large companies to run billions of vectors on hundreds or more machines.

It’s not just existing vendors, though. Multiple specialist vector database products have emerged recently, such as the open source Chroma. How does Pinecone differentiate itself from them?

One way to differentiate, he replied, is by considering open source versus closed-source managed services. Pinecone believes that a managed service is what companies truly need, especially as they scale. Open source solutions can become challenging to manage and optimize when running on a large number of machines, he claimed.

He pointed out the considerations involved in building a production vector database — including data management, metadata handling, scalability, real-time updates, backups, ecosystem integrations, security measures, compliance (such as HIPAA), and more.

Dekel added that Pinecone can also be integrated with data lake providers, like Databricks. The data usually resides elsewhere and needs to be transformed into embeddings by running it through an ML model, he said. After processing and chunking the data, the resulting vectors are sent to Pinecone. Companies like Databricks manage this pipeline by handling data, running models, and hosting them effectively, he explained. Pinecone offers a connector with Databricks, to ensure synchronization throughout the entire process.

Will Every Company Need a Vector Database?

Given the highly promising future of generative AI, I wondered whether every enterprise company will need to eventually adopt vector databases.

Dekel replied that he witnessed the power and importance of vector embeddings over several years while integrating them into Google’s infrastructure, during his second stint with Google (which ended last year). So he believes that vector databases represent a paradigm shift in data utilization, especially as the use of unstructured data — such as images, videos, audio files and webpages — continues to grow exponentially. Vector embeddings are crucial for retrieving and working with this type of data, he said.

There’s no doubt vector databases are all the rage currently, similar perhaps to the massive shift from SQL to NoSQL in the enterprise market a decade ago. So if you’re a developer working with generative AI (and who isn’t these days), it’ll be worth your time learning how to use a vector database.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.