Where are you using WebAssembly?
Wasm promises to let developers build once and run anywhere. Are you using it yet?
At work, for production apps
At work, but not for production apps
I don’t use WebAssembly but expect to when the technology matures
I have no plans to use WebAssembly
No plans and I get mad whenever I see the buzzword
AI / Data / Large Language Models

Milvus in 2023: Open Source Vector Database Year in Review

As we enter the new year, it's a good time to reflect on the entire vector database industry, with a special focus on open source Milvus.
Jan 19th, 2024 8:45am by
Featued image for: Milvus in 2023: Open Source Vector Database Year in Review

Last year marked a pivotal turning point in artificial intelligence (AI). Large language models (LLMs) have taken center stage, gaining widespread recognition for their exceptional natural language-processing capabilities. This surge in popularity has substantially expanded the possibilities of machine learning applications, enabling developers to construct more intelligent and interactive applications.

Amid this revolution, vector databases have emerged as a crucial component, acting as the long-term memory for LLMs. The rise of retrieval augmented generation (RAG) models, intelligent agents and multimodal retrieval apps has demonstrated the vast potential of vector databases in enhancing multimodal data retrieval efficiency, reducing hallucinations in LLMs and supplementing domain knowledge.

The LLM evolution has also catalyzed significant advancements in embedding technologies. According to the Massive Text Embedding Benchmark (MTEB) Leaderboard on Hugging Face, leading embedding models such as UAE, VoyageAI, CohereV3 and Bge were all released in 2023. These advancements have bolstered the vector retrieval effectiveness of various vector search technologies like Milvus, providing more precise and efficient data-processing capabilities for AI applications.

However, with the growing popularity of vector databases, debates arose about the necessity of specialized solutions. New startups have entered the vector database arena. Many traditional relational and NoSQL databases have started treating vectors as a significant data type, and many claim to be capable of substituting specialized vector databases in every situation.

As we enter 2024, it’s a good time to reflect on the entire vector database industry, with a special focus on open source Milvus.

Milvus in 2023: Numbers Don’t Lie

First launched in 2019, Milvus has pioneered the concept of vector databases and consistently maintained a reputation for high reliability, scalability, search quality and performance. In 2023, Milvus achieved impressive results and underwent significant shifts, primarily driven by the rapid advancement of LLMs and the boom of AI-generated content (AIGC) applications. Here are some key figures that best represent Milvus’s progress in 2023.

Zero Downtime During Rolling Upgrades

For those new to vector databases, their primary focus centers on functionality rather than operational maintenance. Many application developers also pay less attention to stability in their vector databases than transactional databases since their applications are often in the early stages of exploration. However, stability becomes indispensable if you aim to deploy your AIGC application in a production environment and achieve the best user experience.

Milvus prioritizes not just functionality but also operational stability by adding rolling upgrades to Milvus starting from version 2.2.3. After continuous refinement, this feature can ensure zero downtime during upgrades without interrupting business processes.

3x Performance Improvement in Production Environments

Boosting vector search performance needs to be a primary goal for vector databases. Many vector search solutions chose to base their solution on adapting the Hierarchical Navigable Small Worlds (HNSW) algorithm to get to market quickly. Unfortunately, that means they face significant challenges in real-world production environments, especially with highly filtered searches (over 90%) and frequent data deletions.

Milvus has focused on optimizing performance during any phase of development, especially in production environments, achieving a threefold improvement in search performance, especially in filtered search and streaming insert/search situations.

We also introduced VectorDBBench, an open source benchmarking tool, to make it easy for developers to evaluate vector databases across different conditions. Unlike traditional evaluation methods, VectorDBBench assesses databases using real-world data, including super-large data sets or those closely resembling data from actual embedding models, providing users with more insightful information for informed decision-making.

5% Recall Improvement on the Beir Data Set

While dense embeddings have proven effective in vector search, they must catch up when searching for names, objects, abbreviations and short query contexts. In response to their limitations, Milvus has introduced a hybrid query approach that integrates dense embeddings with sparse embeddings to enhance the quality of search results. This hybrid solution with a reranking model has resulted in a 5% improvement in the recall rate on the Beir data set, as validated by our tests.

Milvus has also unveiled a graph-based retrieval solution tailored for sparse embeddings, surpassing the performance of conventional search algorithms like WAND.

At the 2023 NeurIPS BigANN competition, Zilliz engineer Zihao Wang presented Pyanns, a search algorithm that demonstrated significant superiority over other entries in the sparse embedding search track. It’s a precursor to our sparse embedding search algorithms for production environments.

10x Memory Saved on Large Data Sets

Retrieval augmented generation (RAG) was the most popular use case for vector databases in 2023. However, the increase in vector data volumes with RAG applications presents a storage challenge for these applications. This challenge is especially true when the volume of transformed vectors exceeds that of the original document chunks, potentially escalating memory usage costs. For example, after dividing documents into chunks, the size of a 1536-dimensional float32 vector (roughly 3kb) transformed from a 500-token chunk (about 1kb) is greater than the 500-token chunk.

Milvus is the first open source vector database to support disk-based indexing, resulting in a 5x memory savings. By the end of 2023, Milvus 2.3.4 introduced the capability to load scalar and vector data/indexes onto the disk using memory-mapped files (MMap). This advancement offers more than a 10x reduction in memory usage compared to traditional in-memory indexing.

20 Milvus Releases

In 2023, we launched 20 releases, a testament to the dedication of over 300 community developers and our commitment to a user-driven approach in development.

To illustrate, Milvus 2.2.9 introduced dynamic schema, marking a crucial shift from prioritizing performance to enhancing usability. Milvus 2.3 introduced critical features such as upsert, range search, cosine metrics and more, all driven by our user community’s specific needs and feedback.

Million Tenants in a Single Custer

Implementing multitenancy is crucial for developing RAG systems, AI agents and other LLM applications, meeting the heightened user demands for data isolation. For business-to-customer (B2C) businesses, tenant numbers can skyrocket into the millions, making physical isolation of user data impractical (as an example, it’s unlikely that anyone would create millions of tables in a relational database). Milvus introduced the Partition Key feature, allowing for efficient, logical isolation and data filtering based on partition keys, which is handy at a large scale.

Conversely, business-to-business (B2B) enterprises, accustomed to dealing with tens of thousands of tenants, benefit from a more nuanced strategy involving physical resource isolation. The latest Milvus 2.3.4 brings enhanced memory management, coroutine handling and CPU optimization, making creating tens of thousands of tables within a single cluster easier. This enhancement also accommodates the needs of B2B businesses with enhanced efficiency and control.

10 Million Docker Image Pulls

As 2023 drew to a close, Milvus had reached 10 million Docker pull downloads. This accomplishment signals increased developer interest in Milvus and emphasizes its rising significance within the vector database domain. As a cloud native vector database, Milvus integrates seamlessly with Kubernetes and the broader container ecosystem.

10 Billion Entities in a Single Collection

While scalability might not currently steal the spotlight in the AI phenomenon, it plays a pivotal role in its long-term success. The Milvus vector database can seamlessly scale out to accommodate billions of vector data. Milvus helped one LLM customer store, process and retrieve an astounding 10 billion data points, just one example of Milvus’s capabilities to handle massive volumes of data.

Beyond the Numbers: New Insights into Vector Databases

Beyond the numerical milestones, 2023 has enriched us with valuable insights into the subtle nuances and evolving dynamics of vector search technology.

LLM Apps Are Still in the Early Stages

Reflecting on the early days of the mobile internet boom, many developers created simple apps like flashlights or weather forecasts that eventually were integrated into smartphone operating systems. Last year, most AI native applications, like AutoGPT, which rapidly hit 100,000 stars on GitHub, didn’t deliver practical value but represented meaningful experiments. For vector database applications, the current use cases may just be the first wave of AI native transformations.

Vector Databases Go toward Diversification

Similar to the evolution of databases into categories like online transaction processing (OLTP), online analytical processing (OLAP) and NoSQL, vector databases show a clear trend toward diversification. Departing from the conventional focus on online services, offline analysis has gained significant traction. Another notable instance of this shift is the introduction of GPTCache, an open source semantic cache released in 2023. It enhances the efficiency and speed of GPT-based applications by storing and retrieving responses generated by language models.

Vector Operations Are Becoming More Complicated.

While supporting approximate nearest neighbor (ANN) search is a defining feature of vector databases, it doesn’t stand alone. The common belief that merely keeping nearest neighbor search is sufficient to classify a database as a vector or AI native database oversimplifies the intricacies of vector operations. Beyond the basic capabilities of hybrid scalar filtering and vector search, databases tailored for AI native applications should support more sophisticated semantic capabilities like neural network (NN) filtering, K-nearest neighbors (KNN) join and cluster querying.

Elastic Scalability Is Essential for AI Native Applications.

The exponential growth of AI applications, exemplified by ChatGPT amassing over 100 million monthly active users in two months, likely surpasses any prior business trajectory. Swiftly scaling from 1 million to 1 billion data points becomes paramount once businesses hit their stride in growth. AI application developers benefit from the pay-as-you-go service model set by LLM providers, leading to substantial reductions in operational costs. Similarly, storing data that aligns with this pricing model proves advantageous for developers, allowing them to channel more attention toward core business.

Unlike language models (LLMs) and various other technological systems, vector databases operate in a stateful manner, requiring persistent data storage. Consequently, when selecting vector databases, it is crucial to prioritize elasticity and scalability to ensure alignment with the dynamic demands of evolving AI applications.

Machine Learning in Vector Databases Can Yield Extraordinary Results

In 2023, our substantial investment in the AI4DB (AI for database) projects yielded remarkable success. As part of our endeavors, we introduced two pivotal capabilities to Zilliz Cloud, the fully managed Milvus solution: 1) AutoIndex, an auto-parameter-tuning index rooted in machine learning and 2) a data-partitioning strategy based on data clustering. Both played a crucial role in significantly enhancing the search performance of Zilliz Cloud.

Open Source vs. Closed Source

Leading closed source LLMs like OpenAI’s GPT series and Claude put the open source community at a disadvantage without comparable computational and data resources.

However, within vector databases, open source eventually will become the favored choice for users. Opting for open source introduces many advantages, including more diverse use cases, expedited iteration and cultivating a more robust ecosystem.

Furthermore, database systems are so intricate that they cannot afford the opacity often associated with LLMs. Users must thoroughly understand the database before choosing the most reasonable approach for its use. Moreover, the transparency ingrained in open source empowers users to customize the database according to their needs.

Epilogue and a New Beginning

It is exciting to see the innovation from the many AI startups founded in 2023. It reminds me why I got into VectorDB development in the first place. In 2024, all these innovative applications will gain real traction, attracting not just funding but real paying customers, which will bring different requirements for these developers.

We are hopeful and excited to witness even more diversified applications and system designs in vector databases in the coming year.

Let’s make extraordinary things happen in 2024!

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Docker.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.