October 19, 2024

Many Vector Databases…

Thare are many Vector Databases are now open, It is quite hard to know the traits of thoes databases.

But I found interesting article, Vector Database Benchmarks in Qdrant Site.

https://qdrant.tech/benchmarks/

In terms of computation time and its searching capabilitxy, they search for the defference between ElasticSearch, Milbus, Qdrant, Redis, Weaviete.

Which one to choose?

As they noted, probably they are biased. But the benchmark of Qdrant looks quite. Qdrant notably achieves high request-per-second (RPS) rates and low latencies across multiple scenarios, outperforming competitors like Elasticsearch, Milvus, Redis, and Weaviate in specific aspects. The benchmarks emphasize the trade-offs between speed, precision, and resource utilization in vector database performance.

The difference that Qdrant claims between VDB

  • Qdrant achives highest RPS and lowest latencies in almost all the scenarios, no matter the precision threshold and the metric we choose. It has also shown 4x RPS gains on one of the datasets.
  • Elasticsearch has become considerably fast for many cases but it’s very slow in terms of indexing time. It can be 10x slower when storing 10M+ vectors of 96 dimensions! (32mins vs 5.5 hrs)
  • Milvus is the fastest when it comes to indexing time and maintains good precision. However, it’s not on-par with others when it comes to RPS or latency when you have higher dimension embeddings or more number of vectors.
  • Redis is able to achieve good RPS but mostly for lower precision. It also achieved low latency with single thread, however its latency goes up quickly with more parallel requests. Part of this speed gain comes from their custom protocol.
  • Weaviate has improved the least since our last run. Because of relative improvements in other engines, it has become one of the slowest in terms of RPS as well as latency.

The other difference?

Qdrant primarily benchmarks only in terms of search performance. Naturally, other factors should also be considered. Here are my thoughts:

  • Elasticsearch, along with the Elastic-stack, offers numerous features, providing functionalities helpful for hybrid searching in languages other than English, such as Japanese. It can also be operated as a cluster.
  • Similarly, Redis-Stack offers a beautiful interface and the appeal of being an In-Memory database. Through Llama-Index, Qdrant currently cannot manage documents directly; however, Redis allows for simultaneous management of Documents and VectorIndexes, offering an advantage of not complicating the service.
  • Milvus is built with cluster operations in mind. Furthermore, through the Role-Based Access Control (RBAC) feature, it allows for setting different data access rights for each user. RBAC achieves detailed access control by assigning permissions to roles first and then assigning those roles to users, instead of directly assigning permissions to users.