The significance of chat-agent apps and providing ample context for large language models has grown, as it improves the quality of responses to user queries. Retrieval-Augmented Information (RAG) enhances LLM's capabilities by offering additional context, leading to more accurate answers. It relies on vector databases to locate relevant information for a given query, making the choice of a reliable and scalable database challenging.
I spent a week exploring various vector databases, including open-source and managed options. Each had its unique advantages and drawbacks. I tested their functionality by creating a simple database, inserting vectors, and experimenting with their configurations and querying processes.
The ongoing debate between self-hosted and managed vector databases centers around whether you're willing to maintain an additional stack in your application. Self-hosted options require manual monitoring, server performance maintenance, and alert setup when system performance peaks. You may also need to configure scalability by adjusting server specifications and distributing across multiple servers. On the other hand, managed vector databases simplify the process with just a few clicks to set up and run. These options include built-in monitoring dashboards and scalability management.
Let’s dive in
Milvus
Milvus is an open-sourced vector database. It is the first thing when I hear about vector databases. So many people around data folks are talking about Milvus and until now I have a chance to try it by myself. Besides Milvus offering self-hosted, they provide installation using docker, they also offer managed Milvus under zilliz. The installation guide is easy to follow, there is a docker-compose yaml file that you can download and run the components using docker. There are 3 services, standalone, etcd, and minio.
🌐Homepage 💻 Github 🔗 Managed Service
Chroma
Another popular open-source vector database. I found Chroma when I tried to find an alternative to Milvus. Installation is straightforward, tutorial documentation is easy to follow. It uses SQLite to store the vectors. It provides an embedding model directly when you want to save the vector from unstructured data. It also offers two modes, flat files and server. Server one is like running and server db you can access over HTTP.
Qdrant
Qdrant is the first managed vector database I tried. Their advertising is everywhere. In terms of documentation, Qdrant is easy to follow, gives basic tutorials and examples. It is easy to set up the server and connect to it using the client python library.
Free tier Qdrant provides 4GB Disk, 1GB RAM, and 0.5 vCPU which quite generous for a free tier and you can create a simple application for RAG on top of it.
Pinecone
Pinecone is a closed-source vector database that offers a managed database. You can set up the project and its index within seconds. The hierarchy of the project is very straightforward. Project > Index > Records. Unlike others that have a similar structure to Vector DBs such as Project > Database > Collection > Records.
Pinecone offers free tiers that consist of a single project, a single pod, and a single zone. My experiment was using gcp-starter, located on Iowa us-central1, and pod specs x1. However, it is quite hard to understand the specification. I found the documentation that explains the pod specification that depends on the number of vector sizes. The bigger the size of the record, the higher the pod specs will be and it is somewhat scaled automatically depending on your data size.
MongoDB
MongoDB is a NoSQL database that supports vector search. Unlike previous vector databases that specialize in handling vectors. MongoDB provides such a feature to handle this data. I tried MongoDB atlas and it directly created a vector search. The query is similar to the way you query NoSQL. Developer experience is not easy for me not using NoSQL daily.
Conclusion
Overall, the vector database is here to stay. It is capability opens up new possibilities and forms the way we store unstructured data and access the data. I think almost all vector databases offer the same features, I haven’t seen any significant innovations that one to another vector db. The pricing scheme and capability of scaling DBs are quite different. Some offer total records and API calls, while others charge based on the instance type. It depends on your application and traffic.
Keep reading with a 7-day free trial
Subscribe to The Beep to keep reading this post and get 7 days of free access to the full post archives.