The comparison matrix
Eleven databases across the four deployment shapes, on the five dimensions that decide fit and cost. yes and no are read from each vendor's own repository, documentation, or pricing page; partial denotes availability through an integration or full-text add-on rather than a single native call. Pricing model is the column to read at scale, not the headline price. The full machine-readable matrix is in data.json.
| Database | Deployment shape | Open-source license | Hybrid search | Metadata filtering | Pricing model | Best for |
|---|---|---|---|---|---|---|
| pgvector | Extension (in Postgres) | PostgreSQL License | partial (Postgres FTS) | yes (SQL) | free (pay your Postgres) | apps already on Postgres, under ~10M vectors |
| Qdrant | Self-Hosted + Managed | Apache-2.0 | yes (sparse+dense) | yes (strong, Rust) | usage / free tier | fast filtered search at low self-host cost |
| Weaviate | Self-Hosted + Managed | BSD-3-Clause | yes (native BM25) | yes | resource-based | native hybrid search and module ecosystem |
| Milvus / Zilliz | Self-Hosted + Managed | Apache-2.0 | yes (2.4+) | yes | free OSS / usage (Zilliz) | billion-scale distributed workloads |
| Pinecone | Serverless (Managed) | proprietary | yes (sparse-dense) | yes | usage (serverless) | zero-ops fully-managed RAG |
| Chroma | Embedded + Cloud | Apache-2.0 | partial (integration) | yes | free OSS / cloud usage | prototyping and local-first RAG |
| LanceDB | Embedded + Cloud | Apache-2.0 | partial (FTS) | yes | free OSS / cloud | multimodal and edge / embedded |
| turbopuffer | Serverless (Managed) | proprietary | yes (BM25+vector) | yes | usage (object-storage) | storage-heavy, cost-optimized workloads |
| Vespa | Self-Hosted + Managed | Apache-2.0 | yes (native ranking) | yes | free OSS / resource (Cloud) | complex ML-driven ranking over text+vectors |
| Marqo | Self-Hosted + Cloud | Apache-2.0 | yes | yes | free OSS / cloud | end-to-end embedding generation + storage |
| Redis (Query Engine) | Self-Hosted + Managed | RSALv2 / SSPLv1 / AGPLv3 | yes (vector + text) | yes | free OSS / Redis Cloud | low-latency vectors where Redis is already in the stack |
Deployment shapes, licenses, hybrid support, and pricing models reflect each vendor's public repository and documentation as of June 2026 and change often; verify on the vendor's own page before a purchase decision. Redis is marked partial under open source because version 8 ships a tri-license (RSALv2/SSPLv1/AGPLv3) rather than a single permissive license. The narrative companion to this dataset is Best Vector Databases for RAG 2026.
Methodology
Each cell is read from the vendor's own public repository, documentation, or pricing page as of June 2026. Open-source licenses are verified from the project's GitHub repository (Redis confirmed as the Redis 8 tri-license RSALv2/SSPLv1/AGPLv3, not a single permissive license). Hybrid search is marked yes only where a single native query path is documented; partial denotes availability through an integration or full-text add-on. Metadata filtering is near-universal across the category. Vendor performance and cost claims are attributed, not asserted; no first-party recall or latency benchmark is encoded here.
The dataset tracks five dimensions: deployment_shape, open_source_license, hybrid_search, metadata_filtering, and pricing_model, plus each tool's "best for" fit. The four-shape reference model (Embedded, Self-Hosted, Managed, Serverless) and the rankings were fixed before any monetization check; there is no paid placement and no sponsorship from any database listed.