Picking a Vector DB in 2026: A Decision Framework

By OrionAI Build Editorial · Published 2026-05-10 · // guide

I've helped four teams pick a vector DB this year. Each pick was different. The framework that gets to the right answer in 15 minutes.

The four questions

How many vectors, today and at 12-month projection?
What's your latency budget — P95, under your real load?
What's your filtering pattern — heavy metadata filtering, geo filters, or pure-vector?
How much ops are you willing to carry?

The answer matrix

Under 1M vectors, light filtering

pgvector. Already in your Postgres if you have one. Don't add new infrastructure for <1M vectors.

1M to 10M vectors, light filtering

Qdrant or Weaviate self-hosted on a $20-$60/month box. pgvector starts to slow on similarity search around the 5M-vector mark unless you tune carefully.

1M to 10M vectors, heavy metadata filtering

Qdrant. Filterable HNSW handles pre-filter cases pgvector struggles with.

10M+ vectors

Managed Qdrant Cloud, Pinecone, or Weaviate Cloud. The ops burden of self-hosting at scale is real and grows.

Multi-tenant SaaS pattern

Pinecone or Qdrant Cloud, namespaces per tenant. Self-hosting multi-tenant vector search at scale is a full-time job for someone.

Geo-filtered search (location-aware)

pgvector with PostGIS, or Qdrant with geo filters. Not Pinecone — geo support is weaker.

What people optimise for that doesn't matter

Recall@1 on standard benchmarks. Differences are small in real workloads. Test on your data.
Raw query latency. Anything above ~10ms vector lookup time is dwarfed by the LLM call.
"Latest paper algorithm." HNSW is fine. IVFPQ is fine. New is not always better.

What does matter and is under-measured

Reindex time when you change embedding model — it will happen.
Backup and restore story.
How easily you can run it locally for dev.
Metadata schema flexibility (adding a field shouldn't be a migration).

The 15-minute test

Take 1,000 of your real vectors. Index them in three options. Run 100 of your real queries. If two options give similar quality, pick the one with the smaller ops surface. If one option clearly fails, you've saved yourself months.

Model APIs — vetted picks

Anthropic OpenAI ElevenLabs Cartesia Together AI Groq

GPU & compute — vetted picks

RunPod Vast.ai Modal Replicate Lambda Labs Hetzner

Dev tools — vetted picks

Cursor Aider Continue GitHub Copilot