The pricing page for self-hosted Qdrant says $0. Milvus says $0. pgvector says $0. That's the advertised cost. Then you actually run it in production and the bills start showing up in places you didn't expect.
This post prices out the real TCO of self-hosting a vector database. Not the compute cost, which everyone already knows. The stuff that shows up on month three when you realize "open source" doesn't mean "free to operate."
The Quoted Cost
Every self-hosted vector database pitch starts the same way. "Skip the $500/month Pinecone bill. Self-host for $100 on a single VM." The math looks obvious.
For a 10M vector workload:
| Option | Monthly Cost |
|---|---|
| Pinecone (managed) | $500-1,500 |
| Qdrant self-hosted (1 node) | $100-300 |
| Milvus self-hosted (1 node) | $150-400 |
| pgvector on existing Postgres | $0 (marginal) |
Those self-hosted numbers are correct. They're also incomplete. What they measure is the VM rental cost. Not the cost of running a vector database on that VM.
Hidden Cost #1: High Availability
A single-node setup is fine until it's not. When that node reboots, your search is down. When the disk fills up, your search is down. When the Linux kernel pushes a security patch, your search is down.
The fix is a three-node cluster. Which means the $100 VM becomes three $100 VMs, plus a load balancer, plus whatever you use to coordinate them. Suddenly your $100 pricing-page number is $350-500/month.
And that's before you've thought about:
- Multi-AZ deployment for cloud-provider availability zone failures
- Backups that actually restore (not just snapshot, but tested)
- Read replicas if your query volume saturates a single node
Pinecone does all of this invisibly. You pay $500/month and you get HA. Self-hosting, you pay for the VMs and do the HA engineering yourself.
Hidden Cost #2: Engineering Time
This is the big one. And it's the one the pricing comparisons always skip.
Setting up a production vector database cluster is not a one-weekend job. It includes:
- Choosing the right VM types, storage tiers, and network config
- Configuring replication, sharding, and consistency settings
- Writing Terraform or Pulumi for reproducible deployments
- Setting up backup/restore (and testing it)
- Monitoring, alerting, and dashboards
- Load testing to find breakpoints before production does
- Documentation so the next engineer can maintain it
Realistic timeline: 3-6 weeks for an engineer with prior Kubernetes/infra experience. Longer if they're learning as they go.
At a $150K/year total comp (a conservative US figure), that's $9,000-$18,000 in upfront engineering cost before a single query runs. Even at EU rates, you're in the $5K-$10K range.
This isn't the argument to not self-host. It's the argument that "free software" isn't free software plus hardware. It's free software plus hardware plus the labor to operate it.
Hidden Cost #3: The On-Call Tax
Once your vector database is in production, it's part of your oncall rotation. Every alert, every 3am page, every "search is slow" ticket — somebody has to debug it. And that somebody needs to understand:
- How HNSW indexing works when it's slow
- Why p99 latency spiked when p50 is fine
- Whether the memory pressure is from the index, the cache, or a leak
- How to safely resize a cluster without downtime
- What to do when a node's disk fills up and the service crashes
None of this is in the Qdrant docs. You learn it in production, usually at the worst time.
Budget: 0.25 FTE for ongoing operations of a small cluster. 0.5-1 FTE for a big one. That's $30K-$150K/year of engineering time, depending on headcount and comp.
Hidden Cost #4: Upgrades
Vector databases are young. New versions come out every month. Every version has:
- New features you want
- Performance improvements you want
- Bug fixes you need
- Breaking changes you don't want
Upgrading is not apt-get update. You need to test the new version against your workload, schedule a maintenance window (or do a rolling upgrade, which is more complex), and watch for regressions in the days after.
If you skip upgrades, you fall behind. Eventually you're 10 versions back and the upgrade path is a migration project. Teams that don't plan for upgrades end up doing a painful rewrite every 18 months.
Hidden Cost #5: Re-Indexing
At some point you're going to want to change something. A new embedding model. A different vector dimension. A better index type. A new distance metric.
Any of these means re-embedding and re-indexing your entire dataset. For 10M vectors:
- Compute to re-embed: GPU cost to run every document through the new model. See our 1M image cost breakdown — roughly $30-100 for 10M items on a spot g6.xlarge.
- Double-capacity window: you need the old and new indexes online simultaneously during cutover
- Engineering time: a couple weeks to build the pipeline, test, and migrate
Managed services handle some of this for you. If Pinecone upgrades its underlying index, you don't notice. If you self-host, you're the one scheduling the re-indexing job and watching it run for three days.
Hidden Cost #6: Observability
A production vector database needs metrics. Not just "is the server up" metrics. The kind of metrics that tell you:
- p50/p95/p99 query latency
- Index memory pressure
- Queue depths for inserts
- Cache hit rates
- Replication lag
The vector database itself exports some of these. Hooking them into Prometheus/Grafana, setting up alerts, and tuning the alert thresholds so you get paged for real problems and not noise — that's a project. A week of engineering time minimum, with ongoing tuning.
Plus the infra cost. A dedicated monitoring stack runs $50-200/month depending on retention. Datadog or a managed equivalent runs $300-1,000/month for a modest workload.
Hidden Cost #7: Security and Compliance
"We self-host because of compliance" is a common reason. Fair. But self-hosting doesn't automatically make you compliant. You still need:
- Encryption at rest (configured correctly)
- Encryption in transit (with proper cert rotation)
- Network isolation (VPCs, security groups, no public access)
- Access controls (who can query, who can insert, who can drop the collection)
- Audit logs that someone actually reviews
- Regular vulnerability scans and patching
Most managed vector databases are SOC 2 certified, so your compliance story piggybacks on theirs. Self-hosted, you own the entire compliance perimeter. That's weeks of work for initial compliance and ongoing effort to keep it.
The Real TCO
Let's add it up for a 10M vector workload running on self-hosted Qdrant with HA:
| Cost Category | Monthly | Annual |
|---|---|---|
| 3-node cluster VMs | $350 | $4,200 |
| Load balancer + networking | $30 | $360 |
| Monitoring stack | $100 | $1,200 |
| Backups + storage | $50 | $600 |
| Infrastructure subtotal | $530 | $6,360 |
| Initial build (amortized over 2 yrs) | $500 | $6,000 |
| Ongoing ops (0.25 FTE) | $2,500 | $30,000 |
| True TCO | $3,530 | $42,360 |
The infrastructure is $530/month. The real cost is $3,530/month. That's what the pricing comparisons miss.
Compare to Pinecone at $500-1,500/month for the same workload. It's 2-7x cheaper when you include engineering time.
When Self-Hosting Actually Wins
Self-hosting isn't always a bad choice. It wins in specific scenarios:
Very large workloads. Past 100M vectors, the managed pricing gets steep. Self-hosted Qdrant or Milvus at that scale can be 3-5x cheaper, and the engineering overhead amortizes over a bigger bill.
Existing platform team. If you already have a platform team running Kubernetes and databases, adding a vector database is marginal. The engineering tax is already paid.
Hard compliance or data residency. Some regulated environments require data to never leave your VPC. Self-hosting might be the only option, and the engineering cost is just the cost of doing business.
Research or experimentation. For non-production workloads where uptime doesn't matter, self-hosting is fine. Spin up a single Qdrant node, don't worry about HA, move on.
For everyone else — small teams, startups, internal tools, prototypes — managed wins. The math on engineering time is brutal.
The Alternative
There's a third option that doesn't show up in "self-hosted vs managed" comparisons: not running a vector database at all.
For a lot of use cases, you don't need to manage a vector database. You need search that works. Vecstore is one option in this category — a search API that handles embedding, vector storage, and query serving in one thing. No cluster to run, no index to tune, no engineering team to maintain.
This is the real question to ask before choosing a vector database: do I need a vector database, or do I need search? If the answer is search, the vector database is just infrastructure overhead.
We covered this in more detail in You Don't Need a Vector Database.
Wrapping Up
The quoted cost of self-hosting a vector database is real. It's also the smallest line item on the actual bill. Engineering time, on-call burden, upgrades, and operations dwarf the compute cost in almost every scenario.
Before picking self-hosted based on the pricing page, run the real TCO numbers. Include engineering time at your team's actual rate. Include the on-call tax. Include the re-indexing you'll do in 12 months when you switch embedding models.
If the math still works, self-host. If not, the managed option was probably cheaper all along.
See how Vecstore compares or sign up for the free tier to skip the infra conversation entirely.


