Project monthly cost, 3-year TCO, and payback period for your RAG build. Compares pgvector, Pinecone, Weaviate, Qdrant. 100% client-side — your numbers stay in your browser.
AWS RDS db.m5.xlarge + storage. Cheapest at small scale, slower at >10M vectors.
Monthly RAG cost
FTE cost displaced
2 × $80,000 / year ÷ 12 months
Net monthly savings
Payback period: 4.7 months
3-year ROI
3-year TCO: $76,232
Vector DB comparison at this workload
| Database | Monthly | 3-year TCO | vs current |
|---|---|---|---|
| pgvector (self-hosted Postgres)selected | $451 | $76,232 | — |
| Pinecone Serverless | $319 | $71,480 | $-132.00/mo |
| Weaviate Cloud | $463 | $76,664 | +$12.00/mo |
| Qdrant (self-hosted) | $529 | $79,040 | +$78.00/mo |
Want a real ROI audit?
This calculator uses 2026 list pricing. Real production bills typically run 15-25% higher; real cost-optimization wins are typically 30-60% larger than the calculator can model. We do free 30-min audits for shortlists.
The calculator models steady-state monthly cost for a production RAG system using 2026 list pricing. Four cost components:
FTE savings calculation — multiplies the FTE count you input by the annual loaded cost (salary + benefits + tooling, typically 1.3× base salary). Divided by 12 for monthly. Compared against monthly RAG cost to get net savings.
Payback period — your one-time build/ramp-up cost divided by monthly net savings. If RAG costs more than the FTE it displaces, payback is N/A (the system needs more scale to justify itself).
What it doesn't do: capture spikey workloads, multi-region replication costs, fine-tuning costs (separate from RAG), reranker model costs (we assume single-stage retrieval), or the cost of getting the RAG quality high enough to displace humans (which is the bigger gating factor than infrastructure cost). Real production bills are typically 15-25% higher than the calculator's projection.
For a real cost + quality audit on a RAG system you're already running — including the architecture-level optimizations the calculator can't see — book the call below.
Architecture patterns, chunking strategies, retrieval tuning, and production gotchas. The companion guide to this calculator.
Cost model for direct LLM API usage (no retrieval). Compare GPT-4o, Claude 3.5, Gemini, self-hosted Llama.
Score your organization across 7 dimensions before scoping a RAG build. Best taken before the calculators when you're not sure where to start.