SEARCH & AI

What Is a Vector Database (And Do You Actually Need One)?

Giorgi Kenchadze

Giorgi Kenchadze

2026-01-22 · 8 min read

Every few years, a new type of database shows up and suddenly it's "the answer" to everything. Graph databases had their moment. Time-series databases had theirs. Right now, vector databases are in the spotlight—and unlike some hype cycles, this one is grounded in a real shift in how applications handle data.

But "what is a vector database" is a surprisingly loaded question. The short answer: it's a database optimized for storing and searching high-dimensional vectors. The longer answer involves understanding what vectors are, why traditional databases can't handle them well, and—critically—whether you actually need one for your use case.

Vectors and Embeddings: What They Actually Are

A vector embedding is a list of numbers that represents the meaning of a piece of data. Text, images, audio—any unstructured data can be converted into an embedding using a neural network (an embedding model).

Take the sentence "how to train a puppy." An embedding model might convert that into a vector of 768 floating-point numbers. The sentence "tips for teaching a young dog" would produce a different list of numbers, but one that's geometrically close to the first—because the meanings are similar.

This is the key insight: similar meanings produce similar vectors. The distance between two vectors in this high-dimensional space reflects how related their source data is. A sentence about puppy training and a sentence about dog obedience land near each other. A sentence about tax law lands far away.

The same principle applies to images. A photo of a golden retriever and a photo of a labrador produce embeddings that are close together. A photo of a skyscraper does not.

These vectors typically have 256 to 1,536 dimensions, depending on the model. That's where things get computationally interesting.

What a Vector Database Does Differently

A traditional relational database is built for exact matches. You query for rows where a column equals a value, falls within a range, or matches a pattern. The data structures behind this—B-trees, hash indexes—are optimized for precise lookups.

Vector databases solve a fundamentally different problem: vector similarity search. Given a query vector, find the most similar vectors in a collection of millions or billions. This isn't an exact match—it's a nearest-neighbor search in high-dimensional space.

Why can't you just use PostgreSQL with a vector column and calculate cosine similarity? You can, for small datasets. But brute-force comparison against every vector in a table scales linearly. At 10 million vectors with 768 dimensions each, you're comparing against ~7.6 billion floating-point numbers per query. That's seconds, not milliseconds.

Vector databases use approximate nearest neighbor (ANN) algorithms to make this tractable. They trade a small amount of accuracy—maybe returning the 95th-percentile best match instead of the absolute best—for orders-of-magnitude speed improvements. A well-tuned ANN index can search through 100 million vectors in under 10 milliseconds.

How Vector Search Works, Step by Step

Here's the full pipeline, with a vector database explained as a sequence of operations:

1. Embed your data. Run each document, image, or data point through an embedding model. This produces a vector for each item. For a catalog of 1 million products, you'd generate 1 million vectors.

2. Index the vectors. The vector database ingests these vectors and builds an index—a data structure that organizes them for fast similarity lookup. This is the expensive step. Depending on the algorithm, indexing 1 million 768-dimensional vectors might take minutes to hours.

3. Query. When a user searches, their query is embedded using the same model, producing a query vector. The database searches its index for the nearest neighbors to that query vector.

4. Rank and return. The database returns the top-K most similar vectors, along with their similarity scores and any metadata you stored alongside them. Your application uses these results to show search results, recommendations, or whatever the use case requires.

How Indexing Algorithms Make It Fast

The indexing layer is where vector databases earn their keep. Three families of algorithms dominate:

HNSW (Hierarchical Navigable Small World) builds a multi-layered graph where each node connects to its nearest neighbors. Searching is like navigating a skip list—you start at the top layer with long-range connections and drill down to finer layers. HNSW offers excellent query speed (sub-millisecond for millions of vectors) and high recall, but it requires the entire index to fit in memory. For 100 million 768-dimensional vectors stored as float32, that's roughly 300 GB of RAM.

IVF (Inverted File Index) partitions the vector space into clusters using k-means. At query time, it only searches the clusters closest to the query vector, skipping the rest. IVF uses less memory than HNSW and works well with disk-based storage, but recall degrades if you search too few clusters.

PQ (Product Quantization) compresses vectors by splitting them into sub-vectors and quantizing each sub-vector to its nearest centroid in a learned codebook. This dramatically reduces memory—a 768-dimensional float32 vector (3,072 bytes) can be compressed to 96 bytes. The trade-off is lower accuracy, especially for fine-grained similarity.

In practice, production systems often combine these. IVF+PQ is common for billion-scale datasets where memory is a constraint. HNSW alone works well up to tens of millions of vectors if you have the RAM.

Real Use Cases

The reason vector databases have gotten so much attention is that semantic search and related workloads are showing up everywhere:

  • Semantic search — Search by meaning instead of keywords. A query for "affordable flights to warm destinations" finds results about "cheap tickets to tropical locations." This is the most common use case and the one driving most adoption.

  • RAG (Retrieval-Augmented Generation) — LLM applications retrieve relevant context from a vector database before generating a response. This is how most production chatbots and AI assistants ground their answers in real data instead of hallucinating.

  • Recommendation engines — Embed users and items into the same vector space, then recommend items whose vectors are closest to a user's vector. Spotify and YouTube use variations of this approach at massive scale.

  • Image search — Embed images and text into a shared space (using models like CLIP) so users can search photos with natural language or find visually similar images.

  • Anomaly detection — In fraud detection or security monitoring, normal behavior forms clusters in vector space. Data points far from any cluster are flagged as anomalies.

When You DON'T Need a Vector Database

Here's where the industry conversation often goes sideways. Not every application needs a dedicated vector database. Consider skipping one if:

Your data is small. If you have fewer than 100,000 vectors, brute-force search with a library like FAISS or even NumPy is fast enough. You can keep everything in memory on a single machine and get sub-50ms query times without any indexing at all.

You need exact keyword matching. Vector search is fuzzy by nature. If your users search for SKUs, error codes, or legal citations, you need traditional full-text search (Elasticsearch, PostgreSQL), not approximate nearest neighbors.

You don't want to manage embeddings. Running embedding models, building ingestion pipelines, choosing indexing parameters, tuning recall vs. latency—this is real operational overhead. If search is a feature of your product (not the product), that overhead may not be worth it.

pgvector is enough for your scale. PostgreSQL's pgvector extension supports HNSW indexing and handles millions of vectors reasonably well. If you're already running Postgres and your dataset is under 5–10 million vectors, adding a vector column might be all you need. No new infrastructure required.

The Options Landscape

When you do need vector search, you have a spectrum of options:

Self-hosted databases like Milvus, Qdrant, and Weaviate give you full control. You manage deployment, scaling, backups, and tuning. This makes sense when you have strict data residency requirements, need to customize the indexing pipeline, or have a team comfortable with infrastructure operations.

Managed vector databases like Pinecone, managed Weaviate, or Zilliz Cloud handle the infrastructure for you. You get an API, they manage the clusters. Pricing is typically based on storage and queries—expect $70–300/month for a moderately sized workload.

Skip the database entirely. If what you actually need is semantic search or image search in your application, you don't necessarily need to manage vectors at all. Search APIs like Vecstore handle embedding generation, vector storage, and retrieval behind a single REST API—three endpoints, sub-200ms responses, 100+ languages. You send text or images, you get ranked results back. No models to run, no indexes to tune.

This last option is worth considering honestly. A vector database is a means to an end. If the end is "my users need good search," the question isn't "which vector database should I use" but "what's the simplest way to ship this."

Choosing the Right Approach

The decision comes down to how central vector search is to your product:

Scenario Recommended approach
Search is a feature, not the product Managed search API
5M+ vectors, need full control Self-hosted vector DB
Already on Postgres, moderate scale pgvector extension
Building a RAG pipeline for an LLM app Managed vector DB or search API
Research or prototyping FAISS or in-memory brute force

When to use a vector database is ultimately a question of scale, control, and how much infrastructure you're willing to own. For a lot of teams, the answer is less infrastructure than they think.

The Bottom Line

Vector databases are a genuinely useful technology solving a real problem: fast similarity search over high-dimensional data. They're not magic, and they're not always necessary. Understanding the mechanics—embeddings, ANN algorithms, indexing trade-offs—helps you make a clear-eyed decision about whether you need one, and if so, which kind.

Start with the problem, not the technology. If your problem is "users need to search by meaning," you have options ranging from a Postgres extension to a fully managed search API. Pick the one that matches your team's capacity and your application's actual requirements.

Better search for your product—without the engineering overhead.

45M+ searches powered by Vecstore this year

Sign up for Vecstore
Start for Free

25 Free credits. No credit card required.