Infrastructure

Vector Database

A specialized database designed to store and efficiently search high-dimensional vectors, enabling semantic search and AI applications.

What is a vector database?

A vector database is a specialized database optimized for storing and searching vectors—lists of numbers that represent data like text, images, or audio.

Traditional databases excel at exact matches: find all users named "John" in California. Vector databases excel at similarity matches: find all documents similar to this query.

This similarity search enables:

  • Semantic search: Find relevant content by meaning
  • RAG systems: Retrieve context for AI responses
  • Recommendations: Find similar products or content
  • Image search: Find visually similar images
  • Anomaly detection: Find unusual patterns

Vector databases use sophisticated algorithms to search billions of vectors in milliseconds.

How do vector databases work?

Storing vectors:

  1. You generate embeddings from your data (using embedding models)
  2. Each vector is stored with metadata (source document, URL, tags)
  3. The database builds an index structure for fast searching

Searching vectors:

  1. Convert your query to a vector using the same embedding model
  2. The database finds the most similar vectors using distance metrics
  3. Returns the top-k nearest neighbors with their metadata

Index types:

The magic is in how vectors are indexed for fast searching:

  • Flat (brute force): Compare query to every vector. Accurate but slow for large datasets.
  • IVF (Inverted File Index): Cluster vectors, search only relevant clusters. Good balance of speed and accuracy.
  • HNSW (Hierarchical Navigable Small World): Graph-based approach. Very fast, high accuracy, more memory.
  • PQ (Product Quantization): Compress vectors to reduce storage and speed up search. Trades some accuracy for efficiency.

How to choose a vector database

Consider these factors:

Scale

  • < 100K vectors: pgvector, Chroma, or any option
  • 100K - 10M vectors: Any managed or self-hosted option
  • 10M vectors: Pinecone, Milvus, Qdrant with careful tuning

Operations preferences

  • Want zero ops? → Pinecone, managed Weaviate
  • Already using Postgres? → pgvector
  • Want control/cost savings? → Self-hosted Qdrant, Milvus

Feature needs

  • Hybrid search (vector + keyword)? → Weaviate, Qdrant
  • Multi-tenancy? → Pinecone namespaces, Weaviate multi-tenant
  • Filtering? → All support, but capabilities vary

Cost

  • Self-hosted: Infrastructure costs only
  • Managed: Pay per vector stored + queries
  • Evaluate at your expected scale

Development experience Try each database's quickstart. Developer experience varies significantly.

Implementing a vector database

1. Design your schema

Decide what metadata to store alongside vectors:

{
  "id": "doc_123",
  "vector": [0.1, 0.2, ...],
  "metadata": {
    "source": "product_manual.pdf",
    "page": 15,
    "section": "Troubleshooting",
    "updated": "2024-01-15"
  }
}

2. Index your content

  • Chunk documents appropriately
  • Generate embeddings
  • Upload with metadata
  • Build indexes

3. Configure search

  • Choose distance metric (cosine, euclidean, dot product)
  • Set top-k (how many results to return)
  • Configure filters (only search certain document types)
  • Tune relevance thresholds

4. Optimize for production

  • Implement caching for common queries
  • Monitor latency and accuracy
  • Set up backups and replication
  • Plan for index updates

Vector database best practices

Keep embeddings and source in sync When documents change, update their vectors. Stale embeddings return wrong results.

Use metadata filtering Filter by metadata before vector search to narrow the search space and improve relevance.

Choose appropriate distance metrics

  • Cosine similarity: Best for most text applications
  • Euclidean distance: When magnitude matters
  • Dot product: Faster if vectors are normalized

Benchmark with your data Run your actual queries against your actual data. Benchmarks on standard datasets don't predict your performance.

Plan for updates Vector databases handle inserts well, but bulk updates can be slow. Design your update strategy early.

Monitor query patterns Track slow queries, popular queries, and queries with poor results. Use this data to optimize.

Test relevance regularly Periodically evaluate whether returned results are actually relevant. Relevance can degrade as data changes.