Books/RAG Essentials/Vector Databases Explained

Vector Databases Explained

You have embeddings — arrays of numbers representing your text. Now you need somewhere to store them and search through them fast. A regular database is not built for this. You need a vector database.

What Is a Vector Database?

A vector database is a database designed specifically to store, index, and search high-dimensional vectors. When you run a query, it finds the vectors most similar to your query vector using algorithms optimized for this purpose.

Think of it this way:

Regular Database	Vector Database
Stores rows and columns	Stores vectors (arrays of numbers)
Searches by exact match or range	Searches by similarity
SQL: `WHERE name = 'Alice'`	"Find the 5 vectors closest to this one"
Indexes: B-tree, hash	Indexes: HNSW, IVF, PQ

How Vector Search Works

When you store a document in a vector database, you store its embedding vector alongside metadata (the original text, source, etc.). When you search:

Your query text is converted to a vector
The database finds the K nearest vectors (K-Nearest Neighbors)
It returns the associated documents

Query: "How do I reset my password?" → [0.021, -0.134, 0.891, ...]
                                              │
                                              ▼
                                    ┌─────────────────┐
                                    │  Vector Database │
                                    │                 │
                                    │  📄 [0.019, ...]  ← "Go to Settings > Security..."  ✅ closest
                                    │  📄 [-0.52, ...]  ← "Business hours are..."
                                    │  📄 [0.331, ...]  ← "Refund policy allows..."
                                    │  📄 [0.025, ...]  ← "To change credentials..."    ✅ 2nd closest
                                    └─────────────────┘

The database uses specialized indexing algorithms (like HNSW — Hierarchical Navigable Small World) to search millions of vectors in milliseconds without comparing every single one.

Popular Vector Databases

Pinecone — Managed Cloud Vector DB

Fully managed, no infrastructure to handle. Great for getting started quickly.

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")

# Create an index
pc.create_index(
    name="my-knowledge-base",
    dimension=1536,           # Must match your embedding model
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-east-1")
)

index = pc.Index("my-knowledge-base")

# Upsert vectors (store)
index.upsert(vectors=[
    {
        "id": "doc-1",
        "values": [0.021, -0.134, 0.891, ...],    # 1536-dim vector
        "metadata": {
            "text": "To reset your password, go to Settings...",
            "source": "help-center",
            "category": "account"
        }
    },
    # ... more vectors
])

# Query (search)
results = index.query(
    vector=[0.019, -0.128, 0.887, ...],   # query embedding
    top_k=5,
    include_metadata=True
)

for match in results.matches:
    print(f"Score: {match.score:.3f} — {match.metadata['text']}")

ChromaDB — Local/Embedded Vector DB

Open source, runs locally, perfect for development and small projects. No API key needed.

import chromadb
from chromadb.utils import embedding_functions

# Create a client (local, in-memory or persistent)
client = chromadb.PersistentClient(path="./chroma_data")

# Use OpenAI embeddings (or use Chroma's default)
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key="your-openai-key",
    model_name="text-embedding-3-small"
)

# Create a collection
collection = client.get_or_create_collection(
    name="knowledge_base",
    embedding_function=openai_ef
)

# Add documents (Chroma generates embeddings automatically)
collection.add(
    documents=[
        "To reset your password, go to Settings > Security.",
        "Our refund policy allows returns within 30 days.",
        "Business hours are Monday-Friday, 9 AM to 5 PM.",
    ],
    ids=["doc-1", "doc-2", "doc-3"],
    metadatas=[
        {"source": "help-center", "category": "account"},
        {"source": "help-center", "category": "billing"},
        {"source": "help-center", "category": "general"},
    ]
)

# Query
results = collection.query(
    query_texts=["How do I change my password?"],
    n_results=3
)

for doc, distance in zip(results["documents"][0], results["distances"][0]):
    print(f"Distance: {distance:.3f} — {doc}")

Weaviate — Feature-Rich Open Source

Open source with a cloud offering. Supports hybrid search (vector + keyword), built-in vectorization, and GraphQL API.

import weaviate
from weaviate.classes.config import Configure, Property, DataType

# Connect to Weaviate Cloud or local instance
client = weaviate.connect_to_weaviate_cloud(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=weaviate.auth.AuthApiKey("your-key")
)

# Create a collection with built-in vectorizer
collection = client.collections.create(
    name="KnowledgeBase",
    vectorizer_config=Configure.Vectorizer.text2vec_openai(model="text-embedding-3-small"),
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="source", data_type=DataType.TEXT),
    ]
)

# Add documents (Weaviate vectorizes automatically)
collection.data.insert({"content": "To reset your password...", "source": "help-center"})

# Query
results = collection.query.near_text(query="How do I change my password?", limit=3)
for obj in results.objects:
    print(obj.properties["content"])

client.close()

Supabase pgvector — SQL-Based Vector Search

If you already use PostgreSQL/Supabase, pgvector adds vector search to your existing database. No separate service needed.

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table with a vector column
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT,
  source TEXT,
  embedding VECTOR(1536)    -- 1536 dimensions for OpenAI embeddings
);

-- Insert a document with its embedding
INSERT INTO documents (content, source, embedding)
VALUES (
  'To reset your password, go to Settings > Security.',
  'help-center',
  '[0.021, -0.134, 0.891, ...]'   -- 1536-dim vector
);

-- Search for similar documents
SELECT content, source,
       1 - (embedding <=> '[0.019, -0.128, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.019, -0.128, ...]'
LIMIT 5;

Comparison Table

Feature	Pinecone	ChromaDB	Weaviate	pgvector
Type	Managed cloud	Local / embedded	Self-host or cloud	PostgreSQL extension
Setup	Minutes	Minutes	Moderate	Minutes (if you have PG)
Free tier	Yes (limited)	Free (open source)	Free (open source)	Free (open source)
Scalability	Automatic	Limited	Good	Good
Hybrid search	Yes	Limited	Yes	With pg_trgm
Best for	Production apps	Prototyping, small apps	Feature-rich apps	SQL-first teams

Choosing the Right Vector Database

Just learning or prototyping? Start with ChromaDB — it runs locally with zero config
Building a production app? Use Pinecone for simplicity or Weaviate for more features
Already on PostgreSQL/Supabase? Add pgvector to your existing database
Need maximum control? Self-host Weaviate or Qdrant

What to ask your AI: "Help me set up [ChromaDB/Pinecone/Weaviate] for my RAG application. I'm using [OpenAI/Cohere/local] embeddings with [Python/TypeScript]."

What's Next?

You now understand embeddings and vector databases — the two building blocks of RAG. Next, we will put everything together and build a complete RAG pipeline from loading documents to generating answers.

🌐 www.genai-mentor.ai

Understanding Embeddings

Building a RAG Pipeline