Books/RAG Essentials/Vector Databases Explained

    Vector Databases Explained

    Vector Databases Explained

    You have embeddings — arrays of numbers representing your text. Now you need somewhere to store them and search through them fast. A regular database is not built for this. You need a vector database.

    What Is a Vector Database?

    A vector database is a database designed specifically to store, index, and search high-dimensional vectors. When you run a query, it finds the vectors most similar to your query vector using algorithms optimized for this purpose.

    Think of it this way:

    Regular DatabaseVector Database
    Stores rows and columnsStores vectors (arrays of numbers)
    Searches by exact match or rangeSearches by similarity
    SQL: WHERE name = 'Alice'"Find the 5 vectors closest to this one"
    Indexes: B-tree, hashIndexes: HNSW, IVF, PQ

    How Vector Search Works

    When you store a document in a vector database, you store its embedding vector alongside metadata (the original text, source, etc.). When you search:

    1. Your query text is converted to a vector
    2. The database finds the K nearest vectors (K-Nearest Neighbors)
    3. It returns the associated documents
    Query: "How do I reset my password?" → [0.021, -0.134, 0.891, ...]
                                                  │
                                                  ▼
                                        ┌─────────────────┐
                                        │  Vector Database │
                                        │                 │
                                        │  📄 [0.019, ...]  ← "Go to Settings > Security..."  ✅ closest
                                        │  📄 [-0.52, ...]  ← "Business hours are..."
                                        │  📄 [0.331, ...]  ← "Refund policy allows..."
                                        │  📄 [0.025, ...]  ← "To change credentials..."    ✅ 2nd closest
                                        └─────────────────┘
    

    The database uses specialized indexing algorithms (like HNSW — Hierarchical Navigable Small World) to search millions of vectors in milliseconds without comparing every single one.

    Popular Vector Databases

    Pinecone — Managed Cloud Vector DB

    Fully managed, no infrastructure to handle. Great for getting started quickly.

    from pinecone import Pinecone
    
    pc = Pinecone(api_key="your-api-key")
    
    # Create an index
    pc.create_index(
        name="my-knowledge-base",
        dimension=1536,           # Must match your embedding model
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )
    
    index = pc.Index("my-knowledge-base")
    
    # Upsert vectors (store)
    index.upsert(vectors=[
        {
            "id": "doc-1",
            "values": [0.021, -0.134, 0.891, ...],    # 1536-dim vector
            "metadata": {
                "text": "To reset your password, go to Settings...",
                "source": "help-center",
                "category": "account"
            }
        },
        # ... more vectors
    ])
    
    # Query (search)
    results = index.query(
        vector=[0.019, -0.128, 0.887, ...],   # query embedding
        top_k=5,
        include_metadata=True
    )
    
    for match in results.matches:
        print(f"Score: {match.score:.3f}{match.metadata['text']}")

    ChromaDB — Local/Embedded Vector DB

    Open source, runs locally, perfect for development and small projects. No API key needed.

    import chromadb
    from chromadb.utils import embedding_functions
    
    # Create a client (local, in-memory or persistent)
    client = chromadb.PersistentClient(path="./chroma_data")
    
    # Use OpenAI embeddings (or use Chroma's default)
    openai_ef = embedding_functions.OpenAIEmbeddingFunction(
        api_key="your-openai-key",
        model_name="text-embedding-3-small"
    )
    
    # Create a collection
    collection = client.get_or_create_collection(
        name="knowledge_base",
        embedding_function=openai_ef
    )
    
    # Add documents (Chroma generates embeddings automatically)
    collection.add(
        documents=[
            "To reset your password, go to Settings > Security.",
            "Our refund policy allows returns within 30 days.",
            "Business hours are Monday-Friday, 9 AM to 5 PM.",
        ],
        ids=["doc-1", "doc-2", "doc-3"],
        metadatas=[
            {"source": "help-center", "category": "account"},
            {"source": "help-center", "category": "billing"},
            {"source": "help-center", "category": "general"},
        ]
    )
    
    # Query
    results = collection.query(
        query_texts=["How do I change my password?"],
        n_results=3
    )
    
    for doc, distance in zip(results["documents"][0], results["distances"][0]):
        print(f"Distance: {distance:.3f}{doc}")

    Weaviate — Feature-Rich Open Source

    Open source with a cloud offering. Supports hybrid search (vector + keyword), built-in vectorization, and GraphQL API.

    import weaviate
    from weaviate.classes.config import Configure, Property, DataType
    
    # Connect to Weaviate Cloud or local instance
    client = weaviate.connect_to_weaviate_cloud(
        cluster_url="https://your-cluster.weaviate.network",
        auth_credentials=weaviate.auth.AuthApiKey("your-key")
    )
    
    # Create a collection with built-in vectorizer
    collection = client.collections.create(
        name="KnowledgeBase",
        vectorizer_config=Configure.Vectorizer.text2vec_openai(model="text-embedding-3-small"),
        properties=[
            Property(name="content", data_type=DataType.TEXT),
            Property(name="source", data_type=DataType.TEXT),
        ]
    )
    
    # Add documents (Weaviate vectorizes automatically)
    collection.data.insert({"content": "To reset your password...", "source": "help-center"})
    
    # Query
    results = collection.query.near_text(query="How do I change my password?", limit=3)
    for obj in results.objects:
        print(obj.properties["content"])
    
    client.close()

    Supabase pgvector — SQL-Based Vector Search

    If you already use PostgreSQL/Supabase, pgvector adds vector search to your existing database. No separate service needed.

    -- Enable the extension
    CREATE EXTENSION IF NOT EXISTS vector;
    
    -- Create a table with a vector column
    CREATE TABLE documents (
      id SERIAL PRIMARY KEY,
      content TEXT,
      source TEXT,
      embedding VECTOR(1536)    -- 1536 dimensions for OpenAI embeddings
    );
    
    -- Insert a document with its embedding
    INSERT INTO documents (content, source, embedding)
    VALUES (
      'To reset your password, go to Settings > Security.',
      'help-center',
      '[0.021, -0.134, 0.891, ...]'   -- 1536-dim vector
    );
    
    -- Search for similar documents
    SELECT content, source,
           1 - (embedding <=> '[0.019, -0.128, ...]') AS similarity
    FROM documents
    ORDER BY embedding <=> '[0.019, -0.128, ...]'
    LIMIT 5;

    Comparison Table

    FeaturePineconeChromaDBWeaviatepgvector
    TypeManaged cloudLocal / embeddedSelf-host or cloudPostgreSQL extension
    SetupMinutesMinutesModerateMinutes (if you have PG)
    Free tierYes (limited)Free (open source)Free (open source)Free (open source)
    ScalabilityAutomaticLimitedGoodGood
    Hybrid searchYesLimitedYesWith pg_trgm
    Best forProduction appsPrototyping, small appsFeature-rich appsSQL-first teams

    Choosing the Right Vector Database

    • Just learning or prototyping? Start with ChromaDB — it runs locally with zero config
    • Building a production app? Use Pinecone for simplicity or Weaviate for more features
    • Already on PostgreSQL/Supabase? Add pgvector to your existing database
    • Need maximum control? Self-host Weaviate or Qdrant

    What to ask your AI: "Help me set up [ChromaDB/Pinecone/Weaviate] for my RAG application. I'm using [OpenAI/Cohere/local] embeddings with [Python/TypeScript]."

    What's Next?

    You now understand embeddings and vector databases — the two building blocks of RAG. Next, we will put everything together and build a complete RAG pipeline from loading documents to generating answers.


    🌐 www.genai-mentor.ai