pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database

Why pgvector Changes the Game

Most teams building AI applications assume they need a dedicated vector database. They spin up Pinecone or Weaviate alongside their existing PostgreSQL instance, then wrestle with keeping data synchronized across two systems. pgvector eliminates that complexity by bringing vector similarity search directly into PostgreSQL.

pgvector is a PostgreSQL extension that adds a vector data type, distance operators, and approximate nearest neighbor (ANN) indexes. Your embeddings live in the same database as your application data, which means you can JOIN vectors with user tables, filter by metadata columns, and wrap everything in a single transaction.

Installing pgvector

On Ubuntu or Debian with PostgreSQL already installed:

sudo apt-get install postgresql-16-pgvector

On macOS with Homebrew:

brew install pgvector

If you are running PostgreSQL in Docker, use an image that includes pgvector:

docker run -d --name pgvector-db \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  pgvector/pgvector:pg16

Once installed, enable the extension in your database:

CREATE EXTENSION IF NOT EXISTS vector;

Creating a Vector Column

Add an embedding column to any table. The dimension must match your embedding model — OpenAI text-embedding-3-small produces 1536 dimensions:

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT NOT NULL,
    embedding vector(1536),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

Inserting Embeddings with Python

Use the psycopg driver with pgvector support:

import psycopg
from pgvector.psycopg import register_vector
from openai import OpenAI

client = OpenAI()
conn = psycopg.connect("postgresql://user:secret@localhost/mydb")
register_vector(conn)

def embed_and_store(title: str, content: str):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=content
    )
    embedding = response.data[0].embedding

    conn.execute(
        "INSERT INTO documents (title, content, embedding) VALUES (%s, %s, %s)",
        (title, content, embedding)
    )
    conn.commit()

embed_and_store("pgvector Guide", "pgvector adds vector search to PostgreSQL...")

Querying by Similarity

pgvector provides three distance operators:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

<-> — L2 (Euclidean) distance
<=> — cosine distance
<#> — negative inner product

For normalized embeddings, cosine distance is the most common choice:

SELECT id, title, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY distance
LIMIT 10;

In Python:

def search(query: str, limit: int = 5):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_vec = response.data[0].embedding

    results = conn.execute(
        "SELECT id, title, embedding <=> %s::vector AS distance "
        "FROM documents ORDER BY distance LIMIT %s",
        (query_vec, limit)
    ).fetchall()
    return results

IVFFlat vs HNSW Indexes

Without an index, pgvector performs exact nearest neighbor search — scanning every row. For large datasets, you need an ANN index.

IVFFlat partitions vectors into lists, then searches only the nearest lists at query time:

CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

Set lists to roughly sqrt(row_count). More lists means faster queries but lower recall. You must have data in the table before creating an IVFFlat index because it clusters existing vectors.

HNSW builds a hierarchical navigable small world graph. It is more expensive to build but delivers better recall-speed tradeoffs:

CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

HNSW can be created on an empty table and handles inserts incrementally. For most production workloads, HNSW is the better default.

Tuning Query Performance

Increase the search scope at query time for higher recall:

-- IVFFlat: search more lists (default is 1)
SET ivfflat.probes = 10;

-- HNSW: expand candidate list (default is 40)
SET hnsw.ef_search = 100;

Higher values increase recall but slow down queries. Profile with EXPLAIN ANALYZE to find the right balance.

FAQ

Can I use pgvector with an existing production PostgreSQL database?

Yes. pgvector is a standard PostgreSQL extension. You run CREATE EXTENSION vector, add vector columns with ALTER TABLE, and your existing tables, indexes, and queries continue working unchanged. There is no migration or data export required.

How many vectors can pgvector handle before performance degrades?

With HNSW indexes, pgvector handles millions of vectors with sub-10ms query latency on modest hardware. Teams have reported good performance up to 10-20 million rows on a single instance. Beyond that, consider partitioning or a dedicated vector database.

Should I use IVFFlat or HNSW for my project?

Start with HNSW unless you have a specific reason not to. HNSW provides better recall at the same speed, supports incremental inserts, and can be created on an empty table. IVFFlat is useful when you need faster index build times and can tolerate slightly lower recall.

#Pgvector #PostgreSQL #VectorSearch #Embeddings #Database #AgenticAI #LearnAI #AIEngineering

pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database

Why pgvector Changes the Game

Installing pgvector

Creating a Vector Column

Inserting Embeddings with Python

Querying by Similarity

IVFFlat vs HNSW Indexes

Tuning Query Performance

FAQ

Can I use pgvector with an existing production PostgreSQL database?

How many vectors can pgvector handle before performance degrades?

Should I use IVFFlat or HNSW for my project?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding