pgvector Tutorial: Adding Vector Search to Your Existing PostgreSQL Database
Learn how to install pgvector, create vector columns, build IVFFlat and HNSW indexes, and run similarity queries directly inside PostgreSQL without adding another database to your stack.
Why pgvector Changes the Game
Most teams building AI applications assume they need a dedicated vector database. They spin up Pinecone or Weaviate alongside their existing PostgreSQL instance, then wrestle with keeping data synchronized across two systems. pgvector eliminates that complexity by bringing vector similarity search directly into PostgreSQL.
pgvector is a PostgreSQL extension that adds a vector data type, distance operators, and approximate nearest neighbor (ANN) indexes. Your embeddings live in the same database as your application data, which means you can JOIN vectors with user tables, filter by metadata columns, and wrap everything in a single transaction.
Installing pgvector
On Ubuntu or Debian with PostgreSQL already installed:
sudo apt-get install postgresql-16-pgvector
On macOS with Homebrew:
brew install pgvector
If you are running PostgreSQL in Docker, use an image that includes pgvector:
docker run -d --name pgvector-db \
-e POSTGRES_PASSWORD=secret \
-p 5432:5432 \
pgvector/pgvector:pg16
Once installed, enable the extension in your database:
CREATE EXTENSION IF NOT EXISTS vector;
Creating a Vector Column
Add an embedding column to any table. The dimension must match your embedding model — OpenAI text-embedding-3-small produces 1536 dimensions:
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
embedding vector(1536),
created_at TIMESTAMPTZ DEFAULT NOW()
);
Inserting Embeddings with Python
Use the psycopg driver with pgvector support:
import psycopg
from pgvector.psycopg import register_vector
from openai import OpenAI
client = OpenAI()
conn = psycopg.connect("postgresql://user:secret@localhost/mydb")
register_vector(conn)
def embed_and_store(title: str, content: str):
response = client.embeddings.create(
model="text-embedding-3-small",
input=content
)
embedding = response.data[0].embedding
conn.execute(
"INSERT INTO documents (title, content, embedding) VALUES (%s, %s, %s)",
(title, content, embedding)
)
conn.commit()
embed_and_store("pgvector Guide", "pgvector adds vector search to PostgreSQL...")
Querying by Similarity
pgvector provides three distance operators:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
<->— L2 (Euclidean) distance<=>— cosine distance<#>— negative inner product
For normalized embeddings, cosine distance is the most common choice:
SELECT id, title, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY distance
LIMIT 10;
In Python:
def search(query: str, limit: int = 5):
response = client.embeddings.create(
model="text-embedding-3-small",
input=query
)
query_vec = response.data[0].embedding
results = conn.execute(
"SELECT id, title, embedding <=> %s::vector AS distance "
"FROM documents ORDER BY distance LIMIT %s",
(query_vec, limit)
).fetchall()
return results
IVFFlat vs HNSW Indexes
Without an index, pgvector performs exact nearest neighbor search — scanning every row. For large datasets, you need an ANN index.
IVFFlat partitions vectors into lists, then searches only the nearest lists at query time:
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Set lists to roughly sqrt(row_count). More lists means faster queries but lower recall. You must have data in the table before creating an IVFFlat index because it clusters existing vectors.
HNSW builds a hierarchical navigable small world graph. It is more expensive to build but delivers better recall-speed tradeoffs:
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
HNSW can be created on an empty table and handles inserts incrementally. For most production workloads, HNSW is the better default.
Tuning Query Performance
Increase the search scope at query time for higher recall:
-- IVFFlat: search more lists (default is 1)
SET ivfflat.probes = 10;
-- HNSW: expand candidate list (default is 40)
SET hnsw.ef_search = 100;
Higher values increase recall but slow down queries. Profile with EXPLAIN ANALYZE to find the right balance.
FAQ
Can I use pgvector with an existing production PostgreSQL database?
Yes. pgvector is a standard PostgreSQL extension. You run CREATE EXTENSION vector, add vector columns with ALTER TABLE, and your existing tables, indexes, and queries continue working unchanged. There is no migration or data export required.
How many vectors can pgvector handle before performance degrades?
With HNSW indexes, pgvector handles millions of vectors with sub-10ms query latency on modest hardware. Teams have reported good performance up to 10-20 million rows on a single instance. Beyond that, consider partitioning or a dedicated vector database.
Should I use IVFFlat or HNSW for my project?
Start with HNSW unless you have a specific reason not to. HNSW provides better recall at the same speed, supports incremental inserts, and can be created on an empty table. IVFFlat is useful when you need faster index build times and can tolerate slightly lower recall.
#Pgvector #PostgreSQL #VectorSearch #Embeddings #Database #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.