Skip to content
Learn Agentic AI9 min read0 views

AI-Powered Search for SaaS Applications: Semantic Search Over Product Data

Build semantic search for your SaaS product using vector embeddings, enabling users to find records by meaning rather than exact keyword matches.

Why Keyword Search Falls Short

Traditional keyword search works by matching exact tokens. When a user in your CRM searches for "companies that are struggling financially," keyword search returns nothing — because no record contains those exact words. Semantic search uses vector embeddings to match by meaning, so that query finds records tagged "at risk," "payment overdue," or "churn likelihood: high."

For SaaS products with rich, structured data, semantic search transforms how users discover and interact with their information.

Architecture: Indexing Pipeline

The indexing pipeline converts your product data into searchable vector embeddings. It runs on data changes (inserts, updates, deletes) and keeps the vector index in sync with your primary database.

# Embedding indexer that processes data changes
from openai import OpenAI
import numpy as np
from dataclasses import dataclass

client = OpenAI()

@dataclass
class SearchDocument:
    entity_type: str
    entity_id: str
    tenant_id: str
    text: str
    metadata: dict

def create_embedding(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text,
    )
    return response.data[0].embedding


def build_search_text(entity_type: str, record: dict) -> str:
    """Convert a database record into searchable text."""
    builders = {
        "contact": lambda r: (
            f"Contact: {r['name']}. Company: {r.get('company', 'N/A')}. "
            f"Title: {r.get('title', 'N/A')}. Notes: {r.get('notes', '')}. "
            f"Tags: {', '.join(r.get('tags', []))}."
        ),
        "deal": lambda r: (
            f"Deal: {r['name']}. Value: ${r.get('value', 0):,.2f}. "
            f"Stage: {r.get('stage', 'unknown')}. "
            f"Description: {r.get('description', '')}."
        ),
        "ticket": lambda r: (
            f"Support ticket: {r['subject']}. Status: {r.get('status', 'open')}. "
            f"Priority: {r.get('priority', 'normal')}. Body: {r.get('body', '')}."
        ),
    }
    builder = builders.get(entity_type)
    if not builder:
        raise ValueError(f"Unknown entity type: {entity_type}")
    return builder(record)

Storing Embeddings with pgvector

Use PostgreSQL with pgvector to keep embeddings alongside your existing data, avoiding the operational overhead of a separate vector database.

# pgvector storage and retrieval
import asyncpg

EMBED_DIM = 1536  # text-embedding-3-small dimension

async def setup_vector_table(pool: asyncpg.Pool):
    async with pool.acquire() as conn:
        await conn.execute("CREATE EXTENSION IF NOT EXISTS vector;")
        await conn.execute(f"""
            CREATE TABLE IF NOT EXISTS search_embeddings (
                id SERIAL PRIMARY KEY,
                tenant_id UUID NOT NULL,
                entity_type VARCHAR(50) NOT NULL,
                entity_id UUID NOT NULL,
                content TEXT NOT NULL,
                embedding vector({EMBED_DIM}) NOT NULL,
                metadata JSONB DEFAULT '{{}}',
                updated_at TIMESTAMPTZ DEFAULT NOW(),
                UNIQUE(entity_type, entity_id)
            );
        """)
        await conn.execute("""
            CREATE INDEX IF NOT EXISTS idx_search_embed_tenant
            ON search_embeddings (tenant_id);
        """)


async def upsert_embedding(pool: asyncpg.Pool, doc: SearchDocument):
    embedding = create_embedding(doc.text)
    embedding_str = "[" + ",".join(str(x) for x in embedding) + "]"
    async with pool.acquire() as conn:
        await conn.execute("""
            INSERT INTO search_embeddings
                (tenant_id, entity_type, entity_id, content, embedding, metadata)
            VALUES ($1, $2, $3, $4, $5::vector, $6)
            ON CONFLICT (entity_type, entity_id)
            DO UPDATE SET content = $4, embedding = $5::vector,
                          metadata = $6, updated_at = NOW();
        """, doc.tenant_id, doc.entity_type, doc.entity_id,
             doc.text, embedding_str, doc.metadata)

Search API

The search endpoint accepts a natural language query, embeds it, and performs a cosine similarity search scoped to the user's tenant.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

from fastapi import FastAPI, Depends, Query
from pydantic import BaseModel

app = FastAPI()

class SearchResult(BaseModel):
    entity_type: str
    entity_id: str
    content: str
    score: float
    metadata: dict

@app.get("/api/search", response_model=list[SearchResult])
async def semantic_search(
    q: str = Query(..., min_length=2, max_length=500),
    entity_type: str | None = Query(None),
    limit: int = Query(10, ge=1, le=50),
    tenant_id: str = Depends(get_current_tenant),
    pool: asyncpg.Pool = Depends(get_db_pool),
):
    query_embedding = create_embedding(q)
    embedding_str = "[" + ",".join(str(x) for x in query_embedding) + "]"

    type_filter = "AND entity_type = $3" if entity_type else ""
    params = [tenant_id, embedding_str]
    if entity_type:
        params.append(entity_type)

    async with pool.acquire() as conn:
        rows = await conn.fetch(f"""
            SELECT entity_type, entity_id, content, metadata,
                   1 - (embedding <=> $2::vector) AS score
            FROM search_embeddings
            WHERE tenant_id = $1 {type_filter}
            ORDER BY embedding <=> $2::vector
            LIMIT {limit};
        """, *params)

    return [
        SearchResult(
            entity_type=r["entity_type"],
            entity_id=str(r["entity_id"]),
            content=r["content"],
            score=round(float(r["score"]), 4),
            metadata=r["metadata"],
        )
        for r in rows
    ]

Relevance Tuning

Combine vector similarity with keyword matching and recency boosting for better results.

# Hybrid scoring: vector similarity + keyword BM25 + recency
async def hybrid_search(pool: asyncpg.Pool, query: str,
                        tenant_id: str, limit: int = 10):
    query_embedding = create_embedding(query)
    embedding_str = "[" + ",".join(str(x) for x in query_embedding) + "]"

    async with pool.acquire() as conn:
        rows = await conn.fetch("""
            SELECT entity_type, entity_id, content, metadata,
                   1 - (embedding <=> $2::vector) AS vector_score,
                   ts_rank(to_tsvector('english', content),
                           plainto_tsquery('english', $3)) AS keyword_score,
                   EXTRACT(EPOCH FROM (NOW() - updated_at)) AS age_seconds
            FROM search_embeddings
            WHERE tenant_id = $1
            ORDER BY (
                0.7 * (1 - (embedding <=> $2::vector)) +
                0.2 * ts_rank(to_tsvector('english', content),
                              plainto_tsquery('english', $3)) +
                0.1 * (1.0 / (1.0 + EXTRACT(EPOCH FROM (NOW() - updated_at)) / 86400))
            ) DESC
            LIMIT $4;
        """, tenant_id, embedding_str, query, limit)
    return rows

FAQ

How do I keep the vector index in sync with my primary data?

Use database triggers or change data capture (CDC) to detect inserts, updates, and deletes. Queue these changes to a background worker that recomputes embeddings and upserts them. For deletes, remove the corresponding row from the search_embeddings table. A 30-second indexing delay is acceptable for most SaaS applications.

Should I use pgvector or a dedicated vector database?

pgvector is the right choice for most SaaS products under 10 million records. It keeps your stack simple — one database, one backup strategy, one connection pool. Switch to a dedicated vector database like Pinecone or Weaviate only if you need sub-10ms latency at scale or advanced filtering that pgvector does not support.

Use a multilingual embedding model like text-embedding-3-small (which supports 100+ languages natively). Index all content as-is without translation. The embedding model maps semantically similar content to nearby vectors regardless of language, so a query in Spanish will find relevant records written in English.


#SemanticSearch #VectorEmbeddings #SaaS #SearchAPI #Python #Pgvector #AgenticAI #LearnAI #AIEngineering

Share this article
C

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.