4 minutes
Semantic Search Using PostgreSQL & pgvector
Introduction
I’ve been playing with semantic search because sometimes keyword-based lookups just don’t cut it, especially when I want to find conceptually similar documents rather than ones that happen to share the same words. By using PostgreSQL’s pgvector extension, I can keep everything in a relational database I already use, while getting fast, approximate nearest-neighbor search on embeddings I generate locally.
Why I Like This Setup
- Unified Stack: No need for a separate vector DB; just extend Postgres.
- SQL Power: I can combine vector similarity with regular SQL filters (dates, tags, user IDs).
- Simplicity: One database to backup, one set of credentials, and everything lives in my existing infra.
High-Level Flow
Here’s how my semantic search pipeline flows:
- Install pgvector – Enable the extension in Postgres.
- Embed Data – Generate embeddings for each document or text chunk with a local model.
- Store Embeddings – Insert embeddings into a vector column in a Postgres table.
- Query by Similarity – Use
<->
operator to find nearest neighbors to a query embedding.
I’ll dive into each step with code samples and tips that helped me avoid common pitfalls.
1. Installing pgvector
Why: pgvector adds a vector
type and similarity operators right inside Postgres.
-- Run this once per database
enable_extension = "CREATE EXTENSION IF NOT EXISTS vector;"
-- Verify it’s available
SELECT * FROM pg_extension WHERE extname = 'vector';
If you’re using Docker, a quick Dockerfile snippet:
FROM postgres:15
RUN apt-get update && apt-get install -y postgresql-server-dev-15 build-essential && \
git clone https://github.com/pgvector/pgvector.git && \
cd pgvector && \
make && make install
Once pgvector is loaded, I restart my Postgres container or service and I’m good to go.
2. Generating Embeddings
Why: I need consistent embeddings for both my stored data and any query text.
My Model: I stick with all-MiniLM-L6-v2
for embeddings, same as my summarization pipeline, keeps things consistent.
from transformers import AutoTokenizer, AutoModel
import torch
# Load once
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
def get_embedding(text):
inputs = tokenizer(text, return_tensors='pt', truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
return outputs.last_hidden_state.mean(dim=1).squeeze().tolist()
# Example
docs = ["This is a sample document.", "Another text goes here."]
embeddings = [get_embedding(d) for d in docs]
print(f"Generated {len(embeddings)} embeddings.")
I always sanity-check embedding lengths, should be 384 floats, and occasionally eyeball a few values to ensure they’re not all zeros or NaNs.
3. Storing Embeddings in Postgres
Why: Getting data into the DB ensures I can query it with SQL tools and integrate with my dashboards.
-- Table creation
drop_and_create = "DROP TABLE IF EXISTS documents;"
create_table = '''
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
title TEXT,
content TEXT,
embedding VECTOR(384)
);
'''
-- Inserting from Python (psycopg2 example)
import psycopg2
conn = psycopg2.connect(dbname='mydb', user='me', password='pass')
cur = conn.cursor()
for title, doc, emb in zip(titles, docs, embeddings):
cur.execute(
"INSERT INTO documents (title, content, embedding) VALUES (%s, %s, %s);",
(title, doc, emb)
)
conn.commit()
Make sure to convert the Python list to a Postgres array or vector literal. I usually do embedding::vector
casting in the SQL driver config.
4. Running Similarity Queries
Why: This is where the magic happens, finding similar docs by vector distance.
-- Find top 5 docs similar to a query embedding
enum_query = '[0.12, -0.34, ...]' -- output of get_embedding()
SELECT id, title, content
FROM documents
ORDER BY embedding <-> enum_query::vector
LIMIT 5;
In Python I wrap that into a helper:
def search_similar(query_text, k=5):
q_emb = get_embedding(query_text)
cur.execute(
"SELECT title, content FROM documents ORDER BY embedding <-> %s::vector LIMIT %s;",
(q_emb, k)
)
return cur.fetchall()
# Usage
results = search_similar("How to set up a pgvector database?", k=3)
for title, content in results:
print(f"- {title}")
A couple of gotchas I learned:
Indexing: For large tables (>10k rows), create an
ivfflat
index to speed up queries.CREATE INDEX ON documents USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);
Analyze: Remember to
ANALYZE documents
after bulk loads so Postgres knows about data distribution.
Wrapping Up
That’s my go-to setup for semantic search in Postgres. I can combine this with other SQL filters (dates, tags) or even join with metadata tables. All my data stays private, and I get flexible querying without the overhead of a specialized vector DB. Next, I’ll hook this into a simple Flask app so I can play with search in my browser.