
14 min read
Feb 05, 2026
Table of contents
01 When All Search Methods Agree on the Wrong Answer02 Why Does Vector Search Miss Exact Technical Terms?03 Why Does Text Search Miss Synonyms and Context?04 How Does Hybrid Search Combine Semantic and Keyword Results?05 How Does Time-Windowed Search Prevent Stale Results?06 How Do You Design a Schema for Hybrid Search?07 How Does Reciprocal Rank Fusion Combine Search Results?08 Running the Demo09 How Do Vector, Text, and Hybrid Search Compare?10 Production Tuning11 Choosing Your Search StrategyVector databases revolutionized semantic search, but they solve only one-third of the retrieval problem. When developers search documentation for “authentication setup,” vector embeddings excel at understanding intent. Yet two critical gaps remain: keyword precision and temporal relevance.
For RAG applications, retrieval quality directly determines generation quality: feed an LLM deprecated documentation, and it will confidently generate outdated answers.
Consider a developer searching for “configure OAuth authentication with environment variables.” Vector search returns a general security guide. Text search returns a three-year-old changelog. Hybrid search surfaces an OAuth guide from 2021, before breaking changes in 2023.
Three methods, three wrong answers. Temporal filtering completes the solution.
This tutorial explores vector similarity, keyword matching, and recency scoring using PostgreSQL full-text search, pgvectorscale, and TimescaleDB temporal partitioning: a unified stack that eliminates the complexity of maintaining separate databases.
Developers expect hybrid search to outperform pure vector or text search, but that’s not always the case. The following demonstration uses a 150-document database emulating documentation for NovaCLI, a fictional CLI tool, with intentionally engineered failure scenarios. You can run these queries yourself; setup instructions follow below.
Query: “How to enable logging in NovaCLI”
We ran four different search methods on the same query and the same 150 documents. Check out the top-ranked document each one returned.

Vector search fails. Returns “Configuring Application Logging in NovaCLI” (v1.0, January 2023, deprecated). The embedding captured semantic similarity to “logging” and “NovaCLI,” but the v1.0 document was comprehensive and well-structured when written. Its embeddings remain strong.

Text search fails. Returns the same deprecated v1.0 document with a perfect keyword match score. The title contains every query term.

Hybrid search fails. RRF combines both rankings. When vector and text search agree, RRF amplifies their consensus. The wrong answer wins with higher confidence.

Hybrid + temporal succeeds. Returns “Structured Logging Configuration in NovaCLI v3.1” (October 2025). Temporal filtering restricted the search to documents published within the last 12 months. The deprecated v1.0 document never entered the ranking.
The v1.0 document wasn’t maliciously designed. It was genuinely comprehensive when published. The v3.1 document introduces JSON-structured logging with different terminology. From a pure relevance perspective, the old document arguably matches the query better.
That's precisely the problem. Relevance without recency may return wrong results. Deprecated APIs, breaking changes, and superseded workflows make old documentation problematic. In RAG pipelines, this stale context propagates to the generated response, producing hallucinations grounded in superseded information.
The logging query above exposed a consensus failure: all methods agreed on incorrect output. Understanding why requires exploring each method's blind spots.
Vector search converts text into high-dimensional embeddings that capture semantic meaning. Documents about “logging” cluster near documents about “monitoring,” “observability,” and “debugging” because the concepts relate.
This semantic understanding becomes a liability for precision queries. Vector search cannot reliably distinguish “User ID 123” from “User ID 132” because their embeddings are mathematically similar. It struggles with:
Our sample query failed because the deprecated v1.0 document had stronger semantic alignment. Its comprehensive coverage of logging concepts created dense, well-formed embeddings. The newer v3.1 document, with its focus on JSON structure, is embedded differently.
Vector search fails on precision. Text search should handle exact terms, but it has its own blind spot.
PostgreSQL full-text search tokenizes text into lexemes and matches them literally. If your query contains “database replication,” documents using “data mirroring” or “streaming replication” score zero despite describing identical concepts.
This vocabulary mismatch problem compounds with:
The demo query failed because the v1.0 title matched perfectly: “Configuring Application Logging in NovaCLI” contains every query term. The v3.1 title “Structured Logging Configuration” introduces “Structured,” diluting the keyword density. This happens because standard ts_rank without normalization favors keyword-stuffed content.
Timescale's pg_textsearch extension addresses that limitation by providing BM25 ranking, which normalizes for document length. BM25 penalizes keyword density in shorter documents while rewarding natural term distribution.
Here's how BM25 search performs on the same query:

BM25 alone still misses the target answer at rank #1, but notice the score distribution: BM25's normalized scores (3.598, 3.564, 3.445) differ significantly from the raw ts_rank scores shown earlier. The ranking changes because BM25 accounts for document length and term frequency saturation. Yet keyword-based ranking alone cannot capture semantic intent.
This brings us to the core question: if BM25 improves text search and vector search captures semantics, how does hybrid search merge these signals, and when does that fusion succeed or fail?
Hybrid search executes vector similarity and text matching in parallel, then merges results using Reciprocal Rank Fusion (RRF). The formula:
RRF_score = 1/(k + rank_vector) + 1/(k + rank_text)RRF normalizes rankings by position rather than raw scores. A document ranked #1 by vector search and #5 by text search receives a combined score reflecting both placements. The constant k (typically 60) balances the influence between top-ranked and moderately-ranked results.
This approach helps when methods disagree. If vector search prefers Document A and text search prefers Document B, but both rank Document C moderately well, RRF surfaces the consensus candidate.
The logging query failed because both methods agreed. RRF cannot rescue consensus failures. When vector and text search both rank the deprecated document #1, RRF reinforces the error with higher confidence.
However, improving the text search component changes what RRF has to work with. When using BM25 instead of standard ts_rank:

The target answer now appears at rank #2 (tied at 0.016), demonstrating how BM25's length normalization prevents keyword-stuffed documents from dominating the text search component. This allows the vector signal to influence the final ranking more effectively.
Nevertheless, even with improved text search, outdated documents may dominate rankings when their content genuinely matches query intent better than current alternatives.
Time-windowed search addresses this by restricting candidates to recent documents before any ranking occurs.
Standard hybrid search ranks by relevance alone. A 2019 configuration guide can outrank the 2024 version if its content better matches the query.
Time-windowed hybrid search restricts scope at the database level, not through post-filtering. TimescaleDB’s hypertable partitioning by timestamp enables the query planner to skip entire chunks of old data:
-- See: src/search.py (search_hybrid_temporal function)
WHERE published_date >= NOW() - INTERVAL '12 months'This clause triggers chunk exclusion. Partitions outside the time window are never scanned, reducing I/O while guaranteeing only current content enters the ranking pipeline.
Chunk exclusion also reduces query latency. On our small 150-document database:

Hybrid + temporal matched hybrid’s speed while returning correct results. On larger databases with millions of documents, partition pruning compounds these gains; queries skip entire chunks rather than filtering rows post-retrieval.
Summing up, the logging query succeeded with temporal filtering because the v1.0 document (January 2023) fell outside the 12-month window. With the trap removed, the v3.1 document rose to #1, demonstrating why time is a non-negotiable constraint in some use cases. Regardless of semantic similarity or keyword precision, documentation searches require current information. Temporal filtering encodes this requirement into the query itself.
Understanding why each method fails points directly to the implementation requirements: a schema that supports all three dimensions, indexes optimized for each query pattern, and SQL that combines them efficiently.
The documents table in our demo app combines vector, text, and temporal dimensions in a single schema:
-- See: sql/01_create_schema.sql
-- Note: Simplified schema shown below. Production schema includes additional fields
-- for trap quartet methodology (trap_set, trap_type) and metadata (tags, deprecation_note).
CREATE TABLE documents (
id TEXT NOT NULL,
title TEXT NOT NULL,
body TEXT NOT NULL,
category TEXT,
version TEXT,
-- Dual timestamp architecture
created_at TIMESTAMPTZ NOT NULL, -- For TimescaleDB partitioning
published_date TIMESTAMPTZ, -- For temporal filtering
-- Full-text search (generated column)
search_vector TSVECTOR GENERATED ALWAYS AS (
setweight(to_tsvector('english', COALESCE(title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(body, '')), 'B')
) STORED,
-- Vector embedding (768-dim from MPNet)
embedding VECTOR(768),
PRIMARY KEY (id, created_at)
);
-- Convert to hypertable (6-month chunks)
SELECT create_hypertable('documents', 'created_at',
chunk_time_interval => INTERVAL '6 months');It's worth mentioning that we used a dual timestamp architecture to separate partitioning from filtering:
The primary key includes created_at, enabling time-range queries to skip entire partitions without scanning them.
The generated search_vector column assigns weight ‘A’ to titles and weight ‘B’ to body content. PostgreSQL automatically maintains this column when the title or body changes. Embeddings are generated from title and body concatenation, explicitly excluding metadata fields irrelevant to semantic search in our demonstration.
For our demo app, two index types enabled performant hybrid search on the same table:
-- See: sql/02_create_indexes.sql
-- Note: Simplified index creation shown below. Production indexes include tuning parameters
-- for DiskANN (num_neighbors, search_list_size, max_alpha, num_dimensions, num_bits_per_dimension).
-- Vector similarity index (pgvectorscale)
CREATE INDEX documents_embedding_idx ON documents USING diskann (embedding);
-- Full-text search index (PostgreSQL)
CREATE INDEX documents_search_vector_idx ON documents USING GIN (search_vector);These indexes coexist without conflicts because they serve orthogonal query patterns. PostgreSQL’s query planner treats them as independent access paths, selecting the appropriate index based on query predicates.
GIN (Generalized Inverted Index) on search_vector builds an inverted index where each unique lexeme points to the documents containing it. When you search for “authentication setup,” PostgreSQL stems both terms, then looks up which documents contain those lexemes. The ‘A’ and ‘B’ weights propagate into the index, influencing ranking without requiring separate indexes.
DiskANN on embedding builds an approximate nearest neighbor graph that keeps most data on disk rather than in memory. Unlike HNSW indexes that require the entire graph in RAM, DiskANN scales to millions of vectors on memory-constrained environments (4-8GB RAM), maintaining sub-50ms query performance.
Hybrid queries execute both index scans in parallel. Each scan runs concurrently, returning its ranked result set to the application layer, where RRF combines them.
We embedded the title and body because documentation queries target content. Your domain may require different fields; embed what users mentally query against.
search_vector.With the schema and indexes in place, the next step is constructing the query that combines both search methods.
Hybrid search executes vector and text queries as separate Common Table Expressions (CTEs), then merges rankings with RRF:
-- See: src/search.py (search_hybrid function, lines 155-214)
WITH vector_search AS (
SELECT
id, title, body, version, created_at, published_date,
ROW_NUMBER() OVER (ORDER BY embedding <=> $1::vector) AS rank
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 20
),
text_search AS (
SELECT
id, title, body, version, created_at, published_date,
ROW_NUMBER() OVER (
ORDER BY ts_rank(search_vector, websearch_to_tsquery('english', $2)) DESC
) AS rank
FROM documents
WHERE search_vector @@ websearch_to_tsquery('english', $2)
ORDER BY ts_rank(search_vector, websearch_to_tsquery('english', $2)) DESC
LIMIT 20
),
combined AS (
SELECT
COALESCE(v.id, t.id) AS id,
COALESCE(v.title, t.title) AS title,
COALESCE(v.body, t.body) AS body,
COALESCE(v.version, t.version) AS version,
COALESCE(v.created_at, t.created_at) AS created_at,
COALESCE(v.published_date, t.published_date) AS published_date,
-- RRF scoring: 1/(60 + rank) with equal weights (0.5 each)
COALESCE(1.0 / (60 + v.rank), 0.0) * 0.5 +
COALESCE(1.0 / (60 + t.rank), 0.0) * 0.5 AS score
FROM vector_search v
FULL OUTER JOIN text_search t ON v.id = t.id
)
SELECT * FROM combined
ORDER BY score DESC
LIMIT 5;Parameters:
$1: Query embedding vector (768 dimensions)$2: Query text for full-text searchRRF formula: 1.0 / (60 + rank) assigns each document a score based on position rather than raw similarity. The constant 60 dampens rank differences, preventing a #1 result from completely dominating. Documents found by both methods receive combined scores; documents found by only one method receive partial credit via COALESCE.
Why LIMIT 20 in CTEs: RRF can rerank results. A document ranked #15 in vector search and #3 in text search might win overall. Retrieving 20 candidates from each method provides sufficient headroom before the final LIMIT 5.
Add a WHERE clause to both CTEs:
-- See: src/search.py (search_hybrid_temporal function, lines 270-330)
WITH vector_search AS (
SELECT
id, title, body, version, created_at, published_date,
ROW_NUMBER() OVER (ORDER BY embedding <=> $1::vector) AS rank
FROM documents
WHERE published_date >= NOW() - INTERVAL '12 months'
ORDER BY embedding <=> $1::vector
LIMIT 20
),
text_search AS (
SELECT
id, title, body, version, created_at, published_date,
ROW_NUMBER() OVER (
ORDER BY ts_rank(search_vector, websearch_to_tsquery('english', $2)) DESC
) AS rank
FROM documents
WHERE published_date >= NOW() - INTERVAL '12 months'
AND search_vector @@ websearch_to_tsquery('english', $2)
ORDER BY ts_rank(search_vector, websearch_to_tsquery('english', $2)) DESC
LIMIT 20
),
The temporal filter applies before ranking, not after. Documents outside the time window never enter the candidate pool. TimescaleDB’s chunk exclusion skips entire partitions when filtering on the partitioning column (created_at), but filtering on published_date still benefits from index usage on that column.
Time window selection:
With the schema, indexes, and query patterns in place, you can explore these behaviors yourself using the demo application.
Clone the repository and restore the pre-built database:
# See: setup_demo.sh and run_demo.sh in repository root
git clone https://github.com/timescale/TimescaleDB-HybridSearch
cd TimescaleDB-HybridSearch
# Configure database connection
cp .env.example .env
# Edit .env with your DATABASE_URL
# Run automated setup (creates venv, installs dependencies, restores database)
./setup_demo.sh
# Launch the interactive demo
./run_demo.sh
The demo loads a sentence-transformer model (all-mpnet-base-v2) once at startup (~5 seconds). Enter any query at the prompt. For convenience, embeddings generate locally on-the-fly rather than using remote API calls to LLM models, which is usual in production.

All four search methods run in parallel against the same query.
Deployment options:
See the repository README for detailed setup instructions.
The query at the start of the tutorial demonstrated one failure pattern: consensus failure, where all methods agreed on the wrong answer. But hybrid search can fail in other ways, and understanding these patterns helps you anticipate when temporal filtering alone won’t save you.
Each search method excels in specific scenarios and fails in others:
Method | Index Type | Best Use Case | Primary Weakness |
Vector Search | DiskANN (pgvectorscale) | Conceptual queries, semantic understanding | Keyword insensitivity; fails on exact technical terms |
Text Search | GIN (tsvector) | Exact keyword matches, technical terms | Cannot handle synonyms or context |
Hybrid Search | Both (RRF combination) | Queries requiring semantic + keyword precision | Cannot distinguish outdated from current docs |
Hybrid + Temporal | Both + TimescaleDB partitioning | Current documentation, time-sensitive content | Excludes historical information by design |
The following case studies demonstrate these failure modes in practice.
Query: “How to enable SCRAM-SHA-256 authentication in NovaCLI”
This query demonstrates a counterintuitive failure: vector search finds the correct answer, but hybrid search returns a deprecated document.
Vector search succeeds (rank #1, score 0.693):

The embedding captures semantic intent. Vector search ranks the current v3.1 configuration guide first.
Text search fails (rank #1, score 1.000):

Text search finds perfect lexical matches, ranking the deprecated v1.0 document first with a perfect score. The trap document title is shorter and keyword-dense (4 tokens); the target includes additional context (6 tokens). Standard ts_rank without normalization sums matching lexeme weights, favoring shorter, keyword-stuffed documents.
Hybrid search fails (rank #1, RRF score 0.016):

RRF combines both rankings, but text search’s failure creates a veto effect. Even though vector search correctly identified the best document, text search’s strong preference for the deprecated document pulled the hybrid result toward the wrong answer.
How to fix the veto effect:
ts_rank normalization (constant 32) to penalize keyword-stuffed documentsQuery: “How to configure NovaCLI”
This query demonstrates hybrid search’s intended behavior: when neither method is confident individually, RRF combines its weak signals to surface the correct answer.
Vector search fails (rank #5, score 0.704):

Vector search returns a broad getting-started guide at rank #1. The correct document “TOML Configuration Guide” (v3.1) appears at rank #5.
Text search fails (rank #3, score 0.996):

Text search finds keyword matches for “configuration” and “NovaCLI,” but ranks the deprecated v2.0 YAML guide at #1. The correct v3.1 TOML guide appears at rank #3.
Hybrid search succeeds (rank #1, RRF score 0.016):

The TOML guide wasn’t the best match for either method individually (rank #5 and #3), but it scored moderately well in both rankings. When both methods point toward the same document despite neither being certain, that convergence carries information. RRF amplifies this consensus.
Key Takeaways:
Knowing when each method fails is diagnostic. Tuning your implementation to minimize those failures is the next step.
The default 50/50 RRF weighting suits balanced workloads. Adjust when one method consistently outperforms:
Match time windows to content lifecycle:
TimescaleDB chunk size affects partition pruning efficiency:
INTERVAL '1 hour')INTERVAL '6 months')INTERVAL '3 months')Not every query needs hybrid search:
These routing rules provide starting points. The right strategy depends on your specific use case and user behavior.
No single method dominates. Vector search excels at exploratory queries where exact terminology is unknown. Text search handles exact term lookups like API method names or error codes. Hybrid search applies when queries combine technical terms with semantic intent. Hybrid + temporal becomes necessary when content freshness determines correctness.
Monitor your search logs to measure which queries succeed, which fail, and why. The demo repository includes the full implementation; experiment with RRF weights, temporal windows, and ranking functions to find what works for your database and user behavior. For RAG and AI search applications, this combination of semantic understanding, keyword precision, and temporal awareness provides the retrieval foundation that keeps generated answers current.
About the Author: Damaso is a Technical Content Writer and Content Engineer with over 20 years of hands-on IT experience. He specializes in translating deep technical expertise in Kubernetes, DevOps, CI/CD, and AI/ML into practical content for developers, architects, and IT leaders. Learn more: damasosanoja.com

It’s 2026, Just Use Postgres
Stop managing multiple databases. Postgres extensions replace Elasticsearch, Pinecone, Redis, MongoDB, and InfluxDB with BM25, vectors, JSONB, and time-series in one database.
Read more
Receive the latest technical articles and release notes in your inbox.