TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
How to Choose a Database: A Decision Framework for Modern ApplicationsRecursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL PerformancePostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesHow to Reduce Bloat in Large PostgreSQL TablesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)When to Consider Postgres PartitioningAn Intro to Data Modeling on PostgreSQLDesigning Your Database Schema: Wide vs. Narrow Postgres TablesGuide to PostgreSQL Database OperationsBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Explaining PostgreSQL EXPLAINBest Practices for (Time-)Series Metadata Tables What Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideGuide to Postgres Data ManagementA Guide to Data Analysis on PostgreSQLHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningTop PostgreSQL Drivers for PythonUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseHow to Use Psycopg2: The PostgreSQL Adapter for PythonBuilding a Scalable DatabaseGuide to PostgreSQL Database Design
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
Building AI Agents with Persistent Memory: A Unified Database ApproachWhen Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
Complete Guide: Migrating from MongoDB to Tiger Data (Step-by-Step)How to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's Guide5 Ways to Monitor Your PostgreSQL DatabaseData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics
Sections

Tutorials

How to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector Search

Hybrid search & filtering

PostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A Tutorial

Image search

Building an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector

AI and vector fundamentals

A Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong Abstraction

Cosine similarity

A Guide to Cosine SimilarityImplementing Cosine Similarity in Python

Vector databases

Vector Database Options for AWSVector Store vs. Vector Database: Understanding the Connection
Building AI Agents with Persistent Memory: A Unified Database Approach

Semantic search

Fundamentals

Understanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchWhen Should You Use Full-Text Search vs. Vector Search?

Vectorscale

Fundamentals

Understanding DiskANN

Schema design

Streaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector Data
Vector Database Basics: HNSW

Benchmarks

Pgvector vs. Pinecone: Vector Database Performance and Cost Comparison

Fundamentals

HNSW vs. DiskANN
Nearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They Work

AI query interfaces

Text-to-SQL: A Developer’s Zero-to-Hero Guide

Products

Time Series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time Series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2026 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at Jan 20, 2026

Building AI Agents with Persistent Memory: A Unified Database Approach

Building AI Agents with Persistent Memory A Unified Database Approach

Written by Jakkie Koekemoer

The underlying language models in AI agents are stateless at the API level. They respond to prompts and immediately forget everything. This is a major limitation if you’re building autonomous production systems that need to learn, remember, and evolve.

When it comes to implementing agent memory, the fragmented approach of using separate databases for vectors, relational data, and time-series creates operational complexity, consistency problems, and infrastructure costs that scale faster than your data.

This guide teaches you how to consolidate AI agent persistent memory into a single PostgreSQL database using Tiger Data, combining hypertables for time-series conversation history, pgvectorscale for semantic search, and standard PostgreSQL for structured state—eliminating the multi-database complexity that plagues most agent architectures.

What is AI Agent Persistent Memory Architecture?

AI agent persistent memory architecture is a unified database system that stores episodic events, semantic knowledge, and procedural state in PostgreSQL, enabling agents to maintain context across conversations while scaling to production workloads. Instead of managing separate vector databases, time-series stores, and relational systems, production-grade agents now consolidate all three memory types using PostgreSQL extensions: hypertables partition conversation history by time, pgvector indexes enable semantic search over embedded knowledge, and standard tables store user preferences with ACID guarantees.

The Three Types of Agent Memory

Production AI agents require three distinct cognitive memory systems working together.

Episodic memory captures timestamped events—every conversation turn, tool invocation, and API result becomes a queryable timeline. When users ask “What did we discuss yesterday?”, agents query episodic memory. This demands time-series optimization because range queries like “last 50 messages” are constant, not exceptional.

Semantic memory represents embedded knowledge stored as vector embeddings. When users upload PDFs or agents scrape documentation, content gets chunked, embedded, and indexed for similarity search. This powers retrieval-augmented generation (RAG), letting agents fetch relevant context before generating responses. Production systems require vector similarity search with sub-50ms latency even at millions of embeddings.

Procedural memory stores structured preferences and learned behaviors in relational tables. When agents remember “this user prefers Python” or “always format code as TypeScript”, that’s procedural memory requiring transactional updates with foreign key constraints.

Memory Type

Data Pattern

Query Type

PostgreSQL Implementation

Episodic

Time-series events

Temporal range queries

Hypertables with automatic partitioning

Semantic

Vector embeddings

K-nearest neighbor search

pgvector columns with DiskANN indexes

Procedural

Relational data

CRUD with joins

Standard PostgreSQL tables

The multi-database approach forces serial round trips to three separate systems, then stitches results in application code. Every network hop adds latency. Every additional system multiplies operational complexity. Tiger Data consolidates all three patterns in PostgreSQL with production-grade performance.

Why Use a Unified Database for AI Agent Memory?

Unified PostgreSQL architecture reduces infrastructure costs by as much as 66% while enabling faster context retrieval through single-query joins across episodic, semantic, and procedural memory—impossible when data spans separate vector databases, time-series stores, and relational systems. A single database connection constructs complete context windows in one transaction with consistent point-in-time snapshots. One backup strategy protects all memory. One query language joins temporal, vector, and relational data without network round trips between systems.

Schema Design: Conversation Store (Episodic Memory)

-- Episodic memory: timestamped conversation history CREATE TABLE messages ( id BIGSERIAL, conversation_id UUID NOT NULL, role TEXT NOT NULL CHECK (role IN ('user', 'assistant', 'system', 'tool')), content TEXT NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- Performance and cost tracking model_name TEXT, tokens_used INTEGER, latency_ms INTEGER, metadata JSONB, PRIMARY KEY (conversation_id, created_at, id) ); -- Convert to hypertable for time-series optimization SELECT create_hypertable( 'messages', by_range('created_at', INTERVAL '7 days') ); -- Index for fast conversation retrieval CREATE INDEX idx_messages_conversation_time ON messages (conversation_id, created_at DESC);

The create_hypertable call partitions the table into 7-day chunks. When querying recent messages, PostgreSQL scans only relevant chunks. As data ages, Tiger Data compresses old chunks at 10:1 ratios, keeping storage manageable without sacrificing query performance.

Schema Design: Knowledge Base (Semantic Memory)

-- Semantic memory: embedded knowledge with temporal validity CREATE TABLE knowledge_items ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), content TEXT NOT NULL, source_url TEXT, embedding vector(1536), -- OpenAI ada-002 dimension -- Temporal validity tracking for time-aware RAG created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), valid_from TIMESTAMPTZ NOT NULL DEFAULT NOW(), valid_until TIMESTAMPTZ, -- Categorization for filtered search tags TEXT[], category TEXT, metadata JSONB ); -- StreamingDiskANN index for production vector search CREATE INDEX idx_knowledge_embedding ON knowledge_items USING diskann (embedding); -- GIN index for hybrid search (vector + keyword) CREATE INDEX idx_knowledge_content_fts ON knowledge_items USING GIN (to_tsvector('english', content)); -- Temporal filtering for time-aware RAG CREATE INDEX idx_knowledge_temporal ON knowledge_items (valid_from, valid_until, category);

The diskann index uses pgvectorscale’s implementation of Microsoft’s DiskANN algorithm, optimized for SSD storage with lower RAM requirements than HNSW. pgvectorscale achieves 471 QPS at 99% recall on 50M vectors—competitive with specialized vector databases while staying in PostgreSQL.

Temporal validity columns (valid_from, valid_until) enable time-aware RAG. Pure vector similarity returns semantically similar but potentially outdated information. Temporal windows mark old knowledge as invalid without deletion, supporting queries like “What was the architecture recommendation in Q3 2024?”

Schema design: Agent State (Procedural Memory)

-- Procedural memory: user preferences and agent configuration CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email TEXT UNIQUE NOT NULL, preferences JSONB DEFAULT '{}', created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), last_active_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE agents ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name TEXT NOT NULL, system_prompt TEXT NOT NULL, learned_strategies JSONB DEFAULT '{}', avg_response_time_ms INTEGER, total_conversations INTEGER DEFAULT 0, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE TABLE user_agents ( user_id UUID REFERENCES users(id) ON DELETE CASCADE, agent_id UUID REFERENCES agents(id) ON DELETE CASCADE, is_default BOOLEAN DEFAULT FALSE, PRIMARY KEY (user_id, agent_id) ); Relational structure supports transactional updates like “update user preferences and default agent atomically”. JSONB columns provide schema flexibility while foreign keys ensure data integrity.

How to Implement Hybrid RAG with Time Filtering?

Hybrid RAG combines vector similarity search with temporal validity filtering and full-text keyword matching, returning only knowledge that is both semantically relevant and currently valid—preventing agents from retrieving outdated information despite high embedding similarity. This pattern queries episodic, semantic, and procedural memory in one database round trip.

import { Pool } from 'pg'; const pool = new Pool({ connectionString: process.env.DATABASE_URL }); interface SearchOptions { query: string; embedding: number[]; limit?: number; maxAgeDays?: number; category?: string; semanticWeight?: number; // Weight for vector similarity (0-1) textWeight?: number; // Weight for text search (0-1) } async function hybridSearchWithTemporalValidity(options: SearchOptions) { const { query, embedding, limit = 10, maxAgeDays = 30, category, semanticWeight = 0.6, textWeight = 0.4 } = options; // Build dynamic query parts const params: any[] = [ `[${embedding.join(',')}]`, // $1 query, // $2 maxAgeDays, // $3 limit // $4 ]; let categoryFilter = ''; if (category) { params.push(category); // $5 categoryFilter = `AND category = $${params.length}`; } const result = await pool.query(` SELECT id, content, source_url, created_at, embedding <-> $1::vector AS distance, ts_rank(to_tsvector('english', content), plainto_tsquery('english', $2)) AS text_rank, -- Hybrid score: combine normalized scores ( (1 - (embedding <-> $1::vector)) * ${semanticWeight} + ts_rank(to_tsvector('english', content), plainto_tsquery('english', $2)) * ${textWeight} ) AS hybrid_score FROM knowledge_items WHERE -- Temporal validity: only currently valid knowledge valid_from <= NOW() AND (valid_until IS NULL OR valid_until > NOW()) -- Recency filter (using parameterized query) AND created_at > NOW() - ($3 || ' days')::INTERVAL -- Full-text relevance filter AND to_tsvector('english', content) @@ plainto_tsquery('english', $2) ${categoryFilter} ORDER BY hybrid_score DESC LIMIT $4 `, params); return result.rows; } This query uses the DiskANN index for vector search (<-> operator), GIN index for full-text (@@ operator), and B-tree indexes for temporal filtering—all in one PostgreSQL query. The valid_until column prevents outdated knowledge retrieval: facts remain in the database but queries exclude them after invalidation.

Context window construction: Single-query pattern

interface ContextWindow { recentMessages: Array<{ role: string; content: string; created_at: Date }>; relevantFacts: Array<{ content: string; source: string; distance: number }>; userPreferences: Record<string, any>; tokenCount?: number; // Track total tokens used } async function buildContextWindow( userId: string, conversationId: string, queryEmbedding: number[], queryText: string, // For full-text search maxTokens: number = 100000 // Token budget ): Promise<ContextWindow> { const result = await pool.query(` WITH recent_messages AS ( SELECT role, content, created_at, -- Estimate token count (rough: ~4 chars = 1 token) LENGTH(content) / 4 AS estimated_tokens FROM messages WHERE conversation_id = $1 AND created_at > NOW() - INTERVAL '24 hours' ORDER BY created_at ASC -- Chronological order LIMIT 50 ), relevant_knowledge AS ( SELECT content, source_url, embedding <-> $2::vector AS distance FROM knowledge_items WHERE valid_from <= NOW() AND (valid_until IS NULL OR valid_until > NOW()) AND created_at > NOW() - INTERVAL '30 days' -- CRITICAL FIX: Add full-text search pre-filter AND to_tsvector('english', content) @@ plainto_tsquery('english', $4) ORDER BY distance ASC LIMIT 5 ), user_prefs AS ( SELECT preferences FROM users WHERE id = $3 ) SELECT jsonb_build_object( 'messages', ( SELECT jsonb_agg( jsonb_build_object( 'role', role, 'content', content, 'created_at', created_at, 'estimated_tokens', estimated_tokens ) ORDER BY created_at ASC -- Maintain order ) FROM recent_messages ), 'facts', ( SELECT jsonb_agg( jsonb_build_object( 'content', content, 'source', source_url, 'distance', distance ) ORDER BY distance ASC ) FROM relevant_knowledge ), 'preferences', (SELECT preferences FROM user_prefs) ) AS context `, [ conversationId, `[${queryEmbedding.join(',')}]`, userId, queryText // Full-text search parameter ]); const context = result.rows[0]?.context || {}; // Token budget enforcement let totalTokens = 0; const messages = (context.messages || []).filter((msg: any) => { totalTokens += msg.estimated_tokens || 0; return totalTokens <= maxTokens * 0.8; // Reserve 20% for facts/system prompt }); return { recentMessages: messages, relevantFacts: context.facts || [], userPreferences: context.preferences || {}, tokenCount: totalTokens }; } One database query constructs the complete context window. CTEs execute in parallel (PostgreSQL’s query planner handles this), and the final SELECT combines everything into JSON. One network round trip, one transaction, consistent point-in-time snapshot across all memory types.

Memory Consolidation for Cost Optimization

-- Background consolidation: convert old episodic memory to semantic summaries CREATE OR REPLACE FUNCTION consolidate_old_conversations() RETURNS void AS $$ DECLARE conversation RECORD; summary TEXT; summary_embedding vector(1536); -- Variable to hold the chunk name during the compression phase chunk_to_compress NAME; BEGIN FOR conversation IN SELECT DISTINCT conversation_id FROM messages WHERE created_at < NOW() - INTERVAL '30 days' AND NOT EXISTS ( SELECT 1 FROM knowledge_items WHERE metadata->>'conversation_id' = conversation_id::TEXT AND category = 'conversation_summary' ) LOOP SELECT string_agg(content, E'\n' ORDER BY created_at) INTO summary FROM messages WHERE conversation_id = conversation.conversation_id; -- Generate embedding via application logic (placeholder) summary_embedding := '[0.01, 0.02, ...]'::vector; INSERT INTO knowledge_items ( content, embedding, category, metadata ) VALUES ( summary, summary_embedding, 'conversation_summary', jsonb_build_object( 'conversation_id', conversation.conversation_id, 'consolidated_at', NOW() ) ); END LOOP; -- Compress the historical chunk(s) ONCE after consolidation -- This iterates over ALL relevant chunks in the 30-40 day window and compresses them. FOR chunk_to_compress IN SELECT show_chunks('messages', newer_than => NOW() - INTERVAL '40 days', older_than => NOW() - INTERVAL '30 days') LOOP PERFORM compress_chunk(chunk_to_compress); END LOOP; END; $$ LANGUAGE plpgsql; This function converts episodic memory (raw messages) into semantic memory (summaries). Old conversations are selected, aggregated, embedded (via external logic), and indexed in the knowledge_items table. After consolidation, historical message data chunks are compressed using columnar compression (e.g., via TimescaleDB/Tiger Data), achieving high data reduction while the messages remain queryable for compliance or debugging.

Framework Integrations

Here is an example of LangChain integration:

from langchain_postgres import PGVectorStore, PostgresChatMessageHistory from langgraph.checkpoint.postgres import PostgresSaver from psycopg import Connection # Updated vector store pattern (PGVector class deprecated) conn_string = "postgresql://user:pass@host:5432/db" # Vector storage for semantic memory vector_store = PGVectorStore( embeddings=embeddings, connection=conn_string, collection_name="knowledge_items", use_jsonb=True ) # Chat history for episodic memory chat_history = PostgresChatMessageHistory( connection_string=conn_string, session_id="conversation_123" ) # LangGraph checkpointer for agent state persistence with Connection.connect(conn_string) as conn: checkpointer = PostgresSaver(conn) checkpointer.setup() # Create necessary tables

Critical update: LangChain v0.3+ deprecated individual memory classes in favor of LangGraph persistence via checkpointers. The PGVector class is deprecated—migrate to PGVectorStore (as used above) with a connection string. The PostgresSaver from langgraph-checkpoint-postgres is the correct, current class for agent state check-pointing (for persistence and recovery), which is separate from the raw message logging handled by PostgresChatMessageHistory.

Postgres, via its extension pgvector, can efficiently serve as a unified Document Store (raw text/metadata) and Vector Store (embeddings) within LlamaIndex. This simplifies the RAG infrastructure, allowing LlamaIndex to store all components and perform Fast Approximate Nearest Neighbor (ANN) searches directly within the highly reliable and ACID-compliant relational database.

Production & Performance

-- Choose vector index based on workload characteristics -- DiskANN: Best for billion-scale datasets, lower RAM requirements CREATE INDEX idx_knowledge_diskann ON knowledge_items USING diskann (embedding); -- HNSW: Best for datasets fitting in RAM, fastest queries CREATE INDEX idx_knowledge_hnsw ON knowledge_items USING hnsw (embedding vector_cosine_ops); -- Enable automatic compression for cost reduction ALTER TABLE messages SET ( timescaledb.compress, timescaledb.compress_segmentby = 'conversation_id', timescaledb.compress_orderby = 'created_at DESC' ); SELECT add_compression_policy('messages', INTERVAL '7 days');

DiskANN achieves 28x lower p95 latency than Pinecone’s s1 index at 99% recall, but choose based on your scale: HNSW for <100M vectors fitting in RAM, DiskANN for billion-scale or cost-sensitive deployments. Compression policies run automatically in the background, reducing storage by 10:1 without blocking writes.

Monitoring Agent Token Usage

Use SQL query similar to the one shown below to monitor agent token usage:

-- Monitor daily token usage per model SELECT time_bucket('1 day', created_at) AS day, model_name, SUM(tokens_used) AS total_tokens FROM messages WHERE tokens_used IS NOT NULL GROUP BY 1, 2 ORDER BY 1 DESC, 2;

Conclusion

AI agent persistent memory consolidation into PostgreSQL eliminates the operational complexity of managing separate vector databases, time-series stores, and relational systems. Tiger Data delivers production-grade performance through hypertables (automatic time-partitioning), pgvectorscale (471 QPS at 99% recall on 50M vectors), and native compression (90%+ storage reduction). One database connection constructs complete context windows spanning episodic, semantic, and procedural memory in a single query with ACID guarantees.

Implement temporal validity modeling through valid_from and valid_until columns to prevent outdated knowledge retrieval. Use hybrid search combining vector similarity, full-text matching, and temporal filtering. For LangChain integration, migrate to PGVectorStore and PostgresSaver patterns following v0.3+ deprecations. Reserve specialized vector databases only for billion-scale deployments—unified PostgreSQL handles sub-100M vector workloads with lower complexity and cost.

Start building with Tiger Data for production AI agent memory: hypertables for conversation history, pgvectorscale for semantic search, and PostgreSQL reliability for everything else.


About the author: Jakkie Koekemoer is the Engineering Manager at Draft.dev. With more than 20 years of experience as a developer, data engineer, and manager, he's committed to solving real problems and bringing the best out of people. He lives in Cape Town, South Africa.

On this page