TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
Recursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL PerformancePostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesHow to Reduce Bloat in Large PostgreSQL TablesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)When to Consider Postgres PartitioningDesigning Your Database Schema: Wide vs. Narrow Postgres TablesGuide to PostgreSQL Database OperationsBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables Understanding PostgreSQL TablespacesA Guide to Data Analysis on PostgreSQLA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningTop PostgreSQL Drivers for PythonWhat Is Audit Logging and How to Enable It in PostgreSQLGuide to Postgres Data ManagementHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseAn Intro to Data Modeling on PostgreSQLExplaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksHow to Use Psycopg2: The PostgreSQL Adapter for PythonBuilding a Scalable DatabaseGuide to PostgreSQL Database Design
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
When Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
Building Python Apps With PostgreSQL: A Developer's Guide5 Ways to Monitor Your PostgreSQL DatabaseHow to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics

Products

Time Series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time Series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2026 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at Jan 15, 2026

How to Choose a Database: A Decision Framework for Modern Applications

Database Decision Thumbnail

Written by Jakkie Koekemoer

Most database selection guides fall into two camps: vendor pitches disguised as advice, or vague platitudes that end with "it depends." Neither helps when you're staring at a blank architecture diagram, wondering if you really need multiple databases to handle user accounts, sensor logs, and vector embeddings.

The main principle: Start with PostgreSQL extensions (Tiger Data/TimescaleDB for time-series, pgvector for embeddings) and only fragment your stack when you hit proven bottlenecks at 100M+ events/day or face hard compliance isolation requirements. Modern applications need hybrid data models, and managing multiple specialized databases creates synchronization lag, double storage costs, and distributed join complexity in application code.

This guide provides a structured decision framework based on data shape (i.e. if the data contains flat rows, nested documents, time-series points, or high-dimensional vectors), query patterns, and scale thresholds, with concrete performance benchmarks for PostgreSQL, Tiger Data (TimescaleDB), InfluxDB, MongoDB, and vector databases.

Here is a glimpse of four main database categories, their typical uses, and the golden rule:

#

Category

Examples

When to Use

1

General Purpose Relational

Postgres, MySQL

Default choice for most applications. ACID guarantees, complex queries, data integrity.

2

Time-Series Optimized

TimescaleDB, InfluxDB, TigerData

High-volume timestamped data, metrics, IoT sensors, monitoring systems.

3

Document Databases

MongoDB

Flexible schemas, semi-structured data, rapid iteration, content management.

4

Vector Databases

Pinecone, Weaviate, pgvector

Embeddings, semantic search, AI/ML applications, similarity matching.

Golden Rule: Don’t specialize until you hit >100M events/day or have specific isolation requirements for compliance.

Which Database Should I Use? The Three-Step Decision Framework

Database selection starts with understanding your data and how you’ll query it—not with feature lists or marketing claims.

Step 1: What Types of Data Am I Storing?

Different data structures require different storage patterns. Relational data (users, orders) needs ACID transactions and joins. Time-series data (sensor readings, metrics) requires compression and temporal queries. Document data (product catalogs) works with flexible schemas. Vector data (embeddings) needs specialized similarity search. When multiple types must be queried together, unified databases prevent data silos and eliminate synchronization lag.

Relational Data: Structured tables with foreign keys and complex relationships. Best stored in RDBMS systems that enforce ACID guarantees and support complex joins. Example: e-commerce with users, orders, and inventory.

Time-Series Data: Ordered sequences of timestamped events. Optimized storage requires compression, time-based partitioning, and efficient range queries. Example: IoT sensor readings, application metrics, financial tick data.

Document Data: Semi-structured JSON with flexible schemas. Works well in document stores or JSONB columns in PostgreSQL. Example: product catalogs with varying attributes.

Vector Data: High-dimensional arrays for semantic search and AI applications. Requires specialized indexing (HNSW, IVF) for performant similarity queries. Example: RAG applications, image search.

Key insight: If you have multiple data types that need to be queried together, a multi-model database prevents data silos. For example, an IoT platform storing both device metadata (relational) and sensor readings (time-series) benefits from querying both in a single transaction.

Step 2: What Are My Query Patterns?

Your access patterns determine database architecture more than data shape.

On-line transaction processing (OLTP) (Transaction Processing): Point lookups, updates, and inserts optimized for low-latency single-row operations. Example: SELECT * FROM users WHERE id = 123.

Online analytical processing (OLAP) (Analytical Processing): Aggregations across large datasets with time ranges. Example: SELECT device_id, AVG(temperature) FROM sensors WHERE time > NOW() - INTERVAL '7 days' GROUP BY device_id.

Vector Search: Nearest-neighbor queries using distance metrics. Example: Finding semantically similar documents for RAG applications.

Hybrid Queries: Combining multiple patterns. Example: “Find products similar to this one (vector) purchased in the last month (time-series) by premium users (relational).”

Most real-world applications require hybrid queries. This is where the “one database per data type” approach breaks down, forcing complex orchestration in application code.

Step 3: What Scale Am I Operating At?

Scale determines whether optimization matters or if developer velocity should win.

Under 10K events/day: Any database works. Choose based on team familiarity. Poor tuning overwhelms actual query performance differences at this scale.

10K to 1M events/day: Optimization starts to matter. Indexes, partitioning, and query planning become relevant. PostgreSQL with extensions (Tiger Data for time-series, pgvector for embeddings) provides the best balance.

1M to 100M events/day: Purpose-built optimizations become critical but don’t necessarily require separate databases. Features like hypertables (automatic time-based partitioning), continuous aggregates, and compression deliver 90%+ compression ratios and sub-millisecond queries.

Over 100M events/day: Horizontal scaling or highly specialized databases may be required. Operational complexity becomes justified by performance gains.

Here is a diagram that shows how you can choose the right database based on your data type and scale:

image

How Do Different Database Types Compare?

Database Type

Best For

Key Strengths

Limitations

Typical Use Case

PostgreSQL

ACID compliance, complex joins, structured data

Mature ecosystem, strong consistency, SQL standardization

High ingestion rates (>100K/sec) require tuning

E-commerce platforms

TigerData (TimescaleDB)

High ingest (>100K/sec), temporal queries, IoT

90%+ compression, automatic partitioning, SQL compatible

Learning curve for time-series features

Sensor networks, monitoring

InfluxDB

Pure time-series workloads, metrics

Purpose-built for time-series, SQL support (v3+)

Limited relational capabilities

Application performance monitoring

MongoDB

Unstructured data, rapid prototyping

Schema flexibility, horizontal scaling

Complex analytical queries expensive

Content management, catalogs

Pinecone/Weaviate

Massive-scale vector search (100M+ vectors)

Automatic scaling, optimized similarity search

Limited metadata filtering, operational overhead

Large-scale semantic search

pgvector

Vector search with relational/temporal filters

Integrated filtering, PostgreSQL ecosystem

Scales efficiently to 50M vectors; beyond that, cost efficiency decreases

RAG applications, hybrid search

When Should I Use One Database vs Multiple Databases?

Use a unified database (PostgreSQL with extensions) when: Your team has fewer than 10 engineers, your data types overlap (IoT platforms querying sensor readings with device metadata), or you operate between 1M-100M events/day where optimization matters but horizontal scaling doesn’t.

Use multiple databases when: Compliance requires physical isolation (HIPAA, PCI-DSS), workloads have fundamentally incompatible requirements (strict serializable isolation vs eventual consistency), or you’ve proven a bottleneck that exhausts optimization options.

The Hidden Costs of Multiple Databases

The “polyglot persistence” (using multiple types of databases in the same application, each for its specific use) pattern creates:

  • Synchronization lag: Vector search returns documents deleted 5 seconds ago because the delete event hasn’t propagated from PostgreSQL to Pinecone.

  • Double storage costs: Same data exists in PostgreSQL (metadata) and InfluxDB (time-series), often with minimal compression in one system.

  • Distributed joins in application code: Your backend becomes a database query coordinator, implementing joins, filters, and pagination across multiple systems, which effectively slows it down.

Common Scenarios: Which Database Should I Choose?

Scenario A: IoT Platform (Sensors + Device Management)

Requirements: Ingest 500K sensor readings/second, join with device metadata, query historical trends.

Recommendation: Tiger Data (TimescaleDB)

Why: You need to join time-series data (sensor readings) with relational data (device configurations, alert rules). A split architecture (InfluxDB + PostgreSQL) requires denormalizing device metadata into every sensor reading (bloat), implementing joins in application code (complexity), or accepting stale metadata (correctness issues).

-- Single query joining time-series and relational data SELECT  d.name AS device_name, AVG(s.temperature) AS avg_temp, d.alert_threshold FROM sensor_readings s JOIN devices d ON s.device_id = d.id WHERE  s.time > NOW() - INTERVAL '1 hour' AND d.location = 'Building A' GROUP BY d.name, d.alert_threshold, d.id HAVING AVG(s.temperature) > MAX(d.alert_threshold);

Scenario B: AI Agent with Conversational Memory

Requirements: Store conversation history (time-series), embeddings for semantic search (vector), user preferences (relational).

Recommendation: Tiger Data with pgvector

Why: AI agents need episodic memory (“What did the user ask 10 minutes ago?”) and semantic memory (“What past conversations are relevant?”). These must be queried together for context-aware responses.

-- AI agent memory: hybrid query SELECT c.message, c.timestamp, u.name AS user_name, c.embedding <=> $1::vector AS distance FROM conversations c JOIN users u ON c.user_id = u.id WHERE c.user_id = $2 AND c.timestamp > NOW() - INTERVAL '7 days' AND u.subscription_tier = 'premium' ORDER BY c.embedding <=> $1::vector LIMIT 5;

Scenario C: Content Management System

Requirements: Flexible schemas, document relationships, full-text search.

Recommendation: PostgreSQL with JSONB

Why: PostgreSQL’s JSONB provides schema flexibility without sacrificing indexing, full-text search via GIN indexes, and JSON path queries comparable to MongoDB. MongoDB is only necessary if you need horizontal sharding from day one or lack PostgreSQL expertise.

Scenario D: High-Frequency Trading

Requirements: Sub-millisecond latency, millions of ticks per second, complex event processing.

Recommendation: Tiger Data for storage/analytics + in-memory processing (Redis/Apache Flink) for execution

Why: Trading systems need both real-time processing (<1ms) and historical analysis (backtesting). Tiger Data handles the analytical workload while purpose-built in-memory systems handle execution. This is a justified split because the latency requirements are fundamentally different.

Performance Benchmarks

The following benchmarks compare Tiger Data/TimescaleDB against standard PostgreSQL and InfluxDB for typical time-series workloads. Test setup: 100M rows inserted, 1000 concurrent queries over 7-day time ranges on AWS EC2 m5.2xlarge instances.

Metric

TigerData/TimescaleDB

Standard PostgreSQL

InfluxDB 3.x

Sustained Ingest Rate

100K-115K rows/sec

15K-20K rows/sec

300K-400K rows/sec

Peak Ingest (optimized)

Up to 1.2M rows/sec*

50K rows/sec

800K rows/sec

Compression Ratio

90-97%

0% (no native compression)

60%

Aggregate Query (p95)

45-120ms

8,200ms

120ms

Point Query (p95)

2ms

3ms

5ms

Storage (100M rows)

2-4 GB

42 GB

16 GB

* Peak rates require optimized conditions: high cardinality, 32 concurrent connections, ordered data. Real-world sustained rates typically range from 100K-140K rows/sec depending on hardware and data characteristics. Source: Official TimescaleDB benchmarks.

Key observations: Tiger Data’s compression achieves 90-97% reduction through columnar storage and delta encoding. Smaller datasets fit in memory, improving query performance. Aggregate queries show the biggest gap due to continuous aggregates pre-computing common rollups.

Frequently Asked Questions

Is PostgreSQL Fast Enough for Time-Series Data?

Yes, with extensions like Tiger Data (TimescaleDB). Standard PostgreSQL lacks automatic time-based partitioning, compression optimized for temporal patterns, and time-bucketing functions. Tiger Data adds these features while maintaining PostgreSQL’s ACID guarantees and full SQL compatibility.

Note: Tiger Data is the company name (rebranded from Timescale Inc. in June 2025); TimescaleDB is the PostgreSQL extension that provides time-series optimizations.

Time-Series Optimization Example

-- Create a hypertable for sensor data CREATE TABLE sensor_readings ( time TIMESTAMPTZ NOT NULL, device_id INTEGER NOT NULL, temperature DOUBLE PRECISION, humidity DOUBLE PRECISION ); -- Convert to hypertable (automatic time-based partitioning) SELECT create_hypertable('sensor_readings', 'time'); -- Create continuous aggregate for real-time dashboards CREATE MATERIALIZED VIEW sensor_hourly WITH (timescaledb.continuous) AS SELECT time_bucket('1 hour', time) AS hour, device_id, AVG(temperature) as avg_temp FROM sensor_readings GROUP BY hour, device_id; -- Set up data retention (auto-delete data older than 90 days) SELECT add_retention_policy('sensor_readings', INTERVAL '90 days');

Why this matters: You can join sensor data with device metadata without ETL pipelines. Performance scales automatically as data grows. Retention policies are native SQL, not custom scripts.

Do I Need a Separate Vector Database for RAG Applications?

Not unless you exceed 50 million embeddings or require specialized distributed architectures. Integrated solutions (pgvector in PostgreSQL/Tiger Data) provide a critical advantage: filtering vectors by metadata or time in a single query.

The Vector Database Threshold

pgvector scales efficiently to tens of millions of vectors with proper tuning. At approximately 10M vectors with 1536 dimensions, you need ~60GB RAM for in-memory HNSW indexes. Beyond this scale, cost-efficiency may favor disk-based indexes (pgvectorscale’s StreamingDiskANN) or dedicated vector databases.

Verified capabilities: Tiger Data benchmarks show pgvector handling 50 million vectors successfully. With the pgvectorscale extension, performance shows 28x lower latency and 16x higher throughput versus Pinecone’s s1 pods at 99% recall. Source: Tiger Data benchmarks.

Hybrid Query Advantage

Consider this RAG (Retrieval-Augmented Generation) scenario:

-- Create table for document embeddings CREATE TABLE documents ( id SERIAL PRIMARY KEY, content TEXT, embedding vector(1536), created_at TIMESTAMPTZ DEFAULT NOW(), category TEXT, author_id INTEGER ); -- Create HNSW index for fast similarity search CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops); -- Hybrid query: similarity + time filter + relational join SELECT d.content, d.embedding <=> $1::vector AS distance, a.name AS author_name FROM documents d JOIN authors a ON d.author_id = a.id WHERE d.created_at > NOW() - INTERVAL '30 days' AND d.category = 'technical' ORDER BY d.embedding <=> $1::vector LIMIT 10;

In a dedicated vector database, you’d need to: (1) Store embeddings in Pinecone, (2) Store metadata in PostgreSQL, (3) Query Pinecone for top 1000 similar vectors, (4) Filter in application code for time/category, (5) Query PostgreSQL for author names, (6) Merge results in application logic. The integrated approach executes this in a single query with proper indexing for all filters.

When Should I Use MongoDB vs PostgreSQL?

Migrate to MongoDB when: Your data is genuinely schema-less with unpredictable structure, you need horizontal sharding from day one, or your team has MongoDB expertise but not PostgreSQL.

Stay with PostgreSQL when: You need consistent joins, ACID transactions across documents, or complex analytics. PostgreSQL’s JSONB data type provides schema flexibility with better query performance for most use cases.

PostgreSQL JSONB vs MongoDB

Modern PostgreSQL handles semi-structured data efficiently:

-- Store flexible product catalog in PostgreSQL CREATE TABLE products ( id SERIAL PRIMARY KEY, name TEXT, attributes JSONB ); -- Create GIN index for fast JSONB queries CREATE INDEX idx_attributes ON products USING GIN (attributes); -- Query nested JSON with indexing SELECT name, attributes->'specs'->>'weight' AS weight FROM products WHERE attributes @> '{"category": "electronics", "in_stock": true}'::jsonb ORDER BY (attributes->>'price')::numeric;

Trade-offs: MongoDB excels at horizontal scaling and schema evolution. PostgreSQL excels at complex joins (MongoDB’s $lookup is 10-100x slower according to independent benchmarks), analytical queries, and maintaining data consistency. MongoDB 5.0+ includes native time-series collections, narrowing the gap for temporal workloads.

What About InfluxDB 3.x?

InfluxDB 3.x (current version) uses SQL as its primary query language via Apache DataFusion, maintaining backward compatibility with InfluxQL. Previous versions used Flux (v2.x) or InfluxQL (v1.x). InfluxDB excels at pure metrics workloads but lacks the relational capabilities needed for hybrid queries. Learn more: InfluxDB vs TigerData comparison.

What’s the Difference Between Tiger Data and TimescaleDB?

Tiger Data is the company name (rebranded from Timescale Inc. in June 2025). TimescaleDB is the PostgreSQL extension that provides time-series optimizations. The extension works with PostgreSQL 15, 16, and 17.

Conclusion: Start Simple, Scale When Necessary

Most applications benefit from a unified architecture until they have proven bottlenecks. Start with PostgreSQL. Add Tiger Data extensions for time-series optimization. Use pgvector for vector search. Leverage JSONB for document flexibility. Only split your stack when you’ve exhausted optimization options or hit hard constraints like compliance isolation.

The database decision rule: Don’t introduce operational complexity to solve theoretical problems. Benchmark your actual workload. If you’re considering multiple databases for relational, time-series, and vector data without hitting 100M+ events/day or facing compliance requirements, you’re likely over-engineering.

Stop managing three different databases for one application. Unify your relational, time-series, and vector workloads with Tiger Data. Experience the speed of a specialized database with the reliability of PostgreSQL. Try TigerData free today.

On this page