---
title: "Time-Series Database: What It Is, How It Works, and When You Need One"
description: "What is a time-series database, how does it work, and when do you need one? A technical breakdown of TSDBs—from partitioning and compression to real-world use cases."
section: "Time series basics"
---

> **TimescaleDB is now Tiger Data.**

## The Rise of Time-Stamped Data

Virtually every modern application at scale generates time-stamped data. Server logs, sensor readings, stock ticks, user events, temperature measurements. The volume grows constantly, and the queries follow predictable patterns: filter by time range, aggregate by interval, compare periods.

General-purpose databases handle this fine when volumes are modest. But datasets grow. Hundreds of millions of rows become billions. Queries shift from point lookups to scans and aggregations across wide time ranges. What took milliseconds now takes seconds. Engineering teams spend more time [<u>tuning indexes and managing partitions</u>](https://www.tigerdata.com/blog/hidden-costs-table-partitioning-scale) than building product features.

Time series databases solve this. They are built to ingest, store, compress, and query time-stamped data at scale.

### What Is Time Series Data?

Time series data is data that measures change over time. It constitutes any data where the timestamp is the primary axis for organization and querying. Not just "a column that happens to be a date." The timestamp is how you slice, group, and analyze the data.

A data point typically includes a timestamp, one or more numeric values, and labels that identify the source. A temperature sensor on a factory floor produces a reading every second: timestamp, temperature, sensor ID, location, equipment type. Visit this time series guide to browse [<u>timestamped data examples</u>](https://www.tigerdata.com/learn/types-of-data-supported-by-postgresql-and-timescale).

Examples span every industry: CPU metrics from server fleets, glucose readings from medical devices, price updates from exchanges, GPS coordinates from delivery vehicles, clickstream events from web apps. What makes all of this "time series" is not the timestamp itself but the access pattern. You almost never look up a single record by primary key. You scan ranges, group by intervals, and summarize. 

### The Ubiquity of Metrics: From IoT to Observability

Time series workloads are everywhere, even when teams don't initially frame their data that way.

In infrastructure observability, systems generate millions of data points per minute: CPU, memory, disk I/O, network throughput, request latency, error rates. Monitoring tools like Grafana and Prometheus feed metrics dashboards and alerting pipelines.

In IoT telemetry and industrial applications, sensors on factory equipment, energy grids, oil rigs, and connected vehicles produce continuous streams. Time series databases first gained traction here, serving as data historians for sensory equipment. A single manufacturing facility might generate thousands of data points per second.

In product analytics, every user interaction is a timestamped event. Pageviews, clicks, feature usage, session durations.

In financial markets, exchanges produce tick-level data at sub-millisecond granularity. Price updates, trade executions, order book changes.

The common thread: data arrives continuously, volumes grow fast, and the value comes from analyzing patterns over time.

### Why Specialized Databases?

When data has a strong temporal component, the database can make assumptions that general-purpose systems cannot.

Time series data is typically append-heavy, with new readings arriving continuously and historical data rarely changing. New readings arrive continuously; historical data is rarely updated. The database can optimize for sequential appends rather than random updates, converting expensive random I/O into cheap sequential writes.

Timestamps arrive at regular or near-regular intervals, so delta encoding (storing the difference between consecutive timestamps rather than the full value) compresses them to nearly nothing. Measurement values from the same source tend to be similar over short intervals, enabling further compression through XOR encoding and variable-length integers. The [<u>Facebook Gorilla paper</u>](https://dl.acm.org/doi/10.14778/2824032.2824078) demonstrated XOR-based compression achieving an average of 1.37 bytes per value. These techniques have directly influenced the design of several TSDBs.

Queries almost always include a time range filter. Organizing data into time-based partitions lets the database skip irrelevant chunks entirely, keeping query latency stable as total volume grows.

The numbers are real: 10x to 100x compression ratios depending on the workload, compared to row-oriented storage.

## Why Traditional Databases Struggle

Relational databases like vanilla [<u>PostgreSQL</u>](https://www.postgresql.org/) (without time-series extensions) and MySQL were designed for transactional workloads: frequent reads and writes to individual rows, updates in place, point lookups by primary key. Time series workloads push against every one of those assumptions.

### Limitations of Relational Databases

As tables grow past hundreds of millions of rows, B-tree indexes become less effective. The index grows large. Range scans across wide time windows touch many pages. Query planners struggle with aggregate queries that span large datasets.

Row-oriented storage keeps all columns of a row together on disk. Analytical queries that only need a few columns may still read more data than necessary. Many modern time-series systems address this by combining row storage for recent data with columnar storage for historical data. Wasted I/O.

Compression in row-oriented systems is generic. It can't exploit the patterns specific to time series data: regular timestamps, slowly changing values, repetitive tag sets.

And the operations relational databases are optimized for (updates, deletes, MVCC) are rare in time series workloads. The database carries that overhead without benefiting from it.

### The Scale Problem

A modest monitoring setup collecting 100 metrics every 10 seconds across 1,000 servers generates 850 million rows per day. In a month: 25+ billion rows.

At that scale, everything degrades simultaneously. Insert throughput drops as indexes grow. Query latency increases. Vacuum operations in PostgreSQL consume more resources. Backup times extend.

Manual partitioning helps but creates its own operational overhead: managing boundaries, creating new partitions, dropping old ones, ensuring query pruning works correctly. That overhead grows with the dataset.

### The Cardinality Problem

High cardinality means a large number of unique tag combinations. A system tracking metrics per user ID might have millions of unique series. Some TSDBs maintain an in-memory index entry for every unique series, so memory usage scales linearly with cardinality. In some TSDB architectures, memory usage can grow with the number of unique series, which creates challenges at high cardinality. 

The fix is usually columnar storage with efficient encoding, where high-cardinality fields are stored as regular columns rather than indexed tags. TimescaleDB can achieve [<u>chunk size reduction up to 98%</u>](https://www.tigerdata.com/docs/use-timescale/latest/hypercore)across real deployments, including high-cardinality datasets, using columnar compression in its Hypercore storage engine.

## How Time Series Databases Work

A time series database (TSDB), sometimes categorized as a Time Series DBMS, organizes its storage, indexing, and query execution around time. Not as an afterthought but as the core design principle.

### Time-Based Partitioning

This is the single most impactful optimization. Data is automatically divided into chunks by time interval (hourly, daily, weekly).

New data always writes to the most recent chunk, which stays small enough to fit in memory. Queries with a time range predicate skip irrelevant chunks entirely. Old chunks can be compressed, moved to cheaper storage, or dropped without touching active data. Recent chunks stay in row format for fast writes; older chunks convert to columnar format for fast scans.

In TimescaleDB, this is implemented as hypertables. Developers interact with a single table through standard SQL. The partitioning is invisible.

### Append-Only Storage

Time series workloads are predominantly append-only. Sensor readings don't get updated. Metric data points aren't edited after collection.

This lets the database skip in-place update overhead: no complex locking, optimized write-ahead logging, immutable data structures that are simpler and more compressible. Append-only storage also converts random I/O to sequential writes, which is faster on both SSDs and spinning disks.

Some time series data does need corrections (late-arriving data, sensor recalibrations). Databases on relational foundations like TimescaleDB handle mutations naturally. TSDBs that assume strict immutability can struggle with them.

### Compression

Delta encoding stores timestamp differences instead of full values. For data arriving every second, the delta is constant (1000ms), which compresses to almost nothing. Delta-of-delta encoding pushes further: if deltas are constant, the delta-of-delta is zero, compressible to a single bit.

Combined with columnar storage (same-column values stored together, enabling type-specific encoding), these techniques deliver 10x to 100x storage reductions.

### Ingestion

TSDBs sustain high write throughput through batched writes, in-memory buffering (memtables) that flush to disk in sorted order, and partitioned write paths that avoid contention between concurrent writers. A large IoT deployment might need tens of thousands of inserts per second with no tolerance for latency spikes.

### Queries and Aggregations

Time-range aggregations (averages, sums, min/max over hourly or daily windows) are the core operation. TSDBs optimize through partition pruning, columnar scans, vectorized execution, and pre-computed aggregations.

Continuous aggregates incrementally materialize common rollups in the background. Instead of recomputing a daily average across billions of raw rows, the database maintains a materialized view that updates with new data only. Real-time aggregates combine these rollups with the latest raw data in a single query.

### Downsampling and Retention

Raw per-second data matters for recent analysis. For long-term trends, per-hour or per-day granularity is enough. Downsampling replaces high-resolution data with aggregates: a year of per-second data (31.5 million rows) becomes 8,760 hourly rows.

TSDBs automate the data lifecycle: retention policies drop data older than a threshold, compression policies compress chunks after a specified age, and tiering moves old data to object storage while keeping it queryable. These run in the background without manual intervention.

## Under the Hood

### LSM Trees, Memtables, and SSTables

Some TSDBs use Log-Structured Merge (LSM) trees for storage, while others use columnar or hybrid architectures optimized for time-series workloads.

New data lands in a memtable (an in-memory sorted structure). When the memtable fills, it flushes to disk as an immutable SSTable (Sorted String Table). The database maintains multiple SSTable levels, with background merging to consolidate them. Reads check the memtable first, then SSTables newest to oldest.

This converts random writes to sequential I/O. Immutable SSTables compress well because the database never updates them in place.

### Specialized Indices

B-tree indexes become expensive to maintain at massive scale for time-series workloads.

Time-partitioned indexes restrict each index to a single chunk, keeping sizes bounded. Data within a chunk is already sorted by time, so sequential scans often beat index lookups. Some TSDBs skip traditional indexes entirely, using block-level metadata and Bloom filters to prune irrelevant data without maintaining a full index structure.

SQL queries against a billion-row TSDB return in milliseconds because the combination of partition pruning, chunk exclusion, and Bloom filters narrows the search space before any data is read from disk.

### Distributed Column Stores

Columnar storage stores all values of a single column together. Queries that aggregate one column skip all other columns. Compression is better because adjacent values in a column are similar.

Systems like ClickHouse and Apache Pinot use distributed column store architectures for large-scale analytics workloads. TimescaleDB takes a hybrid approach with Hypercore: recent data stays in row format for fast writes, older data converts to columnar for analytical queries.

## Core Use Cases

**Monitoring infrastructure and observability.** Infrastructure metrics, application performance data, distributed traces, log aggregation. Real-time dashboards and alerting rules query this data constantly.

**Real-time analytics and alerting.** E-commerce checkout rates per minute. Trading volumes in real time. Foot traffic per retail location. Fast queries on fresh data to power alerts and dashboards.

**IoT telemetry and industrial applications.** Vibration and temperature readings from manufacturing equipment. Power generation and consumption from energy grids. Pressure and flow from oil rigs. GPS and diagnostics from connected vehicles.

**Financial tick data.** Price updates, trade executions, and order book changes at sub-millisecond intervals. Backtesting, risk metrics, compliance storage.

**Product and user analytics.** Clicks, sessions, feature usage, purchases. Conversion funnels, cohort retention, feature adoption across billions of events.

## When to Adopt a Time Series Database

Not every application with timestamps needs a TSDB. If your dataset fits in a well-indexed PostgreSQL table and queries return in acceptable time, you probably don't need one.

The signals are concrete: queries slowing as tables grow past hundreds of millions of rows. Storage costs climbing. Engineering time going to manual partitioning, index tuning, and retention scripts. Dashboards showing stale data because the batch pipeline runs on a delay.

### Choosing the Right Approach

You have two paths: adopt a specialized TSDB, or extend the database you already use (such as Postgres).

Specialized TSDBs like InfluxDB and QuestDB deliver strong performance for metrics and monitoring. The tradeoff: a new system, often with a proprietary query language or non-native SQL, separate tooling, and limited relational capabilities. Your data lives in two places, which means pipelines, sync, and operational overhead.

TimescaleDB takes the other path. It's a PostgreSQL extension that gives PostgreSQL advanced time series capabilities: hypertables, Hypercore (hybrid row/columnar storage), continuous aggregates, native compression, lifecycle automation (retention policies). BecauseTimescaleDB allows you to stay with PostgreSQL, you keep full SQL, existing tooling, [<u>ACID guarantees</u>](https://www.tigerdata.com/learn/understanding-acid-compliance), and the ability to join time series data with relational data in the same query.

This matters when your time series data isn't isolated. If dashboards join metrics with customer records, or analytics must stay transactionally consistent with operational data, splitting into a separate system introduces lag and drift that compounds over time.

Time-series database users have achieved massive query improvements, storage savings through compression, and infrastructure cost reductions, as shown in these sample [<u>case studies</u>](https://www.tigerdata.com/case-studies).

### Trade-offs

A specialized TSDB is entirely optimized for one workload but adds operational complexity. Extending your existing database preserves stack simplicity but works within that ecosystem's constraints.

The question isn't "which database is fastest on benchmarks?" It's "what does my architecture look like in two years?" If your data is growing, your analytical queries are getting more complex, and you're already on PostgreSQL, [<u>extending Postgres</u>](https://www.tigerdata.com/blog/vertical-scaling-buying-time-you-cant-afford) avoids the architectural fragmentation that eventually becomes the most expensive problem.

## Integrating TSDBs into Your Stack

TSDBs connect to streaming platforms (Kafka, Pulsar), warehouses (Snowflake, BigQuery), visualization tools (Grafana, Tableau), and ML pipelines. The challenge is maintaining data freshness while controlling costs.

A common pattern: real-time TSDB for operational queries and alerts, continuous aggregates for dashboards, periodic snapshots to a warehouse for historical analysis. TSDBs that support standard SQL and the PostgreSQL wire protocol integrate with existing tools without custom adapters.

Developers want fewer systems, not more. Analytics on live data without pipelines. Time series capabilities in the database they already use. The databases that win long-term won't force teams to choose between performance and simplicity.

## Frequently Asked Questions

### What exactly is time-series data?

Time series data is any data where the timestamp is the primary organizing dimension. Server metrics every 10 seconds, IoT sensor readings, stock ticks, clickstream events. The defining trait is the access pattern: queries by time range, aggregation by interval, comparison across periods.

### What is the best way to store 100 TB of time-series data?

Three things working together: columnar compression (a good TSDB compresses 100 TB down to 5 to 15 TB), time-based partitioning (queries only scan relevant chunks), and automated tiering (old data moves to cheap object storage but stays queryable). TimescaleDB handles all three natively. At 100 TB, managing this manually is a full-time job.

### What is the best database type for time series?

Depends on your stack. If you're on PostgreSQL, TimescaleDB adds time-series performance without a second database. For standalone metrics ingestion, InfluxDB and QuestDB are purpose-built. For heavy analytical aggregations, ClickHouse is strong. The worst choice is usually "stick with vanilla Postgres and hope tuning is enough."

### What are good services for a time-series database server?

Managed: Tiger Cloud (TimescaleDB + PostgreSQL, usage-based pricing), Amazon Timestream (AWS-native, limited SQL compatibility), InfluxDB Cloud. Self-hosted: TimescaleDB runs anywhere PostgreSQL runs. QuestDB and ClickHouse offer both. The managed-vs-self-hosted decision usually comes down to whether you have a DBA and depends on your data privacy and compliance requirements.

### Which time series database works best for high-volume IoT data?

IoT needs sustained write throughput, compression for repetitive readings, and handling of late or out-of-order data. TimescaleDB combines high ingest with 84% to 97% compression in production IoT deployments, plus mutable data for sensor recalibrations. [<u>Waterbridge ingests 5,000 to 10,000 points per second</u>](https://www.tigerdata.com/blog/how-waterbridge-uses-timescaledb-for-real-time-data-consistency). [<u>Glooko ingests 3 billion data points/month</u>](https://www.tigerdata.com/blog/how-glooko-turns-3b-data-points-per-month-into-lifesaving-diabetes-healthcare-tiger-data).

### Can relational databases handle time-series data?

Up to a point. PostgreSQL handles time-stamped data fine under a few hundred million rows with simple queries. Past billions of rows, B-tree indexes become inefficient, row-oriented storage wastes I/O on analytical scans, and maintenance operations grow expensive. TimescaleDB extends PostgreSQL with time-series primitives: same SQL, same ACID guarantees, plus the partitioning, compression, and query optimizations of a TSDB.

### How do you store stock market time-series data?

Financial tick data needs sub-millisecond timestamp precision, high ingest during market hours, low-latency queries, and long-term compliance storage. The typical approach: ingest raw ticks with nanosecond timestamps, use continuous aggregates for OHLCV bars at 1-second, 1-minute, and 1-hour intervals. QuestDB and kdb+ are popular for ultra-low-latency trading. TimescaleDB fits when you need to join tick data with relational data (orders, accounts, compliance metadata) in the same database.

### Where can I find TimescaleDB time series technical papers?

Apart from the Tiger Data [documentation](https://www.tigerdata.com/docs/) and [blog](https://www.tigerdata.com/blog), you can access whitepapers such as [Understanding Postgres Performance Limits for Analytics on Live Data](https://www.tigerdata.com/blog/postgres-optimization-treadmill) and [Tiger Data Architecture for Real-Time Analytics](https://www.tigerdata.com/docs/about/latest/whitepaper).