Tiger Data Blog

Start on Postgres, Scale on Postgres: How TimescaleDB 2.25 Continues to Improve the Way Postgres Scales

Mike Freedman — Tue, 17 Feb 2026 17:33:46 GMT

Most developers start building with Postgres because it’s simple, reliable, and flexible. You get a clear relational model, transactional semantics you can trust, and an ecosystem that lets teams move quickly without committing to a complex architecture early. The challenge is keeping that simplicity as systems grow. Higher ingest, larger datasets, and increasingly real-time analytical workloads can push teams toward a second system long before they want one.

This pressure is most visible in time-series workloads that demand real-time performance. High write rates, append-heavy tables, and repeated queries over recent windows stress both storage and execution paths. Without reducing the amount of work required per query, scale quickly becomes an architectural problem rather than a performance optimization one, shifting effort from incremental tuning to changes in system design.

TimescaleDB is designed to change that trajectory. “Start on Postgres, scale on Postgres” is a promise, but it is grounded in a specific architectural approach: performance at scale comes from reducing the work the database must do as data grows, then parallelizing what remains. TimescaleDB 2.25 continues this evolution by tightening the execution and maintenance paths that dominate cost at scale, so common workloads become cheaper and operationally steadier under sustained growth.

This release focuses on three outcomes: faster queries without constant tuning, efficient scaling to larger datasets and higher ingest, and real-time analytics that stays current and trustworthy without introducing a second system.

Faster Postgres queries at scale, with less tuning

Compression, chunk pruning, and columnar execution already reduce query cost by limiting how much data needs to be read and processed. In 2.25, more queries can avoid work entirely, and the planner is more consistent about selecting those cheaper plans.

A clear example is aggregation on compressed data. In earlier releases, queries using functions like MIN, MAX, FIRST, or LAST benefited from compression and metadata, but they still required scanning compressed batches and performing aggregation during execution. The scan was cheaper than a row-oriented approach, but it was still work proportional to the data touched.

In 2.25, these aggregates can often be answered directly from sparse metadata maintained for compressed chunks. The planner can choose a custom execution path that reads summaries rather than scanning or decompressing data. This is implemented via the new ColumnarIndexScan plan node (see PR #9088, PR #9103, and PR #9108). On workloads where this applies, the 2.25 release notes report this class of queries speeding up by up to 289x. For teams running dashboards or monitoring queries over large compressed datasets, this can translate into dramatically faster response times with no query changes required.

The important shift here is in cost structure. Once an answer can be derived from metadata, performance is no longer tied to the number of rows stored inside a chunk. It is tied to the minimum work required to identify relevant chunks and read their summaries, which becomes more valuable as datasets grow.

A complementary improvement applies the same idea to another common pattern: time-filtered queries that do not need to materialize column values. For queries like SELECT COUNT(*) FROM events WHERE time > ..., previously, the execution path could still require decompressing the time column to evaluate the predicate, even though the query does not need to read time values for every row. In 2.25, the time column can often be skipped entirely for these cases, reducing CPU and memory pressure while preserving the same result (see PR #9094). The release notes describe this pattern as up to 50x faster for the example query.

As these fast paths expand, plan stability becomes just as important as peak speed. Even when an efficient path exists, teams feel it when the planner chooses it inconsistently or when small changes in query shape lead to surprising regressions. In 2.25, planner improvements around columnar scan paths and ordering help make compression-aware execution more predictable (see PR #8986 and PR #9133). Fewer surprises mean less time spent tuning and diagnosing why a query slowed down as data evolved.

Efficient scaling for high-ingest Postgres workloads

A hard part of scaling is not only achieving good performance at a given size, but preserving efficiency as data volume, ingest rate, and concurrency grow together over time. In practice, scaling pressure shows up in two ways. Some costs grow gradually, such as planning and execution work increasing with the number of partitions. Others appear more abruptly, when accumulated complexity makes execution brittle and small changes in data or query shape trigger different plans and sudden slowdowns.

TimescaleDB’s scaling model is designed to address both. It relies on clear boundaries: partitioning data into chunks, using metadata to prune irrelevant chunks, and compressing data to reduce the work required within each chunk. In 2.25, several refinements make these boundaries behave more efficiently and consistently under sustained growth.

One pressure point is that chunk counts rise over long retention windows, making pruning and constraint handling increasingly important. Earlier versions already used constraints and metadata to skip irrelevant chunks, but there were cases where constraint handling became more permissive than necessary, causing queries to consider more chunks than required as datasets aged. In 2.25, constraint handling improves for fully covered chunks, helping keep both planning and execution costs more tightly bounded as data volume increases (see PR #9127).

Planning behavior under high partition counts is another area where inefficiency and brittleness can emerge together. As hypertables accumulate thousands of chunks, planning time and plan quality can matter as much as execution speed, especially for joins and more complex query shapes. TimescaleDB 2.25 includes fixes for a planning performance regression on Postgres 16 and later affecting some join queries (see PR #8706). These changes reduce both how quickly planning cost grows and how likely it is to tip into unstable behavior as workloads evolve.

The result is more efficient scaling in practice. Costs still grow with data, but they grow more slowly and with fewer surprises, allowing Postgres to continue scaling in place rather than forcing architectural changes to manage accumulated overhead.

Real-time analytics in Postgres, without a split architecture

As refresh frequency increases and datasets grow, keeping analytics fresh inside the primary database can create background pressure. That pressure grows unless refresh and maintenance paths stay efficient. TimescaleDB has long supported real-time analytics inside Postgres through continuous aggregates, compression, and retention policies. In 2.25, the focus is on lowering the operational footprint of staying current as systems run continuously.

One improvement is compressed continuous aggregate refresh. Earlier versions supported refreshing into compressed hypertables, but the refresh path could include intermediate steps that added extra I/O and CPU work. In 2.25, direct compression on continuous aggregate refresh is enabled via a configuration option, reducing unnecessary data movement when keeping aggregates up to date (see PR #8777 and PR #9038). The semantics are unchanged, but the cost of maintaining freshness is lower, especially for frequent refresh schedules.

This is complemented by refinements to batching. Large refresh transactions can temporarily increase WAL volume and create uneven load. In 2.25, the default buckets_per_batch for continuous aggregate refresh policies is adjusted to keep transactions smaller (from 1 to 10 buckets), reducing WAL holding and making refresh behavior steadier under sustained ingest (see PR #9031).

The release also includes incremental improvements that reduce background churn from lifecycle operations like retention and deletes on long-running datasets, along with correctness and robustness fixes for compressed and partitioned workloads. For example, support for retention policies on UUIDv7-partitioned hypertables expands the set of configurations where lifecycle management remains reliable over time (see PR #9102). These changes are small individually, but they matter for trust. Real-time analytics only works if results stay aligned with transactional truth as schemas and workloads evolve.

Closing

TimescaleDB 2.25 continues to make Postgres a better place to run real-time analytics at scale: faster queries through less work, smoother behavior as data and ingest grow, and lower operational overhead for keeping analytics current and correct.

All in service of a simple yet powerful idea: start on Postgres, scale on Postgres. Learn why vanilla Postgres hits performance ceilings at scale.

To learn more, check out the full release notes or try Tiger Cloud for free and experience TimescaleDB 2.25 on your largest hypertables. Learn how Plexigrid consolidated 4 databases into Postgres and got 350x faster queries.

Postgres for Agents

Ajay Kulkarni — Tue, 21 Oct 2025 13:46:50 GMT

Announcing Agentic Postgres: The first database built for agents.

Agents are the New Developer

80% of Claude Code was written by AI. More than a quarter of all new code at Google was generated by AI one year ago. It’s safe to say that in the next 12 months, the majority of all new code will be written by AI.

Agents don’t behave like humans. They behave in new ways. Software development tools need to evolve. Agents need a new kind of database made for how they work.

But what would a database for agents look like?

At Tiger, we’ve obsessed over databases for the past 10 years. We’ve built high-performance systems for time series data, scaled Postgres across millions of workloads, and served thousands of customers and hundreds of thousands of developers around the world.

So when agents arrived, we felt it immediately. In our bones. This new era of computing would need its own kind of data infrastructure. One that still delivered power without complexity, but built for a new type of user.

How do agents behave?

Agents don’t click, they call.
Agents don’t remember, they retrieve.
Agents can download expertise to become experts.
Agents can parallelize effortlessly, acting like a multi-threaded team.
Agents need a safe sandbox where they can play (or wreak havoc).
Agents can also hammer your infrastructure (and your budget) if you’re not careful.

We started on this problem over a year ago. Multiple teams working in parallel, months of engineering and internal user feedback, rethinking everything from the storage layer to how agents actually reason.

Here’s what we built.

Introducing Agentic Postgres

Today we’re launching Agentic Postgres, the first database designed from the ground up for agents. It includes:

The best database MCP server ever built

Agentic Postgres includes our new MCP server that enables agents not just to interact with the database but also understand how to use it well. We’ve taken our 10+ years of Postgres experience and distilled it into a set of built-in master prompts. This gives agents safe, structured access to the database through high-level tools for schema design, query tuning, migrations, and more. The MCP server also performs native full-text and semantic search over the Postgres docs, so agents can instantly retrieve the right context as they think.

> I want to create a personal assistant app. Please create a free service on Tiger. Then using Postgres best practices, describe the schema you would create.

0:00

/0:30

Native search and retrieval

Agentic Postgres comes with native full-text and semantic search built directly into the database. For semantic search, we’ve improved our existing extension pgvectorscale, for higher-throughput indexing, better recall, and lower latency at scale than pgvector.

For full-text search, pg_textsearch, our newest Postgres extension, implements BM25 for modern ranked keyword search optimized for hybrid AI applications alongside pgvectorscale. The current preview release uses an in-memory structure for fast writes and queries. Future releases will add disk-based segments with compression and BlockMax WAND optimization, applying the same battle-tested techniques from production search engines.

Together, these extensions let agents retrieve structured data instantly without leaving Postgres.

> Using service qilk2gqjuz, analyze user feedback with hybrid search (combining text search and semantic search). Group similar feedback by theme and show counts for each theme, using an ascii bar chart. First, look at the pg_textsearch (BM25) and pgvectorscale documentation in the Tiger docs to get the proper syntax, and then use those extensions.

0:00

/0:50

Fast, zero-copy forks

At the core of Agentic Postgres is a new copy-on-write block storage layer that makes databases instantly forkable. Every agent can spin up its own isolated environment, a full copy of production data in seconds, without duplicating data (or costs). Every fork is lightweight and efficient, so you only pay for the blocks that change. It’s perfect for experiments, benchmarks, and migrations that can run safely in parallel.

> Please create a fork of gf868h9j1y using the last snapshot, and then test 3 different indexes that we should create to speed up performance, then delete the fork, and report back on your findings. Before you start run “tiger service fork --help” and “tiger service delete --help” to get the right syntax. Use MCP over psql, using the password from the local keychain.

0:00

/1:06

New CLI and free tier

We’ve also built a new CLI that makes it easy to explore, fork, and build with Agentic Postgres, and we’re launching a free tier so every developer and every agent can get hands-on right away.

This is all launching today. You can try this today with 3 basic commands in your terminal:

# 3 commands to install the Tiger CLI and MCP. That's it!
$ curl -fsSL https://cli.tigerdata.com | sh
$ tiger auth login
$ tiger mcp install

Then just tell your agent to spin up a new free service using MCP, or simply call tiger create service from the command line to get going.

Powered by Fluid Storage

Agentic Postgres is powered by Fluid Storage, our new distributed storage layer. Fluid Storage is built on a disaggregated architecture of a horizontally scalable distributed block store using local NVMe storage, a storage proxy layer that exposes copy-on-write volumes, and a user-space storage device driver.

It’s storage that looks like a local disk to Postgres yet scales like a cloud service.

As a result, Fluid Storage delivers instant forks, snapshots, and automatic scaling (up or down) without downtime or over-provisioning. In benchmark testing, a single volume sustains throughput over 110,000 IOPS, while retaining all of Fluid’s elasticity and copy-on-write capabilities.

All free services on Tiger Cloud run on Fluid Storage today, so every developer can experience its performance and flexibility firsthand.

And this is just the start. We’ll dive deeper into each of these (MCP, pg_textsearch, pgvectorscale, forkable databases, Fluid Storage, CLI, free tier) later this week and next.

Built for Agents and Developers

Agentic Postgres is built for agents, so developers can work on higher level problems.

Building with and for agents, we’ve learned something simple: agents are not here to replace us. They’re here to elevate us.

Agents take on the mechanical, repetitive work, freeing us to focus on what matters most: architecture, design, creativity, impact. They make us faster, smarter, and enable us to do more ambitious work than we could do alone.

The myth is that AI will replace developers. The truth is that developers who build with agents will replace those who don’t.

Agentic Postgres is for developers who want to build with AI. For developers who care more about working applications than disposable demos. For developers who want AI to feel like engineering, not just experimentation.

Today’s launch is just the beginning. There are still some rough edges. We’d love your help sanding them down. But expect more to come: more launches, big and small, in the weeks, months, and years ahead.

Agents are the new developers. Agentic Postgres is their new playground.

Built for Agents. Designed to Elevate Developers.

So let’s build. Together.

Get started today:

$ curl -fsSL https://cli.tigerdata.com | sh

Tiger Lake: A New Architecture for Real-Time Analytical Systems and Agents

Mike Freedman — Thu, 17 Jul 2025 12:59:06 GMT

🔈

Tiger Lake is currently in public beta for scale and enterprise users. Sign up for Tiger Cloud to try out your use case.

Modern applications are becoming more dynamic, more intelligent, and more real time. Dashboards refresh with incoming telemetry. Monitoring systems respond to shifting baselines. Agents make decisions in context, not in isolation. Each depends on the same foundational requirement: the ability to unify live events with deep historical state.

Yet the data remains fragmented.

Operational systems, built on Postgres, handle ingestion and serving. Analytical systems, built on the lakehouse, handle enrichment and modeling. Connecting them means stitching together streams, pipelines, and custom jobs—each introducing latency, fragility, and cost. The result is a patchwork of systems that struggle to deliver the full picture, let alone do so in real time.

This fragmentation doesn’t just slow teams down—it limits what developers can build. You can’t deliver real-time dashboards with historical depth or ground agents in fresh operational context when the data is split by design.

This architectural divide is no longer sustainable.

Tiger Lake bridges that divide. Now in public beta, it introduces a new data loop—continuous, bidirectional, and deeply integrated—between Postgres and the lakehouse. It simplifies the stack, preserves open formats, and brings operational and analytical context into the same system.

Introducing Tiger Lake: Real-Time Data, Full-Context Systems

Tiger Lake eliminates the need for external pipelines, complex orchestration frameworks, and proprietary middleware. It is built directly into Tiger Cloud and integrated with Tiger Postgres, our production-grade Postgres engine for transactional, analytical, and agentic workloads.

The architecture uses open standards from end to end:

Apache Iceberg tables stored in Amazon S3 Tables for lakehouse integration
Continuous replication from Postgres tables or hypertables into Iceberg
Streaming ingestion back into Postgres for low-latency serving and operations
Pushing down queries from Postgres to Iceberg for efficient rollups

These capabilities come built in. What previously required Flink jobs, DAG schedulers, and custom glue now works natively. Streaming behavior and schema compatibility are designed into the system from the start.

To understand how Tiger Lake reshapes data architecture, it helps to revisit the medallion model and consider how it evolves when real-time context becomes a core design principle.

You can think of it as an operational medallion architecture:

Bronze: Raw data lands in Iceberg-backed S3.
Silver: Cleaned and validated data is replicated to Postgres.
Gold: Aggregates are computed in Postgres for real-time serving, then streamed back to Iceberg for feature analysis.

Traditional Bronze–Silver–Gold workflows were built for batch systems. Tiger Lake enables a continuous flow where enrichment and serving happen in real time.

This shift transforms an overly complex pipeline into a dynamic and simpler real-time data loop. Context and data moves freely between systems. Operational and analytical layers stay connected without redundant jobs or duplicated infrastructure.

All data remains native, up to date, and queryable with standard SQL. Tiger Lake supports a single write path that powers real-time applications, dashboards, and the lakehouse, using the architecture that best fits the developer. Users can write data to Postgres, then have appropriate data and rollups automatically synced to their lakehouse; conversely, users already feeding raw data into the lakehouse can automatically bring it to Postgres for operational serving. Now, applications can reason across the now and the then—without orchestration code or synchronization overhead.

"We stitched together Kafka, Flink, and custom code to stream data from Postgres to Iceberg. It worked, but it was fragile and high-maintenance," said Kevin Otten, Director of Technical Architecture at Speedcast. "Tiger Lake replaces all of that with native infrastructure. It’s the architecture we wish we had from day one."

From Architecture to Outcomes

Tiger Lake enables real-time systems that were previously too complex to operate or too expensive to build.

Customer-facing dashboards

Dashboards can now combine live metrics with historical aggregates in a single query. There is no need for dual stacks or stale insights. Tiger Lake supports high-throughput ingestion at production scale, powering pipelines that visualize billions of rows in real time. Everything lives in one system, continuously updated and instantly queryable.

"With Tiger Lake, we finally unified our real-time and historical data," said Maxwell Carritt, Lead IoT Engineer at Pfeifer & Langen. "Now we seamlessly stream from Tiger Postgres into Iceberg, giving our analysts the power to explore, model, and act on data across S3, Athena, and Tiger Data."

Monitoring systems

With a single source of truth and a continuous data loop, alerting becomes faster and more reliable. Engineers can run one SQL query to inspect fresh telemetry and historical incidents together—improving triage speed, reducing false positives, and staying focused on what matters.

Simplifying the data plane also improves system resilience. Tiger Lake lets monitoring systems operate on the same live operational backbone, where Iceberg provides historical depth and Tiger Postgres delivers low-latency access.

Agents

Tiger Lake makes grounding possible without additional infrastructure. Developers can embed recent user activity and long-term interaction history directly inside Postgres. There is no need for orchestration, vector drift management or custom AI pipelines.

Imagine a support agent receives a new inquiry. The large body of historical support cases remain in Iceberg, while Tiger Lake created automated chunk and vector embeddings in Postgres. Now vector search against the operational database can answer AI chat questions quickly, while ensuring that embeddings stay fresh and up-to-date without complex orchestration pipelines.

In doing so, Tiger Lake is also a key building block in what we call Agentic Postgres, a Postgres foundation for intelligent systems that learn, decide, and act.

"With Tiger Lake, we believe Tiger Data is setting a strong foundation for turning Postgres into the operational engine of the open lakehouse for applications," said Ken Yoshioka, CTO, Lumia Health. "It allows us the flexibility to grow our biotech startup quickly with infrastructure designed for both analytics and agentic AI."

Companies like Speedcast, Lumia Health, and Pfeifer & Langen are already building full-context and real-time analytical systems with Tiger Lake. These architectures power industrial telemetry, agentic workflows, and real-time operations, all from a unified, continuously streaming platform.

Coming soon: Round-trip intelligence

Later this summer: Query Iceberg catalogs directly from within Postgres. Explore, join, and reason across lakehouse and operational data using SQL.
Fall 2025: Full round-trip workflows: ingest into Postgres, enrich in Iceberg and stream results back automatically. This lets developers move from event to analysis to action in one architecture.

How to set up Tiger Lake

Getting started is simple. No complex orchestration or manual integrations:

Create a bucket for Iceberg-compatible S3 tables.
Provide ARN permissions to Tiger Cloud.
Enable table sync in Tiger Postgres:

ALTER TABLE my_hypertable SET (
  tigerlake.iceberg_sync = true
);

The Future of Data Architecture Is Real-Time, Contextual, and Open

Tiger Lake introduces a new kind of architecture. It is continuous by design, scalable by default, and optimized for applications that need full context and complete data in real time.

Operational data flows into the lakehouse for enrichment and modeling. Enriched insights flow back into Postgres for low-latency serving. Applications and agents complete the loop, responding with precision and speed.

We believe this is the foundation for what comes next:

Systems that unify operational use cases and internal analytics
Architectures that reduce complexity instead of compounding it
Workloads that are not just reactive but grounded in understanding

You should not have to choose between context and simplicity. You should not have to patch together systems that were never designed to work together. And you should not have to replatform to evolve.

Together with next-generation storage architecture and our Postgres-native AI tooling, Tiger Lake forms the backbone of Agentic Postgres. This is a foundation built for intelligent workloads that learn, simulate, and act. We’ll share more soon.

Try it today on Tiger Cloud, and check out the Tiger Lake docs to get started.

— Mike

Speed Without Sacrifice: Building the Modern PostgreSQL for the Analytical and Agentic Era

Ajay Kulkarni — Tue, 17 Jun 2025 14:14:45 GMT

Timescale is now Tiger Data.

TL;DR: Eight years ago, we launched Timescale to bring time-series to PostgreSQL. Our mission was simple: help developers building time-series applications.

Since then, we have built a thriving business: 2,000 customers, mid 8-digit ARR (>100% growth year over year), $180 million raised from top investors.

We serve companies who are building real-time analytical products and large-scale AI workloads like: Mistral, HuggingFace, Nvidia, Toyota, Tesla, NASA, JP Morgan Chase, Schneider Electric, Palo Alto Networks, and Caterpillar. These are companies building developer tools, industrial dashboards, crypto exchanges, AI-native games, financial RAG applications, and more.

We’ve quietly evolved from a time-series database into the modern PostgreSQL for today’s and tomorrow’s computing, built for performance, scale, and the agentic future. So we’re changing our name: from Timescale to Tiger Data. Not to change who we are, but to reflect who we’ve become. Tiger Data is bold, fast, and built to power the next era of software.

Developers Thought We Were Crazy

When we started 8 years ago, SQL databases were “old fashioned.” NoSQL was the future. Hadoop, MongoDB, Cassandra, InfluxDB – these were the new, exciting NoSQL databases. PostgreSQL was old and boring.

That’s when we launched Timescale: a time-series database on PostgreSQL. Developers thought we were crazy. PostgreSQL didn’t scale. PostgreSQL wasn’t fast. Time-series needed a NoSQL database. Or so they said.

“While I appreciate PostgreSQL every day, am I the only one who thinks this is a rather bad idea?” – top HackerNews comment on our launch (link)

But we believed in PostgreSQL. We knew that boring could be awesome, especially with databases. And frankly, we were selfish: PostgreSQL was the only database that we wanted to use.

Today, PostgreSQL has won.

There are no more “SQL vs. NoSQL” debates. MongoDB, Cassandra, InfluxDB, and other NoSQL databases are seen as technical dead ends. Snowflake and Databricks are acquiring PostgreSQL companies. No one talks about Hadoop. The Lakehouse has won.

Today, agentic workloads are here.

Agents need a fast database. We see this in our customer base: private equity firms and hedge funds using agents to help understand market movements (“How did the market respond to Apple WWDC 2025?”); industrial equipment manufacturers building chat interfaces on top of internal manuals to help field technicians; developer platforms storing agentic interactions into history tables for greater transparency and trust; and so on.

What Started as a Heretical Idea Is Now a Thriving Business

We have also changed. We met in September 1997, during our first week at MIT. We soon became friends, roommates, even marathon training partners (Boston 1998).

While our hairlines and drinks (turmeric shots!) have changed, our enthusiasm has not

That friendship became the foundation for an entrepreneurial journey that has surpassed even our boldest imaginations.

What started as a heretical idea is now a thriving business:

2,000 customers
Mid 8-digit ARR, growing >100% y/y
200 people in 25 countries
$180 million raised from top investors
60%+ gross margins

Cloud usage is up 5x in the last 18 months, based on paid customers alone.

And that’s only the paid side of the story. Our open-source community is 10x-20x larger. (Based on telemetry, it’s 10x, but we estimate that at least half of all deployments have telemetry turned off.)

TimescaleDB is everywhere. It’s included in PostgreSQL offerings around the world: from Azure, Alibaba, and Huawei to Supabase, DigitalOcean, and Fly.io. You’ll also find it on Databricks Neon, Snowflake Crunchy Bridge, OVHCloud, Render, Vultr, Linode, Aiven, and more.

We Are Tiger Data

Today, we are more than a time-series database. We are powering developer tools, SaaS applications, AI-native games, financial RAG applications, and more. The majority of workloads on our Cloud product aren’t time-series. Companies are running entire applications on us. CTOs would say to us, “You keep talking about how you are the best time-series database, but I see you as the best PostgreSQL.”

So we are now “Tiger Data.” We offer the fastest PostgreSQL. Speed without sacrifice.

Our cloud offering is “Tiger Cloud.” Our logo stays the same: the tiger, looking forward, focused and fast. Some things do not change. Our open source time-series PostgreSQL extension remains TimescaleDB. Our vector extension is still pgvectorscale.

Why “Tiger”? The tiger has been our mascot since 2017, symbolizing the speed, power, and precision we strive for in our database. Over time, it’s become a core part of our culture: from weekly “Tiger Time” All Hands and monthly “State of the Tiger” business reviews, to welcoming new teammates as “tiger cubs” to the “jungle.” As we reflected on our products, performance, and community, we realized: we aren’t just Timescale. We’re Tiger. Today, we’re making that official.

This is not a reinvention: it’s a reflection of how we already serve our customers today.

Polymarket uses Tiger Data to track their price history. During the last election Polymarket ramped up 4x when trade volumes were extra high, to power over $3.7 billion dollars worth of trades.

Linktree uses Tiger Data for their premium analytics product, saving $17K per month on 12.6 TB from compression savings. They also compressed their time to launch, going from 2 weeks to 2 days for shipping analytical features.

Titan America uses Tiger Data’s compression and continuous aggregates to reduce costs and increase visibility into their facilities for manufacturing cement, ready-mixed concrete, and related materials.

Lucid Motors uses Tiger Data for real-time telemetry and autonomous driving analytics.

The Financial Times runs time-sensitive analytics and semantic search.

Tiger Is the Fastest Postgres for Modern Workloads

We are building the fastest Postgres: purpose-built for the modern operational workloads where traditional OLTP databases break down.

Operational workloads that go far beyond simple transactions are now the norm. They require real-time, user-facing analytics over massive high-cardinality datasets, from event streams to time-series to user-level behavioral data.

As the frontier moves further with agentic applications, the demands grow even more. These systems don’t just read and write: they observe, decide, and act. These AI applications require fast vector search across embeddings, and fast branching of data environments for experimentation and context-sensitive responses.

Tiger is not a fork. It’s not a wrapper. It is PostgreSQL, extended with innovations in the database engine and cloud infrastructure to deliver speed without sacrifice.

How are we so fast? Because of consistent, disciplined engineering efforts to serve customer needs over several years. Here is a non-exhaustive list:

Hypertables (2017)
Native columnar compression (2019)
Real-time materialized views for faster queries (2020)
Decoupled compute and storage (2021)
Tiered Storage to S3 Parquet (2022)
Vectorized query execution for fast analytics (2023)
Hybrid row-columnar store for faster queries on recent and historical data (2024)
Faster vector workloads on PostgreSQL via pgvectorscale (2024)
300x faster mutations (updates, upserts, deletes) to compressed columnar data (2024)
2500x faster distinct queries, 6x faster point queries on high-cardinality columns (2025)
Rapid horizontal scaling with load-balanced read replica sets (2025)
Enhanced high-performance storage up to 64 TB and 32,000 IOPS (2025)

Tiger brings together the familiarity and reliability of Postgres with the performance of purpose-built engines.

We built the fastest PostgreSQL. Not because we wanted to, but because our customers wanted us to.

Building the Modern PostgreSQL for the Analytical and Agentic Era

PostgreSQL has won. The Lakehouse has won. Every application is becoming an analytical application. Agents are here, in production, and need to be fast. The future is hybrid, developers and agents, with better latency and throughput needs.

In this era, modern applications must:

Handle terabytes and petabytes of data
Support real-time analytics
Integrate Gen AI features
Serve both humans and software agents, across dev, test, and production lifecycles
Meet sub-second latency and high concurrency expectations
Scale across operational databases and cost-efficient lakehouses
Maintain transactional integrity
Deliver all of this reliably and cost-effectively, because data volumes grow much faster than budgets

Our history to date, our time in this market, our lived experience watching all these changes unfold in real-time screams to us one thing: modern applications need a new kind of operational database.

One built for transactional, analytical, and agentic workloads. One that also acts as the operational serving layer for the Lakehouse. One built on Postgres.

That is what we are building.

And wow do we have some fun product announcements queued up for the upcoming weeks and months. A more agentic PostgreSQL. A deeper integration with the Lakehouse via Iceberg. A new compressed insert approach yielding 10 million rows per second. A new type of disaggregated storage architecture with zero-copy instant forks and replicas that we are deploying in our cloud for greater performance, as a replacement for EBS. And more.

We can’t wait to show it all to you. But first we had to clearly communicate who we are. We are Tiger Data.

Come Join Us

Tiger is the Fastest PostgreSQL. The operational database platform built for transactional, analytical, and agentic workloads. The only database platform that provides Speed without Sacrifice.

This is not a rebrand, but a recommitment to our customers, to our developers, and to our core mission.

If this mission resonates with you, come join us. Give us product feedback. Spread the word. Wear the swag. Join the team.

It’s Go Time. 🐯🚀

13 Tips to Improve PostgreSQL Insert Performance

Mike Freedman — Wed, 17 Apr 2024 12:00:00 GMT

Ingest performance is critical for many common PostgreSQL use cases, including application monitoring, application analytics, IoT monitoring, and more. These use cases have something in common: unlike standard relational "business" data, changes are treated as inserts, not overwrites. In other words, every new value becomes a new row in the database instead of replacing the row's prior value with the latest one.

If you're operating in a scenario where you need to retain all data vs. overwriting past values, optimizing the speed at which your database can ingest new data becomes essential.

At Tiger Data (the creators of TimescaleDB), we have a lot of experience optimizing performance, so in this article, we will look at PostgreSQL inserts and how to improve their performance. We'll include the following:

1. Useful tips for improving PostgreSQL insert performance, in general, such as moderating your use of indexes, reconsidering foreign key constraints, avoiding unnecessary UNIQUE keys, using separate disks for WAL (Write-Ahead Logging) and data, and deploying on performant disks. Each of these strategies can help optimize the speed at which your database ingests new data.

2. TimescaleDB-specific insert performance tips (TimescaleDB works like PostgreSQL under the hood).

💫

Don't know what TimescaleDB is? Read this article.

PostgreSQL Insert Overview

One of PostgreSQL's fundamental commands, the INSERT operation plays a crucial role in adding new data to a database. It adds one or more rows to a table, filling each column with specified data. When certain columns are not specified in the insert query, PostgreSQL automatically fills these columns with their default values, if any are defined. This feature ensures that the database maintains integrity and consistency, even when all column values are not provided.

The INSERT operation is fundamental for data ingest processes, where new data is continually added to the database. It allows for the efficient and organized storage of new information, making it accessible for querying and analysis.

Here’s a simple example of an INSERT query:

INSERT INTO employees (name, position, department)
VALUES ('John Doe', 'Software Engineer', 'Development');

In this example, the INSERT INTO statement specifies the table employees to which the row will be added. The columns name, position, and department are explicitly mentioned, indicating where the provided data should be inserted.

Following the VALUES keyword, the actual data to be inserted into these columns is provided in parentheses. If the employees table contains other columns for which default values are defined and are not included in the INSERT statement, PostgreSQL will automatically fill those columns with the default values.

When Insert Performance Matters

The speed at which data can be ingested into a database directly impacts its utility and responsiveness, especially when reaction speed in real-time or near-real-time data processing is essential.

One prominent example of such a use case is time-series data management. Time-series data, characterized by its sequential nature, accumulates moment by moment, often originating from sensors, financial transactions, or user activity logs.

The value of time-series data lies in its timeliness and the insights that can be gleaned from analyzing patterns over time. To maintain the integrity and relevance of these insights, insert performance must be optimized to ensure data is updated consistently and without delay. High insert performance allows for the seamless integration of new data, preserving the chronological order and enabling accurate real-time analysis.

Application monitoring represents another critical area where insert performance is paramount. Effective monitoring systems rely on continuously ingesting application metrics and logs to provide an up-to-date view of the application's health and performance. Any lag in data ingest can lead to delays in detecting and responding to issues, potentially affecting user experience and system stability. Strong insert performance ensures that monitoring systems remain current, allowing for immediate action in response to any anomalies detected.

Event detection applications, such as fraud detection systems, also underscore the importance of fast insert speeds. In these scenarios, the ability to rapidly ingest and process data can mean the difference between catching a fraudulent transaction as it happens or missing it entirely.

Fast data ingest enables these systems to analyze events in real time, applying algorithms to detect suspicious patterns and react promptly. The reaction speed is crucial in minimizing risk and protecting assets, highlighting the critical role of insert performance in maintaining system efficacy.

Improving Insert Performance

The previous use cases stress the critical role of ingest speed in real-time or high-volume databases, such as those handling time series. These use cases make up for most of our customer base here at Tiger Data, so we're pretty confident to recommend these five best practices for improving ingest performance in vanilla PostgreSQL:

1. Use indexes in moderation

Having the right indexes can speed up your queries, but they’re not a silver bullet. Incrementally maintaining indexes with each new row requires additional work. Check the number of indexes you’ve defined on your table (use the psql command \d table_name), and determine whether their potential query benefits outweigh the storage and insert overhead. Since every system is different, there aren’t any hard and fast rules or “magic number” of indexes—just be reasonable.

2. Reconsider foreign key constraints

Sometimes, it's necessary to build foreign keys (FK) from one table to other relational tables. When you have an FK constraint, every INSERT will typically need to read from your referenced table, which can degrade performance. Consider denormalizing your data—we sometimes see pretty extreme use of FK constraints from a sense of “elegance” rather than engineering trade-offs.

3. Avoid unnecessary UNIQUE keys

Developers are often trained to specify primary keys in database tables, and many ORMs love them. Yet, many use cases—including common monitoring or time-series applications—don’t require them, as each event or sensor reading can simply be logged as a separate event by inserting it at the tail of a hypertable's current chunk during write time.

If a UNIQUE constraint is otherwise defined, that insert can necessitate an index lookup to determine if the row already exists, which will adversely impact the speed of your INSERT.

4. Use separate disks for WAL and data

While this is a more advanced optimization that isn't always needed, if your disk becomes a bottleneck, you can further increase throughput by using a separate disk (tablespace) for the database's WAL and data.

5. Use performant disks

Sometimes developers deploy their database in environments with slower disks, whether due to poorly-performing HDD, remote storage area networks (SANs), or other types of configurations. And because when you insert rows, the data is durably stored in the WAL before the transaction completes, slow disks can impact insert performance. One thing to do is check your disk IOPS using the ioping command.

Read test:

$ ioping -q -c 10 -s 8k .
--- . (hfs /dev/disk1 930.7 GiB) ioping statistics ---
9 requests completed in 208 us, 72 KiB read, 43.3 k iops, 338.0 MiB/s
generated 10 requests in 9.00 s, 80 KiB, 1 iops, 8.88 KiB/s
min/avg/max/mdev = 18 us / 23.1 us / 35 us / 6.17 us

Write test:

$ ioping -q -c 10 -s 8k -W .
--- . (hfs /dev/disk1 930.7 GiB) ioping statistics ---
9 requests completed in 10.8 ms, 72 KiB written, 830 iops, 6.49 MiB/s
generated 10 requests in 9.00 s, 80 KiB, 1 iops, 8.89 KiB/s
min/avg/max/mdev = 99 us / 1.20 ms / 2.23 ms / 919.3 us

You should see at least thousands of read IOPS and many hundreds of write IOPS. If you are seeing far fewer, your disk hardware is likely affecting your INSERT performance. See if alternative storage configurations are feasible.

✨

Read our benchmark on batch ingest in PostgreSQL.

Using TimescaleDB to Improve Ingest Performance

TimescaleDB is built to improve query and ingest performance in PostgreSQL.

The most common uses for TimescaleDB involve storing massive amounts of data for cloud infrastructure metrics, product analytics, web analytics, IoT devices, and many use cases involving large PostgreSQL tables. The ideal TimescaleDB scenarios are time-centric, almost solely append-only (lots of INSERTs), and require fast ingestion of large amounts of data within small time windows.

Here you have eight more techniques for improving ingest performance with TimescaleDB:

6. Use parallel writes

Each INSERT or COPY command to TimescaleDB (as in PostgreSQL) is executed as a single transaction and thus runs in a single-threaded fashion. To achieve higher ingest, you should execute multiple INSERT or COPY commands in parallel.

For help with bulk loading large CSV files in parallel, check out TimescaleDB's parallel copy command.

⭐ Pro tip: make sure your client machine has enough cores to execute this parallelism (running 32 client workers on a 2 vCPU machine doesn’t help much— the workers won’t actually be executed in parallel).

7. Insert rows in batches

To achieve higher ingest rates, you should insert your data with many rows in each INSERT call (or else use some bulk insert command, like COPY or our parallel copy tool).

Don't insert your data row-by-row—instead, try at least hundreds (or thousands) of rows per insert. This allows the database to spend less time on connection management, transaction overhead, SQL parsing, etc., and more time on data processing.

8. Properly configure shared_buffers

We typically recommend 25 % of available RAM. If you install TimescaleDB via a method that runs timescaledb-tune, it should automatically configure shared_buffers to something well-suited to your hardware specs.

Note: in some cases, typically with virtualization and constrained cgroups memory allocation, these automatically-configured settings may not be ideal. To check that your shared_buffers are set to within the 25 % range, run SHOW shared_buffers from your psql connection.

9. Run our Docker images on Linux hosts

If you are running a TimescaleDB Docker container (which runs Linux) on top of another Linux operating system, you're in great shape. The container is basically providing process isolation, and the overhead is extremely minimal.

If you're running the container on a Mac or Windows machine, you'll see some performance hits for the OS virtualization, including for I/O.

Instead, if you need to run on Mac or Windows, we recommend installing directly instead of using a Docker image.

10. Avoid too many or too small chunks

We don't currently recommend using space partitioning. And if you do, remember that this number of chunks is created for every time interval.

So, if you create 64 space partitions and daily chunks, you'll have 24,640 chunks per year. This may lead to a bigger performance hit during query time (due to planning overhead) than during insert time, but it's something to consider nonetheless.

Another thing to avoid is using an incorrect integer value when you specify the time interval range in create_hypertable.

⭐ Pro tip:

If your time column uses a native timestamp type, then any integer value should be in terms of microseconds (so one day = 86400000000). We recommend using interval types ('1 day') to avoid the potential for any confusion.
If your time column is an integer or bigint itself, use the appropriate range: if the integer timestamp is in seconds, use 86400; if the bigint timestamp is in nanoseconds, use 86400000000000.

In both cases, you can use chunk_relation_size_pretty to make sure your chunk sizes or partition ranges seem reasonable:

=> SELECT chunk_table, ranges, total_size
FROM chunk_relation_size_pretty('hypertable_name')
ORDER BY ranges DESC LIMIT 4;
chunk_table               |                         ranges                          | total_size
-----------------------------------------+---------------------------------------------------------+------------
_timescaledb_internal._hyper_1_96_chunk | {"['2020-02-13 23:00:00+00','2020-02-14 00:00:00+00')"} | 272 MB
_timescaledb_internal._hyper_1_95_chunk | {"['2020-02-13 22:00:00+00','2020-02-13 23:00:00+00')"} | 500 MB
_timescaledb_internal._hyper_1_94_chunk | {"['2020-02-13 21:30:00+00','2020-02-13 22:00:00+00')"} | 500 MB
_timescaledb_internal._hyper_1_93_chunk | {"['2020-02-13 20:00:00+00','2020-02-13 21:00:00+00')"} | 500 MB

11. Avoid “too large” chunks

To maintain higher ingest rates, you want your latest chunk and all its associated indexes to stay in memory so that writes to the chunk and index updates merely update memory. (The write is still durable, as inserts are written to the WAL on disk before the database pages are updated.)

If your chunks are too large, then writes to even the latest chunk will start swapping to disk.

As a rule of thumb, we recommend that the latest chunks and all their indexes fit comfortably within the database's shared_buffers. You can check your chunk sizes via the chunk_relation_size_pretty SQL command.

=> SELECT chunk_table, table_size, index_size, toast_size, total_sizeFROM chunk_relation_size_pretty('hypertable_name')ORDER BY ranges DESC LIMIT 4;
chunk_table               | table_size | index_size | toast_size | total_size
-----------------------------------------+------------+------------+------------+------------
_timescaledb_internal._hyper_1_96_chunk | 200 MB     | 64 MB      | 8192 bytes | 272 MB
_timescaledb_internal._hyper_1_95_chunk | 388 MB     | 108 MB     | 8192 bytes | 500 MB
_timescaledb_internal._hyper_1_94_chunk | 388 MB     | 108 MB     | 8192 bytes | 500 MB
_timescaledb_internal._hyper_1_93_chunk | 388 MB     | 108 MB     | 8192 bytes | 500 MB

If your chunks are too large, you can update the range for future chunks via the set_chunk_time_interval command. However, this does not modify the range of existing chunks (e.g., by rewriting large chunks into multiple small chunks).

For configurations where individual chunks are much larger than your available memory, we recommend dumping and reloading your hypertable data to properly sized chunks.

Keeping the latest chunk applies to all active hypertables; if you are actively writing to two hypertables, the latest chunks from both should fit within shared_buffers.

12. Write data in loose time order

When chunks are sized appropriately (see #10 and #11), the latest chunk(s) and their associated indexes are naturally maintained in memory. New rows inserted with recent timestamps will be written to these chunks and indexes already in memory.

If a row with a sufficiently older timestamp is inserted—i.e., it's an out-of-order or backfilled write—the disk pages corresponding to the older chunk (and its indexes) will need to be read in from disk. This will significantly increase write latency and lower insert throughput.

Particularly, when you are loading data for the first time, try to load data in sorted, increasing timestamp order.

Be careful if you're bulk-loading data about many different servers, devices, and so forth:

Do not bulk insert data sequentially by server (i.e., all data for server A, then server B, then C, and so forth). This will cause disk thrashing as loading each server will walk through all chunks before starting anew.
Instead, arrange your bulk load so that data from all servers are inserted in loose timestamp order (e.g., day 1 across all servers in parallel, then day 2 across all servers in parallel, etc.)

13. Watch row width

The overhead from inserting a wide row (say, 50, 100, 250 columns) is going to be much higher than inserting a narrower row (more network I/O, more parsing and data processing, larger writes to WAL, etc.). Most of our published benchmarks are using TSBS, which uses 12 columns per row. So you'll correspondingly see lower insert rates if you have very wide rows.

If you are considering very wide rows because you have different types of records, and each type has a disjoint set of columns, you might want to try using multiple hypertables (one per record type)—particularly if you don't often query across these types.

Additionally, JSONB records are another good option if virtually all columns are sparse. That said, if you're using sparse wide rows, use NULLs for missing records whenever possible, not default values, for the most performance gains (NULLs are much cheaper to store and query).

Finally, the cost of wide rows is actually much less once you compress rows using TimescaleDB’s native compression. Rows are converted into more columnar compressed form, sparse columns compress extremely well, and compressed columns aren’t read from disk for queries that don’t fetch individual columns.

Summary

If ingest performance is critical to your use case, consider using TimescaleDB. You can get started with hosted TimescaleDB (Tiger Cloud) for free today or download TimescaleDB to your own hardware.

Our approach to support is to address your whole solution, so we're here to help you achieve your desired performance results (see more details about our Support team and ethos).

Lastly, our Slack community is a great place to connect with 8,000+ other developers with similar use cases, as well as myself, Tiger Data engineers, product team members, and developer advocates.

Keep learning about improving PostgreSQL performance

If you're interested in improving your PostgreSQL performance, you'll find the following resources useful:

👉 Navigating growing PostgreSQL tables. Are your PostgreSQL queries slowing down as your database tables grow? Learn about a few tactics that can get you back on track.

👉 When to consider PostgreSQL partitioning. Postgres partitioning can be a powerful tool to scale your database, although it’s not a one-size-fits-all solution. Learn if it's the solution you're looking for.

👉 When your tables start growing, it might be time for some PostgreSQL fine-tuning. Get advice on how to optimize your database step by step:

👉 Further tips on improving inserts.

FAQs: Improving PostgreSQL Insert Performance

Q: How can I improve PostgreSQL insert performance when dealing with large amounts of data?

A: To optimize PostgreSQL insert performance, focus on using indexes in moderation, inserting rows in batches rather than one by one, and ensuring your disk hardware is performant. If disk becomes a bottleneck, consider using separate disks for WAL and data. For bulk loading data, try tools like TimescaleDB's parallel copy command, which can significantly increase throughput.

Q: What role do indexes and constraints play in PostgreSQL insert performance?

A: While indexes speed up queries, they can slow down inserts since each new row requires index maintenance. Foreign key constraints force PostgreSQL to read from referenced tables during inserts, and UNIQUE constraints necessitate index lookups to check for duplicates. Consider whether these constraints are truly necessary for your use case, especially for append-only scenarios like time-series data.

Q: How should I configure my hardware for optimal PostgreSQL insert performance?

A: Use performant disks capable of thousands of read IOPS and hundreds of write IOPS, which you can check with the ioping command. Configure shared_buffers to approximately 25% of available RAM to ensure enough memory for caching active data. If running in containers, use Linux hosts for Docker images to minimize virtualization overhead.

Q: What's the optimal approach for batch inserting data into PostgreSQL?

A: Instead of row-by-row insertion, batch hundreds or thousands of rows per INSERT command to reduce overhead from connection management and SQL parsing. Execute multiple INSERT or COPY commands in parallel to leverage multiple cores. When bulk loading time-series data, insert in loose time order (e.g., day 1 across all servers, then day 2) rather than sequentially by server to prevent disk thrashing.

Q: How can TimescaleDB improve insert performance compared to vanilla PostgreSQL?

A: TimescaleDB enhances PostgreSQL insert performance through features like hypertables with automatic time-based chunking, which keeps recent data in memory for faster writes. It offers native compression to reduce the cost of wide rows and provides tools like parallel copy for efficient bulk loading. TimescaleDB also helps maintain appropriate chunk sizes (neither too large nor too small) to optimize memory usage and prevent unnecessary disk operations.

Achieving the Best of Both Worlds: Ensuring Up-To-Date Results With Real-Time Aggregation

Sven Klemm — Thu, 07 May 2020 15:11:33 GMT

Real-time aggregates (released with TimescaleDB 1.7) build on continuous aggregates' ability to increase query speed and optimize storage. Learn what's new, details about how they work, and how to get started.

One constant across all time-series use cases is data: metrics, logs, events, sensor readings; IT and application performance monitoring, SaaS applications, IoT, martech, fintech, and more. Lots (and lots) of data. What’s more, it typically arrives continuously.

This need to handle large volumes of constantly generated data motivated some of our earliest TimescaleDB architectural decisions, such as its use of automated time-based partitioning and local-only indexing to achieve high insert rates. And last year, we added type-specific columnar compression to significantly shrink the overhead involved in storing all of this data (often by 90% or higher – see our technical description and benchmarking results).

And another key capability in TimescaleDB, which is the focus of this post, has been continuous aggregates, which we first introduced in TimescaleDB 1.3. Continuous aggregates allow one to specify a SQL query that continually processes raw data into a so-called materialized table.

Continuous aggregates are somewhat similar to materialized views in databases, but unlike a materialized view (as in PostgreSQL), continuous aggregates do not need to be refreshed manually; the view will be refreshed automatically in the background as new data is added, or old data is modified. Additionally, TimescaleDB does not need to re-calculate all of the data on every refresh. Only new and/or invalidated data will be calculated. And since this re-aggregation is automatic – it executes as a background job at regular intervals – this process doesn’t add any maintenance burden to your database.

This is where most database or streaming systems that offer continuous aggregates or continuous queries give up. We knew we could do better.

Enter Real-Time Aggregation, introduced in TimescaleDB 1.7 (see our release blog).

Quick Background on Continuous Aggregates

The benefit of continuous aggregations are two fold:

Query performance. By executing queries against pre-calculated results, rather than the underlying raw data, continuous aggregates can significantly improve query performance.
Storage savings with downsampling. Continuous aggregates are often combined with data retention policies for better storage management. Raw data can be continually aggregated into a materialized table, and dropped after it reaches a certain age. So the database may only store some fixed period of raw data (say, one week), yet store aggregate data for much longer.

Consider the following example, collecting system metrics around CPU usage and storing it in a CPU metrics hypertable, where each row includes a timestamp, hostname, and 3 metrics around CPU usage (usage_user, usage_system, usage_iowait).

We collect these statistics every second per server.

            time              | hostname |     usage_user     |    usage_system     |    usage_iowait
-------------------------------+----------+--------------------+---------------------+---------------------
2020-05-06 02:32:34.627143+00 | host0    | 0.5378765249290502 |  0.2958572490961302 | 0.10685818344495246
2020-05-06 02:32:34.627143+00 | host1    | 0.3175958910709298 |  0.7874926624954846 | 0.16615243032654803
2020-05-06 02:32:34.627143+00 | host2    | 0.4788377981501064 | 0.18277343256546175 |  0.7183967491020162

So a query that wants to compute the per-hourly histogram of usage consumption over the course of 7 days for 10 servers will process 10 servers * 60 seconds * 60 minutes * 24 hours * 7 days= 6,048,000 rows of data.

On the other hand, if we pre-compute a histogram per hour, then the same query on the continuous aggregate table will only need to process 10 servers * 24 hours * 7 days = 1680 rows of data.

But pre-computed results in the continuous aggregate view will lag behind the latest data, as the materialization only runs at scheduled intervals. So, both to more cheaply handle out-of-order data and to avoid excessive load, there is typically some refresh lag between the raw data and when it’s materialized. In fact, this refresh lag is configurable in TimescaleDB, such that the continuous aggregation engine will not materialize data that’s newer than the refresh lag.

(Slightly more specifically, if we compute aggregations across some time bucket, such as hourly, then each hourly interval has a start time and end time. TimescaleDB will only materialize data when its corresponding aggregation interval’s end time is older than the refresh lag. So, if we are doing hourly rollups with 30 minute refresh lag, then we’d only perform the materialized aggregation from, say, 2:00am - 3:00am after 2:30pm.)

So, on one hand, using a continuous aggregate view has cut down the amount of data we process at query time by 3600x (i.e., from more than 6 million rows to fewer than 2000). But, in this view, we’re often missing the last hour or so of data.

While you could just make the refresh lag smaller and smaller to workaround this problem, it comes at the cost of higher and higher load; unless these aggregates are recomputed on every new insert (expensive!), they’re fundamentally always stale.

Introducing Real-Time Aggregation

With real-time aggregation, when you query a continuous aggregate view, rather than just getting the pre-computed aggregate from the materialized table, the query will transparently combine this pre-computed aggregate with raw data from the hypertable that’s yet to be materialized. And, by combining raw and materialized data in this way, you get accurate and up-to-date results, while still enjoying the speedups that come from pre-computing a large portion of the result.

Let’s return to the example above. Recall that when we created hourly rollups, we set the refresh lag to 30 minutes, so our continuous aggregate view will lag behind by 30-90 minutes.

But, when querying a view that supports real-time aggregation, the single query as before for hourly data across the past week will process and combine the results from two tables:

Materialized table: 10 servers * (22 hours + 24 hours * 6 days) = 1660 rows
Raw data: 10 servers * 60 seconds * 90 minutes = 54,000 rows

So now, with these “back of the envelope” calculations, we’ve processed a total of 55,660 rows, still well below the 6 million from before. Moreover, the last 90 minutes of data are more likely to already be memory resident for even better performance, given the database page caching already happening for recent data.

Real-time aggregates allow you to query your pre-calculated data and newer, not yet materialized "raw" data

The above illustration shows this in practice. The database internally maintains a completion threshold as metadata, which records the point-in-time to which all previous records from the raw table have been materialized. This completion threshold lags behind the refresh lag we discussed earlier, and gets updated by the database engine whenever a background task updates the materialized view.

(In fact, it’s a bit more complicated given TimescaleDB’s ability to handle late data that gets written after some time region has already been materialized, i.e., behind the completion threshold. But we’re going to ignore how TimescaleDB tracks invalidation regions in this post.)

So now when processing our query covering the interval , the database engine will conceptually take a UNION ALL between results from the materialized table starting at now() - interval '7 days' up to the completion threshold, with results from the raw table from the completion threshold up to now().

But rather than just describe this behavior, let’s walk through a concrete example and compare our query times without continuous aggregates, with vanilla continuous aggregates, and with real-time aggregation enabled.

These capabilities were developed by Timescale engineers: Sven Klemm, Matvey Arye, Gayathri Ayyapan, David Kohn, and Josh Lockerman.

Testing Real-Time Aggregation

In the following, I’ve created a TimescaleDB 1.7 instance via Managed Service for TimescaleDB (specially, an “basic-100-compute-optimized” instance with PostgreSQL 12, 4 vCPU, and 100GB SSD storage), and then created the following hypertable:

$ psql postgres://tsdbadmin@tsdb-bb8e760-internal-90d0.a.timescaledb.io:26479/defaultdb?sslmode=require

=> CREATE TABLE cpu (
      time TIMESTAMPTZ,
      hostname TEXT,
      usage_user FLOAT,
      usage_system FLOAT,
      usage_iowait FLOAT
   );

=> SELECT create_hypertable ('cpu', 'time', 
      chunk_time_interval => interval '1d');

I’m now going to load the hypertable with 14 days of synthetic data (which is created with the following INSERT statement):

=> INSERT INTO cpu (
   SELECT time, hostname, random(), random(), random()
      FROM generate_series(NOW() - interval '14d', NOW(), '1s') AS time
      CROSS JOIN LATERAL (
         SELECT 'host' || host_id::text AS hostname 
            FROM generate_series(0,9) AS host_id
      ) h
   );

Okay, so that inserted 12,096,010 rows of synthetic data into our hypertable of the following format, stretching from 2:32am UTC on April 22 to 2:32am UTC on May 6:

=> SELECT * FROM cpu ORDER BY time DESC LIMIT 3;

             time              | hostname |     usage_user     |    usage_system     |    usage_iowait     
-------------------------------+----------+--------------------+---------------------+---------------------
 2020-05-06 02:32:34.627143+00 | host0    | 0.5378765249290502 |  0.2958572490961302 | 0.10685818344495246
 2020-05-06 02:32:34.627143+00 | host1    | 0.3175958910709298 |  0.7874926624954846 | 0.16615243032654803
 2020-05-06 02:32:34.627143+00 | host2    | 0.4788377981501064 | 0.18277343256546175 |  0.7183967491020162


=> SELECT min(time) AS start, max(time) AS end FROM cpu;

-[ RECORD 1 ]------------------------
start | 2020-04-22 02:32:34.627143+00
end   | 2020-05-06 02:32:34.627143+00

Let’s now create a continuous aggregate view on this table with hourly histograms:

=> CREATE VIEW cpu_1h 
   WITH (timescaledb.continuous, 
         timescaledb.refresh_lag = '30m',
         timescaledb.refresh_interval = '30m')
   AS
      SELECT 
         time_bucket('1 hour', time) AS hour,
         hostname, 
         histogram(usage_user, 0.0, 1.0, 5) AS hist_usage_user,
         histogram(usage_system, 0.0, 1.0, 5) AS hist_usage_system,
         histogram(usage_iowait, 0.0, 1.0, 5) AS hist_usage_iowait
      FROM cpu
      GROUP BY hour, hostname;

By default, queries to this view use these real-time aggregation features. If you want to disable real-time aggregation, set materialized_only = true when creating the view or by later ALTERing the view. (See API docs here.)

Now, the job scheduling framework will start to asynchronously process this view, which we can see in our informational view. (You can also manually force the materialization to occur if needed.)

=> SELECT * FROM timescaledb_information.continuous_aggregate_stats;

- [ RECORD 1 ]
view_name              | cpu_1h
completed_threshold    | 2020-05-06 02:00:00+00
invalidation_threshold | 2020-05-06 02:00:00+00
job_id                 | 1000
last_run_started_at    | 2020-05-06 02:34:08.300524+00
last_successful_finish | 2020-05-06 02:34:09.04923+00
last_run_status        | Success
job_status             | Scheduled
last_run_duration      | 00:00:00.748706
next_scheduled_run     | 2020-05-06 03:04:09.04923+00
total_runs             | 17
total_successes        | 17
total_failures         | 0
total_crashes          | 0

From this data, we see that the materialized view includes data up to 2:00am on May 6, while from above we’ve learned that the raw data goes up to 2:32am.

Let’s try our query directly on the raw table, and use an EXPLAIN ANALYZE to both show the database plan, as well as actually execute the query and collect timing information. (Note that in many use cases, one would offset queries from now() - . But to ensure that we use identical datasets in our subsequent analysis, we explicitly select the interval offset from the dataset’s last timestamp.)

=> EXPLAIN (ANALYZE, COSTS OFF)
   SELECT 
      time_bucket('1 hour', time) AS hour,
      hostname, 
      histogram(usage_user, 0.0, 1.0, 5) AS hist_usage_user,
      histogram(usage_system, 0.0, 1.0, 5) AS hist_usage_system,
      histogram(usage_iowait, 0.0, 1.0, 5) AS hist_usage_iowait
   FROM cpu
   WHERE time > '2020-05-06 02:32:34.627143+00'::timestamptz - interval '7 days'
   GROUP BY hour, hostname
   ORDER BY hour DESC;

QUERY PLAN             
----------------------------------------------------------------
 Finalize GroupAggregate (actual time=1859.306..1862.331 rows=1690 loops=1)
   Group Key: (time_bucket('01:00:00'::interval, cpu."time")), cpu.hostname
   ->  Gather Merge (actual time=1841.735..1849.604 rows=1881 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort (actual time=1194.162..1194.222 rows=627 loops=3)
               Sort Key: (time_bucket('01:00:00'::interval, cpu."time")) DESC, cpu.hostname
               Sort Method: quicksort  Memory: 25kB
               Worker 0:  Sort Method: quicksort  Memory: 274kB
               Worker 1:  Sort Method: quicksort  Memory: 274kB
               ->  Partial HashAggregate (actual time=1193.198..1193.594 rows=627 loops=3)
                     Group Key: time_bucket('01:00:00'::interval, cpu."time"), cpu.hostname
                     ->  Parallel Custom Scan (ChunkAppend) on cpu (actual time=9.840..716.952 rows=2016000 loops=3)
                           Chunks excluded during startup: 7
                           ->  Parallel Seq Scan on _hyper_1_14_chunk (actual time=14.751..199.098 rows=864000 loops=1)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                           ->  Parallel Seq Scan on _hyper_1_13_chunk (actual time=14.749..201.100 rows=864000 loops=1)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                           ->  Parallel Seq Scan on _hyper_1_12_chunk (actual time=0.025..182.591 rows=864000 loops=1)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                           ->  Parallel Seq Scan on _hyper_1_11_chunk (actual time=0.031..182.812 rows=864000 loops=1)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                           ->  Parallel Seq Scan on _hyper_1_10_chunk (actual time=0.035..183.918 rows=864000 loops=1)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                           ->  Parallel Seq Scan on _hyper_1_9_chunk (actual time=0.019..184.416 rows=864000 loops=1)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                           ->  Parallel Seq Scan on _hyper_1_8_chunk (actual time=0.823..91.605 rows=386225 loops=2)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                                 Rows Removed by Filter: 45775
                           ->  Parallel Seq Scan on _hyper_1_15_chunk (actual time=0.022..20.277 rows=91550 loops=1)
                                 Filter: ("time" > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))

 Planning Time: 1.917 ms
 Execution Time: 1921.753 ms

Note that TimescaleDB’s constraint exclusion excluded 7 of the chunks from being queried given the WHERE predicate (as the query was for the last 7 days of the 14 day dataset), then processed the query on the remaining 8 chunks (performing a scan over 6,048,000 rows) using two parallel workers. The query in total took just over 1.9 seconds.

Now let’s try the query on our materialized table, first turning off real-time aggregation just for this experiment:

=> ALTER VIEW cpu_1h set (timescaledb.materialized_only = true);

First, let’s look at the table definition, which defines a SELECT on the materialized view with the specified GROUP BYs. But we also see that each of the histograms calls “finalize_agg.” TimescaleDB doesn’t precisely pre-compute and store the exact answer that’s specified in the query, but rather a partial aggregate that is then “finalized” at query time, which will allow for greater parallelization and rebucketing at query time (in a future release).

 \d+ cpu_1h;

                                          View "public.cpu_1h"
      Column       |           Type           | Collation | Nullable | Default | Storage  | Description 
-------------------+--------------------------+-----------+----------+---------+----------+-------------
 hour              | timestamp with time zone |           |          |         | plain    | 
 hostname          | text                     |           |          |         | extended | 
 hist_usage_user   | integer[]                |           |          |         | extended | 
 hist_usage_system | integer[]                |           |          |         | extended | 
 hist_usage_iowait | integer[]                |           |          |         | extended | 

View definition:
 SELECT _materialized_hypertable_2.hour,
    _materialized_hypertable_2.hostname,
    _timescaledb_internal.finalize_agg('histogram(double precision,double precision,double precision,integer)'::text, NULL::name, NULL::name, '{{pg_catalog,float8},{pg_catalog,float8},{pg_catalog,float8},{pg_catalog,int4}}'::name[], _materialized_hypertable_2.agg_3_3, NULL::integer[]) AS hist_usage_user,
    _timescaledb_internal.finalize_agg(...) AS hist_usage_system,
    _timescaledb_internal.finalize_agg(...) AS hist_usage_iowait
   FROM _timescaledb_internal._materialized_hypertable_2
  GROUP BY _materialized_hypertable_2.hour, _materialized_hypertable_2.hostname;

Now let’s run the query with vanilla continuous aggregates enabled:

=> EXPLAIN (ANALYZE, COSTS OFF)
   SELECT * FROM cpu_1h
   WHERE hour > '2020-05-06 02:32:34.627143+00'::timestamptz - interval '7 days'
   ORDER BY hour DESC;

QUERY PLAN
----------------------------------------------------------------
 Sort (actual time=3.218..3.312 rows=1670 loops=1)
   Sort Key: _materialized_hypertable_2.hour DESC
   Sort Method: quicksort  Memory: 492kB
   ->  HashAggregate (actual time=1.943..2.891 rows=1670 loops=1)
         Group Key: _materialized_hypertable_2.hour, _materialized_hypertable_2.hostname
         ->  Custom Scan (ChunkAppend) on _materialized_hypertable_2 (actual time=0.064..0.688 rows=1670 loops=1)
               Chunks excluded during startup: 1
               ->  Seq Scan on _hyper_2_17_chunk (actual time=0.063..0.590 rows=1670 loops=1)
                     Filter: (hour > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))
                     Rows Removed by Filter: 270

 Planning Time: 0.645 ms
 Execution Time: 3.461 ms

Just 4 milliseconds, after a scan of 1,670 rows in the materialized hypertable. And let’s look at the most recent 3 rows returned for a specific host:

=> SELECT hour, hostname, hist_usage_user
    FROM cpu_1h
    WHERE hour > '2020-05-06 02:32:34.627143+00'::timestamptz - interval '7 days'         
       AND hostname = 'host0'
    ORDER BY hour DESC LIMIT 3;

          hour          | hostname |      hist_usage_user      
------------------------+----------+---------------------------
 2020-05-06 01:00:00+00 | host0    | {0,781,676,712,719,712,0}
 2020-05-06 00:00:00+00 | host0    | {0,736,714,776,689,685,0}
 2020-05-05 23:00:00+00 | host0    | {0,714,759,715,692,720,0}

Note that the last record is from the 1:00am - 2:00am hour.

Now let’s re-enable real-time aggregation and try the same query, first showing how the real-time aggregation is defined as a UNION ALL between the materialized and raw data.

=> ALTER VIEW cpu_1h set (timescaledb.materialized_only = false);

=> \d+ cpu_1h;

                                          View "public.cpu_1h"
      Column       |           Type           | Collation | Nullable | Default | Storage  | Description 
-------------------+--------------------------+-----------+----------+---------+----------+-------------
 hour              | timestamp with time zone |           |          |         | plain    | 
 hostname          | text                     |           |          |         | extended | 
 hist_usage_user   | integer[]                |           |          |         | extended | 
 hist_usage_system | integer[]                |           |          |         | extended | 
 hist_usage_iowait | integer[]                |           |          |         | extended | 

View definition:
 SELECT _materialized_hypertable_2.hour,
    _materialized_hypertable_2.hostname,
    _timescaledb_internal.finalize_agg(...) AS hist_usage_user,
    _timescaledb_internal.finalize_agg(...) AS hist_usage_system,
    _timescaledb_internal.finalize_agg(...) AS hist_usage_iowait
   FROM _timescaledb_internal._materialized_hypertable_2
  WHERE _materialized_hypertable_2.hour < COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(1)), '-infinity'::timestamp with time zone)
  GROUP BY _materialized_hypertable_2.hour, _materialized_hypertable_2.hostname
UNION ALL
 SELECT time_bucket('01:00:00'::interval, cpu."time") AS hour,
    cpu.hostname,
    histogram(cpu.usage_user, 0.0::double precision, 1.0::double precision, 5) AS hist_usage_user,
    histogram(cpu.usage_system, 0.0::double precision, 1.0::double precision, 5) AS hist_usage_system,
    histogram(cpu.usage_iowait, 0.0::double precision, 1.0::double precision, 5) AS hist_usage_iowait
   FROM cpu
  WHERE cpu."time" >= COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(1)), '-infinity'::timestamp with time zone)
  GROUP BY (time_bucket('01:00:00'::interval, cpu."time")), cpu.hostname;


=> EXPLAIN (ANALYZE, COSTS OFF)
   SELECT * FROM cpu_1h
   WHERE hour > '2020-05-06 02:32:34.627143+00'::timestamptz - interval '7 days'
   ORDER BY hour DESC;

QUERY PLAN               
----------------------------------------------------------------
 Sort (actual time=20.871..21.055 rows=1680 loops=1)
   Sort Key: _materialized_hypertable_2.hour DESC
   Sort Method: quicksort  Memory: 495kB
   ->  Append (actual time=1.842..20.536 rows=1680 loops=1)
         ->  HashAggregate (actual time=1.841..2.789 rows=1670 loops=1)
               Group Key: _materialized_hypertable_2.hour, _materialized_hypertable_2.hostname
               ->  Custom Scan (ChunkAppend) on _materialized_hypertable_2 (actual time=0.105..0.580 rows=1670 loops=1)
                     Chunks excluded during startup: 1
                     ->  Index Scan using _hyper_2_17_chunk__materialized_hypertable_2_hour_idx on _hyper_2_17_chunk (actual time=0.104..0.475 rows=1670 loops=1)
                           Index Cond: ((hour < COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(1)), '-infinity'::timestamp with time zone)) AND (hour > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval)))
         ->  HashAggregate (actual time=17.641..17.655 rows=10 loops=1)
               Group Key: time_bucket('01:00:00'::interval, cpu."time"), cpu.hostname
               ->  Custom Scan (ChunkAppend) on cpu (actual time=0.165..12.297 rows=19550 loops=1)
                     Chunks excluded during startup: 14
                     ->  Index Scan using _hyper_1_15_chunk_cpu_time_idx on _hyper_1_15_chunk (actual time=0.163..9.723 rows=19550 loops=1)
                           Index Cond: ("time" >= COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(1)), '-infinity'::timestamp with time zone))
                           Filter: (time_bucket('01:00:00'::interval, "time") > ('2020-05-06 02:32:34.627143+00'::timestamp with time zone - '7 days'::interval))

 Planning Time: 3.532 ms
 Execution Time: 22.905 ms

Still very fast at just over 26 milliseconds (scanning 1,670 materialized rows and 19,550 raw rows), and now the results:

=> SELECT hour, hostname, hist_usage_user
   FROM cpu_1h
WHERE hour > '2020-05-06 02:32:34.627143+00'::timestamptz - interval '7 days'
      AND hostname = 'host0'
   ORDER BY hour DESC LIMIT 3;

          hour          | hostname |      hist_usage_user      
------------------------+----------+---------------------------
 2020-05-06 02:00:00+00 | host0    | {0,384,388,385,400,398,0}
 2020-05-06 01:00:00+00 | host0    | {0,781,676,712,719,712,0}
 2020-05-06 00:00:00+00 | host0    | {0,736,714,776,689,685,0}

Unlike when we were processing the materialized table without the real-time aggregation, we have up-to-date data with data from the 2:00 - 3:00am hour. This is because the materialized table didn’t have data from the last hour, while the real-time aggregation was able to compute that result from the raw data at query time. You can also notice that there is less data in the final row (namely, each histogram bucket has about half the counts as the prior rows), as this final row was the aggregation of 32 minutes of raw data, not a full hour.

You can also observe these two stages of real-time aggregation in the above query plan: the materialized hypertable is processed in the first section via Custom Scan (ChunkAppend) on _materialized_hypertable_2, while the underlying raw hypertable is processed in the second section via Custom Scan (ChunkAppend) on cpu, and each processes only before or after the offset specified by the completion threshold (shown with _timescaledb_internal.cagg_watermark(1) in the plan).

So, in summary: a complete, up-to-date aggregate over the data, both at a fraction of the latency of querying the raw data, and avoiding the excessive overhead of schemes that update materalizations through per-row or per-statement triggers.

Query Type	Latency	Freshness
Raw Data	1924 ms	Up-to-date
Continuous Aggregates	4 ms	Lags up to 90 minutes
Real-Time Aggregation	26 ms	Up-to-date

Continuous aggregates and real-time aggregation for the win!

Conclusions

What motivated us to build TimescaleDB is the firm belief that time-series use cases need a best-in-class, flexible time-series database, with advanced capabilities specifically designed for time-series workloads. We developed real-time aggregation for time-series use cases such as devops monitoring, real-time analytics, and IoT, where fast queries over high-volume workloads and accurate, real-time results really matter.

Real-time aggregation joins a number of advanced capabilities in TimescaleDB around data lifecycle management and time-series analytics, including automated data retention, data reordering, native compression, downsampling, and traditional continuous aggregates.

And, there’s still much more to come. Keep an eye out for our much-anticipated TimescaleDB 2.0 release, which introduces horizontal scaling to TimescaleDB for terabyte to petabyte workloads.

Want to check out real-time aggregation?

Ready to dig in? Check out our docs.
Brand new to TimescaleDB? Get started here.

If you have any questions along the way, we’re always available via our community Slack (we’re @mike and @sven , come say hi 👋).

And, if you are interested in keeping up-to-date with future TimescaleDB releases, sign up for our Release Notes. It’s low-traffic, we promise.

Until next time, keep it real!

Time-series data: Why (and how) to use a relational database instead of NoSQL

Mike Freedman — Thu, 20 Apr 2017 14:00:00 GMT

These days, time-series data applications (e.g., data center / server / microservice / container monitoring, sensor / IoT analytics, financial data analysis, etc.) are proliferating.

As a result, time-series databases are in fashion (here are 33 of them). Most of these renounce the trappings of a traditional relational database and adopt what is generally known as a NoSQL model. Usage patterns are similar: a recent survey showed that developers preferred NoSQL to relational databases for time-series data by over 2:1.

Relational databases include: MySQL, MariaDB Server, PostgreSQL. NoSQL databases include: Elastic, InfluxDB, MongoDB, Cassandra, Couchbase, Graphite, Prometheus, ClickHouse, OpenTSDB, DalmatinerDB, KairosDB, RiakTS. Source: Percona.

Typically, the reason for adopting NoSQL time-series databases comes down to scale. While relational databases have many useful features that most NoSQL databases do not (robust secondary index support; complex predicates; a rich query language; JOINs, etc), they are difficult to scale.

And because time-series data piles up very quickly, many developers believe relational databases are ill-suited for it.

We take a different, somewhat heretical stance: relational databases can be quite powerful for time-series data. One just needs to solve the scaling problem. That is what we do in TimescaleDB.

When we announced TimescaleDB two weeks ago, we received a lot of positive feedback from the community. But we also heard from skeptics, who found it hard to believe that one should (or could) build a scalable time-series database on a relational database (in our case, PostgreSQL).

There are two separate ways to think about scaling: scaling up so that a single machine can store more data, and scaling out so that data can be stored across multiple machines.

Why are both important? The most common approach to scaling out across a cluster of N servers is to partition, or shard, a dataset into N partitions. If each server is limited in its throughput or performance (i.e., unable to scale up), then the overall cluster throughput is greatly reduced.

This post discusses scaling up. (A scaling-out post will be published on a later date.)

In particular, this post explains:

Why relational databases do not normally scale up well
How LSM trees (typically used in NoSQL databases) do not adequately solve the needs of many time-series applications
How time-series data is unique, how one can leverage those differences to overcome the scaling problem, and some performance results

Our motivations are twofold: for anyone facing similar problems, to share what we’ve learned; and for those considering using TimescaleDB for time-series data (including the skeptics!), to explain some of our design decisions.

Why databases do not normally scale up well: Swapping in/out of memory is expensive

A common problem with scaling database performance on a single machine is the significant cost/performance trade-off between memory and disk. While memory is faster than disk, it is much more expensive: about 20x costlier than solid-state storage like Flash, 100x more expensive than hard drives. Eventually, our entire dataset will not fit in memory, which is why we’ll need to write our data and indexes to disk.

This is an old, common problem for relational databases. Under most relational databases, a table is stored as a collection of fixed-size pages of data (e.g., 8KB pages in PostgreSQL), on top of which the system builds data structures (such as B-trees) to index the data. With an index, a query can quickly find a row with a specified ID (e.g., bank account number) without scanning the entire table or “walking” the table in some sorted order.

Now, if the working set of data and indexes is small, we can keep it in memory.

But if the data is sufficiently large that we can’t fit all (similarly fixed-size) pages of our B-tree in memory, then updating a random part of the tree can involve significant disk I/O as we read pages from disk into memory, modify in memory, and then write back out to disk (when evicted to make room for other B-tree pages). And a relational database like PostgreSQL keeps a B-tree (or other data structure) for each table index, in order for values in that index to be found efficiently. So, the problem compounds as you index more columns.

In fact, because the database only accesses the disk in page-sized boundaries, even seemingly small updates can cause these swaps to occur: To change one cell, the database may need to swap out an existing 8KB page and write it back to disk, then read in the new page before modifying it.

But why not use smaller- or variable-sized pages? There are two good reasons: minimizing disk fragmentation, and (in case of a spinning hard disk) minimizing the overhead of the “seek time” (usually 5–10ms) required in physically moving the disk head to a new location.

What about solid-state drives (SSDs)? While solutions like NAND Flash drives eliminate any physical “seek” time, they can only be read from or written to at the page-level granularity (today, typically 8KB). So, even to update a single byte, the SSD firmware needs to read an 8KB page from disk to its buffer cache, modify the page, then write the updated 8KB page back to a new disk block.

The cost of swapping in and out of memory can be seen in this performance graph from PostgreSQL, where insert throughput plunges with table size and increases in variance (depending on whether requests hit in memory or require (potentially multiple) fetches from disk).

Insert throughput as a function of table size for PostgreSQL 9.6.2, running with 10 workers on an Azure standard DS4 v2 (8 core) machine with SSD-based (premium LRS) storage. Clients insert individual rows into the database (each of which has 12 columns: a timestamp, an indexed randomly-chosen primary id, and 10 additional numerical metrics). The PostgreSQL rate starts over 15K inserts/second, but then begins to drop significantly after 50M rows and begins to experience very high variance (including periods of only 100s of inserts/sec).

Enter NoSQL databases with Log-Structured Merge Trees (and new problems)

About a decade ago, we started seeing a number of “NoSQL” storage systems address this problem via Log-structured merge (LSM) trees, which reduce the cost of making small writes by only performing larger append-only writes to disk.

Rather than performing “in-place” writes (where a small change to an existing page requires reading/writing that entire page from/to disk), LSM trees queue up several new updates (including deletes!) into pages and write them as a single batch to disk. In particular, all writes in an LSM tree are performed to a sorted table maintained in memory, which is then flushed to disk as an immutable batch when of sufficient size (as a “sorted string table”, or SSTable). This reduces the cost of making small writes.

In an LSM tree, all updates are first written a sorted table in memory, and then flushed to disk as an immutable batch, stored as an SSTable, which is often indexed in memory. Source: igvita.com

This architecture — which has been adopted by many “NoSQL” databases like LevelDB, Google BigTable, Cassandra, MongoDB (WiredTiger), and InfluxDB — may seem great at first. Yet it introduces other tradeoffs: higher memory requirements and poor secondary index support.

Higher-memory requirements: Unlike in a B-tree, in an LSM tree there is no single ordering: no global index to give us a sorted order over all keys. Consequently, looking up a value for a key gets more complex: first, check the memory table for the latest version of the key; otherwise, look to (potentially many) on-disk tables to find the latest value associated with that key. To avoid excessive disk I/O (and if the values themselves are large, such as the webpage content stored in Google’s BigTable), indexes for all SSTables may be kept entirely in memory, which in turn increases memory requirements.

Poor secondary index support: Given that they lack any global sorted order, LSM trees do not naturally support secondary indexes. Various systems have added some additional support, such as by duplicating the data in a different order. Or, they emulate support for richer predicates by building their primary key as the concatenation of multiple values. Yet this approach comes with the cost of requiring a larger scan among these keys at query time, thus supporting only items with a limited cardinality (e.g., discrete values, not numeric ones).

There is a better approach to this problem. Let’s start by better understanding time-series data.

Time-series data is different

Let’s take a step back, and look at the original problem that relational databases were designed to solve. Starting from IBM’s seminal System R in the mid-1970s, relational databases were employed for what became known as online transaction processing (OLTP).

Under OLTP, operations are often transactional updates to various rows in a database. For example, think of a bank transfer: a user debits money from one account and credits another. This corresponds to updates to two rows (or even just two cells) of a database table. Because bank transfers can occur between any two accounts, the two rows that are modified are somewhat randomly distributed over the table.

Time-series data arises from many different settings: industrial machines; transportation and logistics; DevOps, datacenter, and server monitoring; and financial applications.

Now let’s consider a few examples of time-series workloads:

DevOps/server/container monitoring. The system typically collects metrics about different servers or containers: CPU usage, free/used memory, network tx/rx, disk IOPS, etc. Each set of metrics is associated with a timestamp, unique server name/ID, and a set of tags that describe an attribute of what is being collected.
IoT sensor data. Each IoT device may report multiple sensor readings for each time period. As an example, for environmental and air quality monitoring this could include: temperature, humidity, barometric pressure, sound levels, measurements of nitrogen dioxide, carbon monoxide, particulate matter, etc. Each set of readings is associated with a timestamp and unique device ID, and may contain other metadata.
Financial data. Financial tick data may include streams with a timestamp, the name of the security, and its current price and/or price change. Another type of financial data is payment transactions, which would include a unique account ID, timestamp, transaction amount, as well as any other metadata. (Note that this data is different than the OLTP example above: here we are recording every transaction, while the OLTP system was just reflecting the current state of the system.)
Fleet/asset management. Data may include a vehicle/asset ID, timestamp, GPS coordinates at that timestamp, and any metadata.

In all of these examples, the datasets are a stream of measurements that involve inserting “new data” into the database, typically to the latest time interval. While it’s possible for data to arrive much later than when it was generated/timestamped, either due to network/system delays or because of corrections to update existing data, this is typically the exception, not the norm.

In other words, these two workloads have very different characteristics:

OLTP Writes

Primarily UPDATES
Randomly distributed (over the set of primary keys)
Often transactions across multiple primary keys

Time-series Writes

Primarily INSERTs
Primarily to a recent time interval
Primarily associated with both a timestamp and a separate primary key (e.g., server ID, device ID, security/account ID, vehicle/asset ID, etc.)

TimescaleDB stores each chunk in an internal database table, so indexes only grow with the size of each chunk, not the entire hypertable. As inserts are largely to the more recent interval, that one remains in memory, avoiding expensive swaps to disk.

Why does this matter? As we will see, one can take advantage of these characteristics to solve the scaling-up problem on a relational database.

A new way: Adaptive time/space chunking

NOTE September 2021: following publication of this post, as explained in this GitHub issue adaptive chunking was deprecated from latest releases of TimescaleDB. There is a feature request for the approach to be reinstated. You may wish to follow or upvote that request.

When previous approaches tried to avoid small writes to disk, they were trying to address the broader OLTP problem of UPDATEs to random locations. But as we just established, time-series workloads are different: writes are primarily INSERTS (not UPDATES), to a recent time interval (not a random location). In other words, time-series workloads are append only.

This is interesting: it means that, if data is sorted by time, we would always be writing towards the “end” of our dataset. Organizing data by time would also allow us to keep the actual working set of database pages rather small, and maintain them in memory. And reads, which we have spent less time discussing, could also benefit: if many read queries are to recent intervals (e.g., for real-time dashboarding), then this data would be already cached in memory.

At first glance, it may seem like indexing on time would give us efficient writes and reads for free. But once we want any other indexes (e.g., another primary key like server/device ID, or any secondary indexes), then this naive approach would revert us back to making random inserts into our B-tree for that index.

There is another way, which we call, “adaptive time/space chunking”. This is what we use in TimescaleDB.

Instead of just indexing by time, TimescaleDB builds distinct tables by splitting data according to two dimensions: the time interval and a primary key (e.g., server/device/asset ID). We refer to these as chunks to differentiate them from partitions, which are typically defined by splitting the primary key space. Because each of these chunks are stored as a database table itself, and the query planner is aware of the chunk’s ranges (in time and keyspace), the query planner can immediately tell to which chunk(s) an operation’s data belongs. (This applies both for inserting rows, as well as for pruning the set of chunks that need to be touched when executing queries.)

The key benefit of this approach is that now all of our indexes are built only across these much smaller chunks (tables), rather than a single table representing the entire dataset. So if we size these chunks properly, we can fit the latest tables (and their B-trees) completely in memory, and avoid this swap-to-disk problem, while maintaining support for multiple indexes.

Approaches to implementing chunking

The two intuitive approaches to design this time/space chunking each have significant limitations:

Approach #1: Fixed-duration intervals

Under this approach, all chunks can have fixed, identical time intervals, e.g., 1 day. This works well if the volume of data collected per interval does not change. However, as services become popular, their infrastructure correspondingly expands, leading to more servers and more monitoring data. Similarly, successful IoT products will deploy ever more numbers of devices. And once we start writing too much data to each chunk, we’re regularly swapping to disk (and will find ourselves back at square one). On the flip side, choosing too-small intervals to start with leads to other performance downsides, e.g., having to touch many tables at query time.

Each chunk has a fixed duration in time. Yet if the data volume per time increases, then eventually chunk size becomes too large to fit in memory.

Approach #2: Fixed-sized chunks

With this approach, all chunks have fixed target sizes, e.g., 1GB. A chunk is written to until it reaches its maximum size, at which point it becomes “closed” and its time interval constraints become fixed. Later data falling within the chunk’s “closed” interval will still be written to the chunk, however, in order to preserve the correctness of the chunk’s time constraints.

A key challenge is that the time interval of the chunk depends on the order of data. Consider if data (even a single datapoint) arrives “early” by hours or even days, potentially due to a non-synchronized clock, or because of varying delays in systems with intermittent connectivity. This early datapoint will stretch out the time interval of the “open” chunk, while subsequent on-time data can drive the chunk over its target size. The insert logic for this approach is also more complex and expensive, driving down throughput for large batch writes (such as large COPY operations), as the database needs to make sure it inserts data in temporal order to determine when a new chunk should be created (even in the middle of an operation). Other problems exist for fixed- or max-size chunks as well, including time intervals that may not align well with data retention policies (“delete data after 30 days”).

Each chunk’s time interval is fixed only once its maximum size has been reached. Yet if data arrives early, this creates a large interval for the chunk, and the chunk eventually becomes too large to fit in memory.

TimescaleDB takes a third approach that couples the strengths of both approaches.

Approach #3: Adaptive intervals (our current design)

Please see note, above.

Chunks are created with a fixed interval, but the interval adapts from chunk-to-chunk based on changes in data volumes in order to hit maximum target sizes.

By avoiding open-ended intervals, this approach ensures that data arriving early doesn’t create too-long time intervals that will subsequently lead to over-large chunks. Further, like static intervals, it more naturally supports retention policies specified on time, e.g., “delete data after 30 days”. Given TimescaleDB’s time-based chunking, such policies are implemented by simply dropping chunks (tables) in the database. This means that individual files in the underlying file system can simply be deleted, rather than needing to delete individual rows, which requires erasing/invalidating portions of the underlying file. Such an approach therefore avoids fragmentation in the underlying database files, which in turn avoids the need for vacuuming. And this vacuuming can be prohibitively expensive in very large tables.

Still, this approach ensures that chunks are sized appropriately so that the latest ones can be maintained in memory, even as data volumes may change.

Partitioning by primary key then takes each time interval and further splits it into a number of smaller chunks, which all share the same time interval but are disjoint in terms of their primary keyspace. This enables better parallelization both on servers with multiple disks — for both inserts and queries — as well as multiple servers. More on these issues in a later post.

If the data volume per time increases, then chunk interval decreases to maintain right-sized chunks. If data arrives early, then data is stored into a “future” chunk to maintain right-sized chunks.

Result: 15x improvement in insert rate

Keeping chunks at the right size is how we achieve our INSERT results that surpass vanilla PostgreSQL, that Ajay already showed in his earlier post.

Insert throughput of TimescaleDB vs. PostgreSQL, using the same workload as described earlier. Unlike vanilla PostgreSQL, TimescaleDB maintains a constant insert rate (of about 14.4K inserts/second, or 144K metrics/second, with very low variance), independent of dataset size.

This consistent insert throughput also persists when writing large batches of rows in single operations to TimescaleDB (instead of row-by-row). Such batched inserts are common practice for databases employed in more high-scale production environments, e.g., when ingesting data from a distributed queue like Kafka. In such scenarios, a single Timescale server can ingest 130K rows (or 1.3M metrics) per second, approximately 15x that of vanilla PostgreSQL once the table has reached a couple 100M rows.

Insert throughput of TimescaleDB vs. PostgreSQL when performing INSERTs of 10,000-row batches.

Summary

A relational database can be quite powerful for time-series data. Yet, the costs of swapping in/out of memory significantly impacts their performance. But NoSQL approaches that implement Log Structured Merge Trees have only shifted the problem, introducing higher memory requirements and poor secondary index support.

By recognizing that time-series data is different, we are able to organize data in a new way: adaptive time/space chunking. This minimizes swapping to disk by keeping the working data set small enough to fit inside memory, while allowing us to maintain robust primary and secondary index support (and the full feature set of PostgreSQL). And as a result, we are able to scale up PostgreSQL significantly, resulting in a 15x improvement on insert rates.

But what about performance comparisons to NoSQL databases? That post is coming soon.

In the meantime, you can download the latest version of TimescaleDB, released under the permissive Apache 2 license, on GitHub.

Like this post? Interested in learning more?

Check out our GitHub, join our Slack community, and sign up for the community mailing list below. We’re also hiring!