Category: All posts
Jun 02, 2025
We're excited to announce TimescaleDB 2.20, a release that further solidifies TimescaleDB as the fastest PostgreSQL for time-series and analytical workloads. This version delivers remarkable performance enhancements, especially for demanding analytical queries and crucial real-world data management tasks like updates and upserts, without asking developers to sacrifice the flexibility, full SQL capabilities, or rich ecosystem of PostgreSQL.
Beyond significantly boosting query performance, TimescaleDB 2.20 streamlines the developer experience, particularly when managing data efficiently and working with continuous aggregates. Get ready for impressive speed-ups and powerful enhancements that help you move faster and scale smarter.
Below, we bring you all the details on this edition’s new features and developer experience enhancements. But if you want the TL;DR, here it is.
TimescaleDB 2.20 makes the fastest PostgreSQL for real-time analytics on time-series data even faster, delivering speed without the sacrifices found in specialized OLAP databases. Key highlights include:
With the TL;DR out of the way, we can deep dive into everything 2.20 has to offer!
Inserting or updating data into compressed columnstore chunks has been a pain point for some users, often requiring creative workarounds to manage throughput. We've heard you, and TimescaleDB 2.20 brings a massive improvement to this experience!
For heavily constrained UPSERT or ON CONFLICT, performance is now up to 10 times faster.
Consider a worst-case scenario backfill of 3,000 records where every record conflicts with an existing unique constraint:
INSERT INTO uk_ts_comp_table
SELECT * FROM insert_sample on conflict (uk,ts) do nothing;
TimescaleDB 2.19.3: ~17 seconds
TimescaleDB 2.20.0: ~1.7 seconds (a 10x improvement!)
This is significantly closer to the performance of inserting into uncompressed rowstore chunks (which took ~300ms in this test). If no unique constraints are present in a chunk, the backfill performance between rowstore and columnstore chunks is now nearly equivalent.
We're not stopping here! We have more improvements lined up for backfilling, including leveraging bloom filters, to further enhance performance in upcoming releases. Our goal is to make backfilling into columnstore only marginally slower than into uncompressed tables, even for the most complex scenarios.
TimescaleDB's columnstore on PostgreSQL already provides powerful analytical capabilities. Version 2.20 pushes these boundaries, offering speeds that challenge dedicated OLAP systems while retaining the full power and flexibility of PostgreSQL.
Traditional OLAP databases excel at scanning large datasets but often falter with point lookups or finding rare values (like a specific UUID or transaction ID). This is where TimescaleDB 2.20 further differentiates itself by enhancing PostgreSQL's capabilities. We introduce bloom filters for columnstore batches, addressing a common pain point and delivering "speed without sacrifice"—you don't lose point query performance for analytical speed.
A bloom filter is a space-efficient, probabilistic data structure that quickly determines if an element might be in a set or definitely is not.
How bloom filters help bridge the gap:
For columns that aren't segmentby
columns, we now build a bloom filter for each batch. When you query for a specific value (e.g., WHERE subject = 'unsubscribe'
), TimescaleDB first checks the bloom filter:
If it might be there, we then decompress and check. This drastically reduces data processing for such lookups, making your PostgreSQL-based time-series database performant for mixed workloads.
Performance impact:
SELECT min(sent), max(sent), count() FROM hackers WHERE subject = 'unsubscribe' ORDER BY count() DESC LIMIT 10;
can execute 3.5 times faster (e.g., from 12ms down to 2.7ms).SELECT * FROM token_transfers WHERE from_address = '0xa148e876bbd30e1937be268a2f4387b14eff5cb0' ORDER BY block_timestamp DESC LIMIT 10;
Storage and availability:
Each bloom filter adds a small amount of storage per batch (up to 1kB, but can be as little as 9 bytes depending on cardinality). This feature is enabled by default for all new chunks created in TimescaleDB 2.20. Existing chunks require recompression to gain bloom indexes.
Building on its initial availability for rowstore since version 2.2.1, a standout feature in TimescaleDB 2.20 is the major advancement and optimization of SkipScan for columnstore hypertables. This optimization dramatically speeds up queries that retrieve distinct values from segmentby
columns—a common operation that can be slow in many databases.
What is SkipScan?
SkipScan is an intelligent query execution technique that significantly accelerates finding distinct values from an ordered index. Instead of scanning every entry, SkipScan efficiently jumps from one distinct item to the next, bypassing duplicates. SkipScan was first supported for regular PostgreSQL tables and rowstore hypertables in version 2.2.1. In TimescaleDB 2.20, we've massively enhanced this capability for columnstore, unlocking its full potential and performance.
How SkipScan accelerates Your columnstore queries
Columnstore in TimescaleDB organizes data into compressed batches, with an internal B-tree index on segmentby
and orderby
min/max metadata columns. SkipScan leverages this structure. If you need to find distinct devices or get the latest reading for each, SkipScan shines, delivering PostgreSQL performance at speeds that rival specialized systems.
Significant performance gains:
The benchmarks speak for themselves:
SELECT DISTINCT ON (device) device FROM ht_metrics_compressed;
2500x faster: From ~2.2 seconds down to a mere 0.85 milliseconds!
WITH temp as (SELECT DISTINCT ON (device) * FROM ht_metrics_compressed) SELECT avg(value) FROM temp;
2000x faster: From ~3.3 seconds down to 1.65 milliseconds.
SELECT count(DISTINCT device) FROM ht_metrics_compressed;
2000x faster: From ~2.42 seconds down to 1.16 milliseconds.
The technique applies to finding distinct values of segmentby
columns in a columnstore.
Scenario 1: Show latest readings on each device
Imagine you want the most recent metric reading for every device:
select DISTINCT ON (device) device, time, value from metrics
ORDER BY device, time DESC;
With SkipScan:
This avoids scanning and decompressing unnecessary data. For a table with 10 distinct devices and millions of rows spread across thousands of batches, SkipScan might only need to perform a handful of targeted reads and decompressions, leading to execution times like 4.589 ms instead of 3912.069 ms (a ~850x improvement in this specific example).
Scenario 2: Show latest readings for devices updated in the last hour
What if you add a time filter?
SELECT DISTINCT ON (device) device, time, value
FROM metrics
WHERE time > (now() - INTERVAL '1 hour')
ORDER BY device, time DESC;
TimescaleDB's columnstore already uses min/max metadata per batch to prune batches that don't overlap with the time filter. SkipScan builds on this. Even if many batches are filtered out by the time condition, SkipScan still provides an advantage by efficiently finding the first relevant record for each distinct device within the remaining candidate batches.
In one test, this query ran in 2.537 ms with SkipScan, compared to 9.236 ms without it. While the relative improvement is smaller because the WHERE clause is already highly selective, SkipScan still ensures optimal performance.
Scenario 3: Show latest readings where measurement exceeds a threshold
Consider a filter on a non-indexed, non-segmentby column:
SELECT DISTINCT ON (device) *
FROM metrics
WHERE value > 50
ORDER BY device, time DESC;
Here, batch-level filtering on value isn't possible. Without SkipScan, the system would have to decompress and scan many rows for each device to find those with value > 50 before picking the latest.
With SkipScan:
This "early exit" per device offers huge savings. One such query saw execution time drop from 2145.536 ms to 45.392 ms (a ~47x improvement) thanks to SkipScan.
SkipScan, now highly optimized for columnstore in addition to its existing rowstore capabilities, is a game-changer for DISTINCT
queries on your primary segmentby
column, especially across hybrid tables where it can efficiently process both data formats, making your analytical dashboards and reports snappier than ever.
We're continuing to optimize storage and processing for all data types. In TimescaleDB 2.20, we've finalized improvements for boolean columns in our columnstore:
Performance gains:
These improvements show up in benchmarks:
ClickBench Query 6 (URL, PageViews):
SELECT URL, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = false AND IsRefresh = false AND URL <> '' GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
ClickBench Query 7 (Title, PageViews):
SELECT Title, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = false AND IsRefresh = false AND Title <> '' GROUP BY Title ORDER BY PageViews DESC LIMIT 10;
tsbench (uk_price_paid dataset):
SELECT town, sum(price) FROM uk_price_paid WHERE is_new = true AND duration > 1 GROUP BY 1;
These enhancements mean more efficient storage and faster queries when dealing with boolean flags and conditions in your time-series data.
Getting started with TimescaleDB just got dramatically easier. In earlier versions, creating a hypertable with columnstore and compression often required four separate SQL statements and prior knowledge of how to configure each option. This created a steep learning curve for new users and extra friction for experienced developers looking to spin up new time-series tables quickly.
In TimescaleDB 2.20, we’re collapsing those steps into a single, streamlined CREATE TABLE
statement that gets you up and running faster:
CREATE TABLE metric (
time timestampz not null,
device text,
value float
)
WITH (
timescaledb.hypertable,
timescaledb.partition_column='time',
timescaledb.chunk_interval='1 day'
);
This new declarative syntax offers a smarter default experience:
metrics
table.timescaledb.hypertable
).segmentby
and orderby
settings based on statistical heuristics when compression first runs.This means you can get started analyzing time-series data with compression and columnar performance without writing multiple DDL statements or tweaking config knobs upfront.
Note: While compression is now automatically enabled, a compression policy is still not configured by default. That step remains manual (for now)—but we’re working to automate this too in an upcoming release.
The previous APIs for creating hypertables and enabling compression still work and will continue to be supported. This new approach is fully backward-compatible and designed to improve the developer experience, not disrupt existing workflows.
Continuous aggregates are a cornerstone of managing and querying large time-series datasets efficiently. TimescaleDB 2.20 introduces more flexibility and power to how you define and manage them.
We're excited to announce that you can now use window functions and mutable functions within your continuous aggregate query definitions! The support is experimental, as there is a risk of data inconsistency if the user is not careful. For example, in backfill scenarios, buckets could be missed.
Window Functions: Support for window functions is experimental and must be enabled via the GUC: SET timescaledb.enable_cagg_window_functions = on;
. This allows for more complex calculations like moving averages, rankings, and period-over-period comparisons directly within your cagg definition.
Mutable Functions: Mutable functions (functions whose results can change even with the same inputs, e.g., random() or now()) are now permitted. The system will emit a warning if you use them in a materialized query.
The order of materialization is now changeable. With the policy batched refreshes introduced in the 2.19.0 release, we made the hard-coded decision to always update the most recent data first. Now it’s possible to change the order depending on the use case to refresh the oldest data first. In the refresh policy, the option refresh_newest_first
allows you to change the sorting, if set to TRUE the newest data will get refreshed first, and FALSE
starts with the oldest. The default value is TRUE
.
Why is this important? These functions were previously blocked due to the potential risk of data inconsistency, especially during backfills or re-materializations where historical "states" might not be perfectly reproducible. While these risks still exist and require careful consideration by the user, enabling these functions significantly improves developer experience and allows for a wider range of analytical rollups.
Use with caution: Be mindful of the implications, especially with window
functions that depend on specific data ordering or mutable functions that could lead to non-deterministic aggregate values over time if not handled carefully.
Previously, when a continuous aggregate refresh policy ran, it would always prioritize materializing the newest data first. This makes sense for many real-time monitoring use cases.
However, some scenarios benefit from refreshing the oldest data first (e.g., backfilling historical data or ensuring completeness of older periods before newer ones). In TimescaleDB 2.20, you now have control over this behavior.
The add_continuous_aggregate_policy function
(and related policy modification functions) now includes an option: refresh_newest_first
.
refresh_newest_first = TRUE (default)
: Refreshes start from the newest data and work backward.
refresh_newest_first = FALSE
: Refreshes start from the oldest data and work forward.
This added flexibility allows you to tailor the materialization strategy to your specific application needs.
As announced with the release of TimescaleDB 2.19.0, version 2.20 drops support for PostgreSQL 14.
Please plan your PostgreSQL upgrades accordingly to take advantage of the latest TimescaleDB features.
TimescaleDB 2.20 marks a significant leap forward, reinforcing its position as the fastest PostgreSQL for time-series and demanding workloads. We've delivered substantial speed improvements for columnstore queries and data management without sacrificing the robust capabilities, full SQL, and developer experience you expect from PostgreSQL. Features like SkipScan, bloom filters for point queries, and dramatically faster, more flexible data updates on historic data allow you to tackle demanding workloads with greater efficiency and flexibility than typical OLAP solutions might offer.
For Timescale Cloud users on PostgreSQL 15 and above, these enhancements are already live and available, while self-hosted users can upgrade now to explore these benefits firsthand. Check out our release notes for a full list of improvements and bug fixes. As always, we welcome your feedback in our Community Forum or on GitHub. Check out
Happy querying!