Category: All posts
Oct 01, 2025

Everyone eventually asks the same “easy” questions: How many distinct things do I have right now? How many active devices? Which trading pairs saw activity today? What’s the latest reading per sensor? On PostgreSQL at scale, those questions turn from harmless SQL into multi‑second waits because DISTINCT usually means walking every qualifying row and deduplicating afterward. Even with an ordered index, the engine traverses the full range and pushes keys through a unique step—O(N) work when N is tens or hundreds of millions.
We built SkipScan to flip that script. Instead of visiting every row, SkipScan uses the B‑tree’s order to jump from one distinct value to the next: find device = 1, output it, then restart the index at device > 1, and so on. The cost becomes O(K × log N), where K is the number of distinct values. When K ≪ N, you get millisecond answers instead of coffee‑breaks.
SkipScan first landed in TimescaleDB 2.2.0 for rowstores. With 2.20.0, we extended it to columnstore hypertables and to distinct aggregates like COUNT(DISTINCT …) —leveraging PostgreSQL 16’s ability to feed presorted inputs into ORDER BY/DISTINCT aggregates. With 2.22.0, we support multi-column SkipScan when all column values are guaranteed to be not-null. What follows is the problem it solves, the design that makes columnstore skipping safe and fast, and the exact queries and plans you can reproduce.
Rowstores are bad enough for DISTINCT at large N. Columnstores raise the stakes: data are packed in compressed batches (≤1,000 rows) and reading a single tuple can imply decompressing a whole batch. If we naïvely walked every batch to deduplicate, compression would help storage but hurt latency.
To make DISTINCT fast on columnstore, the executor has to:
That’s the essence of SkipScan on columnstore.
Columnstore chunks maintain an automatic B‑tree on (segmentby, orderby) (e.g., (device, time DESC)), plus per‑batch metadata columns like _ts_meta_min_1 and _ts_meta_max_1 for time. SkipScan hooks into that structure:
device using the (segmentby, orderby) index; for “latest per device,” this is typically the batch with the newest time for that device.DISTINCT ON (device) ... ORDER BY device, time DESC).device > current_device to hop to the next device’s first batch._ts_meta_max_1 > now() - '1 hour') so entire stretches of batches are skipped without decompression.There’s one important constraint: SkipScan on columnstore applies when the DISTINCT key is the leading segmentby column (or the leading columns for multi-key SkipScan). That’s what allows safe jumps.
Note: by leading column in the index we mean the 1st index column used in a query. For an index on (a,b), b is the leading index column in a query like “select distinct a, b from t where a=1”
Under the covers you’ll see a Custom Scan (SkipScan) node above an index scan and (when needed) a DecompressChunk. After emitting a distinct value, the executor rewrites its start condition from col >= v to col > v and re‑seeks—classic B‑tree hopscotch.
Table
CREATE TABLE metrics (
time timestamptz NOT NULL,
device integer,
value double precision
);
Rowstore index
CREATE INDEX ON metrics(device, time DESC);
Scale: ~23,328,010 rows; device has 10 distinct values.
Columnstore layout
ALTER TABLE metrics SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'device',
timescaledb.compress_orderby = 'time DESC'
);
SELECT compress_chunk(ch) FROM show_chunks('metrics') ch;
On one representative compressed chunk we see 5,370 batches totaling 5,364,693 compressed rows and an index on (device, _ts_meta_min_1 DESC, _ts_meta_max_1 DESC).
Complexity notes: the O(·) terms are theoretical guides. PostgreSQL uses B‑trees (multi‑way), so constants, caching, and pruning matter. We use them to show scale, not to predict exact milliseconds.
Counting devices (rowstore)
SELECT COUNT(DISTINCT device) FROM metrics;
With SkipScan: 3.430 ms.
Without (vanilla path): 6756.322 ms (~6.8 s).Same 23M‑row table; SkipScan performs roughly 2000× faster by doing ~K log‑seeks instead of scanning all N entries.
Latest reading per device (columnstore)
SELECT DISTINCT ON (device) device, time, value
FROM metrics
ORDER BY device, time DESC;With SkipScan: 4.589 ms by touching the first batch per device.
Without: 3912.069 ms—the engine decompresses and inspects ~5.3 M tuples to deduplicate.
Latest reading per device in the last hour (columnstore + metadata pruning)
SELECT DISTINCT ON (device) device, time, value
FROM metrics
WHERE time > now() - INTERVAL '1 hour'
ORDER BY device, time DESC;With SkipScan: 2.537 ms.
Without: 9.236 ms.The gap narrows because _ts_meta_max_1 lets the planner discard 5,330/5,370 batches up front; SkipScan still wins by visiting only the first qualifying batch per device.
Thresholded latest reading (columnstore; tuple‑level predicate)
SELECT DISTINCT ON (device) *
FROM metrics
WHERE value > 50
ORDER BY device, time DESC;With SkipScan: 45.392 ms after scanning ~3,002 decompressed tuples on the first chunk.
Without: 2145.536 ms after touching ~6,303,586 tuples across chunks.
When a predicate can’t be answered by batch metadata, SkipScan still stops early per device once a match is found.
In IoT fleets, “show me the latest per device” powers every status board. On rowstores it’s a recurring spike; on columnstore without SkipScan it can become a wall. With SkipScan, the query returns in a few milliseconds by decompression‑on‑demand.
At exchanges and brokerages, COUNT(DISTINCT symbol) and “latest quote per pair” show up in every dashboard and alert. With 200M+ rows and hundreds of thousands of pairs, SkipScan changes the economics of distinct—turning once‑expensive questions into cheap, always‑on queries.
In crypto analytics, analysts slice by wallet, contract, or chain. Setting the segmentby to the entity you dedupe most often (e.g., address) lets SkipScan jump entity‑to‑entity, keeping answer times stable even under heavy ingest.
Multi-column SkipScan support was added in 2.22.0, with a caveat: it can only be chosen for queries which do not produce NULL distinct values.
That means if a query produces multiple distinct columns and doesn’t allow NULLs to be output as distinct values, then this query can be eligible for multi-column SkipScan.
Examples:
CREATE INDEX ON metrics(region, device, metric_type);
-- All distinct columns have filters which don't allow NULLs: can use SkipScan
SELECT DISTINCT ON (region, device, metric_type) *
FROM metrics
WHERE region IN ('UK','EU','JP') AND device >1 AND metric_type IS NOT NULL
ORDER BY region, device, metric_type, time DESC;
-- Distinct columns are declared NOT NULL: can use SkipScan
CREATE TABLE metrics(region TEXT NOT NULL, device INT NOT NULL, ...);
SELECT DISTINCT ON (region, device) *
FROM metrics
ORDER BY region, device, time DESC;Multicolumn SkipScan skips over the current key values in a similar way to a single-column SkipScan: it freezes the values of keys preceding the current key, allows any values for keys following the current key, and searches for the first current key value which is greater than the current value.
When no more values are found for the current key, SkipScan moves to the previous key and repeats the process, or is done when there are no more previous keys.
For the above example, suppose we are currently scanning the values (AUS, 2, >56) for the tuple (region, device, metric_type). We’ve found the next tuple (AUS, 2, 59) and we search for (AUS, 2, >59) now.
If no more tuples are found, we switch the search to (AUS,>2, “any value”), find the next tuple (AUS,5,3) and search for (AUS,5,>3), etc.
If no more tuples for AUS are found we search for (>AUS, “any value”, “any value”), suppose we find (BR,4, 15) next, and restart the search for (BR,4,>15), etc.
The reason why we don’t use SkipScan if NULL values are possible for distinct columns is because of complexities in determining “any value” in the above examples.
Because PostgreSQL sorts NULLs separately from other values we have to check for NULLs and NOT NULLs separately for each key to exhaust “any value” possibilities.
Checking it for 2 keys means doing 4 checks for (NULL, NULL), (NULL, NOT NULL), (NOT NULL, NULL), (NOT NULL, NOT NULL). Checking it for M keys means 2^M complexity.
It is a lot of complexity to address for an edge-case as users usually do not care about NULL devices or regions etc.
To take advantage of multi-column SkipScan it’s enough to add IS NOT NULL check or declare distinct columns NOT NULL if no NULL values are anticipated in those columns. If there are strict conditions on distinct columns like “a>1” or “b IN (1,2,3)” then NULLs are already not allowed for those columns and SkipScan can be applied to those columns.
Design your layout so the column you deduplicate is first:
device, time DESC).timescaledb.compress_segmentby to that column (or columns for multicolumn SkipScan) and compress_orderby to match your query’s sort (e.g., time DESC). Compress your historical chunks.Verify with EXPLAIN—look for Custom Scan (SkipScan) above the index/decompress nodes. If the planner is shy, you can nudge it:
SET enable_seqscan = false; -- prefer indexes
SET timescaledb.skip_scan_run_cost_multiplier = 0; -- bias for SkipScan (2.20+)To check whether SkipScan was used in a query, and if it was used, on which keys and whether those keys are NOT NULL, the following flag can be set:
SET timescaledb.debug_skip_scan_info TO true; -- (2.22+)
-- NULLs are excluded via ">1"
SELECT DISTINCT ON (device) device, time, value
FROM metrics
WHERE device > 1
ORDER BY device, time DESC;
-- SkipScan key info is output
INFO: SkipScan used on metrics_device_idx(device NOT NULL)
-- NULLs are not excluded
SELECT DISTINCT ON (device) device, time, value
FROM metrics
ORDER BY device, time DESC;
-- SkipScan key info is output
INFO: SkipScan used on metrics_device_idx(device NULLS LAST)
When it doesn’t apply: multiple NULL-allowing distinct keys at once, orders that don’t start with the distinct column, or a columnstore where your distinct key isn’t the leading segmentby.
-- Ordered index for rowstore
CREATE INDEX ON metrics(device, time DESC);
-- Ordered index for columnstore based on segmentby + orderby settings
ALTER TABLE metrics SET (timescaledb.compress, timescaledb.compress_orderby='time DESC', timescaledb.compress_segmentby='device');
SELECT compress_chunk(ch) FROM show_chunks('metrics') ch;
These queries can use SkipScan:
SELECT DISTINCT ON (device) * FROM metrics;
SELECT DISTINCT ON (device) * FROM metrics ORDER BY device, time DESC;
SELECT DISTINCT ON (device, value) * FROM metrics WHERE value = 10;
SELECT count(DISTINCT device), max(DISTINCT device) FROM metrics;
These queries cannot use SkipScan either because distinct columns not in the index or non-distinct aggregates present or because the index doesn’t match the query order:
SELECT DISTINCT ON (time) * FROM metrics;
SELECT DISTINCT ON (device) * FROM metrics ORDER BY device, time;
SELECT DISTINCT ON (device, value) * FROM metrics;
SELECT count(DISTINCT device), max(device) FROM metrics;
SELECT count(DISTINCT device), count(DISTINCT value) FROM metrics;
Millisecond‑fast deduplication on datasets measured in billions of rows; far fewer tuples decompressed per query; predictable dashboard SLOs under bursty ingest; and no application changes—just the right index or segment layout. Rowstore users get wins back to 2.2.0; columnstore and distinct‑aggregate users get the full experience on 2.20.0+ (with PostgreSQL 16+), and multicolumn SkipScan support for not-null distinct values added in 2.22.0.
Skip less relevant rows. Skip entire batches. Skip the scan.
Sign up for a free trial today to see the benefit for your own queries and workloads.
About the authors
Natalya Aksman is a senior developer at TigerData, focusing on queries optimization and performance. Natalya joined TigerData recently, after working on Vertica columnar database for many years, and before that, working on Sybase IQ which was one of the earliest columnar databases in the industry.
Natalya is most passionate about optimizing performance of analytical databases and streamlining of query planning and execution. Natalya has a Master of Applied Mathematics degree from Moscow University and Master of Computer Science from Northeastern University in Boston.
Noah Hein is a Senior Product Marketing Engineer at TigerData, where he helps developers understand, adopt, and succeed with the fastest PostgreSQL platform for real‑time and analytical workloads. Day‑to‑day, he translates deep technical capabilities—like hypertables, hypercore compression, and continuous aggregates—into clear product narratives and customer stories that drive adoption and growth.
Before joining TigerData, Noah spent several years on the “builder” side of the house as both a founding engineer and an educator. He co‑created Latent Space’s three‑week AI Engineering Fundamentals course and has taught hundreds of engineers how to apply LLMs in production. Noah frequently speaks on AI‑data convergence topics; at the first ever AI Engineer Summit he led the “AI Engineering 101” workshop, walking participants through hands‑on projects.
Outside of work, Noah tries to help more people land jobs with his side project JobMosaic. When he’s not crafting launch posts, you’ll find him experimenting with edge‑AI devices, tinkering with homelab Postgres clusters, or giving impromptu botany lessons to anyone who will listen.