
Back to blog
5 min read
Apr 24, 2026
Table of contents
01 What You Will Learn02 Why It Matters03 Audit Your p95 Latency04 Migrate High-Access Metadata05 Configure Columnar Compression06 Measuring the Join Tax07 Next StepStandard relational normalization is the bedrock of database design. It prevents data duplication and maintains integrity by splitting information into specialized tables. However, as your tables approach the 500 million row mark, the very joins that keep your data clean begin to degrade performance. You may notice p95 latencies creeping up even though your queries remain logically sound. This guide explains how to identify when relational purity is holding you back and how to use data flattening and columnar compression to reclaim your speed.
This guide breaks down the mechanics of read amplification and join overhead in high-volume systems. You will learn:
Standard normalization relies on joins to reconstruct data at query time. On small datasets, this is efficient. On tables with billions of rows, every join adds a layer of I/O and CPU overhead that compounds as your data volume grows. Row-based storage forces the database to read every column in a row, even if your query only needs two, leading to massive read amplification.
Flattening your schema, moving frequently joined metadata directly into your main ingestion table, removes the need for these expensive joins. While this usually increases storage size in a row-based system, specialized columnar storage allows for aggressive compression of this redundant data. By choosing the right schema architecture, you can significantly boost the responsiveness of your real-time dashboards and analytics.
Start by identifying the queries that drive your latency spikes. Use the pg_stat_statements extension to find queries with high total execution time. Run EXPLAIN (ANALYZE, BUFFERS) on these queries to see how the database interacts with the storage layer.
Look for "Nested Loop" or "Hash Join" operations where the "Shared Hit Blocks" are high. If the database spends 80% of its time matching keys between a 500M row metrics table and a 10k row device table, your architecture is likely CPU-bound by join coordination. A high number of "read" buffers in the output indicates that the database is hunting through indexes and table pages to find metadata that isn't stored locally with the record.
When you see a query plan where the join cost increases linearly with the size of the primary table, you have identified a "join tax" that cannot be solved with more memory or faster disks. At this stage, the overhead of managing the relationship between tables is outweighing the benefits of relational purity.
Identify the metadata columns that appear most often in your WHERE and JOIN clauses.
Then, add the new denormalized column to your high-frequency ingestion table:
ALTER TABLE device_metrics ADD COLUMN region TEXT;Next, backfill the existing rows. On tables with hundreds of millions of rows, a single UPDATE statement can cause massive transaction log bloat and table locks. Instead, backfill in batches or use a join-based update to pull metadata from your reference table:
UPDATE device_metrics m
SET region = d.region
FROM devices d
WHERE m.device_id = d.id
AND m.region IS NULL;This migration eliminates the join at query time, allowing the database to filter and aggregate in a single pass. By localizing the data, you ensure the database engine no longer needs to load secondary table pages into the buffer pool just to check a filter condition.
After flattening your table, you must enable columnar storage to handle the redundant metadata. In Tiger Data, this is achieved through a hybrid storage engine that partitions data into "chunks" and then compresses those chunks into a columnar format. This storage engine groups data by column rather than row, allowing the database to ignore unused columns during a scan and reducing the physical I/O required for every query.
To implement this, you first define your compression policy. You must choose a segmentby column, typically a device ID or primary key, and an orderby column, usually a timestamp. The segmentby column is the most important for performance; it determines how data is grouped within the compressed segments.
-- 1. Enable compression on the hypertable
ALTER TABLE device_metrics SET (
timescaledb.compress = true,
timescaledb.compress_orderby = 'ts DESC',
timescaledb.compress_segmentby = 'device_id, region'
);
-- 2. Add a policy to compress data older than 7 days
SELECT add_compression_policy('device_metrics', INTERVAL '7 days');
When the compression policy runs, the database transforms the row-based data into compressed columnar batches. For a flattened table, this is where the performance "magic" happens. If you have 100 million rows where the region is 'north_east', a row-based engine stores that string 100 million times. The columnar engine stores it once along with a metadata bitmask, reducing the storage footprint of that column by up to 99%.
You can verify the effectiveness of your compression and the reduction in read amplification by querying the compression statistics:
SELECT
total_chunks,
number_compressed_chunks,
pg_size_pretty(before_compression_total_bytes) AS before_size,
pg_size_pretty(after_compression_total_bytes) AS after_size
FROM hypertable_columnstore_stats('device_metrics');
This architectural shift moves your database from being I/O bound to being CPU efficient. By reducing the "Shared Hit Blocks" (the number of 8KB pages the database must load into memory), you free up the buffer pool for other critical operations and effectively raise the performance ceiling of your entire system.
You can measure the performance gap between a normalized join and a flattened table by comparing their execution costs. Consider a system tracking millions of IoT sensors that needs to find the average reading for a specific region.
This query must join two tables, forcing the database to match foreign keys for every row in the time range.
SELECT avg(m.value), d.region
FROM device_metrics m
JOIN devices d ON m.device_id = d.id
WHERE m.ts > now() - interval '1 hour.'
AND d.region = 'north_east'
GROUP BY d.region;
By moving the region column into the device_metrics table and using columnar compression, the join disappears. The database only reads the value and region columns from disk, ignoring the rest of the row.
SELECT avg(value), region
FROM device_metrics
WHERE ts > now() - interval '1 hour.'
AND region = 'north_east'
GROUP BY region;
Identify your most expensive join query using pg_stat_statements. Create a denormalized version of that table that includes the necessary metadata, and test query performance against a compressed columnar version. This test will help you determine if your system is hitting a structural ceiling that only an architectural change can fix.
To try this on a fully managed instance, start a free Tiger Cloud trial. You can compare and contrast query performance on real data without impacting your production database.
Receive the latest technical articles and release notes in your inbox.