
Back to blog
5 min read
May 08, 2026
Table of contents
01 What You Will Learn02 Why It Matters03 The Join Explosion Problem04 Moving From Relational to Flattened05 Measuring the Performance Gap06 Next StepWhen your Shared Buffer Hits cross 100,000 for a simple dashboard refresh, you have hit the wall of relational scaling. A simple query that used to return in 100ms now takes 1.5 seconds. You’ve added indexes and tuned the buffer pool, but the performance gains are fleeting. Your largest tables crossed 500M rows, and the joins that once felt elegant are now a weight dragging down your entire system.
High-frequency data and deep relational hierarchies eventually collide. By following this guide, you will be able to:
Relational schemas work during small-scale pilots but struggle when dashboards must stitch ten tables together for every refresh, hundreds of times per second, against billions of rows. Complex joins consume excessive CPU cycles, leading to the query speed degradation typical of the optimization treadmill.
Each join forces the database to navigate multiple B-tree indexes and load disparate pages into memory. By flattening data, you reduce the database's computational burden and enable much higher query concurrency.
A normalized schema fragments telemetry into separate tables for readings, hardware IDs, and site data. While efficient for storage, this creates a “join explosion” at scale. On tables with a billion rows, the query planner has to navigate multiple B-tree indexes and perform nested loops, or hash joins, just to assemble a single dashboard view. This forces the database to jump between disparate disk locations, incurring a heavy computational tax every time a user refreshes their screen.
The solution is to move the join cost from read-time to write-time. Instead of stitching tables together during every query, you flatten the data by pre-joining the metadata to the raw reading during ingestion.
Moving from relational to flattened means transitioning from a passive storage model to an active processing model. Instead of a lean ingestion phase followed by expensive reads, you perform the join logic exactly once when the data enters the system. Storing the result in a single, wide table allows the database to perform a single index scan rather than a multi-way join, reducing I/O load and restoring dashboard responsiveness.
The following sections illustrate how to move to a flattened data model, with example SQL commands for a smart building management system. In this scenario, we will move the computational cost of joining sensor data (e.g. temperature, air quality) with building and regional metadata from query-time to write-time.
Start by creating a table that includes the metadata as native columns. This removes the need for downstream joins.
This example query creates the flattened table and a composite index so the database can locate time-series readings and their metadata without expensive lookups across separate tables. This structure allows the database to instantly locate specific data, like North region telemetry, without performing costly table scans.
CREATE TABLE readings_flattened (
ts TIMESTAMPTZ NOT NULL,
sensor_name TEXT,
building_name TEXT,
region TEXT,
value DOUBLE PRECISION
);
CREATE INDEX idx_flattened_ts_region ON readings_flattened (ts DESC, region);
Moving a billion rows at once will lock your database. Use a batch approach to migrate data from your relational tables.
This example query creates a path to migrate historical records from fragmented tables into the new wide format one window at a time, starting with the past thirty days. By pre-joining the data during this migration, you eliminate the need for the database ever to stitch these specific records together again.
INSERT INTO readings_flattened (ts, sensor_name, building_name, region, value)
SELECT
r.ts,
s.sensor_name,
l.building_name,
l.region,
r.value
FROM readings r
JOIN sensors s ON r.sensor_id = s.id
JOIN locations l ON s.location_id = l.id
WHERE r.ts > now() - interval '30 days';
To keep the flattened table up to date without changing your application code, use a database trigger. This ensures that every new reading is pre-joined as it enters the system.
This example query creates an automated trigger that flattens every incoming sensor reading in real time. This ensures that, as new sensor data arrives, it is immediately linked to its building and regional context before being stored.
CREATE OR REPLACE FUNCTION flatten_reading_trigger()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO readings_flattened (ts, sensor_name, building_name, region, value)
SELECT
NEW.ts, s.sensor_name, l.building_name, l.region, NEW.value
FROM sensors s
JOIN locations l ON s.location_id = l.id
WHERE s.id = NEW.sensor_id;
RETURN NEW;
END;
CREATE TRIGGER trg_flatten_reading
AFTER INSERT ON readings
FOR EACH ROW EXECUTE FUNCTION flatten_reading_trigger();
To see the mechanical benefits of your new flattened table, you need to move beyond measuring execution time, which can fluctuate based on concurrent load. Instead, use the BUFFERS metric in your query plan to observe the physical I/O the database performs. This provides a stable, repeatable measure of the work required to retrieve your data.
To identify the join tax in your system, establish a performance baseline using your existing relational schema. This example query uses the EXPLAIN (ANALYZE, BUFFERS) command to measure the physical memory and I/O work required by a standard multi-way join.
In our example scenario, this query tracks how many data blocks your database engine must touch to assemble the North region dashboard.
EXPLAIN (ANALYZE, BUFFERS)
SELECT r.ts, s.sensor_name, l.building_name, r.value
FROM readings r
JOIN sensors s ON r.sensor_id = s.id
JOIN locations l ON s.location_id = l.id
WHERE r.ts > now() - interval '1 hour' AND l.region = 'North';
Focus on the shared hit count. Each hit represents one 8KB block of data the database had to find and load. If you see tens of thousands of hits for one hour of data, your I/O is the bottleneck.
Validate the architectural shift by running a comparative diagnostic against the new flattened structure. This example query proves the efficiency of data co-location by measuring shared buffer hits. Unlike the relational model, which hunts across multiple indexes, this approach allows the engine to find all necessary data within a single storage layer, significantly reducing the number of blocks touched.
In our example scenario, this query will show the efficiency of your database engine in assembling the North region dashboard with a flattened structure.
EXPLAIN (ANALYZE, BUFFERS)
SELECT ts, sensor_name, building_name, value
FROM readings_flattened
WHERE ts > now() - interval '1 hour' AND region = 'North';
Look for a 70–90% drop in shared hits — the foundation of any real-time analytics workload. This reduction in blocks touched is what allows your dashboard to scale to hundreds of concurrent users without CPU saturation.
Flattening introduces redundancy. If a building is renamed, the flattened table will still contain the old name for historical rows. However, for telemetry and IIoT, metadata is usually static. The massive gain in query concurrency and dashboard speed almost always outweighs the cost of a rare historical update.
Identify the slowest query in your dashboard. Test a flattened version of that dataset using a temporary table to see the I/O savings immediately:
CREATE TEMPORARY TABLE test_flattened AS
SELECT r.ts, s.sensor_name, l.building_name, l.region, r.value
FROM readings r
JOIN sensors s ON r.sensor_id = s.id
JOIN locations l ON s.location_id = l.id
LIMIT 1000000;
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM test_flattened WHERE region = 'North';
Compare the shared hit count of this test against your production query. If the savings are significant, it is time to move toward a denormalized architecture. For workloads exceeding billions of rows, see how Tiger Data does this flattening for you with hybrid row-columnar storage. Start a Tiger Cloud free trial today to see these savings for yourself.

Row vs Columnar Storage for Analytics: Why PostgreSQL Scans Are Slower Than They Should Be
Learn why PostgreSQL reads 16x more data than your queries need, and how a hybrid row-columnar storage layout eliminates the bottleneck without changing your SQL.
Read more
Receive the latest technical articles and release notes in your inbox.