---
title: "How Small Postgres Metadata Tables Quietly Throttle Your Largest Queries"
published: 2026-06-29T09:34:12.000-04:00
updated: 2026-06-29T09:34:12.000-04:00
excerpt: "Stale statistics on small Postgres metadata tables can silently throttle your largest queries. Learn to spot it with EXPLAIN and fix it—no schema migration."
tags: PostgreSQL, PostgreSQL Performance
authors: NanoHertz Communications
---

> **TimescaleDB is now Tiger Data.**

Most time-series performance work starts with the obvious suspect: the fact table. Five hundred million rows of sensor readings, hundreds of daily partitions, an autovacuum worker fighting write load. That is where engineers go first. That is where the [Optimization Treadmill](https://www.tigerdata.com/blog/postgres-optimization-treadmill) lives. But after reading this guide, you'll know how to find a different class of slowdown hiding in the opposite direction, detect it in under five minutes with [EXPLAIN](https://www.tigerdata.com/learn/explaining-postgresql-explain), and apply a same-day fix that requires no schema migration.

A 10,000-row device metadata table, rarely updated, sitting in the `FROM` clause of every dashboard query you run, can drag your entire time-series query path into the wrong performance tier. The time-series scan looks fast in `EXPLAIN`. The join node is where the time goes. The fix requires looking at a table that almost nobody thinks to audit.

## Why it matters

Postgres triggers ANALYZE on a table based on `autovacuum_analyze_scale_factor`, which defaults to 0.2. That means 20% of the table's rows need to change before autovacuum schedules a statistics refresh. On a 10,000-row `devices` table, that threshold is 2,000 rows. If your device metadata only sees occasional inserts as you onboard new sensors or record location changes, `ANALYZE` might have run once in the last quarter.

Meanwhile, `pg_statistic` still describes the distribution of `device_id` values your planner saw six months ago. Your 10,000-row table looks like a 600-row table to the query planner. Every query joining it runs a plan built for a much smaller dataset, and that plan doesn't improve on its own.

The symptom compounds constantly. Every dashboard refresh, every alert evaluation, every hourly export job runs this join. Stale statistics on a small metadata table are not a one-time cost. They tax your entire query workload invisibly, on every execution. The decision about how to execute that join is made by the query planner once, based on what `pg_statistic` describes at plan time.

## How Postgres picks a join strategy

When the planner receives a [query with a join](https://www.tigerdata.com/learn/postgresql-join-type-theory), it chooses between execution strategies based on estimated row counts from `pg_statistic`. A nested loop join works well when the inner table is small and the join condition hits an index. A hash join builds an in-memory hash table and scans both relations once.

When your `devices` table has 10,000 rows but `pg_statistic` thinks it has 600, the planner's cost model produces the wrong answer. It may choose a nested loop when the dataset is large enough to warrant a hash join, or allocate an undersized hash table that spills to disk. The direction of the error matters less than the consequence: a fast indexed scan on your fact table becomes the cheaper half of a misplanned join, and the join node pays the penalty.

This pushes your query from the indexed access tier, where performance scales with log(n), into the linear tier, where it scales with the full dataset the planner miscalculated. A join that ran in 20ms when the planner had accurate statistics can degrade to 4 seconds as the mismatch grows, with no change to the fact table, the indexes, or the query itself. The [IIoT performance envelope](https://www.tigerdata.com/blog/the-iiot-postgresql-performance-envelope) maps this degradation curve in detail.

## Finding the problem

Run `EXPLAIN (ANALYZE, BUFFERS)` on any query that joins your fact table to a metadata table:

```SQL
Run EXPLAIN (ANALYZE, BUFFERS) on any query that joins your fact table to a metadata table:

EXPLAIN (ANALYZE, BUFFERS)
SELECT
    r.ts,
    r.value,
    d.location
FROM readings r
JOIN devices d ON r.device_id = d.id
WHERE r.ts > now() - interval '1 hour'
  AND d.facility = 'plant-a';
```

Look at the join node in the output. Postgres shows two numbers for every node: rows=X (estimated) and actual rows=Y. A ratio of actual to estimated above 5:1 in either direction generally indicates the planner is working with stale information.

A real mismatch looks like this:

```SQL
Hash Join  (cost=45.00..12890.43 rows=620 width=48)
           (actual time=0.821..2847.332 rows=9841 loops=1)
  ->  Seq Scan on devices  (cost=0.00..42.20 rows=620 width=24)
                           (actual time=0.015..0.441 rows=9841 loops=1)
```

The planner expected 620 rows from `devices`. It got 9,841. That gap is why the hash table bucketing is wrong, why memory allocation is undersized, and why a join that should take 150ms ran for 2.8 seconds. The fact table scan was fast. The join node absorbed all the cost. 

## Fix 1: Force fresher statistics

The fastest fix lowers the scale factor on the metadata table so `ANALYZE` fires more aggressively:

```SQL
ALTER TABLE devices SET (
    autovacuum_analyze_scale_factor = 0.01,
    autovacuum_analyze_threshold = 50
);
```

With these settings, `ANALYZE` fires after roughly 150 rows change rather than 2,000. For a table that sees occasional updates as you onboard new sensors, this keeps statistics current without adding meaningful autovacuum load. The `devices` table is not your hot write path.

Run a manual `ANALYZE` immediately to rebuild statistics from current data:

```SQL
ANALYZE devices;
```

Then re-run the `EXPLAIN` query. The estimated rows on the join node should now match actual rows closely, and the join strategy will likely change.

## Fix 2: Audit all small metadata tables

The same problem applies to any metadata table that grows slowly but gets joined to your large fact tables (e.g. a `facilities` table, an `equipment_types` table, a `tag_groups` table). This query finds the candidates:

```SQL
SELECT
    schemaname,
    relname,
    n_live_tup AS estimated_rows,
    last_analyze,
    last_autoanalyze
FROM pg_stat_user_tables
WHERE n_live_tup < 100000
  AND (
      last_analyze < now() - interval '7 days'
      OR last_analyze IS NULL
  )
ORDER BY n_live_tup DESC;
```

Any metadata table with more than a week between `ANALYZE` runs and fewer than 100,000 rows is worth checking. Identify which ones appear in joins to large fact tables, then apply the scale factor adjustment to each. This is a one-time change with no ongoing maintenance burden.

## Fix 3: Denormalize for the highest-volume joins

When a join runs on every query in your workload, the most durable fix is to remove it entirely. For stable metadata like `location` or `facility`, you can store the value directly on the `readings` row:

```SQL
-- Add the column to the fact table
ALTER TABLE readings ADD COLUMN location TEXT;

-- Backfill from the current join
UPDATE readings r
SET location = d.location
FROM devices d
WHERE r.device_id = d.id;
```

The original query then becomes:

```SQL
SELECT ts, value, location
FROM readings
WHERE ts > now() - interval '1 hour'
  AND location = 'plant-a';
```

No join. No statistics dependency. No planning variability. The tradeoff is keeping `location` current when a device moves. The simplest pattern is a trigger on `devices`:

```SQL
CREATE OR REPLACE FUNCTION sync_device_location()
RETURNS TRIGGER AS $$
BEGIN
    UPDATE readings
    SET location = NEW.location
    WHERE device_id = NEW.id;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER devices_location_sync
AFTER UPDATE OF location ON devices
FOR EACH ROW
WHEN (OLD.location IS DISTINCT FROM NEW.location)
EXECUTE FUNCTION sync_device_location();
```

This works well when device location changes are infrequent. If `readings` has hundreds of millions of rows and devices move regularly, an application-layer update run as a background job is safer: it keeps the writes out of the transaction that updates `devices` and avoids locking large row sets. A chunked loop limits I/O pressure:

```SQL
DO $$
DECLARE
    part         TEXT;
    rows_updated INT;
BEGIN
    FOR part IN
        SELECT child.relname
        FROM pg_inherits
        JOIN pg_class child  ON pg_inherits.inhrelid  = child.oid
        JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
        WHERE parent.relname = 'readings'
        ORDER BY child.relname
    LOOP
        EXECUTE format(
            'UPDATE %I r
             SET location = d.location
             FROM devices d
             WHERE r.device_id = d.id
               AND r.location IS DISTINCT FROM d.location',
            part
        );
        GET DIAGNOSTICS rows_updated = ROW_COUNT;
        COMMIT;
        PERFORM pg_sleep(0.1);
    END LOOP;
END $$;
```

The `IS DISTINCT FROM` check skips rows where the value already matches, so the loop terminates cleanly even if it runs across a restart. Either way, the application code that previously ran the join query gets simpler, not more complex.

## Validating the fix

After applying Fix 1 or Fix 3, re-run the original `EXPLAIN (ANALYZE, BUFFERS)`. A correctly-planned query shows estimated rows close to actual on the join node and a shorter total execution time. Here is what the output looks like after `ANALYZE` with corrected statistics:

```SQL
Hash Join  (cost=178.00..13210.55 rows=9930 width=48)
           (actual time=1.243..312.881 rows=9841 loops=1)
  ->  Seq Scan on devices  (cost=0.00..172.30 rows=9930 width=24)
                           (actual time=0.018..0.892 rows=9841 loops=1)
```

The estimated rows now match actual rows closely (9,930 vs. 9,841). Total execution time dropped from 2.8 seconds to 312ms. The hash table was allocated correctly from the start.

For a before-and-after comparison across your full query workload, use `pg_stat_statements`:

```SQL
SELECT
    query,
    calls,
    mean_exec_time,
    total_exec_time / 1000 AS total_sec
FROM pg_stat_statements
WHERE query ILIKE '%devices%'
  AND query ILIKE '%readings%'
ORDER BY mean_exec_time DESC
LIMIT 5;
```

Run this before the fix, then again 24 hours later once autovacuum has refreshed statistics and the query plan cache has turned over. The mean execution time change for your slowest join queries is your signal. 

## Your next step

Open psql and run the audit query from Fix 2. It takes under a minute, requires no permissions beyond `SELECT` on `pg_stat_user_tables`, and identifies every underanalyzed metadata table in your schema at once. For each table that comes back, check whether it joins to a large fact table in your workload, then apply the scale factor fix from Fix 1. No maintenance window, no schema migration. The new statistics take effect the moment `ANALYZE` completes.

If the audit returns nothing but your join queries are still slow, the bottleneck has moved. The [Optimization Treadmill](https://www.tigerdata.com/blog/postgres-optimization-treadmill) maps the next layer of performance constraints in high-volume Postgres workloads and what to do when the metadata tables aren't the problem. Start a [Tiger Data free trial](https://console.cloud.timescale.com/signup) today to validate these fixes on your own data.