---
title: "When Continuous Ingestion Breaks Traditional Postgres"
published: 2026-03-13T15:33:45.000-04:00
updated: 2026-03-13T15:33:45.000-04:00
excerpt: "Postgres maintenance depends on quiet periods your continuous workload eliminated. Here's what happens inside the database when the gaps disappear."
tags: PostgreSQL
authors: Matty Stratton
---

> **TimescaleDB is now Tiger Data.**

Your system writes data constantly. Not in jobs. Not in batches. A stream that runs at 3am the same as it runs at 3pm. IoT sensors. Trade feeds. Metrics collectors. The data never stops.

For a while, Postgres handles it fine. Then you start noticing things. Autovacuum is always running. Write latency has a pattern you can't explain by traffic alone. Maintenance tasks that used to take minutes now take hours. And the really annoying part: nothing is misconfigured.

You check the usual suspects. Indexes are correct. Query plans look reasonable. Configs follow best practices. A colleague confirms the same.

The problem isn't a missing index or a bad query plan. The problem is that Postgres was designed with a quiet period baked into its assumptions. Your system eliminated that quiet period. Now you're paying for it.

## What "breathing room" actually means in Postgres

Most database systems are designed around a workload shape that includes peaks and valleys. Peaks are when users are active. Valleys are when the database catches up.

Postgres maintenance is built around the valley.

Autovacuum runs more aggressively when the database is quiet. `ANALYZE` refreshes statistics without competing for I/O. Checkpoint cycles complete cleanly. WAL accumulation clears out. The buffer cache warms up on predictable patterns.

Batch ETL fits this model perfectly. A nightly job writes data for two hours. The database writes, then rests, then writes again. Maintenance runs in the gaps. Everything resets before the next cycle starts.

Continuous ingestion has no gaps. The window that used to be quiet at 2am is now the same as the window at 2pm. Every maintenance process that depends on quiet time now runs in direct competition with writes. All day. All night.

## The maintenance competition problem

Three maintenance processes need quiet time and don't get it under continuous ingestion.

**Autovacuum.** Even on append-only tables, autovacuum fires continuously at high insert rates. Since PostgreSQL 13, inserts themselves trigger autovacuum to freeze tuples and update the visibility map. This isn't about dead tuples from updates or deletes. It's insert-driven vacuum, running because the data is arriving too fast for the system to catch up.

At 50K inserts/second, autovacuum never finishes a cycle before the next one starts. It competes for I/O with your writes. When it loses, bloat accumulates. When it wins, write latency spikes.

There's no configuration fix for this. You can tune `autovacuum_vacuum_cost_delay` and `autovacuum_max_workers` all day. What you're tuning is how autovacuum loses gracefully. Not how it stops competing.

**Checkpoints.** Postgres writes dirty pages to disk at checkpoint intervals. After a checkpoint completes, the first write to any previously-clean page triggers a full-page write to WAL (that's the `full_page_writes mechanism`, and it's on by default for good reason). At high insert rates, checkpoint cycles are constant. The full-page write burst that follows each one adds significant WAL volume on top of your baseline write load.

Batch systems checkpoint, rest, then return to normal. Continuous systems checkpoint and immediately start generating the next burst. There's no recovery window.

**ANALYZE and statistics.** Query planning accuracy depends on fresh statistics. On a billion-row table, `ANALYZE` is expensive. On a batch system, you schedule it after the load completes. On a continuous system, there is no "after." You run it during writes or you let statistics go stale. Stale statistics mean bad query plans. Bad query plans mean unexpected sequential scans at the worst possible time.

## WAL as the throughput ceiling you can't tune past

This is the mechanical core of the problem.

Every insert generates WAL. Heap insert record, index insertion records for every index on the table, plus full-page writes after checkpoints. A single 1KB sensor reading with five indexes generates roughly 2.5-3.5KB of actual I/O once you account for the heap tuple, B-tree leaf page insertions, and WAL records. At 100K inserts/second, that puts sustained WAL throughput at 50-100MB/sec under normal conditions. After a checkpoint, it spikes higher because of full-page writes.

That's 3-6GB per minute. 180-360GB per hour. Just WAL.

WAL writes are sequential and synchronous by default. That's a hard ceiling on write throughput for a given storage configuration. You can raise the ceiling by buying faster storage. You can't eliminate it, because WAL is how Postgres guarantees durability. And you shouldn't want to eliminate it. Durability matters. But you should understand that your write throughput has a physical upper bound set by how fast your storage can absorb WAL, and continuous ingestion pushes against that bound constantly.

Here's where continuous ingestion and batch ETL diverge completely.

Batch ETL generates bursts of WAL followed by silence. The silence lets replicas catch up. A streaming replica can fall behind during a batch load and recover in the gap. Nobody notices because the gap is long enough.

Continuous ingestion generates WAL constantly. Replicas that fall slightly behind have no gap to recover in. They fall further behind. The primary retains unprocessed WAL in `pg_wal`, consuming disk. The further behind the replica gets, the more WAL it needs to process, and the more disk the primary holds. It's a feedback loop. The thing that causes the problem (WAL volume) is the same thing that prevents recovery (WAL volume).

Adding replicas makes it worse, not better. Each replica is another consumer that needs to keep up with the same WAL stream, and the primary holds WAL until the slowest one catches up.

The standard fix is more provisioned IOPS. It works for a while. Then data volume grows and you're having the same conversation again, just with bigger numbers on the invoice.

## Why the standard toolkit doesn't solve this

Walk through each common response and you'll see exactly where it runs out.

**More autovacuum workers.** More workers means more I/O competition with writes, not less. You're distributing the problem across more processes. The aggregate I/O pressure is unchanged.

**Aggressive autovacuum cost limits.** You can configure vacuum to run faster and harder. It cleans up faster but hits writes harder. There's no setting that makes the competition disappear. You're choosing which process suffers.

**More RAM.** Bigger `shared_buffers` and page cache reduce physical reads. Write amplification is unchanged. WAL volume is unchanged. Autovacuum competition is unchanged. You bought better read performance for a write-bound problem.

**Faster storage.** Raises the WAL ceiling. Doesn't change the ratio of actual I/O to logical data. At 3-5x write amplification, faster storage lets you sustain a higher write rate before hitting the ceiling. But data volume grows, and the ceiling moves up proportionally.

**Vertical scaling.** Same as faster storage with more CPU. You've bought headroom measured in months. At the current data growth trajectory, that math doesn't improve over time.

Each of these is the right response to the symptom. None of them changes the underlying dynamic: continuous ingestion is in constant competition with the maintenance processes Postgres needs to stay healthy.

## The workloads where this actually matters

Not every write-heavy system has this problem. Let's be precise.

The pattern shows up when three things are true at once: writes are continuous rather than bursty, data volume is growing on a sustained curve, and the database needs to stay queryable under latency requirements while ingestion is running.

**Industrial IoT** is the clearest example. A wind farm with 10,000 sensors reporting every five seconds generates roughly 2,000 inserts/second. That's modest by financial or observability standards, but it never pauses. The turbines don't stop overnight. Maintenance windows don't exist because the data source doesn't know what a maintenance window is.

**Financial market data** is the high-frequency version. Trade feeds run at hundreds of thousands of events per second during market hours. Pre-market and after-market data keeps coming. Systems that aggregate this data for risk and compliance queries need it available immediately, not at end of day.

**Observability platforms** are the distributed version. Metrics, traces, and logs from thousands of hosts. Each host generates data independently. The aggregate rate is enormous and constant.

What these have in common: the data source runs on its own schedule, completely independent of what the database needs. The wind turbine doesn't care that autovacuum is behind. The trading engine doesn't wait for a checkpoint to finish.

If your write pattern is bursty (user-driven traffic, nightly batch jobs, periodic syncs), you probably don't have this problem. The database gets its breathing room, maintenance catches up, and standard Postgres optimization works the way it's supposed to. The pattern described in this post shows up specifically when the gap disappears.

## Recognizing the pattern early

The instinct when Postgres starts struggling under continuous ingestion is to tune harder. Add workers. Raise limits. Upgrade storage.

Those are correct responses for a database that has misconfiguration or a bad schema. Postgres is doing exactly what it was designed to do. The MVCC model, the WAL architecture, the maintenance scheduler: these are good design decisions for the workloads Postgres was built to handle. The system changed underneath it. That's not a criticism of the tool.

But continuous ingestion isn't a heavier version of batch ETL. It's a different workload class. The architectural assumptions underneath Postgres were built around a workload that breathes. Continuous ingestion doesn't breathe. And that distinction matters because it determines whether optimization will change your trajectory or just delay the same outcome.

Recognizing that early is worth a lot. At 50M rows, switching to a purpose-built architecture takes days. At 1B rows, it takes months. Every quarter you spend optimizing within the wrong architecture is a quarter where migration gets harder and the engineering team spends more time managing the database than building product.

If this sounds familiar, the full analysis covers the scoring framework and the mechanics behind why each optimization phase hits a ceiling. It's the same trajectory described here, zoomed out to show the complete path and where it leads.

[**Read the full analysis: Understanding Postgres Performance Limits for Analytics on Live Data →**](https://www.tigerdata.com/blog/postgres-optimization-treadmill)