<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[Tiger Data Blog]]></title>
        <description><![CDATA[Insights, product updates, and tips from TigerData (Creators of TimescaleDB) engineers on Postgres, time series & AI. IoT, crypto, and analytics tutorials & use cases.]]></description>
        <link>https://www.tigerdata.com/blog</link>
        <image>
            <url>https://www.tigerdata.com/icon.ico</url>
            <title>Tiger Data Blog</title>
            <link>https://www.tigerdata.com/blog</link>
        </image>
        <generator>RSS for Node</generator>
        <lastBuildDate>Tue, 10 Mar 2026 06:27:33 GMT</lastBuildDate>
        <atom:link href="https://www.tigerdata.com/blog" rel="self" type="application/rss+xml"/>
        <ttl>60</ttl>
        <item>
            <title><![CDATA[Vertical Scaling: Buying Time You Can't Afford]]></title>
            <description><![CDATA[Postgres vertical scaling works, until it doesn't. Learn why high-frequency ingestion workloads hit an architectural wall and what to do about it.]]></description>
            <link>https://www.tigerdata.com/blog/vertical-scaling-buying-time-you-cant-afford</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/vertical-scaling-buying-time-you-cant-afford</guid>
            <category><![CDATA[PostgreSQL]]></category>
            <category><![CDATA[PostgreSQL Performance]]></category>
            <dc:creator><![CDATA[Matty Stratton]]></dc:creator>
            <pubDate>Thu, 26 Feb 2026 14:48:27 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2026/02/Blog-Thumbnail-1280x720.png">
            </media:content>
            <content:encoded><![CDATA[<p>Your Postgres database is struggling. Write latency is climbing, autovacuum is fighting for I/O, and the indexes you added three months ago aren't cutting it anymore. So you do the obvious thing.</p><p>You upgrade the instance. Metrics drop. Everyone exhales.</p><p>Six months later, you do it again.</p><p>Nobody puts this in a postmortem, because vertical scaling works. That's why teams keep reaching for it. But if you're running continuous high-frequency ingestion on Postgres, it's not a fix. It's a payment plan on a debt that keeps growing.</p><h2 id="the-cost-curve-doesnt-lie">The Cost Curve Doesn't Lie</h2><p>You've probably already run the numbers. At 50K inserts per second, you're adding roughly 1.5 billion rows per year. Your data volume curve is exponential. Your infrastructure cost moves in steps, doubling each time you provision the next tier up.</p><p>Plot both lines on the same chart. Watch them diverge.</p><p>You upgrade from 16 vCPU/64GB to 32 vCPU/128GB with provisioned IOPS (io2 at 10,000+ on AWS, say). Cost roughly doubles. You get six months of breathing room. Then the data keeps growing, and the metrics start climbing again.</p><p>So you upgrade again. The cost doubles again. Twelve months out, you're projecting another upgrade. The database line item is growing faster than the product revenue it supports.</p><p>Oof.</p><h2 id="what-youre-actually-buying">What You're Actually Buying</h2><p>More CPU gives autovacuum room to run without starving query execution. More RAM improves <code>shared_buffers</code> and OS page cache hit rates. Faster storage reduces I/O wait across the board.</p><p>All real wins. None of them touch the per-row overhead.</p><p>Here's what's actually happening underneath. At 100K inserts per second, you're writing 250-350MB of actual I/O for 100MB of application data. Every row carries MVCC headers, index entries, and WAL records whether you asked for them or not. A 1KB sensor reading becomes roughly 2.5 to 3.5KB of actual I/O: 23-byte heap tuple header, five index entries at ~60 bytes each, plus a ~1.2KB WAL record stacking on top.</p><p>At 100K inserts/sec, that's 250-350MB/sec of real I/O to move 100MB/sec of data. A bigger instance tolerates that overhead more gracefully. It does not reduce it.</p><p>So the trajectory holds. Six months of headroom, metrics creep back, another upgrade, another budget conversation. Each step costs more than the last one and buys roughly the same amount of time.</p><h2 id="the-invisible-cost-nobody-tracks">The Invisible Cost Nobody Tracks</h2><p>Here's where it gets uncomfortable. The latency graphs are one thing. Engineers watch latency graphs. Finance watches the <em>invoice</em>.</p><p>At some point the database line item becomes visible enough that someone schedules a meeting. Now you're explaining autovacuum to a person who manages a spreadsheet for a living. (That meeting is not fun. The prep work for that meeting costs engineering time you don't have.)</p><p>But that's the visible cost. The invisible one is worse.</p><p>When teams hit this pattern, senior engineers typically spend 20-30% of their time on database operations. Not firefighting. Weekly. Monitoring autovacuum lag. Tuning per-partition settings. Watching replication delay. Reviewing runbooks before anyone touches the schema. Making sure the pg_partman automation didn't silently fail again.</p><p>None of that shows up in the cloud bill. It doesn't trigger a finance meeting – it just quietly drains your best people every single week. New engineers need weeks of onboarding before they can safely operate the partitioning scheme. What should be a one-person schema change becomes a team event with a rollback plan.</p><p>You've built a database operations practice inside your product engineering team. That wasn't the plan.</p><h2 id="why-vertical-scaling-feels-like-its-working">Why Vertical Scaling Feels Like It's Working</h2><p>The thing that makes this pattern so persistent is that each optimization phase genuinely does help. Vertical scaling is no exception.</p><p>You add the bigger instance, and autovacuum workers stop competing with queries for CPU. Shared buffers expand, and buffer cache hit rates climb. Those io2 IOPS stop being the bottleneck. For a while, the system breathes.</p><p>But here's the thing: Postgres wasn't designed for continuous, high-frequency, append-only ingestion at scale. The design choices that make it excellent for general-purpose workloads, MVCC for concurrency, row-based heap storage, B-tree indexes, the WAL architecture – all generate overhead that multiplies when you're hammering it with hundreds of thousands of inserts per second that never pause.</p><p>Vertical scaling gives the existing architecture more room to operate. It doesn't change the architecture.</p><p>MVCC creates per-tuple overhead on data you'll never update. Row storage forces you to read all 30 columns when your query needs two. B-tree indexes mean every insert has to traverse and update every index, and at 50K inserts/sec with five indexes, that's 250K index insertions per second. WAL records every single one of those operations before touching a data page, so at 100K inserts/sec you're generating 50-100MB/sec of WAL just to do normal work.</p><p>None of those problems shrink when you add more vCPUs.</p><h2 id="how-it-shows-up-before-its-a-crisis">How It Shows Up Before It's a Crisis</h2><p>The real tell isn't in a p95 latency chart. It's the <em>pattern</em>.</p><p>You optimize. You get relief. The metrics climb back. You optimize again. The relief lasts a little less time than before.</p><p>Before it becomes a full crisis, it shows up in how the team is spending its time.</p><p>Optimization is on every quarterly roadmap, not as a one-time project, but as a line item, every quarter, competing with features for engineering time.</p><p>The database bill goes up 40% while user growth was 15%. Finance notices. Those numbers don't get ignored.</p><p>You ship a 2x performance improvement and data growth erases it within two quarters. The treadmill doesn't slow down – it <strong>speeds up</strong>.</p><p>And autovacuum just keeps coming up! It's in the top five processes by CPU and I/O at all hours and tuning it is somehow <em>always</em> on someone's plate.</p><p>Two or three of these? Pay attention. Four? You're <em>already</em> in the pattern.</p><h2 id="optimization-problem-vs-architecture-problem">Optimization Problem vs. Architecture Problem</h2><p>There are two different problems that both show up as "database performance is degrading."</p><p>The first is an optimization problem. The workload fits the database design. Better indexes, query rewrites, config tuning, vertical scaling. These directly improve the trajectory, and Postgres expertise solves it. For most workloads, vanilla Postgres is the right tool and this is the right path.</p><p>The second is an <strong>architectural mismatch</strong>. The workload is hitting design tradeoffs baked into the storage engine and the write path. Optimization helps short-term, but it doesn't change the trajectory. You're working <em>around</em> the architecture instead of <em>with</em> it.</p><p>Both of these look identical from the outside: degrading query latency, climbing infrastructure costs, teams spending more time on database operations than product work. The difference only becomes obvious when you notice each fix is lasting a little less time than the last one.</p><p>Vertical scaling is the right move for the first problem. For the second, it's just the most expensive item on the treadmill.</p><h2 id="when-to-think-about-architecture-instead">When to Think About Architecture Instead</h2><p>If your workload is continuous high-frequency ingestion, your data is append-only, queries predominantly filter on time ranges, and you're measuring retention in months or years, you're probably dealing with an architectural mismatch, not an optimization problem.</p><p>You also don't need to replace Postgres. TimescaleDB extends vanilla Postgres with columnar compression, hypertables with automatic chunking, and a query planner that understands time-based access patterns. You keep SQL, your extensions, your team's knowledge, and the entire Postgres ecosystem. What changes is the storage engine and write path underneath (the parts <em>actually</em> generating the overhead).</p><p>Migration complexity scales with data volume. At 10M-50M rows, it's days to two weeks. At 100M-500M rows, two to six weeks. At 1B+, you're looking at months. Those hours don't go toward product features. And there's no point on that curve where waiting makes it cheaper.</p><p>If your team is spending 20%+ of engineering time on database operations and scalability is on every quarterly roadmap, you already know something is off. The upgrade cycles don't get cheaper. They just get further apart until they don't.</p><p><em>This post is part of a series on Postgres performance limits for high-frequency data workloads. The full analysis, including a workload scoring framework and migration complexity breakdown at different scales, is in the anchor essay:</em><a href="https://www.tigerdata.com/blog/postgres-optimization-treadmill" rel="noreferrer"><em> <u>Understanding Postgres Performance Limits for Analytics on Live Data</u></em></a><em>. Ready to test it on your own data?</em><a href="https://console.cloud.timescale.com/signup"><em> <u>Start a free Tiger Data trial.</u></em></a></p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Understanding Postgres Performance Limits for Analytics on Live Data]]></title>
            <description><![CDATA[PostgreSQL hits hard limits under analytics workloads. Here's why MVCC, WAL, and row storage compound — and what to do instead.]]></description>
            <link>https://www.tigerdata.com/blog/postgres-optimization-treadmill</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/postgres-optimization-treadmill</guid>
            <category><![CDATA[PostgreSQL]]></category>
            <category><![CDATA[Analytics]]></category>
            <category><![CDATA[PostgreSQL Performance]]></category>
            <dc:creator><![CDATA[Matty Stratton]]></dc:creator>
            <pubDate>Wed, 25 Feb 2026 19:18:16 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2026/02/advocacy-essay-thumbnail-with-elephant.png">
            </media:content>
            <content:encoded><![CDATA[<h2 id="the-pattern-recognition-moment">The Pattern Recognition Moment</h2>
<p>You're reviewing monitoring on a normal workday. There hasn't been a new deployment, no weird traffic spike, and no schema changes. But p95 write latency has crept from 8ms to 25ms over the past month, and last week it touched 45ms. Your largest tables crossed 500M rows sometime in March and they're still climbing.</p>
<p>Six weeks of data points, all trending the same direction.</p>
<p><img src="https://timescale.ghost.io/blog/content/images/2026/02/diagram-2-p95-write-latency.png" alt="" loading="lazy"></p>
<p>You've run Postgres in production for years. You've tuned queries, rebuilt indexes, and right-sized instances. But this time the fixes don't stick; every new index or config tweak brings the metrics back down for a few weeks, then they climb again. You can plot the trajectory out three months and know exactly where it lands.</p>
<p>So you do a proper audit: query plans, connection overhead, table stats, bloat. Everything checks out. Schema is sound, indexes cover the hot paths, and configs follow best practices. A consultant confirms the same: nothing misconfigured. But performance keeps degrading, and it correlates with data volume, not traffic.</p>
<p>You look closer at the workload. Most writes are inserts, not updates, and every row carries a timestamp. Queries almost always filter by time range. Data arrives continuously, not in batches or bursts, but as a steady stream that never pauses. You need months or years of retention, and you're not just storing this data. You're querying it under latency requirements.</p>
<p>This doesn't fit the profile of a transactional workload, and it doesn't fit a data warehouse either. It's continuous high-frequency ingestion that needs to stay operationally queryable.</p>
<p>Postgres is a brilliant general-purpose database. The same design choices that make it handle e-commerce, SaaS backends, and CMS workloads so well create compounding overhead for sustained high-frequency time-series ingestion with long retention. Design tradeoffs, not bugs. Baked into the architecture by intent.</p>
<p>You are not fighting misconfiguration. You are fighting architectural boundaries designed for a different workload class.</p>
<p>This piece walks through what we call the Optimization Treadmill: the sequence of phases most teams follow, each a correct response to observed symptoms, each providing temporary relief without changing the underlying trajectory. Understanding the mechanics of why the treadmill exists is what lets you recognize it early. If you recognize the scenario above, this is a common path. The question isn't whether you'll hit the ceiling. It's when, and how much runway you have left when you do.</p>
<h2 id="what-this-workload-looks-like">What This Workload Looks Like</h2>
<p>Not all high-write workloads will hit this wall. Postgres handles enormous write volumes for e-commerce, social networks, and SaaS backends without issue. The friction comes from a specific combination of six characteristics. If four or five describe your system, the optimization phases in the next section will be familiar.</p>
<p><strong>Continuous high-frequency ingestion.</strong> Thousands to hundreds of thousands of inserts per second, 24/7, with no pause: IoT sensors reporting every few seconds, financial systems processing trades in real time, or APM platforms collecting metrics from thousands of hosts. High-frequency data generation is independent of user count. Batch systems get quiet periods where the database can run maintenance, but continuous ingestion never stops. Maintenance competes directly with writes, and there is no scheduling window.</p>
<p><strong>Time-series access patterns.</strong> Nearly every row has a timestamp, and queries almost always include time range filters. "Last 30 minutes of CPU utilization," "this week compared to last week," "all transactions between two dates." This goes beyond a <code>created_at</code> column; the entire query pattern revolves around time. General-purpose indexes aren't built for this access pattern, which is why teams end up reimplementing time-based data organization through manual partitioning scripts and custom tooling.</p>
<p><strong>Append-only data.</strong> Once written, rows rarely change. Sensor readings don't get updated, financial transactions are immutable, log entries are permanent. Deletes happen in bulk (drop an entire month's partition), not row by row. MVCC exists to handle concurrent reads and writes on the same rows. Append-only workloads pay that overhead on data they never touch again. Autovacuum is running constantly just to clean up dead tuples that were never created through updates.</p>
<p><strong>Long retention.</strong> Months to years, not days or weeks. Compliance might require seven years of financial records, manufacturing teams need root cause analysis across quarters, and ML pipelines need two-plus years of training data. Shortening retention will just hide architectural problems because old data ages out, and long retention means unbounded table growth. At 50K inserts per second, that's roughly 1.5 billion rows per year. After three years? 4.5 billion rows.</p>
<p><strong>Operational query requirements.</strong> This isn't cold storage or an analytics warehouse you query once a day. You need millisecond responses on the last day's data, sub-second on the last week, and reasonable performance across the full retention window. Real-time dashboards, alert systems, user-facing analytics, ad-hoc investigation, all querying the same database. Data warehouse depth with operational latency requirements.</p>
<p><strong>Sustained growth.</strong> Data volume growing 50–100%+ year over year on a predictable curve. Static workloads can be over-provisioned once and left alone, but growing workloads demand constant re-optimization. You're not solving for current scale. You're chasing projected scale, and the gap keeps widening.</p>
<p>If four or five of these apply, the next section maps the optimization path most teams follow. If your workload is standard OLTP, batch warehouse, low-volume time-series, or short-retention, the underlying issues are likely different.</p>
<p>This combination of characteristics didn't exist at scale 15 years ago. It's a product of specific infrastructure shifts: billions of connected devices generating continuous telemetry, high-frequency trading systems that treat microseconds as a competitive moat, AI pipelines that require years of operational history as training data, and observability platforms collecting metrics from every process in a distributed system. The cloud didn't just scale these workloads up. It made them continuous. Machines that never go offline generate data that never stops. That changed what operational databases are asked to do, and general-purpose engines weren't redesigned to match.</p>
<h2 id="the-optimization-path">The Optimization Path</h2>
<p>Most teams working this pattern follow roughly the same sequence. Each phase is a reasonable response to observed symptoms, but each buys 3–6 months of relief at most, adds operational complexity, and has diminishing returns. The optimizations address symptoms without changing the underlying architecture. The ceiling doesn't move. You do, until you run out of room.</p>
<h3 id="phase-1-index-optimization">Phase 1: Index optimization</h3>
<p>The trigger is predictable: query performance degrades as tables grow past 50–100M rows, or sequential scans on a 100M-row table take minutes. The textbook answer is to add B-tree indexes on timestamp columns, build composite indexes for common filter combinations, create partial indexes on hot time ranges, and run ANALYZE to refresh <code>pg_statistic</code>.</p>
<pre><code class="language-sql">-- Composite index for the most common dashboard query pattern
CREATE INDEX idx_metrics_device_time
  ON device_metrics (device_id, ts DESC);

-- Partial index covering only the hot partition
CREATE INDEX idx_metrics_recent
  ON device_metrics (ts DESC)
  WHERE ts &gt; now() - interval '7 days';
</code></pre>
<p>A query that did a sequential scan across 100M rows now hits an index and returns in milliseconds. 10–100x improvement on read performance is typical. Problem solved, for now.</p>
<p>Issues start showing up as tables grow past 300M rows. Every INSERT must update every index on the table. With five indexes, each insert performs six write operations: one heap tuple write and five B-tree leaf page insertions. At 50K inserts/sec, that's 300K write operations per second. Each index insertion traverses the B-tree, potentially causing page splits that trigger additional I/O. <code>pg_stat_user_indexes</code> starts showing index bloat climbing:</p>
<pre><code class="language-sql">-- Monitoring index bloat
SELECT schemaname, tablename, indexname,
       pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
       idx_scan as index_scans,
       idx_tup_read,
       idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
</code></pre>
<p>Index sizes grow faster than table sizes because B-trees don't reclaim space efficiently for append-heavy, time-ordered data. For keys that increase monotonically like timestamps, inserts concentrate on the rightmost leaf pages, resulting in repeated splits. Old leaf pages become sparse but remain allocated. You've improved read latency at the cost of write throughput, and this workload needs both.</p>
<h3 id="phase-2-table-partitioning">Phase 2: Table partitioning</h3>
<p>Your largest table has crossed 800M to 1B rows, and dropping old data via DELETE causes table bloat and long-running transactions that block autovacuum. You implement time-based range partitioning (typically daily or weekly).</p>
<pre><code class="language-sql">-- Partitioned table setup
CREATE TABLE device_metrics (
    ts          timestamptz NOT NULL,
    device_id   bigint NOT NULL,
    metric      text NOT NULL,
    value       double precision
) PARTITION BY RANGE (ts);

-- Daily partitions created by cron or pg_partman
CREATE TABLE device_metrics_20250601
  PARTITION OF device_metrics
  FOR VALUES FROM ('2025-06-01') TO ('2025-06-02');
</code></pre>
<p>Implementation requires automation: cron jobs or pg_partman to create future partitions, monitoring to detect gaps where partition creation failed, and careful handling of queries that span partition boundaries. Backup and restore now operates on hundreds of individual tables, <code>pg_dump</code> time scales with partition count, and schema migrations touch every partition.</p>
<p>The wins are concrete. Queries with time-range filters trigger partition pruning, and EXPLAIN shows the planner excluding irrelevant partitions:</p>
<pre><code class="language-sql">EXPLAIN SELECT avg(value) FROM device_metrics
WHERE ts &gt; now() - interval '1 hour';

-- Scans 1-2 partitions instead of the entire table
-- "Partitions removed: 498 of 500"
</code></pre>
<p>Dropping old data becomes <code>DROP TABLE device_metrics_20240101</code> instead of a multi-hour DELETE that generates gigabytes of WAL and dead tuples.</p>
<p>What happens at 500+ partitions? The <a href="https://www.postgresql.org/docs/current/ddl-partitioning.html">PostgreSQL documentation on partitioning best practices</a> is direct about the cost: "Planning times become longer and memory consumption becomes higher when more partitions remain after the planner performs partition pruning." <code>pg_partman</code> maintenance jobs occasionally fail silently, leaving gaps. Queries spanning long ranges (quarterly reports, year-over-year comparisons) hit hundreds of partitions and regress in performance. Each active partition still has its own autovacuum overhead. The write path is faster per-partition but aggregate write load is unchanged. And the operational complexity is real. New engineers need to understand the partitioning scheme, the automation scripts, the monitoring for gaps, the procedures for backfills, and the implications for schema changes.</p>
<h3 id="phase-3-autovacuum-tuning">Phase 3: Autovacuum tuning</h3>
<p>This is where it starts to feel wrong. You're tuning a cleanup process for data you never modify. <code>n_dead_tup</code> counts are climbing on active partitions, <code>last_autovacuum</code> timestamps show vacuum running constantly but falling behind during write peaks, and <code>pg_stat_activity</code> regularly shows autovacuum workers competing for I/O.</p>
<p>Even append-only workloads generate work for autovacuum. Aborted transactions leave dead tuples. Hint-bit setting (marking tuples as known-committed or known-aborted to avoid future <code>pg_xact</code> lookups) requires dirtying pages. And since PostgreSQL 13, autovacuum triggers based on insert count (not just dead tuples) specifically to freeze tuples and update the visibility map. At high insert rates, this means autovacuum fires continuously on tables that never see a single UPDATE or DELETE.</p>
<pre><code class="language-sql">-- Per-table autovacuum settings on high-traffic partitions
ALTER TABLE device_metrics_20250601 SET (
    autovacuum_vacuum_scale_factor = 0.01,    -- default 0.2
    autovacuum_vacuum_cost_delay = 2,         -- default 2ms (20ms before PG 12)
    autovacuum_vacuum_cost_limit = 1000       -- default 200
);
</code></pre>
<pre><code># postgresql.conf adjustments
autovacuum_max_workers = 6            # default 3
autovacuum_naptime = 15s              # default 1min
maintenance_work_mem = 2GB            # default 64MB
autovacuum_vacuum_cost_delay = 2ms
autovacuum_vacuum_cost_limit = 800
</code></pre>
<p>This helps stabilize bloat, and <code>pg_stat_user_tables.n_dead_tup</code> stays under control. But autovacuum workers now consume measurable CPU and I/O continuously, and monitoring shows autovacuum in <code>pg_stat_activity</code> at all hours. During write peaks, vacuum falls behind, bloat creeps back, and query performance becomes variable. You're tuning a process that exists to clean up overhead your workload doesn't fundamentally produce, but that the storage engine creates anyway.</p>
<h3 id="phase-4-vertical-scaling">Phase 4: Vertical scaling</h3>
<p>All of your optimizations are showing diminishing returns. The next logical step is to add more resources: upgrade from 16 vCPU/64GB to 32 vCPU/128GB with provisioned IOPS storage (e.g., io2 at 10,000+ IOPS on AWS).</p>
<p>More CPU gives autovacuum workers room to operate without starving query execution. More RAM increases <code>shared_buffers</code> and OS page cache hit rates, reducing physical disk reads. Faster storage reduces I/O wait time across the board. This gives you roughly six months of headroom.</p>
<p>Math doesn't lie: the infrastructure cost doubled or tripled, but data growth is still exponential. At the current trajectory, you'll need another upgrade in 12 months. The database cost line item is growing faster than the product revenue it supports.</p>
<h3 id="phase-5-read-replicas">Phase 5: Read replicas</h3>
<p>Dashboards and analytics queries compete with ingestion for CPU and I/O on the primary. You add 1–3 streaming replicas, configure pgbouncer or pgpool to route read traffic, and separate the connection pools. Immediately, write performance on the primary improves. Expensive analytical queries run against replicas without blocking ingestion.</p>
<p>The primary still carries the full write load. At sustained high insert rates generating tens of megabytes per second of WAL, replicas that fall behind accumulate WAL on the primary, consuming disk. The further behind a replica gets, the more WAL the primary must retain, and high write volume is exactly what causes replicas to fall behind in the first place. Real-time dashboards pointing at lagging replicas show stale data. You're now managing multiple Postgres instances with their own monitoring, autovacuum tuning, and connection pooling. The write bottleneck is still untouched.</p>
<h3 id="taking-stock">Taking stock</h3>
<p>After all five phases, this is what the infrastructure looks like: partitioned tables across 500+ partitions with <code>pg_partman</code> automation and monitoring, aggressive per-table autovacuum settings under constant adjustment, instances upgraded 2–3x from original specs with provisioned IOPS, 2–3 streaming replicas with connection-level routing, detailed runbooks covering partition management, vacuum procedures, and failover scenarios.</p>
<p>Each optimization was the right response. Each bought time. Yet the trajectory is unchanged.</p>
<p><img src="https://timescale.ghost.io/blog/content/images/2026/02/diagram-1-latency-across-optimization-phases.png" alt="" loading="lazy"></p>
<p>Senior engineers are now spending 20–30% of their time on database operations. Quarterly planning includes a database scalability line item. New hire onboarding takes weeks before someone can safely operate the partitioning scheme. The team has become part product engineering, part DBA.</p>
<p>Is this inherent to the scale, or is it inherent to the architecture?</p>
<p>The answer matters because the two problems have different solutions. Optimization within the right architecture has a ceiling you can raise. Optimization against an architectural mismatch has a ceiling that doesn't move. Only the timeline changes. For this workload pattern, the ceiling is structural. The question was never if you'd hit it. It was always when.</p>
<h2 id="why-these-optimizations-hit-a-ceiling">Why These Optimizations Hit a Ceiling</h2>
<p>The optimization phases above aren't ineffective. Each one operates within architectural boundaries that weren't designed for this workload pattern, and those boundaries constrain how much any optimization can actually move the needle. Understanding the mechanics explains why returns diminish.</p>
<p>Postgres is a brilliant general-purpose relational database. Its design handles an enormous range of workloads well: e-commerce, content management, authentication, SaaS backends. "General-purpose" means optimized for the average case. High-frequency time-series ingestion with long retention is not the average case. Four core design decisions create this compounding overhead.</p>
<h3 id="mvcc-multi-version-concurrency-control">MVCC (Multi-Version Concurrency Control)</h3>
<p>MVCC lets readers and writers operate concurrently without lock contention. The <a href="https://www.postgresql.org/docs/current/mvcc-intro.html">PostgreSQL documentation on concurrency control</a> describes the core guarantee: "reading never blocks writing and writing never blocks reading." When a row is updated, Postgres keeps the old tuple version visible to in-flight transactions, and autovacuum later marks dead tuples as reusable. For workloads with concurrent reads and updates on shared rows, this is an excellent tradeoff.</p>
<p>For append-only ingestion, every insert still pays the full MVCC cost. Each heap tuple carries a fixed-size header (23 bytes on most machines) containing <code>t_xmin</code>, <code>t_xmax</code>, <code>t_cid</code>, <code>t_ctid</code>, <code>t_infomask</code>, <code>t_infomask2</code>, and <code>t_hoff</code>. These fields track transaction visibility, even though the row will never be updated or deleted by a transaction. Extra cost with no extra value.</p>
<p>The write amplification is easily observable. A 1KB sensor reading becomes:</p>
<ul>
<li>23-byte heap tuple header (plus alignment padding and a 4-byte <code>ItemIdData</code> pointer)</li>
<li>1,024 bytes of row data</li>
<li>5 index entries (assuming 5 indexes, ~40–80 bytes each in B-tree leaf pages)</li>
<li>~1.2KB WAL record (heap insert + index insertions)</li>
</ul>
<p>Total actual I/O: roughly 2.5–3.5KB per 1KB of logical data. At 100K inserts/sec of 1KB rows, you're writing 250–350MB/sec of actual I/O for 100MB/sec of application data. The exact ratio varies with row width, index count, and whether <code>full_page_writes</code> triggers after a checkpoint.</p>
<p><img src="https://timescale.ghost.io/blog/content/images/2026/02/diagram-3-logical-data-vs-IO-breakdown.png" alt="" loading="lazy"></p>
<p>Autovacuum still has work to do on append-only tables. Aborted transactions leave dead tuples, and hint-bit setting (marking tuples as known-committed or known-aborted to avoid future <code>pg_xact</code> lookups) requires dirtying pages. At high insert rates, even these minor sources of work keep autovacuum continuously active. <code>pg_stat_user_tables.n_dead_tup</code> may stay low, but <code>vacuum_count</code> and <code>autovacuum_count</code> keep climbing steadily.</p>
<h3 id="row-based-storage-with-b-tree-indexes">Row-based storage with B-tree indexes</h3>
<p>Postgres stores data as a heap of 8KB pages, each containing variable-length tuples laid out row by row. Every tuple contains all columns. B-tree indexes map key values to ctid (page number + offset) pointers into the heap.</p>
<p>For time-series analytics, this creates read amplification:</p>
<pre><code class="language-sql">SELECT avg(temperature)
FROM sensor_readings
WHERE ts &gt; now() - interval '1 hour'
  AND device_id = 42;
</code></pre>
<p>This query needs two columns: <code>ts</code> and <code>temperature</code>. If the table has 30 columns, Postgres reads all 30 columns for every matching row from the heap pages. The I/O is 15x what a columnar layout would require, where only the referenced columns are read from disk.</p>
<p>Time-series data also compresses extremely well in columnar formats. Sequential timestamps delta-encode to near-zero storage (a regular interval collapses from 8 bytes per timestamp down to a single bit via delta-of-delta encoding), and repeated device IDs run-length-encode. Floating-point sensor values compress with XOR-based compression derived from Facebook's Gorilla algorithm (<a href="http://www.vldb.org/pvldb/vol8/p1816-teller.pdf">Pelkonen et al., "Gorilla: A Fast, Scalable, In-Memory Time Series Database," VLDB, 2015</a>). Columnar storage routinely achieves 10–20x compression on time-series data. Row-based heap storage can't apply any of these techniques because values from different columns are interleaved on the same page.</p>
<p>On the write side, B-tree index maintenance creates significant overhead. Each insert traverses every index's B-tree from root to leaf, finds the correct leaf page, and inserts the new entry. If the leaf page is full, it splits, which can cascade up the tree. For time-ordered data, inserts concentrate on the right edge of timestamp indexes, creating contention on a small number of leaf pages. Non-timestamp indexes (device ID, metric type) scatter inserts across the tree, causing random I/O. With five indexes on a table, every row insert performs one heap page write, five B-tree traversals and leaf page insertions, plus WAL records for each. At 50K inserts/sec, that's 50K heap writes + 250K index insertions per second.</p>
<h3 id="query-planning-overhead">Query planning overhead</h3>
<p>The Postgres planner runs a full optimization pass on every query: it enumerates possible paths, estimates costs from <code>pg_statistic</code> entries, considers index usage, evaluates join orders, and selects an execution plan. For workloads with diverse, unpredictable query patterns involving complex joins, this is the right approach.</p>
<p>For time-series workloads, query shapes are highly repetitive. The same <code>WHERE ts &gt; now() - interval '...'</code> filter runs thousands of times per second. The full planning cycle executes every time. At high query rates, planning overhead is measurable in <code>pg_stat_statements</code> as the gap between <code>total_plan_time</code> and <code>total_exec_time</code>.</p>
<p>Statistics maintenance creates its own cost. ANALYZE samples rows to populate <code>pg_statistic</code>, with the sample size scaled by <code>default_statistics_target</code> (default: 100, which yields roughly 30,000 sampled rows). On billion-row tables, even this sampling-based statistics collection is expensive and must run frequently to keep estimates accurate. Stale statistics provide poor cardinality estimates, leading the planner to choose sequential scans over index scans, or vice versa.</p>
<p>With hundreds of partitions, the planner must evaluate partition pruning for each partition's bounds against the query predicates. This is fast per-partition but scales linearly with partition count. At 500+ partitions, plan time for simple queries can exceed execution time.</p>
<h3 id="write-ahead-logging-wal-volume">Write-Ahead Logging (WAL) volume</h3>
<p>Every data modification generates a WAL record before it's applied to the heap or index pages. WAL writes are sequential and synchronous (fsync per commit, or per <code>wal_writer_delay</code> interval with asynchronous commit). At 100K inserts/sec, WAL generation is roughly:</p>
<ul>
<li>Heap insert records: ~100–150 bytes each = 10–15MB/sec</li>
<li>Index insert records: 5 indexes × ~60–80 bytes each = 30–40MB/sec</li>
<li>Full-page writes (after checkpoint): intermittent bursts of 8KB per dirtied page</li>
</ul>
<p>Total sustained WAL throughput: 50–100MB/sec under normal operation, spiking higher after checkpoints when <code>full_page_writes</code> triggers 8KB records for newly dirtied pages. <a href="https://www.postgresql.org/docs/current/wal-reliability.html">The PostgreSQL documentation</a> describes why: "the first modification of a data page after each checkpoint results in logging the entire page content." At those rates, that's 3–6GB/min, 180–360GB/hour.</p>
<p>WAL I/O becomes a direct throughput bottleneck. <code>pg_stat_wal</code> shows <code>wal_write</code> and <code>wal_sync</code> times climbing. Replicas that can't apply WAL fast enough fall behind, and unprocessed WAL files accumulate on the primary's <code>pg_wal</code> directory, consuming disk. <code>max_wal_size</code> and checkpoint frequency become critical tuning parameters.</p>
<h3 id="the-compounding-effect">The compounding effect</h3>
<p>None of these four constraints operates in isolation. Each amplifies the others, and that's where the math gets ugly.</p>
<p>MVCC overhead creates per-tuple bloat, which accumulates faster than autovacuum can clean at high insert rates. Autovacuum competing for I/O degrades write throughput. Degraded write throughput causes queries on bloated tables to slow down, which increases pressure to add more indexes. More indexes produce more write amplification, more WAL, and more replication lag. Row storage forces read amplification on time-range queries, which creates pressure to add covering indexes. Those indexes add to the write overhead feeding back into the MVCC/autovacuum loop.</p>
<p><img src="https://timescale.ghost.io/blog/content/images/2026/02/diagram-5-updated.png" alt="" loading="lazy"></p>
<p>At 50K inserts/sec with five indexes on a table, the steady-state database workload is: 50K heap tuple writes/sec, 250K B-tree index insertions/sec, 50–100MB/sec sustained WAL generation, continuous autovacuum activity across active partitions, and full query planning on every incoming query.</p>
<p>This is why a 16-core/64GB instance struggles with what appears to be a straightforward append-only workload.</p>
<p>Partitioning reduces per-partition table size but doesn't change the per-row overhead. Adding RAM improves buffer cache hit rates but doesn't reduce write amplification. Autovacuum tuning manages bloat but can't eliminate the cost of producing it. Each optimization operates within these constraints. None removes the constraints themselves.</p>
<p>This is the Optimization Treadmill at the mechanical level. You're not fighting configuration. You're fighting the storage model, the concurrency architecture, and the write path. All of which are designed for a workload that looks nothing like yours.</p>
<h2 id="when-to-choose-a-different-path">When to Choose a Different Path</h2>
<p>Most teams recognize this pattern 12–18 months too late. By then, the tables are massive, the partitioning scheme is deeply embedded, and migration has become a multi-month project. The difference between acting at 10M rows and acting at 1B rows is roughly an order of magnitude in engineering cost.</p>
<h3 id="postgres-workload-scoring-framework">Postgres Workload Scoring Framework</h3>
<p>Go back to <a href="#what-this-workload-looks-like">the six characteristics</a>. Be honest about how many describe your system right now, and then score yourself again against where you'll be in 12-18 months.</p>
<p>If four or five apply, you're in this pattern. The optimization phases above are already in your future, or you've started them.</p>
<p>If all six apply, you're past the point of easy exits. Architectural friction is the dominant factor in your performance trajectory, and the migration cost is climbing every quarter you wait.</p>
<p>If three or fewer apply, you likely have a different problem. Standard Postgres optimization should change the trajectory.</p>
<h3 id="early-warning-signs">Early warning signs</h3>
<p>Before the pattern becomes a crisis, it shows up in how the team spends its time:</p>
<p><strong>Optimization dominates planning.</strong> 10–20% of engineering time goes to database performance, and every quarterly roadmap includes a scalability line item.</p>
<p><strong>Costs grow faster than revenue.</strong> Finance is asking why the database bill increased 40% while user growth was only 15%.</p>
<p><strong>Operational complexity accumulates.</strong> 20+ pages of runbooks, partition management scripts, monitoring for autovacuum lag, replication delay, and index bloat. New engineers need weeks of onboarding before they can safely operate the database.</p>
<p><strong>Growth outpaces optimization.</strong> You ship a 2x improvement and data growth erases it within two quarters.</p>
<p><strong>Autovacuum is a constant concern.</strong> It's in the top five processes by CPU and I/O at all hours, and tuning it is a recurring conversation.</p>
<p>Two or three of these signs mean you should be paying attention. Four or more means you're already in the pattern.</p>
<h3 id="migration-complexity-at-different-scales">Migration complexity at different scales</h3>
<p><strong>10M–50M rows.</strong> A day or two to 1–2 weeks. Simple dump/restore, or logical replication. Low risk, fast validation, easy rollback. 1–2 engineers part-time (roughly 80 engineer-hours).</p>
<p><strong>100M–500M rows.</strong> 2–6 weeks. Partition-by-partition migration. More dependencies to account for, more thorough testing required. 2–3 engineers, mostly full-time (roughly 400 engineer-hours).</p>
<p><strong>1B+ rows.</strong> 2–6 months. Hundreds or thousands of partitions. Zero-downtime required, complex rollback planning. Application-level dual-write or change-data-capture pipelines are in play. 3–5 engineers full-time plus a validation period (roughly 2,000 engineer-hours).</p>
<p>Those hours are not spent on product features. And there's no point on this curve where migration gets easier by waiting.</p>
<h3 id="what-purpose-built-postgres-variants-means">What "purpose-built Postgres variants" means</h3>
<p>TimescaleDB is built on top of Postgres, not in place of it. The PostgreSQL wire protocol, SQL query language, extensions like PostGIS and pgvector, your application code, and your ecosystem tooling all stay the same. What changes is the storage engine and execution layer underneath.</p>
<p><strong>MVCC overhead addressed through columnar compression.</strong> The problem: every row insert in vanilla Postgres generates per-tuple MVCC headers, index entries, and WAL records regardless of whether the data will ever be updated, driving 3–5x write amplification and continuous autovacuum load. TimescaleDB's columnar storage (the <code>Columnstore</code> layer) batches up to 1,000 row versions per column into compressed arrays before writing to disk. Each batch write replaces thousands of individual heap tuple insertions with a single compressed segment write. The per-tuple MVCC header overhead is amortized across the batch, and autovacuum pressure drops proportionally. Far less row-level churn to clean up. In practice, write amplification drops from the 3–5x range to near 1:1 for sustained append workloads. The <a href="https://www.tigerdata.com/docs/about/latest/whitepaper">Tiger Data architecture whitepaper</a> covers the columnar layout and compression pipeline in detail.</p>
<p><strong>Row storage replaced by columnar layout for time-series data.</strong> The problem: vanilla Postgres reads all columns of every matching row even when a query needs two, creating 15x+ read amplification on wide tables, with none of the compression techniques applicable to time-series data. Rather than reading all 30 columns of a row to get two values, queries against the columnar layer read only the referenced columns from compressed column arrays. The 15x read amplification drops to near 1:1. Time-series compression (delta-of-delta for timestamps, gorilla-style XOR for floats, run-length encoding for repeated values) routinely achieves 10–20x compression ratios vs. heap storage. A dataset that occupies 1TB in vanilla Postgres often fits in 50–100GB with columnar compression enabled.</p>
<p><strong>Query planning overhead reduced through chunk exclusion and continuous aggregates.</strong> The problem: the Postgres planner runs a full optimization pass on every query, and with hundreds of partitions, partition pruning overhead can exceed execution time for simple queries. TimescaleDB's planner extension adds chunk exclusion that operates at a lower level than Postgres's partition pruning. Chunks are indexed by time range in a catalog table, and the planner excludes non-overlapping chunks before the standard planning phase. For query shapes that repeat thousands of times per second, this eliminates most of the per-partition pruning overhead. Continuous aggregates go further: pre-computed rollups stored as materialized views, updated incrementally as new data arrives, so dashboards querying hourly or daily aggregations hit a small summary table instead of scanning billions of raw rows.</p>
<p><strong>WAL volume reduced through batched ingestion.</strong> The problem: at 100K inserts/sec, vanilla Postgres generates 50–100MB/sec of WAL, creating I/O bottlenecks and causing replicas to fall behind. Lagging replicas force the primary to retain more unprocessed WAL, which consumes disk and makes the lag worse. The root cause is per-row WAL records: one per heap insert, one per index insertion. Columnar storage's batch writes generate WAL at the segment level rather than the row level. At 100K inserts/sec, WAL volume drops from 50–100MB/sec to roughly 5–15MB/sec in typical deployments, which eliminates most replication lag issues. Replicas that previously fell behind during write peaks can keep up without tuning.</p>
<p><img src="https://timescale.ghost.io/blog/content/images/2026/02/comparison-chart.png" alt="" loading="lazy"></p>
<p><strong>Concrete numbers.</strong> Benchmark results vary by workload, but the directional data is consistent: ingestion throughput 10–20x higher than vanilla Postgres at equivalent instance size, query performance on time-range aggregations 100x+ faster with columnar storage, storage footprint 10–20x smaller with compression enabled. RTABench, a benchmark for real-time analytics workloads, publishes results showing the performance gap between vanilla PostgreSQL and TimescaleDB across real-world query patterns. <a href="https://rtabench.com/">See the benchmark results</a></p>
<h3 id="decision-framework">Decision framework</h3>
<p>Choose a specialized architecture if you score 4+ on the <a href="#when-to-choose-a-different-path">Postgres Workload Scoring Framework</a> AND you're experiencing 2+ early warning signs AND you can project continued data growth.</p>
<p>Strong indicators to act now: you're under 100M rows, you're already building custom partitioning, your team spends 15%+ of engineering time on database optimization, and you can project 500M+ rows within 12 months.</p>
<p>You might not need this if writes are bursty rather than continuous, retention is 7–30 days, queries don't predominantly filter on time ranges, or growth is stable and slow.</p>
<h2 id="optimization-vs-architecture">Optimization vs. Architecture</h2>
<p>There are two different problems that both show up as "database performance is degrading."</p>
<p><strong>Problem 1: Optimization within the right architecture.</strong> The workload fits the database's design. Better indexes, query rewrites, configuration tuning, and hardware upgrades directly improve the trajectory. Postgres expertise solves the problem. For most workloads, vanilla Postgres is the right choice.</p>
<p><strong>Problem 2: The Optimization Treadmill.</strong> The workload hits design tradeoffs baked into the storage engine, concurrency model, and query planner. Optimization helps in the short term but doesn't change the trajectory. Each phase buys time. None buys a different outcome. You're working around the architecture rather than with it.</p>
<p>Knowing which problem you have determines the path forward.</p>
<p>If you followed the optimization phases in this piece, you weren't doing anything wrong. Those were correct responses to the symptoms. Any experienced Postgres team would have done the same. The pattern is common precisely because the progression makes sense at each step.</p>
<p>What changes with recognition is agency. At 10M–50M rows, you can choose a purpose-built architecture in days to weeks and redirect engineering time to product work. At 100M–500M rows, migration is harder but still reasonable, taking 2–6 weeks. At 1B+, it's a multi-month project, and every quarter of delay adds complexity.</p>
<p>The broader principle applies beyond this workload. Different databases have different architectural strengths, so the best choice depends on the workload. Postgres is brilliant for general-purpose relational work. Specialized variants built on top of Postgres excel at specialized patterns. Recognizing when architecture matters more than optimization is an engineering judgment call, not a criticism of the tool.</p>
<p>Architectural fit determines your ceiling. Optimization determines where you operate relative to that ceiling. When you're hitting the ceiling repeatedly, the productive question isn't "how do we optimize better?" It's "are we operating within the right architecture?" With this workload pattern, the ceiling was always there. You just needed enough data volume to find it. Score your workload. If you're at 8+ and under 100M rows, this is the cheapest architectural decision you'll make this year. <a href="https://www.tigerdata.com/docs/about/latest/whitepaper">The whitepaper</a> covers the mechanics. The <a href="https://console.cloud.timescale.com/signup">Tiger Data free trial</a> lets you validate on your own data.</p>
]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Six Signs That Postgres Tuning Won't Fix Your Performance Problems]]></title>
            <description><![CDATA[When Postgres tuning won't fix performance: recognize the six characteristics of time-series workloads that need TimescaleDB's purpose-built architecture.]]></description>
            <link>https://www.tigerdata.com/blog/six-signs-postgres-tuning-wont-fix-performance-problems</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/six-signs-postgres-tuning-wont-fix-performance-problems</guid>
            <category><![CDATA[PostgreSQL]]></category>
            <dc:creator><![CDATA[Matty Stratton]]></dc:creator>
            <pubDate>Thu, 12 Feb 2026 21:26:14 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2026/02/Postgres-Tuning-Performance-compressed.png">
            </media:content>
            <content:encoded><![CDATA[<p>You've added indexes. You've partitioned tables. You've tuned autovacuum within an inch of its life. Performance improves for a few months, and then the dashboards go red again. Sound familiar?</p><p>If so, you're probably not doing anything wrong. You're running a workload that vanilla Postgres was never designed for, and no amount of configuration will change that.</p><p>It's not transactional. It's not a data warehouse. It's analytics on live data: high-frequency ingestion that stays operationally queryable. If you're running this pattern, you've already been through the cycle: add indexes, partition tables, tune autovacuum, upgrade instances. Each fix buys a few months. Then the metrics climb again.</p><p><strong>The longer you wait, the harder the migration. At 10M rows it takes days. At 500M rows, weeks. At 1B+, months.</strong> Recognizing the pattern early is the highest-leverage decision you can make.</p><p>This post describes six characteristics that define this workload. If four or five apply to your system, the friction is architectural, not operational. (For a deeper look at how purpose-built time-series architecture addresses these constraints, see the <a href="https://www.tigerdata.com/docs/about/latest/whitepaper"><u>Tiger Data architecture whitepaper</u></a>.)</p><h2 id="continuous-high-frequency-ingestion">Continuous High-Frequency Ingestion</h2><p>The database is absorbing thousands to hundreds of thousands of inserts per second. Not in bursts. Not during a nightly ETL window. Continuously, 24/7.</p><p>Consider a semiconductor fab with 8,000 CNC machines and inspection stations on the floor, each reporting vibration, temperature, spindle speed, and tool wear every 2 seconds. That's 4,000 inserts/sec from a single facility. Add process control events, quality inspection results, and environmental monitoring across three plants, and you're at 30-50K inserts/sec before accounting for growth.</p><pre><code class="language-SQL">-- What a single station's insert stream looks like
INSERT INTO machine_telemetry (ts, station_id, metric, value)
VALUES
  (now(), 'CNC-4401', 'vibration_mm_s', 2.34),
  (now(), 'CNC-4401', 'spindle_rpm', 12045),
  (now(), 'CNC-4401', 'coolant_temp_c', 31.2),
  (now(), 'CNC-4401', 'tool_wear_pct', 67.8);
-- Multiply by 8,000 stations × 0.5 Hz × 3 facilities
</code></pre><p>This matters because Postgres needs breathing room to run maintenance. Autovacuum, index maintenance, statistics collection. Continuous ingestion means maintenance always competes with writes. There is no off-peak window.</p><h2 id="queries-revolve-around-time">Queries Revolve Around Time</h2><p>Nearly every row has a timestamp, and nearly every query filters on a time range. Last 30 minutes. This week versus last week. Everything between two dates.</p><p>A trading platform captures every order, fill, and cancellation across multiple venues. The operations team monitors execution quality in real time. The compliance team audits historical patterns. Both teams write queries that look like this:</p><pre><code class="language-SQL">-- Operations: real-time execution quality
SELECT venue, avg(fill_latency_us), percentile_cont(0.99)
  WITHIN GROUP (ORDER BY fill_latency_us)
FROM executions
WHERE ts &gt; now() - interval '15 minutes'
GROUP BY venue;

-- Compliance: historical pattern detection
SELECT account_id, count(*) as cancel_count
FROM order_events
WHERE ts BETWEEN '2025-01-01' AND '2025-03-31'
  AND event_type = 'cancel'
  AND cancel_reason = 'client_requested'
GROUP BY account_id
HAVING count(*) &gt; 500;
</code></pre><p>Time is the primary axis for both storage and retrieval. General-purpose B-tree indexes aren't built for this access pattern, which is why teams end up building manual partitioning schemes and custom tooling to get time-range queries to perform.</p><h2 id="data-is-append-only">Data Is Append-Only</h2><p>Once a row lands, it doesn't change. Sensor readings are immutable. Financial transactions don't get updated. Log entries are permanent. When data gets removed, it happens in bulk: drop an entire month's partition, not individual rows.</p><p>A wind farm operator collects turbine performance data: blade pitch, rotor speed, power output, nacelle orientation. Once recorded, these readings are facts. They never get corrected or overwritten.</p><pre><code class="language-SQL">-- This is the entire write pattern. INSERT. No UPDATE. No single-row DELETE.
INSERT INTO turbine_readings
  (ts, turbine_id, blade_pitch_deg, rotor_rpm, power_kw, wind_speed_ms)
VALUES
  (now(), 'WT-112', 12.4, 14.2, 2840, 11.3);

-- Data removal is always bulk
DROP TABLE turbine_readings_2023_q1;</code></pre><p><strong>Every row you insert carries 23 bytes of MVCC transaction metadata, on data you will never update.</strong> Autovacuum scans these tables constantly, cleaning up dead tuples that were never created through updates. At 50K inserts/sec, that's MVCC overhead on 4.3 billion rows per day that will never be modified. You're paying the full cost of a concurrency model designed for workloads that look nothing like yours.</p><h2 id="retention-is-measured-in-months-or-years">Retention Is Measured in Months or Years</h2><p>Seven years of financial records for compliance. Quarters of manufacturing data for root cause analysis. Two-plus years of training data for ML pipelines.</p><p>A pharmaceutical manufacturer tracks environmental conditions (temperature, humidity, particulate count) across cleanroom facilities to meet FDA 21 CFR Part 11 requirements. When a batch fails quality control six months after production, the investigation pulls environmental data from the exact time window the batch was in each room.</p><pre><code class="language-SQL">-- Root cause investigation: what were cleanroom conditions
-- during a batch produced 6 months ago?
SELECT room_id, avg(temp_c), max(particulate_count),
  bool_or(humidity_pct &gt; 45) as humidity_excursion
FROM cleanroom_environment
WHERE ts BETWEEN '2025-08-14 06:00' AND '2025-08-14 18:00'
  AND facility = 'building_3'
GROUP BY room_id;
</code></pre><p>Short retention hides architectural problems because old data ages out. Long retention removes that escape valve. At 50K inserts per second, that's 1.5 billion rows per year. After three years: 4.5 billion rows.</p><h2 id="queries-are-latency-sensitive">Queries Are Latency-Sensitive</h2><p>This data isn't sitting in cold storage waiting for a weekly report. It's being queried actively, under latency constraints.</p><p>A SaaS observability platform collects metrics from thousands of customer deployments. The product serves real-time dashboards, automated alerting, and deep-dive investigation, all from the same database. Latency expectations form a gradient:</p><pre><code class="language-SQL">-- Dashboard widget: last 5 minutes, needs &lt; 100ms response
SELECT host_id, avg(cpu_pct), max(mem_used_bytes)
FROM host_metrics
WHERE ts &gt; now() - interval '5 minutes'
  AND customer_id = 'cust_8821'
GROUP BY host_id;

-- Alert evaluation: last hour, needs &lt; 500ms
SELECT host_id, avg(cpu_pct)
FROM host_metrics
WHERE ts &gt; now() - interval '1 hour'
  AND customer_id = 'cust_8821'
GROUP BY host_id
HAVING avg(cpu_pct) &gt; 90;

-- Incident investigation: last 3 months, seconds acceptable
SELECT date_trunc('hour', ts), avg(cpu_pct), avg(mem_used_bytes)
FROM host_metrics
WHERE ts &gt; now() - interval '90 days'
  AND host_id = 'host-a3f9c'
GROUP BY 1 ORDER BY 1;
</code></pre><p>Data warehouse scope with operational latency requirements. All from a single system.</p><h2 id="growth-is-sustained">Growth Is Sustained</h2><p>Data volume growing 50-100%+ year over year on a predictable curve. Static workloads can be over-provisioned once and left alone. Growing workloads demand constant re-optimization.</p><p>A logistics company tracks GPS position, engine diagnostics, and cargo conditions across a fleet of refrigerated trucks. They started with 200 trucks. Expansion added 150 trucks in year one, another 300 in year two. Each truck reports every 10 seconds.</p><pre><code class="language-markdown">Year 1:  200 trucks ×  6 readings/min × 1,440 min/day = 1.7M rows/day
Year 2:  350 trucks ×  6 readings/min × 1,440 min/day = 3.0M rows/day
Year 3:  650 trucks ×  6 readings/min × 1,440 min/day = 5.6M rows/day

Cumulative after 3 years: ~3.8 billion rows
</code></pre><p>Every optimization you ship today is solving for a table size you'll blow past in six months. The treadmill doesn't stop.</p><h2 id="what-to-do-with-this">What to Do With This</h2><p>Count how many of these characteristics describe your system. If it's two or three, standard <a href="https://timescale.ghost.io/blog/postgres-optimization-treadmill/" rel="noreferrer">Postgres optimization</a> should have a real impact. The architecture fits your workload. Better indexes, smarter queries, autovacuum tuning. The usual playbook works.</p><p>If it's four or five, however, the friction is architectural, not operational. You don't need to abandon Postgres. Tiger Data extends vanilla Postgres to handle exactly this workload. You keep SQL, your extensions, your team's expertise, and the entire Postgres ecosystem. What changes is the storage engine, partitioning, and query planning underneath.</p><p>The numbers bear this out. In <a href="https://www.tigerdata.com/blog/postgresql-timescaledb-1000x-faster-queries-90-data-compression-and-much-more"><u>benchmarks against vanilla PostgreSQL at one billion rows</u></a>, TimescaleDB delivered up to 1,000x faster query performance while reducing storage by 90% through native compression. Ingest throughput stays constant past 10 billion rows, while PostgreSQL's performance degrades as indexed tables outgrow memory (throughput that starts at 100K+ rows/sec can crash to hundreds). On Azure infrastructure running <a href="https://www.tigerdata.com/blog/benchmark-results-fastest-time-series-database-azure"><u>RTABench workloads</u></a>, Tiger Cloud was 1,200x faster than vanilla PostgreSQL across 40 real-time analytics queries. These aren't synthetic edge cases. They're the exact query patterns this post describes: time-range filters, aggregations, selective scans on growing datasets.</p><p><em>This post is part of a series on Postgres performance limits for high-frequency data workloads. The full analysis, including a workload scoring framework and migration complexity breakdown at different scales, is in the anchor essay:</em><a href="https://www.tigerdata.com/blog/postgres-optimization-treadmill" rel="noreferrer"><em> <u>Understanding Postgres Performance Limits for Analytics on Live Data</u></em></a><em>. Ready to test it on your own data?</em><a href="https://console.cloud.timescale.com/signup"><em> <u>Start a free Tiger Data trial.</u></em></a></p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[How to Train Your Agent to Be a Postgres Expert]]></title>
            <description><![CDATA[Turn AI into a Postgres expert with our MCP server. Get 35 years of best practices, versioned docs, and prompt templates for production-ready schemas.]]></description>
            <link>https://www.tigerdata.com/blog/free-postgres-mcp-prompt-templates</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/free-postgres-mcp-prompt-templates</guid>
            <category><![CDATA[Announcements & Releases]]></category>
            <category><![CDATA[AI]]></category>
            <category><![CDATA[PostgreSQL]]></category>
            <dc:creator><![CDATA[Matty Stratton]]></dc:creator>
            <pubDate>Wed, 22 Oct 2025 14:02:12 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2025/10/2025-Oct-21-Prompt-Template-Thumbnail.png">
            </media:content>
            <content:encoded><![CDATA[<h3 id="with-prompt-templates-and-versioned-docs-we-turn-35-years-of-postgres-wisdom-into-structured-knowledge-your-agent-can-reason-with">With prompt templates and versioned docs, we turn 35 years of Postgres wisdom into structured knowledge your Agent can reason with.</h3><p>Agents are the <a href="https://www.tigerdata.com/blog/postgres-for-agents" rel="noreferrer">new developer</a>. But they’re generalists.&nbsp;</p><p>What happens when they design your Postgres database? Your schema runs, your tests pass… and six months later your queries crawl and your costs skyrocket.&nbsp;</p><p>AI-generated SQL and database schemas are almost right. And that’s the problem. Fixing schema design mistakes is costlier than refactoring code. It often means multi-week migrations, downtime windows, rollback plans, and your CEO asking why the site is in maintenance mode. The root issue? LLMs don’t have the depth of Postgres and database expertise to let them build scalable systems. And when agents try to learn, they find documentation written for humans, not for them.&nbsp;</p><p>But agents don’t need more data, they need better context. They need to know what “good Postgres” actually looks like. The good news is given the right context and tools, agents can become instant experts. Even with Postgres.&nbsp;</p><p>That’s why we built an MCP server that provides 35 years of Postgres wisdom, and full access Postgres docs, all in a format that agents can easily process.&nbsp;</p><p>And we think this just might be the best database MCP server ever built. While Neon, Supabase, and other Postgres companies created MCP servers as thin API wrappers, ours teaches AI how to think in Postgres. The Tiger MCP server gives AI tools that work automatically: no prompt engineering or manual lookups needed. You just ask. And it provides correct, idiomatic Postgres. </p><p>Our new MCP server ships with detailed prompt templates written by our senior engineers, plus versioned Postgres (15-18) docs, and <a href="https://docs.tigerdata.com/" rel="noreferrer">TimescaleDB docs</a>.&nbsp;When your AI needs to design a schema, it automatically pulls the right template and searches the docs (using hybrid search) to generate code that actually lasts. And of course, it handles the basics: start, stop, <a href="https://www.tigerdata.com/blog/fast-zero-copy-database-forks" rel="noreferrer">fork a database</a>. </p><p>But the real magic happens when you ask AI to design your schema. So let’s see what happens when AI designs a simple schema. First without the MCP server, then with it.</p><h2 id="without-the-mcp-server">Without the MCP Server</h2><pre><code class="language-SQL">CREATE TABLE customers (
    customer_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    first_name VARCHAR(100) NOT NULL,
    last_name VARCHAR(100) NOT NULL,
    phone VARCHAR(20),
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    is_active BOOLEAN DEFAULT TRUE
);

CREATE INDEX idx_customers_email ON customers(email);

CREATE TABLE authors (
    author_id SERIAL PRIMARY KEY,
    first_name VARCHAR(100) NOT NULL,
    last_name VARCHAR(100) NOT NULL,
    biography TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
</code></pre><p>What’s wrong here:&nbsp;</p><ul><li>Mixed ID types (SERIAL and UUID) with no clear reason</li><li>Arbitrary VARCHAR limits instead of TEXT</li><li>Case-sensitive email index → duplicate accounts</li><li>CURRENT_TIMESTAMP instead of now(), leading to inconsistent timestamps</li></ul><p>Someone new to Postgres may not notice these problems. Until it’s too late.</p><p>Now let’s see what AI generates with our new MCP server.</p><h2 id="with-the-mcp-server">With the MCP Server</h2><pre><code class="language-SQL">CREATE TABLE authors (
  author_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name TEXT NOT NULL,
  bio TEXT,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON authors (name);

CREATE TABLE users (
  user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT NOT NULL UNIQUE,
  password_hash TEXT NOT NULL,
  first_name TEXT NOT NULL,
  last_name TEXT NOT NULL,
  phone TEXT,
  is_active BOOLEAN NOT NULL DEFAULT true,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);</code></pre><p>What’s better about this?</p><ul><li>Consistent ID strategy with BIGINT GENERATED ALWAYS AS IDENTITY</li><li>TEXT instead of arbitrary VARCHAR limits</li><li>Case-insensitive email lookups</li><li>Modern timestamp handling</li></ul><p>But why does this matter?</p><p>Each of these differences creates a compounding problem. Changing datatypes in the future will require full table rewrites. Missing lowercase email handling means duplicate accounts and confused users. And time zones? Every senior developer gets the thousand-yard stare when you mention UTC conversions.</p><p>This is just with a small example; imagine what would happen with more complex schemas.</p><p>And if you don’t believe us, here’s what Claude has to say:</p><pre><code class="language-markdown">&gt; Please describe the schema you would create for an e-commerce website two times, first with the tiger mcp server disabled, then with the tiger mcp server enabled. For each time, write the schema to its own file in the current working directory. Then compare the two files and let me know which approach generated the better schema, using both qualitative and quantitative reasons. For this example, only use standard Postgres.</code></pre><figure class="kg-card kg-video-card kg-width-regular" data-kg-thumbnail="https://timescale.ghost.io/blog/content/media/2025/10/how-to-train-your-agent_thumb.jpg" data-kg-custom-thumbnail="">
            <div class="kg-video-container">
                <video src="https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/media/2025/10/how-to-train-your-agent.mp4" poster="https://img.spacergif.org/v1/1280x720/0a/spacer.png" width="1280" height="720" loop="" autoplay="" muted="" playsinline="" preload="metadata" style="background: transparent url('https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/media/2025/10/how-to-train-your-agent_thumb.jpg') 50% 50% / cover no-repeat;"></video>
                <div class="kg-video-overlay">
                    <button class="kg-video-large-play-icon" aria-label="Play video">
                        <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
                            <path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"></path>
                        </svg>
                    </button>
                </div>
                <div class="kg-video-player-container kg-video-hide">
                    <div class="kg-video-player">
                        <button class="kg-video-play-icon" aria-label="Play video">
                            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
                                <path d="M23.14 10.608 2.253.164A1.559 1.559 0 0 0 0 1.557v20.887a1.558 1.558 0 0 0 2.253 1.392L23.14 13.393a1.557 1.557 0 0 0 0-2.785Z"></path>
                            </svg>
                        </button>
                        <button class="kg-video-pause-icon kg-video-hide" aria-label="Pause video">
                            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
                                <rect x="3" y="1" width="7" height="22" rx="1.5" ry="1.5"></rect>
                                <rect x="14" y="1" width="7" height="22" rx="1.5" ry="1.5"></rect>
                            </svg>
                        </button>
                        <span class="kg-video-current-time">0:00</span>
                        <div class="kg-video-time">
                            /<span class="kg-video-duration">0:50</span>
                        </div>
                        <input type="range" class="kg-video-seek-slider" max="100" value="0">
                        <button class="kg-video-playback-rate" aria-label="Adjust playback speed">1×</button>
                        <button class="kg-video-unmute-icon" aria-label="Unmute">
                            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
                                <path d="M15.189 2.021a9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h1.794a.249.249 0 0 1 .221.133 9.73 9.73 0 0 0 7.924 4.85h.06a1 1 0 0 0 1-1V3.02a1 1 0 0 0-1.06-.998Z"></path>
                            </svg>
                        </button>
                        <button class="kg-video-mute-icon kg-video-hide" aria-label="Mute">
                            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
                                <path d="M16.177 4.3a.248.248 0 0 0 .073-.176v-1.1a1 1 0 0 0-1.061-1 9.728 9.728 0 0 0-7.924 4.85.249.249 0 0 1-.221.133H5.25a3 3 0 0 0-3 3v2a3 3 0 0 0 3 3h.114a.251.251 0 0 0 .177-.073ZM23.707 1.706A1 1 0 0 0 22.293.292l-22 22a1 1 0 0 0 0 1.414l.009.009a1 1 0 0 0 1.405-.009l6.63-6.631A.251.251 0 0 1 8.515 17a.245.245 0 0 1 .177.075 10.081 10.081 0 0 0 6.5 2.92 1 1 0 0 0 1.061-1V9.266a.247.247 0 0 1 .073-.176Z"></path>
                            </svg>
                        </button>
                        <input type="range" class="kg-video-volume-slider" max="100" value="100">
                    </div>
                </div>
            </div>
            
        </figure><pre><code class="language-markdown">Verdict:

The Tiger MCP-enabled schema is objectively superior because:

1. Type Safety: Uses modern PostgreSQL types that prevent common errors
2. Data Integrity: 4x more constraints ensure data quality at the database level
3. Performance: Better indexing strategy with 55% more indexes, including partial and expression indexes
4. Maintainability: Better documentation, clear comments, proper naming
5. Features: More complete e-commerce functionality (discounts, full-text search, audit trails)
6. PostgreSQL Best Practices: Follows official PG17 recommendations from the documentation
7. Future-Proof: Uses modern features like GENERATED ALWAYS AS IDENTITY and NULLS NOT DISTINCT

The Tiger MCP server provided access to PostgreSQL-specific best practices documentation and design patterns that resulted in a more robust, performant, and maintainable schema.
</code></pre><h2 id="how-prompt-templates-make-it-all-possible">How Prompt Templates Make It All Possible</h2><p>Behind the scenes, AI uses the MCP server to call <code>get_prompt_template(‘design_postgres’)</code> to load schema design guidance. No pasted docs. No corrections. Just better code.</p><p>Prompt templates turn production wisdom into reusable guidance for AI. Instead of hunting through documentation written for humans, AI gets the information it needs in a format built for machines.</p><p>It comes down to the fact that humans and LLMs have opposite needs. Humans need narratives and memorable examples (and yes, even cat memes) to help them retain information. LLMs need to preserve context window space. That’s why prompt templates make terrible blog posts, but perfect AI guidance.</p><p>Our philosophy is: don't re-teach what the model already knows. LLMs have seen millions of lines of SQL. They know how to write CREATE TABLE. What they don’t know is the 35 years of Postgres wisdom about what works well and what doesn’t.</p><p>It's like your senior DBA whispering advice in the model's ear.</p><p>Our schema design template <code>(design_postgres_tables)</code> doesn’t explain what a primary key is. It jumps straight to guidance:</p><p>“Prefer <code>BIGINT GENERATED ALWAYS AS IDENTITY</code>; use <code>UUID</code> only when global uniqueness is needed.”</p><p>For data types, it doesn’t teach from scratch. It just tells you what works:</p><p>“DO NOT use <code>money</code> type; DO use <code>numeric</code> instead.”</p><p>Here’s a real snippet from the template:</p><pre><code class="language-markdown">## Postgres "Gotchas"

- **FK indexes**: Postgres **does not** auto-index FK columns. Add them.
- **No silent coercions**: length/precision overflows error out (no truncation). 
  Example: inserting 999 into `NUMERIC(2,0)` fails with error, unlike some 
  databases that silently truncate or round.
- **Heap storage**: no clustered PK by default (unlike SQL Server/MySQL InnoDB); 
  row order on disk is insertion order unless explicitly clustered.</code></pre><p>These gotchas trip up LLMs the same way they trip up developers new to Postgres. We optimized these templates for machines: short, factual, and precise, packing maximum guidance into minimum tokens.&nbsp;</p><p>We tested the same approach on a real IoT schema design task. Without templates, the AI added forbidden configurations and missed critical optimizations. <em>With</em> templates, it generated production-ready code with compression, continuous aggregates, and tuned performance.</p><p>That’s how prompt templates work. Now let’s see how the MCP server makes it all happen.</p><h2 id="how-this-mcp-server-is-smarter-than-others">How This MCP Server is Smarter Than Others</h2><p>While Neon, Supabase, and other Postgres companies created MCP servers as thin API wrappers, ours teaches AI how to think in Postgres.The Tiger MCP server gives AI tools that work automatically: no prompt engineering or manual lookups needed. You just ask. And it provides correct, idiomatic Postgres.</p><p><strong><code>get_prompt_template</code> provides auto-discovered expertise. </strong>Instead of having to call a template explicitly, you just say “I want to make a schema for IoT devices…” and the MCP server figures it out. </p><p>With self-discoverable templates, the AI can detect intent and load the right recipe, applying 35 years of Postgres best practices behind the scenes. </p><p><strong>The templates have real depth. </strong>No scraped snippets or boilerplate. The templates are written by senior Postgres engineers, and provide opinionated, production-tested guidance that is tuned to avoid every trap that seasoned DBAs know to avoid.</p><p><strong>Postgres-native vector retrieval adds the right context.</strong> When the AI needs more information, the MCP server searches the versioned Postgres (15-18) and TimescaleDB docs. And it uses Postgres itself for storage and vector search.</p><p>Versioning is critical. For example, Postgres 15 introduced UNIQUE NULLS NOT DISTINCT, while 16 improved parallel queries, and 17 changed COPY error handling. The MCP keeps AIs grounded in correct syntax every time, avoiding broken code from the wrong version.</p><p>The Tiger MCP doesn’t just wire up APIs. It teaches AI to think like a real Postgres engineer. </p><p>You don’t have to craft the perfect prompt. You just ask, and it does the right thing.</p><h2 id="see-it-for-yourself">See It For Yourself</h2><p>Install the Tiger CLI and MCP server:</p><pre><code class="language-shell">curl -fsSL https://cli.tigerdata.com | sh
tiger auth login
tiger mcp install</code></pre><p>(We also have alternative <a href="https://github.com/timescale/tiger-cli"><u>installation instructions</u></a> for the CLI tool.)</p><p>Then select your AI assistant (Claude Code, Cursor, VS Code, Windsurf, etc.) and immediately get real Postgres knowledge flowing into your AI.</p><p>This is how Postgres becomes the best database to use with AI coding tools: not by accident, not because someone pasted docs into a chat, but because the tooling now teaches AI how to think in Postgres.&nbsp;</p><p>Try the MCP server. Break it. <a href="https://timescaledb.slack.com/join/shared_invite/zt-38c4rrt9t-eR8I4hnb4qeGLUrL6hM3mA#/shared-invite/email"><u>Improve it</u></a>. Help us teach every AI to write real Postgres.</p><hr><p><strong>About the authors</strong></p><p><strong>Matty Stratton</strong></p><p>Matty Stratton is the Head of Developer Advocacy and Docs at Tiger Data, a well-known member of the DevOps community, founder and co-host of the popular <a href="https://www.arresteddevops.com/"><u>Arrested DevOps</u></a> podcast, and a global organizer of the <a href="https://devopsdays.org"><u>DevOpsDays</u></a> set of conferences.</p><p>Matty has over 20 years of experience in IT operations and is a sought-after speaker internationally, presenting at Agile, DevOps, and cloud engineering focused events worldwide. Demonstrating his keen insight into the changing landscape of technology, he recently changed his license plate from DEVOPS to KUBECTL.</p><p>He lives in the Chicagoland area and has three awesome kids and two Australian Shepherds, whom he loves just a little bit more than he loves Diet Coke.</p><p><strong>Matvey Arye</strong></p><p><a href="https://www.linkedin.com/in/matvey-arye/"><u>Matvey Arye</u></a> is a founding engineering leader at Tiger Data (creators of TimescaleDB), the premiere provider of relational database technology for time-series data and AI. Currently, he manages the team at Tiger Data responsible for building the go-to developer platform for AI applications.&nbsp;</p><p>Under his leadership, the Tiger Data engineering team has introduced partitioning, compression, and incremental materialized views for time-series data, plus cutting-edge indexing and performance innovations for AI.&nbsp;</p><p>Matvey earned a Bachelor degree in Engineering at The Cooper Union. He earned a Doctorate in Computer Science at Princeton University where his research focused on cross-continental data analysis covering issues such as networking, approximate algorithms, and performant data processing.&nbsp;</p><p><strong>Jacky Liang</strong></p><p><a href="https://www.linkedin.com/in/jjackyliang/"><u>Jacky Liang</u></a> is a developer advocate at Tiger Data with an AI and LLMs obsession. He's worked at Pinecone, Oracle Cloud, and Looker Data as both a software developer and product manager which has shaped the way he thinks about software.&nbsp;</p><p>He cuts through AI hype to focus on what actually works. How can we use AI to solve real problems? What tools are worth your time? How will this technology actually change how we work?&nbsp;</p><p>When he's not writing or speaking about AI, Jacky builds side projects and tries to keep up with the endless stream of new AI tools and research—an impossible task, but he keeps trying anyway. His model of choice is Claude Sonnet 4 and his favorite coding tool is Claude Code.</p><p></p>]]></content:encoded>
        </item>
    </channel>
</rss>