<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[Tiger Data Blog]]></title>
        <description><![CDATA[Insights, product updates, and tips from TigerData (Creators of TimescaleDB) engineers on Postgres, time series & AI. IoT, crypto, and analytics tutorials & use cases.]]></description>
        <link>https://www.tigerdata.com/blog</link>
        <image>
            <url>https://www.tigerdata.com/icon.ico</url>
            <title>Tiger Data Blog</title>
            <link>https://www.tigerdata.com/blog</link>
        </image>
        <generator>RSS for Node</generator>
        <lastBuildDate>Tue, 07 Apr 2026 09:54:09 GMT</lastBuildDate>
        <atom:link href="https://www.tigerdata.com/blog" rel="self" type="application/rss+xml"/>
        <ttl>60</ttl>
        <item>
            <title><![CDATA[Boosting Postgres Performance With Prepared Statements and PgBouncer's Transaction Mode]]></title>
            <description><![CDATA[You can now boost both your application’s and Postgres’ performance by using prepared statements and PgBouncer’s transaction mode. Learn how.]]></description>
            <link>https://www.tigerdata.com/blog/boosting-postgres-performance-with-prepared-statements-and-pgbouncers-transaction-mode</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/boosting-postgres-performance-with-prepared-statements-and-pgbouncers-transaction-mode</guid>
            <category><![CDATA[PostgreSQL]]></category>
            <category><![CDATA[PostgreSQL Performance]]></category>
            <dc:creator><![CDATA[Grant Godeke]]></dc:creator>
            <pubDate>Fri, 03 Nov 2023 11:12:50 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2023/11/boosting-postgres-performance-with-prepared-statements-and-pgbouncer-transaction-mode.png">
            </media:content>
            <content:encoded><![CDATA[<p>Adopting <a href="https://en.wikipedia.org/wiki/Prepared_statement">prepared statements</a> in your application is an easy performance gain. Using prepared statements lets your application skip query parsing and analyzing, eliminating a substantial amount of overhead. Pairing this with a connection pooler and transaction mode can dramatically boost your Postgres database performance. Needless to say, we were excited to learn that support for prepared statements in transaction mode was introduced in the <a href="https://www.pgbouncer.org/changelog.html#pgbouncer-121x">1.21 release of PgBouncer</a>!&nbsp;</p><p>In this post we’ll cover, at a high level, why you should enable prepared statements in your application. If you’re looking for the gory details of the implication, <a href="https://www.crunchydata.com/blog/prepared-statements-in-transaction-mode-for-pgbouncer" rel="noreferrer">our peers over at Crunchy wrote a great post on the 1.21 release here</a>!</p><p><a href="https://timescale.ghost.io/blog/connection-pooling-on-timescale-or-why-pgbouncer-rocks/" rel="noreferrer">Connection poolers on our mature cloud platform, Timescale, use PgBouncer </a>under the hood, supporting prepared statements and transaction pooling. <a href="https://console.cloud.timescale.com/signup">Start a free trial today to try it out</a>—no credit card required!</p><h3 id="what-is-a-prepared-statement-and-why-does-it-boost-postgres%E2%80%99-performance">What is a prepared statement, and why does it boost Postgres’ performance?</h3><p>A <a href="https://www.postgresql.org/docs/current/sql-prepare.html#:~:text=A%20prepared%20statement%20is%20a,statement%20is%20planned%20and%20executed">prepared statement</a> is a query that can accept parameters that you <code>PREPARE</code> on the server side, which does the query parsing, analysis, and any rewriting. You can then call this query by <code>EXECUTE</code>ing it with the corresponding arguments.&nbsp;</p><p>I like to think of prepared statements as similar to functions. You create a template of what will happen in the database and can call that template with parameters to have it happen inside the database. To over-extend the analogy, this “template” (prepared statements) lets you pre-compile the query, potentially greatly improving the overall execution time.</p><p>In an application setting, many of your queries are probably already templated rather than executing arbitrary queries. These are ideal candidates to use as prepared statements since they repeat the same underlying query but with different values.&nbsp;</p><h3 id="how-to-make-it-work-with-my-app">How to make it work with my app?</h3><p>One limitation of the implementation of prepared statement support in <a href="https://www.tigerdata.com/blog/using-pgbouncer-to-improve-your-postgresql-database-performance" rel="noreferrer">PgBouncer</a> is that “PgBouncer tracks protocol-level named prepared statements.” Basically, rather than writing raw SQL, we should use the libpq implementation of prepared statements instead. </p><p>Fortunately for you, whatever language your application is likely using already relies on this as part of the object-relational mapping (ORM) implementation! We just need to use the corresponding API rather than writing <code>PREPARE … EXECUTE …</code> in raw SQL ourselves.</p><p>ActiveRecord, the default ORM of Rails, makes this even simpler for us by using prepared statements by default. In your <code>config/database.yml</code> file, ensure you have not altered your production environment to turn off <code>prepared_statements</code>. They will allow up to 1,000 prepared statements by default. </p><p>Since Timescale allows for 100, we recommend reducing the config to equal our <code>max_prepared_statements</code> (see the next section for more detail). Thus, your config might look something like this:</p><pre><code class="language-SQL">production:
&nbsp;&nbsp;adapter: postgresql
&nbsp;&nbsp;statement_limit: 100
</code></pre>
<p>Note that <code>prepared_statements: false</code> is absent, as we want them on (which they are by default).</p><p>For an example of what is happening under the hood or to use as a template for other ORMs that may not handle this automatically, in the Ruby pg gem, we have the <code>prepare()</code> function. Creating a prepared statement would look something like:</p><pre><code class="language-SQL">conn = PG.connect(:dbname =&gt; 'tsdb')
conn.prepare('statement1', 'insert into metrics (created, type_id, value) values ($1::timestamptz, $2::int, $3::float)')
</code></pre>
<p>This uses the same table structure as our <a href="https://docs.timescale.com/tutorials/latest/energy-data/" rel="noreferrer">Energy tutorial</a>. Note that the <a href="https://deveiate.org/code/pg/PG/Connection.html#method-i-prepare">official gem documentation</a> recommends casting the values to the desired types to avoid type conflicts. These are SQL types, not Ruby types, though, since they are a part of the query.&nbsp;</p><p>To execute the query, you’d use something like:</p><pre><code class="language-SQL">conn.exec_prepared('statement1', [ 2023-05-31 23:59:59.043264+00,13, 1.78 ])
</code></pre>
<p><br>As a quirk of the PgBouncer implementation, we do not need to <code>DEALLOCATE</code> prepared statements. PgBouncer handles this automatically for us.&nbsp;</p><p>All you need to do is, for each connection, try to prepare the statement you want to use once and then call <code>exec_prepared()</code>as often as the connection stays open.</p><h3 id="what%E2%80%99s-happening-behind-the-scenes">What’s happening behind the scenes?</h3><p>Behind the scenes, PgBouncer intercepts the creation of the prepared statement and creates a cache of prepared statements for the connection pool. It looks at what the client called it ( <code>statement1</code> in our example) and sees if it has a prepared statement already on that connection, which looks something like <code>PGBOUNCER_1234</code>. If it does, then it will just reuse the already created statement. If not, it will automatically handle this and create it on behalf of the client in the database connection.</p><p>This implementation lets you effectively cache prepared statements across connections —which is insanely cool—giving you the full benefit of prepared statements even when using a transaction pool. For example, if your pool size is 20, and you have 100 client connections, each running the same query, this means that, at most, the query will be planned 20 times instead of the standard 100 in previous transaction mode usage. That’s pretty awesome!</p><p>One thing to be mindful of is the <code>max_prepared_statements</code> set in your connection pooler. On Timescale, the default is 100. For ideal performance, it’s recommended to keep the number of different prepared statements your application uses to be less than this value. <br><br>This lets you get maximum efficiency of PgBouncer’s cache. If you go more than this, it is not a big deal, but it may result in slightly more query plans than otherwise, as prepared statements will get deallocated from the database connection. For example, on Timescale, with your 101st prepared statement, the first prepared statement will be replaced by it.</p><div class="kg-card kg-callout-card kg-callout-card-purple"><div class="kg-callout-emoji">📚</div><div class="kg-callout-text"><a href="https://www.timescale.com/blog/using-pgbouncer-to-improve-your-postgresql-database-performance/" rel="noreferrer">Need some advice on how to configure PgBouncer correctly</a>? Here you go!</div></div><h2 id="final-statements">Final Statements</h2><p>Prepared statements rock! Using them can be a pretty easy performance win for your application while also boosting your Postgres database performance. The only thing to be careful about is to make sure that your application is using the libpq version of creating a prepared statement. Basically, rather than writing the raw SQL yourself, make sure you use your ORM’s API to create a prepared statement! Also, in PgBouncer, you don’t have to worry about deallocating a prepared statement ever.</p><p>If you’d like to try this out yourself, <a href="https://console.cloud.timescale.com/signup" rel="noreferrer">Timescale offers a free trial</a>—no credit card required!</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Making PostgreSQL Backups 100x Faster via EBS Snapshots and pgBackRest]]></title>
            <description><![CDATA[pgBackrest is an awesome tool for backup creation/restore in Postgres, but it can get slow for large databases. We mitigated this problem by incorporating EBS Snapshots to our backup strategy. ]]></description>
            <link>https://www.tigerdata.com/blog/making-postgresql-backups-100x-faster-via-ebs-snapshots-and-pgbackrest</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/making-postgresql-backups-100x-faster-via-ebs-snapshots-and-pgbackrest</guid>
            <category><![CDATA[PostgreSQL]]></category>
            <category><![CDATA[Engineering]]></category>
            <dc:creator><![CDATA[Grant Godeke]]></dc:creator>
            <pubDate>Thu, 31 Aug 2023 14:16:35 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2023/08/elephant-armor.png">
            </media:content>
            <content:encoded><![CDATA[<p>If you have experience running PostgreSQL in a production environment, you know that maintaining database backups is a daunting task. In the event of a catastrophic failure, data corruption, or other form of data loss, the ability to quickly restore from these backups will be vital for minimizing downtime. If you’re managing a database, maintaining your backups and getting your recovery strategy in order is probably the first check on your checklist.</p><p>Perhaps this has already given you one headache or two because<strong> creating and restoring backups for large PostgreSQL databases can be a very slow process.</strong></p><div class="kg-card kg-callout-card kg-callout-card-purple"><div class="kg-callout-emoji">🗒️</div><div class="kg-callout-text">A refresher on your basic backup and restore Postgres commands:<br><br><a href="https://www.timescale.com/learn/backup" rel="noreferrer">Postgres Backup Cheat Sheet</a><br><a href="https://www.timescale.com/learn/postgres-cheat-sheet/restore" rel="noreferrer">Postgres Restore Cheat Sheet</a></div></div><p><br></p><p>The most widely used external tool for backup operations in PostgreSQL is <a href="https://pgbackrest.org/?ref=timescale.com">pgBackRest</a>, which is very powerful and reliable. But pgBackRest can also be very time-consuming, especially for databases well over 1 TB. </p><p>The problem is exacerbated when restoring backups from production databases that continue to ingest data, thus creating more WAL (write-ahead log) that must be applied. In this case, a full backup and restore can take hours or even days, which can be a nightmare in production databases.</p><p>When operating our platform (<a href="https://www.timescale.com/">Timescale</a>, a cloud database platform built on PostgreSQL), we struggled with this very thing. At Timescale, we pride ourselves in making PostgreSQL faster and more scalable for large volumes of time-series data—therefore, our customers’ databases are often large (many TBs). At first, we were completely basing our backup and restore operations in pgBackRest, and we were experiencing some pain:</p><ul><li>Creating full backups was very slow. This was a problem, for example, when our customers were trying to upgrade their PostgreSQL major version within our platform, as we took a fresh, full backup after upgrade in case there was a failure shortly after. Upgrades are already stressful, and adding a very slow backup experience was not helping. </li><li>Restoring from backups was also too slow, both restoring from the backups themselves and replaying any WAL that had accrued since the last backup. (<a href="https://docs.timescale.com/use-timescale/latest/backup-restore/backup-restore-cloud/" rel="noreferrer">In Timescale, we automatically take full and incremental backups of all our customers’ databases.</a>)</li></ul><p>In this blog post, we’re sharing how we solved this problem by combining pgBackRest with EBS snapshots. Timescale runs in AWS, so we had the advantage of cloud-native infrastructure. If you're running PostgreSQL in AWS, you can perhaps benefit from a similar approach.</p><p><strong>After introducing EBS snapshots, our backup creation and restore process got 100x faster. </strong>This significantly improved the experience for our customers and made things much easier for our team.</p><h2 id="quick-introduction-to-database-backups-in-postgresql-and-why-we-used-pgbackrest">Quick Introduction to Database Backups in PostgreSQL (And Why We Used pgBackRest)</h2><p>If you asked 100 engineers if they thought backups were important for production databases, they would all say "yes"—but if you then took those same 100 engineers and gave them a grade on their backups, most wouldn’t hit a pass mark. </p><p>We all collectively understand the need for backups, but it’s still hard to create an effective backup strategy, implement it, run it, and test that it’s working appropriately.</p><p>In PostgreSQL specifically, there are two ways to implement backups: <strong>logical database dumps</strong>, which contain the SQL commands needed to recreate (not restore) your database from scratch, and <strong>physical backups</strong>, which capture the files that store your database state.  </p><p>Physical backups are usually paired with a mechanism to store the constant stream of write-ahead logs (WALs), which describe all data mutations on the system. A physical backup can then be restored to get PostgreSQL to the exact same state as it was when that backup was taken, and the WAL files rolled forward to get to a specific point in time, maybe just before someone (accidentally?) dropped all your data or your disk ate itself.</p><p>Logical backups are useful to recreate databases (potentially on other architectures), but maintaining physical backups is imperative for any production workload where uptime is valued. Physical backups are exact: they can be restored quickly and provide point-in-time recovery. In the rest of this article, we’ll discuss physical backups.</p><h3 id="how-are-physical-backups-usually-created-in-postgresql">How are physical backups usually created in PostgreSQL?</h3><ul><li>The first option is using the <a href="https://www.postgresql.org/docs/current/app-pgbasebackup.html">pg_basebackup</a> command. <code>pg_basebackup</code> copies the data directory and optionally includes the WAL files, but it doesn’t support incremental backups and has limited parallelization capabilities. The whole process is very manual, too. If you’re using <code>pg_basebackup</code>, you’ll instantly get the files you need to bootstrap a new database in a tarball or directory, but not much else. </li><li>Tools like <a href="https://pgbackrest.org/?ref=timescale.com">pgBackRest</a> were designed to overcome the limitations of <code>pg_basebackup</code>. pgBackRest allows for full and incremental backups, multi-threaded operations, and point-in-time recovery. It ensures data integrity by validating checksums during the backup process, supports different types of storage, and much more. In other words, pgBackRest is a robust and feature-rich tool, making it our choice for PostgreSQL backup operations.</li></ul><h2 id="the-problem-with-pgbackrest">The Problem With pgBackRest</h2><p>But pgBackrest is not perfect: it reads and backs up files, causing an additional load on your system. This can cause performance bottlenecks that can complicate your backup and restore strategy, especially if you’re dealing with large databases.</p><p>Even though pgBackRest offers incremental backups and parallelization, it often gets slow when executing full backups over large data volumes or on an I/O-saturated system.  </p><p>While you can sometimes rely on differential or incremental backups to minimize data (<a href="https://docs.timescale.com/use-timescale/latest/backup-restore/backup-restore-cloud/" rel="noreferrer">like we do in Timescale</a>), there are situations in which creating full backups is unavoidable. Backups could also be taken on standby, but at the end of the day, you’re limited by how fast you can get data off your volumes. </p><p>We shared earlier the example of full database upgrades, but we're also talking about any other kind of migration, integrity checks, archival operations, etc. In Timescale, some of our most popular platform features (like <a href="https://timescale.ghost.io/blog/introducing-one-click-database-forking-in-timescale-cloud/">forks,</a> <a href="https://timescale.ghost.io/blog/high-availability-for-your-production-environments-introducing-database-replication-in-timescale-cloud/">high-availability replicas</a>, and <a href="https://timescale.ghost.io/blog/high-availability-for-your-production-environments-introducing-database-replication-in-timescale-cloud/">read replicas</a>) imply a data restore from a full backup.</p><p>Having a long-running full backup operation in your production database is not only inconvenient, it can also conflict with other high-priority DB tasks, affecting your overall performance. This was problematic for us.</p><p>The slowness of pgBackRest was also problematic when it was time to restore from these backups. It’s very good at CPU parallelization, but when you’re trying to write terabytes of data as fast as possible, I/O will be the bottleneck. When it comes to recovery time objective or RTO, every minute counts. In case of major failure, you want to get that database up as soon as possible.</p><h2 id="using-ebs-snapshots-to-speed-up-the-creation-of-backups">Using EBS Snapshots to Speed Up the Creation of Backups</h2><p>To speed up the process of creating fresh full backups, we decided to replace standard pgBackRest full backups with on-demand <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html?ref=timescale.com">EBS snapshots</a>.</p><p>Our platform runs in AWS, which comes with some advantages. Using snapshots is a much more cloud-native approach to the problem of backups compared to what’s been traditionally used in PostgreSQL management. </p><p>EBS snapshots create a point-in-time copy of a particular database: this snapshot can be restored, effectively making it a backup. The key is that <strong>taking a snapshot is significantly faster than the traditional approach with pgBackRest</strong>: in our case, our p90 snapshot time decreased by over 100x. This gap gets wider the larger your database is!</p><p>How did we implement this? Basically, we did a one-to-one replacement of pgBackRest. Instead of waiting for the pgBackRest fresh full backup to complete, we now take a snapshot. We still wait for the backup to complete, but the process is significantly faster via snapshots. This way, we get the quick snapshot but also the full data copy and checksumming for datafile integrity, which pgBackRest performs.</p><p>If a user experiences a failure shortly after an upgrade, we have a fresh backup—the snapshot—that we can quickly restore (we’ll cover how we handle restores next). We still take a fresh full backup using pgBackRest (yay for redundancy), but the key difference is that this happens after the upgrade process has been fully completed. </p><p>If a failure has happened, the service is available to our customer quickly: we don’t have to force them to wait for the lengthy pgBackRest process to finish before being able to use their service again.</p><p>The trade-offs for adopting this approach were minimal. The only downside to consider is that, by taking snapshots, we now have redundant backups (both snapshots and full backups), so we incur additional storage costs. But what we’ve gained (both in terms of customer satisfaction and our own peace of mind) is worth the price.</p><h2 id="combining-ebs-snapshots-and-pgbackrest-for-quick-data-restore-taking-partial-snapshots-replaying-wal">Combining EBS Snapshots and pgBackRest for Quick Data Restore: Taking Partial Snapshots, Replaying WAL</h2><p>Solving the first problem we encountered with pgBackRest (i.e., slow creation of full backups) was relatively simple. We knew exactly when we needed an EBS snapshot to be created, as this process is always tied to a very specific workflow (e.g., performing a major version upgrade).</p><p>But we also wanted to explore using EBS snapshots to improve our data restore functionality. As we mentioned earlier, some popular features in the Timescale platform rely heavily on restores, including <a href="https://timescale.ghost.io/blog/introducing-one-click-database-forking-in-timescale-cloud/">creating forks,</a> <a href="https://timescale.ghost.io/blog/high-availability-for-your-production-environments-introducing-database-replication-in-timescale-cloud/">high-availability replicas</a>, and <a href="https://timescale.ghost.io/blog/high-availability-for-your-production-environments-introducing-database-replication-in-timescale-cloud/">read replicas,</a> all of which imply a data restore from a full backup.</p><p>This use case posed a slightly different and more difficult challenge since to restore from a full backup, such a backup needs to exist first, reflecting the latest state of the service. </p><p>To implement this, the first option we explored was taking an EBS snapshot when the user clicked “Create” a fork, read replica, or high-availability replica, to then restore from that snapshot. However, this process was still too slow for the end user. To get the performance we wanted, we had to think a bit beyond the naive approach and determine a way to take semi-regular snapshots across our fleet.</p><p>Fortunately, we already had a backup strategy for pgBackRest in place that we chose to mirror. Now, all Timescale services have EBS snapshots taken daily. For redundancy reasons and to verify file checksums, we still take our standard pgBackRest partial backups, but we don’t depend on them.</p><p>Once the strategy is solved, restoring data from an EBS snapshot mirrors a restore from pgBackRest very closely. We simply chose the corresponding EBS snapshot we wanted to restore—in the cases mentioned above, always the most recent—and then replayed any WAL that has accumulated since that restore point. Here, it is important to note that <a href="https://timescale.ghost.io/blog/how-high-availability-works-in-our-cloud-database/#what-if-theres-a-failure-affecting-your-storage">we still rely on pgBackRest to do our WAL management</a>. pgBackRest works great for us here; nothing gets close in terms of parallel WAL streaming.</p><p>This EBS snapshotting and pgBackRest approach has given us great results so far. Using snapshots for restores has helped improve our product experience, also providing our customers with an even higher level of reliability. Keeping pgBackRest in parallel has given us peace of mind that we still have a traditional backup approach that validates our data as well as snapshots.</p><p>We’re continually improving our strategy though, for example, by being smarter about when we snapshot—e.g., by looking at the accumulated WAL since the last snapshot to determine if we need to snapshot certain services more frequently. This practice helps improve restore times by reducing the amount of WAL that would need to be replayed, which is often the bottleneck in this process.</p><h2 id="on-snapshot-prewarming">On Snapshot Prewarming</h2><p>One important trade-off with this EBS snapshot approach is the balance between deployment time and initial performance. One limitation of a snapshot restore is that not all blocks are necessarily prewarmed and <a href="https://timescale.ghost.io/blog/scaling-postgresql-with-amazon-s3-an-object-storage-for-low-cost-infinite-database-scalability/">may need to be fetched from S3</a> the first time they are used, which is a slow process.</p><p>To give props to pgBackRest restore, it does not have this issue. For our platform features, our trade-off was between getting the user a running read replica (or fork or high-availability replica) as quickly as possible or making sure it was as performant as possible.</p><p>After some back and forth, we decided on our current approach on prewarming: we’re reading as much as we can for five minutes, prioritizing the most recently modified files first. The idea here is that we will warm the data the user is actively engaging with first. After five minutes, we then hand the process off to PostgreSQL to continue reading the rest of the volume at a slower pace until it is complete. For the initial warming, we use a custom <a href="https://www.educative.io/answers/what-is-a-goroutine?ref=timescale.com">goroutine</a> that reads concurrently from files.</p><h2 id="backing-it-up">Backing It Up</h2><p>We are not completely replacing our pgBackRest backup infrastructure with EBS snapshots anytime soon: it is hard to give up on the effectiveness and reliability of pgBackRest. </p><p>But by combining EBS snapshots with pgBackRest across our infrastructure, we’ve been able to mitigate its performance problem significantly, speeding up our backup creation and restore process. This allows us to build a better product, providing a better experience to our customers.</p><p>If you’re experiencing the same pains we were experiencing with pgBackRest, think about experimenting with something similar! It may cost you a little extra money, but it can be very much worth it.</p><p>We still have work to do on our end: we will continue to iterate on the ideal snapshotting strategy across the fleet to minimize deployment times as much as possible. We are also looking at smarter ways to prewarm the snapshots and more applications for snapshots in general.</p><p>If any of these problems interest you, <a href="https://www.timescale.com/careers/?ref=timescale.com">check out our open engineering roles</a> (we’re hiring!). And if you are a PostgreSQL user yourself, <a href="https://console.cloud.timescale.com/signup?ref=timescale.com">sign up for a free Timescale trial</a> and experience the result of EBS snapshots in action.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Read Before You Upgrade: Best Practices for Choosing Your PostgreSQL Version]]></title>
            <description><![CDATA[PostgreSQL upgrades have been known to be a bit of a controversial issue in the community. In this article, we will take the mystery out of the question of when an upgrade is appropriate and how Timescale allows you to do it as swiftly as possible.]]></description>
            <link>https://www.tigerdata.com/blog/read-before-you-upgrade-best-practices-for-choosing-your-postgresql-version</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/read-before-you-upgrade-best-practices-for-choosing-your-postgresql-version</guid>
            <category><![CDATA[PostgreSQL]]></category>
            <category><![CDATA[Cloud]]></category>
            <dc:creator><![CDATA[Kirk Laurence Roybal]]></dc:creator>
            <pubDate>Fri, 11 Nov 2022 18:35:31 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2022/11/Best-Practices-PostgreSQL-version_Hero--1-.png">
            </media:content>
            <content:encoded><![CDATA[<p>PostgreSQL has a long-standing reputation for having a miserable upgrade process. So, when the community heartily recommends that you should upgrade as soon as possible to the latest and greatest PostgreSQL version, it's not really surprising that your heart sinks, your mouth goes dry, and the outright dread of another laborious job takes over.</p><p>It's almost like finishing a long hike or trying to convince somebody that Betamax was better than VHS. Eventually, you just want it to be over so you can take a nap. There's not even any joy about all the new features and speed. It's just too exhausting to generate emotion anymore.</p><p>This blog post will hopefully serve as a guide for when to pull off the old band-aid. That is, when you should upgrade and what PostgreSQL version you should select as a target. By the end of this post, we will introduce you to our best practices for upgrading your PostgreSQL version in Timescale, so you can get over this process of upgrading as quickly and safely as possible.</p><h2 id="when-to-upgrade-postgresql-common-myths">When to Upgrade PostgreSQL: Common Myths</h2><p>The <a href="https://www.linkedin.com/company/postgresql-global-development-group">PostgreSQL Global Development Group</a> has simplified the upgrade process quite a bit with more explicit version numbering. Since there are only two external stimuli, there are only two choices: upgrade the binaries (minor version change) or upgrade the data on disk (major version change).</p><p>The developers of PostgreSQL never really had a plan in mind for when and how to upgrade.  This seems a bit of a harsh statement when tools like pg_upgrade exist but bear with me.   These tools were meant to make upgrades <strong>possible</strong>, not to imply any particular schedule or recommendations for an upgrade plan. The actual upgrade implementation was always left as an exercise for the administrator.</p><p>Let's start with some of the community's conventional wisdom and pretend that those ideas were actually a plan of sorts.</p><h3 id="myth-1-%E2%80%9Cupgrade-as-fast-as-possible-every-time%E2%80%9D">Myth 1: “Upgrade as fast as possible, every time”</h3><p>This "plan" is based on the fear of existing bugs. It is a very Rumsfeldian plan that assumes you don't know what the bugs are, but you're certainly better off if they're fixed. This makes for a very aggressive upgrade pace and hopes for a better tomorrow rather than a stable today.</p><h3 id="myth-2-upgrade-when-you-have-to">Myth 2: "Upgrade when you have to"</h3><p>The complete opposite fear-based pseudo-plan is to stick to the existing version—come hell or high water—unless you run into an otherwise unfixable bug that affects your installation. This is based on the idea that the bugs we know are better than the bugs we don't know. Unfortunately, it ignores the bugs you don't even know exist.</p><h3 id="myth-3-%E2%80%9Cupgrade-for-every-minor-version%E2%80%9D">Myth 3: “Upgrade for every minor version”</h3><p>This is the <a href="https://www.postgresql.org/support/versioning/">general recommendation</a> of the PostgreSQL Global Development Group. The general idea is that all software has bugs, and upgrading is better than not upgrading. That is a bit over-optimistic about new bugs being introduced and kind of ignores that new features that you don’t care about have to be configured—or else.</p><p>This comes a bit closer to planning than guessing for minor versions, as the minor versions of PostgreSQL do not change the file system; they only change the binaries. These upgrades tend to be super heavy on bug fixes and very low on new features, which is where bugs tend to get introduced. It doesn't say anything about bugs you have actually encountered, nor does it say anything about any improvements from which you might be able to benefit.</p><h3 id="myth-4-%E2%80%9Cupgrade-when-you-have-time-to-kill%E2%80%9D">Myth 4: “Upgrade when you have time to kill”</h3><p>Probably the most dangerous plan since you will never have more time in the future and will probably never upgrade. Experience says that this is a completely silly plan that never gets implemented.</p><h3 id="myth-5-%E2%80%9Cupgrade-when-there-are-security-fixes%E2%80%9D">Myth 5: “Upgrade when there are security fixes”</h3><p>Okay, this makes some kind of sense. Unfortunately, it ignores the rest of your installation and puts the application development team into tailspin mode for your DevOps enjoyment. It is the kind of policy you end up with when the DevOps team doesn’t really care about the Apps team.</p><h2 id="when-to-upgrade-postgresql">When to Upgrade PostgreSQL </h2><p>Much of this guide is based on personal experience with PostgreSQL upgrades over the years. In some cases, the old was better than the new, and in others, the other way around. In some cases, the fixes worked immediately. In others, well, not so much.</p><p>Very few hard and fast rules can be drawn when coming up with a plan of this nature, but I'll try to bring the experience to bear in a way that helps to make a decision in the future. That being said, this is a "best practice" based on experience, not a "sure-fire thing."</p><p>As a way to reduce the amount of just sheer subjectivity and opinion around choosing the moment to upgrade, I've taken a look through the release notes of PostgreSQL. In this lookie-look, I've attempted to note where bug fixes occurred and mentally move them back to the version where they were discovered. Unfortunately, this task is also somewhat subjective, as I was not a part of the bug fix development or the bug discovery. So these are just educated guesses, but I hope rather good ones.</p><p>Then I looked at the mental list that I had made and thought about whether it matched my personal experience with successful versus unsuccessful upgrades. It (again) seemed a subjectively good indicator of when an upgrade succeeded or failed.</p><p>So, on to the findings.  </p><p>The first thing I noticed in my research is that the biggest upgrade failures were with a new major version containing updates to the <a href="https://www.postgresql.org/docs/current/wal-intro.html">write-ahead log (WAL)</a>. These were most notable for versions 10 and 12.</p><p>Version 10 would make a book by itself. It was a major undertaking, with quite a few subsystem rewrites. In these version upgrades, there were numerous additions to items (like WAL for hash indexes), as well as improvements and changes to the background writer to support structural changes on disk. These major updates introduced the largest number of unintended behaviors, which lasted the longest before being detected and fixed.</p><p>The next most striking failures came from logical replication between 10 and 11.  Of course, logical replication was invented for 10, so there had never been an attempt to use it for production upgrades before. This first use in the field was—how should I put it?—interesting.</p><p>After that, the bugs died down a lot but were never quite gone.</p><h2 id="upgrade-plan">Upgrade Plan</h2><p>Here is my list of questions to ask before an upgrade.</p><p>1. How big is the change? Was it a major refactor, and did it involve any of the following?</p><ul><li><strong>Query planner:</strong> minor.</li><li><strong>WAL:</strong> major.</li><li><strong>Background writer: </strong>major.</li><li><strong>Memory, caching, locks, or anything else managed by the parent process:</strong> minor.</li><li><strong>Index engine</strong>: major (or just rebuild all your indexes anyway).</li><li><strong>Replication</strong>: major.</li><li><strong>Logging</strong>: minor.</li><li><strong>Vacuuming</strong>: minor.</li></ul><p>2. Were there any huge performance gains?</p><p>3. Does it include major security fixes?</p><p>4. Are there major built-in function() improvements/enhancements?</p><p>5. Do all of my extensions exist for the new version?</p><p>These are my rules of thumb for whether a new PostgreSQL version is compelling for upgrade.  Unfortunately, this still requires some subjective evaluation and a bit of professional knowledge. For instance, just because vacuum is a major feature, it doesn't mean it has ever been a problem with an upgrade. It <strong>could</strong> be, though, and we should look at its major changes with a bit of a wry mouth hold.</p><p>This brings me to my personal procedure that has (so far) followed the above guidelines.</p><ol><li><strong>Upgrade major versions when they reach the minor version .2.</strong> That is, 10.2, 11.2, 12.2, etc. This technique avoids the most egregious bugs introduced in major versions but still allows for staying reasonably close to the current.</li><li><strong>Upgrade minor versions as they are available.</strong> Minor upgrades have not created major issues thus far in my personal experience. The speed increases, bug fixes, security patches, and internationalization have been worth the minor risk.</li><li><strong>Upgrade immediately if your version is nearing the five-year mark</strong>. <a href="https://www.postgresql.org/support/versioning/" rel="noreferrer">The PostgreSQL Global Development Group releases a new major version every year</a> and supports it for five years after its release. You don't want to be left with an unsupported version.</li><li><strong>Upgrade when the security team tells you to</strong>. It doesn't happen very often, but when it does, it's a major event.</li><li><strong>Upgrade because you need functionality</strong>. Things to upgrade for: <code>CONCURRENTLY</code>, <code>SYSTEM</code>, and performance. Things not to upgrade for: functions(), operators, and libraries.</li></ol><p>That's all there is to it.  </p><p>I hope this blog post has helped you to make a decision for when PostgreSQL has compelling new features for you.</p><p>Of course, this is only a general rule of thumb. If you feel compelled to upgrade for some other reason, don't let my guide tell you what <strong>not</strong> to do. It only intends to help in the absence of any other stimuli for upgrade. You do you.</p><h2 id="i-am-ready-to-upgrade-now-what">I Am Ready to Upgrade. Now, What?</h2><p>So now you have followed the checklist above and determined that it’s time for you to upgrade your PostgreSQL version. If you’re running a production database, this may be easier said than done, especially if we are talking about upgrading your major version (e.g., from PostgreSQL 13 to PostgreSQL 14): </p><ul><li>Minor versions of PostgreSQL (e.g., from PostgreSQL 13 to PostgreSQL 13.2) are always backward compatible with the major version. That means that if you upgrade your production database, it is unlikely that anything is going to break due to the upgrade.</li><li>However, major versions of PostgreSQL are not backward compatible. That means that when you upgrade the PostgreSQL version of a database behind a mission-critical application, this may introduce user-facing incompatibilities which might require code changes in your application to ensure no breakage.</li></ul><p>Practical example: if you are upgrading from PostgreSQL 13 to 14, in PostgreSQL 14, the factorial operators ! and !! are no longer supported, nor is running the factorial function on negative numbers. What may seem like a silly example is, in fact, illustrative that assumptions made about how certain functions (or even operators) work between versions may break once you update. </p><p>Fortunately, PostgreSQL is awesome enough to provide clear <a href="https://www.postgresql.org/docs/current/release.html">Release Notes</a> stating the changes between versions. But this doesn’t solve our problem: how to upgrade production databases safely? </p><h2 id="timescale-to-the-rescue">Timescale to the Rescue</h2><p>This is one of the many areas in which choosing a cloud database will help. If you are self-hosting your mission-critical PostgreSQL database and want to run a major upgrade, you would have first to create a copy of your database manually, dumping your production data and restoring it in another database with the same config as your production database. </p><p>Then, you would have to upgrade this database and run your testing there. This process can take a while depending on your database's size (and if we’re talking about a time-series application, it’s probably pretty big). </p><p>Timescale makes the upgrading process way more approachable. Timescale is a database cloud for time-series applications built on TimescaleDB and PostgreSQL. In other words, this is PostgreSQL under the hood—with a sprinkle of TimescaleDB as the time-series secret sauce. </p><p>Timescale databases (which are called “services”) run on a particular version of TimescaleDB and PostgreSQL:</p><ul><li>As a user of Timescale, you don’t have to worry about the TimescaleDB upgrades: they will be handled automatically by the platform during a maintenance window picked by you. These upgrades are backward compatible and nothing you should worry about. They require no downtime.</li><li>The upgrades between minor versions of PostgreSQL are also automatically handled by the platform during your maintenance window. As we mentioned, these upgrades are also backward compatible. However, they require a service restart, which could cause a small (30 seconds to a few minutes) of downtime if you do not have a replica. We always alert users ahead of these in advance.</li></ul><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">✨</div><div class="kg-callout-text"><b><strong style="white-space: pre-wrap;">Editor's Note: </strong></b><i><em class="italic" style="white-space: pre-wrap;">For security reasons, we always run the latest available minor version within a major version on PostgreSQL in Timescale. These minor updates may contain security patches, data corruption problems, and fixes to frequent bugs—as a managed service provider, we have to store our customers’ data as safely as possible.</em></i></div></div><p>But what about upgrades between major versions of PostgreSQL? Since these are often not backward compatible, we cannot automatically upgrade your service in Timescale from, let’s say, PostgreSQL 13 to 14, which may introduce problems in your code and cause major issues! </p><p>Also, upgrading between major versions of PostgreSQL can (unfortunately but unavoidably) introduce some downtime. If you are running a mission-critical application, you want complete control over <em>when</em> that unavoidable downtime will occur. And you certainly want to test that upgrade first. </p><p>A database platform like Timescale can certainly help solve this issue. Upgrading your major version of Postgres will always be a decent lift—but a hosted database platform can make this process way smoother, helping you automate what can be automated and also facilitating your testing:</p><ul><li>In Timescale, you can upgrade the PostgreSQL version that’s running on your service by simply clicking a button.</li><li>You can use database forks to test your upgrade safely. Also, by clicking a button, Timescale allows you to create a database fork (a.k.a. an exact copy of your database) which you can then upgrade to estimate the required downtime to upgrade your production instance.</li><li>You can also use forks to test your application changes. Once your fork is upgraded, you can run some of your production queries—you can find some of these using <a href="https://timescale.ghost.io/blog/identify-postgresql-performance-bottlenecks-with-pg_stat_statements/"><code>pg_stat_statements</code></a>—on the fork to ensure they don’t contain any breaking changes to the new major version. </li></ul><p>Let’s explore this more in the next section. If you’re not using Timescale, you can create a <a href="https://console.cloud.timescale.com/signup">free account here</a>—you’ll have free access for 30 days, no credit card required.  </p><h2 id="safely-upgrading-major-postgresql-versions-in-timescale">Safely Upgrading Major PostgreSQL Versions in Timescale </h2><p>Here’s how you can safely upgrade your Timescale service:</p><ul><li>First, fork your service. Timescale allows you to fork (a.k.a. copy) your databases in one click—a fast and cost-effective process. You will only be charged when your fork runs, and you can immediately delete it after your testing is complete.</li><li>Now that you have a perfect copy of your production database ready for testing (with the click of a button), it’s time to click another button to tell the platform to upgrade your major PostgreSQL version automatically. You can do this in Timescale—we’ll tell you exactly how in a minute.</li><li>Once the upgrade is complete in your fork, run your tests.</li><li>In order to see how long the upgrade took on the fork, you can go to your metrics tab and check how long your service was unavailable (the grey zone in your CPU and RAM graphs). This will give you an estimate as to how long your primary service will be down when you choose to upgrade it.</li><li>When you’re sure that nothing breaks, you can upgrade your primary service. Make sure to plan accordingly! Upgrading will cause downtime, so make sure you have accounted for that as a part of your upgrade plan. </li></ul><p>Let’s see how this looks in the console. </p><p>First, check which TimescaleDB and PostgreSQL version your database is running on your service Overview page.</p><figure class="kg-card kg-image-card"><img src="https://timescale.ghost.io/blog/content/images/2022/11/Best-practices-upgrade-PostgreSQL_img-1.png" class="kg-image" alt="" loading="lazy" width="1095" height="635" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/11/Best-practices-upgrade-PostgreSQL_img-1.png 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2022/11/Best-practices-upgrade-PostgreSQL_img-1.png 1000w, https://timescale.ghost.io/blog/content/images/2022/11/Best-practices-upgrade-PostgreSQL_img-1.png 1095w" sizes="(min-width: 720px) 720px"></figure><p>To fork your service is as easy as going to the Operations tab and clicking on the Fork service option. This will automatically create an exact snapshot of your database.</p><figure class="kg-card kg-image-card"><img src="https://timescale.ghost.io/blog/content/images/2022/11/Best-practices-upgrade-PostgreSQL_img2.png" class="kg-image" alt="" loading="lazy" width="1170" height="553" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/11/Best-practices-upgrade-PostgreSQL_img2.png 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2022/11/Best-practices-upgrade-PostgreSQL_img2.png 1000w, https://timescale.ghost.io/blog/content/images/2022/11/Best-practices-upgrade-PostgreSQL_img2.png 1170w" sizes="(min-width: 720px) 720px"></figure><p>To upgrade your major version of PostgreSQL, go to your Maintenance tab. Under Service upgrades, you will see a Service upgrades button. If you click that button, your service will be updated to the next major version of Postgres (in the example below, the service would be upgraded from PostgreSQL 13.7 to PostgreSQL 14).</p><figure class="kg-card kg-image-card"><img src="https://timescale.ghost.io/blog/content/images/2022/11/Best-practices-upgrade-PostgreSQL_img3.png" class="kg-image" alt="" loading="lazy" width="964" height="623" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/11/Best-practices-upgrade-PostgreSQL_img3.png 600w, https://timescale.ghost.io/blog/content/images/2022/11/Best-practices-upgrade-PostgreSQL_img3.png 964w" sizes="(min-width: 720px) 720px"></figure><h2 id="your-upgrade-is-complete">Your Upgrade Is Complete</h2><p>That’s it! You can now use the latest and greatest that PostgreSQL has to offer. That said, choosing to upgrade is no small feat. Before going through the upgrade process, there is a lot to consider, and it is important to have a plan to account for the downtime you will experience. </p><p>While the upgrade process can be a bit painful, you can at least rely on Timescale to handle the technical orchestration of the upgrade. In the future, we hope to offer even better tooling to make the upgrade process entirely pain-free (but we have to walk before we can run, right?).</p><p><br><br>If you’d like to see what Timescale has to offer, <a href="https://www.timescale.com/timescale-signup">start a free trial if you haven’t already. There’s no credit card required!</a></p>]]></content:encoded>
        </item>
    </channel>
</rss>