TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
Guide to PostgreSQL PerformanceHow to Reduce Bloat in Large PostgreSQL TablesDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables A Guide to Data Analysis on PostgreSQLA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningPostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)Top PostgreSQL Drivers for PythonWhen to Consider Postgres PartitioningGuide to PostgreSQL Database OperationsUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLGuide to Postgres Data ManagementHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseAn Intro to Data Modeling on PostgreSQLExplaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksBuilding a Scalable DatabaseRecursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL Database DesignHow to Use Psycopg2: The PostgreSQL Adapter for Python
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
When Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
5 Ways to Monitor Your PostgreSQL DatabaseHow to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's GuideData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics
Sections

Performance

Guide to PostgreSQL Performance

Schema design

PostgreSQL Performance Tuning: Designing and Implementing Your Database Schema

Performance tuning

PostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesHow to Reduce Bloat in Large PostgreSQL TablesPostgreSQL Performance Tuning: How to Size Your Database

Partitioning

Determining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)When to Consider Postgres PartitioningPg_partman vs. Hypertables for Postgres Partitioning

Database design and modeling

An Intro to Data Modeling on PostgreSQLDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables Guide to PostgreSQL Database Design

Database replication

A PostgreSQL Database Replication Guide

Data analysis

A Guide to Data Analysis on PostgreSQLHow to Compute Standard Deviation With PostgreSQL

Data transformation

How to Use PostgreSQL for Data Transformation

Data aggregation

How PostgreSQL Data Aggregation Works

Scaling postgres

A Guide to Scaling PostgreSQLBuilding a Scalable Database

Database security

Guide to PostgreSQL SecurityWhat Is Audit Logging and How to Enable It in PostgreSQL

Data management

Understanding PostgreSQL TablespacesGuide to Postgres Data ManagementHandling Large Objects in Postgres

Database operations

Guide to PostgreSQL Database Operations

JSON

How to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Index JSONB Columns in PostgreSQLSQL/JSON Data Model and JSON in SQL: A PostgreSQL Perspective

Query optimization

Explaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?Optimizing Array Queries With GIN Indexes in PostgreSQLRecursive Query in SQL: What It Is, and How to Write One

Database tools and libraries

How to Use Psycopg2: The PostgreSQL Adapter for PythonTop PostgreSQL Drivers for Python

Database indexes

How to Monitor and Optimize PostgreSQL Index Performance

Database backups and restore

A Guide to pg_restore (and pg_restore Example)

Products

Time Series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time Series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2025 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at Apr 5, 2024

A Guide to pg_restore (and pg_restore Example)

Try for free

Start supercharging your PostgreSQL today.

Two tech-looking cylinders with an arrow in between them, representing data flowing from one to the other.

Written by Abhinav D.

If you have worked with production-scale databases in PostgreSQL, you know that effective backups and a well-defined recovery plan are crucial for managing them. Backups protect your data from loss or corruption and enable you to recover your database in case of failures, outages, or human errors. In this guide, we'll explore pg_restore, a PostgreSQL utility specifically designed to work with logical backups, playing a vital role in database recovery.

pg_restore allows you to restore a database from a logical backup created by pg_dump. Logical backups contain SQL commands that recreate the database objects and data, offering flexibility and portability across different PostgreSQL versions and other database systems.

However, it's important to note that pg_restore and logical backups are not suitable for every situation. They have limitations, especially when dealing with large-scale databases or complex recovery scenarios. In this article, we'll dive deeper into:

  • Understanding what pg_restore is and how it works

  • Identifying when pg_restore is the right tool for your recovery needs

  • Exploring practical examples of using pg_restore effectively

By the end of this guide, you'll have a solid understanding of pg_restore and how it fits into your overall backup and recovery strategy for PostgreSQL databases.

Creating Logical Backups With pg_dump

Understanding logical backups

Logical backups differ from physical backups because they don't contain a direct copy of the database files. Instead, a logical backup is a file that includes a series of SQL commands that, when executed, rebuild the database to its state at the time of the backup.

This approach offers several advantages:

  1. Logical backups are portable across different PostgreSQL versions and can even be migrated to other database systems that support SQL.

  2. They allow for selective restoration of specific database objects, such as tables or schemas.

  3. Logical backups are human-readable and can be inspected or modified if needed.

Creating a logical backup with pg_dump

When creating logical backups of your PostgreSQL database, pg_dump is the go-to tool. To create a logical backup using pg_dump, you can use the following basic command:

pg_dump [database_name] -f backup_file.sql

  • -f or --file: Specifies the file path of the output file.

This command connects to the specified database, retrieves the SQL commands necessary to recreate the database objects and data, and saves them to a file named backup_file.sql.

For example, to create a logical backup of our e-commerce database, you would run:

pg_dump ecommerce -f /tmp/ecommerce_backup.sql

pg_dump options

pg_dump provides many options to customize the backup process. Some commonly used options include:

  • -U or --username: Specifies the username to connect to the database.

  • -W or --password: Prompts for the password to authenticate the connection.

  • -j or --jobs: Enables parallel backup by specifying the number of concurrent jobs.

  • -F or --format: Specifies the output format of the backup file. The available formats are plain (default), custom, directory, and tar.

For example, to create a backup of e-commerce database in the tar format with username and password authentication, you would run:

pg_dump -U postgres -W -F tar ecommerce -f /tmp/ecommerce_backup.tar

image

In this example:

  • -F tar specifies the tar format, which creates an uncompressed tar archive.

  • The output file extension is .tar to reflect the tar format.

It's important to note that when using pg_dump for migration purposes, you should use the custom or directory format, as they provide additional features like compression and parallel restoration. You can find the complete list of available options in the PostgreSQL documentation.

pg_dumpall

In addition to pg_dump, PostgreSQL also provides pg_dumpall, which creates a logical backup of an entire PostgreSQL cluster, including all databases, roles, and other cluster-wide objects.

The key difference between pg_dump and pg_dumpall is that pg_dumpall includes cluster-level objects like user roles and permissions while pg_dump focuses on a single database.

To create a logical backup of an entire cluster using pg_dumpall, you can use the following command:

pg_dumpall -U postgres -W -f /tmp/cluster_backup.tar

image

This command will generate an SQL script that includes commands to recreate all databases, roles, and other cluster-wide objects.

Note the multiple password prompts. pg_dumpall will connect to each database, hence the password prompt per database connection. This is the default behavior when password authentication is used. However, you can use the passfile option if there are multiple databases.

Using pg_dump and pg_dumpall, you can create comprehensive logical backups of your PostgreSQL databases and clusters, ensuring you have the necessary data and configuration to restore your database environment.

Rebuilding Databases With pg_restore

Once you have created a logical backup using pg_dump, you can use pg_restore to rebuild the database from that backup file. pg_restore is a powerful tool that allows you to restore an entire database or specific parts of it, giving you flexibility in the restoration process.

Examples of pg_restore command

Let's connect the examples with the previous section, where we created a logical backup of the e-commerce database using pg_dump. We'll use that backup file to demonstrate how to use pg_restore.

Simple example To restore the entire e-commerce database from the backup file, you can use the following command:

pg_restore -d ecommerce ecommerce_backup.sql

This command assumes that the target database e-commerce already exists and will restore the data. If the database doesn't exist, you'll need to create it first.

Examples with options

pg_restore provides several options to customize the restoration process. Here are a few important ones:

  • -c or --clean: Drops database objects before recreating them. This ensures a clean restoration.

  • -C or --create: Creates the target database before restoring the data.

  • -j or --jobs: Specifies the number of concurrent jobs for parallel restoration.

Here's an example that will restore an e-commerce database from a backup file created in the above section:

pg_restore -U postgres -W -C -c --if-exists -d postgres /tmp/ecommerce_backup.tar

In this example,

  • -C: This option tells pg_restore to create the database before restoring the data. If the database already exists, pg_restore will exit with an error unless the --if-exists option is also specified.

  • -c: This option specifies the "clean" mode for the restore. It tells pg_restore to drop database objects (tables, functions, etc.) before recreating them. This ensures that the restored database is in a clean state and matches the structure of the backup file.

  • --if-exists: This option is used with the -C option. It tells pg_restore to ignore errors if the database being created already exists. If the database exists, pg_restore will proceed with the restore without attempting to create it again.

  • -d postgres: This option specifies the name of the database to which to connect initially. In this case, it's the Postgres database, which is the default database that typically exists in PostgreSQL installations. pg_restore needs to connect to an existing database to create the new database specified in the backup file.

  • /tmp/ecommerce_backup.tar: This is the path to the backup file that contains the database dump. It should be a valid backup file created by pg_dump in the "tar" format.

You can find the complete list of available options in the PostgreSQL documentation.

How pg_restore works

When you run pg_restore, it follows these steps to rebuild the database:

  1. Reading the backup file: pg_restore reads the specified backup file, which contains SQL commands generated by pg_dump.

  2. Creating the database (optional): If the -C option is used, pg_restore creates the target database before proceeding with the restoration.

  3. Dropping existing objects (optional): If the -c option is used, pg_restore drops any existing database objects before recreating them.

  4. Executing SQL commands: pg_restore executes the SQL commands from the backup file to recreate the database objects, such as tables, indexes, constraints, and data.

  5. Parallel processing (optional): If the -j option is used, pg_restore utilizes multiple jobs to execute the SQL commands in parallel, speeding up the restoration process. Parallel processing is available using the custom or directory archive formats in pg_restore.

Throughout the restoration process, pg_restore provides flexibility in rebuilding specific parts of the database. You can use options like -t or --table to restore only specific tables, -n or --schema to restore specific schemas, and more. This allows you to restore the desired parts of the database selectively.

Furthermore, if you have created a logical backup of an entire PostgreSQL cluster using pg_dumpall, you can use pg_restore to rebuild the entire cluster, including all databases and cluster-wide objects.

By understanding how pg_restore works, and considering these limitations, you can effectively rebuild your databases from logical backups, ensuring data recovery and migration capabilities.

Benefits, Downsides, and Use Cases for pg_restore

Let's explore and discuss the specific use cases where pg_restore shines.

System migration

One of the primary use cases for pg_restore is system migration. When you need to move a PostgreSQL database to a different server, upgrade to a newer version, or switch to a different database system, logical backups created by pg_dump and restored with pg_restore can be a viable solution.

The SQL-based structure of logical backups allows for easy transfer across different PostgreSQL versions and even to other SQL-compliant databases. pg_restore can handle the recreation of database objects and data insertion on the target system, making the migration process more straightforward.

However, it's important to note that migrating large databases using logical backups can be time-consuming and resource-intensive. Migration techniques like replication or parallel restoration might be more suitable.

Partial restoration

Another valuable use case for pg_restore is partial restoration. Sometimes, you may only need to restore specific parts of a database rather than the entire one. pg_restore provides filters and options to selectively restore individual tables, schemas, or other database objects.

For example, you can use the -t or --table option to restore only specific tables or the -n or --schema option to restore objects within a particular schema. This granular control over the restoration process can be beneficial when recovering specific data or troubleshooting issues related to particular database objects.

There are benefits and drawbacks regarding logical backup and restoration tools like pg_restore. 

Benefits of logical backup

  1. Flexible formatting: Logical backups created by pg_dump can be formatted in various ways, such as plain SQL, custom archive, or directory format. This flexibility allows for easier manipulation and customization of the backup files.

  2. Fully restores the database to the most recent state: Logical backups capture the complete state of the database at the time the backup was taken. When restored using pg_restore, the database is rebuilt to its exact state at the time of the backup, including all data and schema objects. In contrast, physical restores can only restore the database to a fixed savepoint.

  3. Suitable for migration and updates: Logical backups are particularly useful when migrating databases to newer versions of PostgreSQL or moving data between different database systems. The SQL-based nature of logical backups makes them compatible across different PostgreSQL versions and even with other SQL-compliant databases.

  4. Supports partial restores: pg_restore allows for selective restoration of specific database objects, such as tables, schemas, or functions. This feature is handy when you only need to restore a subset of the database rather than the entire database.

Downsides of logical backup

  1. Slower restoration process: Restoring a database from a logical backup using pg_restore can be slower than restoring from a physical backup. The restoration process involves executing SQL commands to recreate the database objects and insert the data, which can be time-consuming for large databases.

  2. Requires compute resources: pg_restore needs to execute the SQL commands from the backup file, which requires CPU and memory resources on the target database server. This can impact the performance of the server during the restoration process.

  3. Challenges with large databases: Logical backups and restoration with pg_restore can struggle with large databases. Generating the backup file and executing the SQL commands during restoration can take significant time and resources, making it less practical for databases with terabytes of data.

Database backup with pg_restore

-l for table of contents

Let's look at an example of using pg_restore with filters to rebuild specific tables from a database backup. Suppose we have a logical backup file named ecommerce_backup.tar that contains a backup of our e-commerce database.

  • To view the table of contents of the backup file, you can use the -l or --list option:

pg_restore -l ecommerce_backup.tar

This command will display a list of all the objects in the backup file.

image

-L for list file

To restore only specific tables, you can create a list file containing the names of the tables you want to restore. For example, we want to restore only the products and customers tables. 

  • First, let’s output the contents of the backup file to a restore_list.txt

pg_restore -l ecommerce_backup.tar > restore_list.txt

  • Then, you can keep the objects you want to restore. We will keep only two tables for this example.

215; 1259 17198 TABLE public customers postgres 217; 1259 17202 TABLE public order_items postgres

image

  • Then, use the -L or --use-list option to specify the list file:

pg_restore -U postgres -W -d ecommerce -L restore_list.txt ecommerce_backup.tar

  • This command will restore only the order_items and customers tables from the backup file into the e-commerce database.

-using -n with a schema

You can use the -n or --schema option to restore objects within a specific schema. For example, to restore only the objects in the public schema:

pg_restore -U postgres -W -d ecommerce -n public ecommerce_backup.tar

-using -N to exclude a schema

Use the- N or- exclude-schema option to exclude objects within a specific schema. For example, to exclude the temp schema from the restoration:   

pg_restore -U postgres -W -d ecommerce -N temp ecommerce_backup.tar

This command will restore all objects from the backup file except those belonging to the temp schema.

These examples demonstrate how pg_restore provides flexibility in selectively restoring specific parts of a database based on your requirements.

Database migration with pg_restore

Let's explore an example of migrating a PostgreSQL database to Timescale using pg_restore. Timescale is a time-series database that extends PostgreSQL with additional functionality for handling time-series data.

To migrate a PostgreSQL database to Timescale using pg_restore, follow the steps outlined in the Timescale documentation.

Here are a few key considerations to keep in mind:

  • Role management: Before dumping the data, it's important to handle the roles and permissions separately. You can use pg_dumpall with the --roles-only option to dump the roles from the source PostgreSQL database.

  • Schema and data dump: Use pg_dump to create a logical backup of the source database schema and data. However, you must specify certain flags to ensure compatibility with Timescale.

    • --no-tablespaces: Timescale has limitations on tablespace support, so this flag is necessary.

    • --no-owner and --no-privileges: These flags are required because Timescale's default user, tsdbadmin, is not a superuser and has restricted privileges compared to PostgreSQL's default superuser.

  • Restoring with concurrency: When using the directory format for pg_dump and pg_restore, you can speed up the restoration process by leveraging concurrency. However, concurrently loading the _timescaledb_catalog schema can cause errors due to insufficient privileges. To work around this, serially load the _timescaledb_catalog schema and then load the rest of the database concurrently.

  • Post-migration tasks: After the data is loaded, it's recommended that the table statistics be updated by running ANALYZE on all the data. This helps optimize query performance in Timescale.

  • Verification and application setup: Before bringing your application online with the migrated database, thoroughly verify the data integrity and ensure the migration was successful.

It's important to note that migrating large databases using pg_dump and pg_restore can be time-consuming and may require downtime for your application. For databases larger than 100 GB, Timescale recommends using their live migration strategy for a low-downtime migration solution instead.

Also, remember that migrating to Timescale may require additional steps to enable Timescale-specific features like hypertables, data compression, and retention policies after the migration is complete.

Conclusion

This guide explored the versatility of pg_restore, a logical backup and recovery tool for PostgreSQL. We've learned how pg_restore works hand in hand with pg_dump to create and restore logical backups, providing flexibility and granular control over the restoration process.

A few key advantages of pg_restore are its ability to facilitate system migrations and partial database restorations. Whether you need to migrate a PostgreSQL database to a newer version, move data between different systems, or selectively restore specific objects, pg_restore offers the tools and options to accomplish these tasks easily.

However, it's important to note that logical replication is often the recommended approach for system migrations for most real-world workloads, especially those involving larger databases or requiring minimal downtime.

If you're working with time-series data and considering migrating to a specialized time-series database like Timescale, pg_restore can be valuable. This enables you to use Timescale's powerful features, such as hypertables, data compression, and retention policies, to optimize your time-series workloads and achieve maximum efficiency.

To experience the benefits of Timescale firsthand, try it for free today. Create your account and explore the possibilities of a purpose-built time-series database.

Further reading

Learn more about migration strategies:

  • Migrating a Terabyte-Scale PostgreSQL Database to Timescale With (Almost) Zero Downtime

  • How to Migrate Your Data to Timescale (3 Ways)

FAQs

What is pg_restore and when should I use it? pg_restore is a PostgreSQL utility specifically designed to work with logical backups, playing a vital role in database recovery. It allows you to restore a database from a logical backup created by pg_dump, containing SQL commands that recreate database objects and data. This tool is particularly useful for system migrations, upgrading to newer PostgreSQL versions, or when you need to selectively restore specific parts of a database.

How do I create a backup file that works with pg_restore? You create compatible backup files using pg_dump with format options like custom, directory, or tar. For example, run pg_dump -U postgres -W -F tar ecommerce -f /tmp/ecommerce_backup.tar to create a tar format backup of your e-commerce database. For migration purposes, it's recommended to use the custom or directory format as they support features like compression and parallel restoration.

What is the basic syntax for using pg_restore? The basic syntax is pg_restore [options] backup_file, where options control how restoration happens and backup_file is your backup file path. For example, pg_restore -U postgres -W -C -c --if-exists -d postgres /tmp/ecommerce_backup.tar will restore an e-commerce database, creating it if it doesn't exist (-C), cleaning existing objects first (-c), and connecting initially to the PostgreSQL database.

How can I selectively restore only specific tables or schemas? You can use pg_restore with filters to selectively restore database objects. First, view the backup contents with pg_restore -l ecommerce_backup.tar > restore_list.txt, then edit the list file to keep only desired objects, and finally restore using pg_restore -d ecommerce -L restore_list.txt ecommerce_backup.tar. Alternatively, use -n public to restore only objects in the public schema or -N temp to exclude the temp schema from restoration.

What are the limitations of pg_restore compared to physical backups? While pg_restore offers flexibility, it has notable limitations for large databases. The restoration process can be significantly slower than physical backups since it executes SQL commands to recreate objects and insert data. It requires substantial compute resources during restoration, potentially impacting server performance. For databases with terabytes of data, logical restoration becomes less practical, and alternative approaches like replication might be more suitable.

On this page

    Try for free

    Start supercharging your PostgreSQL today.