Latest

Tiger Cloud: Performance, Scale, Enterprise, Free

Self-hosted products

MST

Upserting is an operation to add data to your database where:

  • A matching row does not exist: inserts a new row
  • A matching row exists: either updates the existing row, or does nothing

Upserts work when you have a unique index or constraint. A matching row is one that has identical values for the columns covered by the index or constraint. In Postgres, a primary key is a unique index with a NOT NULL constraint. If you have a primary key, you automatically have a unique index.

Unique constraints must include all partitioning columns. That means unique constraints on a hypertable must include the time column. If you added other partitioning columns to your hypertable, the constraint must include those as well. For more information, see Enforce constraints with unique indexes.

The examples in this page use a conditions table with a unique constraint on the columns (time, location). To create a unique constraint, either:

  • Use UNIQUE (<COLUMNS>) when you define your table:

    CREATE TABLE conditions (
    time TIMESTAMPTZ NOT NULL,
    location TEXT NOT NULL,
    temperature DOUBLE PRECISION NULL,
    humidity DOUBLE PRECISION NULL,
    UNIQUE (time, location)
    );
  • Use ALTER TABLE after the table is created:

    ALTER TABLE conditions
    ADD CONSTRAINT conditions_time_location
    UNIQUE (time, location);

To insert new data that doesn't violate the constraint, and to update the existing row if it does, use the syntax INSERT INTO ... VALUES ... ON CONFLICT ... DO UPDATE. For example, to update the temperature and humidity values if a row with the specified time and location already exists, run:

INSERT INTO conditions
VALUES ('2017-07-28 11:42:42.846621+00', 'office', 70.2, 50.1)
ON CONFLICT (time, location) DO UPDATE
SET temperature = excluded.temperature,
humidity = excluded.humidity;

You can also do nothing if the constraint is violated. The new data is not inserted, and the old row is not updated, the database engine skips the row and moves on. This is useful to prevent the entire transaction from failing when writing many rows as one batch.

To insert or do nothing, use the syntax INSERT INTO ... VALUES ... ON CONFLICT DO NOTHING:

INSERT INTO conditions
VALUES ('2017-07-28 11:42:42.846621+00', 'office', 70.1, 50.0)
ON CONFLICT DO NOTHING;

When you need to upsert large amounts of data, COPY is significantly faster than INSERT. However, COPY doesn't support ON CONFLICT clauses directly. Best practice is to use a staging table. This two-step approach combines the speed of COPY for bulk loading with the flexibility of INSERT...ON CONFLICT for upsert logic. For large datasets, this is much faster than using INSERT...ON CONFLICT directly.

To load data efficiently with COPY, then upsert:

  1. Create a staging table with the same structure as the destination table

    CREATE TEMP TABLE conditions_staging (LIKE conditions);
  2. Use COPY to bulk load data into the staging table

    COPY conditions_staging(time, location, temperature, humidity)
    FROM '/path/to/data.csv'
    WITH (FORMAT CSV, HEADER);
  3. Upsert from the staging table to the destination table

    INSERT INTO conditions
    SELECT * FROM conditions_staging
    ON CONFLICT (time, location) DO UPDATE
    SET temperature = EXCLUDED.temperature,
    humidity = EXCLUDED.humidity;

    To skip duplicate rows, set ON CONFLICT (time, location) DO NOTHING.

  4. Clean up the staging table

    DROP TABLE conditions_staging;

Keywords

Found an issue on this page?Report an issue or Edit this page in GitHub.