Skip to content

Insert data

Insert single and multiple rows and bulk load data into TimescaleDB with SQL

Inserts into a hypertable are normal PostgreSQL writes. Use the usual commands, and lean on batching when you can:

  • INSERT: single rows or small batches
  • COPY: bulk data loading

To improve performance, insert time series data directly to the columnstore using direct compress.

To insert a single row into a hypertable, use the syntax INSERT INTO ... VALUES:

INSERT INTO conditions(time, location, temperature, humidity)
VALUES (NOW(), 'office', 70.0, 50.0);

A more efficient method to insert row-by-row is to insert multiple rows into a hypertable using a single INSERT call. This works even for thousands of rows at a time. TimescaleDB batches the rows by chunk, then writes to each chunk in a single transaction.

You use the same syntax, separating rows with a comma:

INSERT INTO conditions
VALUES
(NOW(), 'office', 70.0, 50.0),
(NOW(), 'basement', 66.5, 60.0),
(NOW(), 'garage', 77.0, 65.2);

If you INSERT unsorted data, call convert_to_columnstore('<chunk_name>', recompress => true) on the chunk to reorder and optimize your data.

You can return some or all of the inserted data by adding a RETURNING clause to the INSERT command. For example, to return all the inserted data, run:

INSERT INTO conditions
VALUES (NOW(), 'office', 70.1, 50.1)
RETURNING *;

This returns:

time | location | temperature | humidity
------------------------------+----------+-------------+----------
2017-07-28 11:42:42.846621+00 | office | 70.1 | 50.1
(1 row)

If you INSERT unsorted data, call convert_to_columnstore('<chunk_name>', recompress => true) on the chunk to reorder and optimize your data.

The COPY command is the most efficient way to load large amounts of data into a hypertable. For bulk data loading, COPY can be 2-3x faster or more than INSERT, especially when combined with direct compress.

COPY supports loading from:

  • CSV files:

    COPY conditions(time, location, temperature, humidity)
    FROM '/path/to/data.csv'
    WITH (FORMAT CSV, HEADER);
  • Standard input

    To load data from your application or script using standard input:

    COPY conditions(time, location, temperature, humidity)
    FROM STDIN
    WITH (FORMAT CSV);

    To signal the end of input, add \. on a new line.

  • Program output

    To load data generated by a program or script:

    COPY conditions(time, location, temperature, humidity)
    FROM PROGRAM 'generate_data.sh'
    WITH (FORMAT CSV);

If you COPY unsorted data, call convert_to_columnstore('<chunk_name>', recompress => true) on the chunk to reorder and optimize your data.

When you set timescaledb.enable_direct_compress_insert or timescaledb.enable_direct_compress_copy, TimescaleDB compresses data in memory during ingestion and writes the compressed batches directly to the columnstore. This significantly reduces the I/O footprint, and INSERT and COPY produce columnstore chunks on the spot — the columnstore policy is less important.

Note

This feature is a tech preview and not production-ready. Using direct compress can regress query performance or storage ratio if the ingested rows are not sorted by the table’s orderby columns or if the data has very high cardinality.

To enable in-memory compression during ingestion, set the appropriate GUC:

SET timescaledb.enable_direct_compress_insert = on;
-- or
SET timescaledb.enable_direct_compress_copy = on;

Important facts

  • High cardinality data does not produce good batches and degrades query performance.
  • The columnstore is optimized for 1000 rows per batch per segmentby value.
  • WAL records are written for the compressed batches rather than the individual tuples.
  • Best results are achieved for batch ingestion with 1000 rows or more.
  • Tables with unique constraints cannot use direct compress.
  • If your hypertable has no segmentby column configured, TimescaleDB picks one for you based on the first batch of rows you ingest. Turn this off with SET timescaledb.enable_direct_compress_auto_segmentby = off.