Insert data
Insert single and multiple rows and bulk load data into TimescaleDB with SQL
Inserts into a hypertable are normal PostgreSQL writes. Use the usual commands, and lean on batching when you can:
INSERT: single rows or small batchesCOPY: bulk data loading
To improve performance, insert time series data directly to the columnstore using direct compress.
Insert a single row
Section titled “Insert a single row”To insert a single row into a hypertable, use the syntax INSERT INTO ... VALUES:
INSERT INTO conditions(time, location, temperature, humidity) VALUES (NOW(), 'office', 70.0, 50.0);Insert multiple rows
Section titled “Insert multiple rows”A more efficient method to insert row-by-row is to insert multiple rows into a hypertable using a single
INSERT call. This works even for thousands of rows at a time. TimescaleDB batches the rows by chunk, then writes to
each chunk in a single transaction.
You use the same syntax, separating rows with a comma:
INSERT INTO conditions VALUES (NOW(), 'office', 70.0, 50.0), (NOW(), 'basement', 66.5, 60.0), (NOW(), 'garage', 77.0, 65.2);If you INSERT unsorted data, call convert_to_columnstore('<chunk_name>', recompress => true)
on the chunk to reorder and optimize your data.
Insert and return data
Section titled “Insert and return data”You can return some or all of the inserted data by adding a RETURNING clause to the INSERT command. For example,
to return all the inserted data, run:
INSERT INTO conditions VALUES (NOW(), 'office', 70.1, 50.1) RETURNING *;This returns:
time | location | temperature | humidity------------------------------+----------+-------------+----------2017-07-28 11:42:42.846621+00 | office | 70.1 | 50.1(1 row)If you INSERT unsorted data, call convert_to_columnstore('<chunk_name>', recompress => true)
on the chunk to reorder and optimize your data.
Bulk insert with COPY
Section titled “Bulk insert with COPY”The COPY command is the most efficient way to load large amounts of data into a hypertable. For
bulk data loading, COPY can be 2-3x faster or more than INSERT, especially when combined with
direct compress.
COPY supports loading from:
-
CSV files:
COPY conditions(time, location, temperature, humidity)FROM '/path/to/data.csv'WITH (FORMAT CSV, HEADER); -
Standard input
To load data from your application or script using standard input:
COPY conditions(time, location, temperature, humidity)FROM STDINWITH (FORMAT CSV);To signal the end of input, add
\.on a new line. -
Program output
To load data generated by a program or script:
COPY conditions(time, location, temperature, humidity)FROM PROGRAM 'generate_data.sh'WITH (FORMAT CSV);
If you COPY unsorted data, call convert_to_columnstore('<chunk_name>', recompress => true)
on the chunk to reorder and optimize your data.
Improve performance with direct compress
Section titled “Improve performance with direct compress”When you set timescaledb.enable_direct_compress_insert or timescaledb.enable_direct_compress_copy, TimescaleDB compresses data in memory during ingestion and writes the compressed batches directly to the columnstore. This significantly reduces the I/O footprint, and INSERT and COPY produce columnstore chunks on the spot — the columnstore policy is less important.
This feature is a tech preview and not production-ready. Using direct compress can regress query performance or storage ratio if the ingested rows are not sorted by the table’s orderby columns or if the data has very high cardinality.
To enable in-memory compression during ingestion, set the appropriate GUC:
SET timescaledb.enable_direct_compress_insert = on;-- orSET timescaledb.enable_direct_compress_copy = on;Important facts
- High cardinality data does not produce good batches and degrades query performance.
- The columnstore is optimized for 1000 rows per batch per
segmentbyvalue. - WAL records are written for the compressed batches rather than the individual tuples.
- Best results are achieved for batch ingestion with 1000 rows or more.
- Tables with unique constraints cannot use direct compress.
- If your hypertable has no
segmentbycolumn configured, TimescaleDB picks one for you based on the first batch of rows you ingest. Turn this off withSET timescaledb.enable_direct_compress_auto_segmentby = off.