Published at Jul 11, 2024
Start supercharging your PostgreSQL today.

Written by Team Timescale
In PostgreSQL or TimescaleDB, you may often need to eliminate duplicate rows from your query results to get a unique set of records. The DISTINCT clause in PostgreSQL is designed to do just that, ensuring your results contain only unique rows.
The basic syntax of the DISTINCT clause is as follows:
SELECT DISTINCT column1, column2, ...
FROM table;
You can also use DISTINCT ON to return the first row of each group of duplicates based on a specified column or set of columns:
SELECT DISTINCT ON (column1) column1, column2, ...
FROM table
ORDER BY column1, column2, ...;
Let's consider a table weather_data with the following structure and data:
CREATE TABLE weather_data (
id SERIAL PRIMARY KEY,
device_id VARCHAR(50),
temperature NUMERIC,
humidity NUMERIC,
wind_speed NUMERIC,
recorded_at TIMESTAMP
);
INSERT INTO weather_data (device_id, temperature, humidity, wind_speed, recorded_at) VALUES
('device_1', 22.5, 55.0, 12.5, '2023-01-15 08:00:00'),
('device_2', 21.0, 60.0, 10.0, '2023-01-15 08:05:00'),
('device_1', 23.0, 57.0, 11.0, '2023-01-15 08:10:00'),
('device_2', 19.5, 62.0, 8.0, '2023-01-15 08:15:00'),
('device_1', 20.0, 59.0, 9.5, '2023-01-15 08:20:00'),
('device_3', 24.0, 54.0, 13.0, '2023-01-15 08:25:00'),
('device_2', 18.5, 63.0, 7.5, '2023-01-15 08:30:00'),
('device_3', 22.0, 55.5, 12.0, '2023-01-15 08:35:00'),
('device_1', 21.5, 58.0, 10.5, '2023-01-15 08:40:00'),
('device_3', 23.5, 53.0, 13.5, '2023-01-15 08:45:00');
Learn how the Timescale Team made DISTINCT queries up to 8,000x faster on PostgreSQL.
Let's dive into some examples to see the DISTINCT clause in action.
Example 1: Selecting distinct values from a single column
Suppose we want to retrieve all unique device_id values from the weather_data table. We can use the DISTINCT clause to achieve this:
SELECT DISTINCT device_id
FROM weather_data;
Result:
device_id
-----------
device_1
device_2
device_3
This query returns a unique set of device_id values from the weather_data table.
Example 2: Selecting distinct combinations of multiple columns
Now, let's say we want to retrieve unique combinations of device_id and humidity. We can use the DISTINCT clause on multiple columns:
SELECT DISTINCT device_id, humidity
FROM weather_data;
Result:
device_id | humidity
-----------+----------
device_1 | 55.0
device_2 | 60.0
device_1 | 57.0
device_2 | 62.0
device_1 | 59.0
device_3 | 54.0
device_2 | 63.0
device_3 | 55.5
device_1 | 58.0
device_3 | 53.0
This query returns unique combinations of device_id and humidity from the weather_data table.
Example 3: Using DISTINCT ON to get the first row of each group
Suppose we want to retrieve the first recorded temperature for each device. We can use DISTINCT ON to achieve this:
SELECT DISTINCT ON (device_id) device_id, temperature, recorded_at
FROM weather_data
ORDER BY device_id, recorded_at;
Result:
device_id | temperature | recorded_at
-----------+-------------+---------------------
device_1 | 22.5 | 2023-01-15 08:00:00
device_2 | 21.0 | 2023-01-15 08:05:00
device_3 | 24.0 | 2023-01-15 08:25:00
This query returns the first recorded temperature for each device_id based on the recorded_at timestamp.
The DISTINCT clause is a powerful tool in PostgreSQL for removing duplicate rows from your query results. To learn more about this PostgreSQL clause, check the official documentation. And if to get the behind-the-scenes story of how the Timescale Team made DISTINCT queries up to 8,000x faster, read our blog post.