---
title: Analyse geospatial data with PostGIS | Tiger Data Docs
description: Store, query, and analyze geospatial data using the PostGIS extension
---

The `postgis` PostgreSQL extension provides storing, indexing, and querying geographic data. It helps in spatial data analysis, the study of patterns, anomalies, and theories within spatial or geographical data.

For more information about these functions and the options available, see the [PostGIS documentation](https://postgis.net/documentation/).

## Use the `postgis` extension to analyze geospatial data

The `postgis` PostgreSQL extension allows you to conduct complex analyses of your geospatial time-series data. Tiger Data understands that you have a multitude of data challenges and helps you discover when things happened, and where they occurred. In this example you can query when the `covid` cases were reported, where they were reported, and how many were reported around a particular location.

1. **Install the `postgis` extension**

   ```
   CREATE EXTENSION postgis;
   ```

   Confirm the extension is installed using the `\dx` command:

   ```
                                       List of installed extensions
   Name         | Version |   Schema   |                                      Description
   ---------------------+---------+------------+---------------------------------------------------------------------------------------
    pg_stat_statements  | 1.10    | public     | track planning and execution statistics of all SQL statements executed
    pgcrypto            | 1.3     | public     | cryptographic functions
    plpgsql             | 1.0     | pg_catalog | PL/pgSQL procedural language
    postgis             | 3.3.3   | public     | PostGIS geometry and geography spatial types and functions
    timescaledb         | 2.24.0  | public     | Enables scalable inserts and complex queries for time-series data (Community Edition)
    timescaledb_toolkit | 1.22.0  | public     | Library of analytical hyperfunctions,     time-series pipelining, and other SQL utilities
   (6 rows)
   ```

2. **Create a hypertable named `covid_location`**

   `location` is a `GEOGRAPHY` type column that stores GPS coordinates using the 4326/WGS84 coordinate system, and `time` records the time the GPS coordinate was logged for a specific `state_id`. This hypertable is partitioned on the `time` column:

   ```
   CREATE TABLE covid_location (
     time TIMESTAMPTZ NOT NULL,
     state_id INT NOT NULL,
     location GEOGRAPHY(POINT, 4326),
     cases INT NOT NULL,
     deaths INT NOT NULL
   ) WITH (
     tsdb.hypertable
   );
   ```

   When you create a hypertable using [CREATE TABLE … WITH …](/docs/reference/timescaledb/hypertables/create_table/index.md), the default partitioning column is automatically the first column with a timestamp data type. Also, TimescaleDB creates a [columnstore policy](/docs/reference/timescaledb/hypercore/add_columnstore_policy/index.md) that automatically converts your data to the columnstore, after an interval equal to the value of the [chunk\_interval](/docs/reference/timescaledb/hypertables/set_chunk_time_interval/index.md), defined through `after` in the policy. This columnar format enables fast scanning and aggregation, optimizing performance for analytical workloads while also saving significant storage space. In the columnstore conversion, hypertable chunks are compressed by up to 98%, and organized for efficient, large-scale queries.

   You can customize this policy later using [alter\_job](/docs/reference/timescaledb/jobs-automation/alter_job/index.md). However, to change `after` or `created_before`, the compression settings, or the hypertable the policy is acting on, you must [remove the columnstore policy](/docs/reference/timescaledb/hypercore/remove_columnstore_policy/index.md) and [add a new one](/docs/reference/timescaledb/hypercore/add_columnstore_policy/index.md).

   You can also manually [convert chunks](/docs/reference/timescaledb/hypercore/convert_to_columnstore/index.md) in a hypertable to the columnstore.

3. **Create an index on the `state_id` column**

   ```
   CREATE INDEX ON covid_location (state_id, time DESC);
   ```

4. **Insert sample data**

   The longitude and latitude coordinates of New Jersey are (-73.935242 40.730610), and New York are (-74.871826 39.833851):

   ```
   INSERT INTO covid_location VALUES
   ('2023-06-28 20:00:00',34,'POINT(-74.871826 39.833851)',5,2),
   ('2023-06-28 20:00:00',36,'POINT(-73.935242 40.730610)',7,1),
   ('2023-06-29 20:00:00',34,'POINT(-74.871826 39.833851)',14,0),
   ('2023-06-29 20:00:00',36,'POINT(-73.935242 40.730610)',12,1),
   ('2023-06-30 20:00:00',34,'POINT(-74.871826 39.833851)',10,4);
   ```

5. **Fetch all cases of a specific state during a specific period**

   ```
   SELECT * FROM covid_location
   WHERE state_id = 34 AND time BETWEEN '2023-06-28 00:00:00' AND '2023-06-30 23:59:59';
   ```

   The data you get back looks a bit like this:

   ```
                        time          | state_id |                      location                      | cases | deaths
   ------------------------+----------+----------------------------------------------------+-------+--------
    2023-06-28 20:00:00+00 |       34 | 0101000020E61000005C7347FFCBB752C0535E2BA1BBEA4340 |     5 |      2
    2023-06-29 20:00:00+00 |       34 | 0101000020E61000005C7347FFCBB752C0535E2BA1BBEA4340 |    14 |      0
    2023-06-30 20:00:00+00 |       34 | 0101000020E61000005C7347FFCBB752C0535E2BA1BBEA4340 |    10 |      4
   (3 rows)
   ```

6. **Fetch the latest logged cases using SkipScan**

   Replace `<Interval_Time>` with the number of days between the day you are running the query and the day the last report was logged in the table, in this case 30, June, 2023:

   ```
   SELECT DISTINCT ON (state_id) state_id, ST_AsText(location) AS location
   FROM covid_location
   WHERE time > now() - INTERVAL '<Interval_Time>'
   ORDER BY state_id,
   time DESC;
   ```

   The `ST_AsText(location)` function converts the binary geospatial data into human-readable format. The data you get back looks a bit like this:

   ```
   state_id |          location
   ----------+-----------------------------
   34 | POINT(-74.871826 39.833851)
   (1 row)
   ```

7. **Fetch all cases within 10000 meters of Manhattan**

   ```
   SELECT DISTINCT cases, state_id
   FROM covid_location
   WHERE ST_DWithin(
   location,
   ST_GeogFromText('POINT(-73.9851 40.7589)'),
   10000
   );
   ```

   The data you get back looks a bit like this:

   ```
   cases | state_id
   -------+----------
    7 |       36
   12 |       36
   (2 rows)
   ```
