TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
TigerData logo

Products

Time-series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Tutorials Changelog Success Stories Time-series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2026 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Copy as HTML

Open in ChatGPT

Open in Claude

Open in v0

Kamil Ruczyński

By Kamil Ruczyński

4 min read

Jun 20, 2025

Dev Q&ATimescaleDBPostgreSQL

Table of contents

01 MySQL Solution02 What is TimescaleDB?03 TimescaleDB Solution04 TimescaleDB Continuous Aggregates05 TimescaleDB Retention06 Summary

TimescaleDB to the Rescue - Speeding Up Statistics

TimescaleDB to the Rescue - Speeding Up Statistics
Dev Q&A
Kamil Ruczyński

By Kamil Ruczyński

4 min read

Jun 20, 2025

Table of contents

01 MySQL Solution02 What is TimescaleDB?03 TimescaleDB Solution04 TimescaleDB Continuous Aggregates05 TimescaleDB Retention06 Summary

Copy as HTML

Open in ChatGPT

Open in Claude

Open in v0

This article, written by Kamil Ruczyński, was originally posted on Apr 6, 2025 on his blog. Kamil is a Staff Software Engineer specializing in backend solutions, with a strong focus on scalability and performance. It's reposted here with permission.

Some time ago, I was working on improving the performance of slow statistics. The problem was that our database contained billions of rows, making data retrieval slow, even for the last seven days. From a product perspective, we needed to display data for at least 30 days and in real-time. All the data was stored in MySQL without partitioning, so we had to find a better solution. Simply using a cache was not an option, as real-time data was required.

Let’s analyze it on contrived example but similar to the original one.

MySQL Solution

Let’s say that we have a table with the following structure:

CREATE TABLE agent_stats (
    id BIGINT AUTO_INCREMENT NOT NULL,
    agent_id INT NOT NULL,
    event_type VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id)
);

CREATE INDEX idx_agent_stats_created_at_agent_id_event_type ON agent_stats (created_at, agent_id, event_type);

We’re collecting there statistics of our AI agents. Now, we have two types of events:

  • triggered
  • response_generated

To test the performance of the database, we need to generate a lot of data. In MySQL there is no fast way to generate random data, so I used some script which generated 24 234 964 records. It was quite slow.

And now, we want to get the number of triggered events for each agent in the last 30 days.

SELECT agent_id, event_type, COUNT(*) as count
FROM agent_stats
WHERE created_at > '2025-02-28 00:00:00'
GROUP BY agent_id, event_type

It’s very slow, and takes 11 seconds.

What is TimescaleDB?

TimescaleDB is an open-source time-series database built on PostgreSQL. It is designed to handle large volumes of time-series data efficiently. Basically it is a PostgreSQL extension that adds time-series capabilities to the database. It’s optimized for insertions of time-series data, and it provides features like automatic partitioning, retention policies, and continuous aggregates.

TimescaleDB Solution

So let’s try to use TimescaleDB to speed up our statistics. Creating a similar table in TimescaleDB:

CREATE TABLE agent_stats(
   created_at TIMESTAMPTZ NOT NULL,
   agent_id BIGINT NOT NULL CHECK (agent_id > 0),
   event_type VARCHAR(255) NOT NULL
);

SELECT create_hypertable('agent_stats', 'created_at');

CREATE INDEX ON agent_stats (created_at DESC, agent_id, event_type);

The create_hypertable function creates a hypertable, which is a TimescaleDB abstraction for a standard PostgreSQL table, but with automatic partitioning based on time.

Generating data in TimescaleDB is much convenient and a lot faster than in MySQL.

INSERT INTO agent_stats (created_at, agent_id, event_type)
SELECT
    time,
    random() * 9 + 1, /* 1 - 10 */
    'triggered'
FROM
    generate_series(
        '2024-01-01 00:00:00',
        '2025-03-28 16:00:00',
        INTERVAL '1 second'
    ) AS time;

INSERT INTO agent_stats (created_at, agent_id, event_type)
SELECT
    time,
    random() * 9 + 1, /* 1 - 10 */
    'response_generated'
FROM
    generate_series(
        '2024-01-01 00:00:00',
        '2025-03-28 16:00:00',
        INTERVAL '1 second'
    ) AS time;

In this way we can generate 2 records per second in the range of around 1 year and 3 months. I repeated this a few times, and in a few minutes total of 184 675 516 records were generated.

Now, let’s get the same data as before:

SELECT agent_id, event_type, COUNT(*) as count
FROM agent_stats
WHERE created_at > '2025-02-28 00:00:00'
GROUP BY agent_id, event_type

Keep in mind that we have much more data now, so the query it’s also slow. Now it takes 9 seconds, but of course on the same data as in MySQL it would be much faster, because of the partitioning. Ok, so now we need to speed it up.

TimescaleDB Continuous Aggregates

Continuous aggregates are a powerful feature of TimescaleDB that allows you to pre-compute and store the results of query aggregations over time. It’s based on the concept of materialized views in PostgreSQL, but it can return data real-time.

Let’s create a new continuous aggregate for agent_stats table:

CREATE MATERIALIZED VIEW hourly_agent_stats WITH (timescaledb.continuous)
AS
SELECT
    time_bucket('1 hour', created_at) as hour,
    agent_id,
    event_type,
    COUNT(1) AS occurrences
FROM agent_stats
GROUP BY
    hour,
    agent_id,
    event_type
WITH NO DATA
;

We also need to create a refresh policy to keep the materialized view up to date.

SELECT add_continuous_aggregate_policy('hourly_agent_stats',
    start_offset => INTERVAL '1 day',
    end_offset => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour'
);

Now we need to populate the materialized view with the data we already have in the table.

CALL refresh_continuous_aggregate('hourly_agent_stats', '2024-01-01', '2025-03-28');

It takes some time, and probably on the production it would be better to do it incrementally e.g. per day.

How it looks in the database:

image

Records in the view are aggregated by hour, so we have one record per hour per agent_id and event_type.

Now we can modify the query to take advantage of the newly created continuous aggregate:

SELECT agent_id, event_type, SUM(occurrences) as occurrences
FROM hourly_agent_stats
WHERE hour > '2025-02-28 00:00:00'
GROUP BY agent_id, event_type

This query is much faster, and it takes only 0.02 seconds.

It’s possible to get real-time data from the continuous aggregate, as historical data is fetched from the view, while the last hour (since we use a 1-hour bucket) is fetched from the source table.

TimescaleDB Retention

TimescaleDB also provides a retention policy feature that allows you to automatically drop old data. Let’s say that we need only 6 months of detailed data, and 1 year of aggregated data. We can set up the following retention policies:

SELECT add_retention_policy('agent_stats', INTERVAL '6 MONTH');

This will drop all data older than 6 months from the source table.

SELECT add_retention_policy('hourly_agent_stats', INTERVAL '1 YEAR');

This will drop all data older than 1 year from the continuous aggregate.

Summary

We could achieve some performance improvement by using partitioning in MySQL, but it would be a slight improvement and additional work. Adding TimescaleDB increases the complexity of the whole system, as it’s a new technology that needs to be maintained, but it’s a great choice for time-series data and great choice from the point of view of application engineers, as it provides a lot of useful features. However, if you’re using PostgreSQL now, using TimescaleDB will be easier to implement, as you don’t need to learn a new technology. It’s just an extension.

Related posts

How Flogistix by Flowco Reduced Infrastructure Management Costs by 66% with Tiger Data

How Flogistix by Flowco Reduced Infrastructure Management Costs by 66% with Tiger Data

Dev Q&A

Dec 04, 2025

How Flogistix by Flowco slashed infrastructure costs 66% using Tiger Data: 84% compression, 99% reliability, real-time insights.

Read more

How Mechademy Cut Hybrid Digital Twin Infrastructure Costs by 87% After Migrating from MongoDB to Tiger Data

How Mechademy Cut Hybrid Digital Twin Infrastructure Costs by 87% After Migrating from MongoDB to Tiger Data

Dev Q&AMonitoring & Alerting

Nov 05, 2025

How Mechademy scaled industrial IoT diagnostics for its hybrid digital twin infrastructure 50× while reducing costs 87% by migrating from MongoDB to TimescaleDB.

Read more

Stay updated with new posts and releases.

Receive the latest technical articles and release notes in your inbox.

Share

Get Started Free with Tiger CLI