TigerData logo
TigerData logo
  • Product

    Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    TimescaleDB Enterprise

    Self-managed TimescaleDB for on-prem, edge and private cloud

    Open source

    TimescaleDB

    Time-series, real-time analytics and events on Postgres

    Search

    Vector and keyword search on Postgres

  • Industry

    Crypto

    Energy & Utilities

    Oil & Gas Operations

  • Docs
  • Pricing
  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    TigerData logo

    Timescale

    Partners

    Security

    Careers

Start a free trial
Tiger Data

Products

  • Time-series and Analytics
  • AI and Vector
  • Enterprise Plan
  • Cloud Status
  • Support
  • Security
  • Cloud Terms of Service

Learn

  • Documentation
  • Blog
  • Tutorials
  • Changelog
  • Success Stories
  • Time-series Database

Company

  • Contact Us
  • Careers
  • About
  • Newsroom
  • Brand
  • Community
  • Code Of Conduct
  • Events
Privacy preferencesLegalPrivacySitemap

Subscribe to the Tiger Data newsletter

Gold Partner with Inductive Automation — Ignition

2026 (c) Timescale, Inc., d/b/a Tiger Data.
All rights reserved.

Tiger Data
GOLD PARTNER WITHINDUCTIVE AUTOMATION

2026 (c) Timescale, Inc., d/b/a Tiger Data.
All rights reserved.

Privacy preferencesLegalPrivacySitemap

Back to blog

Copy as HTML

Open in ChatGPT

Open in Claude

Open in v0

Kamil Ruczyński

By Kamil Ruczyński

4 min read

Jun 20, 2025

Dev Q&ATimescaleDBPostgreSQL

Table of contents

01 MySQL Solution02 What is TimescaleDB?03 TimescaleDB Solution04 TimescaleDB Continuous Aggregates05 TimescaleDB Retention06 Summary

TimescaleDB to the Rescue - Speeding Up Statistics

TimescaleDB to the Rescue - Speeding Up Statistics

Back to blog

Dev Q&A
Kamil Ruczyński

By Kamil Ruczyński

4 min read

Jun 20, 2025

Table of contents

01 MySQL Solution02 What is TimescaleDB?03 TimescaleDB Solution04 TimescaleDB Continuous Aggregates05 TimescaleDB Retention06 Summary

Copy as HTML

Open in ChatGPT

Open in Claude

Open in v0

This article, written by Kamil Ruczyński, was originally posted on Apr 6, 2025 on his blog. Kamil is a Staff Software Engineer specializing in backend solutions, with a strong focus on scalability and performance. It's reposted here with permission.

Some time ago, I was working on improving the performance of slow statistics. The problem was that our database contained billions of rows, making data retrieval slow, even for the last seven days. From a product perspective, we needed to display data for at least 30 days and in real-time. All the data was stored in MySQL without partitioning, so we had to find a better solution. Simply using a cache was not an option, as real-time data was required.

Let’s analyze it on contrived example but similar to the original one.

MySQL Solution

Let’s say that we have a table with the following structure:

CREATE TABLE agent_stats (
    id BIGINT AUTO_INCREMENT NOT NULL,
    agent_id INT NOT NULL,
    event_type VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id)
);

CREATE INDEX idx_agent_stats_created_at_agent_id_event_type ON agent_stats (created_at, agent_id, event_type);

We’re collecting there statistics of our AI agents. Now, we have two types of events:

  • triggered
  • response_generated

To test the performance of the database, we need to generate a lot of data. In MySQL there is no fast way to generate random data, so I used some script which generated 24 234 964 records. It was quite slow.

And now, we want to get the number of triggered events for each agent in the last 30 days.

SELECT agent_id, event_type, COUNT(*) as count
FROM agent_stats
WHERE created_at > '2025-02-28 00:00:00'
GROUP BY agent_id, event_type

It’s very slow, and takes 11 seconds.

What is TimescaleDB?

TimescaleDB is an open-source time-series database built on PostgreSQL. It is designed to handle large volumes of time-series data efficiently. Basically it is a PostgreSQL extension that adds time-series capabilities to the database. It’s optimized for insertions of time-series data, and it provides features like automatic partitioning, retention policies, and continuous aggregates.

TimescaleDB Solution

So let’s try to use TimescaleDB to speed up our statistics. Creating a similar table in TimescaleDB:

CREATE TABLE agent_stats(
   created_at TIMESTAMPTZ NOT NULL,
   agent_id BIGINT NOT NULL CHECK (agent_id > 0),
   event_type VARCHAR(255) NOT NULL
);

SELECT create_hypertable('agent_stats', 'created_at');

CREATE INDEX ON agent_stats (created_at DESC, agent_id, event_type);

The create_hypertable function creates a hypertable, which is a TimescaleDB abstraction for a standard PostgreSQL table, but with automatic partitioning based on time.

Generating data in TimescaleDB is much convenient and a lot faster than in MySQL.

INSERT INTO agent_stats (created_at, agent_id, event_type)
SELECT
    time,
    random() * 9 + 1, /* 1 - 10 */
    'triggered'
FROM
    generate_series(
        '2024-01-01 00:00:00',
        '2025-03-28 16:00:00',
        INTERVAL '1 second'
    ) AS time;

INSERT INTO agent_stats (created_at, agent_id, event_type)
SELECT
    time,
    random() * 9 + 1, /* 1 - 10 */
    'response_generated'
FROM
    generate_series(
        '2024-01-01 00:00:00',
        '2025-03-28 16:00:00',
        INTERVAL '1 second'
    ) AS time;

In this way we can generate 2 records per second in the range of around 1 year and 3 months. I repeated this a few times, and in a few minutes total of 184 675 516 records were generated.

Now, let’s get the same data as before:

SELECT agent_id, event_type, COUNT(*) as count
FROM agent_stats
WHERE created_at > '2025-02-28 00:00:00'
GROUP BY agent_id, event_type

Keep in mind that we have much more data now, so the query it’s also slow. Now it takes 9 seconds, but of course on the same data as in MySQL it would be much faster, because of the partitioning. Ok, so now we need to speed it up.

TimescaleDB Continuous Aggregates

Continuous aggregates are a powerful feature of TimescaleDB that allows you to pre-compute and store the results of query aggregations over time. It’s based on the concept of materialized views in PostgreSQL, but it can return data real-time.

Let’s create a new continuous aggregate for agent_stats table:

CREATE MATERIALIZED VIEW hourly_agent_stats WITH (timescaledb.continuous)
AS
SELECT
    time_bucket('1 hour', created_at) as hour,
    agent_id,
    event_type,
    COUNT(1) AS occurrences
FROM agent_stats
GROUP BY
    hour,
    agent_id,
    event_type
WITH NO DATA
;

We also need to create a refresh policy to keep the materialized view up to date.

SELECT add_continuous_aggregate_policy('hourly_agent_stats',
    start_offset => INTERVAL '1 day',
    end_offset => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour'
);

Now we need to populate the materialized view with the data we already have in the table.

CALL refresh_continuous_aggregate('hourly_agent_stats', '2024-01-01', '2025-03-28');

It takes some time, and probably on the production it would be better to do it incrementally e.g. per day.

How it looks in the database:

image

Records in the view are aggregated by hour, so we have one record per hour per agent_id and event_type.

Now we can modify the query to take advantage of the newly created continuous aggregate:

SELECT agent_id, event_type, SUM(occurrences) as occurrences
FROM hourly_agent_stats
WHERE hour > '2025-02-28 00:00:00'
GROUP BY agent_id, event_type

This query is much faster, and it takes only 0.02 seconds.

It’s possible to get real-time data from the continuous aggregate, as historical data is fetched from the view, while the last hour (since we use a 1-hour bucket) is fetched from the source table.

TimescaleDB Retention

TimescaleDB also provides a retention policy feature that allows you to automatically drop old data. Let’s say that we need only 6 months of detailed data, and 1 year of aggregated data. We can set up the following retention policies:

SELECT add_retention_policy('agent_stats', INTERVAL '6 MONTH');

This will drop all data older than 6 months from the source table.

SELECT add_retention_policy('hourly_agent_stats', INTERVAL '1 YEAR');

This will drop all data older than 1 year from the continuous aggregate.

Summary

We could achieve some performance improvement by using partitioning in MySQL, but it would be a slight improvement and additional work. Adding TimescaleDB increases the complexity of the whole system, as it’s a new technology that needs to be maintained, but it’s a great choice for time-series data and great choice from the point of view of application engineers, as it provides a lot of useful features. However, if you’re using PostgreSQL now, using TimescaleDB will be easier to implement, as you don’t need to learn a new technology. It’s just an extension.

Related posts

How ApexAnalytica Runs Building Telemetry, Transactional Data, and RAG on a Single Postgres Instance

How ApexAnalytica Runs Building Telemetry, Transactional Data, and RAG on a Single Postgres Instance

Dev Q&ATimescaleDB

May 27, 2026

See how open source TimescaleDB powers a year of hourly building energy data in a single heat map view — rendered in under a second across 38 live sites.

Read more

From 6 Seconds to Under 100ms: How the Embodied Carbon Observatory Separates Grid Improvement From Real Decarbonization With Tiger Data

From 6 Seconds to Under 100ms: How the Embodied Carbon Observatory Separates Grid Improvement From Real Decarbonization With Tiger Data

Dev Q&ATiger Data

Apr 23, 2026

How the Embodied Carbon Observatory uses TimescaleDB to cut queries from 6s to under 100ms, separating real decarbonization from grid improvement.

Read more

Stay updated with new posts and releases.

Receive the latest technical articles and release notes in your inbox.

Share

Start a free trial