---
title: "How FlightAware Fuels Flight Prediction Models for Global Travelers With TimescaleDB and Grafana"
published: 2020-10-05T13:38:11.000-04:00
updated: 2026-01-07T11:46:13.000-05:00
excerpt: "Learn how FlightAware architected a monitoring system - combining  TimescaleDB, Grafana, and Docker  - that allows them to power real-time flight predictions, analyze prediction performance, and continuously improve their models."
tags: Dev Q&A, #CTA-signup
authors: Caroline Rodewig, Lacey Butler
---

> **TimescaleDB is now Tiger Data.**

_This is an installment of our “Community Member Spotlight” series, where we invite our customers to share their work, shining a light on their success and inspiring others with new ways to use technology to solve problems._

  
_In this edition,_ [Caroline Rodewig](https://www.linkedin.com/in/caroline-rodewig-3ba81095/)_, Senior Software Engineer and Predict Crew Lead at FlightAware,  joins us to share how they’ve architected a monitoring system that allows them to power real-time flight predictions, analyze prediction performance, and continuously improve their models._

[FlightAware](https://flightaware.com/) is the world's largest [flight tracking and data platform](https://thepointsguy.com/news/how-flight-tracking-site-flightaware-works-for-consumers-and-airlines/); we fuse hundreds of global data sources to produce an accurate, consistent view of flights around the world. We make this data available to users through web and mobile applications, as well as different APIs.

Our customers cover a number of different segments, including:

-   **Travelers / aviation enthusiasts** who use our website and mobile apps to track flights (e.g., using our “where’s my flight?” program).
-   **Business aviation providers** (such as [Fixed Base Operators](https://en.wikipedia.org/wiki/Fixed-base_operator) or [aircraft operators](https://en.wikipedia.org/wiki/Air_operator%27s_certificate)) who use flight-tracking data and custom reporting to support their businesses.
-   **Airlines** that use flight-tracking data or our predictive applications to operate more efficiently.

_Editor’s Note: for more information about FlightAware’s products (and ways to harness its data infrastructure), check out_ [_this overview_](https://flightaware.com/commercial/)_. Want to build your own flight tracking receiver and ground station? See_ [_FlightAware’s PiAware tutorial._](https://flightaware.com/adsb/piaware/build)  

## About the team

The Predictive Technologies [crew](https://flightaware.engineering/crews-wings-and-alliances-part-1-the-principles-of-how-we-work/) is responsible for FlightAware's predictive applications, which as a whole are called "FlightAware Foresight." At the moment, our small-but-mighty team is made up of only three people: our project manager [James Parkman](https://www.linkedin.com/in/jamesparkman/), software engineer Andrew Brooks, and myself. We each wear many different hats; a day's work can cover anything from Tier 2 customer support to R&D, and everything in between.

A former crew member, Diorge Tavares, wrote a [cool article](https://flightaware.engineering/flying-through-the-clouds-flightawares-journey-into-machine-learning-at-scale/) about his experience as a site reliability engineer embedded in the Predict crew. He helped us design infrastructure and led our foray into cloud computing; now that our team is more established, he’s moved back to the FlightAware Systems team full-time.

## About the project

Our team's chief project is predicting flight arrival times, or ETAs; we predict both landing (EON) and gate arrival (EIN) times. And, ultimately, we need to monitor, visualize, and alarm on the _quality_ of those predictions. This is where TimescaleDB fits in.

Not only should we track how our prediction error changes over the course of each flight, we also need to track how our error changes over months - or years! - to ensure we're continually improving our predictions. Our predictive models can have short bursts of inaccuracy - like failing to anticipate the impact of a huge storm - but they can also drift slowly over time as real-life behaviors change.

As an example of the type of data we extract, the below is our "Worst Flights" dashboard, which we use for QA. (Looking through outliers is an easy way to spot bugs.) The rightmost column compares our error to third-parties', so we can see how we're doing relative to the rest of the industry.

![Screenshot of Grafana UI, showing 2 tables and colored bands for error rate/avg difference (green, orange, red)](https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/images/2022/01/Picture3-2.png)

_Our Grafana dashboard for tracking "Worst Flights" and our prediction quality vs. other data sources_

But, we also go deep into specific flights, like the below "Single Flight" dashboard view. This is useful for debugging, as it gives a detailed picture of how our predictions changed over the course of a _single_ flight.

![Screenshot of Grafana UI, showing line graph and 4 gauges](https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/images/2022/01/Picture4-2.png)

_Our Grafana dashboard for debugging and assessing our prediction quality at the individual flight level_

## Choosing (and using) TimescaleDB

We tested out several different monitoring setups before settling on TimescaleDB and Grafana. We recently published a [blog post](https://flightaware.engineering/systems-monitoring-with-prometheus-grafana/) detailing our quest for a monitoring system, which I’ve summarized below.

First, we considered using [Zabbix](https://www.zabbix.com); it's widely used at FlightAware, where most software reports into Zabbix in one way or another. However, we quickly realized that Zabbix was not the tool for the job – our Systems crew had serious doubts that Zabbix would be able to handle the load of all the metrics we wanted to track:

**We make predictions for around 75,000 flights per day; if we only stored two error values per flight (much fewer than we wanted), it would require making 100 inserts per minute.**

After ruling out Zabbix, I started looking at [Grafana](https://grafana.com/) as a visualization and alerting tool, and it seemed to have all the capabilities we needed. For my database backend, I first picked [Prometheus](https://prometheus.io), because it was near the top of Grafana's "supported databases" list and its built-in visualization capabilities seemed promising for rapid development.

I didn't know much about time-series databases, and, while Prometheus is a good fit for some data, it really didn't fit mine well:

-   **No JOINs**. My only prior database experience was with PostgreSQL, and it didn't occur to me that some databases just wouldn't support JOINs. While we _could_ have worked around this issue by inserting specific, already-joined error metrics, this would have limited the flexibility and "query-a-bility" of the data.
-   **Number of labels to store**. At the bare minimum, we wanted to store EON and EIN predictions for 600 airports, at least 10 times throughout each flight. This works out to 12,000 different label combinations, each stored as a time series – which [Prometheus is not currently designed to handle](https://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels).

And, that’s when I found TimescaleDB. A number of factors went into our decision to use TimescaleDB, but here are the top four:

-   **Excellent performance.** [This article comparing TimescaleDB vs. PostgreSQL performance](https://timescale.ghost.io/blog/blog/timescaledb-vs-6a696248104e/) really impressed me. Getting consistent performance, despite the number of rows in the table, was critical to our goal of storing performance data over several years.
-   **Institutional knowledge.** FlightAware uses PostgreSQL in a vast number of applications, so there was already a lot of institutional knowledge and comfort with SQL.
-   **Impressive documentation.** I have yet to have an issue or question that wasn't discussed and answered in the docs. Plus, it was trivial to test out – I love one-line docker start-up commands (see [TimescaleDB Docker Installation instructions](https://docs.timescale.com/latest/getting-started/installation/docker/installation-docker)).
-   **Grafana support.** I was pretty confident that I wanted to use Grafana to visualize our data and power our dashboards, so this was a potential dealbreaker.

![Screenshot of Grafana UI, showing various line graphs for different key metrics](https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/images/2022/01/Picture2-4.png)

_We use several Grafana dashboards, like this one, to view detailed performance over time (average error trends over one or more airports)_

_Editor’s Note: To learn more about TimescaleDB and Grafana,_ [_see our Grafana tutorials_](https://docs.timescale.com/latest/tutorials/tutorial-grafana) _(5 step-by-step guides for building visualizations, using variables, setting up alerts, and more) and_ [_Grafana how-to blog posts_](https://timescale.ghost.io/blog/tag/grafana/)_._

_To see how to use TimescaleDB to perform time-series forecasting and analysis,_ [_check out our time-series forecasting tutorial_](https://docs.timescale.com/latest/tutorials/tutorial-forecasting) _(includes two forecasting methods, best practices, and sample queries)._  

## Current deployment & use cases

Our architecture is pretty simple (see diagram below). We run a copy of this setup in several environments: production, production hot-standby, staging, and test. Each environment has its own predictions database, which allows us to compare our predictions in staging to those in production and validate changes _before_ they get released.  

⭐ **Pro tip:** we periodically sync Grafana configurations from production to each of the other environments, which reduces the manual work involved in updating dashboards across instances.

![Architecture diagram, depicting how variou system elements fit together](https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/images/2022/01/Picture1.jpg)

FlightAware Predict team's system architecture, which uses custom Python programs, Docker, Grafana, and TimescaleDB

After some trial and error, we’ve set up our TimescaleDB schema as follows:

**(1) Short term (1 week) tables for arrivals, our own predictions, and third-party predictions.** The predict-assessor program reads our flight data feed, extracts ETA predictions and arrival times, and inserts them into the database. For scale, the arrivals table typically contains 500k rows, and the predictions tables each contain 5M rows.

-   Each table is chunked: arrivals by arrival time and predictions by the time the prediction was made.
-   We use archiving functions to copy some data into long-term storage, and [a `drop_chunks` policy](https://docs.timescale.com/latest/api#drop_chunks) to ensure that rows older than one week are dropped to prevent unlimited table growth.

**(2) Long term (permanent) table for prediction and prediction-error data.** Archiving functions move data to the long term table by joining the short terms tables together. They also "threshold" the data to reduce verbosity, by only storing predictions at predetermined intervals; i.e., predictions that were present 1 and 2 hours before arrival are migrated to long-term tables, but intermediate predictions (i.e., at 1.5 hours before arrival) are not kept.

-   Between the join and the threshold, the archiving process reduces the average number rows per flight from **25** (across 3 short-term tables) to **6**!
-   We haven’t enabled a `drop_chunks` policy on this table as of now; after ~9 months of running this setup, our database file is pretty manageable at 54GB. If we start having space issues, we'd opt to store fewer predictions per flight rather than lose any year-over-year historical data.

##   
Biggest "Aha!" moment

> Continuous aggregates are what well and truly sold me on TimescaleDB. We went from 6.4 seconds to execute a query to 30_ms._ Yes, milliseconds.

I was embarrassingly late to the party when it comes to continuous aggregates. When I first set up our database, every query was fast because the database was small. However, as we added data over time, some queries slowed down significantly.

The biggest offender was a query on our KPIs dashboard, visualized in Grafana below. This graph gives us a birds-eye view of error over time. The blue line represents the average error for all airports at a certain time before arrival; the red line shows the number of flights per day. (You can see the huge traffic drop when airlines stopped flights in March, due to the COVID-19 pandemic.)

![Screenshot of Grafana UI, showing line graph](https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/images/2022/01/Picture5.png)

_Our KPI dashboard includes various metrics, including our average error rate and total flights per day across all airports_

**_Before_ learning about continuous aggregates, the query to extract this data looked like this:**

```SQL
SELECT
  time_bucket('1 day', arr_time) AS "time",
  AVG(get_error(prediction_fa, arr_time)) AS on_error,
  count(*) AS on_count
FROM prediction_history
WHERE 
  time_out = '02:00:00' AND 
  arr_time BETWEEN '2020-03-01' AND '2020-09-05'
GROUP BY 1
ORDER BY 1
```

It took 6.4 seconds and aggregated 1.6M rows, from a table of 147M rows.

For what the query was doing, this runtime wasn't too bad – the table was chunked by `arr_time`, which the query planner could take advantage of.

I considered adding indexes to make the query faster, but wasn't convinced they would help much and was concerned about the resulting performance penalties for inserts.

I also considered creating a materialized view to aggregate the data and writing a cron job to regularly refresh it...but that seemed like a hassle, and after all, I could wait 10 seconds for something to load 🤷‍♀️.

Then, I discovered TimescaleDB's continuous aggregations! For the unfamiliar, they basically implement that regularly-refreshing materialized view idea, but in a far smarter way and with a bunch of cool extra features.

**Here's the view for the continuous aggregate:**

```SQL
CREATE VIEW error_by_time_out
WITH (timescaledb.continuous) AS
  SELECT
    time_out,
    time_bucket(INTERVAL '1 hour', arr_time) AS bucket,
    AVG(get_error(prediction_fa, arr_time)) AS avg_error,
    COUNT(*) AS count
  FROM prediction_history
  GROUP BY time_out, bucket;
```

The new data extraction query is a little bit harder to parse, because the error needs to be aggregated across continuous aggregate buckets:

```SQL
SELECT
  time_bucket('1 day', bucket) AS "time",
  SUM(avg_error * count) / SUM(count) AS error,
  SUM(count) AS count
FROM error_by_time_out
WHERE 
  time_out = '02:00:00' AND 
  bucket BETWEEN '2020-03-01' AND '2020-09-05'
GROUP BY 1
ORDER BY 1
```

...and I'll let you guess how long it takes....

30ms. **Yes, milliseconds. We went from 6.4 seconds to execute the query to 30**_**ms**._

On top of that, unlike in a classic materialized view, the whole view doesn't have to be recalculated every time it needs to be updated - _just the parts that have changed._ This means refreshes are lightning fast too.

**Continuous aggregates are what well and truly sold me on TimescaleDB.**

The amazing developers at Timescale recently made continuous aggregates even better through "real-time" aggregates. These will automatically fill in data between the last view refresh and real-time when they're queried, so you always get the most up-to-date data possible. Unfortunately, our database is a few versions behind so we're not using real-time aggregates yet, but I can't wait to upgrade and start using them.

_Editor’s Note: To learn more about real-time aggregates and how they work, see our_ [_“Ensuring up-to-date results with Real-Time Aggregations” blog and mini-tutorial_](https://timescale.ghost.io/blog/blog/achieving-the-best-of-both-worlds-ensuring-up-to-date-results-with-real-time-aggregation/) _(includes benchmarks, example scenarios, and resources to get started)._

## Getting started advice & resources

In addition to the documentation I’ve linked throughout this post, I'd recommend doing what I did: reading [the TimescaleDB docs](https://docs.timescale.com/), spinning up a test database, and going to town.

And, after a few months of use, make sure to go back and read the docs again – you'll discover all sorts of new things to try to make your database even faster (looking at you [timescaledb-tune](https://github.com/timescale/timescaledb-tune))!

_Editor’s Note: If you’d like to follow Caroline’s advice and start testing TimescaleDB for yourself, [Timescale Cloud](https://www.timescale.com/cloud) is the fastest way to get up and running - 100% free for 30 days, no credit card required. You can see self-managed and other hosted options_ [_here_](https://www.timescale.com/products)_._

_To learn more about timescale-tune,_ [_see our Configuring TimescaleDB documentation_](https://docs.timescale.com/latest/getting-started/configuring)_._

_We’d like to thank Caroline and the FlightAware team for sharing their story, as well as for their work to make accurate, reliable flight data available to travelers, aviation enthusiasts, and operators everywhere. We’re big fans of FlightAware at Team Timescale, and we’re honored to have them as members of our community!_

_We’re always keen to feature new community projects and stories on our blog. If you have a story or project you’d like to share, reach out on Slack (_[_@lacey butler_](https://timescaledb.slack.com/archives/DPFTYT9E0)_), and we’ll go from there._

_Additionally, if you’re looking for more ways to get involved and show your expertise, check out the_ [_Timescale Heroes_](https://www.timescale.com/timescale-heroes/) _program._