Tiger Data Blog

Slow Grafana Performance? Learn How to Fix It Using Downsampling

Brian Rowe — Thu, 23 Jun 2022 13:03:01 GMT

Downsampling in Grafana

Graphs are awesome. They allow us to understand data quicker and easier, highlighting trends that otherwise wouldn’t stand out. And Grafana, the open-source visualization tool, is a fantastic tool for creating graphs, especially for time-series data.

If you have some data that you want to analyze visually, you just hook it up to your Grafana instance, set up your query, and you’re off to the races. (If you’re new to Grafana and Timescale, don’t worry, we’ve got you covered. See our Getting Started with Grafana and TimescaleDB docs or videos to get up and running).

However, while Grafana is an awesome tool for generating graphs, problems still arise when we have too much data. Extremely large datasets can be prohibitively slow to load, leading to frustrated users or, worse, unusable dashboards.

These large time-series datasets are especially common in industries like financial services, the Internet of Things, and observability as data can be relentless, often generated at high rates and volumes.

To better understand the problems that can occur when we have extremely large datasets, consider the example of stock ticker data and this graph showing 30 days' worth of trades for five different stocks (AAPL, TSLA, NVDA, MSFT, and AMD):

This graph is composed of five queries which collectively contain nearly 1.3 million data points and takes nearly 20 seconds to load, pan, or zoom!

Even with more manageable amounts of data, our graphs can still sometimes be difficult to interpret if the data is too noisy. If the daily variance of our data is so high, it can hide the underlying trends that we're looking for. Consider this graph showing the volume of taxi trips taken in New York City over a two-month period:

That spike a third of the way in may be a significant shift in volume, and those lower peaks toward the right edge might be a significant decline. It's not immediately obvious though, and certainly, this is not the powerful tool we want our graphs to be.

We can use different types of downsampling to solve the problems of slow-loading Grafana dashboards and noisy graphs, respectively. Downsampling is the practice of replacing a large set of data points with a smaller set.

We’ll implement our solutions using two of TimescaleDB’s hyperfunctions for downsampling, making it easy to manipulate and analyze time-series data with fewer lines of SQL code. We’ll look at one hyperfunction for downsampling using the Largest Triangle Three Buckets or lttb() method, and another for downsampling using the ASAP smoothing algorithm, both of which come pre-installed with Timescale or can be accessed via the timescaledb_toolkit extension if you self-manage your database.

Example 1: Load faster dashboards with lttb( ) downsampling

In our first example, which plots the prices for five stocks over a 30-day period, the problem is that we have way too much data, resulting in a slow-loading graph. This is because the real-time stocks dataset we’re using has upwards of 10,000 points per day for each stock symbol!

Given the timeframe of our analysis (30 days), this is far more data than we need to spot a trend, and the time needed to load this graph is dominated by the cost of fetching all of the data.

To solve this problem, we need to find a way to reduce the number of data points we're getting from our data source. Unfortunately, doing this in a manner that doesn't drastically deform our graph is actually a very tricky problem. For example, let’s look at just the NVDA ticker price:

Here's what we see if we just naively take the 10-minute average for the NVDA symbol (overlaid in yellow on the original data).

The graph of the average (mean) roughly follows the underlying data but completely smooths away almost all of the peaks and valleys, and those are the most interesting parts of the dataset! Taking the first or last point from each bucket results in an even more skewed graph, as the outlying points have no weight unless they happen to fall in just the right spot.

What we need is a way to capture the most interesting point from each bucket. To do that, we can use the lttb() algorithm which gives us a downsampled graph that follows the pattern of the original graph quite closely. (As an aside, lttb() was invented by Sveinn Steinarsson in his master’s thesis).

Using lttb(), the downsampled data is barely distinguishable from the original, despite having less than 0.5 % of the points!

lttb() works by keeping the same first and last point as the original data but dividing the rest of the data into equal intervals. For each interval, it then tries to find the most impactful point. It does this by building a triangle for each point in the interval with the point selected from the previous interval and the average of the points in the next interval. These triangles are compared with one another by area. The largest resulting triangle corresponds to the point in the interval that has the largest impact on how the graph looks.

As we see above, the result is a graph that very closely resembles the original graph. What's not as obvious is that the raw data was nearly 315,000 rows of data that took over five seconds to pull into our dashboard. The lttb() data was 1,404 rows that took less than one second to fetch.

Here is the SQL query we used in our Grafana panel to get the lttb() data.

SELECT
  time AS "time",
  value AS "NVDA lttb"
FROM unnest((
    SELECT lttb(time, price, 2 * (($__to - $__from) / $__interval_ms)::int)
    FROM stocks_real_time
    WHERE symbol = 'NVDA' AND $__timeFilter("time"))
)
  ORDER BY 1;

As you can see, the real work here is done by the lttb() hyperfunction call in the inner SELECT. This function takes the time and value columns from our table, and also a third integer specifying the target resolution, which is the number of points it should return.

Unfortunately, Grafana doesn't directly expose the panel width in pixels to us, but we can get an approximation from the $__interval global variable (which is approximately (to - from) / resolution). For this graph, the interval was a bit of an underestimation, hence us doubling it in the function above.

Our lttb() hyperfunction returns a custom timevector object, which uses unnest to get time, value rows that Grafana can understand and plot.

Example 2: Find signal from noisy datasets with ASAP smoothing downsampling

lttb() is a fantastic downsampling algorithm for giving us a subset of points that maintain the visual appearance of a graph. However, sometimes the problem is that the original graph is so noisy that the long-term trends we're trying to see are lost in the normal periodic variance of the data. This is the case we saw in our second example above, that of taxi data (and shown below):

In this case, what we're interested in isn't a way of just reducing the number of points in a graph (as we saw before, that ends up with a graph that looks the same!), but doing so in a manner that smooths away the noise.

We can use a downsampling technique called Automated Smoothing for Attention Prioritization (ASAP), which was developed by Kexin Rong and Peter Bailis.

ASAP works by analyzing the data for intervals of high autocorrelation. Think of this as finding the size of the repeating shape of a graph, so maybe 24 hours for our taxi data, or even 168 hours (one week). Once ASAP has found the range with the highest autocorrelation, it will smooth out the data by computing a rolling average using that range as the window size.

For instance, if you have perfectly regular data, ASAP should mostly smooth everything away to the underlying flat trend, as in the following example:

The green line here is the raw data. It is generated as a sine wave with an interval of 20 and an offset of 100 that repeats daily. The yellow line is the ASAP algorithm applied to the data, showing that the graph is entirely regular noise with no interesting underlying fluctuation.

Obviously ASAP can work well on this type of synthetic data, but let's see how it does with our taxi data.

Here it becomes very obvious that there was a significant dip over from about 11/26 to 12/03, which happens to be Thanksgiving weekend, a US public holiday weekend that occurs at the end of November every year. We can see this even more dramatically by selecting only the ASAP output and letting Grafana auto-adjust the scale:

The data for this graph is the taxi trips CSV file. The SQL query we're running in Grafana is this:

SELECT
  time AS "time",
  value AS "asap"
FROM unnest((
  SELECT asap_smooth(time, value, (($__to - $__from) / '$__interval_ms')::integer)
  FROM taxidata
  WHERE $__timeFilter("time"))
)
ORDER BY 1

As in example 1 above, the asap_smooth hyperfunction does most of the work here, taking the time and value columns, as well as a target resolution as arguments. We use the same trick from example 1 to approximate the panel width from Grafana's global variables.

Learn More

Eager to try downsampling or learn more about other hyperfunctions? Check out our downsample and hyperfunctions docs for more information on how hyperfunctions can help you efficiently query and analyze your data.

Looking for more Grafana guides? Here are our Grafana tutorials and our Grafana 101 Creating Awesome Visualizations for more support on visualizations in Grafana.

If you need a database to store your time-series data and power your dashboards, try Timescale, our fast, easy-to-use, and reliable cloud-native data platform for time series built on PostgreSQL. (You can sign up for a 30-day free trial, no credit card required.)

Introducing Hyperfunctions: New SQL Functions to Simplify Working With Time-Series Data in PostgreSQL

Joshua Lockerman — Tue, 13 Jul 2021 13:02:11 GMT

Today, we’re excited to launch TimescaleDB hyperfunctions, a series of SQL functions within TimescaleDB that make it easier to manipulate and analyze time-series data in PostgreSQL with fewer lines of code. You can use hyperfunctions to calculate percentile approximations of data, compute time-weighted averages, downsample and smooth data, and perform faster COUNT DISTINCT queries using approximations. Moreover, hyperfunctions are “easy” to use: you call a hyperfunction using the same SQL syntax you know and love.

At Timescale, our mission is to enable every software developer to store, analyze, and build on top of their time-series data so that they can measure what matters in their world: IoT devices, IT systems, marketing analytics, user behavior, financial metrics, and more.

We made the decision early in the design of TimescaleDB to build on top of PostgreSQL. We believed then, as we do now, that building on the world’s fastest-growing database would have numerous benefits for our customers. Perhaps the biggest of these advantages is in developer productivity. Developers can use the tools and frameworks they know and love and bring all their skills and expertise with SQL with them.

SQL is a powerful language and we believe that by adding a specialized set of functions for time-series analysis, we can make it even better.

Today, there are nearly three million active TimescaleDB databases running mission-critical time-series workloads across industries. Time-series data comes at you fast, sometimes generating millions of data points per second. In order to measure everything that matters, you need to capture all of the data you possibly can. Because of the volume and rate of information, time-series data can be complex to query and analyze.

As we interviewed customers and learned how they analyze and manipulate time-series data, we noticed several common queries begin to take shape. Often, these queries were difficult to compose in standard SQL. TimescaleDB hyperfunctions are a series of SQL functions to address the most common, and often most difficult, queries developers write today. We made the decision to take the hard path ourselves so that we could give developers an easier path.

Hyperfunctions included in this initial release

Today, we’re releasing several hyperfunctions, including:

Time-Weighted Average allows you to take the average over an irregularly spaced dataset that only includes changepoints.
Percentile-Approximation brings percentile analysis to more workflows. When used with continuous aggregates, you can compute percentiles over any time range of your dataset in near real-time and use them for baselining and normalizing incoming data. For maximum control, we provide implementations of two different approximation algorithms:
- Uddsketch gives formal guarantees to the accuracy of approximate percentiles, in exchange for always returning a range of possible values.
- T-Digest gives fuzzier guarantees which allow it to be more precise at the extremes of the distribution.
Hyperloglog enables faster approximate COUNT DISTINCT, making it easier to track how the cardinality of your data changes over time.
Counter Aggregate enables working with counters in an ergonomic SQL-native manner.
ASAP Smoothing smooths datasets to bring out the most important features when graphed.
Largest Triangle Three Buckets Downsampling reduces the number of elements in a dataset while retaining important features when graphed.
Stats-agg makes using rolling, cumulative and normalized statistics as easy as their standard counterparts.

Note that Hyperfunctions work on TimescaleDB hypertables, as well as regular PostgreSQL tables.

New SQL functions, not new SQL syntax

We made the decision to create new SQL functions for each of the time-series analysis and manipulation capabilities above. This stands in contrast to other efforts which aim to improve the developer experience by introducing new SQL syntax.

While introducing new syntax with new keywords and new constructs may have been easier from an implementation perspective, we made the deliberate decision not to do so since we believe that it actually leads to a worse experience for the end-user.

New SQL syntax means that existing drivers, libraries, and tools may no longer work. This can leave developers with more problems than solutions as their favorite tools, libraries, or drivers may not support the new syntax, or may require time-consuming modifications to do so.

On the other hand, new SQL functions mean that your query will run in every visualization tool, database admin tool, or data analysis tool. We have the freedom to create custom functions, aggregates, and procedures that help developers better understand and work with their data, and ensure all their drivers and interfaces still work as expected.

Hyperfunctions are written in Rust

Rust was our language of choice for developing the new hyperfunctions. We chose it for its superior productivity, community, and the pgx software development kit. We felt Rust was a more friendly language for a project like ours and would encourage more community contributions.

The inherent safety of Rust means we could focus more time on feature development rather than worrying about how the code is written. The extensive Rust community (💗 crates.io), along with excellent package-management tools, means we can use off-the-shelf solutions for common problems, leaving us more time to focus on the uncommon ones.

On the topic of community, we found the Rust community to be one of the friendliest on the internet, and its commitment to open source, open communication, and good documentation make it an utter joy to work with. Libraries such as Serde and quickcheck make common tasks a breeze and lets us focus on the code that’s novel to our project, instead of writing boilerplate that's already been written by thousands of others.

We’d like to shout out ZomboDB’s pgx, an SDK for building Postgres extensions using Rust. Pgx provides tools to generate extension scripts from Rust files and bind Rust functions to Postgres functions, as well as tools to set up, run, and test PostgreSQL instances. (For us, it’s been an amazing tool and experience with incredible benefits – we estimate that pgx has reduced our workload by at least one-third!.)

Next steps

In the rest of this post, we detail why we chose to build new SQL functions (not new SQL syntax), and explore each hyperfunction and its example usage.

But if you’d like to get started with hyperfunctions right away, the easiest way to do so is with a fully-managed TimescaleDB service. Try it for free (no credit card required) for 30 days. Hyperfunctions are pre-loaded on each new database service on Timescale, so after you’ve created a new service, you’re all set to use them!

If you prefer to manage your own database instances, you can download and install the timescaledb_toolkit extension on GitHub for free, after which you’ll be able to use all the hyperfunctions listed above.

Finally, we love building in public. You can view our upcoming roadmap on GitHub for a list of proposed features, as well as features we’re currently implementing and those that are available to use today.

We also welcome feedback from the community (it helps us prioritize the features users really want). To contribute feedback, comment on an open issue or in a discussion thread in GitHub.

To learn more about hyperfunctions, please continue reading.

Building new SQL functions instead of reinventing syntax

SQL is the third most popular programming language in the world. It’s the language known and loved by many software developers, data scientists, and business analysts the world over, and it's a big reason we chose to build TimescaleDB on top of PostgreSQL in the first place.

Similarly, we choose to make our APIs user-friendly without breaking full SQL compatibility. This means we can create custom functions, aggregates, and procedures but no new syntax - and all the drivers and interfaces can still work. You get the peace of mind that your query will run in every visualization tool, database admin tool, or data analysis tool that speaks SQL.

SQL is powerful and it’s even Turing complete, so you can technically do anything with it. But that doesn’t mean you’d want to 😉. Our hyperfunctions are made to make complex analysis and time-series manipulation in SQL simpler, without undermining the guarantees of full SQL compatibility. We’ve spent a large amount of our time on design; prototyping and just writing out different names and calling conventions for clarity and ease of use.

Our guiding philosophy is to make simple things easy and complex things possible. We enable things that feel like they should be straightforward, like using a single function call to calculate a time-weighted average of a single item over a time period. We also enable operations that would otherwise be prohibitively expensive (in terms of complexity to write) or would previously take too long to respond to be useful. For example, calculating a rolling time-weighted average of each item normalized to the monthly average of the whole group of things.

For example, we’ve implemented a default for percentile approximation called percentile_agg that should work for most users, while also exposing the lower level UDDsketch and tdigest implementations for users who want to have more control and get into the weeds.

Another advantage of using SQL functions rather than new syntax is that we bring your code closer to your data, rather than forcing you to take your data to your code. Simply put, you can now perform more sophisticated analysis and manipulation operations on your data right inside your database, rather than creating data pipelines to funnel data into Python or other analysis libraries to conduct analysis there.

We want to make the more complex analysis simpler and easier in the database not just because we want to build a good product, but also because it’s far, far more efficient to do your analysis as close to the data as possible, and then get aggregated or other simpler results that get passed back to the user.

This is because the network transmission step is often the slowest and most expensive part of many calculations, and because the serialization and deserialization overhead can be very large as you get to large datasets. So by making these functions and all sorts of analysis simpler to perform in the database, nearer to the data, developers save time and money.

Moreover, while you could perform some of the complex analysis enabled by hyperfunctions in other languages inside the database (e.g., programs in Python or R), hyperfunctions now enable you to perform such sophisticated time-series analysis and manipulation in SQL right in your query statements, making them more accessible.

Hyperfunctions released today

Hyperfunctions refactor some of the most gnarly SQL queries for time-series data into concise, elegant functions that feel natural to any developer that knows SQL. Let’s walk through the hyperfunctions we’re releasing today and the ones that will be available soon.

Back in January, when we launched our initial hyperfunctions release, we asked for feedback and input from the community. We want this to be a community-driven project, so for our 1.0 release, we’ve prioritized several features requested by community members. We’ll have a brief overview here, with a technical deep dive into each family of functions in a series of separate blog posts in the coming weeks.

Time-weighted averages

Time-series averages can be complicated to calculate; generally, you need to determine how long each value has been recorded in order to know how much to weigh them. While doing this in native SQL is possible, it is extremely error-prone and unwieldy. More damningly, the SQL needed would not work in every context. In particular, it would not work in TimescaleDB’s automatically refreshing materialized views, continuous aggregates, so users who wanted to calculate time-weighted averages over multiple time intervals would be forced to rescan the entire dataset for each average so calculated. Our time-weighted average hyperfunction removes this complexity and can be used in continuous aggregates to make multi-interval time-weighted averages as cheap as summing a few sub-averages.

Here’s an example of using time-weighted averages for an IoT use case, specifically to find the average temperature in a set of freezers over time. (Notice how it takes sixteen lines of complex code to find the time-weighted average using regular SQL, compared just five lines of code with SELECT statements when using the TimescaleDB hyperfunction):

Time-weighted average using TimescaleDB hyperfunction

SELECT freezer_id, 
	avg(temperature), 
	average(time_weight('Linear', ts, temperature)) as time_weighted_average 
FROM freezer_temps
GROUP BY freezer_id;

 freezer_id |  avg  | time_weighted_average 
------------+-------+-----------------------
          1 | 10.35 |     6.802777777777778

Time-weighted average using regular SQL

WITH setup AS (
	SELECT lag(temperature) OVER (PARTITION BY freezer_id ORDER BY ts) as prev_temp, 
		extract('epoch' FROM ts) as ts_e, 
		extract('epoch' FROM lag(ts) OVER (PARTITION BY freezer_id ORDER BY ts)) as prev_ts_e, 
		* 
	FROM  freezer_temps), 
nextstep AS (
	SELECT CASE WHEN prev_temp is NULL THEN NULL 
		ELSE (prev_temp + temperature) / 2 * (ts_e - prev_ts_e) END as weighted_sum, 
		* 
	FROM setup)
SELECT freezer_id, 
	avg(temperature),
	sum(weighted_sum) / (max(ts_e) - min(ts_e)) as time_weighted_average 
FROM nextstep
GROUP BY freezer_id;

 freezer_id |  avg  | time_weighted_average 
------------+-------+-----------------------
          1 | 10.35 |     6.802777777777778

Percentile approximation (UDDsketch & TDigest)

Aggregate statistics are useful when you know the underlying distribution of your data, but for other cases, they can be misleading. For cases where they don’t work, and for more exploratory analyses looking at the ground truth, percentiles are useful.

As useful as it is, percentile analysis comes with one major downside: it needs to store the entire dataset in memory. This means that such analysis is only feasible for relatively small datasets, and even then can take longer than ideal to calculate.

The approximate-percentile hyperfunctions we’ve implemented suffer from neither of these problems: they take constant storage, and, when combined with automatically refreshing materialized views, they can produce results nearly instantaneously. This performance improvement opens up opportunities to use percentile analysis for use cases and datasets where it was previously unfeasible.

Here’s an example of using percentile approximation for a DevOps use case, where we alert on response times that are over the 95th percentile:

WITH “95th percentile” as (
    SELECT approx_percentile(0.95, percentile_agg(response_time)) as threshold
    FROM response_times
)
SELECT count(*) 
FROM response_times 
AND response_time > “95th percentile”.threshold;

See our hyperfunctions docs to get started today. In the coming weeks, we will be releasing a series of blog posts which detail each of the hyperfunctions released today, in the context of using them to solve a real-world problem.

Hyperfunctions in public preview

In addition to the hyperfunctions released today, we’re making several hyperfunctions available for public preview. These include hyperfunctions for downsampling, smoothing, approximate count-distinct, working with counters, and working with more advanced forms of averaging. All of these are available for trial today through our experimental schema, and, with your feedback, will be made available for production usage soon.

Here’s a tour through each hyperfunction and why we created them:

Graph Downsampling & Smoothing

We have two algorithms implemented to help downsample your data for better, faster graphing:

The first graphing algorithm for downsampling is Largest triangle three bucket (LTTB). LTTB limits the number of points you need to send to your graphing engine while maintaining visual acuity. This means that you don’t try to plot 200,000 points on a graph that’s only 2000 pixels wide, which is inefficient in terms of network and rendering costs.

Given an original dataset which looks like the graph below:

We can downsample it to just 34 points with the following query using the LTTB hyperfunction:

SELECT toolkit_experimental.lttb(time, val, 34)

The above query yields the following graph, which retains the periodic pattern of the original graph, with just 34 points of data.

The second graphing algorithm for downsampling is Automatic smoothing for attention prioritization (ASAP smoothing). ASAP Smoothing uses optimal moving averages to smooth a graph to remove noise and make sure that trends are obvious to the user, while not over-smoothing and removing all the signals as well. This leads to vastly improved readability.

For example, the graph below displays 250 years of monthly temperature readings from England (raw data can be found here):

We can run the following query using the ASAP smoothing hyperfunction:

SELECT toolkit_experimental.asap_smooth(month, value, 800) FROM temperatures

The result is the graph below, which is much less noisy than the original and one where users can more easily spot trends.

Counter Aggregates

Metrics generally come in a few different varieties, which many systems have come to call gauges and counters. A gauge is a typical metric that can vary up or down, something like temperature or percent utilization. A counter is meant to be monotonically increasing. So it keeps track of, say, the total number of visitors to a website. The main difference in processing counters and gauges is that a decrease in the value of a counter (compared to its previous value in the time series) is interpreted as a reset. TimescaleDB’s counter aggregate hyperfunctions enable a simple and optimized analysis of these counters.

For example, despite a dataset being stored like:

data
------
  10
  20
   0
   5
  15

We can calculate the delta (along with various other statistics) over this monotonically-increasing counter with the following query using the counter aggregate hyperfunction:

SELECT toolkit_experimental.delta(
    toolkit_experimental.counter_agg(ts, val))
FROM foo;

 delta
------
  40

Hyperloglog for Approximate Count Distinct

We’ve implemented a version of the hyperloglog algorithm to do approximate count distinct queries over data in a more efficient and parallelizable fashion. For existing TimescaleDB users, you’d be happy to hear that they work in continuous aggregates, which are automatically refreshing materialized views.

Statistical Aggregates

Calculating rolling averages and other statistical aggregates over tumbling windows is very difficult in standard SQL because to do it accurately you’d need to separate out the different components (i.e., for average, count and sum) and then calculate it yourself. Our statistical aggregates allow you to simply do this, with simple rollup.

To follow the progress and contribute to improving these (and future) hyperfunctions, you can view our roadmap on GitHub. Our development process is heavily influenced by community feedback, so your comments on issues and discussion threads will help determine which features get prioritized, and when they’re stabilized for release.

Next Steps

Try hyperfunctions today with a fully-managed Timescale service (no credit card required, free for 30 days). Hyperfunctions are pre-loaded on each new database service on Timescale, so after you’ve created a new service, you’re all set to use them!

If you prefer to manage your own database instances, you can download and install the timescaledb_toolkit extension on GitHub for free, after which you’ll be able to use all the hyperfunctions listed above.

We love building in public. You can view our upcoming roadmap on GitHub for a list of proposed features, as well as features we’re currently implementing and those that are available to use today. We also welcome feedback from the community (it helps us prioritize the features users really want). To contribute feedback, comment on an open issue or in a discussion thread in GitHub.

Time-Series Analytics for PostgreSQL: Introducing the Timescale Analytics Project

David Kohn — Thu, 21 Jan 2021 03:16:15 GMT

We're excited to announce Timescale Analytics, a new project focused on combining all of the capabilities SQL needs to perform time-series analytics into one Postgres extension. Learn about our plans, why we're announcing them now, and ways to contribute your feedback and ideas.

Today, we’re excited to announce the Timescale Analytics project, an initiative to make Postgres the best way to execute critical time-series queries quickly, analyze time-series data, and extract meaningful information. SQL is a powerful language (we're obviously big fans ourselves), and we believe that by adding a specialized set of functions for time-series analysis, we can make it even better.

The Timescale Analytics project aims to identify, build, and combine all of the functionality SQL needs to perform time-series analysis into a single extension.

In other words, the Timescale Analytics extension will be a "one-stop shop" for time-series analytics in PostgreSQL, and we're looking for feedback from the community: what analytical functionality would you find most useful?

We believe that it is important to develop our code in the open and are requiring radical transparency of ourselves: everything about this project, our priorities, intended features, trade-off discussions, and (tentative) roadmap, are available in our GitHub repository.

It is our hope that working like this will make it easier for the community to interact with the project and allow us to respond quickly to community needs.

To this end, we’re announcing the project as early as possible, so we can get community feedback before we become too invested in a single direction. Over the next few weeks, we’ll be gathering thoughts on initial priorities and opening some sample PRs. Soon after that, we plan to create an initial version of the Timescale Analytics extension for you to experiment with.

Here are some examples of analytics functions we are considering adding: monotonic counters, tools for graphing, statistical sketching, and pipelining.

Monotonic Counters

A monotonically increasing counter is a type of metric often used in time-series analysis. Logically, such a counter should only ever increase, but the value is often read from an ephemeral source that can get reset back to zero at any time (due to crashes or other similar phenomena). To analyze data from such a source, you need to account for these resets: whenever the counter appears to decrease, you assume a reset occurred, and thus, you add the value after the reset to the value immediately prior to the reset.

Assume we have a counter that measures visitors to a website. If we were running a new marketing campaign focused on driving people to a new page on our site, we could use the change in the counter to measure the success of the campaign. While this kind of analysis can be performed in stock SQL, it quickly becomes unwieldy.

Using native SQL, such a query would look like:

SELECT sum(counter_reset_val) + last(counter, ts) - first(counter, ts) as counter_delta 
FROM (
    SELECT *,
        CASE WHEN counter - lag(counter) OVER (ORDER BY ts ASC) < 0
            THEN lag(counter) OVER (ORDER BY ts ASC)
            ELSE 0
        END as counter_reset_val
    FROM user_counter
) f;

This is a relatively simple example, and more sophisticated queries are even more complicated.

One of our first proposals for capabilities to include in Timescale Analytics would make this much simpler, allowing us to write something like:

SELECT delta(counter_agg(counter, ts)) as counter_delta FROM user_counter;

There are many examples like this: scenarios where it’s possible to solve the problem in stock SQL, but the resulting code is not exactly easy to write, nor pretty to read.

We believe we can solve that problem and make writing analytical SQL as easy as any other modern language.

Tools for Graphing

When graphing time-series data, you often need to perform operations such as change-point analysis, downsampling, or smoothing. Right now, these are usually generated with a front-end service, such as Grafana, but this means the graphs you use are heavily tied to the renderer you’re using.

Moving these functions to the database offers a number of advantages:

Users can choose their graphing front-end based on how well it does graphing, not on how well it does data analytics
Queries can remain consistent across all front-end tools and consumers of your data
Doing all the work in the database involves shipping a much smaller number of data points over the network

Key to getting this project working is building the output formats that will work for a variety of front-ends and identifying the necessary APIs. If you have thoughts on the matter, please hop on our discussion threads.

A fully worked-out pure-SQL example of a downsampling algorithm is too long to include inline here (for example, a worked-through version of largest-triangle-three-buckets can be found in this blog post) – but with aggregate support could be as simple as:

SELECT lttb(time, value, num_buckets=>500) FROM data;

This could return a timeseries data type, which could be ingested directly into a tool like Grafana or another language, or it could be unnested to get back to the time-value pairs to send into an external tool.

These tools can then use the simplified query instead of doing their own custom analysis on your data.

Example downsampling data from this dataset. It keeps the large-scale features of the data, with an order of magnitude fewer data points

Statistical Sketching

Sketching algorithms, such as t-digest, hyperloglog, and count-min, allow us to get a quick, approximate, answer for certain queries when the statistical bounds provided are acceptable.

This is even more exciting in the TimescaleDB ecosystem since it appears most of these sketches will fit nicely into continuous aggregates, allowing incredibly low query latency.

For instance, a continuous aggregate displaying the daily unique visitors to a website could be defined like:

CREATE MATERIALIZED VIEW unique_vistors
WITH (timescaledb.continuous) AS
    SELECT 
    time_bucket('1 day', time) as day, 
    hll(visitor_id) as visitors
    FROM connections
    GROUP BY time_bucket('1 day', time);

Such a view could be queried to get the visitors over range of days, like so:

SELECT day, approx_distinct(visitors)
FROM unique_vistors
WHERE day >= '2020-01-01' AND day >= '2020-01-15'

Additionally, it would allow for re-aggregation to determine the number of unique visitors over a coarser time range, such as the number of monthly visitors:

SELECT time_bucket(day, '30 days'), approx_distinct(hll(visitors))
FROM unique_vistors
GROUP BY time_bucket(day, '30 days')

Pipelining

SQL queries can get long, especially when there are multiple layers of aggregation and function calls.

For instance, to write a pairwise delta at minute-granularity in TimescaleDB, we’d use something like:

SELECT minutes, sampled - lag(sampled) OVER (ORDER BY minutes) as delta
FROM (
    SELECT
        time_bucket_gapfill(time, '1 minute') minutes,
        interpolate(first(value, time)) sampled
    FROM data
    GROUP BY time_bucket_gapfill(time, '1 minute')
) interpolated;

To mitigate this, the Timescale Analytics proposal includes a unified pipeline API capability that would allow us to use the much more straightforward (and elegant) query below:

SELECT timeseries(time, value) |> sample('1 minute') |> interpolate('linear') |> delta() FROM data;

Besides the simpler syntax, this API could also enable some powerful optimizations, such as incremental pipelines, single-pass processing, and vectorization.

This is still very much in the design phase, and we’re currently having discussions about what such an API should look like, what pipeline elements are appropriate, and what the textual format should be.

How we’re building Timescale Analytics

We’re building Timescale Analytics as a PostgreSQL extension. PostgreSQL's extension framework is quite powerful and allows for different levels of integration with database internals.

Timescale Analytics will be separate from the core TimescaleDB extension. This is because TimescaleDB core interfaces quite deeply into PostgreSQL’s internals— including the planner, executor, and DDL interfaces—due to the demands of time-series data storage. This necessitates a certain conservatism to its development process in order to ensure that updating TimescaleDB versions cannot damage existing databases, and that features interact appropriately with PostgreSQL’s core functionality.

By separating the new analytics functionality into a dedicated Timescale Analytics extension, we can vastly reduce the contact area for these new functions, enabling us to move faster without increased risk. We will be focusing on improvements that take advantage of the PostgreSQL extension hooks for creating functions, aggregates, operators, and other database objects, rather those that require interfacing with the lower-level planning and execution infrastructure. Creating a separate extension also allows us to experiment with our build process and technologies, for instance, writing the extension in Rust.

More importantly, we hope using a separate extension will lower barriers for community contributions. We know that the complexity of our integrations with PostgreSQL can make it difficult to contribute to TimescaleDB proper. We believe this new project will allow for much more self-contained contributions by avoiding projects requiring deep integration with the PostgreSQL planner or executor.

So, if you’ve been wanting to contribute back but didn’t know how or are a Rustacean looking to get involved in databasing, please join us!

Get Involved

Before the code is written is the perfect time to have a say in where the project will go. To this end, we want—and need—your feedback: what are the frustrating parts of analyzing time-series data? What takes far more code than you feel it should? What runs slowly or only runs quickly after seemingly arcane rewrites?

We want to solve community-wide problems and incorporate as much feedback as possible, in addition to relying on our intuition, observation, and experiences.

Want to help? You can submit suggestions and help shape the direction in 3 primary ways:

Look at some of the discussions we’re having right now and weigh in with your opinions. Any and all comments are welcome, whether you’re an experienced developer or just learning.
Check out the features we’re thinking of adding, and weigh in on if they’re something you want, if we’re missing something, or if there are any issues or alternatives. We are releasing nightly Docker images of our builds.
Explore our running feature requests, add a +1, and contribute your own.

Most importantly: share your problems! Tell us the kinds of queries or analyses you wish were easier, the issues you run into, or the workarounds you’ve created to solve gaps. (Example datasets are especially helpful, as they concretize your problems and create a shared language in which to discuss them.)