TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
How to Choose a Database: A Decision Framework for Modern ApplicationsRecursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL PerformancePostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesHow to Reduce Bloat in Large PostgreSQL TablesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)When to Consider Postgres PartitioningAn Intro to Data Modeling on PostgreSQLDesigning Your Database Schema: Wide vs. Narrow Postgres TablesGuide to PostgreSQL Database OperationsBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Explaining PostgreSQL EXPLAINBest Practices for (Time-)Series Metadata Tables What Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideGuide to Postgres Data ManagementA Guide to Data Analysis on PostgreSQLHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningTop PostgreSQL Drivers for PythonUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseHow to Use Psycopg2: The PostgreSQL Adapter for PythonBuilding a Scalable DatabaseGuide to PostgreSQL Database Design
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
Building AI Agents with Persistent Memory: A Unified Database ApproachWhen Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
Complete Guide: Migrating from MongoDB to Tiger Data (Step-by-Step)How to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's Guide5 Ways to Monitor Your PostgreSQL DatabaseData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics
Sections

Time series databases

AWS Time-Series Database: Understanding Your OptionsThe Best Time-Series Databases ComparedAlternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your Options

Time series analysis

Stationary Time-Series Analysis

Python

How to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonTime-Series Analysis and Forecasting With Python Guide to Time-Series Analysis in Python

R

Time-Series Analysis in R
Understanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views

Time series data

What Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsWhy Consider Using PostgreSQL for Time-Series Data?
TigerData logo

Products

Time-series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time-series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2026 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at May 16, 2024

The Best Time-Series Databases Compared

Try for free

Start supercharging your PostgreSQL today.

Several database representations side by side in neon colors.

Written by Team Timescale

Time-series databases answer a challenge that more companies increasingly recognize: measuring how systems change.

To analyze time-series data effectively, you must collect data at speed and scale. If you ingest hundreds, thousands, or even more data points per second, they can pile up quickly.

Ultimately, you need a database that can handle the scale, has excellent usability, and is reliable. A general database is unlikely to meet all three criteria for actual time-series workloads. You need a purpose-built time-series database.

In this post, we’ll compare some of the most popular time-series databases and review key features and criteria to consider when choosing one.

What Makes a Time-Series Database the Best?

There are many ways a database can be the best, depending on your preferred optimization points. In this section, we outline the most critical aspects of a time-series database: scalability, maintainability, reliability, and query language.

  • Scalability. Time-series data typically grows rapidly, and the database must accommodate large volumes of data without performance degradation.

  • Maintainability. Maintaining a database involves the ease of performing backups, updates, and other routine tasks to ensure data integrity and availability. This is important as it affects the database's long-term usability and reliability, reducing the time and effort required for administrative tasks.

  • Reliability. Ensuring data accuracy and availability, especially during high demand or failures, is essential. Since time-series databases often support business decisions, they require consistent and dependable data access.

  • Query language. A database's query language determines how users interact with the data, affecting ease of use and the learning curve. A familiar query language like SQL can reduce training time and integrate more easily with existing tools, enhancing productivity.

Top Time-Series Databases in Comparison

Database Model

Scalability

Deployment

Query Language

Pricing Models

TimescaleDB

Relational database

Vertically scalable, with automatic partitioning, columnar compression, optimized queries, and automatically updated materialized views. Horizontally, you can scale your read load on Timescale Cloud by adding read replicas.

Self-managed or managed cloud service  

SQL

TimescaleDB open-source can be self-hosted and is free 

Timescale Cloud follows a pay-as-you-go model

InfluxDB

Custom, non-relational NoSQL, columnar database

Horizontally scalable 

Self-managed or managed cloud service  

SQL, InfluxQL, Flux

InfluxDB Open Source is free to use, and InfluxDB Cloud Serverless is a managed service with a pay-as-you-go pricing model.

InfluxDB Cloud Dedicated and InfluxDB Clustered are for high-volume production workloads with costs depending on storage, CPU, and RAM

Prometheus

Pull-based model that scrapes metrics from targets

Scalable vertically or through federation

Deployed as single binary on server or on container platforms such as Kubernetes

PromQL

Open-source: no associated license cost

Kdb+

Columnar database with a custom data model

Horizontally scalable with multi-node support and multi-threading 

On-premises, in the cloud, or hybrid

Q language

Free 32-bit version for non-commercial purposes

For commercial deployments, pricing depends on the deployment model & number of cores

Graphite

Whisper (file-based time series) database format

Horizontally scalable, supports replication and clustering

On-premises or in the cloud

GQL

Open-source: no associated licensing cost

ClickHouse

Columnar database

Horizontally scalable

On-premises or in the cloud

Also available as a managed service

SQL-based declarative query language, mostly identical to ANSI SQL standard

ClickHouse is open-source and doesn’t have an associated license cost

ClickHouse Cloud follows a pay-as-you-go pricing model

MongoDB

No-SQL database with a JSON-like document model

Horizontally scalable - supports automatic load balancing, data sharding, and replication

Self-managed or managed cloud service  

MQL (MongoDB Query Language)

MongoDB Community Edition is open-source and free to use

MongoDB Enterprise: pricing depends on the features you choose

MongoDB Atlas has a pay-as-you-go pricing model

InfluxDB

InfluxDB is a popular time-series database known for its high performance and scalability. Built from the ground up for time-series data, it offers a schema-less data model, which makes it easy to add new fields without modifying the schema or migrating data. Depending on which version you're using, it uses specialized query languages called InfluxQL and Flux (you can also rely on standard SQL).

InfluxDB can handle high-throughput ingestion of metrics data and is widely used in monitoring and alerting, IoT data storage and analysis, and real-time analytics. It’s also suited for applications that involve analyzing and tracking data points over time and for scenarios where users need to query data quickly after ingestion for real-time analytics. 

The database uses columnar storage for data, Apache Arrow for in-memory data representation, and Parquet for file storage, which allows for faster queries and better data compressions. Plus, it allows you to define retention policies that determine how long your data is stored before it is automatically deleted. 

What sets InfluxDB apart is its architecture which involves four key components (data ingest, data compaction, data querying, and garbage collection), each operating independently. With the query and ingest components decoupled, it is easy to independently scale each component depending on the workload demands. InfluxDB's clustering capabilities also enable horizontal scalability, allowing it to scale with growing data volumes and workload demands.

Prometheus

Prometheus is a monitoring and alerting toolkit known for its reliability and scalability. It is optimized for collecting and querying time-series data related to system and application metrics. Prometheus uses a pull-based model where the server collects metrics from the target at regular intervals and stores data in a custom time-series database that’s optimized for low resource usage and fast querying. This model makes it highly suitable for monitoring dynamic environments since it enables automatic discovery and allows monitoring of new instances. 

Prometheus has an extensible and modular architecture with components such as service discovery mechanisms and exporters. However, since it’s a non-distributed system, it isn’t horizontally scalable and lacks built-in clustering, although it supports federation, which allows multiple servers to share data.

Its query language, PromQL, allows users to perform queries and aggregations on metric data. Prometheus is widely used in cloud-native environments and is known for its simplicity, flexibility, and robustness. It integrates with container orchestration platforms like Kubernetes, making it a popular choice for monitoring microservices architectures, including Docker and Kubernetes. Other use cases included application performance monitoring and anomaly detection. 

However, note that Prometheus is not a general-purpose time-series database, so it might not be suitable for long-term data storage. 

Kdb+

kdb+ is a columnar database optimized for handling financial time-series data such as stock market trades, although it works for other kinds of time-series data, including IoT and sensor data. Developed by Kx Systems, kdb+ is known for its speed, scalability, and advanced analytics capabilities, making it suitable for applications that need high-speed querying and analysis of high-volume data. However, it uses a proprietary language called q for querying and manipulating data, which is tailored for working with time-series data in finance.

The database is suited to scenarios where low-latency access to large volumes of time-series data is critical, such as algorithmic trading, risk management, and tick data analysis. Its architecture allows for on-disk and in-memory storage, and it can scale horizontally across numerous machines. Its data storage and processing make it a niche choice for financial institutions and trading firms seeking to gain insights from real-time market data.

Timescale

TimescaleDB is a powerful open-source time-series database built on top of PostgreSQL. It offers scalability, reliability, and advanced time-series capabilities. It automatically partitions time-series data using chunks and hypertables, which improves query performance and simplifies data management. 

Timescale's compression offers significant storage savings by reducing the footprint of time-series data, often achieving up to 90 percent compression rates. This not only lowers storage costs but also enhances query performance, as less data needs to be read from disk, leading to faster analytics and more efficient resource utilization.

With TimescaleDB, you can leverage the familiarity and power of SQL to query and manipulate time-series data, enabling complex analytics and insights. It also provides specialized SQL operators and functions for different scenarios, like first, last, and time_bucket, that simplify aggregation and querying.  

Timescale's continuous aggregates optimize query performance by precomputing and materializing the results of time-series queries at regular intervals. This allows for instantaneous access to up-to-date summaries and insights, reducing the computational overhead on your database and ensuring faster, more efficient data analysis.

Its integration with PostgreSQL also ensures data durability, ACID compliance, and compatibility with a wide range of tools and applications. Whether it's for monitoring, IoT, financial analysis, or other time-series use cases, TimescaleDB provides a robust solution for storing and analyzing time-series data at scale.

All of these features make Timescale ideal for storing and analyzing metrics such as application logs, sensor data, server performance metrics, and financial data like trading volumes, exchange rates, and stock prices. 

Graphite

Graphite is an open-source monitoring tool and time-series database used for collecting, storing, and visualizing metric data. It uses the Whisper format, which efficiently manages and stores data by aggregating and expiring data based on retention policies, and a simple yet powerful query language, making it easy to ingest and analyze time-series data. It also supports a range of functions for querying, aggregating, and transforming time series data, allowing users to create custom dashboards and graphs. 

The database’s architecture consists of numerous components, including Carbon (which receives metrics from various sources, caches them in memory, and stores them in Whisper), Whisper (which manages data aggregation and retention), and Graphite-web (a web app that provides a UI for visualizing and querying time series data). 

Graphite is commonly used for monitoring infrastructure and application metrics in environments ranging from small-scale deployments to large-scale systems. It provides customizable dashboards and graphs for visualizing metric trends and anomalies, helping users gain insights into the performance and health of their systems. While Graphite has limitations in scalability compared to some other solutions, its simplicity and versatility make it a popular choice for many monitoring use cases.

ClickHouse

ClickHouse is an open-source columnar database management system optimized for analytics workloads. It excels in performing fast analytical queries on large volumes of data, and stores data in a column-oriented format, which enables efficient compression and processing for analytical queries (since only the necessary columns are read as the query is executed). 

The database's native support for SQL allows you to leverage your existing SQL skills and tools for querying and analyzing data. It is also highly scalable and can handle petabytes of data across distributed clusters. Plus, it uses MergeTtree as its primary table engine, which supports indexing, replication, and partitioning. It also supports materialized views, which improves query performance. 

ClickHouse is commonly used for ad hoc analytics, real-time reporting, and data warehousing applications where fast query performance is essential. Although it’s not optimized for working with time-series data, it can still effectively store and analyze it. And while it can quickly query time-series data once ingested, it can struggle with high-write scenarios where the data should be ingested in small batches. 

MongoDB

MongoDB is a NoSQL database that is commonly known for its ease of use, scalability, and flexibility. It stores data in JSON-like documents, making it suitable for various use cases, including real-time analytics, content management, and mobile applications. Despite being a general-purpose NoSQL database, MongoDB can efficiently process and store time-series data. Its flexible data model easily adapts to the evolving structure of time-series data.

MongoDB's horizontal scalability allows it to handle large volumes of data and high-throughput workloads with ease. And its query language, MongoDB Query Language (MQL), supports a rich set of operations for querying and manipulating document data. 

Time-Series Database Features to Consider

Time-series databases may provide general features as well as time-series features (e.g., TimescaleDB) or focus on providing time-series features at the cost of supporting more general workloads (e.g., InfluxDB). Time-series data doesn’t live in a silo (you often want to correlate it with other types of data), so whatever the approach, you will need a way to relate general and time-series data.

In the following section, you will see examples of time-series features you can expect from a time-series database. To show you both ends of the spectrum, you’ll see what the features look like in two totally different time-series databases: InfluxDB (with its own “Flux” query language) and TimescaleDB (which extends PostgreSQL and offers full-SQL support).

Time-based aggregation

Time-based aggregation is a must-have feature for time-series databases. Without it, you won’t be able to create hourly, daily, weekly, and monthly time aggregations (and many permutations in between) that are key for analyzing time-series data. Let’s see what time-based aggregate queries look like in the case of InfluxDB and TimescaleDB:

InfluxDB

from(bucket: "crypto")   |> range(start: 0)   |> aggregateWindow(every: 1d, fn: count, createEmpty: false)   |> sort(columns: ["_time", ], desc:true)   |> limit(n:10)

Using InfluxDB’s query language, Flux, you must define a time range for the query. You can use 0 as the start parameter to work around this—in case you don’t want to define a specific time range. Then you can use the aggregateWindow() function to create arbitrary time “buckets.”

TimescaleDB

SELECT   time_bucket('1 day', time) AS bucket,   count(*) row_count FROM crypto_ticks GROUP BY bucket ORDER BY bucket DESC LIMIT 10 bucket         |row_count| -------------------+---------+ 2022-09-09 02:00:00| 52277| 2022-09-08 02:00:00|   128657| 2022-09-07 02:00:00|   134849| 2022-09-06 02:00:00|   132837| 2022-09-05 02:00:00|   126254|

In TimescaleDB you can use the time_bucket() function to create arbitrary time buckets. Besides, all the other available PostgreSQL functions, like count(*) works the same way as in regular PostgreSQL.

When working with time-series data, you will have to create many time-based aggregations, so make sure that the database you choose provides a simple and intuitive interface for creating time buckets.

Automatic downsampling

Time-series data is often ingested at a very high resolution (e.g., thousands of data points per second). To make it easier to analyze time series, users often downsample their data (e.g., they convert thousands of data points per second to only one). This technique not only saves storage costs because lower-resolution data needs to be stored, but it also makes it easier to create visualizations and recognize trends in the data.

Downsampling is often done repeatedly and continuously, which means that if, for example, you insert multiple new rows every second, the database rolls up the incoming data into larger buckets automatically. Instead of aggregating the raw data yourself, the database takes care of it automatically and in real time. Let’s see how InfluxDB and TimescaleDB handle downsampling with an OHLC example.

InfluxDB

close=from(bucket: "crypto")   |> range(start: -30d)   |> group(columns:["symbol"])   |> filter(fn: (r) => r["_measurement"] == "ohlc")   |> window(every: 1h)   |> reduce(fn: (r, accumulator) => ({

      indexLow:         if (r._field=="low") then            accumulator.indexLow+1          else         accumulator.indexLow,

      indexOpen:        if (r._field=="open") then         accumulator.indexOpen+1         else         accumulator.indexOpen,         open:        if (r._field=="open") then          if (accumulator.indexOpen==0) then            r._value          else            accumulator.open       else         accumulator.open       , 

      high:        if (r._field=="high") then             if(r._value>accumulator.high ) then             r._value           else             accumulator.high        else          accumulator.high    ,

    low:        if (r._field=="low") then           if(r._value<accumulator.low or accumulator.indexLow==0.0) then             r._value           else            accumulator.low        else          accumulator.low,

             close:         if (r._field=="close") then            r._value        else          accumulator.close,

             volume:          if (r._field=="volume") then           r._value+accumulator.volume            else            accumulator.volume             }),     identity: {indexLow:0,indexOpen:0,open: 0.0,high: 0.0,low: 0.0,close: 0.0,volume: 0.0})     |> drop(columns: ["indexOpen","indexLow"])   |> group(columns:["pair"])     |> yield(name: "candle")

InfluxDB’s Flux provides a convenient way to write simple queries, but if you want to create somewhat more complex queries, like creating OHLC aggregates from raw financial tick data, the final query can become quite long as you can see.

TimescaleDB

CREATE MATERIALIZED VIEW hourly_buckets WITH (timescaledb.continuous) AS SELECT   time_bucket('1 hour', time) AS bucket,   symbol,   first(price, time) AS open,   max(price) AS high,   min(price) AS low,   last(price, time) AS close FROM crypto_ticks GROUP BY bucket, symbol;

SELECT * FROM hourly_buckets; bucket         |symbol  |open   |high   |low |close  | -------------------+--------+-------+-------+-------+-------+ 2022-02-08 22:00:00|ADA/USD |  1.166|   1.17|  1.157|  1.168| 2022-02-08 22:00:00|ATOM/USD|  30.44|  30.63|   30.3|  30.51| 2022-02-08 22:00:00|AVAX/USD|  87.85|   88.0|  86.72|  87.06| 2022-02-08 22:00:00|BNB/USD |  413.5|  416.5|  410.3|  410.3| 2022-02-08 22:00:00|BTC/USD |44192.4|44354.0|43938.6|44185.2|

If you are familiar with PostgreSQL syntax, you can see that the TimescaleDB method is very similar to a PostgreSQL materialized view. However, the mechanism under the hood is different to provide a better developer experience for time-series data by automatically storing pre-aggregated buckets over time, maintaining aggregations when raw data changes, and even returning real-time data.

Querying recent data

You might want to build visual dashboards to display time-series trends or even close to real-time data. For creating trend charts, you can use the previously mentioned downsampling method. But for real-time data, you probably want to see more granular and recent data, e.g., all data points from the past five minutes. Let’s see how you can make this simple request in InfluxDB and TimescaleDB.

InfluxDB

from(bucket: "crypto")   |> range(start: -5m)   |> filter(fn: (r) => r.symbol == "BTC/USD")   |> sort(columns: ["_time", ], desc:true)   |> limit(n:5)   |> keep(columns: ["_time", "_value"])

In Flux, you can specify a time range that is relative to now with start: -5m which will return all data for the “BTC/USD” symbol from the past five minutes. 

TimescaleDB

SELECT   time,   price FROM crypto_ticks WHERE   "time" > NOW() - INTERVAL '5 minutes' AND   symbol = 'BTC/USD' ORDER BY time DESC LIMIT 5; time           |price  | -------------------+-------+ 2022-09-12 15:24:07|22346.7| 2022-09-12 15:24:03|22346.3| 2022-09-12 15:23:50|22346.7| 2022-09-12 15:23:45|22355.9| 2022-09-12 15:23:40|22358.1|

In the TimescaleDB example, you can see a familiar SQL example (if you already know SQL) with a symbol filter and a relative time filter in the WHERE clause using the NOW() PostgreSQL function. Under the hood, however, this query gets executed differently from regular PostgreSQL: when you insert time-series data into the database, TimescaleDB auto-partitions your table based on the time column.

Then, when you make a query containing a time filter, like in this example, TimescaleDB can exclude whole chunks from scanning, which makes querying recent data lightning fast, even if you have billions of rows stored in the database.

Long-range analytical queries

What if you are also interested in analyzing longer time frames, e.g., all data from the past year? Maybe you want to see the highest price of a certain stock or crypto symbol in the past year.

InfluxDB from(bucket: "crypto")   |> range(start: -1y)   |> group(columns: ["code"])   |> max()   |> group()   |> sort(columns: ["_value"], desc: true)

This example shows that Flux executes your query in the same order as you describe it.

TimescaleDB

SELECT   symbol,   MAX(price) AS max_price FROM crypto_ticks WHERE   "time" >= NOW() - INTERVAL '1 year' GROUP BY symbol ORDER BY max_price DESC;

symbol   |max_price | ---------+----------+ BTC/USD  |   48210.1| WBTC/USD |  48169.56| ETH/USD  |   3579.38| BNB/USD  | 460.0| SOL/USD  | 143.55|

Analytical queries like this, with a larger time window as the filter, are not typical time-series queries, but you might want to run these from time to time. TimescaleDB provides two features that significantly speed up these queries: native compression, which saves space and converts your data into a columnar form, and continuous aggregates, which automatically maintain materialized aggregate data that can be retained separately from raw readings. Together, these features can have a dramatic effect on the performance of your application.

JOINing time-series data with other business data

Sometimes we only talk about time-series data without mentioning all the other data that real-world projects have in their data infrastructure. But the reality is that time-series data is always connected to non-time-series (business) data. If you plan to analyze your time-series data and business data together, the database you choose needs to be able to JOIN them and work with them quickly and simply. In the following examples, you can see how to join two tables in InfluxDB and TimescaleDB.

InfluxDB

crypto_assets = from(bucket: "crypto-assets")     |> range(start: -1mo)     |> filter(fn: (r) => r._measurement == "assets" and r._field == "symbol")

crypto_ticks = from(bucket: "crypto-ticks")     |> range(start: -1m)     |> filter(fn: (r) => r._measurement == "ticks" and r._field == "price")

join(     tables: {assets:crypto_assets, ticks:crypto_ticks},     on: [symbol, ], )

The big difference between InfluxDB and TimescaleDB in this regard is that InfluxDB can only store timestamped data, while TimescaleDB can store timestamped and non-timestamped data right next to each other. Thus, in InfluxDB you can only join time-series data with other time-series data but not relational data.

TimescaleDB

SELECT  crypto_assets.name,  bucket,  close,  high,  low,  open FROM one_day_candle INNER JOIN crypto_assets ON crypto_assets.symbol = one_day_candle.symbol WHERE   bucket > NOW() - INTERVAL '1 month' AND   one_day_candle.symbol = 'BTC/USD' ORDER BY bucket; name   |bucket         |close  |high   |low |open   | -----------+-------------------+-------+-------+-------+-------+ Bitcoin USD|2022-08-13 02:00:00|24460.6|24889.5|24312.3|24402.2| Bitcoin USD|2022-08-14 02:00:00|24312.4|25034.2|24160.4|24455.2| Bitcoin USD|2022-08-15 02:00:00|24092.8|25210.9|23798.7|24316.2| Bitcoin USD|2022-08-16 02:00:00|23867.7|24247.5|23692.0|24103.2| Bitcoin USD|2022-08-17 02:00:00|23340.1|24430.1|23184.4|23857.3|

In TimescaleDB, you can use PostgreSQL’s JOIN to connect any two tables in the same database, enabling you to store your non-time-series data next to your time-series data. Without this feature, you might have a harder time bringing your data together from multiple sources.

Fun fact: One of the reasons TimescaleDB was created was that the founders struggled to find a database that could do easy JOINs for time-series data.

Data retention and compression

While observing how thousands of companies handle time-series data, we found that it becomes less valuable over time. This means users often want to archive or even remove older data after a certain time to save on storage costs.

InfluxDB

image

In InfluxDB you can change the data retention settings on a per bucket basis on the UI. In older versions, you could also add data retention policies this way:

CREATE RETENTION POLICY "one_year" ON "some_dataset" DURATION 1y REPLICATION 1 DEFAULT

TimescaleDB

–Compression: ALTER TABLE example SET (   timescaledb.compress,   timescaledb.compress_segmentby = symbol ); SELECT add_compression_policy(‘crypto_ticks’, INTERVAL '2 weeks');

–Data retention: SELECT add_retention_policy(‘crypto_ticks’,, INTERVAL '1 year');

With TimescaleDB, you can set up both a compression policy (to save on storage needs but keep the data available for querying) and a data retention policy (which gets rid of the data after the defined time period). Without essential tooling around data compression and data retention in the database, you’d need to implement and maintain these automations manually.

Next Steps

The value that time-series data provides is unquestionable. It’s like watching the whole movie instead of just the last frame. You do need a proper database that can play that movie for you though. I hope this article gave you the tools you need to choose the time-series database that best fits your team and your project. 

Check out our docs for hands-on time-series tutorials:

  • Getting Started With TimescaleDB

  • How to Simulate a Basic IoT Sensor Dataset on PostgreSQL

  • And more!

If you are interested in Timescale,  you can try it for free or check out our GitHub, community Slack, or Forum and join discussions with thousands of Timescale users.

On this page

    Try for free

    Start supercharging your PostgreSQL today.