TigerData logo
TigerData logo
  • Product

    Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    TimescaleDB Enterprise

    Self-managed TimescaleDB for on-prem, edge and private cloud

    Open source

    TimescaleDB

    Time-series, real-time analytics and events on Postgres

    Search

    Vector and keyword search on Postgres

  • Industry

    Crypto

    Energy Telemetry

    Oil & Gas Operations

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InStart a free trial
Home
The Best Time-Series Databases Compared (2026)Time Series Anomaly Detection: Methods, SQL, and Real-Time ImplementationAWS Timestream Alternatives: Your Migration Options After LiveAnalyticsWhat Is Temporal Data?Time-Series Database: What It Is, How It Works, and When You Need OneIs Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsTime-Series Analysis and Forecasting With Python What Are Open-Source Time-Series Databases—Understanding Your OptionsStationary Time-Series AnalysisAlternatives to TimescaleWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is a Time Series and How Is It Used?How to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
PostgreSQL vs. Cassandra: The Decision Framework for Time-Series and Write-Heavy WorkloadsUnderstanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ Understanding FILTER in PostgreSQL (With Examples)How to Install PostgreSQL on MacOSUnderstanding GROUP BY in PostgreSQL (With Examples)Understanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)PostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding PostgreSQL WITHIN GROUPUnderstanding WINDOW in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisUnderstanding DISTINCT in PostgreSQL (With Examples)PostgreSQL Joins : A SummaryUnderstanding PostgreSQL Date and Time FunctionsWhat Is a PostgreSQL Cross Join?Understanding ACID Compliance Understanding PostgreSQL Conditional FunctionsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding percentile_cont() and percentile_disc() in PostgreSQL5 Common Connection Errors in PostgreSQL and How to Solve ThemData Processing With PostgreSQL Window FunctionsPostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsData Partitioning: What It Is and Why It MattersUnderstanding PostgreSQL Array FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQLWhat Is a PostgreSQL Left Join? And a Right Join?Strategies for Improving Postgres JOIN PerformanceUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on LinuxUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding WHERE in PostgreSQL (With Examples)Understanding OFFSET in PostgreSQL (With Examples)What Is a PostgreSQL Inner Join?Understanding PostgreSQL SELECTWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?What Characters Are Allowed in PostgreSQL Strings?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Full Outer Join?Self-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding the Postgres extract() Function
How to Choose a Database: A Decision Framework for Modern ApplicationsA Guide to Scaling PostgreSQLHandling Large Objects in PostgresGuide to PostgreSQL PerformancePostgreSQL Performance Tuning: Key ParametersHow to Reduce Bloat in Large PostgreSQL TablesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)SQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveHow to Use PostgreSQL for Data TransformationPostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Optimizing Database IndexesWhen to Consider Postgres PartitioningAn Intro to Data Modeling on PostgreSQLDesigning Your Database Schema: Wide vs. Narrow Postgres TablesGuide to PostgreSQL Database OperationsBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables What Is a PostgreSQL Temporary View?PostgreSQL Performance Tuning: How to Size Your DatabaseA PostgreSQL Database Replication GuideGuide to Postgres Data ManagementHow to Compute Standard Deviation With PostgreSQLRecursive Query in SQL: What It Is, and How to Write OneHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLA Guide to Data Analysis on PostgreSQLGuide to PostgreSQL SecurityOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningTop PostgreSQL Drivers for PythonUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceA Guide to pg_restore (and pg_restore Example)Explaining PostgreSQL EXPLAINHow PostgreSQL Data Aggregation WorksHow to Use Psycopg2: The PostgreSQL Adapter for PythonBuilding a Scalable DatabaseGuide to PostgreSQL Database Design
Best Practices for Postgres Data ManagementHow to Store Video in PostgreSQL Using BYTEABest Practices for Postgres PerformanceHow to Design Your PostgreSQL Database: Two Schema ExamplesBest Practices for Scaling PostgreSQLHow to Handle High-Cardinality Data in PostgreSQLBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres SecurityBest Practices for PostgreSQL Database OperationsBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Manage Your Data With Data Retention PoliciesHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
EV Charging Management System: Architecture, OCPP Data, and the Right DatabaseWhat Is an Edge Database? On-Device Storage, Sync Patterns, and Choosing the Right StackWater Utilities Database: How to Store and Query SCADA, AMI, and Quality Data at ScaleA Beginner’s Guide to IIoT and Industry 4.0Data Historian vs. Time-Series Database: How to Choose and When to SwitchWhat Is a Data Historian?The Best Databases for IoT in 2026: A Practical ComparisonHow Hopthru Powers Real-Time Transit Analytics From a 1 TB TableUnderstanding IoT (Internet of Things)Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionWhy You Should Use PostgreSQL for Industrial IoT Data Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
A Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRAG Is More Than Just Vector SearchRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchVector Search vs Semantic SearchHNSW vs. DiskANNWhen Should You Use Full-Text Search vs. Vector Search?Building AI Agents with Persistent Memory: A Unified Database ApproachWhat Is Vector Search? Text-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkPostgreSQL Hybrid Search Using Pgvector and CohereBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding OLTPUnderstanding OLAP: What It Is, How It Differs From OLTP, and Running It on PostgreSQLColumnar Databases vs. Row-Oriented Databases: Which to Choose?How to Choose an OLAP DatabaseHow to Choose a Real-Time Analytics DatabaseData Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)PostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
Is Postgres Partitioning Really That Hard? An Introduction To HypertablesComplete Guide: Migrating from MongoDB to Tiger Data (Step-by-Step)How to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's GuideData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsPostgreSQL Materialized Views and Where to Find Them5 Ways to Monitor Your PostgreSQL DatabaseTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsBenchmarks
Home
The Best Time-Series Databases Compared (2026)Time Series Anomaly Detection: Methods, SQL, and Real-Time ImplementationAWS Timestream Alternatives: Your Migration Options After LiveAnalyticsWhat Is Temporal Data?Time-Series Database: What It Is, How It Works, and When You Need OneIs Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsTime-Series Analysis and Forecasting With Python What Are Open-Source Time-Series Databases—Understanding Your OptionsStationary Time-Series AnalysisAlternatives to TimescaleWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is a Time Series and How Is It Used?How to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
PostgreSQL vs. Cassandra: The Decision Framework for Time-Series and Write-Heavy WorkloadsUnderstanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ Understanding FILTER in PostgreSQL (With Examples)How to Install PostgreSQL on MacOSUnderstanding GROUP BY in PostgreSQL (With Examples)Understanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)PostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding PostgreSQL WITHIN GROUPUnderstanding WINDOW in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisUnderstanding DISTINCT in PostgreSQL (With Examples)PostgreSQL Joins : A SummaryUnderstanding PostgreSQL Date and Time FunctionsWhat Is a PostgreSQL Cross Join?Understanding ACID Compliance Understanding PostgreSQL Conditional FunctionsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding percentile_cont() and percentile_disc() in PostgreSQL5 Common Connection Errors in PostgreSQL and How to Solve ThemData Processing With PostgreSQL Window FunctionsPostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsData Partitioning: What It Is and Why It MattersUnderstanding PostgreSQL Array FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQLWhat Is a PostgreSQL Left Join? And a Right Join?Strategies for Improving Postgres JOIN PerformanceUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on LinuxUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding WHERE in PostgreSQL (With Examples)Understanding OFFSET in PostgreSQL (With Examples)What Is a PostgreSQL Inner Join?Understanding PostgreSQL SELECTWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?What Characters Are Allowed in PostgreSQL Strings?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Full Outer Join?Self-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding the Postgres extract() Function
How to Choose a Database: A Decision Framework for Modern ApplicationsA Guide to Scaling PostgreSQLHandling Large Objects in PostgresGuide to PostgreSQL PerformancePostgreSQL Performance Tuning: Key ParametersHow to Reduce Bloat in Large PostgreSQL TablesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)SQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveHow to Use PostgreSQL for Data TransformationPostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Optimizing Database IndexesWhen to Consider Postgres PartitioningAn Intro to Data Modeling on PostgreSQLDesigning Your Database Schema: Wide vs. Narrow Postgres TablesGuide to PostgreSQL Database OperationsBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables What Is a PostgreSQL Temporary View?PostgreSQL Performance Tuning: How to Size Your DatabaseA PostgreSQL Database Replication GuideGuide to Postgres Data ManagementHow to Compute Standard Deviation With PostgreSQLRecursive Query in SQL: What It Is, and How to Write OneHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLA Guide to Data Analysis on PostgreSQLGuide to PostgreSQL SecurityOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningTop PostgreSQL Drivers for PythonUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceA Guide to pg_restore (and pg_restore Example)Explaining PostgreSQL EXPLAINHow PostgreSQL Data Aggregation WorksHow to Use Psycopg2: The PostgreSQL Adapter for PythonBuilding a Scalable DatabaseGuide to PostgreSQL Database Design
Best Practices for Postgres Data ManagementHow to Store Video in PostgreSQL Using BYTEABest Practices for Postgres PerformanceHow to Design Your PostgreSQL Database: Two Schema ExamplesBest Practices for Scaling PostgreSQLHow to Handle High-Cardinality Data in PostgreSQLBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres SecurityBest Practices for PostgreSQL Database OperationsBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Manage Your Data With Data Retention PoliciesHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
EV Charging Management System: Architecture, OCPP Data, and the Right DatabaseWhat Is an Edge Database? On-Device Storage, Sync Patterns, and Choosing the Right StackWater Utilities Database: How to Store and Query SCADA, AMI, and Quality Data at ScaleA Beginner’s Guide to IIoT and Industry 4.0Data Historian vs. Time-Series Database: How to Choose and When to SwitchWhat Is a Data Historian?The Best Databases for IoT in 2026: A Practical ComparisonHow Hopthru Powers Real-Time Transit Analytics From a 1 TB TableUnderstanding IoT (Internet of Things)Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionWhy You Should Use PostgreSQL for Industrial IoT Data Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
A Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRAG Is More Than Just Vector SearchRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchVector Search vs Semantic SearchHNSW vs. DiskANNWhen Should You Use Full-Text Search vs. Vector Search?Building AI Agents with Persistent Memory: A Unified Database ApproachWhat Is Vector Search? Text-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkPostgreSQL Hybrid Search Using Pgvector and CohereBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding OLTPUnderstanding OLAP: What It Is, How It Differs From OLTP, and Running It on PostgreSQLColumnar Databases vs. Row-Oriented Databases: Which to Choose?How to Choose an OLAP DatabaseHow to Choose a Real-Time Analytics DatabaseData Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)PostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
Is Postgres Partitioning Really That Hard? An Introduction To HypertablesComplete Guide: Migrating from MongoDB to Tiger Data (Step-by-Step)How to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's GuideData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsPostgreSQL Materialized Views and Where to Find Them5 Ways to Monitor Your PostgreSQL DatabaseTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
TigerData logo

Products

Time-series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Tutorials Changelog Success Stories Time-series Database

Company

Contact Us Careers About Newsroom Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

2026 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

By Tiger Data Team

Updated at Apr 23, 2026

Table of contents

    Time Series Anomaly Detection: Methods, SQL, and Real-Time Implementation

    Time Series Anomaly Detection: Methods, SQL, and Real-Time Implementation

    By Tiger Data Team

    Updated at Apr 23, 2026

    Time series anomaly detection is the process of identifying data points, sequences, or patterns in time-ordered data that deviate significantly from expected behavior. It is a core technique in three of the most demanding production monitoring domains: IoT sensor monitoring (industrial equipment, energy telemetry, smart buildings), infrastructure observability (server metrics, latency spikes, traffic anomalies), and financial monitoring (fraud detection, unusual transaction patterns).

    What separates time series anomaly detection from general outlier detection is temporal structure. A reading of 95% CPU utilization looks like an outlier in the global distribution until you account for context: that value is expected every night at 2am during a scheduled batch job, but anomalous at 2am on a Sunday. Seasonality, trends, and autocorrelation mean a data point has to be evaluated against its temporal context, not just the overall distribution. That distinction justifies the specialization and shapes every method covered in this guide.

    This guide covers two implementation paths. The first is SQL-native detection using PostgreSQL window functions and TimescaleDB's continuous aggregates. Standard SQL handles the majority of production monitoring use cases without any external infrastructure or ML models. The second is Python integration for cases where labeled training data, ensemble methods, or foundation models like Moment or TimeGPT are the right tool. These paths are complementary, not competing.

    One transparency note up front: Tiger Data builds a PostgreSQL-based time-series database, so our examples are PostgreSQL-native. Where Python-only or ML-specific approaches are better for your use case, we say so.

    Types of Anomalies in Time Series Data

    Every anomaly detection strategy starts with knowing what type of anomaly you are looking for. Detection methods that work well for one type often fail for another.

    Point anomaly: A single observation that deviates from the expected distribution. A temperature sensor reading 9,999°C when all surrounding readings are 72°C is a point anomaly. These are the easiest to detect and the most common subject of simple threshold alerts.

    Contextual anomaly: A value that is normal in one context but anomalous in another. CPU utilization at 95% is expected during a scheduled batch job at 2am but is anomalous at the same hour on a Sunday with no scheduled jobs. The value itself is not unusual. Its context makes it anomalous. Detecting contextual anomalies requires partitioned baselines or time-of-week adjustments.

    Collective anomaly: A sequence of observations that together signal a problem, even if no individual reading is unusual. A gradual 0.5°C temperature rise in industrial equipment over six hours looks like measurement noise reading by reading, but collectively indicates bearing failure. Collective anomalies often require sequence models or pattern matching across a window.

    Type

    Description

    Example

    Detection challenge

    Point

    Single outlier observation

    Sensor reads 9,999°C vs. surrounding 72°C

    Threshold or Z-score

    Contextual

    Normal value in wrong context

    95% CPU at 2am on Sunday

    Partitioned baselines, time-of-week windows

    Collective

    Sequence that signals a problem

    Gradual 0.5°C rise over 6 hours

    Sliding windows, pattern matching

    The detection strategy differs by type. Point anomalies respond well to statistical thresholds (Z-score, MAD). Contextual anomalies require partitioned baselines or temporal segmentation. Collective anomalies often require sequence models or rolling window analysis across a meaningful time horizon.

    Anomaly Detection Methods: Statistical vs. Machine Learning

    A useful framing from experienced practitioners: for most production monitoring use cases, anomaly detection reduces to good forecasting plus a threshold on the residual. Machine learning methods add complexity without always adding reliability. That does not mean ML is wrong, but it should not be the default starting point.

    Statistical Methods

    Z-score (standard score): Flags values that are more than N standard deviations from the rolling mean. Formula: (x - mean) / stddev. Strengths: simple, interpretable, no training required, works natively in SQL. Limitations: sensitive to outliers in the baseline -- the standard deviation itself gets inflated by anomalies you are trying to detect, and the method assumes approximate normality in the data.

    Median Absolute Deviation (MAD): Uses the median instead of the mean as the baseline, making it resistant to contamination. Formula: MAD = median(|xi - median(x)|); flag values where |x - median| / MAD > threshold. MAD is more robust than Z-score for sensor data with occasional extreme values precisely because the median does not get inflated by the anomalies you are looking for.

    Moving average residuals / STL decomposition: Decompose the series into trend, seasonal, and residual components, then flag residuals beyond a threshold. This is the "good forecasting plus threshold on the residual" pattern that practitioners consistently describe as the most reliable production approach. Handles strong seasonality better than Z-score but requires more setup.

    Machine Learning Methods

    Isolation Forest: Unsupervised; isolates anomalies by recursively partitioning the feature space. Works well for multivariate data, does not assume a distribution shape, and requires no labeled data. Available via scikit-learn with a simple API.

    Autoencoders / LSTM: Neural network approaches that learn to reconstruct normal patterns and flag high reconstruction error as anomalous. Higher complexity than statistical methods. Appropriate when pattern complexity justifies it and compute is available for training and inference.

    Prophet + residual analysis: Facebook Prophet fits a trend and seasonality model; residuals beyond prediction intervals are flagged as anomalous. Handles missing data and holidays well. More setup than Z-score, but handles strong seasonality better than simpler statistical methods.

    When to Use Which Method

    Scenario

    Recommended method

    Reason

    Real-time monitoring, no labeled data

    Z-score or MAD

    SQL-native, zero infrastructure overhead

    Strong seasonality (day-of-week, time-of-day patterns)

    STL residuals or Prophet + threshold

    Captures periodic patterns Z-score misses

    Multivariate sensor data, no labels

    Isolation Forest

    Handles multiple correlated metrics

    Labeled historical incidents available

    Supervised classifier

    Maximize precision/recall on known failure modes

    Production monitoring at scale

    Statistical first, ML on top

    Interpretability plus operational simplicity

    The best method is always the one whose false positive rate is tolerable in production. Start simple and add complexity only when the simpler method demonstrably fails.

    Anomaly Detection in SQL with PostgreSQL and TimescaleDB

    The following queries implement Z-score and MAD detection using standard SQL window functions and TimescaleDB's time_bucket() function. No external ML infrastructure, no Python runtime, no proprietary query language.

    TimescaleDB implements Z-score and MAD anomaly detection natively using standard SQL window functions and continuous aggregates, without any external ML infrastructure or proprietary processing engine.

    All examples below use this consistent schema:

    CREATE TABLE sensor_readings ( time TIMESTAMPTZ NOT NULL, device_id TEXT NOT NULL, metric TEXT NOT NULL, value DOUBLE PRECISION ); SELECT create_hypertable('sensor_readings', 'time');

    The create_hypertable() call enables automatic time-based partitioning, which makes the following queries more efficient than standard PostgreSQL for time-series workloads by bounding chunk scans to relevant time ranges. It also enables the time_bucket() function and columnar compression via Hypercore.

    Z-Score Detection with Window Functions

    The rolling Z-score query uses AVG() OVER and STDDEV() OVER with a time-aligned window. The key detail is PARTITION BY device_id - this computes a separate baseline for each device, which is essential for fleet monitoring where each sensor has different normal behavior.

    WITH rolling_stats AS ( SELECT time, device_id, metric, value, AVG(value) OVER ( PARTITION BY device_id, metric ORDER BY time ROWS BETWEEN 59 PRECEDING AND CURRENT ROW ) AS rolling_mean, STDDEV(value) OVER ( PARTITION BY device_id, metric ORDER BY time ROWS BETWEEN 59 PRECEDING AND CURRENT ROW ) AS rolling_stddev FROM sensor_readings WHERE time > NOW() - INTERVAL '24 hours' ), scored AS ( SELECT time, device_id, metric, value, CASE WHEN rolling_stddev = 0 THEN 0 ELSE (value - rolling_mean) / rolling_stddev END AS z_score FROM rolling_stats ) SELECT * FROM scored WHERE ABS(z_score) > 3 ORDER BY time DESC;

    The ROWS BETWEEN 59 PRECEDING AND CURRENT ROW defines a 60-reading rolling window. Adjust this to match your sampling interval and how much history you want in the baseline. The CASE WHEN rolling_stddev = 0 guard prevents division by zero when a device is reporting a constant value.

    For very large datasets, pre-computing rolling statistics in a continuous aggregate and then applying the threshold check against the aggregate is more efficient than computing window functions over raw data on every query.

    See PostgreSQL window functions for the full reference on PARTITION BY, ORDER BY, and frame specifications.

    Median Absolute Deviation (MAD) with percentile_disc()

    MAD uses the median as the baseline instead of the mean, which makes it resistant to contamination. When anomalies are already in your rolling window, Z-score's standard deviation inflates. MAD's median does not.

    WITH windowed AS ( SELECT time, device_id, metric, value, -- Compute rolling median over the last 60 readings percentile_disc(0.5) WITHIN GROUP (ORDER BY value) OVER ( PARTITION BY device_id, metric ORDER BY time ROWS BETWEEN 59 PRECEDING AND CURRENT ROW ) AS rolling_median FROM sensor_readings WHERE time > NOW() - INTERVAL '24 hours' ), deviations AS ( SELECT time, device_id, metric, value, rolling_median, ABS(value - rolling_median) AS abs_deviation FROM windowed ), mad_scores AS ( SELECT time, device_id, metric, value, rolling_median, abs_deviation, -- Compute MAD: median of absolute deviations percentile_disc(0.5) WITHIN GROUP (ORDER BY abs_deviation) OVER ( PARTITION BY device_id, metric ORDER BY time ROWS BETWEEN 59 PRECEDING AND CURRENT ROW ) AS mad FROM deviations ) SELECT time, device_id, metric, value, rolling_median, CASE WHEN mad = 0 THEN 0 ELSE abs_deviation / mad END AS mad_score FROM mad_scores WHERE mad > 0 AND abs_deviation / mad > 3.5 ORDER BY time DESC;

    For large datasets, TimescaleDB also provides approx_percentile() as a hyperfunction for approximate but significantly faster percentile computation. See why percentiles are better than averages and the full percentile_disc() reference for details on exact vs. approximate trade-offs.

    Alert Deduplication with LAG()

    If an anomaly persists for 30 minutes and you check every minute, you generate 30 alerts for one incident. This is the most common operational complaint about anomaly detection in production - not missed detections, but alert fatigue from repeat notifications for the same ongoing condition.

    The fix is to emit an alert only when a reading transitions from normal to anomalous:

    WITH anomaly_flags AS ( -- Your Z-score or MAD query from above, returning is_anomaly boolean SELECT time, device_id, metric, value, ABS((value - rolling_mean) / NULLIF(rolling_stddev, 0)) > 3 AS is_anomaly FROM rolling_stats ), transitions AS ( SELECT time, device_id, metric, value, is_anomaly, LAG(is_anomaly) OVER ( PARTITION BY device_id, metric ORDER BY time ) AS prev_is_anomaly FROM anomaly_flags ) -- Only emit when transitioning from normal to anomalous SELECT time, device_id, metric, value FROM transitions WHERE is_anomaly = TRUE AND (prev_is_anomaly = FALSE OR prev_is_anomaly IS NULL);

    This pattern does not fix bad thresholds. If your baseline is wrong, deduplication just reduces the noise from a bad detection. But it prevents repeat alerts from overwhelming on-call teams during sustained anomalous periods. More sophisticated state management (sustained anomaly duration, multi-metric correlation) typically requires a dedicated alerting layer like PagerDuty or a custom pg_notify integration.

    Real-Time Anomaly Detection with Continuous Aggregates

    Batch SQL queries against raw data detect anomalies after the fact. They do not automatically alert as data arrives. Continuous aggregates solve this by pre-computing rolling statistics on a configurable refresh cadence, so threshold checks run against materialized statistics rather than rescanning raw data on every query.

    The architecture for real-time detection looks like this:

    1. Raw sensor data lands in the sensor_readings hypertable

    2. A continuous aggregate pre-computes rolling statistics (mean, standard deviation, or MAD approximation) per device per time bucket

    3. A separate query or alerting hook checks the continuous aggregate against anomaly thresholds

    4. Alerts emit via pg_notify, a polling service, or an external integration (Grafana, PagerDuty)

    The continuous aggregate for hourly rolling statistics:

    CREATE MATERIALIZED VIEW sensor_hourly_stats WITH (timescaledb.continuous) AS SELECT time_bucket('1 hour', time) AS bucket, device_id, metric, AVG(value) AS avg_value, STDDEV(value) AS stddev_value, MIN(value) AS min_value, MAX(value) AS max_value, COUNT(*) AS reading_count FROM sensor_readings GROUP BY time_bucket('1 hour', time), device_id, metric; -- Refresh policy: keep the aggregate current, refreshing every minute SELECT add_continuous_aggregate_policy( 'sensor_hourly_stats', start_offset => INTERVAL '3 hours', end_offset => INTERVAL '1 minute', schedule_interval => INTERVAL '1 minute' );

    The schedule_interval => INTERVAL '1 minute' means anomaly detection lag is at most one minute. A longer schedule interval (one hour) is more compute-efficient for historical analysis dashboards but unsuitable for real-time alerting. Tune this to your operational latency requirement.

    Threshold checks against the continuous aggregate are fast because they read pre-computed statistics, not raw rows:

    -- Check for anomalies in the last completed hour bucket SELECT bucket, device_id, metric, avg_value, stddev_value, -- Compare current bucket stats against a 24-hour rolling baseline (avg_value - AVG(avg_value) OVER ( PARTITION BY device_id, metric ORDER BY bucket ROWS BETWEEN 23 PRECEDING AND 1 PRECEDING )) / NULLIF( STDDEV(avg_value) OVER ( PARTITION BY device_id, metric ORDER BY bucket ROWS BETWEEN 23 PRECEDING AND 1 PRECEDING ), 0 ) AS z_score FROM sensor_hourly_stats WHERE bucket >= NOW() - INTERVAL '2 hours' ORDER BY ABS(z_score) DESC NULLS LAST;

    This pattern is sufficient for most IoT monitoring and infrastructure observability workloads at moderate data rates. When to consider heavier external infrastructure: sub-second latency requirements at billions of events per day typically require a streaming pipeline (Kafka, Flink). For everything short of that threshold, continuous aggregates handle the detection layer without additional infrastructure. See how to build an IoT pipeline in PostgreSQL for the full streaming architecture.

    Python-Based Anomaly Detection with TimescaleDB

    Python is not a fallback for when SQL fails. It is the right tool when detection requires ML models that need training data, ensemble methods combining multiple signals, foundation models like Moment or TimeGPT, or team workflows that already live in Python notebooks.

    The architectural trade-off: TimescaleDB stores data in PostgreSQL (standard SQL, full ecosystem compatibility, version-controllable schema), and Python reads from it using standard database connectors. Detection models live in Python where they belong - in version control, in CI/CD, alongside the rest of application code. This is a different trade-off from InfluxDB's in-database Python engine. Tiger Data's approach keeps detection logic portable and operationally familiar. The database is the source of truth; the Python model is a function that reads from it and writes back to it.

    Python Libraries for Time Series Anomaly Detection

    PyOD - Comprehensive outlier detection toolkit. Includes Isolation Forest, COPOD, LOF, and Autoencoders. The most complete option for production use, with 40+ algorithms under a consistent API.

    ADTK (Anomaly Detection ToolKit) - Time-series-aware; handles seasonality and trend natively. Good for practitioners who want something between raw scikit-learn and a full ML framework.

    scikit-learn IsolationForest - The workhorse for unsupervised detection on multivariate data. Simple API, well-documented, no training labels required.

    statsmodels STL - For seasonal decomposition and residual analysis. The right tool when the detection problem is fundamentally about separating trend and seasonality from the residual.

    Example: Isolation Forest with psycopg2

    import psycopg2 import pandas as pd from sklearn.ensemble import IsolationForest # Connect and pull a time-windowed slice from sensor_readings conn = psycopg2.connect("postgresql://user:password@host:5432/dbname") query = """ SELECT time, device_id, metric, value FROM sensor_readings WHERE time > NOW() - INTERVAL '7 days' AND device_id = %s AND metric = %s ORDER BY time """ df = pd.read_sql(query, conn, params=('sensor_001', 'temperature')) df = df.set_index('time') # Fit Isolation Forest on the value column clf = IsolationForest(contamination=0.01, random_state=42) df['anomaly_score'] = clf.fit_predict(df[['value']]) df['is_anomaly'] = df['anomaly_score'] == -1 # Write anomaly scores back to a results table anomalies = df[df['is_anomaly']][['value', 'is_anomaly']].reset_index() with conn.cursor() as cur: for _, row in anomalies.iterrows(): cur.execute( """ INSERT INTO anomaly_results (time, device_id, metric, value, model) VALUES (%s, %s, %s, %s, %s) ON CONFLICT DO NOTHING """, (row['time'], 'sensor_001', 'temperature', row['value'], 'isolation_forest') ) conn.commit() conn.close()

    For async Python applications, asyncpg is an alternative to psycopg2 with better performance for high-throughput workloads. See time-series analysis and forecasting with Python for additional Python integration patterns with TimescaleDB, and the guide to time-series analysis in Python for background on STL decomposition and forecasting.

    For seasonality-based detection, Prophet + residual analysis is worth considering: see time-series forecasting with Prophet for a full implementation example.

    Anomaly Detection for IoT and Sensor Monitoring

    IoT anomaly detection differs from infrastructure observability in one fundamental way: cardinality. A single industrial deployment can have 10,000 sensors, each generating readings every second. Per-device baseline computation has to scale to thousands of independent time series without a query-per-device architecture.

    Hypertable partitioning by device_id handles this. Each device's data is stored in adjacent chunks, and PARTITION BY device_id in window functions or continuous aggregate policies executes across the full fleet without per-device queries. The examples in the SQL section above already use this pattern. The PARTITION BY device_id, metric clause computes independent baselines for each device-metric combination in a single pass.

    Handling Missing Data and Sensor Drift

    Sensors go offline, transmit stale values, or drift gradually over time. Gaps in the series break rolling window computations and inflate or deflate baseline statistics.

    TimescaleDB's time_bucket_gapfill() with interpolate() or locf() (last observation carried forward) handles sparse series:

    SELECT time_bucket_gapfill('1 minute', time) AS bucket, device_id, metric, interpolate(AVG(value)) AS interpolated_value FROM sensor_readings WHERE time BETWEEN '2026-04-01' AND '2026-04-02' AND device_id = 'sensor_001' GROUP BY bucket, device_id, metric ORDER BY bucket;

    interpolate() fills gaps with linear interpolation. locf() carries the last known value forward, which is better for step-function metrics where interpolation would be misleading. See stationary time-series analysis for background on why gaps affect statistical baselines and when stationarity assumptions matter for detection methods.

    Data Retention and Compression at IoT Scale

    Keeping years of raw sensor data for baseline recalibration is expensive. TimescaleDB's Hypercore compression (columnar storage) reduces historical data storage by 90-98% while keeping it fully queryable for baseline recalibration. Tiered storage moves older data to low-cost object storage, again without losing SQL access.

    This matters for anomaly detection specifically because long-horizon baselines (a full year of data to capture seasonal patterns) become practical when storage costs are not prohibitive. Relevant verticals where this architecture applies: oil and gas (pressure, temperature, flow rate monitoring), smart buildings (HVAC, energy, occupancy), energy telemetry (solar and wind generation output), and manufacturing (vibration, torque, bearing temperature).

    For regulated or air-gapped environments where cloud connectivity is not an option, TimescaleDB Enterprise provides a self-managed deployment with high availability, automated backups, and certified SCADA connectors at edge sites. The same SQL-based anomaly detection queries in this guide run on enterprise edge nodes with optional cloud sync to Tiger Cloud for cross-site analytics.

    See data historian vs. time-series database for how these patterns compare to legacy historian architectures, and best databases for IoT for database selection guidance.

    Evaluating Anomaly Detection Quality

    Accuracy is the wrong metric for anomaly detection. In a monitoring system where 99.9% of readings are normal, a model that flags nothing achieves 99.9% accuracy. The metrics that matter are precision, recall, and false positive rate.

    The trade-off is context-dependent:

    High recall, lower precision: Catch more anomalies but generate more false alerts. Appropriate for critical safety systems (nuclear plants, medical devices) where missing an event is catastrophic and alert fatigue is an acceptable operational cost.

    High precision, lower recall: Fewer false alerts but some real anomalies missed. Appropriate for non-critical monitoring where alert fatigue is a bigger operational problem than missed detections.

    Alert fatigue is the most common production failure mode. Grafana users in particular note that detection bands break and require recalibration when anomalies occur - the anomalies themselves corrupt the baseline. Three practical approaches:

    1. Use MAD instead of Z-score to reduce baseline contamination

    2. Implement alert deduplication with LAG() to suppress repeat alerts during sustained anomalous periods (shown in the SQL section above)

    3. Set conservative thresholds and tune toward precision unless the cost of missing an event is catastrophic

    If you have a labeled incident history (past outages, failure events), use it to backtest your detection rule before deploying to production. A 90-day historical replay against known incidents gives a realistic precision/recall estimate. Without labeled history, start with a threshold of 3 standard deviations for Z-score and 3.5 for MAD, then tune based on the false positive rate you observe in the first week of deployment.

    Getting Started with Time Series Anomaly Detection in Tiger Data

    The SQL examples in this guide run on any PostgreSQL installation with TimescaleDB. To follow along with a managed environment, Tiger Cloud provides a free trial with create_hypertable(), continuous aggregates, and all hyperfunctions available immediately.

    Where to go next depends on where you are in the process:

    • New to time-series analysis: Time-series analysis: what it is and how to use it

    • Working in Python: Time-series analysis and forecasting with Python

    • Building an IoT pipeline: How to build an IoT pipeline in PostgreSQL

    • Evaluating database options: Best databases for IoT

    • Going deeper on percentiles and hyperfunctions: Why percentiles are better than averages

    Start a Tiger Cloud trial or install TimescaleDB locally via Docker to run the queries in this guide against your own data.

    FAQ: Time Series Anomaly Detection

    What is time series anomaly detection?

    Time series anomaly detection is the process of identifying data points, sequences, or patterns in time-ordered data that deviate significantly from expected behavior. It differs from general outlier detection because the temporal structure of the data (seasonality, trends, autocorrelation) must be accounted for -- what looks like an outlier in a global distribution may be entirely normal at a specific time of day or day of week.

    What are the types of anomalies in time series data?

    There are three canonical types: point anomalies (single outlier readings that deviate from surrounding values), contextual anomalies (values that are normal globally but anomalous in context, like high CPU usage during non-scheduled hours), and collective anomalies (sequences that signal a problem even if individual readings are not unusual, like a slow temperature rise indicating equipment failure). Each type requires a different detection strategy.

    What is the difference between anomaly detection and outlier detection?

    Outlier detection treats data points as independent observations and flags statistical extremes. Anomaly detection in time-series accounts for temporal dependencies. A data point is evaluated against its temporal context (time of day, day of week, recent history), not just the global distribution. In practice, many time-series anomaly detection methods are adapted outlier detection methods applied to temporally-contextualized features.

    Can you do anomaly detection with SQL?

    Yes. Standard SQL window functions (AVG() OVER, STDDEV() OVER, percentile_disc()) implement Z-score and MAD detection without any external tools. TimescaleDB adds time_bucket() for time-aligned rolling windows and continuous aggregates for pre-computing rolling statistics at scale. TimescaleDB implements Z-score and MAD anomaly detection natively using standard SQL window functions, without external ML infrastructure.

    What is Z-score anomaly detection and how do you implement it?

    Z-score measures how many standard deviations a value is from the rolling mean: (x - mean) / stddev. Values beyond a threshold (commonly 2.5 to 3 standard deviations) are flagged as anomalous. In SQL, implement with AVG(value) OVER (PARTITION BY device_id ORDER BY time ROWS BETWEEN N PRECEDING AND CURRENT ROW) and STDDEV(value) OVER the same window. The PARTITION BY device_id clause computes a separate baseline per device, which is essential for fleet monitoring.

    What is Median Absolute Deviation (MAD) and why is it better than Z-score?

    MAD uses the median instead of the mean as the baseline, making it resistant to contamination: if anomalies are already present in the window, they inflate Z-score's standard deviation but not MAD's median. For sensor monitoring where anomalies may persist for several readings, MAD tends to produce fewer false negatives during sustained anomalous periods. It is implemented in SQL using percentile_disc(0.5) for both the rolling median and the median of absolute deviations.

    How do you detect anomalies in real time?

    For SQL-based detection: use a continuous aggregate to pre-compute rolling statistics on a configurable refresh cadence (for example, every minute). Query the aggregate against anomaly thresholds and emit alerts via pg_notify or an external alerting integration. For ML-based detection: run a Python detection model on a polling interval, reading windowed data from TimescaleDB via psycopg2 and writing anomaly scores back to a results table.

    What is a good anomaly detection algorithm for IoT sensor data?

    For real-time fleet monitoring: Z-score or MAD using SQL window functions, with PARTITION BY device_id to compute separate baselines per sensor. This handles thousands of devices without per-device queries. For batch analysis with labeled historical data: Isolation Forest (scikit-learn) or ADTK for time-series-aware detection. For strong seasonality: STL decomposition plus threshold on residuals.

    How do continuous aggregates help with anomaly detection?

    Continuous aggregates pre-compute rolling statistics (means, standard deviations, percentiles) and materialize them on a refresh schedule. Threshold checks against materialized statistics are dramatically faster than recomputing window functions over raw data on every query, which matters at IoT scale with thousands of devices. They also enable the "compute once, alert many times" pattern: a single continuous aggregate can power both a Grafana dashboard and an automated alerting check.

    What Python libraries are available for time series anomaly detection?

    The most widely used: PyOD (comprehensive toolkit, 40+ algorithms), ADTK (time-series-aware, handles seasonality natively), scikit-learn IsolationForest (simple, unsupervised, no training labels required), and statsmodels STL (seasonal decomposition for the forecasting plus residual pattern). For foundation model approaches, Moment and TimeGPT are emerging options for zero-shot detection without custom training.

    How do you avoid false positives in anomaly detection?

    Three practical approaches: (1) use MAD instead of Z-score to reduce baseline contamination from existing anomalies; (2) implement alert deduplication with LAG() to suppress repeat alerts during sustained anomalies; (3) tune thresholds toward precision (fewer alerts, more real ones) rather than recall, unless the cost of missing an event is catastrophic. Backtesting against a labeled incident history before deploying to production is the most reliable way to set realistic thresholds.

    How does STL decomposition work for anomaly detection?

    STL (Seasonal-Trend decomposition using Loess) separates a time series into trend, seasonal, and residual components. Anomaly detection then applies a threshold to the residual component. Values where the residual exceeds N standard deviations from its own distribution are flagged. This is the "good forecasting plus threshold on the residual" pattern that experienced practitioners consistently describe as the most reliable production approach for series with strong periodicity (daily cycles, weekly patterns, seasonal trends).