---
title: "What Is a Data Historian?"
description: "Data historians vs. time-series databases: how they work, where they're used in oil & gas and manufacturing, and when modern alternatives make sense. ("
section: "Postgres for IoT"
---

> **TimescaleDB is now Tiger Data.**

A data historian, also called an operational historian or process historian, is specialized software designed to collect, store, and retrieve time-stamped process data from industrial equipment: sensors, PLCs, distributed control systems (DCS), and SCADA systems. It connects to industrial devices via protocols like OPC-UA, OPC-DA, and Modbus, storing every reading as a tag-value pair with a timestamp and quality code. Data historians are the standard data storage layer in oil and gas, manufacturing, utilities, and pharmaceutical process environments.

Historians emerged in the 1980s as industrial control systems generated more data than operators could manually log. Purpose-built for plant-floor environments, they became foundational infrastructure across capital-intensive industries where process data is both safety-critical and heavily regulated. Today, a single midstream oil and gas operator may run a historian database tracking millions of data points per day across thousands of field sensors.

## How a Data Historian Works

Data historians share a three-layer architecture: data acquisition, tag-based storage, and retrieval with compression.

### Data Acquisition

Historians connect to industrial equipment through standardized protocols. OPC-UA (OPC Unified Architecture) is the current interoperability standard; OPC-DA is its older predecessor, still common in legacy installations. Modbus, DNP3, and proprietary manufacturer interfaces are also in wide use. In environments built around SCADA systems, the historian functions as a SCADA historian, collecting and archiving the real-time process data that SCADA supervisory layers generate. The historian's acquisition layer, sometimes called an interface or collector, polls devices or receives data pushes at defined intervals, often once per second or faster for critical measurements.

Each data point arrives as a "tag" value. A historian receiving data from a gas compression station might collect hundreds of tags simultaneously: suction pressure, discharge pressure, flow rate, motor temperature, and valve position, each at its own polling frequency.

### The Tag-Based Data Model: What Developers Need to Know

The tag-based data model is the single most important concept for developers approaching historian data for the first time.

A tag is a named data point corresponding to a single sensor or measurement. In OSIsoft PI (now AVEVA PI System), the dominant historian platform, a tag follows a hierarchical naming convention that maps to physical plant structure:

`SITE_01.UNIT_A.PUMP_101.FLOW_RATE
WELLHEAD_03.TUBING.PRESSURE_PSI
COMPRESSOR_07.MOTOR.TEMP_DEGC`

Each tag has defined attributes: TagName, engineering units (EngUnits), point type (Float32, Int32, Digital, String), a description, and range limits. When a sensor reading arrives, the historian stores it as: **tag name + timestamp + value + quality code**. That's the entire data model.

Where a SQL database has tables, columns, and rows, a historian has tags. No schema migration when you add a new sensor; you add a new tag. No foreign keys, no joins, no relational structure. That model is extremely efficient for high-frequency writes on thousands of measurement streams, but it makes cross-tag analytics and integration with modern data platforms harder than it needs to be.

A developer thinking in SQL would model this as a two-column table: timestamp and value, keyed by tag name. That is essentially what a historian does internally, but optimized end-to-end for OT protocols, industrial compression, and high-frequency ingestion. Tiger Data’s TimescaleDB uses the same underlying model, one table with a time column and measurement column, but built on PostgreSQL, so the data is immediately queryable with standard SQL and accessible to any tool in the modern data stack.

### Storage and Compression

Raw historian data at 1-second intervals for 10,000 tags generates billions of rows per year. Compression is not optional.

Historians apply two primary compression algorithms before writing to disk. Exception reporting stores a new value only when the reading changes beyond a configured deadband. If a temperature sensor holds at 72.3 degrees for 30 seconds, only the first reading is stored. Swinging door compression fits a linear segment through data points and stores only the segment endpoints when readings stay within a tolerance band. Together, these techniques can reduce storage requirements by an order of magnitude or more while preserving enough fidelity for process analysis.

Retrieval works by tag name and time range. Historians support trend queries (all values for tag X between T1 and T2), downsampled queries (average, max, or last value per hour), and interpolated queries (the value at an exact timestamp, calculated from the stored points on either side).

## What Data Does a Historian Store?

Historians store three categories of industrial process data.

**Analog data** covers continuous measurements: temperature, pressure, flow rate, liquid level, vibration frequency, electrical power. Analog readings are the dominant data type. Most process data is continuous and arrives at 1-second or sub-second intervals.

**Digital (discrete) data** covers binary on/off signals from switches, valves, and motors: pump running or stopped, valve open or closed, high-pressure alarm active or inactive. Digital readings occur less frequently than analog but matter for event correlation and root cause analysis. When an alarm triggered, what were the analog conditions at that moment? The historian is the system that answers that question.

**Quality data** is what distinguishes a historian from a simple time-series log. Every reading carries a quality code (Good, Bad, Uncertain, or Substituted) that reflects sensor health or communication status at the time of the reading. That metadata is what makes historian data auditable for regulated industries like pharmaceuticals and oil and gas.

Some historians also support event frames or batch records: structured records that capture the start, end, and conditions of a defined process event. This is standard in pharmaceutical manufacturing for FDA 21 CFR Part 11 compliance, where a batch record must document the exact conditions under which a drug was produced.

## Where Data Historians Are Used

### Oil and Gas

Oil and gas is the largest and most demanding application domain for data historians. Wellhead telemetry captures pressure, temperature, and flow at each producing well. Pipeline monitoring tracks pressure transients for leak detection and compressor station performance. Refineries run process historians on process units monitoring thousands of measurement points simultaneously.

WaterBridge, a large produced-water management company operating across the Permian Basin, shows what scale looks like in practice. In a [<u>real-world example from oil and gas</u>](https://www.tigerdata.com/blog/how-waterbridge-uses-timescaledb-for-real-time-data-consistency), WaterBridge handles historian-class real-time data consistency challenges - thousands of sensor readings per second across field assets, with requirements for both data completeness and low-latency access for operational decisions. Teams at that scale need infrastructure that can ingest at historian rates and serve queries at operational speed.

### Manufacturing and Process Industries

Manufacturing was the original home of data historians. Quality control monitoring on production lines, batch process tracking in chemical, food, and beverage plants, and predictive maintenance on motors and compressors via vibration and temperature trending are all standard historian use cases. Automotive, semiconductor, pulp and paper, and metals manufacturers all carry substantial historian deployments.

### Utilities and Energy

Power grid substations, generation plants (gas turbines, wind farms, hydroelectric), and transmission infrastructure run historians to track grid frequency, voltage, and generation output. As energy grids add distributed generation and smart meters, the number of monitored data points keeps growing.

### Pharmaceuticals

FDA 21 CFR Part 11 compliance requires validated, tamper-evident data records for batch manufacturing. Historians are the standard audit trail mechanism for pharmaceutical process environments. The quality code infrastructure - Good, Bad, Uncertain, Substituted - maps directly to the regulatory requirement to document data integrity.

### Building Automation

HVAC systems, energy management platforms, and building management systems (BMS) use historians at smaller scale for energy optimization and facilities reporting - an application area that has grown as building operators pursue sustainability targets.

## Major Data Historian Vendors

The historian market has been dominated by a small set of purpose-built vendors for decades.

**OSIsoft PI / AVEVA PI System** is the market leader by installed base, originally developed by OSIsoft (founded 1980) and deployed across oil and gas, utilities, and manufacturing worldwide. OSIsoft was acquired by AVEVA in 2021, and AVEVA was subsequently acquired by Schneider Electric in 2023. Engineers on legacy PI deployments are actively evaluating their options as PI ProcessBook has been deprecated and the product roadmap keeps shifting under new ownership.

**GE Proficy Historian** is GE's historian, deployed across manufacturing and power generation. Following GE's 2024 industrial spinoff into GE Vernova, the Proficy software business was sold to private equity firm TPG in March 2026 and now operates as an independent company.

**AspenTech InfoPlus.21 (IP21)** is common in chemical, refining, and petrochemical industries. Part of AspenTech's broader process optimization platform, it carries higher licensing costs and significant professional services dependency.

**Honeywell PHD (Process History Database)** is prevalent in refining and chemical plants running Honeywell DCS equipment. PHD is tightly coupled to the Honeywell ecosystem, which makes it a natural default for Honeywell shops but creates friction in mixed-vendor environments.

**Modern open alternatives** - cloud-native and open-source time-series databases (TSDBs) including TimescaleDB, InfluxDB, and QuestDB - are increasingly deployed for historian-class workloads. Many industrial teams run them alongside existing historians rather than as direct replacements, routing analytics and reporting to a modern time-series database (TSDB) while keeping the historian in place for OT/control-layer collection. It’s worth noting that industrial circles, time-series databases are sometimes referred to as historians.

## Limitations of Traditional Data Historians

Historians were purpose-built for their original problem and solved it well. The limitations that have emerged are structural. They reflect design decisions made in the 1980s and 1990s that are increasingly mismatched with modern industrial data requirements.

**Proprietary architecture and vendor lock-in.** Tag naming conventions, data formats, and APIs vary per vendor and are not standardized. Getting historian data into a data lake, ML pipeline, or cloud analytics platform requires custom connectors, middleware, or historian vendor tooling.

**Scalability ceilings.** Traditional historians were architected for thousands of tags at 1-second intervals. Modern IoT deployments can generate hundreds of thousands of tags at sub-second intervals. Some historians handle this at high hardware cost; others hit architectural limits that more hardware cannot fix.

**SQL and analytics gaps.** Historians have proprietary query languages - PI SQL and Asset Framework queries for AVEVA PI, vendor-specific syntax for GE Proficy. None are directly queryable via standard SQL, which cuts off BI tools, Jupyter notebooks, and standard data engineering pipelines.

**Cloud-native deployment constraints.** Legacy historians were designed for on-premises OT environments, often air-gapped from IT networks. Cloud deployment requires architectural changes those products were not built to support.

**Licensing and total cost.** Enterprise historian licensing is priced per tag, with annual maintenance fees. For greenfield IIoT deployments adding thousands of new sensor streams, that model becomes cost-prohibitive fast.

## Data Historian vs. Time-Series Database

The boundaries between data historians and [<u>time-series databases</u>](https://www.tigerdata.com/learn/time-series-database-what-it-is-how-it-works-and-when-you-need-one) are blurring. Modern TSDBs are adding OT protocol connectors and industrial data model support; some legacy historians are adding SQL interfaces and cloud deployment options. The underlying design philosophies remain distinct.

| **Historian** | **Time-Series Database** |
| --- | --- |
| **Data model** | Tag-based (tag name + timestamp + value + quality code) | Flexible schema; typically table with time column + metrics |
| **Query language** | Proprietary (PI SQL, AF queries, vendor-specific syntax) | Standard SQL or SQL-adjacent |
| **OT protocol support** | Native (OPC-UA, OPC-DA, Modbus, DNP3 built in) | Requires connectors or middleware |
| **Cloud deployment** | Limited; designed for on-premises OT environments | Cloud-native; horizontal scaling |
| **Licensing** | Per-tag; enterprise contracts | Usage-based or open source |

Historians are built specifically for industrial process data, with native OT protocol connectivity, tag-based data models, and OT-specific compression baked in. Time-series databases are general-purpose - they handle industrial and non-industrial workloads but require more configuration for OT data ingestion.

Historians also often carry OT-specific compliance certifications relevant to regulated industries. TSDBs typically do not come pre-certified, though data stored in them can satisfy compliance requirements when the surrounding processes are validated correctly.

Tiger Data’s TimescaleDB is built on PostgreSQL, which means historian-class workloads run on infrastructure engineers already know - standard SQL, standard connectors, no proprietary query language. [<u>Hypertables</u>](https://www.tigerdata.com/docs/use-timescale/latest/hypertables) (time-partitioned table structures that keep queries fast as data grows) handle high-frequency ingestion so queries stay performant at scale; [<u>continuous aggregates</u>](https://www.tigerdata.com/docs/use-timescale/latest/continuous-aggregates/about-continuous-aggregates) (precomputed rollups that refresh automatically) handle the kind of downsampling and trend queries that historians typically compute with proprietary functions. Teams in oil and gas and energy have used Tiger Data to handle historian-class telemetry volumes - thousands of sensor readings per second - while keeping the data accessible to SQL-based analytics workflows.

Many teams take an augment approach rather than full replacement. The historian stays in place for OT/control-layer data collection and compliance; a modern TSDB receives a copy or subset of the data for analytics, reporting, and integration with IT systems. That sidesteps the operational disruption of ripping out production infrastructure while solving the SQL and integration gaps.

For a detailed breakdown of when each approach makes sense, see [<u>Moving Past Legacy Systems: Data Historian vs. Time-Series Database</u>](https://www.tigerdata.com/learn/moving-past-legacy-systems-data-historian-vs-time-series-database). For how modern energy teams are navigating this shift, see [<u>IoT Energy Data at Scale: Engineering Solutions Beyond Legacy Historians</u>](https://www.tigerdata.com/blog/iot-energy-data-at-scale-engineering-solutions-beyond-legacy-historians).

## Frequently Asked Questions

### What is a data historian?

A data historian is specialized software designed to collect, store, and retrieve time-stamped process data from industrial equipment - sensors, PLCs, DCS, and SCADA systems. It uses a tag-based data model where each measurement point (temperature gauge, flow sensor, valve position) gets a unique tag name, and every reading is stored as a tag plus timestamp plus value plus quality code. Data historians are the standard data storage layer in oil and gas, manufacturing, utilities, and pharmaceutical process environments.

### How does a data historian work?

Historians connect to industrial equipment via protocols like OPC-UA, OPC-DA, and Modbus. Data arrives continuously from sensors as tag values at high frequency, often once per second or faster. Before writing to storage, the historian applies compression algorithms (typically exception reporting or swinging door compression) to reduce storage volume while preserving enough fidelity for process analysis. Data is then retrievable by tag name and time range, with support for trend analysis, quality filtering, and downsampled queries.

### What is the difference between a data historian and a time-series database?

Historians and time-series databases are converging, but the design differences still matter. Historians are built specifically for industrial process data - native OT protocol connectors, proprietary tag-based schemas, built-in OT-specific compression. Time-series databases are general-purpose: they handle industrial and non-industrial workloads and are queryable with standard SQL. Modern TSDBs like Tiger Data (built on PostgreSQL) are used for historian-class workloads because they offer SQL access alongside high-frequency time-series ingestion, without per-tag licensing.

### What industries use data historians?

Data historians are most common in oil and gas (wellhead telemetry, pipeline monitoring, refinery process control), manufacturing (quality monitoring, batch processing, predictive maintenance), utilities and power generation (grid monitoring, substation data), and pharmaceuticals (FDA-compliant batch records). Building automation and transportation infrastructure are growing application areas.

### What are the limitations of traditional data historians?

Traditional historians face four structural limitations as industrial data volumes grow. First, proprietary data formats and query languages create integration friction with modern analytics tools. Second, per-tag licensing models become cost-prohibitive at IIoT scale. Third, legacy historians were designed for on-premises OT environments while cloud deployment and IT/OT integration require architectural work those products were not built to support. Fourth, SQL support is limited or absent, which cuts historian data off from standard data engineering and machine learning workflows.

### What is a modern alternative to a data historian?

For new deployments and historian modernization, most teams look at time-series databases. InfluxDB, QuestDB, and TimescaleDB each offer SQL or SQL-adjacent query support, cloud-native deployment, and horizontal scalability without per-tag licensing. [Tiger Data, built on PostgreSQL, is used by teams in oil and gas and energy for historian-class telemetry with full SQL access, hypertable partitioning, and continuous aggregates for rollups](https://www.tigerdata.com/timescaledb-enterprise). Many teams keep their existing historian for OT/control-layer collection while routing analytics and reporting to a modern TSDB.

### Where can I find additional resources about using modern historians like TimescaleDB?

The Tiger Data blog has a substantial list of articles and tutorials on using [<u>historians and databases for IIoT</u>](https://www.tigerdata.com/blog/search?query=iiot).

## Historian-Class Workloads on PostgreSQL

Tiger Cloud is a fully managed time-series database built on PostgreSQL. It handles the ingest rates, compression, and rollup queries that historian-class workloads require, with standard SQL and no per-tag licensing.

[<u>Start a free Tiger Cloud trial</u>](https://www.tigerdata.com/cloud)