---
title: "From 6 Seconds to Under 100ms: How the Embodied Carbon Observatory Separates Grid Improvement From Real Decarbonization With Tiger Data"
published: 2026-04-23T09:00:00.000-04:00
updated: 2026-04-23T11:31:30.000-04:00
excerpt: "How the Embodied Carbon Observatory uses TimescaleDB to cut queries from 6s to under 100ms, separating real decarbonization from grid improvement."
tags: Dev Q&A, Tiger Data
authors: Andrew Stebbins
---

> **TimescaleDB is now Tiger Data.**

_A solo developer turns 40,000 Environmental Product Declarations and 25+ years of EPA grid data into an interactive map that answers a question no ESG rating agency publishes: is this plant actually decarbonizing, or did the grid just get cleaner?_

_This is an installment of our_ [_Community Member Spotlight series_](https://www.tigerdata.com/blog/community-member-spotlight-series-featuring-your-timescaledb-projects)_, in which we invite our open source community members to share their work, spotlight what they have built, and inspire others with new ways to use TimescaleDB._

_Today we hear from Ankur Podder, founder of_ [_dexdogs.earth_](https://dexdogs.earth/) _and solo developer of the Embodied Carbon Observatory, an open source temporal observatory for embodied carbon in US building materials. The observatory brings an attribution query that used to take 6 to 8 seconds on vanilla Postgres down to under 100ms on open source TimescaleDB, which is what lets it power an interactive map that answers the Buy Clean question: when a concrete plant's carbon number drops, is the manufacturer improving, or did the regional grid get cleaner?_

## About the Embodied Carbon Observatory

Architects, procurement managers, and sustainability researchers are under growing pressure from Buy Clean procurement policies, LEED v5 requirements, and corporate net zero commitments to prove that the materials going into their projects are actually decarbonizing. The evidence they are handed is a stack of Environmental Product Declarations (EPDs), one per plant per product, treated as static PDFs. You look up the Global Warming Potential (GWP) number, and you move on.

That interpretation has a blind spot. EC3, the largest open EPD database, operated by Building Transparency, has been storing every version of every EPD since launch, creating a sparse, irregular time-series of carbon intensity per facility going back to 2019 across 40,000 records. When a concrete plant's GWP drops 12% between EPD version 1 and version 2, there are two different stories that look identical in a spreadsheet: either the manufacturer changed fuel, added supplementary cementitious materials, or improved efficiency, or the regional electricity grid got cleaner on its own and the A3 manufacturing stage number fell without the plant doing anything.

The Embodied Carbon Observatory tracks GWP trajectories for 15,000+ US manufacturing plants across 40,000 EPDs, joins them to 25+ years of EPA eGRID grid carbon data, and computes, per plant and per period, what percentage of a GWP change came from the grid versus the manufacturing process. The output is an interactive map where every plant click returns a counterfactual time series and a verdict: process\_improvement, grid\_improvement, mixed, or worsening.

The observatory is built by Ankur Podder, a solo developer at dexdogs. Every line of code and every data source is open source under MIT license, and every input (EC3, EPA eGRID, the Federal LCA Commons) is free and reproducible.

## The Challenge: Attribution Is a Time-Series Join Problem

To answer whether a plant is genuinely decarbonizing, the observatory has to join two time-series that move at completely different cadences: EPD version history, which is irregular and arrives roughly every five years per plant, against EPA eGRID grid carbon intensity, which is annual per subregion. Neither dataset answers the question on its own. The join, with a geographic lookup so each plant is matched to its eGRID subregion, is what produces the attribution signal.

On vanilla Postgres, that join is a full table scan on epd\_versions joined to grid\_carbon, filtered by a lat/lng bounding box, every time a user clicks a plant. At current data size it takes 6 to 8 seconds. The query has to run inline on every map click to power an interactive experience. A 6 to 8 second click is not an interactive experience. It is an overnight batch job pretending to be a map.

The problem is also going to get worse. EC3 publishes new EPDs daily as manufacturers renew their declarations under Buy Clean pressure. Over the next two to three years, as Buy Clean thresholds tighten across more states, the rate of EPD updates will accelerate and the observatory needs append performance that keeps up without a nightly rebuild.

## Why Tiger Data: Architecture-First From Day One

Ankur found Tiger Data while evaluating time-series options inside the Postgres ecosystem, and he eliminated three alternatives before writing production code.

-   InfluxDB was eliminated immediately. No SQL, no PostGIS, no joins. The attribution query requires joining EPD history to grid carbon by geographic subregion, and InfluxDB cannot do that join.
-   Supabase was eliminated on performance. Without hypertables, continuous aggregates, or time\_bucket functions, the attribution query is a full table scan on every request. The query worked at current scale and would break as the dataset grew.
-   DuckDB was eliminated on ingestion. The analytical performance is strong, but EC3 publishes new EPDs daily, and DuckDB is not built for a live append workload running alongside analytical queries.

TimescaleDB was the only option that combined Postgres compatibility, [hypertables](https://www.tigerdata.com/docs/reference/timescaledb/hypertables) for time-series performance, [continuous aggregates](https://www.tigerdata.com/docs/reference/timescaledb/continuous-aggregates) for pre-computed trends, and live ingestion for new EPDs as they are published. Staying in the Postgres ecosystem also meant that when the observatory is ready to add PostGIS for native spatial-temporal queries, it is a single extension away. No ETL to a separate system, no data duplication, no second connection pool.

> TimescaleDB was the only option that combined Postgres compatibility, hypertables for time-series performance, continuous aggregates for pre-computed trends, and live ingestion for new EPDs as they are published. _- Ankur Podder, Founder, dexdogs.earth_

## The Embodied Carbon Observatory Stack

The observatory's pipeline starts with three free, public data sources. The EC3 API supplies EPD records, plant locations, and the full version history of each declaration. EPA eGRID supplies annual grid carbon intensity and the renewable mix per subregion going back to 1996. The Federal LCA Commons (USDA, NREL, EPA) supplies upstream process data for supply chain attribution.

A Python ETL pipeline normalizes all three sources, assigns each plant to its eGRID subregion via a lat/lng to subregion lookup, and writes time-series rows into a TimescaleDB-extended Postgres database running on Tiger Cloud's free tier. Inside it, epd\_versions and grid\_carbon are hypertables partitioned on issued\_at. A continuous aggregate, gwp\_attribution, pre-computes the per-plant and per-period split between grid and process contribution, refreshing automatically as new EPDs land. Plant location is stored as indexed lat/lng columns so radius filters stay fast without a heavy spatial extension.

A FastAPI layer on Railway sits between the database and the frontend as a thin, stateless query service. The frontend is Next.js on Vercel. The interactive map is rendered in Mapbox GL with verdict color coding and radius filters, the GWP versus counterfactual charts are drawn in D3.js, and the upstream supply chain dependency graph is rendered with Cytoscape.js. The entire project is open source under MIT license, and the GitHub repo is public.

![](https://storage.ghost.io/c/6b/cb/6bcb39cf-9421-4bd1-9c9d-fa7b6755ba0e/content/images/2026/04/td-dexdogs-spotlight-diagram-blog-v1.svg)

__Embodied Carbon Observatory data flow: EC3, EPA eGRID, and the Federal LCA Commons land in open source TimescaleDB through a Python ETL. Hypertables and a continuous aggregate power the attribution query that FastAPI serves to the Next.js map.__

## Results: Interactive Attribution, Not Overnight Batch

### From 6 to 8 Seconds to Under 100ms

The central query in the observatory finds all concrete plants within 200 miles of a location, computes their annual average GWP, and returns both the GWP delta and the grid carbon delta year over year. Those are the raw ingredients the frontend needs to render the attribution verdict.

On vanilla Postgres (no hypertable, no continuous aggregate) the query runs 6 to 8 seconds. It is a full sequential scan on epd\_versions joined to grid\_carbon with a lat/lng bounding box filter. With TimescaleDB installed, a hypertable on issued\_at, the gwp\_attribution continuous aggregate, and an index on plant location, the same query returns in under 100ms. The SQL does not change. The performance is what makes it possible to run the attribution on every map click instead of as an overnight batch job.

> On vanilla Postgres that join is a full table scan every time. 6 to 8 seconds at current data size, and worse as the dataset grows. With TimescaleDB the same query runs under 100ms. _- Ankur Podder_

### The Counterfactual: What Would GWP Be If Only the Grid Changed?

The attribution signal comes out of a single query that builds a counterfactual time-series per plant: what the plant's GWP would be today if the manufacturing process had not changed at all and only the regional grid had gotten cleaner. The difference between the actual GWP and the counterfactual is the genuine process signal. Subtracting it out is what makes it possible to separate a manufacturer that is actually investing in lower-carbon inputs from one that looks green on paper because the state built more solar.

The shape of that query, running on TimescaleDB:

```SQL
SELECT p.name, p.state, p.egrid_subregion,
       e.issued_at::date AS epd_date,
       ROUND(e.gwp_total::numeric, 2) AS gwp_actual,
       -- counterfactual: what would GWP be if only the grid changed?
       ROUND((
         FIRST_VALUE(e.gwp_total) OVER w *
         (g.co2e_rate_lb_per_mwh /
          NULLIF(FIRST_VALUE(g.co2e_rate_lb_per_mwh) OVER w, 0))
       )::numeric, 2) AS gwp_counterfactual_grid_only
FROM epd_versions e
JOIN plants p ON p.id = e.plant_id
LEFT JOIN grid_carbon g ON g.egrid_subregion = p.egrid_subregion
                       AND EXTRACT(year FROM g.year) = EXTRACT(year FROM e.issued_at)
WINDOW w AS (PARTITION BY p.id, e.declared_unit ORDER BY e.issued_at
             ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
ORDER BY p.name, e.issued_at;
```

Every plant and every period returns a verdict: process\_improvement if the process share of the improvement is at least 50%, grid\_improvement if the grid share is, mixed if neither dominates, and worsening if GWP went up. That verdict is what the map color codes on every click.

## Looking Ahead

The next unlock is adding PostGIS so spatial plus temporal queries stay in a single SQL statement. Because PostGIS and TimescaleDB run inside the same Postgres engine, the observatory will be able to answer questions like “all plants within 200 miles of a construction site whose process-driven GWP improvement exceeds 15% since 2020” or “nearest low-carbon alternative to this plant ranked by GWP trajectory, not just current GWP” directly from the database, without swapping the lat/lng filter for a separate geospatial service. The same query shape surfaces eGRID subregions where grid improvement is masking process deterioration, which is a view no ESG rating agency currently publishes.

After that, Ankur plans to extend ingestion to steel, timber, and insulation (all already covered in EC3), roll plant-level attribution up to the manufacturer fleet level for capital markets use cases, and move ingestion to a daily cron. As Buy Clean thresholds tighten across more states, the rate of new EPD publications will accelerate. TimescaleDB's append performance is what keeps continuous tracking tractable at that scale, so the observatory can become the authoritative dataset for whether US building materials are measurably decarbonizing, not just claiming to be.

* * *

_Are you building something interesting with TimescaleDB? Get featured! The Community Member Spotlight series is where we highlight work like Ankur's: real problems solved on open source TimescaleDB by developers in the community. If you've shipped something on TimescaleDB, a weekend project, a research tool, or production infrastructure, I'd love to feature yours next in the series.  
  
The one requirement is that the project runs on open source TimescaleDB (self-hosted or on the free tier both work). Fill out_ [_this short form_](https://forms.gle/JrDojxNHop49wGAr9) _and I'll follow up to talk through what you've built._