Introduction to Real-Time Swap Analytics

Posted by

Jônatas Davi Paganini

If you're an investor actively trading on decentralized exchanges, you’re swapping tokens daily and trying to maximize profits while managing risk. An effective swap analytics process will enable you to stop trading blind: You will harness analytics to make decisions based on results, not hunches.

Swap analytics examine the pricing, risk, and performance of swap derivatives to improve trading decisions and risk management. They enable you to understand whether your strategies are working, which tokens are performing best, and your overall profit or loss.

However, this analysis becomes increasingly difficult to implement as your trading activities grow. With dozens or hundreds of swaps across multiple tokens, tracking performance manually becomes nearly impossible. 🫠

But, worry not. A robust real-time swap analytics system powered by TimescaleDB can help you fix this and completely transform your trading experience.

In this tutorial, we'll build a system that can:

Track every swap transaction with precise timestamps (time-series data)
Calculate real-time profit and loss using first-in-first-out (FIFO) accounting
Identify winning and losing patterns in your trading history
Provide instant performance metrics across different time periods
Scale to handle millions of transactions without performance degradation

What Is Swap Analytics?

Swap analytics employs quantitative methods to evaluate swap derivatives, using mathematical models to optimize pricing, risk assessment, and trading strategies under varying market conditions.

Core components include pricing models using interest rate curves and volatility surfaces, risk metrics like PV01 (price value of a basis point) and value-at-risk, and performance attribution to explain trading results.

Modern swap analytics extends to regulatory compliance calculations like credit valuation adjustment (CVA) and increasingly incorporates machine learning to improve predictions, execution, and anomaly detection.

Since swap transactions follow time patterns, TimescaleDB delivers real-time insights that other databases can't match. It unifies historical and current data in one place, letting you learn from past trades while monitoring live market activity without compromise.

The Challenge of FIFO Accounting With Real-Time Data

First-in-first-out accounting is a standard method in finance for calculating profit and loss. When applied to cryptocurrency swaps, it means that when you sell tokens, you're assumed to be selling your oldest tokens first. This assumption creates a specific challenge for time-series databases because FIFO accounting requires maintaining state across transactions. Each sale needs to reference previous purchases, potentially going back days, weeks, or even months. This creates a tension with time-series databases that are optimized for time-bounded queries.

Let's consider a simple example:

January 1: Buy 2 ETH at $1,000 each.
January 15: Buy 1 ETH at $1,200.
February 1: Sell 2.5 ETH at $1,500 each.

To calculate the profit using FIFO:

First, we sell the 2 ETH bought on January 1 ($1,500 - $1,000 = $500 profit per ETH).
Then, we sell 0.5 ETH from the January 15 purchase ($1,500 - $1,200 = $300 profit per ETH).
Total profit: (2 × $500) + (0.5 × $300) = $1,150

This calculation requires maintaining a running inventory of tokens and their purchase prices, which becomes complex in a time-series context, especially when trying to create real-time analytics.

TimescaleDB offers several approaches to solve this challenge:

Using window functions to track cumulative purchases and sales
Implementing custom aggregates that maintain state
Creating materialized views that pre-compute FIFO calculations at regular intervals

The main difficulty comes from the fact that FIFO accounting requires maintaining the order of purchases, complex window functions and common table expressions (CTEs) are needed, and continuous aggregates (our improved version of materialized views) have limitations with certain SQL constructs. Before diving into the implementation details, let's clarify some key terminology.

💡

Since the time of writing, continuous aggregates have been updated in https://github.com/timescale/timescaledb/releases/tag/2.20.0 2.20 to include experimental support for window functions and non-immutable functions.

Real-Time Swap Analytics: Glossary of Terms

To ensure clarity throughout this tutorial, here's a short glossary of key terms:

Swap: An exchange of one token for another on a decentralized exchange or trading platform.
FIFO (first-in-first-out): An accounting method where the first tokens purchased are considered the first ones sold.
PnL (profit and loss): The financial gain or loss calculated from trading activities.
Continuous aggregate: A TimescaleDB feature that pre-computes and stores aggregated data for faster query performance.
Hypertable: TimescaleDB's partitioned table structure optimized for time-series data.
Token in/out: The amount of tokens deposited or withdrawn during a swap transaction.
USD in/out: The dollar value of tokens involved in a swap transaction.
Win rate: The percentage of profitable trades relative to total trades.
CTE (common table expression): A temporary result set that can be referenced within a SQL statement.
Window function: SQL functions that perform calculations across a set of table rows related to the current row.
Materialized view (PostgreSQL): A database object that contains the results of a query, stored for faster access.
time_bucket: A TimescaleDB function that groups time-series data into intervals (hourly, daily, etc.).
Chunk: A partition of data in TimescaleDB, typically based on time ranges.
Refresh policy: Rules that determine when and how continuous aggregates are updated.

Now that we understand the challenge and terminology, let's explore how TimescaleDB can help us solve it.

How TimescaleDB Enhances Your Real-Time Swap Analysis

TimescaleDB is a PostgreSQL extension that optimizes storing and querying time-series, event, and vector data. It works by automatically partitioning data into "chunks" based on time intervals, creating what's called a hypertable. This architecture improves query performance as queries that filter on time ranges only scan relevant chunks, dramatically reducing I/O operations. As part of TimescaleDB’s hybrid row-columnar engine, these chunks are then moved from the rowstore to the columnstore as they get older, saving significant storage space and automatically streamlining data management.

This chunking approach allows TimescaleDB to handle billions of data points efficiently while maintaining native PostgreSQL compatibility—you can use all the PostgreSQL features you already know alongside TimescaleDB’s optimizations. Leveraging this familiarity helps teams onboard faster and build better, shortening the time to market.

For this particular case (financial data like swap events), these benefits translate to faster analytics, lower storage costs, and the ability to maintain longer historical records without performance degradation.

1. The base for swap analysis: Hypertable structure

First, we'll create a table to store our swap events and convert it into a TimescaleDB hypertable:

CREATE TABLE swap_events (
  id SERIAL,
  time TIMESTAMPTZ NOT NULL,
  token_address TEXT NOT NULL,
  token_in NUMERIC,
  token_out NUMERIC,
  usd_in NUMERIC,
  usd_out NUMERIC,
  wallet_address TEXT,
  PRIMARY KEY (id, time)  -- Include time in the primary key for TimescaleDB
);

-- Convert to a TimescaleDB hypertable
SELECT create_hypertable('swap_events', by_range('time', INTERVAL '1 week'));

With our base structure in place, we can now implement different types of analytics, starting with simpler metrics that don't require FIFO accounting.

2. Continuous aggregates for simple metrics

Continuous aggregates are one of TimescaleDB's most powerful features. They work through a materialization engine specifically designed for time-series data. Unlike traditional materialized views, continuous aggregates can do the following:

Incrementally refresh: Only newly added or modified data is processed during updates.
Use a real-time aggregation system: Queries combine materialized results with real-time data for up-to-date answers.
Maintain a refresh policy: They automatically update on your defined schedule.
Optimize storage: Continuous aggregates store aggregated results efficiently using the same chunking mechanism as hypertables.

For straightforward metrics that don't require FIFO accounting, continuous aggregates work perfectly:

CREATE MATERIALIZED VIEW swap_events_hourly WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  wallet_address,
  token_address,
  SUM(usd_in) AS total_usd_in,
  SUM(usd_out) AS total_usd_out,
  SUM(token_in) AS total_token_in,
  SUM(token_out) AS total_token_out,
  COUNT(*) AS swap_count,
  COUNT(CASE WHEN token_out > 0 THEN 1 END) AS sell_count,
  COUNT(CASE WHEN token_in > 0 THEN 1 END) AS buy_count
FROM swap_events
GROUP BY bucket, wallet_address, token_address;

-- Set refresh policy
SELECT add_continuous_aggregate_policy('swap_events_hourly',
  start_offset => INTERVAL '1 day',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour');

While continuous aggregates efficiently handle simple metrics, they are limited when it comes to the complex window functions needed for FIFO accounting. We need a different approach for these more complex calculations.

3. FIFO PnL calculation with a view

For the complex FIFO accounting, we'll use a regular view with window functions that maintain the state we need across transactions:

CREATE OR REPLACE VIEW swap_fifo_pnl AS
WITH token_queue AS (
  SELECT
    time,
    id,
    token_address,
    wallet_address,
    token_in,
    token_out,
    usd_in,
    usd_out,
    SUM(token_in) OVER (
      PARTITION BY wallet_address, token_address
      ORDER BY time, id
      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) - SUM(token_out) OVER (
      PARTITION BY wallet_address, token_address
      ORDER BY time, id
      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS token_balance,
    SUM(token_in) OVER (
      PARTITION BY wallet_address, token_address
      ORDER BY time, id
    ) AS cumulative_token_in,
    SUM(token_out) OVER (
      PARTITION BY wallet_address, token_address
      ORDER BY time, id
    ) AS cumulative_token_out,
    SUM(usd_in) OVER (
      PARTITION BY wallet_address, token_address
      ORDER BY time, id
    ) AS cumulative_usd_in
  FROM swap_events
),
fifo_calcs AS (
  SELECT
    time,
    id,
    token_address,
    wallet_address,
    token_in,
    token_out,
    usd_in,
    usd_out,
    token_balance,
    cumulative_token_in,
    cumulative_token_out,
    cumulative_usd_in,
    CASE 
      WHEN token_out > 0 THEN
        -- Calculate the average cost basis for tokens being sold using FIFO
        usd_out - (token_out * 
          (LAG(cumulative_usd_in, 1, 0) OVER (PARTITION BY wallet_address, token_address ORDER BY time, id) / 
           LAG(cumulative_token_in, 1, 1) OVER (PARTITION BY wallet_address, token_address ORDER BY time, id)))
      ELSE 0
    END AS realized_pnl
  FROM token_queue
)
SELECT
  time,
  wallet_address,
  token_address,
  token_in,
  token_out,
  usd_in,
  usd_out,
  token_balance,
  realized_pnl,
  SUM(realized_pnl) OVER (
    PARTITION BY wallet_address, token_address
    ORDER BY time, id
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS cumulative_pnl
FROM fifo_calcs;

With our FIFO PnL view in place, we can now build higher-level performance metrics to gain insights into trading performance.

Note that we have an overwhelming amount of window functionality that keeps repeating in the previous SQL code snippet. We can refactor this code and simply introduce a `WINDOW` function.

CREATE OR REPLACE VIEW swap_fifo_pnl AS
WITH token_queue AS (
  SELECT
    time,
    id,
    token_address,
    wallet_address,
    token_in,
    token_out,
    usd_in,
    usd_out,
    SUM(token_in) OVER w_balance - SUM(token_out) OVER w_balance AS token_balance,
    SUM(token_in) OVER w_cumulative AS cumulative_token_in,
    SUM(token_out) OVER w_cumulative AS cumulative_token_out,
    SUM(usd_in) OVER w_cumulative AS cumulative_usd_in
  FROM swap_events
  WINDOW
    w_balance AS (PARTITION BY wallet_address, token_address ORDER BY time, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
    w_cumulative AS (PARTITION BY wallet_address, token_address ORDER BY time, id)
),
fifo_calcs AS (
  SELECT
    time,
    id,
    token_address,
    wallet_address,
    token_in,
    token_out,
    usd_in,
    usd_out,
    token_balance,
    cumulative_token_in,
    cumulative_token_out,
    cumulative_usd_in,
    CASE 
      WHEN token_out > 0 THEN
        -- Calculate the average cost basis for tokens being sold using FIFO
        usd_out - (token_out * 
          (LAG(cumulative_usd_in, 1, 0) OVER w_lag / 
           LAG(cumulative_token_in, 1, 1) OVER w_lag))
      ELSE 0
    END AS realized_pnl
  FROM token_queue
  WINDOW
    w_lag AS (PARTITION BY wallet_address, token_address ORDER BY time, id)
)
SELECT
  time,
  wallet_address,
  token_address,
  token_in,
  token_out,
  usd_in,
  usd_out,
  token_balance,
  realized_pnl,
  SUM(realized_pnl) OVER w_cumulative_pnl AS cumulative_pnl
FROM fifo_calcs
WINDOW
  w_cumulative_pnl AS (PARTITION BY wallet_address, token_address ORDER BY time, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);

4. Performance metrics (win rate, etc.)

Using the FIFO PnL view, we can easily calculate performance metrics such as win rate and total profit:

-- Query for performance metrics (win rate, etc.)
SELECT
  wallet_address,
  token_address,
  COUNT(*) AS total_trades,
  COUNT(CASE WHEN realized_pnl > 0 THEN 1 END) AS winning_trades,
  ROUND(COUNT(CASE WHEN realized_pnl > 0 THEN 1 END)::numeric / NULLIF(COUNT(*), 0) * 100, 2) AS win_rate,
  SUM(realized_pnl) AS total_pnl
FROM swap_fifo_pnl
WHERE token_out > 0
GROUP BY wallet_address, token_address;

Now that we have our core functionality implemented, let's explore how to optimize performance for large datasets.

Best Practices and Optimization Tips for Real-Time Swap Analytics

Even though we can't use continuous aggregates for our FIFO calculations, TimescaleDB's architecture still provides significant performance advantages. The extension's chunking mechanism, query planner optimizations, and parallel query execution can dramatically speed up complex real-time analytics:

Time-based partitioning: Even when using regular views or materialized views, queries benefit from TimescaleDB's time-based chunking, as the database only needs to scan chunks relevant to the query's time range.
Optimized query planning: TimescaleDB's query planner understands time-series patterns and can optimize execution plans specifically for time-series workloads, even for complex window functions.
Parallelization: TimescaleDB can parallelize operations across chunks, allowing complex calculations to utilize multiple CPU cores efficiently.
Hypertable indexes: Special index optimizations for time-series data improve performance for both simple and complex queries.

To further enhance performance, consider these optimization strategies:

Hybrid approach: Use continuous aggregates for simple metrics and views for complex calculations.
Materialized views: For frequently accessed FIFO calculations, consider creating materialized views that you refresh on a schedule:

CREATE MATERIALIZED VIEW fifo_pnl_daily AS
SELECT * FROM swap_fifo_pnl WHERE time > now() - INTERVAL '30 days';

Chunking time periods: For large datasets, query the FIFO view with time constraints:

SELECT * FROM swap_fifo_pnl 
WHERE time > now() - INTERVAL '7 days'
AND wallet_address = '0xuser1';

Compression: Enable compression on older chunks to save space:

ALTER TABLE swap_events SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'wallet_address,token_address'
);

Retention policies: Set up retention policies for data you don't need to keep long-term.

Streamline Your Swap Analytics Today

While continuous aggregates have limitations with complex window functions and CTEs needed for FIFO accounting, this hybrid approach gives you the best of both worlds:

Fast, pre-computed, simple metrics with continuous aggregates
Accurate FIFO accounting with views
Excellent query performance with TimescaleDB's time-partitioning

The solution scales as your dataset grows, and you can further optimize by materializing the complex FIFO calculations for specific time periods that are frequently accessed.

I hope this helps! Let me know if you have any questions about implementing this approach. The full example is in GitHub if you want to play with it! If you haven’t installed TimescaleDB yet, visit our GitHub page (⭐ welcome!) or follow our installation instructions.

Date published

May 20, 2025

Posted by

Jônatas Davi Paganini

Get Started Free with Tiger CLI

Date published

May 20, 2025

Posted by

Jônatas Davi Paganini

Get Started Free with Tiger CLI