Category: All posts
May 20, 2025
Posted by
Jônatas Davi Paganini
If you're an investor actively trading on decentralized exchanges, you’re swapping tokens daily and trying to maximize profits while managing risk. An effective swap analytics process will enable you to stop trading blind: You will harness analytics to make decisions based on results, not hunches.
Swap analytics examine the pricing, risk, and performance of swap derivatives to improve trading decisions and risk management. They enable you to understand whether your strategies are working, which tokens are performing best, and your overall profit or loss.
However, this analysis becomes increasingly difficult to implement as your trading activities grow. With dozens or hundreds of swaps across multiple tokens, tracking performance manually becomes nearly impossible. 🫠
But, worry not. A robust real-time swap analytics system powered by TimescaleDB can help you fix this and completely transform your trading experience.
In this tutorial, we'll build a system that can:
Swap analytics employs quantitative methods to evaluate swap derivatives, using mathematical models to optimize pricing, risk assessment, and trading strategies under varying market conditions.
Core components include pricing models using interest rate curves and volatility surfaces, risk metrics like PV01 (price value of a basis point) and value-at-risk, and performance attribution to explain trading results.
Modern swap analytics extends to regulatory compliance calculations like credit valuation adjustment (CVA) and increasingly incorporates machine learning to improve predictions, execution, and anomaly detection.
Since swap transactions follow time patterns, TimescaleDB delivers real-time insights that other databases can't match. It unifies historical and current data in one place, letting you learn from past trades while monitoring live market activity without compromise.
First-in-first-out accounting is a standard method in finance for calculating profit and loss. When applied to cryptocurrency swaps, it means that when you sell tokens, you're assumed to be selling your oldest tokens first. This assumption creates a specific challenge for time-series databases because FIFO accounting requires maintaining state across transactions. Each sale needs to reference previous purchases, potentially going back days, weeks, or even months. This creates a tension with time-series databases that are optimized for time-bounded queries.
Let's consider a simple example:
To calculate the profit using FIFO:
This calculation requires maintaining a running inventory of tokens and their purchase prices, which becomes complex in a time-series context, especially when trying to create real-time analytics.
TimescaleDB offers several approaches to solve this challenge:
The main difficulty comes from the fact that FIFO accounting requires maintaining the order of purchases, complex window functions and common table expressions (CTEs) are needed, and continuous aggregates (our improved version of materialized views) have limitations with certain SQL constructs. Before diving into the implementation details, let's clarify some key terminology.
To ensure clarity throughout this tutorial, here's a short glossary of key terms:
Now that we understand the challenge and terminology, let's explore how TimescaleDB can help us solve it.
TimescaleDB is a PostgreSQL extension that optimizes storing and querying time-series, event, and vector data. It works by automatically partitioning data into "chunks" based on time intervals, creating what's called a hypertable. This architecture improves query performance as queries that filter on time ranges only scan relevant chunks, dramatically reducing I/O operations. As part of TimescaleDB’s hybrid row-columnar engine, these chunks are then moved from the rowstore to the columnstore as they get older, saving significant storage space and automatically streamlining data management.
This chunking approach allows TimescaleDB to handle billions of data points efficiently while maintaining native PostgreSQL compatibility—you can use all the PostgreSQL features you already know alongside TimescaleDB’s optimizations. Leveraging this familiarity helps teams onboard faster and build better, shortening the time to market.
For this particular case (financial data like swap events), these benefits translate to faster analytics, lower storage costs, and the ability to maintain longer historical records without performance degradation.
First, we'll create a table to store our swap events and convert it into a TimescaleDB hypertable:
CREATE TABLE swap_events (
id SERIAL,
time TIMESTAMPTZ NOT NULL,
token_address TEXT NOT NULL,
token_in NUMERIC,
token_out NUMERIC,
usd_in NUMERIC,
usd_out NUMERIC,
wallet_address TEXT,
PRIMARY KEY (id, time) -- Include time in the primary key for TimescaleDB
);
-- Convert to a TimescaleDB hypertable
SELECT create_hypertable('swap_events', by_range('time', INTERVAL '1 week'));
With our base structure in place, we can now implement different types of analytics, starting with simpler metrics that don't require FIFO accounting.
Continuous aggregates are one of TimescaleDB's most powerful features. They work through a materialization engine specifically designed for time-series data. Unlike traditional materialized views, continuous aggregates can do the following:
For straightforward metrics that don't require FIFO accounting, continuous aggregates work perfectly:
CREATE MATERIALIZED VIEW swap_events_hourly WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 hour', time) AS bucket,
wallet_address,
token_address,
SUM(usd_in) AS total_usd_in,
SUM(usd_out) AS total_usd_out,
SUM(token_in) AS total_token_in,
SUM(token_out) AS total_token_out,
COUNT(*) AS swap_count,
COUNT(CASE WHEN token_out > 0 THEN 1 END) AS sell_count,
COUNT(CASE WHEN token_in > 0 THEN 1 END) AS buy_count
FROM swap_events
GROUP BY bucket, wallet_address, token_address;
-- Set refresh policy
SELECT add_continuous_aggregate_policy('swap_events_hourly',
start_offset => INTERVAL '1 day',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');
While continuous aggregates efficiently handle simple metrics, they are limited when it comes to the complex window functions needed for FIFO accounting. We need a different approach for these more complex calculations.
For the complex FIFO accounting, we'll use a regular view with window functions that maintain the state we need across transactions:
CREATE OR REPLACE VIEW swap_fifo_pnl AS
WITH token_queue AS (
SELECT
time,
id,
token_address,
wallet_address,
token_in,
token_out,
usd_in,
usd_out,
SUM(token_in) OVER (
PARTITION BY wallet_address, token_address
ORDER BY time, id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) - SUM(token_out) OVER (
PARTITION BY wallet_address, token_address
ORDER BY time, id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS token_balance,
SUM(token_in) OVER (
PARTITION BY wallet_address, token_address
ORDER BY time, id
) AS cumulative_token_in,
SUM(token_out) OVER (
PARTITION BY wallet_address, token_address
ORDER BY time, id
) AS cumulative_token_out,
SUM(usd_in) OVER (
PARTITION BY wallet_address, token_address
ORDER BY time, id
) AS cumulative_usd_in
FROM swap_events
),
fifo_calcs AS (
SELECT
time,
id,
token_address,
wallet_address,
token_in,
token_out,
usd_in,
usd_out,
token_balance,
cumulative_token_in,
cumulative_token_out,
cumulative_usd_in,
CASE
WHEN token_out > 0 THEN
-- Calculate the average cost basis for tokens being sold using FIFO
usd_out - (token_out *
(LAG(cumulative_usd_in, 1, 0) OVER (PARTITION BY wallet_address, token_address ORDER BY time, id) /
LAG(cumulative_token_in, 1, 1) OVER (PARTITION BY wallet_address, token_address ORDER BY time, id)))
ELSE 0
END AS realized_pnl
FROM token_queue
)
SELECT
time,
wallet_address,
token_address,
token_in,
token_out,
usd_in,
usd_out,
token_balance,
realized_pnl,
SUM(realized_pnl) OVER (
PARTITION BY wallet_address, token_address
ORDER BY time, id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_pnl
FROM fifo_calcs;
With our FIFO PnL view in place, we can now build higher-level performance metrics to gain insights into trading performance.
Note that we have an overwhelming amount of window functionality that keeps repeating in the previous SQL code snippet. We can refactor this code and simply introduce a `WINDOW` function.
CREATE OR REPLACE VIEW swap_fifo_pnl AS
WITH token_queue AS (
SELECT
time,
id,
token_address,
wallet_address,
token_in,
token_out,
usd_in,
usd_out,
SUM(token_in) OVER w_balance - SUM(token_out) OVER w_balance AS token_balance,
SUM(token_in) OVER w_cumulative AS cumulative_token_in,
SUM(token_out) OVER w_cumulative AS cumulative_token_out,
SUM(usd_in) OVER w_cumulative AS cumulative_usd_in
FROM swap_events
WINDOW
w_balance AS (PARTITION BY wallet_address, token_address ORDER BY time, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
w_cumulative AS (PARTITION BY wallet_address, token_address ORDER BY time, id)
),
fifo_calcs AS (
SELECT
time,
id,
token_address,
wallet_address,
token_in,
token_out,
usd_in,
usd_out,
token_balance,
cumulative_token_in,
cumulative_token_out,
cumulative_usd_in,
CASE
WHEN token_out > 0 THEN
-- Calculate the average cost basis for tokens being sold using FIFO
usd_out - (token_out *
(LAG(cumulative_usd_in, 1, 0) OVER w_lag /
LAG(cumulative_token_in, 1, 1) OVER w_lag))
ELSE 0
END AS realized_pnl
FROM token_queue
WINDOW
w_lag AS (PARTITION BY wallet_address, token_address ORDER BY time, id)
)
SELECT
time,
wallet_address,
token_address,
token_in,
token_out,
usd_in,
usd_out,
token_balance,
realized_pnl,
SUM(realized_pnl) OVER w_cumulative_pnl AS cumulative_pnl
FROM fifo_calcs
WINDOW
w_cumulative_pnl AS (PARTITION BY wallet_address, token_address ORDER BY time, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
Using the FIFO PnL view, we can easily calculate performance metrics such as win rate and total profit:
-- Query for performance metrics (win rate, etc.)
SELECT
wallet_address,
token_address,
COUNT(*) AS total_trades,
COUNT(CASE WHEN realized_pnl > 0 THEN 1 END) AS winning_trades,
ROUND(COUNT(CASE WHEN realized_pnl > 0 THEN 1 END)::numeric / NULLIF(COUNT(*), 0) * 100, 2) AS win_rate,
SUM(realized_pnl) AS total_pnl
FROM swap_fifo_pnl
WHERE token_out > 0
GROUP BY wallet_address, token_address;
Now that we have our core functionality implemented, let's explore how to optimize performance for large datasets.
Even though we can't use continuous aggregates for our FIFO calculations, TimescaleDB's architecture still provides significant performance advantages. The extension's chunking mechanism, query planner optimizations, and parallel query execution can dramatically speed up complex real-time analytics:
To further enhance performance, consider these optimization strategies:
CREATE MATERIALIZED VIEW fifo_pnl_daily AS
SELECT * FROM swap_fifo_pnl WHERE time > now() - INTERVAL '30 days';
SELECT * FROM swap_fifo_pnl
WHERE time > now() - INTERVAL '7 days'
AND wallet_address = '0xuser1';
ALTER TABLE swap_events SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'wallet_address,token_address'
);
While continuous aggregates have limitations with complex window functions and CTEs needed for FIFO accounting, this hybrid approach gives you the best of both worlds:
The solution scales as your dataset grows, and you can further optimize by materializing the complex FIFO calculations for specific time periods that are frequently accessed.
I hope this helps! Let me know if you have any questions about implementing this approach. The full example is in GitHub if you want to play with it! If you haven’t installed TimescaleDB yet, visit our GitHub page (⭐ welcome!) or follow our installation instructions.