Category: All posts
May 22, 2025
This is an installment of our Community Member Spotlight series, in which we invite our customers to share their work, spotlight their success, and inspire others with new ways to use technology to solve problems.
At Shoplogix IMS, we build Industrial IoT (IIoT) solutions, integrating software and hardware designed around batteryless wireless sensors. These sensors continuously monitor critical manufacturing assets, delivering real-time insights. As our platform evolved and scaled, the need to propagate real-time database changes—like sensor activations, configuration updates, or asset status transitions—across different systems became essential. Such capabilities power interactive dashboards, automated workflows, and analytics pipelines.
To address these requirements, we developed a streamlined Change Data Capture (CDC) pipeline utilizing TimescaleDB, PostgreSQL triggers, and Kafka Connect. This solution allowed us to stream database changes into Kafka seamlessly, without modifying existing table schemas or relying on PostgreSQL's write-ahead logs. In this post, we'll explore the details of our CDC architecture, highlight TimescaleDB's strengths, and demonstrate how we've achieved continuous synchronization across downstream systems with minimal overhead.
Proof of Concept
To better explain the concepts of our approach, we have prepared a public repository that mimics the implementation laid out in this article. This repository provides a fully containerized proof of concept (PoC) demonstrating a Change Data Capture (CDC) pipeline using TimescaleDB, Kafka, Kafka Connect, and a JDBC Source Connector.
https://github.com/carlospsikick/timescale-cdc
Change Data Capture (CDC) captures database operations (INSERT, UPDATE, DELETE) at the row level, converting these changes into structured events for immediate downstream consumption. Unlike traditional batch processes, CDC supports real-time data propagation.
While tools like Debezium rely on transaction logs, requiring elevated privileges or logical replication, our approach employs PostgreSQL triggers. This method offers full control, simplicity, and compatibility with managed environments, seamlessly integrating with TimescaleDB.
Our data model comprises sensor-generated time-series data (e.g., temperature, pressure, vibration) and metadata describing assets (such as types, serial numbers, and locations). Combined, these data streams offer a comprehensive view of the operational state of industrial assets, enabling visualization, anomaly detection, and predictive analytics.
As our business expands, we continuously evolve our ecosystem to support new sensor types, analytics, and integrations with external systems. Central to these developments is Event Streaming, which facilitates data transformations and redirections without impacting the broader infrastructure. Change Data Capture (CDC) is vital in this context, translating row-level database changes into real-time events that downstream systems can immediately process.
In our CDC pipeline the database changes triggered by APIs or microservices are captured in real time using database triggers and logged into a dedicated CDC times series schema. A Kafka Connect JDBC Source Connector polls these CDC Tables and Views, and streams the captured changes as events into Kafka topics. These event topics can then be consumed by various subscribers, enabling real-time data propagation across microservices, analytics platforms, and external systems in a decoupled and scalable manner.
Let’s take a closer look:
cdc.change_data_capture()
This function serves as the core mechanism for capturing data changes—inserts, updates, and deletes—from any table that invokes it via a trigger.
Note: To capture events from a TimescaleDB Hypertable we have to change the function a little bit, but the functionality is the same. See the code repo for more details.
The cdc.event_log is the central audit and event tracking table. It stores detailed, structured records of every change captured by the cdc.change_data_capture() trigger function.
Defining the cdc.event_log table as a Timescale hypertable has significant performance and scalability benefits for CDC workloads. It enables efficient time-based partitioning, making incremental polling and historical queries faster. Hypertables are optimized for high-throughput inserts, ideal for the append-only nature of CDC logs. Timescale also offers native features like automated data retention, compression through columnar storage, and real-time analytics capabilities, allowing you to manage storage effectively and build responsive downstream applications. Importantly, this setup integrates seamlessly with tools like Kafka Connect, without altering your connector configuration.
Column Description:
Creating views like cdc.event_log_assets enables clean separation of events from a shared CDC log into table-specific or domain-specific streams. These views simplify Kafka topic routing. Mapping each view to a unique Kafka topic reduces downstream filtering, improving performance by narrowing the data scope for polling connectors. These views also provide a flexible layer for schema shaping, enrichment, and transformation, making the CDC pipeline more modular, scalable, and easier to maintain.
To begin capturing change events for a particular table, all we need to do is add the trigger function to its definition. For example, to monitor `dataschema.assets`: