SlideShare a Scribd company logo
Change Data Capture Pipelines With
Debezium and Kafka Streams
Gunnar Morling
Software Engineer
@gunnarmorling
Recap: Debezium
What's Change Data Capture?
Use Cases
Kafka Streams with Quarkus
Supersonic Subatomic Java
The Kafka Streams Extension
Debezium + Kafka Streams =
Data Enrichment
Auditing
Aggregate View Materialisation
1
2
3
Gunnar Morling
Open source software engineer at Red Hat
Debezium
Quarkus
Hibernate
Spec Lead for Bean Validation 2.0
Other projects: Deptective, MapStruct
Java Champion
#Debezium @gunnarmorling
@gunnarmorling
Postgres
MySQL
Kafka Connect Kafka ConnectApache Kafka
DBZ PG
DBZ
MySQL
Search Index
ES
Connector
JDBC
Connector
ES
Connector
ISPN
Connector
Cache
Debezium
Enabling Zero-Code Data Streaming Pipelines
Data
Warehouse
#Debezium
@gunnarmorling
Debezium
Low-Latency Change Data Streaming
https://medium.com/convoy-tech/
#Debezium
@gunnarmorling
Debezium
Low-Latency Change Data Streaming
https://medium.com/convoy-tech/
#Debezium
@gunnarmorling
CDC – "Liberation for Your Data"
#Debezium
@gunnarmorling
CDC – "Liberation for Your Data"
#Debezium
{
"before": null,
"after": {
"id": 1004,
"first_name": "Anne",
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"server_id": 0,
"ts_sec": 0,
"file": "mysql-bin.000003",
"pos": 154,
"row": 0,
"snapshot": true,
"db": "inventory",
"table": "customers"
},
"op": "c",
"ts_ms": 1486500577691
}
Change Event Structure
Key: Primary key of table
Value: Describing the change event
Old row state
New row state
Metadata
@gunnarmorling#Debezium
Recap: Debezium
What's Change Data Capture?
Use Cases
1
2
3
Debezium + Kafka Streams =
Data Enrichment
Auditing
Aggregate View Materialisation
Kafka Streams with Quarkus
Supersonic Subatomic Java
The Kafka Streams Extension
@gunnarmorling
Quarkus
Supersonic Subatomic Java
“ A Kubernetes Native Java stack tailored for
OpenJDK HotSpot and GraalVM, crafted from the
best of breed Java libraries and standards.
#Debezium
@gunnarmorling#Debezium
@gunnarmorling#Debezium
Quarkus
Supersonic Subatomic Java
Developer joy
Imperative and Reactive
Best-of-breed libraries
@gunnarmorling#Debezium
Quarkus
Supersonic Subatomic Java
Developer joy
Imperative and Reactive
Best-of-breed libraries
Quarkus
The Kafka Streams Extension
Management of topology
Health checks
Dev Mode
Support for native binaries via GraalVM
@gunnarmorling#Debezium
Recap: Debezium
What's Change Data Capture?
Use Cases
1
3
2
Debezium + Kafka Streams =
Data Enrichment
Auditing
Aggregate View Materialisation
Kafka Streams with Quarkus
Supersonic Subatomic Java
The Kafka Streams Extension
Data Enrichment -
Demo
Auditing
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
CRM
Service
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
CRM
Service
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Kafka
Streams
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
#Debezium
@gunnarmorling
Auditing
Source DB Kafka Connect Apache Kafka
DBZ
Customer Events
Transactions
CRM
Service
Kafka
Streams
Id User Use Case
tx-1 Bob Create Customer
tx-2 Sarah Delete Customer
tx-3 Rebecca Update Customer
"Transactions" table
Enriched Customer Events
#Debezium
@gunnarmorling
Auditing
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3"
},
"op": "u",
"ts_ms": 1486500577691
}
Customers
#Debezium
@gunnarmorling
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3"
},
"op": "u",
"ts_ms": 1486500577691
}
{
"before": null,
"after": {
"id": "tx-3",
"user": "Rebecca",
"use_case": "Update customer"
},
"source": {
"name": "dbserver1",
"table": "transactions",
"txId": "tx-3"
},
"op": "c",
"ts_ms": 1486500577691
}
Transactions Customers
{
"id": "tx-3"
}
#Debezium
{
"id": "tx-3"
}
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3"
},
"op": "u",
"ts_ms": 1486500577691
}
Transactions Customers
@gunnarmorling
{
"before": null,
"after": {
"id": "tx-3",
"user": "Rebecca",
"use_case": "Update customer"
},
"source": {
"name": "dbserver1",
"table": "transactions",
"txId": "tx-3"
},
"op": "c",
"ts_ms": 1486500577691
}
#Debezium
@gunnarmorling
{
"before": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@example.com"
},
"after": {
"id": 1004,
"last_name": "Kretchmar",
"email": "annek@noanswer.org"
},
"source": {
"name": "dbserver1",
"table": "customers",
"txId": "tx-3",
"user": "Rebecca",
"use_case": "Update customer"
},
"op": "u",
"ts_ms": 1486500577691
}
Enriched Customers
Auditing
#Debezium
@gunnarmorling
@Override
public KeyValue<JsonObject, JsonObject>
transform(JsonObject key, JsonObject value) {
boolean enrichedAllBufferedEvents =
enrichAndEmitBufferedEvents();
if (!enrichedAllBufferedEvents) {
bufferChangeEvent(key, value);
return null;
}
KeyValue<JsonObject, JsonObject> enriched =
enrichWithTxMetaData(key, value);
if (enriched == null) {
bufferChangeEvent(key, value);
}
return enriched;
}
Auditing
Non-trivial join implementation
no ordering across topics
need to buffer change events
until TX data available
bit.ly/debezium-auditlogs
#Debezium
Aggregate
View Materalization
Aggregate View Materialization
From Multiple Topics to One View
@gunnarmorling#Debezium
PurchaseOrder
OrderLine
{
"purchaseOrderId" : "order-123",
"orderDate" : "2020-08-24",
"customerId": "customer-123",
"orderLines" : [
{
"orderLineId" : "orderLine-456",
"productId" : "product-789",
"quantity" : 2,
"price" : 59.99
},
{
"orderLineId" : "orderLine-234",
"productId" : "product-567",
"quantity" : 1,
"price" : 14.99
}
1
n
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Non-Key Joins (KIP-213)
KTable<Long, OrderLine> orderLines = ...;
KTable<Integer, PurchaseOrder> purchaseOrders = ...;
KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines
.join(
purchaseOrders,
orderLine -> orderLine.purchaseOrderId,
(orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder))
.groupBy(
(orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder))
.aggregate(
PurchaseOrderWithLines::new,
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value),
(Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value)
);
@gunnarmorling#Debezium
Aggregate View Materialization
Awareness of Transaction Boundaries
TX metadata in change events (e.g. dbserver1.inventory.orderline)
{
"before": null,
"after": { ... },
"source": { ... },
"op": "c",
"ts_ms": "1580390884335",
"transaction": {
"id": "571",
"total_order": "1",
"data_collection_order": "1"
}
}
Aggregate View Materialization
Awareness of Transaction Boundaries
Topic with BEGIN/END markers
Enable consumers to buffer all events
of one transaction
@gunnarmorling
{
"transactionId" : "571",
"eventType" : "begin transaction",
"ts_ms": 1486500577125
}
{
"transactionId" : "571",
"ts_ms": 1486500577691,
"eventType" : "end transaction",
"eventCount" : [
{
"name" : "dbserver1.inventory.order",
"count" : 1
},
{
"name" : "dbserver1.inventory.orderLine",
"count" : 5
}
]
}
BEGIN END
#Debezium
Takeaways
Many Use Cases for Debezium and Kafka Streams
Data enrichment
Creating aggregated events
Interactive query services for legacy databases
@gunnarmorling#Debezium
Takeaways
Many Use Cases for Debezium and Kafka Streams
Data enrichment
Creating aggregated events
Interactive query services for legacy databases
@gunnarmorling#Debezium
Debezium + Kafka
Streams =
Resources
Website:
Examples:
Latest news: @debezium
debezium.io
github.com/debezium/debezium-examples/
@gunnarmorling#Debezium
gunnar@hibernate.org
@gunnarmorling
@gunnarmorling
Q&A
#Debezium
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020
Ad

More Related Content

What's hot (20)

Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
confluent
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
Oracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion EditionOracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion Edition
Markus Michalewicz
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
confluent
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Shiao-An Yuan
 
PostgreSQL + Kafka: The Delight of Change Data Capture
PostgreSQL + Kafka: The Delight of Change Data CapturePostgreSQL + Kafka: The Delight of Change Data Capture
PostgreSQL + Kafka: The Delight of Change Data Capture
Jeff Klukas
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
Ricardo Paiva
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Databricks
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
Guido Schmutz
 
Apache Pulsar Overview
Apache Pulsar OverviewApache Pulsar Overview
Apache Pulsar Overview
Streamlio
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Databricks
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using Hazelcast
Taras Matyashovsky
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
Anshum Gupta
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent
 
Oracle RAC 12c Overview
Oracle RAC 12c OverviewOracle RAC 12c Overview
Oracle RAC 12c Overview
Markus Michalewicz
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
Saroj Panyasrivanit
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
confluent
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
Oracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion EditionOracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion Edition
Markus Michalewicz
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
confluent
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Shiao-An Yuan
 
PostgreSQL + Kafka: The Delight of Change Data Capture
PostgreSQL + Kafka: The Delight of Change Data CapturePostgreSQL + Kafka: The Delight of Change Data Capture
PostgreSQL + Kafka: The Delight of Change Data Capture
Jeff Klukas
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
Ricardo Paiva
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Databricks
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
Guido Schmutz
 
Apache Pulsar Overview
Apache Pulsar OverviewApache Pulsar Overview
Apache Pulsar Overview
Streamlio
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Databricks
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Ceph Community
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using Hazelcast
Taras Matyashovsky
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
Anshum Gupta
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent
 

Similar to Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020 (20)

LendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGateLendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGate
Rajit Saha
 
Oracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub ImplementationOracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub Implementation
Vengata Guruswamy
 
Event streaming webinar feb 2020
Event streaming webinar feb 2020Event streaming webinar feb 2020
Event streaming webinar feb 2020
Maheedhar Gunturu
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
Ofir Sharony
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
ScyllaDB
 
Scaling Writes on CockroachDB with Apache NiFi
Scaling Writes on CockroachDB with Apache NiFiScaling Writes on CockroachDB with Apache NiFi
Scaling Writes on CockroachDB with Apache NiFi
Chris Casano
 
Building a Microservices-based ERP System
Building a Microservices-based ERP SystemBuilding a Microservices-based ERP System
Building a Microservices-based ERP System
MongoDB
 
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
HostedbyConfluent
 
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Lucas Jellema
 
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
HostedbyConfluent
 
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
Ontico
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
confluent
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
Andrew Liu
 
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at ScaleGetting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Bishop Fox
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream Processing
Guozhang Wang
 
Dynamo DB Indexes
Dynamo DB IndexesDynamo DB Indexes
Dynamo DB Indexes
Nathan Jones
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Yaroslav Tkachenko
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
HostedbyConfluent
 
MongoDB World 2018: Keynote
MongoDB World 2018: KeynoteMongoDB World 2018: Keynote
MongoDB World 2018: Keynote
MongoDB
 
LendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGateLendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGate
Rajit Saha
 
Oracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub ImplementationOracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub Implementation
Vengata Guruswamy
 
Event streaming webinar feb 2020
Event streaming webinar feb 2020Event streaming webinar feb 2020
Event streaming webinar feb 2020
Maheedhar Gunturu
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
Ofir Sharony
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
ScyllaDB
 
Scaling Writes on CockroachDB with Apache NiFi
Scaling Writes on CockroachDB with Apache NiFiScaling Writes on CockroachDB with Apache NiFi
Scaling Writes on CockroachDB with Apache NiFi
Chris Casano
 
Building a Microservices-based ERP System
Building a Microservices-based ERP SystemBuilding a Microservices-based ERP System
Building a Microservices-based ERP System
MongoDB
 
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
What's in store? Part Deux; Creating Custom Queries with Kafka Streams IQv2
HostedbyConfluent
 
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Lucas Jellema
 
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
Demystifying Event-Driven Architectures with Apache Kafka | Bogdan Sucaciu, P...
HostedbyConfluent
 
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
PostgreSQL - масштабирование в моде, Valentine Gogichashvili (Zalando SE)
Ontico
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
confluent
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
Andrew Liu
 
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at ScaleGetting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Getting Buzzed on Buzzwords: Using Cloud & Big Data to Pentest at Scale
Bishop Fox
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream Processing
Guozhang Wang
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Yaroslav Tkachenko
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
HostedbyConfluent
 
MongoDB World 2018: Keynote
MongoDB World 2018: KeynoteMongoDB World 2018: Keynote
MongoDB World 2018: Keynote
MongoDB
 
Ad

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Ad

Recently uploaded (20)

UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 

Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling, Red Hat) Kafka Summit 2020

  • 1. Change Data Capture Pipelines With Debezium and Kafka Streams Gunnar Morling Software Engineer @gunnarmorling
  • 2. Recap: Debezium What's Change Data Capture? Use Cases Kafka Streams with Quarkus Supersonic Subatomic Java The Kafka Streams Extension Debezium + Kafka Streams = Data Enrichment Auditing Aggregate View Materialisation 1 2 3
  • 3. Gunnar Morling Open source software engineer at Red Hat Debezium Quarkus Hibernate Spec Lead for Bean Validation 2.0 Other projects: Deptective, MapStruct Java Champion #Debezium @gunnarmorling
  • 4. @gunnarmorling Postgres MySQL Kafka Connect Kafka ConnectApache Kafka DBZ PG DBZ MySQL Search Index ES Connector JDBC Connector ES Connector ISPN Connector Cache Debezium Enabling Zero-Code Data Streaming Pipelines Data Warehouse #Debezium
  • 5. @gunnarmorling Debezium Low-Latency Change Data Streaming https://medium.com/convoy-tech/ #Debezium
  • 6. @gunnarmorling Debezium Low-Latency Change Data Streaming https://medium.com/convoy-tech/ #Debezium
  • 7. @gunnarmorling CDC – "Liberation for Your Data" #Debezium
  • 8. @gunnarmorling CDC – "Liberation for Your Data" #Debezium
  • 9. { "before": null, "after": { "id": 1004, "first_name": "Anne", "last_name": "Kretchmar", "email": "[email protected]" }, "source": { "name": "dbserver1", "server_id": 0, "ts_sec": 0, "file": "mysql-bin.000003", "pos": 154, "row": 0, "snapshot": true, "db": "inventory", "table": "customers" }, "op": "c", "ts_ms": 1486500577691 } Change Event Structure Key: Primary key of table Value: Describing the change event Old row state New row state Metadata @gunnarmorling#Debezium
  • 10. Recap: Debezium What's Change Data Capture? Use Cases 1 2 3 Debezium + Kafka Streams = Data Enrichment Auditing Aggregate View Materialisation Kafka Streams with Quarkus Supersonic Subatomic Java The Kafka Streams Extension
  • 11. @gunnarmorling Quarkus Supersonic Subatomic Java “ A Kubernetes Native Java stack tailored for OpenJDK HotSpot and GraalVM, crafted from the best of breed Java libraries and standards. #Debezium
  • 13. @gunnarmorling#Debezium Quarkus Supersonic Subatomic Java Developer joy Imperative and Reactive Best-of-breed libraries
  • 14. @gunnarmorling#Debezium Quarkus Supersonic Subatomic Java Developer joy Imperative and Reactive Best-of-breed libraries
  • 15. Quarkus The Kafka Streams Extension Management of topology Health checks Dev Mode Support for native binaries via GraalVM @gunnarmorling#Debezium
  • 16. Recap: Debezium What's Change Data Capture? Use Cases 1 3 2 Debezium + Kafka Streams = Data Enrichment Auditing Aggregate View Materialisation Kafka Streams with Quarkus Supersonic Subatomic Java The Kafka Streams Extension
  • 19. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events CRM Service #Debezium
  • 20. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events CRM Service Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table #Debezium
  • 21. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events Transactions CRM Service Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table #Debezium
  • 22. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events Transactions CRM Service Kafka Streams Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table #Debezium
  • 23. @gunnarmorling Auditing Source DB Kafka Connect Apache Kafka DBZ Customer Events Transactions CRM Service Kafka Streams Id User Use Case tx-1 Bob Create Customer tx-2 Sarah Delete Customer tx-3 Rebecca Update Customer "Transactions" table Enriched Customer Events #Debezium
  • 24. @gunnarmorling Auditing { "before": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3" }, "op": "u", "ts_ms": 1486500577691 } Customers #Debezium
  • 25. @gunnarmorling { "before": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3" }, "op": "u", "ts_ms": 1486500577691 } { "before": null, "after": { "id": "tx-3", "user": "Rebecca", "use_case": "Update customer" }, "source": { "name": "dbserver1", "table": "transactions", "txId": "tx-3" }, "op": "c", "ts_ms": 1486500577691 } Transactions Customers { "id": "tx-3" } #Debezium
  • 26. { "id": "tx-3" } { "before": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3" }, "op": "u", "ts_ms": 1486500577691 } Transactions Customers @gunnarmorling { "before": null, "after": { "id": "tx-3", "user": "Rebecca", "use_case": "Update customer" }, "source": { "name": "dbserver1", "table": "transactions", "txId": "tx-3" }, "op": "c", "ts_ms": 1486500577691 } #Debezium
  • 27. @gunnarmorling { "before": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "after": { "id": 1004, "last_name": "Kretchmar", "email": "[email protected]" }, "source": { "name": "dbserver1", "table": "customers", "txId": "tx-3", "user": "Rebecca", "use_case": "Update customer" }, "op": "u", "ts_ms": 1486500577691 } Enriched Customers Auditing #Debezium
  • 28. @gunnarmorling @Override public KeyValue<JsonObject, JsonObject> transform(JsonObject key, JsonObject value) { boolean enrichedAllBufferedEvents = enrichAndEmitBufferedEvents(); if (!enrichedAllBufferedEvents) { bufferChangeEvent(key, value); return null; } KeyValue<JsonObject, JsonObject> enriched = enrichWithTxMetaData(key, value); if (enriched == null) { bufferChangeEvent(key, value); } return enriched; } Auditing Non-trivial join implementation no ordering across topics need to buffer change events until TX data available bit.ly/debezium-auditlogs #Debezium
  • 30. Aggregate View Materialization From Multiple Topics to One View @gunnarmorling#Debezium PurchaseOrder OrderLine { "purchaseOrderId" : "order-123", "orderDate" : "2020-08-24", "customerId": "customer-123", "orderLines" : [ { "orderLineId" : "orderLine-456", "productId" : "product-789", "quantity" : 2, "price" : 59.99 }, { "orderLineId" : "orderLine-234", "productId" : "product-567", "quantity" : 1, "price" : 14.99 } 1 n
  • 31. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 32. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 33. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 34. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 35. @gunnarmorling#Debezium Aggregate View Materialization Non-Key Joins (KIP-213) KTable<Long, OrderLine> orderLines = ...; KTable<Integer, PurchaseOrder> purchaseOrders = ...; KTable<Integer, PurchaseOrderWithLines> purchaseOrdersWithOrderLines = orderLines .join( purchaseOrders, orderLine -> orderLine.purchaseOrderId, (orderLine, purchaseOrder) -> new OrderLineAndPurchaseOrder(orderLine, purchaseOrder)) .groupBy( (orderLineId, lineAndOrder) -> KeyValue.pair(lineAndOrder.purchaseOrder.id, lineAndOrder)) .aggregate( PurchaseOrderWithLines::new, (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.addLine(value), (Integer key, OrderLineAndPurchaseOrder value, PurchaseOrderWithLines agg) -> agg.removeLine(value) );
  • 36. @gunnarmorling#Debezium Aggregate View Materialization Awareness of Transaction Boundaries TX metadata in change events (e.g. dbserver1.inventory.orderline) { "before": null, "after": { ... }, "source": { ... }, "op": "c", "ts_ms": "1580390884335", "transaction": { "id": "571", "total_order": "1", "data_collection_order": "1" } }
  • 37. Aggregate View Materialization Awareness of Transaction Boundaries Topic with BEGIN/END markers Enable consumers to buffer all events of one transaction @gunnarmorling { "transactionId" : "571", "eventType" : "begin transaction", "ts_ms": 1486500577125 } { "transactionId" : "571", "ts_ms": 1486500577691, "eventType" : "end transaction", "eventCount" : [ { "name" : "dbserver1.inventory.order", "count" : 1 }, { "name" : "dbserver1.inventory.orderLine", "count" : 5 } ] } BEGIN END #Debezium
  • 38. Takeaways Many Use Cases for Debezium and Kafka Streams Data enrichment Creating aggregated events Interactive query services for legacy databases @gunnarmorling#Debezium
  • 39. Takeaways Many Use Cases for Debezium and Kafka Streams Data enrichment Creating aggregated events Interactive query services for legacy databases @gunnarmorling#Debezium Debezium + Kafka Streams =