SlideShare a Scribd company logo
Kafka Connect and Kafka Streams
The Rise of Apache Kafka as Streaming Platform
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
2https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63921
https://qconlondon.com/london2018/presentation/cloud-native-and-scalable-kafka-architecture
(2018) (2018)
Apache Kafka
A Distributed, Scalable Commit Log
3
Apache Kafka
A Distributed, Scalable Commit Log
4
Apache Kafka
A Distributed, Scalable Commit Log
5
1.0 “Enterprise
Ready”
A Brief History of Apache Kafka and Confluent
0.11 Exactly-once
semantics
0.10 Data processing
(Streams API)
0.9 Data integration
(Connect API)
Intra-cluster
replication
0.8
2012 2014
Cluster mirroring0.7
2015 2016 20172013 2018
CP 4.1
KSQL GA
6
Apache Kafka
The Rise of a Streaming Platform
7
Orders Customers
Payments
Stock
Apache Kafka
Single Shared Source of Truth for (Micro)Services
8
Independent Dev / Test / Prod
9
No Matter Where it Runs
Kafka Connect
Declarative Data integration for Apache Kafka
11
Apache Kafka as Central Nervous System
12
Kafka Connect
13
Standalone Mode
14
Distributed Mode
15
Scalable Consumption
16
Certified Connectors
17
(Distributed) Workers
18
Converters
19
Avro Converter
20
Single Message Transforms
21
Single Message Transforms
•Mask sensitive information
•Add identifiers
•Tag events
•Lineage/provenance
•Remove unnecessary
columns
•Route high priority events to
faster data stores
•Direct events to different
Elasticsearch indexes
•Cast data types to match
destination
•Remove unnecessary
columns
Modify events before storing in Kafka: Modify events going out of Kafka:
22
• InsertField – Add a field using either static data or record metadata
• ReplaceField – Filter or rename fields
• MaskField – Replace field with valid null value for the type (0, empty string, etc)
• ValueToKey – Set the key to one of the value’s fields
• HoistField – Wrap the entire event as a single field inside a Struct or a Map
• ExtractField – Extract a specific field from Struct and Map and include only this field in results
• SetSchemaMetadata – modify the schema name or version
• TimestampRouter – Modify the topic of a record based on original topic and timestamp. Useful
when using a sink that needs to write to different tables or indexes based on timestamps
• RegexpRouter – modify the topic of a record based on original topic, replacement string and a
regular expression
• „Build your own“ – A Transformation is just a Java Class
Built-in Transformations
Kafka Streams
Stream Processing natively on top of Apache Kafka without an additional big data cluster
24
Kafka Streams - Part of Apache Kafka
25
Stream Processing
Data at Rest Data in Motion
26
Key concepts
27
Kafka Streams - Processor Topology
1) Read input from Kafka
2) Operator (directed acyclic graph):
• Filter / map / aggregation / joins
• Operators can be stateful
3) Write result back to Kafka
28
Kafka Streams - Runtime
29
Kafka Streams - Distributed State
30
Kafka Streams - Scaling
31
Kafka Streams - Streams and Tables
32
Kafka Streams - Streams and Tables
33
// Example: reading data from Kafka
KStream<byte[], String> textLines = builder.stream("textlines-topic", Consumed.with(
Serdes.ByteArray(), Serdes.String()));
// Example: transforming data
KStream<byte[], String> upperCasedLines= rawRatings.mapValues(String::toUpperCase));
KStream
34
// Example: aggregating data
KTable<String, Long> wordCounts = textLines
.flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split("W+")))
.groupBy((key, word) -> word)
.count();
KTable
35
Kafka
Streams
A complete streaming microservices, ready for production at large-scale
App configuration
Define processing
(here: WordCount)
Start processing
36
37
What if you are NOT a Java Coder?
Population
CodingSophistication
Realm of Stream Processing
New, Expanded Realm
BI
Analysts
Core
Developers
Data
Engineers
Core Developers
who don’t like
Java
Java
KSQL
38
KSQLis the
Streaming
SQL Engine
for
Apache Kafka
39
KSQL – The Streaming SQL Engine for Apache Kafka
Questions?
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de

More Related Content

What's hot (20)

PPTX
Envoy and Kafka
Adam Kotwasinski
 
PPTX
Extending Flink SQL for stream processing use cases
Flink Forward
 
PDF
Introduction to Kafka Streams
Guozhang Wang
 
ODP
Stream processing using Kafka
Knoldus Inc.
 
PDF
KSQL Intro
confluent
 
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
PPTX
kafka
Amikam Snir
 
PDF
Apache Flink internals
Kostas Tzoumas
 
PPTX
Schema registry
Whiteklay
 
PDF
Kafka Streams: What it is, and how to use it?
confluent
 
PPTX
Kafka at Peak Performance
Todd Palino
 
PDF
How to Build an Apache Kafka® Connector
confluent
 
PDF
When NOT to use Apache Kafka?
Kai Wähner
 
PDF
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
PPTX
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
PPTX
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
PPTX
Kafka 101
Clement Demonchy
 
PDF
Fundamentals of Apache Kafka
Chhavi Parasher
 
PDF
Kafka 101 and Developer Best Practices
confluent
 
PPTX
Apache kafka
Kumar Shivam
 
Envoy and Kafka
Adam Kotwasinski
 
Extending Flink SQL for stream processing use cases
Flink Forward
 
Introduction to Kafka Streams
Guozhang Wang
 
Stream processing using Kafka
Knoldus Inc.
 
KSQL Intro
confluent
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Apache Flink internals
Kostas Tzoumas
 
Schema registry
Whiteklay
 
Kafka Streams: What it is, and how to use it?
confluent
 
Kafka at Peak Performance
Todd Palino
 
How to Build an Apache Kafka® Connector
confluent
 
When NOT to use Apache Kafka?
Kai Wähner
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Kafka 101
Clement Demonchy
 
Fundamentals of Apache Kafka
Chhavi Parasher
 
Kafka 101 and Developer Best Practices
confluent
 
Apache kafka
Kumar Shivam
 

Similar to Kafka Connect and Streams (Concepts, Architecture, Features) (20)

PPTX
Streaming Data and Stream Processing with Apache Kafka
confluent
 
PPTX
Kafka Streams for Java enthusiasts
Slim Baltagi
 
PPTX
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
PDF
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi
 
PDF
Apache Kafka as Event-Driven Open Source Streaming Platform (Prague Meetup)
Kai Wähner
 
PDF
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
PDF
How to Build Streaming Apps with Confluent II
confluent
 
PDF
JHipster conf 2019 - Kafka Ecosystem
Florent Ramiere
 
PDF
Chti jug - 2018-06-26
Florent Ramiere
 
PDF
Jug - ecosystem
Florent Ramiere
 
PDF
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Kai Wähner
 
PDF
Integrating Apache Kafka Into Your Environment
confluent
 
PDF
KSQL - Stream Processing simplified!
Guido Schmutz
 
PDF
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
 
PDF
Devoxx university - Kafka de haut en bas
Florent Ramiere
 
PPTX
Apache kafka
Daan Gerits
 
PPTX
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
 
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
Florent Ramiere
 
PDF
Kafka Vienna Meetup 020719
Patrik Kleindl
 
PDF
Concepts and Patterns for Streaming Services with Kafka
QAware GmbH
 
Streaming Data and Stream Processing with Apache Kafka
confluent
 
Kafka Streams for Java enthusiasts
Slim Baltagi
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi
 
Apache Kafka as Event-Driven Open Source Streaming Platform (Prague Meetup)
Kai Wähner
 
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
How to Build Streaming Apps with Confluent II
confluent
 
JHipster conf 2019 - Kafka Ecosystem
Florent Ramiere
 
Chti jug - 2018-06-26
Florent Ramiere
 
Jug - ecosystem
Florent Ramiere
 
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Kai Wähner
 
Integrating Apache Kafka Into Your Environment
confluent
 
KSQL - Stream Processing simplified!
Guido Schmutz
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
 
Devoxx university - Kafka de haut en bas
Florent Ramiere
 
Apache kafka
Daan Gerits
 
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
 
Beyond the brokers - Un tour de l'écosystème Kafka
Florent Ramiere
 
Kafka Vienna Meetup 020719
Patrik Kleindl
 
Concepts and Patterns for Streaming Services with Kafka
QAware GmbH
 
Ad

More from Kai Wähner (20)

PDF
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Kai Wähner
 
PDF
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kai Wähner
 
PDF
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Kai Wähner
 
PDF
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Kai Wähner
 
PDF
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Kai Wähner
 
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
PDF
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Kai Wähner
 
PDF
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Kai Wähner
 
PDF
Apache Kafka in the Healthcare Industry
Kai Wähner
 
PDF
Apache Kafka in the Healthcare Industry
Kai Wähner
 
PDF
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Kai Wähner
 
PDF
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kai Wähner
 
PDF
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Kai Wähner
 
PDF
Apache Kafka Landscape for Automotive and Manufacturing
Kai Wähner
 
PDF
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
PDF
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Kai Wähner
 
PDF
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Kai Wähner
 
PDF
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Kai Wähner
 
PDF
Apache Kafka in the Transportation and Logistics
Kai Wähner
 
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Kai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Kai Wähner
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Kai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Kai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Kai Wähner
 
Apache Kafka in the Healthcare Industry
Kai Wähner
 
Apache Kafka in the Healthcare Industry
Kai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Kai Wähner
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kai Wähner
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Kai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Kai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Kai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Kai Wähner
 
Apache Kafka in the Transportation and Logistics
Kai Wähner
 
Ad

Recently uploaded (20)

PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Digital Circuits, important subject in CS
contactparinay1
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 

Kafka Connect and Streams (Concepts, Architecture, Features)