SlideShare a Scribd company logo
1
2
Tim is a teacher, author and technology leader with
Confluent. He is not only an expert on KSQL but he can
also frequently be found speaking at conferences in the
United States and all over the world. He is the co-presenter
of various O’Reilly training videos on topics ranging from Git
to Distributed Systems, and he is the author of Gradle
Beyond the Basics.
Tim Berglund
Senior Director of Developer Experience,
Confluent
3
Housekeeping Items
● This session will last about an hour.
● It will be recorded.
● You can submit your questions by entering them into the GoToWebinar panel.
● The last 10 minutes will consist of Q&A.
● The slides and recording will be available after the talk.
Declarative
Stream
Language
Processing
KSQLis a
KSQLis the
Streaming
SQL Enginefor
Apache Kafka
KSQL Concepts
Exploring KSQL Patterns
KSQL Concepts
• Streams are first-class citizens
• Tables are first-class citizens
• Some queries are persistent
• All queries run until terminated
CREATE STREAM clickstream
WITH (
value_format = ‘JSON’,
kafka_topic=‘my_clickstream_topic’
);
Creating a Stream
• Let’s say we have a topic called my_clickstream_topic
• The topic contains JSON data
• KSQL now knows about that topic
Exploring that Stream
SELECT status, bytes
FROM clickstream
WHERE user_agent =
‘Mozilla/5.0 (compatible; MSIE 6.0)’;
• Now that the stream exists, we can examine its contents
• Simple, declarative filtering
• A non-persistent query
CREATE TABLE users
WITH (
key = ‘user_id',
kafka_topic=‘clickstream_users’,
value_format=‘JSON’
);
Creating a Table
• We have a topic called my_clickstream_topic
• The topic contains JSON data
• The topic contains changelog data
Inspecting that Table
SELECT userid, username
FROM users
WHERE level = ‘Platinum’;
• Now that the table exists, we can examine its contents
• Simple, declarative filtering
• A non-persistent query
Joining a Stream to a Table
• Now that we have clickstream and users, we can join them
• This allows us to do filtering of clicks on a user attribute
CREATE STREAM vip_actions AS
SELECT userid, page, action FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';
Usage Patterns
KSQL for Streaming ETL
• Kafka is popular for data pipelines.
• KSQL enables easy transformations of data within the pipe.
• Transforming data while moving from Kafka to another system.
CREATE STREAM vip_actions AS
SELECT userid, page, action FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';
KSQL for Anomaly Detection
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
Identifying patterns or anomalies in real-time data,
surfaced in milliseconds
KSQL for Real-Time
Monitoring• Log data monitoring, tracking and alerting
• Sensor / IoT data
CREATE TABLE error_counts AS
SELECT error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE type = 'ERROR'
GROUP BY error_code;
KSQL for Data Transformation
CREATE STREAM views_by_userid
WITH (PARTITIONS=6,
VALUE_FORMAT=‘JSON’,
TIMESTAMP=‘view_time’) AS
SELECT * FROM clickstream PARTITION BY user_id;
Make simple derivations of existing topics from the command line
Demo
Deployment Patterns
Kafka Cluster
JVM
KSQL ServerKSQL CLI
KSQL in Local Mode
• Starts a CLI and a server in the same JVM
• Ideal for developing on your laptop
bin/ksql-cli local
• Or with customized settings
bin/ksql-cli local --properties-file ksql.properties
KSQL in Local Mode
KSQL in Client-Server Mode
JVM
KSQL Server
KSQL CLI
JVM
KSQL Server
JVM
KSQL Server
Kafka Cluster
• Start any number of server nodes
bin/ksql-server-start
• Start one or more CLIs and point them to a server
bin/ksql-cli remote https://myksqlserver:8090
• All servers share the processing load
Technically, instances of the same Kafka Streams Applications
Scale up/down without restart
KSQL in Client-Server Mode
KSQL in Application Mode
Kafka Cluster
JVM
KSQL Server
JVM
KSQL Server
JVM
KSQL Server
• Start any number of server nodes
Pass a file of KSQL statement to execute
bin/ksql-node query-file=foo/bar.sql
• Ideal for streaming ETL application deployment
Version-control your queries and transformations as code
• All running engines share the processing load
Technically, instances of the same Kafka Streams Applications
Scale up/down without restart
KSQL in Application Mode
Resources and Next Steps
https://github.com/confluentinc/ksql
http://confluent.io/ksql
https://slackpass.io/confluentcommunity #ksql
29
30
Thank you for attending Exploring KSQL
Patterns.
Ad

More Related Content

What's hot (20)

Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
Flink Forward
 
Kafka Connect
Kafka ConnectKafka Connect
Kafka Connect
Oleg Kuznetsov
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
confluent
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
Guozhang Wang
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
HostedbyConfluent
 
Apache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyondApache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyond
Bowen Li
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache Spark
Kazuaki Ishizaki
 
Deploying and Operating KSQL
Deploying and Operating KSQLDeploying and Operating KSQL
Deploying and Operating KSQL
confluent
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Databricks
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
confluent
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
confluent
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
confluent
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
Timo Walther
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Best Practices for Middleware and Integration Architecture Modernization with...
Best Practices for Middleware and Integration Architecture Modernization with...Best Practices for Middleware and Integration Architecture Modernization with...
Best Practices for Middleware and Integration Architecture Modernization with...
Claus Ibsen
 
Streams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLStreams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQL
confluent
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
Guido Schmutz
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
Flink Forward
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
confluent
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
Guozhang Wang
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
Using Kafka at Scale - A Case Study of Micro Services Data Pipelines at Evern...
HostedbyConfluent
 
Apache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyondApache Flink 101 - the rise of stream processing and beyond
Apache Flink 101 - the rise of stream processing and beyond
Bowen Li
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache Spark
Kazuaki Ishizaki
 
Deploying and Operating KSQL
Deploying and Operating KSQLDeploying and Operating KSQL
Deploying and Operating KSQL
confluent
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Databricks
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
confluent
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
confluent
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
confluent
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
Timo Walther
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Best Practices for Middleware and Integration Architecture Modernization with...
Best Practices for Middleware and Integration Architecture Modernization with...Best Practices for Middleware and Integration Architecture Modernization with...
Best Practices for Middleware and Integration Architecture Modernization with...
Claus Ibsen
 
Streams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLStreams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQL
confluent
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
Guido Schmutz
 

Similar to Exploring KSQL Patterns (20)

Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
Cliff Gilmore
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Kinetica
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
confluent
 
Exploring KSQL Patterns
Exploring KSQL Patterns Exploring KSQL Patterns
Exploring KSQL Patterns
confluent
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQL
Florent Ramiere
 
Deploying and Operating KSQL
Deploying and Operating KSQLDeploying and Operating KSQL
Deploying and Operating KSQL
confluent
 
ksqlDB Workshop
ksqlDB WorkshopksqlDB Workshop
ksqlDB Workshop
confluent
 
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
ShapeBlue
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
confluent
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
confluent
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
Yoni Farin
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Paolo Castagna
 
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSPutting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Lightbend
 
KSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
KSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache KafkaKSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache Kafka
Matthias J. Sax
 
Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
Florent Ramiere
 
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
HostedbyConfluent
 
ITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdf
ITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdfITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdf
ITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdf
Ortus Solutions, Corp
 
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
Amazon Web Services Korea
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Application
confluent
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
Cliff Gilmore
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Kinetica
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
confluent
 
Exploring KSQL Patterns
Exploring KSQL Patterns Exploring KSQL Patterns
Exploring KSQL Patterns
confluent
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQL
Florent Ramiere
 
Deploying and Operating KSQL
Deploying and Operating KSQLDeploying and Operating KSQL
Deploying and Operating KSQL
confluent
 
ksqlDB Workshop
ksqlDB WorkshopksqlDB Workshop
ksqlDB Workshop
confluent
 
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
ShapeBlue
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
confluent
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
confluent
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
Yoni Farin
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Paolo Castagna
 
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSPutting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Lightbend
 
KSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
KSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache KafkaKSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache Kafka
Matthias J. Sax
 
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
Knock Knock, Who’s There? With Justin Chen and Dhruv Jauhar | Current 2022
HostedbyConfluent
 
ITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdf
ITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdfITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdf
ITB 2023 qb, Migration, Seeders. Recipe For Success - Gavin-Pickin.pdf
Ortus Solutions, Corp
 
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵 [AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
Amazon Web Services Korea
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Application
confluent
 
Ad

More from confluent (20)

Webinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptxWebinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
Migration, backup and restore made easy using KannikaMigration, backup and restore made easy using Kannika
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - KeynoteData in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
Data in Motion Tour Seoul 2024  - Roadmap DemoData in Motion Tour Seoul 2024  - Roadmap Demo
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi ArabiaData in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent PlatformStrumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not WeeksCompose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and ConfluentBuilding Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazioneIl Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptxWebinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
Migration, backup and restore made easy using KannikaMigration, backup and restore made easy using Kannika
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - KeynoteData in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
Data in Motion Tour Seoul 2024  - Roadmap DemoData in Motion Tour Seoul 2024  - Roadmap Demo
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi ArabiaData in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent PlatformStrumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not WeeksCompose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and ConfluentBuilding Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazioneIl Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Ad

Recently uploaded (20)

Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 

Exploring KSQL Patterns

  • 1. 1
  • 2. 2 Tim is a teacher, author and technology leader with Confluent. He is not only an expert on KSQL but he can also frequently be found speaking at conferences in the United States and all over the world. He is the co-presenter of various O’Reilly training videos on topics ranging from Git to Distributed Systems, and he is the author of Gradle Beyond the Basics. Tim Berglund Senior Director of Developer Experience, Confluent
  • 3. 3 Housekeeping Items ● This session will last about an hour. ● It will be recorded. ● You can submit your questions by entering them into the GoToWebinar panel. ● The last 10 minutes will consist of Q&A. ● The slides and recording will be available after the talk.
  • 8. KSQL Concepts • Streams are first-class citizens • Tables are first-class citizens • Some queries are persistent • All queries run until terminated
  • 9. CREATE STREAM clickstream WITH ( value_format = ‘JSON’, kafka_topic=‘my_clickstream_topic’ ); Creating a Stream • Let’s say we have a topic called my_clickstream_topic • The topic contains JSON data • KSQL now knows about that topic
  • 10. Exploring that Stream SELECT status, bytes FROM clickstream WHERE user_agent = ‘Mozilla/5.0 (compatible; MSIE 6.0)’; • Now that the stream exists, we can examine its contents • Simple, declarative filtering • A non-persistent query
  • 11. CREATE TABLE users WITH ( key = ‘user_id', kafka_topic=‘clickstream_users’, value_format=‘JSON’ ); Creating a Table • We have a topic called my_clickstream_topic • The topic contains JSON data • The topic contains changelog data
  • 12. Inspecting that Table SELECT userid, username FROM users WHERE level = ‘Platinum’; • Now that the table exists, we can examine its contents • Simple, declarative filtering • A non-persistent query
  • 13. Joining a Stream to a Table • Now that we have clickstream and users, we can join them • This allows us to do filtering of clicks on a user attribute CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum';
  • 15. KSQL for Streaming ETL • Kafka is popular for data pipelines. • KSQL enables easy transformations of data within the pipe. • Transforming data while moving from Kafka to another system. CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum';
  • 16. KSQL for Anomaly Detection CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3; Identifying patterns or anomalies in real-time data, surfaced in milliseconds
  • 17. KSQL for Real-Time Monitoring• Log data monitoring, tracking and alerting • Sensor / IoT data CREATE TABLE error_counts AS SELECT error_code, count(*) FROM monitoring_stream WINDOW TUMBLING (SIZE 1 MINUTE) WHERE type = 'ERROR' GROUP BY error_code;
  • 18. KSQL for Data Transformation CREATE STREAM views_by_userid WITH (PARTITIONS=6, VALUE_FORMAT=‘JSON’, TIMESTAMP=‘view_time’) AS SELECT * FROM clickstream PARTITION BY user_id; Make simple derivations of existing topics from the command line
  • 19. Demo
  • 21. Kafka Cluster JVM KSQL ServerKSQL CLI KSQL in Local Mode
  • 22. • Starts a CLI and a server in the same JVM • Ideal for developing on your laptop bin/ksql-cli local • Or with customized settings bin/ksql-cli local --properties-file ksql.properties KSQL in Local Mode
  • 23. KSQL in Client-Server Mode JVM KSQL Server KSQL CLI JVM KSQL Server JVM KSQL Server Kafka Cluster
  • 24. • Start any number of server nodes bin/ksql-server-start • Start one or more CLIs and point them to a server bin/ksql-cli remote https://myksqlserver:8090 • All servers share the processing load Technically, instances of the same Kafka Streams Applications Scale up/down without restart KSQL in Client-Server Mode
  • 25. KSQL in Application Mode Kafka Cluster JVM KSQL Server JVM KSQL Server JVM KSQL Server
  • 26. • Start any number of server nodes Pass a file of KSQL statement to execute bin/ksql-node query-file=foo/bar.sql • Ideal for streaming ETL application deployment Version-control your queries and transformations as code • All running engines share the processing load Technically, instances of the same Kafka Streams Applications Scale up/down without restart KSQL in Application Mode
  • 27. Resources and Next Steps https://github.com/confluentinc/ksql http://confluent.io/ksql https://slackpass.io/confluentcommunity #ksql
  • 28. 29
  • 29. 30 Thank you for attending Exploring KSQL Patterns.

Editor's Notes

  • #5: Really, stream processing is still a pretty new discipline. We are only on the second generation of OSS tooling (depending on how you look at things), and most people who are building streaming systems are building their first. As a result, most stream processing requires a bunch of custom code, often deployed to specialized infrastructure, coded against specialized APIs. And hey, sometimes that’s what you gotta do, but having a declarative language and getting infrastructure problems out of your way is a good thing. KSQL aims to do both things.
  • #6: Another way to put that, is that KSQL is a SQL engine for Kafka. It’s not a subset of ANSI SQL—it can’t be, since streaming systems deal with unbounded data sets and relational databases are fundamentally about bounded data sets, and that difference matters—but man, how great would it be to have a SQL-like language to describe stream processing computation you want done to the data you have stored in Kafka topics? (If you don’t know Kafka already, it’s a messaging system, and topics are just queues of messages. Basic stuff here, and don’t let it confuse you if you’re new to all of this.)
  • #7: Where does it fit into my system? What is the language syntax like?
  • #8: architecture diagram stuff goes into Kafka, KSQL processes it, it goes out KSQL takes the place of more complex options that have preceded it, like the Streams API or the Producer and Consumer API.
  • #9: KSQL is familiar, but is also different in important ways. What is a stream? An unbounded sequence of facts. What is a table? A collection of evolving facts. We’ll see examples. Queries tend to run until you stop them. This is counterintuitive, but remember we’re dealing with streaming data here. There’s never a “last” record. Persistent queries are really stream processing programs that run in KSQL.
  • #10: Ok, so we want to make ourselves a stream out of a topic we have in Kafka, how to start ? This is a lightweight abstraction on top of the topic. Note that the stream has metadata, but the metadata is extracted automatically from the topic.
  • #11: It’s not an ad-hoc query language as such, but since you can define stream processing jobs with it, it’s certainly possible to use it to arbitrary filtering and projection on existing topics. You have to create streams first, fo course, but we’ve gone over that now.
  • #12: Creating a table. Note that this is fundamentally tabular data: the key is the user_id, so each message in the topic is an update to that user’s record. We don’t need to specify the metadata, because it gets sucked in from the topic.
  • #13: It’s not an ad-hoc query language as such, but since you can define stream processing jobs with it, it’s certainly possible to use it to arbitrary filtering and projection on existing topics. You have to create streams first, fo course, but we’ve gone over that now.
  • #16: On the third bullet: often people build streaming pipelines with Kafka dumping data into C* or Elastic. Well, you’re probably going to need to do some work on the data along the way. No need to have a Spark Streaming job running now!
  • #19: KSQL also turns out to be super-useful for housekeeping and administrative actions that would otherwise require a stream-transforming program of some sort to be written and tested, or changes in the underlying source data ssytems to produce to a topic in a different format in the first place. In this example we’re simply: taking all the records from the ‘clickstream’ topic and copying them into a new ‘views_by_userid’ topic, which we’ve asked to be written out in json format (notice that the inout stream could be in any other format KSQL can read), and explicitly asked for there to be 6 partitions of this output topic, for the record timestamps to be populated from the value of the ‘view_time’ field in the input topic And finally, the records should be distributed across the 6 partitions based on their ‘user_id’ All the options we’re specifying here have sensible defaults and can be omitted if you don’t want or need to override them
  • #20: 4
  • #22: When we’re looking to select tools for solving a particular tech problem in front of us we are always making trade-offs. In kafka-land, one interesting set of trade-offs to consider is this, fairly typical, spectrum: ranging from very flexible and low-level on the left side, using the original kafka client producer and consumer APIs – think of this as being at the level of ‘get-message’, ‘put message’ and you of course have to take care of many details of orchestrating these reads and writes yourself; up through something like the kafka streams api, shown in the center here, where we can hide a lot of lower-level implementation concerns and focus on using functions which operate on a stream of records as a whole – perhaps filtering or transforming every record that passes by in a more functional-programming style. The real shift when using this is in mindset, to a place where you think of passing functions to be run against everything in a stream rather than strictly iterating over the stream yourself. KSQL shifts it up another gear to a place where we can declaratively transform one or more streams into another stream, using syntax and ideas that may be more familiar. Notice how, as we go from left to right on this spectrum, each thing builds upon the preceding one – both conceptually and also literally in terms of implementation – each of these APIs is built around the preceding one
  • #27: Leave resource mgmt. to dedicated systems such as k8s All running Engines share the processing load Technically, instances of the same Kafka Streams Applications Scale up/down without restart
  • #29: This is open source, and you should get involved. You can check out the code on GitHub or play with the many examples there. Also, you are hereby solemnly adjured to join the Slack community and ask questions there!