SlideShare a Scribd company logo
Kafka Streams Windowing
Behind the Curtain
Neil Buesing
Principal Solutions Architect, Rill
Confluent Meetup
July 15th, 2021
• Operational intelligence for data in motion
• Easy In - Easy Up - Easy Out
• Work with customers to build & modernize their pipelines
What Does Rill Do?
• Principal Solutions Architect
• Help customers with pipelines leveraging Apache Druid, Apache Kafka, Kafka
Streams, Apache Beam, and other technologies.
• Data Modeling and Governance
• Rill Data / Apache Druid
What Do I Do?
• Overview of the Windowing Options within Kafka Streams
• Windowing Use-Cases
• Examples of Aggregate Windowing
• What each windowing options does within RocksDB and the -changelog topics
• The key serialization of the -changeling topics
• Developer Tools & Ideas
Takeaways
• Stream / Table Duality
• Compacted Topics need stateful front-end
• Stateful Operations
• Finite datasets — tables
• Boundaries for unbounded data — windows
Why Kafka Streams
Windowing Options
Window Type Time boundary Examples
# records for key
@ point in time
Fixed Size
Tumbling Epoch
[8:00, 8:30)
[8:30, 9:00)
single
Yes
Hopping Epoch
[8:00, 8:30)
[8:15, 8:45)
[8:30, 8:45)
[8:45, 9:00)
constant
Yes
Sliding Record
[8:02, 8:32]
[8:20, 8:50]
[8:21, 8:51]
variable
Yes
Session Record
[8:02, 8:02]
[8:02, 8:10]
[9:10, 12:56]
single
(by tombstoning)
No
001 / 12:00
002 / 12:00
003 / 12:00
001 / 12:30
002 / 12:30
003 / 12:30
12:00 12:15 12:30 12:45 1:00
01
5
Tumbling Time Windows
01
3
02
9
03
5
01
4
5 8 12
5
9
03
11
02
3
02
6
02
5
02
1
03
2
03
7
01
7
01
5
01
3
16 7 9
6 11 12
3 8 15
12
Hopping Time Windows
001/ 12:00
002 / 12:00
003 / 12:00
001 / 12:30
002 / 12:30
003 / 12:30
12:00 12:15 12:30 12:45 1:00
01
5
01
3
02
9
03
5
01
4
5 8 12
5
9
03
11
02
3
02
6
02
5
02
1
03
2
03
7
01
7
01
5
01
3
16 7 9
6 11 12
3 8 15
001 / 11:45
5 8 12
001 / 12:45
002 / 11:45 002 / 12:15 002 / 12:45
003 / 11:45 003 / 12:45
003 / 12:15
3 8 15
1
18 20
11
5
9
12
3 9 14
Sliding Time Windows
001 / 11:31:00.000
12:00 12:15 12:30 12:45 1:00
01
5
01
3
01
4
5
01
5
01
3
001 / 11:33:00.000
8
001 / 11:47.00.000
12
001 / 12:31:00.001
7
001 / 11:33.00:001
4
001 / 11:45:00.000
3
12:01
12:03
12:12
12:48
12:55
001 / 12:31:00.001
3
001 / 11:55:00.000
001 / 11:45:00.001
8
5
12:00 12:15 12:30 12:45 1:00
01
5
Session Windows
01
3
02
9
03
5
01
4
03
11
02
3
02
6
02
5
02
1
03
2
03
7
01
7
01
5
01
3
5 8 12 3 8 15
5
9
16 23 25
18 23 24
12
Windowing Options
Window Type Time boundary Examples
# records for key
@ point in time
Fixed Size
Tumbling Epoch
[8:00, 8:30)
[8:30, 9:00)
single
Yes
Hopping Epoch
[8:00, 8:30)
[8:15, 8:45)
[8:30, 8:45)
[8:45, 9:00)
constant
Yes
Sliding Record
[8:02, 8:32]
[8:20, 8:50]
[8:21, 8:51]
variable
Yes
Session Record
[8:02, 8:02]
[8:02, 8:10]
[9:10, 12:56]
single
(by tombstoning)
No
• Good
• Web Visitors
• Products Purchased
• Inventory Management
• IoT Sensors
• Ad Impressions*
• Bad
• Fraud Detection
• User Interactions
• Composition*
Tumbling Time Windows
* event timestamp & grace period
• Good
• Web Visitors
• Products Purchased
• Fraud Detection
• IoT Sensors
• Bad
• User Interactions
• Inventory Management
• Composition
• Ad Impressions
Hopping Time Windows
• Good
• User Interactions
• Fraud Detection
• Usage Changes
• Bad
• Composition
• IoT Sensors
Sliding Time Windows
• Good
• User Interactions / Click Stream
• User Behavior Analysis
• IoT device - session oriented
(running)
• Bad
• Data Analytics (Generalizations)
• IoT sensors - always on
(pacemaker)
Session Windows
• Good
• Composition*
• Finite Datasets
• Bad
• Fraud Detection
• Monitoring
• Unbounded data*
No Windows
* manual tombstoning
Order Processing
Order Analytics
Demo Applications
orders-purchase orders-pickup
repartition
attach
user
& store
attach
line item
pricing
assemble
product
analytics
pickup-order-handler-purchase-order-join-product-repartition
product-repartition product-stats
State
Materialized!<> store = Materialized.as("po").withCachingDisabled();
builder.<String, PurchaseOrder>stream(opt.getTopic())
.groupByKey(Grouped.as("groupByKey"))
.windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize()))
.grace(Duration.ofSeconds(opt.getGracePeriod())))
.aggregate(Streams!::initialize,
Streams!::aggregator,
Named.as("aggregate"),
store)
.toStream(Named.as("toStream"))
.selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")")
.mapValues(Streams!::minimize)
.to(opt.topic(), Produced.as("to"));
Materialized!<> store = Materialized.as("po").withCachingDisabled();
builder.<String, PurchaseOrder>stream(opt.getTopic())
.groupByKey(Grouped.as("groupByKey"))
.windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize()))
.grace(Duration.ofSeconds(opt.getGracePeriod())))
.aggregate(Streams!::initialize,
Streams!::aggregator,
Named.as("aggregate"),
store)
.toStream(Named.as("toStream"))
.selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")")
.mapValues(Streams!::minimize)
.to(opt.topic(), Produced.as("to"));
TimeWindows.of(Duration.ofSeconds(opt.getWindowSize()))
.advanceBy(Duration.ofSeconds(opt.getWindowSize() / 2))
.grace(Duration.ofSeconds(opt.getGracePeriod())
SlidingWindows.withTimeDifferenceAndGrace(
Duration.ofSeconds(opt.getWindowSize()),
Duration.ofSeconds(opt.getGracePeriod()))
SessionWindows.with(Duration.ofSeconds(opt.getWindowSize()))
Demo “Time”
product
analytics
product-repartition product-stats
rocksdb_ldb
console-
consumer
console-
consumer
Metrics
-changelog
* Caching disabled
• Deserializers. % ln -s {jar} /usr/local/confluent/share/java/kafka
• WindowDeserializer
• SessionDeserializer
• RocksDB
• RocksDB’s rocksdb_ldb % brew install rocksdb
• Scripts
• rocksdb_key_parser
• rocksdb_window_parser
• rocksdb_session_parser
Demo Tools
DEMO
• Emitting Results
• Suppression
• Commit Time
• Window Boundaries
• Epoch vs. Event
• Long Windows*
• Join Windowing
• RocksDB Tuning
• RocksDB state store instances…
What Next
• Overview of the Windowing Options within Kafka Streams
• Windowing Use-Cases
• Examples of Aggregate Windowing
• What each windowing options does within RocksDB and the -changelog topics
• The key serialization of the -changeling topics
• Advance Considerations
Takeaways
• Demo Application
https://github.com/nbuesing/kafka-streams-dashboards
• Kafka Summit Europe 2021
https://www.confluent.io/events/kafka-summit-europe-2021/what-is-the-state-of-
my-kafka-streams-application-unleashing-metrics/
Resources
• Nick Dearden’s
https://www.confluent.io/kafka-summit-ny19/zen-and-the-art-of-streaming-joins/
• Anna McDonald’s
https://www.confluent.io/kafka-summit-san-francisco-2019/using-kafka-to-discover-events-hidden-in-
your-database/
• Matthias Sax’s
https://www.confluent.io/kafka-summit-san-francisco-2019/whats-the-time-and-why/
https://www.confluent.io/resources/kafka-summit-2020/the-flux-capacitor-of-kafka-streams-and-ksqldb/
Additional Resources
Blooper Reel
Kafka streams windowing behind the curtain
Ad

More Related Content

What's hot (20)

A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
Martin Podval
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Stateful stream processing with Apache Flink
Stateful stream processing with Apache FlinkStateful stream processing with Apache Flink
Stateful stream processing with Apache Flink
Knoldus Inc.
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
Ververica
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
HostedbyConfluent
 
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
HostedbyConfluent
 
kafka
kafkakafka
kafka
Amikam Snir
 
Dynamic Allocation in Spark
Dynamic Allocation in SparkDynamic Allocation in Spark
Dynamic Allocation in Spark
Databricks
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
Flink Forward
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Flink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
Martin Podval
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Stateful stream processing with Apache Flink
Stateful stream processing with Apache FlinkStateful stream processing with Apache Flink
Stateful stream processing with Apache Flink
Knoldus Inc.
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
Ververica
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
HostedbyConfluent
 
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
HostedbyConfluent
 
Dynamic Allocation in Spark
Dynamic Allocation in SparkDynamic Allocation in Spark
Dynamic Allocation in Spark
Databricks
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
Flink Forward
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Flink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward
 

Similar to Kafka streams windowing behind the curtain (20)

Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
DoiT International
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data Processing
DoiT International
 
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
Igor Roiter
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
Ruhani Arora
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
Matthias Niehoff
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Flink Forward
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
DataWorks Summit
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
Flink Forward
 
Onyx data processing the clojure way
Onyx   data processing  the clojure wayOnyx   data processing  the clojure way
Onyx data processing the clojure way
Bahadir Cambel
 
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
WSO2
 
Streaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache BeamStreaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache Beam
All Things Open
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream Analytics
Davide Mauri
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
WSO2
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
Apache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalApache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - final
Sub Szabolcs Feczak
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
DataWorks Summit
 
SplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security WorkshopSplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security Workshop
Splunk
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.ppt
Qingsong Yao
 
Google's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesGoogle's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT Services
Intel® Software
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
DoiT International
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data Processing
DoiT International
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
Ruhani Arora
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
Matthias Niehoff
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Flink Forward
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
DataWorks Summit
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
Flink Forward
 
Onyx data processing the clojure way
Onyx   data processing  the clojure wayOnyx   data processing  the clojure way
Onyx data processing the clojure way
Bahadir Cambel
 
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
[WSO2Con EU 2017] Deriving Insights for Your Digital Business with Analytics
WSO2
 
Streaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache BeamStreaming Data Pipelines With Apache Beam
Streaming Data Pipelines With Apache Beam
All Things Open
 
Event Hub & Azure Stream Analytics
Event Hub & Azure Stream AnalyticsEvent Hub & Azure Stream Analytics
Event Hub & Azure Stream Analytics
Davide Mauri
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
WSO2
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
Apache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalApache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - final
Sub Szabolcs Feczak
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
DataWorks Summit
 
SplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security WorkshopSplunkLive! Washington DC May 2013 - Splunk Security Workshop
SplunkLive! Washington DC May 2013 - Splunk Security Workshop
Splunk
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.ppt
Qingsong Yao
 
Google's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT ServicesGoogle's Infrastructure and Specific IoT Services
Google's Infrastructure and Specific IoT Services
Intel® Software
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
Ad

More from confluent (20)

Webinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptxWebinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
Migration, backup and restore made easy using KannikaMigration, backup and restore made easy using Kannika
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - KeynoteData in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
Data in Motion Tour Seoul 2024  - Roadmap DemoData in Motion Tour Seoul 2024  - Roadmap Demo
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi ArabiaData in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent PlatformStrumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not WeeksCompose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and ConfluentBuilding Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazioneIl Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptxWebinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
Migration, backup and restore made easy using KannikaMigration, backup and restore made easy using Kannika
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - KeynoteData in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
Data in Motion Tour Seoul 2024  - Roadmap DemoData in Motion Tour Seoul 2024  - Roadmap Demo
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi ArabiaData in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent PlatformStrumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not WeeksCompose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and ConfluentBuilding Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazioneIl Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Ad

Recently uploaded (20)

Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 

Kafka streams windowing behind the curtain

  • 1. Kafka Streams Windowing Behind the Curtain Neil Buesing Principal Solutions Architect, Rill Confluent Meetup July 15th, 2021
  • 2. • Operational intelligence for data in motion • Easy In - Easy Up - Easy Out • Work with customers to build & modernize their pipelines What Does Rill Do?
  • 3. • Principal Solutions Architect • Help customers with pipelines leveraging Apache Druid, Apache Kafka, Kafka Streams, Apache Beam, and other technologies. • Data Modeling and Governance • Rill Data / Apache Druid What Do I Do?
  • 4. • Overview of the Windowing Options within Kafka Streams • Windowing Use-Cases • Examples of Aggregate Windowing • What each windowing options does within RocksDB and the -changelog topics • The key serialization of the -changeling topics • Developer Tools & Ideas Takeaways
  • 5. • Stream / Table Duality • Compacted Topics need stateful front-end • Stateful Operations • Finite datasets — tables • Boundaries for unbounded data — windows Why Kafka Streams
  • 6. Windowing Options Window Type Time boundary Examples # records for key @ point in time Fixed Size Tumbling Epoch [8:00, 8:30) [8:30, 9:00) single Yes Hopping Epoch [8:00, 8:30) [8:15, 8:45) [8:30, 8:45) [8:45, 9:00) constant Yes Sliding Record [8:02, 8:32] [8:20, 8:50] [8:21, 8:51] variable Yes Session Record [8:02, 8:02] [8:02, 8:10] [9:10, 12:56] single (by tombstoning) No
  • 7. 001 / 12:00 002 / 12:00 003 / 12:00 001 / 12:30 002 / 12:30 003 / 12:30 12:00 12:15 12:30 12:45 1:00 01 5 Tumbling Time Windows 01 3 02 9 03 5 01 4 5 8 12 5 9 03 11 02 3 02 6 02 5 02 1 03 2 03 7 01 7 01 5 01 3 16 7 9 6 11 12 3 8 15 12
  • 8. Hopping Time Windows 001/ 12:00 002 / 12:00 003 / 12:00 001 / 12:30 002 / 12:30 003 / 12:30 12:00 12:15 12:30 12:45 1:00 01 5 01 3 02 9 03 5 01 4 5 8 12 5 9 03 11 02 3 02 6 02 5 02 1 03 2 03 7 01 7 01 5 01 3 16 7 9 6 11 12 3 8 15 001 / 11:45 5 8 12 001 / 12:45 002 / 11:45 002 / 12:15 002 / 12:45 003 / 11:45 003 / 12:45 003 / 12:15 3 8 15 1 18 20 11 5 9 12 3 9 14
  • 9. Sliding Time Windows 001 / 11:31:00.000 12:00 12:15 12:30 12:45 1:00 01 5 01 3 01 4 5 01 5 01 3 001 / 11:33:00.000 8 001 / 11:47.00.000 12 001 / 12:31:00.001 7 001 / 11:33.00:001 4 001 / 11:45:00.000 3 12:01 12:03 12:12 12:48 12:55 001 / 12:31:00.001 3 001 / 11:55:00.000 001 / 11:45:00.001 8 5
  • 10. 12:00 12:15 12:30 12:45 1:00 01 5 Session Windows 01 3 02 9 03 5 01 4 03 11 02 3 02 6 02 5 02 1 03 2 03 7 01 7 01 5 01 3 5 8 12 3 8 15 5 9 16 23 25 18 23 24 12
  • 11. Windowing Options Window Type Time boundary Examples # records for key @ point in time Fixed Size Tumbling Epoch [8:00, 8:30) [8:30, 9:00) single Yes Hopping Epoch [8:00, 8:30) [8:15, 8:45) [8:30, 8:45) [8:45, 9:00) constant Yes Sliding Record [8:02, 8:32] [8:20, 8:50] [8:21, 8:51] variable Yes Session Record [8:02, 8:02] [8:02, 8:10] [9:10, 12:56] single (by tombstoning) No
  • 12. • Good • Web Visitors • Products Purchased • Inventory Management • IoT Sensors • Ad Impressions* • Bad • Fraud Detection • User Interactions • Composition* Tumbling Time Windows * event timestamp & grace period
  • 13. • Good • Web Visitors • Products Purchased • Fraud Detection • IoT Sensors • Bad • User Interactions • Inventory Management • Composition • Ad Impressions Hopping Time Windows
  • 14. • Good • User Interactions • Fraud Detection • Usage Changes • Bad • Composition • IoT Sensors Sliding Time Windows
  • 15. • Good • User Interactions / Click Stream • User Behavior Analysis • IoT device - session oriented (running) • Bad • Data Analytics (Generalizations) • IoT sensors - always on (pacemaker) Session Windows
  • 16. • Good • Composition* • Finite Datasets • Bad • Fraud Detection • Monitoring • Unbounded data* No Windows * manual tombstoning
  • 17. Order Processing Order Analytics Demo Applications orders-purchase orders-pickup repartition attach user & store attach line item pricing assemble product analytics pickup-order-handler-purchase-order-join-product-repartition product-repartition product-stats State
  • 18. Materialized!<> store = Materialized.as("po").withCachingDisabled(); builder.<String, PurchaseOrder>stream(opt.getTopic()) .groupByKey(Grouped.as("groupByKey")) .windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize())) .grace(Duration.ofSeconds(opt.getGracePeriod()))) .aggregate(Streams!::initialize, Streams!::aggregator, Named.as("aggregate"), store) .toStream(Named.as("toStream")) .selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")") .mapValues(Streams!::minimize) .to(opt.topic(), Produced.as("to"));
  • 19. Materialized!<> store = Materialized.as("po").withCachingDisabled(); builder.<String, PurchaseOrder>stream(opt.getTopic()) .groupByKey(Grouped.as("groupByKey")) .windowedBy(TimeWindows.of(Duration.ofSeconds(opt.getWindowSize())) .grace(Duration.ofSeconds(opt.getGracePeriod()))) .aggregate(Streams!::initialize, Streams!::aggregator, Named.as("aggregate"), store) .toStream(Named.as("toStream")) .selectKey((k, v) !-> k.key() + " " + toStr(k.window()) + "," + ")") .mapValues(Streams!::minimize) .to(opt.topic(), Produced.as("to")); TimeWindows.of(Duration.ofSeconds(opt.getWindowSize())) .advanceBy(Duration.ofSeconds(opt.getWindowSize() / 2)) .grace(Duration.ofSeconds(opt.getGracePeriod()) SlidingWindows.withTimeDifferenceAndGrace( Duration.ofSeconds(opt.getWindowSize()), Duration.ofSeconds(opt.getGracePeriod())) SessionWindows.with(Duration.ofSeconds(opt.getWindowSize()))
  • 21. • Deserializers. % ln -s {jar} /usr/local/confluent/share/java/kafka • WindowDeserializer • SessionDeserializer • RocksDB • RocksDB’s rocksdb_ldb % brew install rocksdb • Scripts • rocksdb_key_parser • rocksdb_window_parser • rocksdb_session_parser Demo Tools
  • 22. DEMO
  • 23. • Emitting Results • Suppression • Commit Time • Window Boundaries • Epoch vs. Event • Long Windows* • Join Windowing • RocksDB Tuning • RocksDB state store instances… What Next
  • 24. • Overview of the Windowing Options within Kafka Streams • Windowing Use-Cases • Examples of Aggregate Windowing • What each windowing options does within RocksDB and the -changelog topics • The key serialization of the -changeling topics • Advance Considerations Takeaways
  • 25. • Demo Application https://github.com/nbuesing/kafka-streams-dashboards • Kafka Summit Europe 2021 https://www.confluent.io/events/kafka-summit-europe-2021/what-is-the-state-of- my-kafka-streams-application-unleashing-metrics/ Resources
  • 26. • Nick Dearden’s https://www.confluent.io/kafka-summit-ny19/zen-and-the-art-of-streaming-joins/ • Anna McDonald’s https://www.confluent.io/kafka-summit-san-francisco-2019/using-kafka-to-discover-events-hidden-in- your-database/ • Matthias Sax’s https://www.confluent.io/kafka-summit-san-francisco-2019/whats-the-time-and-why/ https://www.confluent.io/resources/kafka-summit-2020/the-flux-capacitor-of-kafka-streams-and-ksqldb/ Additional Resources