SlideShare a Scribd company logo
Data Transformations on Operational Metrics using Kafka Streams
Vidhya Ramachandran
Mukund Murthy
About Priceline
Priceline.com is part of the
Booking Holdings Group. The
Booking Holdings Group is
the world leader in online
travel & related services. The
Group’s mission is to help
people experience the world.
About Us
Vidhya Ramachandran Mukund Murthy
Director Engineering, Common Platform Services Software Engineer , Common Platform Services
Legacy Monitoring System
Legacy Architecture
Custom Alerting System
Custom Dashboards
Application Servers
Kafka - The Only Constant!
Current Architecture
Motive behind going with Kafka & Kafka Streams
Kafka Infrastructure & new Monitoring Solution & other Sinks
Config Reader Event Listener
Applications
Message Router
Embedded Agent
Asynchronous Message Dispatcher
Kafka Consumer
Cluster
Infrastructure Code embedded in
the Products to Produce Events
to the Kafka Topic Splunk HEC
Add other sinks
Data Collection Console
Priceline Data Collection Console
Priceline Streams Multiplexing
• Stream = Events emitted by some source (for e.g. a Priceline Application)
• Multiple Streams can come from the same Application and be written into the
same topic
• A stream can be emitted by multiple Applications
Stream 1
Stream 3
Stream 2
Topic
Priceline Data Collection Console
Streams in Splunk before & after transformations
License Reduction from Transformations
Operational metrics data flowing through Kafka are converted to Pipe
separated format and the keys are applied back as search time
extractions.
Transformations for PCI/PII
Statsd Conversion
Topology – Kafka Streams
Windowed Events
Custom Partitioner
Custom TimeStamp Extractor
License Reduction from Summarizations
Late Arriving Events
Windowed Event Entries
| T1 | T2 | T3 | ……
|1 2 3 |5 6 7 |9 10 11 |
--------> 4 (T1)
--------- > 8 (T2)
Key Value
T1 1,2,3
T1 1,2,3,4
T2 5,6,7
T2 5,6,7,8
T3 9,10,11
Processing too long…
Measure & make sure to keep your Processors simple and
straightforward
Maintain SLA in Monitoring system < 5seconds from Event
Time…
Exception Handling
Gotchas…
Testing Kafka Streams
• Unit testing individual
processors and transformers
• Integration testing with an
embedded Kafka cluster,
multiple instances
• Integration Testing the
aggregation by repeating the
concatenated raw messages
in the final aggregated event
Debugging / Testing
28
• Splunk Add-on to Poll local or remote JMX management Servers
• Index MBean attributes, outputs from MBean operations, and
MBean notifications.
• Configurable Templates to selectively monitor JMX stats
Kafka Monitoring – JMX Stats
Kafka Monitoring - Dashboards
Kafka Monitoring - Alerts
Monitoring Kafka Streams
• Thread metrics
• Average time for commits, poll, process operations
• Tasks created per second, tasked closed per second
• Task metrics
• Average number of commits per second
• Average commit time
• Processor node metrics
• Average and max processing time
• Average number of process operations per second Forward rate
• State store metrics
• Average execution time for put, get, and flush operations
• Average number put, get, and flush operations per second
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandran, Priceline.com) Kafka Summit NYC 2019
Conclusion & Next Steps
Acknowledgements
• Our Core Platform team for their contributions to the Streaming library
and Data Collection Console
• To Confluent support and sales team
We are Hiring!!! https://careers.priceline.com/
Questions

More Related Content

What's hot (20)

PDF
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
confluent
 
PDF
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Michael Noll
 
PDF
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
confluent
 
PDF
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
HostedbyConfluent
 
PDF
Streaming Transformations - Putting the T in Streaming ETL
confluent
 
PDF
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
confluent
 
PPTX
INTRODUCING: CREATE PIPELINE
SingleStore
 
PDF
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
confluent
 
PDF
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
confluent
 
PDF
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
confluent
 
PDF
Leveraging Mainframe Data for Modern Analytics
confluent
 
PDF
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
HostedbyConfluent
 
PDF
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...
confluent
 
PDF
What is Apache Kafka®?
confluent
 
PDF
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
HostedbyConfluent
 
PDF
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
confluent
 
PDF
What every software engineer should know about streams and tables in kafka ...
confluent
 
PPTX
Modeling the Smart and Connected City of the Future with Kafka and Spark
SingleStore
 
PDF
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
HostedbyConfluent
 
PDF
Using Apache Kafka to Analyze Session Windows
confluent
 
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
confluent
 
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Michael Noll
 
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
confluent
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
HostedbyConfluent
 
Streaming Transformations - Putting the T in Streaming ETL
confluent
 
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
confluent
 
INTRODUCING: CREATE PIPELINE
SingleStore
 
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
confluent
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
confluent
 
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
confluent
 
Leveraging Mainframe Data for Modern Analytics
confluent
 
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
HostedbyConfluent
 
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...
confluent
 
What is Apache Kafka®?
confluent
 
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
HostedbyConfluent
 
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
confluent
 
What every software engineer should know about streams and tables in kafka ...
confluent
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
SingleStore
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
HostedbyConfluent
 
Using Apache Kafka to Analyze Session Windows
confluent
 

Similar to Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandran, Priceline.com) Kafka Summit NYC 2019 (20)

PPTX
Streaming Data Ingest and Processing with Apache Kafka
Attunity
 
PDF
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
PPTX
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
confluent
 
PDF
Flink forward-2017-netflix keystones-paas
Monal Daxini
 
PDF
Devoxx university - Kafka de haut en bas
Florent Ramiere
 
PDF
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
confluent
 
PDF
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
confluent
 
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
Florent Ramiere
 
PDF
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
PPTX
Kafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARN
DataWorks Summit
 
PDF
Beyond the brokers - A tour of the Kafka ecosystem
Damien Gasparina
 
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
confluent
 
PPTX
Splunk App for Stream
Splunk
 
PDF
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
PPTX
Data Streaming with Apache Kafka & MongoDB
confluent
 
PDF
Keystone - ApacheCon 2016
Peter Bakas
 
PDF
Monitoring Akka with Kamon 1.0
Steffen Gebert
 
PDF
Stream Processing with Flink and Stream Sharing
confluent
 
PPTX
Webinar: Data Streaming with Apache Kafka & MongoDB
MongoDB
 
PPTX
Data Streaming with Apache Kafka & MongoDB - EMEA
Andrew Morgan
 
Streaming Data Ingest and Processing with Apache Kafka
Attunity
 
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
confluent
 
Flink forward-2017-netflix keystones-paas
Monal Daxini
 
Devoxx university - Kafka de haut en bas
Florent Ramiere
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
confluent
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
confluent
 
Beyond the brokers - Un tour de l'écosystème Kafka
Florent Ramiere
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
Kafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARN
DataWorks Summit
 
Beyond the brokers - A tour of the Kafka ecosystem
Damien Gasparina
 
Beyond the Brokers: A Tour of the Kafka Ecosystem
confluent
 
Splunk App for Stream
Splunk
 
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
Data Streaming with Apache Kafka & MongoDB
confluent
 
Keystone - ApacheCon 2016
Peter Bakas
 
Monitoring Akka with Kamon 1.0
Steffen Gebert
 
Stream Processing with Flink and Stream Sharing
confluent
 
Webinar: Data Streaming with Apache Kafka & MongoDB
MongoDB
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Andrew Morgan
 
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
PDF
Migration, backup and restore made easy using Kannika
confluent
 
PDF
Five Things You Need to Know About Data Streaming in 2025
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
PDF
Unlocking value with event-driven architecture by Confluent
confluent
 
PDF
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
PDF
Building API data products on top of your real-time data infrastructure
confluent
 
PDF
Speed Wins: From Kafka to APIs in Minutes
confluent
 
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Ad

Recently uploaded (20)

PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
July Patch Tuesday
Ivanti
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 

Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandran, Priceline.com) Kafka Summit NYC 2019