SlideShare a Scribd company logo
Apache Kafka
Build a solid application architecture
TEBOURBI Maher
@mtebourbi
!
Client AppBrowser
Web App
DB
Cache
Web App
HDFS
Search
Graph
DB
…
Web App
Search Graph DB
ETL …
Ok
Set X = 1 Set X = 1
Set X = 3 Set X = 3
X = 3
X = 1
Ok
Ok Ok
Other drawbacks
• Database schema evolution.
• Impact of some bugs on data.
• System debugging.
ChangeChange
Cache
Web App
Jobs
MQ
Search
Graph
DB
…
Same data
F(x)
Raw Data Derived Data
Log to the rescue
DB
321 4 5 6 7 8 9 10 11
Command log
Web App
SearchCache
321 4 5 6 7 8 9 10 11
Topic
Producer
Consumer
Offset
321 4 5 6 7 8 9 10 11
321 4 5 6 7
321 4 5 6 7 8 9
P1
P2
P3
Producer
Topic Partitions :
P1
P2
P3
P4
C1
C2
C3
C4
C5
Consumer Group
CG1
CG2
Pull vs Push
321 4 5 6 7 8 9 10 11
Partition
Consumer
Offset
ZooKeeper Topic DB
Vs Vs
321 4 5 6 7 8 9 10 11
121 3 4 5 4 3 1 6 9
FBC X K D H nil Y W V
Key
Value
Offset
Key
Value
Offset
452 1 6 9
762 9 10 11
HDB Y W V
Compaction
App
time window
S1 S2 B1 B2
Rt.
Services
Batch
Services
Advantages
• Eventual consistency.
• More agile and maintainable.
• Scalability.
• Fault tolerance.
Other use cases
• Messaging
• Website Activity Tracking
• Metring
• Stream processing
Some annoyances
• Require Zookeeper.
• Official java client API changes.
• Log compaction not possible with compressed
topics.
• Partition limited to single machine.
Performance
• Use of OS page cache.
• Linux: sendfile() syscall : pagecache -> socket.
• 1.1 Trillion Messages Per Day at Linkedin
• 2 Million Writes Per Second: https://goo.gl/acmi00
3 Servers :
Intel Xeon 2.5 GHz processor with six cores
Six 7200 RPM SATA drives
32GB of RAM
1Gb Ethernet
Kafka version : 0.8.1
Apache Kafka
Apache Kafka

More Related Content

What's hot (20)

PDF
Presto Summit 2018 - 04 - Netflix Containers
kbajda
 
DOCX
empirical analysis modeling of power dissipation control in internet data ce...
saadjamil31
 
PDF
Presto talk @ Global AI conference 2018 Boston
kbajda
 
PDF
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
Altinity Ltd
 
PDF
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
VMware Tanzu
 
PPTX
Gobblin meetup-whats new in 0.7
Vasanth Rajamani
 
PDF
Iceberg: a fast table format for S3
DataWorks Summit
 
PPTX
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
InfluxData
 
PPTX
Pig
Chirag Ahuja
 
PPTX
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
InfluxData
 
PDF
Change Data Streaming Patterns for Microservices With Debezium
confluent
 
PDF
Scaling up uber's real time data analytics
Xiang Fu
 
PPTX
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
InfluxData
 
PDF
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData
 
PDF
Highly Available Graphite
Matthew Barlocker
 
PPT
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
PDF
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
PDF
OPTIMIZING THE TICK STACK
InfluxData
 
PPTX
Case study- Real-time OLAP Cubes
Ziemowit Jankowski
 
Presto Summit 2018 - 04 - Netflix Containers
kbajda
 
empirical analysis modeling of power dissipation control in internet data ce...
saadjamil31
 
Presto talk @ Global AI conference 2018 Boston
kbajda
 
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
Altinity Ltd
 
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
VMware Tanzu
 
Gobblin meetup-whats new in 0.7
Vasanth Rajamani
 
Iceberg: a fast table format for S3
DataWorks Summit
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
InfluxData
 
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
InfluxData
 
Change Data Streaming Patterns for Microservices With Debezium
confluent
 
Scaling up uber's real time data analytics
Xiang Fu
 
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
InfluxData
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData
 
Highly Available Graphite
Matthew Barlocker
 
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
OPTIMIZING THE TICK STACK
InfluxData
 
Case study- Real-time OLAP Cubes
Ziemowit Jankowski
 

Viewers also liked (9)

PDF
LinkedIn Data Infrastructure Slides (Version 2)
Sid Anand
 
PPTX
Architecture of a Kafka camus infrastructure
mattlieber
 
PPT
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Yahoo Developer Network
 
PPTX
Apache kafka
Rahul Jain
 
PPTX
Introduction to Apache Kafka
Jeff Holoman
 
PDF
The best of Apache Kafka Architecture
techmaddy
 
PDF
Data Infrastructure at LinkedIn
Amy W. Tang
 
PPTX
Data Infrastructure at LinkedIn
Amy W. Tang
 
PDF
Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn
Amy W. Tang
 
LinkedIn Data Infrastructure Slides (Version 2)
Sid Anand
 
Architecture of a Kafka camus infrastructure
mattlieber
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Yahoo Developer Network
 
Apache kafka
Rahul Jain
 
Introduction to Apache Kafka
Jeff Holoman
 
The best of Apache Kafka Architecture
techmaddy
 
Data Infrastructure at LinkedIn
Amy W. Tang
 
Data Infrastructure at LinkedIn
Amy W. Tang
 
Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn
Amy W. Tang
 
Ad

Similar to Apache Kafka (20)

PDF
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
HostedbyConfluent
 
PDF
Data pipeline with kafka
Mole Wong
 
PDF
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
✔ Eric David Benari, PMP
 
PDF
Exploring Scenarios of Flink CDC in Streaming Data Integration
Leonard Xu
 
PPTX
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Community
 
PPTX
How did we move one of the world’s largest SAP BW HANA landscape to Microsoft...
Capgemini
 
PPTX
CloverDX 6.2 Release
CloverDX
 
PPT
Virtualized Platform Migration On A Validated System
gazdagf
 
PDF
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
PDF
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
HostedbyConfluent
 
PDF
DumpsBoss Your Key to Passing the DP 203 Exam
jackjohnson9842
 
PDF
Top DP-203 Exam Dumps PDF Free Download for Easy Preparation
jackjohnson9842
 
PDF
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Databricks
 
PDF
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
lakeFS
 
PDF
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
HostedbyConfluent
 
PDF
DriverPack Solution Download Full ISO free
blouch112kp
 
PDF
Adobe After Effects 2025 v25.1.0 Free Download
alihamzakpa070
 
PDF
Atlantis Word Processor 4.4.5.1 Free Download
shanbahikp01
 
PDF
iTop VPN Crack 6.3.3 serial Key Free 2025
blouch86kp
 
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
HostedbyConfluent
 
Data pipeline with kafka
Mole Wong
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
✔ Eric David Benari, PMP
 
Exploring Scenarios of Flink CDC in Streaming Data Integration
Leonard Xu
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Community
 
How did we move one of the world’s largest SAP BW HANA landscape to Microsoft...
Capgemini
 
CloverDX 6.2 Release
CloverDX
 
Virtualized Platform Migration On A Validated System
gazdagf
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
HostedbyConfluent
 
DumpsBoss Your Key to Passing the DP 203 Exam
jackjohnson9842
 
Top DP-203 Exam Dumps PDF Free Download for Easy Preparation
jackjohnson9842
 
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Databricks
 
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
lakeFS
 
Zero Down Time Move From Apache Kafka to Confluent With Justin Dempsey | Curr...
HostedbyConfluent
 
DriverPack Solution Download Full ISO free
blouch112kp
 
Adobe After Effects 2025 v25.1.0 Free Download
alihamzakpa070
 
Atlantis Word Processor 4.4.5.1 Free Download
shanbahikp01
 
iTop VPN Crack 6.3.3 serial Key Free 2025
blouch86kp
 
Ad

Recently uploaded (20)

PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 

Apache Kafka