SlideShare a Scribd company logo
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Tuning Kafka for Fun and Profit
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Zookeeper
 5-node vs. 3-node Ensembles
 Solid State Disks
– Use good SSDs
– Transaction logs only
– Significant improvement in latency and outstanding requests
2
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Kafka Broker Disks
 Disk Layout
 JBOD vs. RAID
– JBOD and RAID-0 are similar
– RAID-5/6 has significant performance overhead
– RAID-10 still offers the best performance and protection
 Filesystem
– New testing shows XFS has a clear benefit
– No tuning required
– Will be continuing testing with more production traffic
3
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Scaling Kafka Clusters
 Disk Capacity
 Network Capacity
 Partition Counts
– Per-Cluster
– Per-Broker
 Limitations
– Topic list length
4
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Topic Configuration
 Retention Settings
 Partition Counts
– Balance over consumers
– Balance over brokers
– Partition size on disk
– Application-specific requirements
5
ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved.
Mirror Maker
 Network Locality
 Consumer Tuning
– Number of streams
– Partition assignment strategy
 Producer Tuning
– Number of streams
– In flight requests
– Linger time
6
Tuning Kafka for Fun and Profit
Ad

More Related Content

What's hot (20)

Micro service architecture
Micro service architecture  Micro service architecture
Micro service architecture
Ayyappan Paramesh
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
Gwen (Chen) Shapira
 
Key Performance Indicators for Managing MongoDB and Recommended Production Co...
Key Performance Indicators for Managing MongoDB and Recommended Production Co...Key Performance Indicators for Managing MongoDB and Recommended Production Co...
Key Performance Indicators for Managing MongoDB and Recommended Production Co...
MongoDB
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
High Availability Using MySQL Group Replication
High Availability Using MySQL Group ReplicationHigh Availability Using MySQL Group Replication
High Availability Using MySQL Group Replication
OSSCube
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
Gwen (Chen) Shapira
 
The DBA 3.0 Upgrade
The DBA 3.0 UpgradeThe DBA 3.0 Upgrade
The DBA 3.0 Upgrade
Sean Scott
 
High Availability with MariaDB Enterprise
High Availability with MariaDB EnterpriseHigh Availability with MariaDB Enterprise
High Availability with MariaDB Enterprise
MariaDB Corporation
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)
Continuent
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
Joe Stein
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Redis Labs
 
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMWalmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Redis Labs
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
Akash Vacher
 
Become a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slidesBecome a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slides
Severalnines
 
Building High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in KafkaBuilding High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in Kafka
confluent
 
Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...
Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...
Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...
Severalnines
 
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
HostedbyConfluent
 
Key Performance Indicators for Managing MongoDB and Recommended Production Co...
Key Performance Indicators for Managing MongoDB and Recommended Production Co...Key Performance Indicators for Managing MongoDB and Recommended Production Co...
Key Performance Indicators for Managing MongoDB and Recommended Production Co...
MongoDB
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
High Availability Using MySQL Group Replication
High Availability Using MySQL Group ReplicationHigh Availability Using MySQL Group Replication
High Availability Using MySQL Group Replication
OSSCube
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
Gwen (Chen) Shapira
 
The DBA 3.0 Upgrade
The DBA 3.0 UpgradeThe DBA 3.0 Upgrade
The DBA 3.0 Upgrade
Sean Scott
 
High Availability with MariaDB Enterprise
High Availability with MariaDB EnterpriseHigh Availability with MariaDB Enterprise
High Availability with MariaDB Enterprise
MariaDB Corporation
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)
Continuent
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
Joe Stein
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Redis Labs
 
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMWalmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Redis Labs
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
Akash Vacher
 
Become a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slidesBecome a MySQL DBA: performing live database upgrades - webinar slides
Become a MySQL DBA: performing live database upgrades - webinar slides
Severalnines
 
Building High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in KafkaBuilding High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in Kafka
confluent
 
Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...
Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...
Webinar slides: How to deploy and manage HAProxy, MaxScale or ProxySQL with C...
Severalnines
 
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
HostedbyConfluent
 

Similar to Tuning Kafka for Fun and Profit (20)

Gluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & TricksGluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & Tricks
GlusterFS
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Community
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red_Hat_Storage
 
Getting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDsGetting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDs
Aerospike, Inc.
 
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
MongoDB
 
Milestone Server And Storage Best Practice
Milestone   Server And Storage Best PracticeMilestone   Server And Storage Best Practice
Milestone Server And Storage Best Practice
hypknight
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red_Hat_Storage
 
Fulcrum Group Storage And Storage Virtualization Presentation
Fulcrum Group Storage And Storage Virtualization PresentationFulcrum Group Storage And Storage Virtualization Presentation
Fulcrum Group Storage And Storage Virtualization Presentation
Steve Meek
 
DiscoverNasbooktbs453bx01ucVlERwlR2A.pdf
DiscoverNasbooktbs453bx01ucVlERwlR2A.pdfDiscoverNasbooktbs453bx01ucVlERwlR2A.pdf
DiscoverNasbooktbs453bx01ucVlERwlR2A.pdf
nosilrub
 
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
DataStax
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
MongoDB
 
Storage spaces direct webinar
Storage spaces direct webinarStorage spaces direct webinar
Storage spaces direct webinar
Виталий Стародубцев
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACAUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
Sandesh Rao
 
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
Insight Technology, Inc.
 
RAID
RAIDRAID
RAID
Mike Tennyson
 
A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NAS
Rahul Janghel
 
Oracle RAC 12c Overview
Oracle RAC 12c OverviewOracle RAC 12c Overview
Oracle RAC 12c Overview
Markus Michalewicz
 
Storage, San And Business Continuity Overview
Storage, San And Business Continuity OverviewStorage, San And Business Continuity Overview
Storage, San And Business Continuity Overview
Alan McSweeney
 
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
AsparuhPolyovski2
 
Storage systems reliability
Storage systems reliabilityStorage systems reliability
Storage systems reliability
Juha Salenius
 
Gluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & TricksGluster for Geeks: Performance Tuning Tips & Tricks
Gluster for Geeks: Performance Tuning Tips & Tricks
GlusterFS
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Community
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red_Hat_Storage
 
Getting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDsGetting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDs
Aerospike, Inc.
 
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
Transforming your Business with Scale-Out Flash: How MongoDB & Flash Accelera...
MongoDB
 
Milestone Server And Storage Best Practice
Milestone   Server And Storage Best PracticeMilestone   Server And Storage Best Practice
Milestone Server And Storage Best Practice
hypknight
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red_Hat_Storage
 
Fulcrum Group Storage And Storage Virtualization Presentation
Fulcrum Group Storage And Storage Virtualization PresentationFulcrum Group Storage And Storage Virtualization Presentation
Fulcrum Group Storage And Storage Virtualization Presentation
Steve Meek
 
DiscoverNasbooktbs453bx01ucVlERwlR2A.pdf
DiscoverNasbooktbs453bx01ucVlERwlR2A.pdfDiscoverNasbooktbs453bx01ucVlERwlR2A.pdf
DiscoverNasbooktbs453bx01ucVlERwlR2A.pdf
nosilrub
 
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
DataStax
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
MongoDB
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACAUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
Sandesh Rao
 
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
Insight Technology, Inc.
 
A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NAS
Rahul Janghel
 
Storage, San And Business Continuity Overview
Storage, San And Business Continuity OverviewStorage, San And Business Continuity Overview
Storage, San And Business Continuity Overview
Alan McSweeney
 
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
50-Tips-for-Boosting-MySQL-Performance-CON2655.pdf
AsparuhPolyovski2
 
Storage systems reliability
Storage systems reliabilityStorage systems reliability
Storage systems reliability
Juha Salenius
 
Ad

More from Todd Palino (9)

Leading Without Managing: Becoming an SRE Technical Leader
Leading Without Managing: Becoming an SRE Technical LeaderLeading Without Managing: Becoming an SRE Technical Leader
Leading Without Managing: Becoming an SRE Technical Leader
Todd Palino
 
From Operations to Site Reliability in Five Easy Steps
From Operations to Site Reliability in Five Easy StepsFrom Operations to Site Reliability in Five Easy Steps
From Operations to Site Reliability in Five Easy Steps
Todd Palino
 
Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart WayCode Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
Todd Palino
 
Why Does (My) Monitoring Suck?
Why Does (My) Monitoring Suck?Why Does (My) Monitoring Suck?
Why Does (My) Monitoring Suck?
Todd Palino
 
URP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowURP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to Know
Todd Palino
 
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Todd Palino
 
Running Kafka for Maximum Pain
Running Kafka for Maximum PainRunning Kafka for Maximum Pain
Running Kafka for Maximum Pain
Todd Palino
 
I'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedInI'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedIn
Todd Palino
 
More Datacenters, More Problems
More Datacenters, More ProblemsMore Datacenters, More Problems
More Datacenters, More Problems
Todd Palino
 
Leading Without Managing: Becoming an SRE Technical Leader
Leading Without Managing: Becoming an SRE Technical LeaderLeading Without Managing: Becoming an SRE Technical Leader
Leading Without Managing: Becoming an SRE Technical Leader
Todd Palino
 
From Operations to Site Reliability in Five Easy Steps
From Operations to Site Reliability in Five Easy StepsFrom Operations to Site Reliability in Five Easy Steps
From Operations to Site Reliability in Five Easy Steps
Todd Palino
 
Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart WayCode Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
Todd Palino
 
Why Does (My) Monitoring Suck?
Why Does (My) Monitoring Suck?Why Does (My) Monitoring Suck?
Why Does (My) Monitoring Suck?
Todd Palino
 
URP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowURP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to Know
Todd Palino
 
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Todd Palino
 
Running Kafka for Maximum Pain
Running Kafka for Maximum PainRunning Kafka for Maximum Pain
Running Kafka for Maximum Pain
Todd Palino
 
I'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedInI'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedIn
Todd Palino
 
More Datacenters, More Problems
More Datacenters, More ProblemsMore Datacenters, More Problems
More Datacenters, More Problems
Todd Palino
 
Ad

Recently uploaded (20)

computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 

Tuning Kafka for Fun and Profit

  • 1. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Tuning Kafka for Fun and Profit
  • 2. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Zookeeper  5-node vs. 3-node Ensembles  Solid State Disks – Use good SSDs – Transaction logs only – Significant improvement in latency and outstanding requests 2
  • 3. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Kafka Broker Disks  Disk Layout  JBOD vs. RAID – JBOD and RAID-0 are similar – RAID-5/6 has significant performance overhead – RAID-10 still offers the best performance and protection  Filesystem – New testing shows XFS has a clear benefit – No tuning required – Will be continuing testing with more production traffic 3
  • 4. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Scaling Kafka Clusters  Disk Capacity  Network Capacity  Partition Counts – Per-Cluster – Per-Broker  Limitations – Topic list length 4
  • 5. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Topic Configuration  Retention Settings  Partition Counts – Balance over consumers – Balance over brokers – Partition size on disk – Application-specific requirements 5
  • 6. ORGANIZATION NAME©2013 LinkedIn Corporation. All Rights Reserved. Mirror Maker  Network Locality  Consumer Tuning – Number of streams – Partition assignment strategy  Producer Tuning – Number of streams – In flight requests – Linger time 6

Editor's Notes

  • #3: We start talking about tuning from the ground up, and Kafka is underpinned by Zookeeper. This tends to be an application that we forget about unless we have problems, because it just runs, but it needs love too. One thing we’ve learned recently is about ensemble sizing in Zookeeper. There has been a lot of work done on performance at different ensemble sizes, and this is largely driven by the ZAB protocol and the network traffic involved. We run either 3-node or 5-node ensembles, with most of the 3-node ensembles being in our staging environments, but we are moving to all 5-node for a very important reason. In order to add a new server to the ensemble, you need to take down each node in turn, add the new server to the config, and bring it back up. If you don’t want to take Zookeeper down, you have to maintain quorum while you do this. If you have one node down in a 3 node cluster due to hardware problems, there is no way to change the server list without an outage because you cannot take a second server offline and maintain quorum. The other important change we have made to Zookeeper is to run it on solid state disks. There’s some information out there that suggests this is a bad thing, but our experience has been the opposite. The first thing to note is that we use really good SSDs, not the consumer grade ones you can buy from Best Buy. The Virident cards we use have garbage collection and are very robust. We only put the transaction logs on SSD, keeping the snapshots on spinning disk. By doing this, we have dropped min, max, and average latency to 0ms (from an average of 20ms), with no outstanding requests during normal operations, even at peak load.
  • #4: Moving on from Zookeeper to the Kafka brokers, mostly what we look at here is disk. Our CPU and memory are fairly standard 12-CPU systems (with hyperthreading) and 64 GB of memory, and we do not colocate any other application with Kafka (which is running on physical hardware, not a virtual environment). Having a lot of memory is helpful because Kafka depends on the pagecache to get the best performance for consumers. With disk, the more spindles you have, the better off you will be. Produce times are dependent on disk IO (assuming you are not using an acknowledgement setting of 0 where you are producing in a “fire and forget” mode), so the more you can spread that out the better. We have recently done a lot of testing of RAID layouts, to validate that our configuration of using RAID-10 on 14 disks was the optimal layout. What we found is that JBOD and RAID-0 perform the best, but offer no protection of the data (if you lose one disk, you lose everything on that broker). RAID 5 and 6 give you a nice balance of protection and disk capacity, but we ran into significant performance problems (produce times shot up to over 20 seconds in the 99% case). RAID-10 gave us the best balance of performance and protection, and is where we are staying for now. It is notable that we are running software RAID, and have not done any testing with hardware RAID. All of our testing was done with a variety of RAID stripe settings, and we found that at least for RAID-10, the default 512 Kb stripe is the best choice. Larger stripes did not offer a significant improvement. We have also been retesting the filesystem lately. Currently, Kafka log segments are stored on an ext4 filesystem, configured with a 120 second commit interval with writeback mode. These settings are obviously unsafe, and we justified it by knowing that we were also replicating data within Kafka and could suffer a system failure. A datacenter power outage changed this view, and we were left with a large amount of disk corruption, both at the file level and the block level. We found that XFS is a better choice of filesystem, offering significant performance benefits without needing to resort to unsafe tuning. We’ll be continuing this testing in some of our staging environments soon.
  • #5: Once we have an optimal configuration for a single broker, we look at how many brokers we need to have in a cluster. The driving factor for us right now is the disk capacity. We use a default retention of 4 days for almost all topics, and having enough disk space to handle this is the primary driver behind increasing the size of a cluster. We threshold our alerts at 60%, and increase the cluster size when we hit this limit. This gives us enough headroom to move partitions around (which resets the retention clock), and wait for new hardware to arrive if needed. Another concern with sizing is the network capacity. While Kafka can definitely operate at line speed for a 1 Gigabit NIC, you want to have some overhead reserved for intra-cluster replication and communication. For this reason, we threshold our network alerts at 75%. If we go above that at peak load, we need to spread out the traffic over more systems. This is another good reason to make sure you balance partitions across your brokers as evenly as possible. The number of partition you have in your cluster is a lesser, but important, concern. Here we are mostly concerned with the number of partitions on a single broker. We have noticed performance problems above 4000 partitions per-broker, though we are not sure exactly where that problem is (whether it is with open filehandles, data structures in the broker, or problems in the controller). We are about to start testing on much larger Kafka broker hardware, however, and will be digging into this limitation. As a side note, you should keep an eye on the number of topics you have for a reason that is not immediately obvious. Zookeeper has a limit of 1 MB as the size of the data in a node. This also applies to the combined length of all the names of the child nodes. Because all of the topics exist as child nodes under /brokers/topics, there is a limitation here. If your topic names are all 50 characters long, and you have more than about 20,900 topics, you will hit this limitation. This could cause Zookeeper to fail entirely, or it could cause problems in Kafka. The guarantee is that it will cause problems.
  • #6: Now that Kafka is running well, we can turn our attention to the topics. In general, there are two things to configure when it comes to topics: the retention, and the number of partitions. There are other things you can look at, such as the segment size, or how long until the segments are rolled, which may have application-specific concerns. But in large part, all we really care about is how long we keep the data, and how much we spread it out. Topics can be configured for retention by time, by size, or by key. There is a default broker-level setting for this, and it can be overridden per-topic. How you retain data is mostly application-dependent. We use a default retention of 4 days, and the reason for this is that in the normal state of affairs, consumers are caught up and reading from the end of the stream. We want enough retention so that if a problem happens with an individual application on the weekend, there is enough time to identify it, figure out what the problem is, resolve it, and catch back up before they fall off the end of their topic. We have certain types of data, such as some of the monitoring, which uses a shorter retention time because the data size is much larger and it gets fixed very quickly if there is ever a problem. We also have topics that are retained for much longer, up to a month, when there is a reason to because of how the application uses the data. The rule of thumb is to never hang on to more data than you really need. There are systems (such as HDFS) which are better designed for long-term storage of data. Partition counts are the tricky calculation. General guidance is to have fewer partitions, not more. This is because more partitions means more log segments, which is more file handles open, and more overhead in the brokers. At the same time, you need to make sure you have enough. There are several ways to look at this, all of which should be taken into account. Balancing over consumers – You must have at least as many partitions as you have consumers in the largest group for a topic. If a topic has 8 partitions, and you have 16 consumer instances, 8 of those consumers will be idle all the time. Balancing over brokers – If your number of topics is not a multiple of the number of brokers in your cluster, the topic cannot be evenly balanced over the brokers. In a cluster with a large number of topics, this is less of a concern because over all the topics you should have a good balance regardless. In cases where you get a dump of messages (high number of messages in a short period of time), balancing over the brokers is very important so you don’t swamp the network. Partition size on disk – This is one of our primary drivers in how we expand topics, as it is a good indication of how busy the topic is. We’ve picked a somewhat arbitrary threshold of 50 GB as the size of a single partition on disk on the brokers. Once a topic exceeds that, we increase the number of partitions (in general). This keeps the log segments of a reasonable size, which is good for recovering a crashed broker, and it also allows us to balance busy topics over more of the cluster. Through all of this, you also need to keep in mind application-specific requirements. You may have an application which is very concerned about message ordering, and only wants a single partition. You may have an application that is using keyed partitioning, and wants a high number of partitions so that they do not need to be expanded at any point (which would change the hashing of keys to partitions). This will often override other concerns. In a multi-tenant environment, the important thing is to have communication with the users, and a way of keeping track of these requirements so they are not forgotten.
  • #7: In an environment with multiple Kafka clusters, you are often using the mirror maker application to replicate data between them. In addition, because mirror maker has both a consumer and a producer, it’s a useful case to look at when tuning both. If you want more information about using mirror maker for running Kafka clusters in tiers, I encourage you to look at one of my other presentations on multi-tier architectures that goes into more depth on the design and concerns around setting this up. With any consumer or producer, network locality is a big factor in performance. If your client is not in the same network as your Kafka cluster, you will have latency, bandwidth concerns, network partitions, and any number of other problems that you get when you have a lot of network hops in the way. With mirror maker, we need to choose whether we are going to locate it proximate to the cluster we are consuming from or the cluster we are producing too (as we use it most often for inter-datacenter replication). Our choice is always to locate it with the produce cluster. The reason for this is that if there is a problem with the produce side of the mirror maker, it will lose messages and the consumer will be continuing to consume messages and commit offsets. If there is a problem with the consumer, it will just stop. So we choose to put the higher risk of network problems on the consume side, rather than the produce side. With tuning the mirror maker consumer, you will mostly consider how much data you need to consume, and the number of streams. You need to have enough copies of mirror maker in a given pipeline to handle the peak traffic, and mirror maker will not operate at line speed because it needs to decompress and recompress every message batch. This is also why you should run more than one consumer stream in a single mirror maker copy, to take advantage of parallelism to get around some of this inefficiency. You will also want to look at the partition assignment strategy that is used when balancing consumers. There is a strategy available for wildcard consumers called “roundrobin” which provides a much more even balance of partitions than the standard assignment strategy. There are also improvements in the most recent mirror maker code to the speed with which the consumer rebalance is performed. On the producer side, you also should be running multiple streams. Where the consumer is responsible for decompressing message batches, the producer is responsible for compressing them again before sending to Kafka. You will also want to consider the number of in flight requests that are allowed between the producer and the Kafka cluster. A higher number will allow for greater throughput, but it will also introduce a higher risk of loss. When the leadership changes on a partition in the produce cluster, message batches that are in flight will be lost. It is also possible to improve this by changing the acknowledgement configuration on the producer, but this will have other performance concerns. Another parameter to look at is the linger time. The mirror maker producer will flush a batch to the producer based on either the producer reaching the byte size limit for a single batch, or by reaching the linger time. For busy topics, you will be subject to the size limit. For slow topics, you will be subject to the time limit. A higher linger time will allow the producer to assemble more efficient batches, with better compression (and the Kafka broker itself does not decompress and break up batches, so this affects your disk utilization on the brokers). It will also increase the amount of time it takes for messages to get from one cluster to the next. You will need to determine how important these factors are and strike a balance.