Open navigation menu

Scribd

0% found this document useful (0 votes)

52 views

Kafka Lessons That We Learned

Kafka lessons that we learned The company experienced several issues after upgrading to Kafka 0.8 including data imbalance across brokers due to changes in partition assignment and data replication features, excessive disk usage from bugs and compression settings, and increased data volume that required scaling the cluster. Lessons learned include properly configuring data replication, monitoring disk usage and topics, addressing bugs contributing to duplicate data, and planning for future scaling needs by separating data types across clusters.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Kafka Lessons That We Learned

Kafka lessons that we learned The company experienced several issues after upgrading to Kafka 0.8 including data imbalance across brokers due to changes in partition assignment and data replication features, excessive disk usage from bugs and compression settings, and increased data volume that required scaling the cluster. Lessons learned include properly configuring data replication, monitoring disk usage and topics, addressing bugs contributing to duplicate data, and planning for future scaling needs by separating data types across clusters.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Kafka lessons that we learned

Kafka lessons that we learned

the hard way
Data Balancing
The Kafka 0.7 cluster has been stable and well-
balanced from the beginning. Kafka 0.8
introduced some new changes.

Partition assignment.
Data replication feature.
Data Balancing
Partition assignment

Cannot use F5 for load balancing.

Load among brokers out of balance.
Monitor disk usage with Bosun (oops).
Occasional maintenance with kafka-reassign-
partitions.sh.
Data Balancing
Data replication feature

Switched from RAID-10 to JBOD as

recommended on Kafka web site.
Drives were severely out-of-balance.
A bad drive brings down the whole broker.
Switching to RAID-10.
Monitor disk usage with Bosun (oops).
Data Balancing
Our own bugs

Log forwarder topic explosion.

Cap on number of forensic topics per stack.
Increased Data
Why is there so much more data in the Kafka 0.8
cluster?

How do we scale going forward?

Increased Data
Why is there so much more data in the Kafka 0.8
cluster?

Log forwarder EOF bug.

Fixed.
Preventative measures going forward:
Monitor topics with Spark.
quota.producer.default property in Kafka 0.9.
Increased Data
Why is there so much more data in the Kafka 0.8
cluster?

Snappy compression. Brokers are I/O bound.

Switched back to gzip for forensic data.
Continue to use Snappy for binary Avro data.
Increased Data
Why is there so much more data in the Kafka 0.8
cluster?

Duplicate data. Forwarder sends logs to

eventdata and stack-specific topics.
Handle multiple topics with Camus or Gobblin.
Increased Data
How do we scale going forward?

Add nodes to Kafka cluster.

Repurpose 0.7 servers.
Separate Kafka clusters for business and
forensic data.

You might also like

Q Tips: Fast, Scalable, and Maintainable Kdb+
From Everand
Q Tips: Fast, Scalable, and Maintainable Kdb+
Nick Psaris
No ratings yet
Advanced Data Streaming with Apache NiFi: Engineering Real-Time Data Pipelines for Professionals
From Everand
Advanced Data Streaming with Apache NiFi: Engineering Real-Time Data Pipelines for Professionals
Adam Jones
No ratings yet
Ansible For Linux by Examples
From Everand
Ansible For Linux by Examples
Luca Berton
No ratings yet
Learning Informatica PowerCenter 9.x
From Everand
Learning Informatica PowerCenter 9.x
Rahul Malewar
3/5 (4)
vSphere 5 AutoLab 1.1a Deployment Guide
From Everand
vSphere 5 AutoLab 1.1a Deployment Guide
Alastair Cooke
No ratings yet
Confluent Certified Developer for Apache Kafka® Exam kit
From Everand
Confluent Certified Developer for Apache Kafka® Exam kit
PRIYANKA
No ratings yet
Kafka Mastery Guide: Comprehensive Techniques and Insights
From Everand
Kafka Mastery Guide: Comprehensive Techniques and Insights
Adam Jones
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
From Everand
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
Peter Jones
No ratings yet
Oracle GoldenGate 11g Implementer's guide
From Everand
Oracle GoldenGate 11g Implementer's guide
John P Jeffries
5/5 (1)
Practical Play Framework: Focus on what is really important
From Everand
Practical Play Framework: Focus on what is really important
Alberto Souza
No ratings yet
Mastering Kafka Streams: From Basics to Expert Proficiency
From Everand
Mastering Kafka Streams: From Basics to Expert Proficiency
William Smith
No ratings yet
Koha 3 Library Management System
From Everand
Koha 3 Library Management System
Savitra Sirohi
2.5/5 (2)
Oracle Coherence 3.5
From Everand
Oracle Coherence 3.5
Aleksandar Seovic
4/5 (1)
Nginx Troubleshooting
From Everand
Nginx Troubleshooting
Alex Kapranoff
No ratings yet
MariaDB High Performance
From Everand
MariaDB High Performance
Pierre MAVRO
No ratings yet
Zabbix 1.8 Network Monitoring
From Everand
Zabbix 1.8 Network Monitoring
Rihards Olups
5/5 (2)
Oracle Data Guard 11gR2 Administration Beginner's Guide
From Everand
Oracle Data Guard 11gR2 Administration Beginner's Guide
Emre Baransel
No ratings yet
Building a NAS Server with Raspberry Pi and Openmediavault
From Everand
Building a NAS Server with Raspberry Pi and Openmediavault
Brian Schell
No ratings yet
Infinispan Data Grid Platform Definitive Guide
From Everand
Infinispan Data Grid Platform Definitive Guide
Wagner Roberto dos Santos
No ratings yet
Advanced Real-Time Data Integration: Apache Kafka and Spark Streaming Techniques
From Everand
Advanced Real-Time Data Integration: Apache Kafka and Spark Streaming Techniques
Adam Jones
No ratings yet
Microsoft BizTalk Server 2010 Patterns
From Everand
Microsoft BizTalk Server 2010 Patterns
Dan Rosanova
2/5 (1)
Fast Data Processing Systems with SMACK Stack
From Everand
Fast Data Processing Systems with SMACK Stack
Raúl Estrada
No ratings yet
JBoss AS 5 Performance Tuning
From Everand
JBoss AS 5 Performance Tuning
Francesco Marchioni
No ratings yet
Ceph Cookbook: Over 100 effective recipes to help you design, implement, and manage the software-defined and massively scalable Ceph storage system
From Everand
Ceph Cookbook: Over 100 effective recipes to help you design, implement, and manage the software-defined and massively scalable Ceph storage system
Karan Singh
4/5 (1)
The LAMP Stack Handbook: Linux, Apache, MySQL, and PHP for Web Development
From Everand
The LAMP Stack Handbook: Linux, Apache, MySQL, and PHP for Web Development
Robert Johnson
No ratings yet
VMware Horizon 6 Desktop Virtualization Solutions
From Everand
VMware Horizon 6 Desktop Virtualization Solutions
Ryan Cartwright
No ratings yet
Hyper-V 2016 Best Practices
From Everand
Hyper-V 2016 Best Practices
Benedict Berger
No ratings yet
SAP Basis Configuration Frequently Asked Questions
From Everand
SAP Basis Configuration Frequently Asked Questions
Equity Press
3.5/5 (4)
WildFly Configuration, Deployment, and Administration - Second Edition
From Everand
WildFly Configuration, Deployment, and Administration - Second Edition
Christopher Ritchie
No ratings yet
The Ceph Handbook: Building and Managing Scalable Distributed Storage Systems
From Everand
The Ceph Handbook: Building and Managing Scalable Distributed Storage Systems
Robert Johnson
No ratings yet
Oracle BAM 11gR1 Handbook
From Everand
Oracle BAM 11gR1 Handbook
Wang
No ratings yet
WildFly Performance Tuning
From Everand
WildFly Performance Tuning
Arnold Johansson
No ratings yet
Learning Apache Spark 2
From Everand
Learning Apache Spark 2
Muhammad Asif Abbasi
No ratings yet
Advanced Penetration Testing for Highly-Secured Environments: The Ultimate Security Guide
From Everand
Advanced Penetration Testing for Highly-Secured Environments: The Ultimate Security Guide
Allen Lee
4.5/5 (6)
Production Ready OpenStack - Recipes for Successful Environments
From Everand
Production Ready OpenStack - Recipes for Successful Environments
Berezin Arthur
No ratings yet
Mastering GeoServer
From Everand
Mastering GeoServer
Colin Henderson
No ratings yet
Apache Hive Cookbook
From Everand
Apache Hive Cookbook
Shrey Mehrotra
No ratings yet
Mastering MariaDB
From Everand
Mastering MariaDB
Razzoli Federico
No ratings yet
CouchDB and PHP Web Development Beginner’s Guide
From Everand
CouchDB and PHP Web Development Beginner’s Guide
Tim Juravich
No ratings yet
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
From Everand
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
Robert Johnson
No ratings yet
Ansible For Security by Examples
From Everand
Ansible For Security by Examples
Berton
No ratings yet
Configuration of Apache Server to Support Asp
From Everand
Configuration of Apache Server to Support Asp
Dr. Hidaia Mahmood Alassouli
No ratings yet
Configuration of Apache Server To Support ASP
From Everand
Configuration of Apache Server To Support ASP
Dr. Hedaya Mahmood Alasooly
No ratings yet
The Apache Kafka® and Generative AI Handbook
From Everand
The Apache Kafka® and Generative AI Handbook
Joseph Matthew Stein
No ratings yet
Mastering Ceph
From Everand
Mastering Ceph
Nick Fisk
No ratings yet
Learn Kubernetes - Container orchestration using Docker: Learn Collection
From Everand
Learn Kubernetes - Container orchestration using Docker: Learn Collection
Arnaud Weil
4/5 (1)
Relayd and Httpd Mastery: IT Mastery, #11
From Everand
Relayd and Httpd Mastery: IT Mastery, #11
Michael W. Lucas
No ratings yet
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
From Everand
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
Steve Brown
No ratings yet
Windows Server 2012 Hyper-V Installation and Configuration Guide
From Everand
Windows Server 2012 Hyper-V Installation and Configuration Guide
Aidan Finn
No ratings yet
Kali Linux Penetration Testing Bible
From Everand
Kali Linux Penetration Testing Bible
Gus Khawaja
No ratings yet
Microsoft Hyper-V Cluster Design
From Everand
Microsoft Hyper-V Cluster Design
Eric Siron
No ratings yet
Build Your First Home Server
From Everand
Build Your First Home Server
R.R. Arnob
No ratings yet
Nginx Essentials
From Everand
Nginx Essentials
Valery Kholodkov
No ratings yet
Modern Web Development: Kickstarting with Svelte
From Everand
Modern Web Development: Kickstarting with Svelte
Tyler Hayes
No ratings yet
Windows Server 2012 Hyper-V: Deploying Hyper-V Enterprise Server Virtualization Platform
From Everand
Windows Server 2012 Hyper-V: Deploying Hyper-V Enterprise Server Virtualization Platform
Zahir Hussain Shah
No ratings yet
Apache Karaf Cookbook
From Everand
Apache Karaf Cookbook
Jamie Goodyear
No ratings yet
How to transfer data to new cluster
No ratings yet
How to transfer data to new cluster
5 pages
Apress Kafka Troubleshooting in Production
No ratings yet
Apress Kafka Troubleshooting in Production
229 pages
Kubernetes - Interview - Prep - Questions - With Answers and Link3
No ratings yet
Kubernetes - Interview - Prep - Questions - With Answers and Link3
2 pages
Maxwell Park
No ratings yet
Maxwell Park
64 pages
Massachusetts Institute of Technology: Problem 1: My Dog Ate My Codebook
No ratings yet
Massachusetts Institute of Technology: Problem 1: My Dog Ate My Codebook
7 pages
VGcompanies
No ratings yet
VGcompanies
138 pages
Hamming CHs1-3 PDF
No ratings yet
Hamming CHs1-3 PDF
60 pages
The Microdose Concept: Presenter: Professor Colin Garner Bpharm PHD DSC Frcpath Ceo, Xceleron LTD, York, Uk
No ratings yet
The Microdose Concept: Presenter: Professor Colin Garner Bpharm PHD DSC Frcpath Ceo, Xceleron LTD, York, Uk
14 pages
Energy and Momentum2g
No ratings yet
Energy and Momentum2g
4 pages
Xuyen On: Professional Summary
No ratings yet
Xuyen On: Professional Summary
5 pages
Summer Camp Flyer 2015
No ratings yet
Summer Camp Flyer 2015
1 page
Experiment 14 The Physical Pendulum
No ratings yet
Experiment 14 The Physical Pendulum
9 pages
IAM User, Roles, and Policies
No ratings yet
IAM User, Roles, and Policies
10 pages
Essential Scala
No ratings yet
Essential Scala
129 pages
Luigi Documentation: Release 1.0
No ratings yet
Luigi Documentation: Release 1.0
81 pages
Bay Piggies 2015 Gradual Typing
No ratings yet
Bay Piggies 2015 Gradual Typing
44 pages
Publication
No ratings yet
Publication
248 pages
ZEPL Launch
No ratings yet
ZEPL Launch
3 pages
Rosetta PDF
No ratings yet
Rosetta PDF
73 pages
Let's Take A Look at This Problem Again: From 6B:: 6C Notes February 6, 2017
No ratings yet
Let's Take A Look at This Problem Again: From 6B:: 6C Notes February 6, 2017
5 pages
6 B5 Fcontinued
No ratings yet
6 B5 Fcontinued
7 pages
Attc
No ratings yet
Attc
541 pages
From 6A:: 6B Notes February 2, 2017
No ratings yet
From 6A:: 6B Notes February 2, 2017
4 pages
2015 FIeld Guide To Data Science
No ratings yet
2015 FIeld Guide To Data Science
126 pages
Any Questions On 6C? Something To Help Us With Log "Number Sense"
No ratings yet
Any Questions On 6C? Something To Help Us With Log "Number Sense"
7 pages
Eric Nelson
No ratings yet
Eric Nelson
4 pages
Let's Take A Look at This Problem Again: From 6B:: 6C Notes February 6, 2017
No ratings yet
Let's Take A Look at This Problem Again: From 6B:: 6C Notes February 6, 2017
5 pages
TTC StreetcarMap
No ratings yet
TTC StreetcarMap
1 page