SlideShare a Scribd company logo
The Impact of Hardware and Software
Version Changes on Apache Kafka®
Performance and Scalability
Paul Brebner, Hendra Gunadi, and more!
Instaclustr by NetApp
© Instaclustr Pty Limited, 2022
ApacheCon NA Performance Engineering Track 2022
Who Am I?
Previously
§ R&D in distributed systems and performance engineering
Last 5 years
§ Technology Evangelist for Instaclustr by NetApp
§ 100+ Blogs, Demo Applications, Talks
§ Benchmarking and Performance Insights
§ Open Source Technologies including
o Apache Cassandra®, Spark™, ZooKeeper, Kafka®
o OpenSearch®, Redis™, PostgreSQL®
o Uber’s Cadence®
© Instaclustr Pty Limited, 2022
Cloud Platform for Big Data
Open Source Technologies
Focus of this talk is on
Apache Kafka®
Instaclustr Managed Platform
© Instaclustr Pty Limited, 2022
Kafka is a distributed streams processing system, it allows
distributed producers to send messages to distributed
consumers via a Kafka cluster.
What is
Kafka?
© Instaclustr Pty Limited, 2022
And Change:
Hardware and Software
© Instaclustr Pty Limited, 2022
Some changes have “obvious”
performance impacts – e.g. Horse à Steam
“The Iron Horse Wins” (!) Race in 1830 between steam locomotive and a horse, horse won (due to
mechanical failure), but was obviously inferior. Source: https://www.fhwa.dot.gov/rakeman/1830.htm
“The Iron Horse Wins” (It didn’t)
Others are not so obvious – e.g.
Electric + Steam Locomotive
https://en.wikipedia.org/wiki/Electric-steam_locomotive#/media/File:SBB_Ee_3-3_8521.png
Part 1: Hardware Change
(Source: Shutterstock)
© Instaclustr Pty Limited, 2022
Hardware Change: CISC
VAX 11/780
CISC = Complex
Instruction Set Computer
University of Waikato NZ
1980-85
© Instaclustr Pty Limited, 2022
CISC to RISC
Pyramid Technology
RISC = Reduced
Instruction Set Computer
UNSW Sydney Australia
2nd half of 1980s
© Instaclustr Pty Limited, 2022
More Recently: Intel PC
https://commons.wikimedia.org/wiki/File:HP_OMEN_X_900_Gaming_Desktop_PC.jpg
© Instaclustr Pty Limited, 2022
Intel PC to iPhone?!
https://commons.wikimedia.org/wiki/File:IPhone_12_Pro_Max_-_3.jpg
© Instaclustr Pty Limited, 2022
Acorn BBC Micro Computer:
1980s
https://www.classic-computers.org.nz/collection/BBCb-1920x.jpg
© Instaclustr Pty Limited, 2022
Fast Forward 40 Years:
Have Gravitons Been Discovered?
https://commons.wikimedia.org/wiki/File:Standard_Model_of_Elementary_Particles_%2B_Gravity.svg
Discovered?
© Instaclustr Pty Limited, 2022
§ RISC for Servers!
• ARM-based New AWS instance types
• Designed by ARM, formerly Advanced RISC Machines and originally Acorn RISC Machine
o made the BBC micro
§ Real Cores
• 64 Cores, 256GB RAM per CPU
• No hyperthreading
• Each vCPU = 1 physical core
§ Benchmarking
• Reported to be up to 40% faster than Intel and AMD
§ Less Power Consumption
• So Cheaper and Faster
Fast Forward 40 Years to the
AWS Graviton2
© Instaclustr Pty Limited, 2022
§ Apache Kafka® deployed on
§ R5 (Intel) vs. R6g (Graviton2) instances
§ New R6g configuration:
• AWS Gp3 disks
• Java 11 OpenJDK à Amazon Corretto
• Client à Broker encryption enabled
§ Hoping for easy and large performance gains…
Initial Benchmarking
© Instaclustr Pty Limited, 2022
Initial Results: 40% Worse!
0
0.5
1
1.5
2
2.5
3
R5 R6G
Initial Results (Million messages/s)
© Instaclustr Pty Limited, 2022
Why? Hypotheses and Tests
0
0.5
1
1.5
2
2.5
3
3.5
R5 R6G R6G-OpenJDK R6G-GP2 R6G-x2
Initial Results (Million messages/s):
Hypotheses and Tests
© Instaclustr Pty Limited, 2022
Deep Dive: CPU Profiling
with Flame Graphs
Hottest code
is widest
Stack
Function
Calls
Crypto/SSL
Problem?!
Java
JVM
Kernel
© Instaclustr Pty Limited, 2022
Encryption Off: Better!
200% improvement, 18% better than R5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
R5 R6G
Encryption Off (Millions messages/s)
© Instaclustr Pty Limited, 2022
§ Cryptography obviously has a big overhead
• but we still need it turned on…
§ An alternative?
• Try Amazon Corretto Crypto Provider (ACCP)
Solution? ACCP
© Instaclustr Pty Limited, 2022
Encryption On and ACCP
Comparable Performance à Cheaper
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
R5 R6G R5-ACCP R6G-ACCP
Encryption On and ACCP (Millions messages/s)
© Instaclustr Pty Limited, 2022
§ OpenJDK/Amazon Corretto encryption on Graviton was slow
• Due to lack of support for Intel operation that sped up cryptography
§ Amazon Corretto Crypto Provider (ACCP)
• Uses OpenSSL, written in C, and faster!
§ ACCP no longer needed
• As there’s a patch for OpenJDK (JDK-8267993 & JDK-8271567)
Explanation?
© Instaclustr Pty Limited, 2022
Part 2: Software Change
© Instaclustr Pty Limited, 2022
Kafka Topic Partitions Enable
Consumer Concurrency
partitions >= consumers
Partition n
Topic “Parties”
Partition 1
Producer
Partition 2
Consumer Group
Consumer
Consumer
Consumers share
work within groups
Consumer
© Instaclustr Pty Limited, 2022
But Partitions Are Expensive:
Replication and Meta-Data Management
© Instaclustr Pty Limited, 2022
Kafka Controller
§ The Kafka Controller manages broker, topic,
and partition meta-data—Kafka’s “Brain”
§ But which controller is active and where is the
meta-data stored?
Kafka
Broker
Kafka
Broker
Kafka
Broker
Active
Controller
Controller Controller
Control Plane (Meta-data)
Data Plane
(Source: Shutterstock)
© Instaclustr Pty Limited, 2022
Apache ZooKeeper®
Kafka
Broker
Kafka
Broker
Kafka
Broker
Active
Controller
ZK 1
ZK 2
Leader
ZK 3
Kafka Cluster
Kafka Cluster Data
ZooKeeper Ensemble
Kafka Cluster Metadata
Changes, failures, and recovery à SLOW!
Controller Controller Kafka Cluster Metadata cached in brokers
ZooKeeper used for Controller election and storing meta-data
Meta-data changes and recovery from failover are SLOW; Reads are fast due to caching
© Instaclustr Pty Limited, 2022
New KRaft Mode
Kafka
Broker
Kafka
Broker
Kafka
Broker
Active
Controller
§ Kafka Cluster Metadata—only stored in
Kafka so fast and scalable
§ Kafka Cluster Metadata replicated to all
brokers, very fast failover
§ Kafka Cluster Data
Controller Controller
Kafka Cluster
KRaft à Fast
Active Controller is Quorum Leader (using Raft to elect leader)
The Raft Mascot
Kafka + Raft Consensus Algorithm = KRaft
© Instaclustr Pty Limited, 2022
Hypotheses
What ZooKeeper KRaft
Reads and therefore data
layer operations
cached/replicated
FAST FAST
Meta-data changes and
recovery from failover
SLOW FAST
Partitions per cluster LESS MORE
Robustness GOOD UNKNOWN
© Instaclustr Pty Limited, 2022
Experiment 1:
Message Throughput Benchmarking
Hypothesis:
There will be no or only minimal difference between ZooKeeper and
KRaft message throughput
Why?
§ As ZK and Kraft are only concerned with meta-data management, not data workloads
§ Kafka producers only need read-only access to partition meta-data
How?
§ Kafka 3.1.1. on identical AWS R6G.large x 3 nodes clusters
© Instaclustr Pty Limited, 2022
Performance:
Partitions vs. Throughput (x-axis log)- identical, cliff > 1000
0
0.5
1
1.5
2
2.5
1 10 100 1000 10000
Partitions vs. Throughput (M TPS)
ZK TPS (M) KRAFT TPS (M)
© Instaclustr Pty Limited, 2022
Partitions vs. Latency (ms):
Identical, Worse > 1000 Partitions
0
200
400
600
800
1000
1200
1 10 100 1000 10000
Partitions vs. Latency (ms)
ZK Latency (ms) KRAFT Latency (ms)
© Instaclustr Pty Limited, 2022
Comparison With Previous
Experiments (2020)—Clusters
Configuration Instances Nodes Total cores RF Kafka version Date bytes per msg
Original cluster r5.xlarge 3 12 3 2.3 Jan-20 80
New cluster r6g.large 3 6 3 3.1.1 Aug-22 8
§ Note apples-to-oranges comparison as
almost everything is different!
§ Also not directly comparable with results
from 1st part of the talk
(Source: Shutterstock)
© Instaclustr Pty Limited, 2022
Throughput higher and more scalable with increasing partitions
c.f. 2020 results
0
0.5
1
1.5
2
2.5
1 10 100 1000 10000
Partitions vs. Throughput (M TPS)
ZK TPS (M) KRAFT TPS (M) 2020 TPS (M)
© Instaclustr Pty Limited, 2022
Experiment 2
§ How many partitions can we create?
§ Can we create more on a KRaft cluster c.f. ZK cluster?
§ How long does it take?
§ RF=1 otherwise background CPU due to replication too high
§ 50% CPU load on clusters with 100 partitions and no data or workload
© Instaclustr Pty Limited, 2022
Approaches Attempted
1. kafka-topics.sh -create topic with lots of partitions
2. kafka-topics.sh -alter topic with more partitions
3. curl with our provisioning API
4. script to create multiple topics with fixed (1000) partitions each
Problem!
All approaches failed eventually, some sooner rather than later...
© Instaclustr Pty Limited, 2022
Problem!
After some failures, the Kafka cluster was unusable,
even after restarting Kafka
(Source: Commons Wikipedia)
© Instaclustr Pty Limited, 2022
Errors Included
Error while executing topic command : The request timed out.
ERROR org.apache.kafka.common.errors.TimeoutException: The request timed out.
From curl: {"errors":[{"name":"Create
Topic","message":"org.apache.kafka.common.errors.RecordBatchTooLargeException
: The total record(s) size of 56991841 exceeds the maximum allowed batch size
of 8388608"}]}
org.apache.kafka.common.errors.DisconnectException: Cancelled createTopics
request with correlation id 3 due to node 2 being disconnected
org.apache.kafka.common.errors.DisconnectException: Cancelled
createPartitions request with correlation id 6 due to node 1 being
disconnected
* A historical error, “Shannon and Bill” = Bill Shannon – 1955-2020 (Sun, UNIX, J2EE)
panic("Shannon
and Bill* say
this can't
happen.");
© Instaclustr Pty Limited, 2022
Partition Creation Time
Eventual
Timeout
ZooKeeper O(n)
KRaft O(1)
Eventual
Failure
© Instaclustr Pty Limited, 2022
Incremental Approach
Time per 1000 partition
increment - increases
with total partitions
ZooKeeper slower than
KRaft
Slow process to create
many partitions!
And eventual failure
0
5
10
15
20
25
30
0 10000 20000 30000 40000 50000 60000 70000 80000 90000
Total Partitions
Time (s) per 1000 partition increment
ZK Increment time Kraft Increment time
© Instaclustr Pty Limited, 2022
Initial Conclusions?
§ Faster to create more partitions on KRaft c.f. ZK
§ There’s a limit of around 80,000 partitions on both ZK and KRaft clusters
§ And Kafka fails!
§ It’s very easy and quick to kill Kafka on KRaft – just try and create a 100k
partition topic
© Instaclustr Pty Limited, 2022
Experiment 3:
Reassign Partitions
A common Kafka operation—if a server fails, you can move all of the leader
partitions on it to other brokers
kafka-reassign-partitions.sh
§ Run once to get a plan, and then again to actually move the partitions
§ Moving partitions from 1 broker to the other 2 brokers
§ 10,000 partitions, RF=2
© Instaclustr Pty Limited, 2022
The Answer to Life, the
Universe, and Everything = 42s
ZooKeeper, 600
KRaft , 42
0
100
200
300
400
500
600
700
ZooKeeper KRaft
Time to Reassign 10,000 Partitions (s)
Note that there was no partition data,
so in real life the time to move data
would be dominant
© Instaclustr Pty Limited, 2022
Experiment 4:
Maximum Partitions
§ Final attempt to reach 1 Million+ Partitions on a cluster (RF=1 only however)
§ Used manual installation of Kafka 3.2.1. on large EC2 instance
§ Hit limits at around 30,000 partitions:
ERROR [BrokerMetadataPublisher id=1] Error publishing broker metadata at 33037
(kafka.server.metadata.BrokerMetadataPublisher)
java.io.IOException: Map failed
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 65536 bytes for committing reserved memory.
More RAM needed? No – didn’t help.
More file descriptors? 2 descriptors used per partition. Only 65535 by default on Linux. Increased –
still failed.
© Instaclustr Pty Limited, 2022
Experiment 4:
Maximum Partitions
§ Plenty of spare RAM but out of memory error
§ Googling found this:
§ KAFKA-6343 OOM as the result of creation of 5k topics (2017!)
§ Linux system setting: vm.max_map_count: Maximum number of memory map areas a
process may have.
§ Each partition uses 2 map areas, default is 65530, allowing a maximum of only 32765
partitions.
§ Set to a very large number, tried again…
§ Now just get a normal memory error:
§ “java.lang.OutOfMemoryError: Java heap space”
§ Tweaked JVM settings, and tried again…
© Instaclustr Pty Limited, 2022
1.9M Partitions > 1M à Success
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2000000
ZooKeeper KRaft 1 Broker (measured) KRaft 3 Brokers (predicted)
Maximum Partitions
But How About the Batch Error?
§ Still painfully slow to create this many partitions due to the batch error when
creating too many partitions at once.
§ This is a real bug: KAFKA-14204: QuorumController must correctly handle
overly large batches
§ Fixed in 3.3.3. (maybe, not tested)
© Instaclustr Pty Limited, 2022
Use Cases for Lots of Partitions
1. Lots of topics! E.g. due to data model or security
2. High throughput
3. Slow consumers—shoppers with more groceries take longer at the checkout,
so you need more checkouts to service shoppers
Possible problems?
§ RF=3 à large clusters
§ Lots of consumers
§ Consumer resources
§ Consumer group balancing performance
§ Key values >> partitions, etc.
© Instaclustr Pty Limited, 2022
(Source: AdobeStock)
Little’s Law: Partitions = TP x RT
RT is Kafka consumer latency
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12
Partitions
(M)
Throughput (M messages/s)
Minimum Partitions needed for target throughput and increasing consumer latency
Latency 1ms Latency 10ms Latency 100ms
Increasing TP and Latency à
More Partitions
Conclusions
What ZooKeeper KRaft Results
Reads and therefore data
layer operations
cached/replicated
FAST FAST Identical
Confirmed
Meta-data changes SLOW FAST Confirmed
Maximum Partitions LESS MORE Confirmed
Robustness YES WATCH OUT OS settings!
© Instaclustr Pty Limited, 2022
Kafka will soon abandon the
Zoo(Keeper) on a KRaft!
© Instaclustr Pty Limited, 2022
Kafka 3.3 (out soon)
KRaft Production Ready
Performance Engineering
Takeaways For Apache Software?
§ Hardware and Software changes will cause performance surprises
§ potentially due to underlying layers
§ Regular benchmarking, hypotheses, experiments, profiling, testing, etc
§ help improve community understanding and end-user experience of performance and scalability
§ Open source cloud providers have a useful role to play in
§ performance assurance, and
§ for providing insights into running, optimizing and using Apache technologies at scale in
production
(Source: Paul Brebner, Broken Hill, Australia)
www.instaclustr.com
info@instaclustr.com
@instaclustr
THANK
YOU!
© Instaclustr Pty Limited, 2022
https://www.instaclustr.com/company/policies/terms-conditions/
Except as permitted by the copyright law applicable to you, you may
not reproduce, distribute, publish, display, communicate or transmit
any of the content of this document, in any form, but any means,
without the prior written permission of Instaclustr Pty Limited
Ad

More Related Content

Similar to The Impact of Hardware and Software Version Changes on Apache Kafka Performance and Scalability (20)

Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul MaddoxAWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS Riyadh User Group
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
Yoni Farin
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
javier ramirez
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloud
confluent
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
HostedbyConfluent
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
HostedbyConfluent
 
Accelerated SDN in Azure
Accelerated SDN in AzureAccelerated SDN in Azure
Accelerated SDN in Azure
Open Networking Summit
 
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
HostedbyConfluent
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Peter Clapham
 
Machine learning at scale with aws sage maker
Machine learning at scale with aws sage makerMachine learning at scale with aws sage maker
Machine learning at scale with aws sage maker
PhilipBasford
 
JOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your CostsJOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your Costs
Jordan Open Source Association
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucs
solarisyougood
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
HostedbyConfluent
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Community
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
DataStax Academy
 
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
HostedbyConfluent
 
Phil Basford - machine learning at scale with aws sage maker
Phil Basford - machine learning at scale with aws sage makerPhil Basford - machine learning at scale with aws sage maker
Phil Basford - machine learning at scale with aws sage maker
AWSCOMSUM
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul MaddoxAWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS Riyadh User Group
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
Yoni Farin
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
javier ramirez
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloud
confluent
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
HostedbyConfluent
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
HostedbyConfluent
 
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
HostedbyConfluent
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Peter Clapham
 
Machine learning at scale with aws sage maker
Machine learning at scale with aws sage makerMachine learning at scale with aws sage maker
Machine learning at scale with aws sage maker
PhilipBasford
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucs
solarisyougood
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
HostedbyConfluent
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Community
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
DataStax Academy
 
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
HostedbyConfluent
 
Phil Basford - machine learning at scale with aws sage maker
Phil Basford - machine learning at scale with aws sage makerPhil Basford - machine learning at scale with aws sage maker
Phil Basford - machine learning at scale with aws sage maker
AWSCOMSUM
 

More from Paul Brebner (20)

Streaming More For Less With Apache Kafka Tiered Storage
Streaming More For Less With Apache Kafka Tiered StorageStreaming More For Less With Apache Kafka Tiered Storage
Streaming More For Less With Apache Kafka Tiered Storage
Paul Brebner
 
30 Of My Favourite Open Source Technologies In 30 Minutes
30 Of My Favourite Open Source Technologies In 30 Minutes30 Of My Favourite Open Source Technologies In 30 Minutes
30 Of My Favourite Open Source Technologies In 30 Minutes
Paul Brebner
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
Paul Brebner
 
Architecting Applications With Multiple Open Source Big Data Technologies
Architecting Applications With Multiple Open Source Big Data TechnologiesArchitecting Applications With Multiple Open Source Big Data Technologies
Architecting Applications With Multiple Open Source Big Data Technologies
Paul Brebner
 
Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers
Apache ZooKeeper and Apache Curator: Meet the Dining PhilosophersApache ZooKeeper and Apache Curator: Meet the Dining Philosophers
Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers
Paul Brebner
 
Spinning your Drones with Cadence Workflows and Apache Kafka
Spinning your Drones with Cadence Workflows and Apache KafkaSpinning your Drones with Cadence Workflows and Apache Kafka
Spinning your Drones with Cadence Workflows and Apache Kafka
Paul Brebner
 
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Paul Brebner
 
A Visual Introduction to Apache Kafka
A Visual Introduction to Apache KafkaA Visual Introduction to Apache Kafka
A Visual Introduction to Apache Kafka
Paul Brebner
 
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Paul Brebner
 
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Paul Brebner
 
Grid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and PotentialGrid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and Potential
Paul Brebner
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...
Paul Brebner
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Paul Brebner
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
0b101000 years of computing: a personal timeline - decade "0", the 1980's
0b101000 years of computing: a personal timeline - decade "0", the 1980's0b101000 years of computing: a personal timeline - decade "0", the 1980's
0b101000 years of computing: a personal timeline - decade "0", the 1980's
Paul Brebner
 
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
Paul Brebner
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
Paul Brebner
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
Paul Brebner
 
Streaming More For Less With Apache Kafka Tiered Storage
Streaming More For Less With Apache Kafka Tiered StorageStreaming More For Less With Apache Kafka Tiered Storage
Streaming More For Less With Apache Kafka Tiered Storage
Paul Brebner
 
30 Of My Favourite Open Source Technologies In 30 Minutes
30 Of My Favourite Open Source Technologies In 30 Minutes30 Of My Favourite Open Source Technologies In 30 Minutes
30 Of My Favourite Open Source Technologies In 30 Minutes
Paul Brebner
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
Paul Brebner
 
Architecting Applications With Multiple Open Source Big Data Technologies
Architecting Applications With Multiple Open Source Big Data TechnologiesArchitecting Applications With Multiple Open Source Big Data Technologies
Architecting Applications With Multiple Open Source Big Data Technologies
Paul Brebner
 
Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers
Apache ZooKeeper and Apache Curator: Meet the Dining PhilosophersApache ZooKeeper and Apache Curator: Meet the Dining Philosophers
Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers
Paul Brebner
 
Spinning your Drones with Cadence Workflows and Apache Kafka
Spinning your Drones with Cadence Workflows and Apache KafkaSpinning your Drones with Cadence Workflows and Apache Kafka
Spinning your Drones with Cadence Workflows and Apache Kafka
Paul Brebner
 
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...
Paul Brebner
 
A Visual Introduction to Apache Kafka
A Visual Introduction to Apache KafkaA Visual Introduction to Apache Kafka
A Visual Introduction to Apache Kafka
Paul Brebner
 
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Paul Brebner
 
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...
Paul Brebner
 
Grid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and PotentialGrid Middleware – Principles, Practice and Potential
Grid Middleware – Principles, Practice and Potential
Paul Brebner
 
Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...Grid middleware is easy to install, configure, secure, debug and manage acros...
Grid middleware is easy to install, configure, secure, debug and manage acros...
Paul Brebner
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Paul Brebner
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
0b101000 years of computing: a personal timeline - decade "0", the 1980's
0b101000 years of computing: a personal timeline - decade "0", the 1980's0b101000 years of computing: a personal timeline - decade "0", the 1980's
0b101000 years of computing: a personal timeline - decade "0", the 1980's
Paul Brebner
 
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
Paul Brebner
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
Paul Brebner
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
Paul Brebner
 
Ad

Recently uploaded (20)

Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Ad

The Impact of Hardware and Software Version Changes on Apache Kafka Performance and Scalability

  • 1. The Impact of Hardware and Software Version Changes on Apache Kafka® Performance and Scalability Paul Brebner, Hendra Gunadi, and more! Instaclustr by NetApp © Instaclustr Pty Limited, 2022 ApacheCon NA Performance Engineering Track 2022
  • 2. Who Am I? Previously § R&D in distributed systems and performance engineering Last 5 years § Technology Evangelist for Instaclustr by NetApp § 100+ Blogs, Demo Applications, Talks § Benchmarking and Performance Insights § Open Source Technologies including o Apache Cassandra®, Spark™, ZooKeeper, Kafka® o OpenSearch®, Redis™, PostgreSQL® o Uber’s Cadence® © Instaclustr Pty Limited, 2022
  • 3. Cloud Platform for Big Data Open Source Technologies Focus of this talk is on Apache Kafka® Instaclustr Managed Platform © Instaclustr Pty Limited, 2022
  • 4. Kafka is a distributed streams processing system, it allows distributed producers to send messages to distributed consumers via a Kafka cluster. What is Kafka? © Instaclustr Pty Limited, 2022
  • 5. And Change: Hardware and Software © Instaclustr Pty Limited, 2022
  • 6. Some changes have “obvious” performance impacts – e.g. Horse à Steam “The Iron Horse Wins” (!) Race in 1830 between steam locomotive and a horse, horse won (due to mechanical failure), but was obviously inferior. Source: https://www.fhwa.dot.gov/rakeman/1830.htm “The Iron Horse Wins” (It didn’t)
  • 7. Others are not so obvious – e.g. Electric + Steam Locomotive https://en.wikipedia.org/wiki/Electric-steam_locomotive#/media/File:SBB_Ee_3-3_8521.png
  • 8. Part 1: Hardware Change (Source: Shutterstock) © Instaclustr Pty Limited, 2022
  • 9. Hardware Change: CISC VAX 11/780 CISC = Complex Instruction Set Computer University of Waikato NZ 1980-85 © Instaclustr Pty Limited, 2022
  • 10. CISC to RISC Pyramid Technology RISC = Reduced Instruction Set Computer UNSW Sydney Australia 2nd half of 1980s © Instaclustr Pty Limited, 2022
  • 11. More Recently: Intel PC https://commons.wikimedia.org/wiki/File:HP_OMEN_X_900_Gaming_Desktop_PC.jpg © Instaclustr Pty Limited, 2022
  • 12. Intel PC to iPhone?! https://commons.wikimedia.org/wiki/File:IPhone_12_Pro_Max_-_3.jpg © Instaclustr Pty Limited, 2022
  • 13. Acorn BBC Micro Computer: 1980s https://www.classic-computers.org.nz/collection/BBCb-1920x.jpg © Instaclustr Pty Limited, 2022
  • 14. Fast Forward 40 Years: Have Gravitons Been Discovered? https://commons.wikimedia.org/wiki/File:Standard_Model_of_Elementary_Particles_%2B_Gravity.svg Discovered? © Instaclustr Pty Limited, 2022
  • 15. § RISC for Servers! • ARM-based New AWS instance types • Designed by ARM, formerly Advanced RISC Machines and originally Acorn RISC Machine o made the BBC micro § Real Cores • 64 Cores, 256GB RAM per CPU • No hyperthreading • Each vCPU = 1 physical core § Benchmarking • Reported to be up to 40% faster than Intel and AMD § Less Power Consumption • So Cheaper and Faster Fast Forward 40 Years to the AWS Graviton2 © Instaclustr Pty Limited, 2022
  • 16. § Apache Kafka® deployed on § R5 (Intel) vs. R6g (Graviton2) instances § New R6g configuration: • AWS Gp3 disks • Java 11 OpenJDK à Amazon Corretto • Client à Broker encryption enabled § Hoping for easy and large performance gains… Initial Benchmarking © Instaclustr Pty Limited, 2022
  • 17. Initial Results: 40% Worse! 0 0.5 1 1.5 2 2.5 3 R5 R6G Initial Results (Million messages/s) © Instaclustr Pty Limited, 2022
  • 18. Why? Hypotheses and Tests 0 0.5 1 1.5 2 2.5 3 3.5 R5 R6G R6G-OpenJDK R6G-GP2 R6G-x2 Initial Results (Million messages/s): Hypotheses and Tests © Instaclustr Pty Limited, 2022
  • 19. Deep Dive: CPU Profiling with Flame Graphs Hottest code is widest Stack Function Calls Crypto/SSL Problem?! Java JVM Kernel © Instaclustr Pty Limited, 2022
  • 20. Encryption Off: Better! 200% improvement, 18% better than R5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 R5 R6G Encryption Off (Millions messages/s) © Instaclustr Pty Limited, 2022
  • 21. § Cryptography obviously has a big overhead • but we still need it turned on… § An alternative? • Try Amazon Corretto Crypto Provider (ACCP) Solution? ACCP © Instaclustr Pty Limited, 2022
  • 22. Encryption On and ACCP Comparable Performance à Cheaper 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 R5 R6G R5-ACCP R6G-ACCP Encryption On and ACCP (Millions messages/s) © Instaclustr Pty Limited, 2022
  • 23. § OpenJDK/Amazon Corretto encryption on Graviton was slow • Due to lack of support for Intel operation that sped up cryptography § Amazon Corretto Crypto Provider (ACCP) • Uses OpenSSL, written in C, and faster! § ACCP no longer needed • As there’s a patch for OpenJDK (JDK-8267993 & JDK-8271567) Explanation? © Instaclustr Pty Limited, 2022
  • 24. Part 2: Software Change © Instaclustr Pty Limited, 2022
  • 25. Kafka Topic Partitions Enable Consumer Concurrency partitions >= consumers Partition n Topic “Parties” Partition 1 Producer Partition 2 Consumer Group Consumer Consumer Consumers share work within groups Consumer © Instaclustr Pty Limited, 2022
  • 26. But Partitions Are Expensive: Replication and Meta-Data Management © Instaclustr Pty Limited, 2022
  • 27. Kafka Controller § The Kafka Controller manages broker, topic, and partition meta-data—Kafka’s “Brain” § But which controller is active and where is the meta-data stored? Kafka Broker Kafka Broker Kafka Broker Active Controller Controller Controller Control Plane (Meta-data) Data Plane (Source: Shutterstock) © Instaclustr Pty Limited, 2022
  • 28. Apache ZooKeeper® Kafka Broker Kafka Broker Kafka Broker Active Controller ZK 1 ZK 2 Leader ZK 3 Kafka Cluster Kafka Cluster Data ZooKeeper Ensemble Kafka Cluster Metadata Changes, failures, and recovery à SLOW! Controller Controller Kafka Cluster Metadata cached in brokers ZooKeeper used for Controller election and storing meta-data Meta-data changes and recovery from failover are SLOW; Reads are fast due to caching © Instaclustr Pty Limited, 2022
  • 29. New KRaft Mode Kafka Broker Kafka Broker Kafka Broker Active Controller § Kafka Cluster Metadata—only stored in Kafka so fast and scalable § Kafka Cluster Metadata replicated to all brokers, very fast failover § Kafka Cluster Data Controller Controller Kafka Cluster KRaft à Fast Active Controller is Quorum Leader (using Raft to elect leader) The Raft Mascot Kafka + Raft Consensus Algorithm = KRaft © Instaclustr Pty Limited, 2022
  • 30. Hypotheses What ZooKeeper KRaft Reads and therefore data layer operations cached/replicated FAST FAST Meta-data changes and recovery from failover SLOW FAST Partitions per cluster LESS MORE Robustness GOOD UNKNOWN © Instaclustr Pty Limited, 2022
  • 31. Experiment 1: Message Throughput Benchmarking Hypothesis: There will be no or only minimal difference between ZooKeeper and KRaft message throughput Why? § As ZK and Kraft are only concerned with meta-data management, not data workloads § Kafka producers only need read-only access to partition meta-data How? § Kafka 3.1.1. on identical AWS R6G.large x 3 nodes clusters © Instaclustr Pty Limited, 2022
  • 32. Performance: Partitions vs. Throughput (x-axis log)- identical, cliff > 1000 0 0.5 1 1.5 2 2.5 1 10 100 1000 10000 Partitions vs. Throughput (M TPS) ZK TPS (M) KRAFT TPS (M) © Instaclustr Pty Limited, 2022
  • 33. Partitions vs. Latency (ms): Identical, Worse > 1000 Partitions 0 200 400 600 800 1000 1200 1 10 100 1000 10000 Partitions vs. Latency (ms) ZK Latency (ms) KRAFT Latency (ms) © Instaclustr Pty Limited, 2022
  • 34. Comparison With Previous Experiments (2020)—Clusters Configuration Instances Nodes Total cores RF Kafka version Date bytes per msg Original cluster r5.xlarge 3 12 3 2.3 Jan-20 80 New cluster r6g.large 3 6 3 3.1.1 Aug-22 8 § Note apples-to-oranges comparison as almost everything is different! § Also not directly comparable with results from 1st part of the talk (Source: Shutterstock) © Instaclustr Pty Limited, 2022
  • 35. Throughput higher and more scalable with increasing partitions c.f. 2020 results 0 0.5 1 1.5 2 2.5 1 10 100 1000 10000 Partitions vs. Throughput (M TPS) ZK TPS (M) KRAFT TPS (M) 2020 TPS (M) © Instaclustr Pty Limited, 2022
  • 36. Experiment 2 § How many partitions can we create? § Can we create more on a KRaft cluster c.f. ZK cluster? § How long does it take? § RF=1 otherwise background CPU due to replication too high § 50% CPU load on clusters with 100 partitions and no data or workload © Instaclustr Pty Limited, 2022
  • 37. Approaches Attempted 1. kafka-topics.sh -create topic with lots of partitions 2. kafka-topics.sh -alter topic with more partitions 3. curl with our provisioning API 4. script to create multiple topics with fixed (1000) partitions each Problem! All approaches failed eventually, some sooner rather than later... © Instaclustr Pty Limited, 2022
  • 38. Problem! After some failures, the Kafka cluster was unusable, even after restarting Kafka (Source: Commons Wikipedia) © Instaclustr Pty Limited, 2022
  • 39. Errors Included Error while executing topic command : The request timed out. ERROR org.apache.kafka.common.errors.TimeoutException: The request timed out. From curl: {"errors":[{"name":"Create Topic","message":"org.apache.kafka.common.errors.RecordBatchTooLargeException : The total record(s) size of 56991841 exceeds the maximum allowed batch size of 8388608"}]} org.apache.kafka.common.errors.DisconnectException: Cancelled createTopics request with correlation id 3 due to node 2 being disconnected org.apache.kafka.common.errors.DisconnectException: Cancelled createPartitions request with correlation id 6 due to node 1 being disconnected * A historical error, “Shannon and Bill” = Bill Shannon – 1955-2020 (Sun, UNIX, J2EE) panic("Shannon and Bill* say this can't happen."); © Instaclustr Pty Limited, 2022
  • 40. Partition Creation Time Eventual Timeout ZooKeeper O(n) KRaft O(1) Eventual Failure © Instaclustr Pty Limited, 2022
  • 41. Incremental Approach Time per 1000 partition increment - increases with total partitions ZooKeeper slower than KRaft Slow process to create many partitions! And eventual failure 0 5 10 15 20 25 30 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 Total Partitions Time (s) per 1000 partition increment ZK Increment time Kraft Increment time © Instaclustr Pty Limited, 2022
  • 42. Initial Conclusions? § Faster to create more partitions on KRaft c.f. ZK § There’s a limit of around 80,000 partitions on both ZK and KRaft clusters § And Kafka fails! § It’s very easy and quick to kill Kafka on KRaft – just try and create a 100k partition topic © Instaclustr Pty Limited, 2022
  • 43. Experiment 3: Reassign Partitions A common Kafka operation—if a server fails, you can move all of the leader partitions on it to other brokers kafka-reassign-partitions.sh § Run once to get a plan, and then again to actually move the partitions § Moving partitions from 1 broker to the other 2 brokers § 10,000 partitions, RF=2 © Instaclustr Pty Limited, 2022
  • 44. The Answer to Life, the Universe, and Everything = 42s ZooKeeper, 600 KRaft , 42 0 100 200 300 400 500 600 700 ZooKeeper KRaft Time to Reassign 10,000 Partitions (s) Note that there was no partition data, so in real life the time to move data would be dominant © Instaclustr Pty Limited, 2022
  • 45. Experiment 4: Maximum Partitions § Final attempt to reach 1 Million+ Partitions on a cluster (RF=1 only however) § Used manual installation of Kafka 3.2.1. on large EC2 instance § Hit limits at around 30,000 partitions: ERROR [BrokerMetadataPublisher id=1] Error publishing broker metadata at 33037 (kafka.server.metadata.BrokerMetadataPublisher) java.io.IOException: Map failed # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 65536 bytes for committing reserved memory. More RAM needed? No – didn’t help. More file descriptors? 2 descriptors used per partition. Only 65535 by default on Linux. Increased – still failed. © Instaclustr Pty Limited, 2022
  • 46. Experiment 4: Maximum Partitions § Plenty of spare RAM but out of memory error § Googling found this: § KAFKA-6343 OOM as the result of creation of 5k topics (2017!) § Linux system setting: vm.max_map_count: Maximum number of memory map areas a process may have. § Each partition uses 2 map areas, default is 65530, allowing a maximum of only 32765 partitions. § Set to a very large number, tried again… § Now just get a normal memory error: § “java.lang.OutOfMemoryError: Java heap space” § Tweaked JVM settings, and tried again… © Instaclustr Pty Limited, 2022
  • 47. 1.9M Partitions > 1M à Success 0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000 2000000 ZooKeeper KRaft 1 Broker (measured) KRaft 3 Brokers (predicted) Maximum Partitions
  • 48. But How About the Batch Error? § Still painfully slow to create this many partitions due to the batch error when creating too many partitions at once. § This is a real bug: KAFKA-14204: QuorumController must correctly handle overly large batches § Fixed in 3.3.3. (maybe, not tested) © Instaclustr Pty Limited, 2022
  • 49. Use Cases for Lots of Partitions 1. Lots of topics! E.g. due to data model or security 2. High throughput 3. Slow consumers—shoppers with more groceries take longer at the checkout, so you need more checkouts to service shoppers Possible problems? § RF=3 à large clusters § Lots of consumers § Consumer resources § Consumer group balancing performance § Key values >> partitions, etc. © Instaclustr Pty Limited, 2022 (Source: AdobeStock)
  • 50. Little’s Law: Partitions = TP x RT RT is Kafka consumer latency 0 0.2 0.4 0.6 0.8 1 1.2 0 2 4 6 8 10 12 Partitions (M) Throughput (M messages/s) Minimum Partitions needed for target throughput and increasing consumer latency Latency 1ms Latency 10ms Latency 100ms Increasing TP and Latency à More Partitions
  • 51. Conclusions What ZooKeeper KRaft Results Reads and therefore data layer operations cached/replicated FAST FAST Identical Confirmed Meta-data changes SLOW FAST Confirmed Maximum Partitions LESS MORE Confirmed Robustness YES WATCH OUT OS settings! © Instaclustr Pty Limited, 2022
  • 52. Kafka will soon abandon the Zoo(Keeper) on a KRaft! © Instaclustr Pty Limited, 2022 Kafka 3.3 (out soon) KRaft Production Ready
  • 53. Performance Engineering Takeaways For Apache Software? § Hardware and Software changes will cause performance surprises § potentially due to underlying layers § Regular benchmarking, hypotheses, experiments, profiling, testing, etc § help improve community understanding and end-user experience of performance and scalability § Open source cloud providers have a useful role to play in § performance assurance, and § for providing insights into running, optimizing and using Apache technologies at scale in production (Source: Paul Brebner, Broken Hill, Australia)
  • 54. www.instaclustr.com [email protected] @instaclustr THANK YOU! © Instaclustr Pty Limited, 2022 https://www.instaclustr.com/company/policies/terms-conditions/ Except as permitted by the copyright law applicable to you, you may not reproduce, distribute, publish, display, communicate or transmit any of the content of this document, in any form, but any means, without the prior written permission of Instaclustr Pty Limited