SlideShare a Scribd company logo
Performance Tuning RocksDB
for Kafka Streams’ State Store
Dhruba Borthakur (Rockset), Bruno Cadonna (Confluent)
About the Presenters
Dhruba Borthakur
CTO & Co-founder Rockset
rockset.com
2
Bruno Cadonna
Contributor to Apache Kafka &
Software Engineer at Confluent
confluent.io
Agenda
• Kafka Streams and State Stores
• Introduction to RocksDB
• Compaction Styles in RocksDB
• Possible Operational Issues
• Tuning RocksDB
• RocksDB Command Line Utilities
• Takeaways
3
Kafka Streams and State Stores
Kafka Streams
5
● Stateless and stateful processors
● Stateful processors use state stores
Kafka Streams
6
● Stateless and stateful processors
● Stateful processors use state stores
Kafka Streams
7
● Stateless and stateful processors
● Stateful processors use state stores
Kafka Streams
8
● Stateless and stateful processors
● Stateful processors use state stores
Kafka Streams
9
● Stateless and stateful processors
● Stateful processors use state stores
Kafka Streams
10
● Stateless and stateful processors
● Stateful processors use state stores
Kafka Streams
11
● Stateless and stateful processors
● Stateful processors use state stores
● Create one topology per input partition, i.e., task
State Stores in Kafka Streams
12
• Stateful processor may use one or more state
stores
• Each partition has its own state store
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
State Stores in Kafka Streams
13
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
State Stores in Kafka Streams
14
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
• collects metrics and de-/serializes records
State Stores in Kafka Streams
15
01
10
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
• collects metrics and de-/serializes records
State Stores in Kafka Streams
16
01
10
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
• collects metrics and de-/serializes records
• caches records
01
10
State Stores in Kafka Streams
17
01
10
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
• collects metrics and de-/serializes records
• caches records
• writes records to changelog
01
10
State Stores in Kafka Streams
18
01
10
01
10
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
• collects metrics and de-/serializes records
• caches records
• writes records to changelog
01
10
State Stores in Kafka Streams
19
01
10
01
10
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
• collects metrics and de-/serializes records
• caches records
• writes records to changelog
• writes records to local state store
01
10
State Stores in Kafka Streams
20
01
10
01
10
Metrics &
De-/Serialization
Caching
Changelogging
Restoration
• Stateful processor may use one or more state
stores
• Each partition has its own state store
• State stores are layered:
• collects metrics and de-/serializes records
• caches records
• writes records to changelog
• writes records to local state store
• State stores are restored from changelog
topics
• Restoration is byte-based and by-passes
wrapping layers
RocksDB is the Default State Store
• Kafka Streams needed a write optimized state store
• Kafka Streams 2.6 uses RocksDB 5.18.4
• Kafka Streams provides metrics to monitor RocksDB state stores
• RocksDB can be configured by passing a class that implements interface
RocksDBConfigSetter to configuration rocksdb.config.setter
21
Example: Configuring RocksDB in Kafka Streams
22
public static class MyRocksDBConfig implements RocksDBConfigSetter {
@Override
public void setConfig(final String storeName,
final Options options,
final Map<String, Object> configs) {
// e.g. set compaction style
options.setCompactionStyle(CompactionStyle.LEVEL);
}
@Override
public void close(final String storeName, final Options options) {}
}
Introduction to RocksDB
What is RocksDB?
• Key-value persistent store
• Embedded C++ & Java library
• Server workloads
24
What is it not?
• Not distributed
• No failover
• Not highly-available. If the machine
dies, you lose your data
• Focus on performance
Kafka Streams makes it fault-tolerant
25
RocksDB API
• Keys and values are byte arrays
• Data are stored sorted by key
• Update Operations: Put/Delete/Merge
• Queries: Get/Iterator
26
Log Structured Merge Architecture
27
Periodic
compaction
Read only data
in SSD or disk
Read write data
in RAM
Transaction log
Scan request from
application
Write request
from application
RocksDB Write Path
28
Write request
Read only
MemTable
Log
Log
sst sst sst
sst sst sst
LS
Compaction
Flush
SwitchSwitch
Active
MemTable Log
RocksDB Reads
• Data can be in memory or disk
• Consult multiple files to find the latest
instance of the key
• Use bloom filters to reduce IO
• Every sst file has a bloom filter
• bloom filters are cached in memory
• default config: eliminates 99% of reads
29
RocksDB Read Path
30
Read only
MemTable Log
Log
sst sst sst
LS
Compaction
Flush
Active
MemTable Log
sst sst sst
Memory
Persistent
Storage
Read
request
Get(k)
Blooms
RocksDB Architecture
31
Read only
MemTable
Log
Log
sst sst sst
LS
Compaction
Flush
Active
MemTable Log
sst sst
Memory
Persistent
Storage
sst
Switch Switch
Write
request
Read only
BlockCache
Read
request
RocksDB Open & Pluggable
32
Pluggable
compaction
Pluggable sst
data format on
storage
Pluggable
Memtable
format in RAM
Transaction log
Blooms
Customizable
WAL
Get or scan request
from application
Write request
from application
Compaction Styles in RocksDB
What is Compaction
• Multi-threaded
• Parallel compactions on different parts of the database
• Deletes overwritten keys
• Two types of compactions
• level compactions
• universal compaction
34
Level compaction
• RocksDB default compaction is Level Compaction (for read heavy workloads)
• Stores data in multiple levels
• More recent data stored in L0
• Older data stored in Lmax
• Files in L0
• overlapping keys, sorted by flush time
• Files in L1 to Lmax
• non overlapping keys, sorted by key
• Max space amplification = 10%
https://github.com/facebook/rocksdb/wiki/Leveled-Compaction
35
Universal Compaction
• For write heavy workloads
• needed if Level style compaction is bottlenecked by disk throughout
• Stores all files in L0
• All files are arranged in time order
• Decreases write amplification but increases space amplification
• Pick up files that are chronologically adjacent to one another
• merge them
• replace them with a new file in L0
36
Possible Operational Issues
Operational Issues
• High memory usage
• Application gets slower or even crashes
• Operating system shows high memory usage
• Kafka Streams metrics for monitoring memory
usage of RocksDB (KIP-607, planned for 2.7)
show high values
38
Operational Issues
• High memory usage
• Application gets slower or even crashes
• Operating system shows high memory usage
• Kafka Streams metrics for monitoring memory
usage of RocksDB (KIP-607, planned for 2.7)
show high values
• High disk usage
• Application crashes with I/O errors
• Operating system shows high disk usage
39
Operational Issues
• High disk I/O
• Operating system shows high disk I/O
• Kafka Streams metrics with high values
• memtable-bytes-flushed-[rate | total]
• bytes-[read | written]-compaction-rate
• Kafka Streams metrics with low values
• memtable-hit-ratio
• block-cache-[data | index | filter]-hit-ratio
40
Operational Issues
• High disk I/O
• Operating system shows high disk I/O
• Kafka Streams metrics with high values
• memtable-bytes-flushed-[rate | total]
• bytes-[read | written]-compaction-rate
• Kafka Streams metrics with low values
• memtable-hit-ratio
• block-cache-[data | index | filter]-hit-ratio
• Write stalls
• Processing latency of the application increases
• Kafka Streams client gets kicked out of the group
• Kafka Streams metric write-stall-duration-[avg | total] shows high values
41
Operational Issues
• Too many open files
• Application crashes with I/O errors
• Kafka Streams metric number-open-files shows high values
42
Operational Issues
• Kafka Streams client gets kicked out of the consumer group during restoration
• Before 2.6 Kafka Streams used RocksDB’s bulk loading (Options#prepareForBulkLoad())
feature to restore the state store faster.
• Bulk loading basically consists of:
• disable automatic compaction and
• write all data to level 0
• trigger manual compaction
43
Operational Issues
• Kafka Streams client gets kicked out of the consumer group during restoration
• Before 2.6 Kafka Streams used RocksDB’s bulk loading (Options#prepareForBulkLoad())
feature to restore the state store faster.
• Bulk loading basically consists of:
• disable automatic compaction and
• write all data to level 0
• trigger manual compaction
• Manual compaction is a blocking call that may take longer than max.poll.interval.ms
44
Operational Issues
• Kafka Streams client gets kicked out of the consumer group during restoration
• Before 2.6 Kafka Streams used RocksDB’s bulk loading (Options#prepareForBulkLoad())
feature to restore the state store faster.
• Bulk loading basically consists of:
• disable automatic compaction and
• write all data to level 0
• trigger manual compaction
• Manual compaction is a blocking call that may take longer than max.poll.interval.ms
• Bulk loading is removed in 2.6
• Currently evaluating alternatives to increase the performance of state store restoration by using other
features of RocksDB, e.g., ingesting SST files directly.
45
Tuning RocksDB
Debug Kafka Streams OOM
• Memory consumption
• memtable (for writes)
• memtable size, number of memtables
• block cache (reads)
• configure to share among all the partitions in the kafka store
• Kafka Streams keeps index blocks in the block cache
• rocksdb-java bugs (https://github.com/facebook/rocksdb/issues/6247)
• High disk usage
• Use level compaction instead of universal compaction
• provision more disk space
https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html
47
Debug writes stalls
• Debug write stalls in RocksDB
• Is disk IO utilization at 100%?
• add more storage spindles
• use universal compaction
• Check number of background compaction threads
• Kafka Streams uses Max(2, number of available processors) by default
• Check memtable configuration
• AdvancedColumnFamilyOptions.max_write_buffer_number
• ColumnFamilyOptions.write_buffer_size
48
Debugging file descriptor issues
• Too many open files
• DBOptions.max_open_files = -1 (default)
• opens all sst files at db open time
• good for performance but can run out of file descriptors
• Increase operating system number of open file descriptors
• Set DBOptions.max_open_files = 10000
• will open a max of 10000 files concurrently
• Decrease number of files by making each file larger
• AdvancedColumnFamilyOptions.target_file_size_base = 128 MB (default is 64 MB)
49
RocksDB Command Line Utilities
Build rocksdb command line utilities
git clone git@github.com:facebook/rocksdb.git
cd rocksdb
make ldb sst_dump
cp ldb /usr/local/bin
cp sst_dump /usr/local/bin
51
Useful RocksDB command line tools: https://github.com/facebook/rocksdb/wiki/Administration-and-Data-
Access-Tool
Build
# change these values accordingly
APP_ID=my-app
STATE_STORE=my-counts
STATE_STORE_DIR=/tmp/kafka-streams
TASKS=$(ls $STATE_STORE_DIR/$APP_ID)
Change These Values
Useful commands
# View all keys
for i in $TASKS; do
ldb --db=$STATE_STORE_DIR/$APP_ID/$i/rocksdb/$STATE_STORE
scan 2>/dev/null;
done
# Show table properties
for i in $TASKS; do
TABLE_PROPERTIES=$(sst_dump --
file=$STATE_STORE_DIR/$APP_ID/$i/rocksdb/$STATE_STORE --
show_properties)
echo -e "Table properties for task:
$in$TABLE_PROPERTIESnn"
done
52
Useful commands- Example output
53
# example output
Table properties for task: 1_9
from [] to []
Process /tmp/kafka-streams/my-app/1_9/rocksdb/my-counts/000006.sst
Sst file format: block-based
Table Properties:
------------------------------
# data blocks: 1
# entries: 2
raw key size: 18
raw average key size: 9.000000
raw value size: 88
raw average value size: 44.000000
data block size: 125
index block size: 35
filter block size: 0
(estimated) table size: 160
Takeaways
Takeaways
• RocksDB is the default state store in Kafka Streams
• Kafka Streams provides functionality to configure and monitor RocksDB
• RocksDB uses a log structured merge (LSM) architecture with different compaction
styles
• You might run into operational issues, but you can solve them by debugging and tuning
RocksDB
• RocksDB offers command line utilities for analysing state stores
55
56
Thank you!
dhruba@rockset.com
bruno@confluent.io
cnfl.io/meetups cnfl.io/slackcnfl.io/blog
Learn how Rockset uses RocksDB
https://rockset.com/blog/how-we-use-rocksdb-at-rockset/
Ad

More Related Content

What's hot (20)

Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
confluent
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
confluent
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
kafka
kafkakafka
kafka
Amikam Snir
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
confluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
Martin Podval
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
Databricks
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
Diego Pacheco
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
confluent
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
confluent
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
confluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
Martin Podval
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
Databricks
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 

Similar to Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur, Rockset, Bruno Cadonna, Confluent) Kafka Summit 2020 (20)

How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
acelyc1112009
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
Yoni Farin
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
Thisara Pramuditha
 
Training Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / ProxyTraining Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / Proxy
Continuent
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
Sander Temme
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
Tony Rogerson
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
Apache Apex
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
DataWorks Summit/Hadoop Summit
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?
Ricardo Paiva
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of state
Yoni Farin
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
Yoni Farin
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Gyula Fóra
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
Ines Sombra
 
Running database infrastructure on containers
Running database infrastructure on containersRunning database infrastructure on containers
Running database infrastructure on containers
MariaDB plc
 
ActiveMQ 5.9.x new features
ActiveMQ 5.9.x new featuresActiveMQ 5.9.x new features
ActiveMQ 5.9.x new features
Christian Posta
 
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
Rainforest QA
 
A Tale of 2 Systems
A Tale of 2 SystemsA Tale of 2 Systems
A Tale of 2 Systems
David Newman
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
acelyc1112009
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
Yoni Farin
 
Training Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / ProxyTraining Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / Proxy
Continuent
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
Sander Temme
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
Tony Rogerson
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
Apache Apex
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
DataWorks Summit/Hadoop Summit
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?
Ricardo Paiva
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of state
Yoni Farin
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
Yoni Farin
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Gyula Fóra
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
confluent
 
Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
Ines Sombra
 
Running database infrastructure on containers
Running database infrastructure on containersRunning database infrastructure on containers
Running database infrastructure on containers
MariaDB plc
 
ActiveMQ 5.9.x new features
ActiveMQ 5.9.x new featuresActiveMQ 5.9.x new features
ActiveMQ 5.9.x new features
Christian Posta
 
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld Europe 2014: Advanced SQL Server on vSphere Techniques and Best Pract...
VMworld
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
Rainforest QA
 
A Tale of 2 Systems
A Tale of 2 SystemsA Tale of 2 Systems
A Tale of 2 Systems
David Newman
 
Ad

More from confluent (20)

Webinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptxWebinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
Migration, backup and restore made easy using KannikaMigration, backup and restore made easy using Kannika
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - KeynoteData in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
Data in Motion Tour Seoul 2024  - Roadmap DemoData in Motion Tour Seoul 2024  - Roadmap Demo
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi ArabiaData in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent PlatformStrumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not WeeksCompose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and ConfluentBuilding Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazioneIl Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptxWebinar Think Right - Shift Left - 19-03-2025.pptx
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
Migration, backup and restore made easy using KannikaMigration, backup and restore made easy using Kannika
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025Five Things You Need to Know About Data Streaming in 2025
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - KeynoteData in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
Data in Motion Tour Seoul 2024  - Roadmap DemoData in Motion Tour Seoul 2024  - Roadmap Demo
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...Confluent per il settore FSI:  Accelerare l'Innovazione con il Data Streaming...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi ArabiaData in Motion Tour 2024 Riyadh, Saudi Arabia
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...Build a Real-Time Decision Support Application for Financial Market Traders w...
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent PlatformStrumenti e Strategie di Stream Governance con Confluent Platform
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not WeeksCompose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and ConfluentBuilding Real-Time Gen AI Applications with SingleStore and Confluent
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazioneIl Data Streaming per un’AI real-time di nuova generazione
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Ad

Recently uploaded (20)

Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur, Rockset, Bruno Cadonna, Confluent) Kafka Summit 2020

  • 1. Performance Tuning RocksDB for Kafka Streams’ State Store Dhruba Borthakur (Rockset), Bruno Cadonna (Confluent)
  • 2. About the Presenters Dhruba Borthakur CTO & Co-founder Rockset rockset.com 2 Bruno Cadonna Contributor to Apache Kafka & Software Engineer at Confluent confluent.io
  • 3. Agenda • Kafka Streams and State Stores • Introduction to RocksDB • Compaction Styles in RocksDB • Possible Operational Issues • Tuning RocksDB • RocksDB Command Line Utilities • Takeaways 3
  • 4. Kafka Streams and State Stores
  • 5. Kafka Streams 5 ● Stateless and stateful processors ● Stateful processors use state stores
  • 6. Kafka Streams 6 ● Stateless and stateful processors ● Stateful processors use state stores
  • 7. Kafka Streams 7 ● Stateless and stateful processors ● Stateful processors use state stores
  • 8. Kafka Streams 8 ● Stateless and stateful processors ● Stateful processors use state stores
  • 9. Kafka Streams 9 ● Stateless and stateful processors ● Stateful processors use state stores
  • 10. Kafka Streams 10 ● Stateless and stateful processors ● Stateful processors use state stores
  • 11. Kafka Streams 11 ● Stateless and stateful processors ● Stateful processors use state stores ● Create one topology per input partition, i.e., task
  • 12. State Stores in Kafka Streams 12 • Stateful processor may use one or more state stores • Each partition has its own state store Metrics & De-/Serialization Caching Changelogging Restoration
  • 13. State Stores in Kafka Streams 13 • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: Metrics & De-/Serialization Caching Changelogging Restoration
  • 14. State Stores in Kafka Streams 14 Metrics & De-/Serialization Caching Changelogging Restoration • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: • collects metrics and de-/serializes records
  • 15. State Stores in Kafka Streams 15 01 10 Metrics & De-/Serialization Caching Changelogging Restoration • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: • collects metrics and de-/serializes records
  • 16. State Stores in Kafka Streams 16 01 10 Metrics & De-/Serialization Caching Changelogging Restoration • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: • collects metrics and de-/serializes records • caches records
  • 17. 01 10 State Stores in Kafka Streams 17 01 10 Metrics & De-/Serialization Caching Changelogging Restoration • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: • collects metrics and de-/serializes records • caches records • writes records to changelog
  • 18. 01 10 State Stores in Kafka Streams 18 01 10 01 10 Metrics & De-/Serialization Caching Changelogging Restoration • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: • collects metrics and de-/serializes records • caches records • writes records to changelog
  • 19. 01 10 State Stores in Kafka Streams 19 01 10 01 10 Metrics & De-/Serialization Caching Changelogging Restoration • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: • collects metrics and de-/serializes records • caches records • writes records to changelog • writes records to local state store
  • 20. 01 10 State Stores in Kafka Streams 20 01 10 01 10 Metrics & De-/Serialization Caching Changelogging Restoration • Stateful processor may use one or more state stores • Each partition has its own state store • State stores are layered: • collects metrics and de-/serializes records • caches records • writes records to changelog • writes records to local state store • State stores are restored from changelog topics • Restoration is byte-based and by-passes wrapping layers
  • 21. RocksDB is the Default State Store • Kafka Streams needed a write optimized state store • Kafka Streams 2.6 uses RocksDB 5.18.4 • Kafka Streams provides metrics to monitor RocksDB state stores • RocksDB can be configured by passing a class that implements interface RocksDBConfigSetter to configuration rocksdb.config.setter 21
  • 22. Example: Configuring RocksDB in Kafka Streams 22 public static class MyRocksDBConfig implements RocksDBConfigSetter { @Override public void setConfig(final String storeName, final Options options, final Map<String, Object> configs) { // e.g. set compaction style options.setCompactionStyle(CompactionStyle.LEVEL); } @Override public void close(final String storeName, final Options options) {} }
  • 24. What is RocksDB? • Key-value persistent store • Embedded C++ & Java library • Server workloads 24
  • 25. What is it not? • Not distributed • No failover • Not highly-available. If the machine dies, you lose your data • Focus on performance Kafka Streams makes it fault-tolerant 25
  • 26. RocksDB API • Keys and values are byte arrays • Data are stored sorted by key • Update Operations: Put/Delete/Merge • Queries: Get/Iterator 26
  • 27. Log Structured Merge Architecture 27 Periodic compaction Read only data in SSD or disk Read write data in RAM Transaction log Scan request from application Write request from application
  • 28. RocksDB Write Path 28 Write request Read only MemTable Log Log sst sst sst sst sst sst LS Compaction Flush SwitchSwitch Active MemTable Log
  • 29. RocksDB Reads • Data can be in memory or disk • Consult multiple files to find the latest instance of the key • Use bloom filters to reduce IO • Every sst file has a bloom filter • bloom filters are cached in memory • default config: eliminates 99% of reads 29
  • 30. RocksDB Read Path 30 Read only MemTable Log Log sst sst sst LS Compaction Flush Active MemTable Log sst sst sst Memory Persistent Storage Read request Get(k) Blooms
  • 31. RocksDB Architecture 31 Read only MemTable Log Log sst sst sst LS Compaction Flush Active MemTable Log sst sst Memory Persistent Storage sst Switch Switch Write request Read only BlockCache Read request
  • 32. RocksDB Open & Pluggable 32 Pluggable compaction Pluggable sst data format on storage Pluggable Memtable format in RAM Transaction log Blooms Customizable WAL Get or scan request from application Write request from application
  • 34. What is Compaction • Multi-threaded • Parallel compactions on different parts of the database • Deletes overwritten keys • Two types of compactions • level compactions • universal compaction 34
  • 35. Level compaction • RocksDB default compaction is Level Compaction (for read heavy workloads) • Stores data in multiple levels • More recent data stored in L0 • Older data stored in Lmax • Files in L0 • overlapping keys, sorted by flush time • Files in L1 to Lmax • non overlapping keys, sorted by key • Max space amplification = 10% https://github.com/facebook/rocksdb/wiki/Leveled-Compaction 35
  • 36. Universal Compaction • For write heavy workloads • needed if Level style compaction is bottlenecked by disk throughout • Stores all files in L0 • All files are arranged in time order • Decreases write amplification but increases space amplification • Pick up files that are chronologically adjacent to one another • merge them • replace them with a new file in L0 36
  • 38. Operational Issues • High memory usage • Application gets slower or even crashes • Operating system shows high memory usage • Kafka Streams metrics for monitoring memory usage of RocksDB (KIP-607, planned for 2.7) show high values 38
  • 39. Operational Issues • High memory usage • Application gets slower or even crashes • Operating system shows high memory usage • Kafka Streams metrics for monitoring memory usage of RocksDB (KIP-607, planned for 2.7) show high values • High disk usage • Application crashes with I/O errors • Operating system shows high disk usage 39
  • 40. Operational Issues • High disk I/O • Operating system shows high disk I/O • Kafka Streams metrics with high values • memtable-bytes-flushed-[rate | total] • bytes-[read | written]-compaction-rate • Kafka Streams metrics with low values • memtable-hit-ratio • block-cache-[data | index | filter]-hit-ratio 40
  • 41. Operational Issues • High disk I/O • Operating system shows high disk I/O • Kafka Streams metrics with high values • memtable-bytes-flushed-[rate | total] • bytes-[read | written]-compaction-rate • Kafka Streams metrics with low values • memtable-hit-ratio • block-cache-[data | index | filter]-hit-ratio • Write stalls • Processing latency of the application increases • Kafka Streams client gets kicked out of the group • Kafka Streams metric write-stall-duration-[avg | total] shows high values 41
  • 42. Operational Issues • Too many open files • Application crashes with I/O errors • Kafka Streams metric number-open-files shows high values 42
  • 43. Operational Issues • Kafka Streams client gets kicked out of the consumer group during restoration • Before 2.6 Kafka Streams used RocksDB’s bulk loading (Options#prepareForBulkLoad()) feature to restore the state store faster. • Bulk loading basically consists of: • disable automatic compaction and • write all data to level 0 • trigger manual compaction 43
  • 44. Operational Issues • Kafka Streams client gets kicked out of the consumer group during restoration • Before 2.6 Kafka Streams used RocksDB’s bulk loading (Options#prepareForBulkLoad()) feature to restore the state store faster. • Bulk loading basically consists of: • disable automatic compaction and • write all data to level 0 • trigger manual compaction • Manual compaction is a blocking call that may take longer than max.poll.interval.ms 44
  • 45. Operational Issues • Kafka Streams client gets kicked out of the consumer group during restoration • Before 2.6 Kafka Streams used RocksDB’s bulk loading (Options#prepareForBulkLoad()) feature to restore the state store faster. • Bulk loading basically consists of: • disable automatic compaction and • write all data to level 0 • trigger manual compaction • Manual compaction is a blocking call that may take longer than max.poll.interval.ms • Bulk loading is removed in 2.6 • Currently evaluating alternatives to increase the performance of state store restoration by using other features of RocksDB, e.g., ingesting SST files directly. 45
  • 47. Debug Kafka Streams OOM • Memory consumption • memtable (for writes) • memtable size, number of memtables • block cache (reads) • configure to share among all the partitions in the kafka store • Kafka Streams keeps index blocks in the block cache • rocksdb-java bugs (https://github.com/facebook/rocksdb/issues/6247) • High disk usage • Use level compaction instead of universal compaction • provision more disk space https://docs.confluent.io/current/streams/developer-guide/memory-mgmt.html 47
  • 48. Debug writes stalls • Debug write stalls in RocksDB • Is disk IO utilization at 100%? • add more storage spindles • use universal compaction • Check number of background compaction threads • Kafka Streams uses Max(2, number of available processors) by default • Check memtable configuration • AdvancedColumnFamilyOptions.max_write_buffer_number • ColumnFamilyOptions.write_buffer_size 48
  • 49. Debugging file descriptor issues • Too many open files • DBOptions.max_open_files = -1 (default) • opens all sst files at db open time • good for performance but can run out of file descriptors • Increase operating system number of open file descriptors • Set DBOptions.max_open_files = 10000 • will open a max of 10000 files concurrently • Decrease number of files by making each file larger • AdvancedColumnFamilyOptions.target_file_size_base = 128 MB (default is 64 MB) 49
  • 50. RocksDB Command Line Utilities
  • 51. Build rocksdb command line utilities git clone [email protected]:facebook/rocksdb.git cd rocksdb make ldb sst_dump cp ldb /usr/local/bin cp sst_dump /usr/local/bin 51 Useful RocksDB command line tools: https://github.com/facebook/rocksdb/wiki/Administration-and-Data- Access-Tool Build # change these values accordingly APP_ID=my-app STATE_STORE=my-counts STATE_STORE_DIR=/tmp/kafka-streams TASKS=$(ls $STATE_STORE_DIR/$APP_ID) Change These Values
  • 52. Useful commands # View all keys for i in $TASKS; do ldb --db=$STATE_STORE_DIR/$APP_ID/$i/rocksdb/$STATE_STORE scan 2>/dev/null; done # Show table properties for i in $TASKS; do TABLE_PROPERTIES=$(sst_dump -- file=$STATE_STORE_DIR/$APP_ID/$i/rocksdb/$STATE_STORE -- show_properties) echo -e "Table properties for task: $in$TABLE_PROPERTIESnn" done 52
  • 53. Useful commands- Example output 53 # example output Table properties for task: 1_9 from [] to [] Process /tmp/kafka-streams/my-app/1_9/rocksdb/my-counts/000006.sst Sst file format: block-based Table Properties: ------------------------------ # data blocks: 1 # entries: 2 raw key size: 18 raw average key size: 9.000000 raw value size: 88 raw average value size: 44.000000 data block size: 125 index block size: 35 filter block size: 0 (estimated) table size: 160
  • 55. Takeaways • RocksDB is the default state store in Kafka Streams • Kafka Streams provides functionality to configure and monitor RocksDB • RocksDB uses a log structured merge (LSM) architecture with different compaction styles • You might run into operational issues, but you can solve them by debugging and tuning RocksDB • RocksDB offers command line utilities for analysing state stores 55
  • 56. 56 Thank you! [email protected] [email protected] cnfl.io/meetups cnfl.io/slackcnfl.io/blog Learn how Rockset uses RocksDB https://rockset.com/blog/how-we-use-rocksdb-at-rockset/