SlideShare a Scribd company logo
Introducing Kafka-on-Pulsar:
Bring native Kafka protocol support
to Apache Pulsar
Sijie Guo / Pierre Zemb
Pulsar Summit
2020-06-17
Who are we?
● Sijie Guo (@sijieg)
● Co-Founder & CEO, StreamNative
● PMC Member of Pulsar/BookKeeper
● Ex Co-Founder, Streamlio
● Ex-Twitter, Ex-Yahoo
● Work on messaging and streaming data
technologies for many years
Who are we?
● Pierre Zemb (@PierreZ)
● Technical Leader
● Working around distributed systems
● Newcomer as an Apache contributor
○ Apache {Flink, HBase, Pulsar}
● Involved into local communities
Agenda
● What is Apache Pulsar?
● Why KoP?
● Introduction of protocol handler
● Kafka VS Pulsar, the protocol version
● How we implement KoP
● Demo
● Roadmap
● Q&A
What is Apache Pulsar?
Flexible pub/sub messaging
backed by durable log storage
Flexible pub/sub messaging
backed by durable log storage
Cloud-Native Event Streaming
Apache Pulsar
● Publish-subscribe: unified messaging model (streaming + queueing)
● Infinite event stream storage: Apache BookKeeper + Tiered Storage
● Connectors: ingest events without writing code
● Process events in real-time
○ Pulsar Functions for serverless / lightweight computation
○ Spark / Flink for unified data processing
○ Presto for interactive queries
Pulsar Highlights
● Multi-tenancy
● Unified messaging (queuing + streaming)
● Layered Architecture
● Tiered Storage
● Built-in schema support
● Built-in geo-replication
The Need of KoP
● Adoptions
● Inbound requests
● Migration
The Existing Efforts
● Kafka Java Wrapper
● Pulsar IO Connector
Implement Kafka protocol on Pulsar?
● Proxy / Gateway
● Implement Kafka protocol on Pulsar broker
KoP as a proxy, OVHcloud version
We first implemented KoP has a proxy PoC in Rust:
● Rust async was out in nightly compiler when we started
● We wanted no GC on proxy layers
● Rust has awesome libraries at TCP-level
Our goal was to convert TCP frames from Kafka to Pulsar
KoP as a proxy, OVHcloud version
Proxy layer
"Hyperion"
clients
KoP as a proxy, OVHcloud version
Everything is a state-machine:
● TCP cnx from Kafka clients
● TCP cnx to Pulsar brokers
Those event-driven finite-state machines were triggered by TCP frames
from their respective protocol.
A third one was above the two to provide synchronization
KoP as a proxy, OVHcloud version
Proxy layer
"Hyperion"
clients
State machines
KoP as a proxy, OVHcloud version
Pros
● Working at TCP layer enables
performance
● nice PoC to discover both protocols
● Rust is blazing fast
● Proxify production is easy
● We could bump old version of Kafka
frames for old Kafka clients
Cons
● Rewrite everything
● Some things were hard to proxify:
○ Group coordinator
○ Offsets management
● Difficult to open-source (different
language)
The group-coordinator/offsets problem
In Kafka, the group coordinator is an
elected actor within the cluster
responsible for:
● assigning partitions to consumers of a
consumer group
● managing offsets for each consumer
group
In Pulsar, partition assignment is managed
by broker on a per-partition basis.
Offset management is done by storing the
acknowledgements in cursors by the
owner broker of that partition.
The group-coordinator/offsets problem
In Kafka, the group coordinator is an
elected actor within the cluster
responsible for:
● assigning partitions to consumers of a
consumer group
● managing offsets for each consumer
group
In Pulsar, partition assignment is managed
by broker on a per-partition basis.
Offset management is done by storing the
acknowledgements in cursors by the
owner broker of that partition.
Simulate this at proxy-level is hard
(missing low-level info)
And then we saw this 😍
Which lead to our collaboration 🤝
What is Apache Pulsar??
How Pulsar implements its protocol
Protocol Handler
What is the protocol handler?
What is the protocol handler?
How to load plugins in a jvm without using classpath?
Pulsar is using NAR to load plugins!
- Pulsar Function
- Pulsar Connector
- Pulsar Offloader
- Pulsar Protocol Handler
How-to load protocol handlers?
1. Upgrade your cluster to 2.5
2. Set the following configurations:
3. Configure each protocol handlers
4. Restart your broker
5. Enjoy!
Kafka-on-Pulsar Protocol Handler
The KoP Implementation
● “distributedlog”
○ Kafka and Pulsar are built on same data structure
● Similarities
○ Topic Lookup
○ Topic / Partitions / Messages / Offset
○ Produce
○ Consume
○ Consumption State
KoP Implementation
● Topic flat map: Brokers set `kafkaNamespace`
● MessageID and offset: LedgerId + EntryId
● Message: Convert key/value/timestamps/headers
● Topic Lookup: Pulsar admin topic lookup -> owner broker
● Produce: Convert message, then PulsarTopic.publishMessage
● Consume: Convert requests, then nonDurableCursor.readEntries
● Group Coordinator: Keep group information in topic
`public/__kafka/__offsets`
KoP for production
❏ What Pulsar Provides
❏ Multi-Tenancy
❏ Security
❏ TLS Encryption
❏ Authentication, Authorization
❏ Data Encryption
❏ Geo-replication
❏ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP for production
❏ What Pulsar Provides
❏ Multi-Tenancy
❏ Security
❏ TLS Encryption
❏ Authentication, Authorization
❏ Data Encryption
✓ Geo-replication
✓ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP for production
❏ What Pulsar Provides
✓ Multi-Tenancy
✓ Security
✓ TLS Encryption
✓ Authentication, Authorization
❏ Data Encryption
✓ Geo-replication
✓ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP for production
❏ What Pulsar Provides
✓ Multi-Tenancy
✓ Security
✓ TLS Encryption
✓ Authentication, Authorization
❏ Data Encryption
✓ Geo-replication
✓ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP multi-tenancy
Pulsar has great support for multi-tenancy, how-to use it in KoP?
SASL-PLAIN is used to inject info:
● The username of Kafka JAAS is the tenant/namespace
● The password must be your classic Pulsar token authentication
parameters
TLS can be added over SASL-PLAIN
KoP multi-tenancy
String tenant = "ns1/tenant1";
String password = "token:xxx";
String jaasTemplate =
"org.apache.kafka.common.security.plain.PlainLoginModule required
username="%s" password="%s";";
String jaasCfg = String.format(jaasTemplate, tenant, password);
props.put("sasl.jaas.config", jaasCfg);
props.put("security.protocol", "SASL_PLAINTEXT");
props.put("sasl.mechanism", "PLAIN");
KoP Compatibility checklist
Integrations tests are runned with Kafka official Java client and popular
Kafka clients in other languages
Golang
● https://github.com/Shopify/sarama
● https://github.com/confluentinc/confluent-kafka-go
Rust
● https://github.com/fede1024/rust-rdkafka
NodeJS
● https://github.com/Blizzard/node-rdkafka
Roadmap / Future work
● KoP Proxy
● Schema
● Kafka transaction (Waiting for Pulsar transaction)
● Kafka 1.X support
● Kafka > 2.0 support
Try it now!
● Download and try it out today!
● https://github.com/streamnative/kop
More protocol handlers are coming!
● AoP - AMQP on Pulsar ← First week of May
● MoP - MQTT on Pulsar
● ...
Q & A
Follow us!
● Follow us at Twitter
○ Pierre Zemb (@PierreZ)
○ Sijie Guo (@sijieg)
○ Apache Pulsar (@apache_pulsar)
○ StreamNative (@streamnativeio)
○ OVHcloud (@OVHcloud)
● Join us at #kop channel on Pulsar slack!
Ad

More Related Content

What's hot (20)

Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
StreamNative
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! Japan
StreamNative
 
Building a FaaS with pulsar
Building a FaaS with pulsarBuilding a FaaS with pulsar
Building a FaaS with pulsar
StreamNative
 
Transaction preview of Apache Pulsar
Transaction preview of Apache PulsarTransaction preview of Apache Pulsar
Transaction preview of Apache Pulsar
StreamNative
 
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
StreamNative
 
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
StreamNative
 
Pulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless EvolutionPulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless Evolution
StreamNative
 
Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021
Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021
Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021
StreamNative
 
Scaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarScaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsar
StreamNative
 
[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake
Timothy Spann
 
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
StreamNative
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Karthik Ramasamy
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
datamantra
 
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
StreamNative
 
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
StreamNative
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
Knoldus Inc.
 
Kafka on Pulsar
Kafka on Pulsar Kafka on Pulsar
Kafka on Pulsar
StreamNative
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...
JinfengHuang3
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
confluent
 
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
StreamNative
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! Japan
StreamNative
 
Building a FaaS with pulsar
Building a FaaS with pulsarBuilding a FaaS with pulsar
Building a FaaS with pulsar
StreamNative
 
Transaction preview of Apache Pulsar
Transaction preview of Apache PulsarTransaction preview of Apache Pulsar
Transaction preview of Apache Pulsar
StreamNative
 
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
StreamNative
 
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
Take Kafka-on-Pulsar to Production at Internet Scale: Improvements Made for P...
StreamNative
 
Pulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless EvolutionPulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless Evolution
StreamNative
 
Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021
Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021
Simplifying Migration from Kafka to Pulsar - Pulsar Summit NA 2021
StreamNative
 
Scaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarScaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsar
StreamNative
 
[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake
Timothy Spann
 
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
StreamNative
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Karthik Ramasamy
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
datamantra
 
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
No Surprises Geo Replication - Pulsar Virtual Summit Europe 2021
StreamNative
 
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
StreamNative
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
Knoldus Inc.
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...
JinfengHuang3
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
confluent
 

Similar to Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre (20)

Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembBuilding a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
StreamNative
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
datamantra
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Timothy Spann
 
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
OSS EU:  Deep Dive into Building Streaming Applications with Apache PulsarOSS EU:  Deep Dive into Building Streaming Applications with Apache Pulsar
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
Timothy Spann
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Timothy Spann
 
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache PulsarCODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
Timothy Spann
 
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
HostedbyConfluent
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Timothy Spann
 
bigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Appsbigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Apps
Timothy Spann
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming
Timothy Spann
 
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Timothy Spann
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Timothy Spann
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
Timothy Spann
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
confluent
 
A day in the life of a log message
A day in the life of a log messageA day in the life of a log message
A day in the life of a log message
Josef Karásek
 
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembBuilding a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
StreamNative
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
datamantra
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Timothy Spann
 
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
OSS EU:  Deep Dive into Building Streaming Applications with Apache PulsarOSS EU:  Deep Dive into Building Streaming Applications with Apache Pulsar
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
Timothy Spann
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Timothy Spann
 
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache PulsarCODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
Timothy Spann
 
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
HostedbyConfluent
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Timothy Spann
 
bigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Appsbigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Apps
Timothy Spann
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming
Timothy Spann
 
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Timothy Spann
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Timothy Spann
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
Timothy Spann
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
confluent
 
A day in the life of a log message
A day in the life of a log messageA day in the life of a log message
A day in the life of a log message
Josef Karásek
 
Ad

More from StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 
Ad

Recently uploaded (20)

Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 

Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre

  • 1. Introducing Kafka-on-Pulsar: Bring native Kafka protocol support to Apache Pulsar Sijie Guo / Pierre Zemb Pulsar Summit 2020-06-17
  • 2. Who are we? ● Sijie Guo (@sijieg) ● Co-Founder & CEO, StreamNative ● PMC Member of Pulsar/BookKeeper ● Ex Co-Founder, Streamlio ● Ex-Twitter, Ex-Yahoo ● Work on messaging and streaming data technologies for many years
  • 3. Who are we? ● Pierre Zemb (@PierreZ) ● Technical Leader ● Working around distributed systems ● Newcomer as an Apache contributor ○ Apache {Flink, HBase, Pulsar} ● Involved into local communities
  • 4. Agenda ● What is Apache Pulsar? ● Why KoP? ● Introduction of protocol handler ● Kafka VS Pulsar, the protocol version ● How we implement KoP ● Demo ● Roadmap ● Q&A
  • 5. What is Apache Pulsar?
  • 6. Flexible pub/sub messaging backed by durable log storage
  • 7. Flexible pub/sub messaging backed by durable log storage
  • 9. Apache Pulsar ● Publish-subscribe: unified messaging model (streaming + queueing) ● Infinite event stream storage: Apache BookKeeper + Tiered Storage ● Connectors: ingest events without writing code ● Process events in real-time ○ Pulsar Functions for serverless / lightweight computation ○ Spark / Flink for unified data processing ○ Presto for interactive queries
  • 10. Pulsar Highlights ● Multi-tenancy ● Unified messaging (queuing + streaming) ● Layered Architecture ● Tiered Storage ● Built-in schema support ● Built-in geo-replication
  • 11. The Need of KoP ● Adoptions ● Inbound requests ● Migration
  • 12. The Existing Efforts ● Kafka Java Wrapper ● Pulsar IO Connector
  • 13. Implement Kafka protocol on Pulsar? ● Proxy / Gateway ● Implement Kafka protocol on Pulsar broker
  • 14. KoP as a proxy, OVHcloud version We first implemented KoP has a proxy PoC in Rust: ● Rust async was out in nightly compiler when we started ● We wanted no GC on proxy layers ● Rust has awesome libraries at TCP-level Our goal was to convert TCP frames from Kafka to Pulsar
  • 15. KoP as a proxy, OVHcloud version Proxy layer "Hyperion" clients
  • 16. KoP as a proxy, OVHcloud version Everything is a state-machine: ● TCP cnx from Kafka clients ● TCP cnx to Pulsar brokers Those event-driven finite-state machines were triggered by TCP frames from their respective protocol. A third one was above the two to provide synchronization
  • 17. KoP as a proxy, OVHcloud version Proxy layer "Hyperion" clients State machines
  • 18. KoP as a proxy, OVHcloud version Pros ● Working at TCP layer enables performance ● nice PoC to discover both protocols ● Rust is blazing fast ● Proxify production is easy ● We could bump old version of Kafka frames for old Kafka clients Cons ● Rewrite everything ● Some things were hard to proxify: ○ Group coordinator ○ Offsets management ● Difficult to open-source (different language)
  • 19. The group-coordinator/offsets problem In Kafka, the group coordinator is an elected actor within the cluster responsible for: ● assigning partitions to consumers of a consumer group ● managing offsets for each consumer group In Pulsar, partition assignment is managed by broker on a per-partition basis. Offset management is done by storing the acknowledgements in cursors by the owner broker of that partition.
  • 20. The group-coordinator/offsets problem In Kafka, the group coordinator is an elected actor within the cluster responsible for: ● assigning partitions to consumers of a consumer group ● managing offsets for each consumer group In Pulsar, partition assignment is managed by broker on a per-partition basis. Offset management is done by storing the acknowledgements in cursors by the owner broker of that partition. Simulate this at proxy-level is hard (missing low-level info)
  • 21. And then we saw this 😍
  • 22. Which lead to our collaboration 🤝
  • 23. What is Apache Pulsar??
  • 24. How Pulsar implements its protocol
  • 26. What is the protocol handler?
  • 27. What is the protocol handler? How to load plugins in a jvm without using classpath? Pulsar is using NAR to load plugins! - Pulsar Function - Pulsar Connector - Pulsar Offloader - Pulsar Protocol Handler
  • 28. How-to load protocol handlers? 1. Upgrade your cluster to 2.5 2. Set the following configurations: 3. Configure each protocol handlers 4. Restart your broker 5. Enjoy!
  • 30. The KoP Implementation ● “distributedlog” ○ Kafka and Pulsar are built on same data structure ● Similarities ○ Topic Lookup ○ Topic / Partitions / Messages / Offset ○ Produce ○ Consume ○ Consumption State
  • 31. KoP Implementation ● Topic flat map: Brokers set `kafkaNamespace` ● MessageID and offset: LedgerId + EntryId ● Message: Convert key/value/timestamps/headers ● Topic Lookup: Pulsar admin topic lookup -> owner broker ● Produce: Convert message, then PulsarTopic.publishMessage ● Consume: Convert requests, then nonDurableCursor.readEntries ● Group Coordinator: Keep group information in topic `public/__kafka/__offsets`
  • 32. KoP for production ❏ What Pulsar Provides ❏ Multi-Tenancy ❏ Security ❏ TLS Encryption ❏ Authentication, Authorization ❏ Data Encryption ❏ Geo-replication ❏ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 33. KoP for production ❏ What Pulsar Provides ❏ Multi-Tenancy ❏ Security ❏ TLS Encryption ❏ Authentication, Authorization ❏ Data Encryption ✓ Geo-replication ✓ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 34. KoP for production ❏ What Pulsar Provides ✓ Multi-Tenancy ✓ Security ✓ TLS Encryption ✓ Authentication, Authorization ❏ Data Encryption ✓ Geo-replication ✓ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 35. KoP for production ❏ What Pulsar Provides ✓ Multi-Tenancy ✓ Security ✓ TLS Encryption ✓ Authentication, Authorization ❏ Data Encryption ✓ Geo-replication ✓ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 36. KoP multi-tenancy Pulsar has great support for multi-tenancy, how-to use it in KoP? SASL-PLAIN is used to inject info: ● The username of Kafka JAAS is the tenant/namespace ● The password must be your classic Pulsar token authentication parameters TLS can be added over SASL-PLAIN
  • 37. KoP multi-tenancy String tenant = "ns1/tenant1"; String password = "token:xxx"; String jaasTemplate = "org.apache.kafka.common.security.plain.PlainLoginModule required username="%s" password="%s";"; String jaasCfg = String.format(jaasTemplate, tenant, password); props.put("sasl.jaas.config", jaasCfg); props.put("security.protocol", "SASL_PLAINTEXT"); props.put("sasl.mechanism", "PLAIN");
  • 38. KoP Compatibility checklist Integrations tests are runned with Kafka official Java client and popular Kafka clients in other languages Golang ● https://github.com/Shopify/sarama ● https://github.com/confluentinc/confluent-kafka-go Rust ● https://github.com/fede1024/rust-rdkafka NodeJS ● https://github.com/Blizzard/node-rdkafka
  • 39. Roadmap / Future work ● KoP Proxy ● Schema ● Kafka transaction (Waiting for Pulsar transaction) ● Kafka 1.X support ● Kafka > 2.0 support
  • 40. Try it now! ● Download and try it out today! ● https://github.com/streamnative/kop More protocol handlers are coming! ● AoP - AMQP on Pulsar ← First week of May ● MoP - MQTT on Pulsar ● ...
  • 41. Q & A
  • 42. Follow us! ● Follow us at Twitter ○ Pierre Zemb (@PierreZ) ○ Sijie Guo (@sijieg) ○ Apache Pulsar (@apache_pulsar) ○ StreamNative (@streamnativeio) ○ OVHcloud (@OVHcloud) ● Join us at #kop channel on Pulsar slack!