SlideShare a Scribd company logo
Kafka Disaster Recovery
Abdelkrim Hadjidj
Senior Data Streaming Specialist
© 2019 Cloudera, Inc. All rights reserved. 2
Quick intro
• Senior Specialist Solution Engineer at Cloudera
• Focus on CDF offering
● Edge Management & IoT (MiNiFi, CEM)
● Flow Management (NiFi, Registry)
● Stream Processing (Kafka, KStreams, SMM, SR, …)
• Founder of Future of Data Paris Meetup http://tiny.cc/fodp
• Founder of Solutions Engineers of Paris http://tiny.cc/PSE
@ahadjidj
© 2019 Cloudera, Inc. All rights reserved. 3
Kafka Disaster Recovery options
Broker
Broker
Broker
DC1 DC2
Data
DC1 DC2
Data
Dual ingest
Zero RPO
Mirroring**
Very low RPO
DC2 DC3
Data
Multiple DC*
Zero RPO
BrokerBroker Broker
Broker
Broker
Broker
Broker
Broker
Broker
Broker
Broker
DC1
Broker
* Stretch cluster on geographically distributed DC is not recommended
** Replication is used for internal broker replication
© 2019 Cloudera, Inc. All rights reserved. 4
Agenda
From MM to MM2 and SRM
Active Passive Architecture
Active Active Architectures
Other use cases
Monitoring
Q&A
© 2019 Cloudera, Inc. All rights reserved. 5
Mirror Maker use cases
DC1 DC2 DC3
K1 K2 K3
MM aggregate
Aggregation
DC1 DC2 DC3
K1 K2 K3
MM MM
Data Deployment
MMK1 K2
P
P
P
P
P
P
C
C
C
C
C
C
Segmentation
MMK2 K1
P
P
P
P
P
P
C
C
C
C
C
C
MMK3
P
P
P
P
P
P
Acquisitions & mergers
© 2019 Cloudera, Inc. All rights reserved. 6
Mirror Maker use cases
Tracking
Queuing
P
P
P
P
P
P
P
P
P
P
P
P
C
C
C
C
C
C
C
C
C
C
C
C
Tracking
Aggregate
MM
Queuing
Aggregate
MM
C
C
C
C
C
C
C
C
C
C
C
C
HDFS
HDFS
MM
MM
© 2019 Cloudera, Inc. All rights reserved. 7
Mirror Make limitations for Disaster Recovery
• Static Whitelists and Blacklists
• Configuration synch
• Manual Topic Naming to avoid Cycles
• Scalability and Throughput Limitations due to Rebalances
• Lack of Monitoring and Operational Support
• No Disaster Recovery, Migration, Failover
• Too many MirrorMaker Clusters
© 2019 Cloudera, Inc. All rights reserved. 8
Streams
Replication
Manager
• Mirror Maker 2 KIP-382
• Supports active-active, multi-
cluster, cross DC replication &
other complex scenarios
• Leverage Kafka Connect for
scalability and HA
• Replicate data and configurations
(ACL, partitioning, new topics, etc)
• Offset translation for failover and
failback
• Monitoring integration with SMM
A
B
C
X
Y
C
C
C
Kafka
Connect
MM2 cluster
X
topic1.part1
topic1.part0
A
topic1.part1
topic1.part0
A.topic1.part1
A.topic1.part0
B
topic1.part1
topic1.part0
X.topic1.part1
X.topic1.part0
Active – Passive Architecture
© 2019 Cloudera, Inc. All rights reserved. 10
Producers send to primary if
available, to secondary if not
Consumers can be migrated between
primary and secondary clusters.
Active/standby
Data, offset syncs,
and consumer
checkpoints.
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
© 2019 Cloudera, Inc. All rights reserved. 11
Configuration file
• Simple file configuration
• Multi directional
• Fine grained replication
• Topics white/black lists
• Group white/black lists
• Interval configurations
• Supports patterns
$ ./bin/connect-mirror-maker.sh mm2.properties
© 2019 Cloudera, Inc. All rights reserved. 12
Remote topics
• Replicated topics are
renamed according to
ReplicationPolicy.
• Default policy :
<source>.<topic>
• Can implement custom
policies
topic1
topic2
secondary.topic1
secondary.topic2
topic1
topic2
primary.topic1
primary.topic2
SRM
Primary
Cluster
Secondary
Cluster
© 2019 Cloudera, Inc. All rights reserved. 13
Heartbeats
• MM2 emits a heartbeat topic
in each source cluster, which
is replicated to other clusters
• Downstream cluster uses this
topic to verify that
● The connector is running
● The corresponding
source cluster is
available
target=primary
source=secondary
Timestamp=5434356
primary.heartbeats
SRM
Secondary
Cluster
© 2019 Cloudera, Inc. All rights reserved. 14
Offset Syncs
• Offset sync stream maps
offsets between mirrored
clusters.
topic=primary.topic1
partition=4
upstreamOffset=100
downstreamOffset=102
primary.offset-syncs.internal
SRM
Secondary
Cluster
© 2019 Cloudera, Inc. All rights reserved. 15
Checkpoints
• Checkpoint stream replicates
consumer group state.
• MM2 periodically
emit checkpoints in the
destination cluster
• The checkpoint topic is log-
compacted to reflect only the
latest offsets across
consumer groups
topic=primary.topic1
partition=4
group=consumer-group-2
upstreamOffset=100
offset=102
primary.checkpoints.internal
SRM
Secondary
Cluster
© 2019 Cloudera, Inc. All rights reserved. 16
Cross-cluster offset translation
Translate offsets between clusters via RemoteClusterUtils
Map<TopicPartition, Long> newOffsets =
RemoteClusterUtils.translateOffsets(
newClusterProperties, oldClusterName,
consumerGroupId);
consumer.seek(newOffsets);
● offset translation based on checkpoints in new cluster
● no connection to old cluster required
© 2019 Cloudera, Inc. All rights reserved. 17
Publish to topic
Active/standby
Data, offset syncs,
and consumer
checkpoints.
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Subscribe to *.topic
© 2019 Cloudera, Inc. All rights reserved. 18
Publish to topic
Primary down: fail over
Migrate consumers
Data, offset syncs,
and consumer
checkpoints.
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Use RemoteClusterUtil to migrate to
primary.topic (old data) and topic (new
data)
© 2019 Cloudera, Inc. All rights reserved. 19
Publish to topic
Primary down: fail over
Migrate consumers
Data, offset syncs,
and consumer
checkpoints.
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
$ srm-control offsets --bootstrap-server :9092 --source primary --group foo --export > out.csv
$ kafka-consumer-groups --bootstrap-server B_host:9092 --reset-offsets --group foo --execute --from-file out.csv
© 2019 Cloudera, Inc. All rights reserved. 20
Publish to topic
Primary permanently lost? Recover from secondary.
Lost primary topics can be recovered from remote topics on secondary cluster.
Producers
Producers
Producers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Primary-2
topic1
topic2
secondary.topic1
secondary.topic2
secondary.primary.topic1
secondary.primary.topic2
topic1
topic2
primary.topic1
primary.topic2
primary-2.topic1
primary-2.topic2
Data from old primary
Active – Passive Demo
© 2019 Cloudera, Inc. All rights reserved. 22
Publish to retail-store
Active/standby Demo Scenario
Producers
Producers
NiFi Producers
Producers
NiFi
SRM
Paris
Cluster
NYC
Cluster
Subscribe to retail-store
and nyc_retail-store
Active - Active
© 2019 Cloudera, Inc. All rights reserved. 24
Publish to topic
Active/active: Cross Consumer Groups or XDCR
Consumer subscription defines the patterns
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Produce to both cluster.
Producers
Producers
Consumers
Consume from both clusters.
A/ Cross-cluster consumer groups
© 2019 Cloudera, Inc. All rights reserved. 26
Publish to topic
Cross-cluster consumer groups
Effectively one big consumer group
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Produce to both cluster.
Producers
Producers
Consumers
Subscribe to topic
R1
R1 R1
Subscribe to topic
R2
R2
R2
© 2019 Cloudera, Inc. All rights reserved. 27
Publish to topic
Cross-cluster consumer groups
What it takes to fail-over? Nothing
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Produce to both cluster.
Producers
Producers
Consumers
Subscribe to topic
R3
Subscribe to topic
R3
R3
Primary
Cluster
DC temporarily lost
© 2019 Cloudera, Inc. All rights reserved. 28
Publish to topic
Cross-cluster consumer groups
What it takes to fail-back? Nothing also
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Produce to both cluster.
Producers
Producers
Consumers
Recover from last point and
resume – some events may
be delayed
R4
R4 R4
DC issue resolved
© 2019 Cloudera, Inc. All rights reserved. 29
Publish to topic
Cross-cluster consumer groups
DC permanently lost
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary-2
Cluster
Secondary
Cluster
Produce to both cluster.
Producers
Producers
Consumers
Data previously in primary is
not lost and can be recovered
from secondary
Subscribe to topic
Primary
Cluster
Bring new DC
XDCR
© 2019 Cloudera, Inc. All rights reserved. 31
Publish to topic
Cross Data Center Replication XDCR
All consumers process all records
Producers
Producers
Producers
Producers
Producers
Consumers
VIP/Load
Balancers SRM
Primary
Cluster
Secondary
Cluster
Produce to both cluster.
Producers
Producers
Consumers
Subscribe to *.topic
R1
R1 R1
Subscribe to *.topic
R1 R1
R2 R2
R2
R2 R2
Active – Passive Demo
Other use cases
© 2019 Cloudera, Inc. All rights reserved. 34
Cloud migration or Kafka version upgrade
© 2019 Cloudera, Inc. All rights reserved. 35
Aggregation for Analytics
Monitoring: Demo integration with
SMM
THAN YOU
Ad

More Related Content

What's hot (20)

Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in Kafka
Jayesh Thakrar
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Maxscale_메뉴얼
Maxscale_메뉴얼Maxscale_메뉴얼
Maxscale_메뉴얼
NeoClova
 
Terraform
TerraformTerraform
Terraform
Marcelo Serpa
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
Akihiro Suda
 
Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...
Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...
Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...
HostedbyConfluent
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
HostedbyConfluent
 
Comprehensive Terraform Training
Comprehensive Terraform TrainingComprehensive Terraform Training
Comprehensive Terraform Training
Yevgeniy Brikman
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
HostedbyConfluent
 
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
HostedbyConfluent
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
confluent
 
Helm - Application deployment management for Kubernetes
Helm - Application deployment management for KubernetesHelm - Application deployment management for Kubernetes
Helm - Application deployment management for Kubernetes
Alexei Ledenev
 
Terraform
TerraformTerraform
Terraform
Phil Wilkins
 
Credential store using HashiCorp Vault
Credential store using HashiCorp VaultCredential store using HashiCorp Vault
Credential store using HashiCorp Vault
Mayank Patel
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
HostedbyConfluent
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
HBaseCon
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 
Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in Kafka
Jayesh Thakrar
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Maxscale_메뉴얼
Maxscale_메뉴얼Maxscale_메뉴얼
Maxscale_메뉴얼
NeoClova
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
Akihiro Suda
 
Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...
Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...
Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...
HostedbyConfluent
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
HostedbyConfluent
 
Comprehensive Terraform Training
Comprehensive Terraform TrainingComprehensive Terraform Training
Comprehensive Terraform Training
Yevgeniy Brikman
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
HostedbyConfluent
 
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
HostedbyConfluent
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
From Zero to Hero with Kafka Connect (Robin Moffat, Confluent) Kafka Summit L...
confluent
 
Helm - Application deployment management for Kubernetes
Helm - Application deployment management for KubernetesHelm - Application deployment management for Kubernetes
Helm - Application deployment management for Kubernetes
Alexei Ledenev
 
Credential store using HashiCorp Vault
Credential store using HashiCorp VaultCredential store using HashiCorp Vault
Credential store using HashiCorp Vault
Mayank Patel
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
HostedbyConfluent
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
HBaseCon
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
Amita Mirajkar
 

Similar to Disaster Recovery and High Availability with Kafka, SRM and MM2 (20)

Beyond the Brokers | Emma Humber and Andrew Borley, IBM
Beyond the Brokers | Emma Humber and Andrew Borley, IBMBeyond the Brokers | Emma Humber and Andrew Borley, IBM
Beyond the Brokers | Emma Humber and Andrew Borley, IBM
HostedbyConfluent
 
Db2 developer ecosystem
Db2 developer ecosystemDb2 developer ecosystem
Db2 developer ecosystem
ModusOptimum
 
Db2 on cloud overview
Db2 on cloud overviewDb2 on cloud overview
Db2 on cloud overview
ModusOptimum
 
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
VirtualTech Japan Inc.
 
Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)
Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)
Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)
Yong Feng
 
Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)
Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)
Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)
Michael Elder
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Dive
confluent
 
From Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the UnexpectedFrom Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the Unexpected
DataCore Software
 
MySQL Database Architectures - 2022-08
MySQL Database Architectures - 2022-08MySQL Database Architectures - 2022-08
MySQL Database Architectures - 2022-08
Kenny Gryp
 
Kafka/SMM Crash Course
Kafka/SMM Crash CourseKafka/SMM Crash Course
Kafka/SMM Crash Course
DataWorks Summit
 
OpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and Future
OpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and FutureOpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and Future
OpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and Future
Michael Fork
 
INFINIDAT InfiniGuard - 20220330.pdf
INFINIDAT InfiniGuard - 20220330.pdfINFINIDAT InfiniGuard - 20220330.pdf
INFINIDAT InfiniGuard - 20220330.pdf
MarketingArrowECS_CZ
 
Data stream with cruise control
Data stream with cruise controlData stream with cruise control
Data stream with cruise control
Bill Liu
 
IBM WebSphere Liberty and Docker Deep Dive
IBM WebSphere Liberty and Docker Deep DiveIBM WebSphere Liberty and Docker Deep Dive
IBM WebSphere Liberty and Docker Deep Dive
David Currie
 
Nrb Mainframe Day z Data and AI - Leif Pedersen
Nrb Mainframe Day z Data and AI - Leif PedersenNrb Mainframe Day z Data and AI - Leif Pedersen
Nrb Mainframe Day z Data and AI - Leif Pedersen
NRB
 
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
HostedbyConfluent
 
Sklm webinar
Sklm webinarSklm webinar
Sklm webinar
Luigi Perrone
 
Kubernetes Security Best Practices for DevOps
Kubernetes Security Best Practices for DevOpsKubernetes Security Best Practices for DevOps
Kubernetes Security Best Practices for DevOps
DevOps.com
 
20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes
20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes
20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes
IBM France Lab
 
Dcs cloud architecture-high-level-design
Dcs cloud architecture-high-level-designDcs cloud architecture-high-level-design
Dcs cloud architecture-high-level-design
Isaac Chiang
 
Beyond the Brokers | Emma Humber and Andrew Borley, IBM
Beyond the Brokers | Emma Humber and Andrew Borley, IBMBeyond the Brokers | Emma Humber and Andrew Borley, IBM
Beyond the Brokers | Emma Humber and Andrew Borley, IBM
HostedbyConfluent
 
Db2 developer ecosystem
Db2 developer ecosystemDb2 developer ecosystem
Db2 developer ecosystem
ModusOptimum
 
Db2 on cloud overview
Db2 on cloud overviewDb2 on cloud overview
Db2 on cloud overview
ModusOptimum
 
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
OpenStackを利用したEnterprise Cloudを支える技術 - OpenStack最新情報セミナー 2016年5月
VirtualTech Japan Inc.
 
Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)
Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)
Client Deployment of IBM Cloud Private (Think 2019 Session 5964A)
Yong Feng
 
Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)
Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)
Client Deployment of IBM Cloud Private (IBM #Think2019 #5964)
Michael Elder
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Dive
confluent
 
From Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the UnexpectedFrom Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the Unexpected
DataCore Software
 
MySQL Database Architectures - 2022-08
MySQL Database Architectures - 2022-08MySQL Database Architectures - 2022-08
MySQL Database Architectures - 2022-08
Kenny Gryp
 
OpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and Future
OpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and FutureOpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and Future
OpenStack Atlanta Summit - IBM, SoftLayer, and OpenStack: Present and Future
Michael Fork
 
INFINIDAT InfiniGuard - 20220330.pdf
INFINIDAT InfiniGuard - 20220330.pdfINFINIDAT InfiniGuard - 20220330.pdf
INFINIDAT InfiniGuard - 20220330.pdf
MarketingArrowECS_CZ
 
Data stream with cruise control
Data stream with cruise controlData stream with cruise control
Data stream with cruise control
Bill Liu
 
IBM WebSphere Liberty and Docker Deep Dive
IBM WebSphere Liberty and Docker Deep DiveIBM WebSphere Liberty and Docker Deep Dive
IBM WebSphere Liberty and Docker Deep Dive
David Currie
 
Nrb Mainframe Day z Data and AI - Leif Pedersen
Nrb Mainframe Day z Data and AI - Leif PedersenNrb Mainframe Day z Data and AI - Leif Pedersen
Nrb Mainframe Day z Data and AI - Leif Pedersen
NRB
 
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
Help, My Kafka is Broken! (Emma Humber & Gantigmaa Selenge, IBM) Kafka Summit...
HostedbyConfluent
 
Kubernetes Security Best Practices for DevOps
Kubernetes Security Best Practices for DevOpsKubernetes Security Best Practices for DevOps
Kubernetes Security Best Practices for DevOps
DevOps.com
 
20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes
20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes
20200113 - IBM Cloud Côte d'Azur - DeepDive Kubernetes
IBM France Lab
 
Dcs cloud architecture-high-level-design
Dcs cloud architecture-high-level-designDcs cloud architecture-high-level-design
Dcs cloud architecture-high-level-design
Isaac Chiang
 
Ad

More from Abdelkrim Hadjidj (9)

Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizon
Abdelkrim Hadjidj
 
Paris FOD meetup - koordinator
Paris FOD meetup - koordinatorParis FOD meetup - koordinator
Paris FOD meetup - koordinator
Abdelkrim Hadjidj
 
Paris FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging ManagerParis FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging Manager
Abdelkrim Hadjidj
 
Paris FOD meetup - kafka security 101
Paris FOD meetup - kafka security 101Paris FOD meetup - kafka security 101
Paris FOD meetup - kafka security 101
Abdelkrim Hadjidj
 
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
Abdelkrim Hadjidj
 
Paris FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks PresentationParis FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks Presentation
Abdelkrim Hadjidj
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
Abdelkrim Hadjidj
 
Apache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleApache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scale
Abdelkrim Hadjidj
 
Future of Data Meetup : Boontadata
Future of Data Meetup : BoontadataFuture of Data Meetup : Boontadata
Future of Data Meetup : Boontadata
Abdelkrim Hadjidj
 
Paris FOD meetup - koordinator
Paris FOD meetup - koordinatorParis FOD meetup - koordinator
Paris FOD meetup - koordinator
Abdelkrim Hadjidj
 
Paris FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging ManagerParis FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging Manager
Abdelkrim Hadjidj
 
Paris FOD meetup - kafka security 101
Paris FOD meetup - kafka security 101Paris FOD meetup - kafka security 101
Paris FOD meetup - kafka security 101
Abdelkrim Hadjidj
 
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
Abdelkrim Hadjidj
 
Paris FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks PresentationParis FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks Presentation
Abdelkrim Hadjidj
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
Abdelkrim Hadjidj
 
Apache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleApache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scale
Abdelkrim Hadjidj
 
Future of Data Meetup : Boontadata
Future of Data Meetup : BoontadataFuture of Data Meetup : Boontadata
Future of Data Meetup : Boontadata
Abdelkrim Hadjidj
 
Ad

Recently uploaded (20)

Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 

Disaster Recovery and High Availability with Kafka, SRM and MM2

  • 1. Kafka Disaster Recovery Abdelkrim Hadjidj Senior Data Streaming Specialist
  • 2. © 2019 Cloudera, Inc. All rights reserved. 2 Quick intro • Senior Specialist Solution Engineer at Cloudera • Focus on CDF offering ● Edge Management & IoT (MiNiFi, CEM) ● Flow Management (NiFi, Registry) ● Stream Processing (Kafka, KStreams, SMM, SR, …) • Founder of Future of Data Paris Meetup http://tiny.cc/fodp • Founder of Solutions Engineers of Paris http://tiny.cc/PSE @ahadjidj
  • 3. © 2019 Cloudera, Inc. All rights reserved. 3 Kafka Disaster Recovery options Broker Broker Broker DC1 DC2 Data DC1 DC2 Data Dual ingest Zero RPO Mirroring** Very low RPO DC2 DC3 Data Multiple DC* Zero RPO BrokerBroker Broker Broker Broker Broker Broker Broker Broker Broker Broker DC1 Broker * Stretch cluster on geographically distributed DC is not recommended ** Replication is used for internal broker replication
  • 4. © 2019 Cloudera, Inc. All rights reserved. 4 Agenda From MM to MM2 and SRM Active Passive Architecture Active Active Architectures Other use cases Monitoring Q&A
  • 5. © 2019 Cloudera, Inc. All rights reserved. 5 Mirror Maker use cases DC1 DC2 DC3 K1 K2 K3 MM aggregate Aggregation DC1 DC2 DC3 K1 K2 K3 MM MM Data Deployment MMK1 K2 P P P P P P C C C C C C Segmentation MMK2 K1 P P P P P P C C C C C C MMK3 P P P P P P Acquisitions & mergers
  • 6. © 2019 Cloudera, Inc. All rights reserved. 6 Mirror Maker use cases Tracking Queuing P P P P P P P P P P P P C C C C C C C C C C C C Tracking Aggregate MM Queuing Aggregate MM C C C C C C C C C C C C HDFS HDFS MM MM
  • 7. © 2019 Cloudera, Inc. All rights reserved. 7 Mirror Make limitations for Disaster Recovery • Static Whitelists and Blacklists • Configuration synch • Manual Topic Naming to avoid Cycles • Scalability and Throughput Limitations due to Rebalances • Lack of Monitoring and Operational Support • No Disaster Recovery, Migration, Failover • Too many MirrorMaker Clusters
  • 8. © 2019 Cloudera, Inc. All rights reserved. 8 Streams Replication Manager • Mirror Maker 2 KIP-382 • Supports active-active, multi- cluster, cross DC replication & other complex scenarios • Leverage Kafka Connect for scalability and HA • Replicate data and configurations (ACL, partitioning, new topics, etc) • Offset translation for failover and failback • Monitoring integration with SMM A B C X Y C C C Kafka Connect MM2 cluster X topic1.part1 topic1.part0 A topic1.part1 topic1.part0 A.topic1.part1 A.topic1.part0 B topic1.part1 topic1.part0 X.topic1.part1 X.topic1.part0
  • 9. Active – Passive Architecture
  • 10. © 2019 Cloudera, Inc. All rights reserved. 10 Producers send to primary if available, to secondary if not Consumers can be migrated between primary and secondary clusters. Active/standby Data, offset syncs, and consumer checkpoints. Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster
  • 11. © 2019 Cloudera, Inc. All rights reserved. 11 Configuration file • Simple file configuration • Multi directional • Fine grained replication • Topics white/black lists • Group white/black lists • Interval configurations • Supports patterns $ ./bin/connect-mirror-maker.sh mm2.properties
  • 12. © 2019 Cloudera, Inc. All rights reserved. 12 Remote topics • Replicated topics are renamed according to ReplicationPolicy. • Default policy : <source>.<topic> • Can implement custom policies topic1 topic2 secondary.topic1 secondary.topic2 topic1 topic2 primary.topic1 primary.topic2 SRM Primary Cluster Secondary Cluster
  • 13. © 2019 Cloudera, Inc. All rights reserved. 13 Heartbeats • MM2 emits a heartbeat topic in each source cluster, which is replicated to other clusters • Downstream cluster uses this topic to verify that ● The connector is running ● The corresponding source cluster is available target=primary source=secondary Timestamp=5434356 primary.heartbeats SRM Secondary Cluster
  • 14. © 2019 Cloudera, Inc. All rights reserved. 14 Offset Syncs • Offset sync stream maps offsets between mirrored clusters. topic=primary.topic1 partition=4 upstreamOffset=100 downstreamOffset=102 primary.offset-syncs.internal SRM Secondary Cluster
  • 15. © 2019 Cloudera, Inc. All rights reserved. 15 Checkpoints • Checkpoint stream replicates consumer group state. • MM2 periodically emit checkpoints in the destination cluster • The checkpoint topic is log- compacted to reflect only the latest offsets across consumer groups topic=primary.topic1 partition=4 group=consumer-group-2 upstreamOffset=100 offset=102 primary.checkpoints.internal SRM Secondary Cluster
  • 16. © 2019 Cloudera, Inc. All rights reserved. 16 Cross-cluster offset translation Translate offsets between clusters via RemoteClusterUtils Map<TopicPartition, Long> newOffsets = RemoteClusterUtils.translateOffsets( newClusterProperties, oldClusterName, consumerGroupId); consumer.seek(newOffsets); ● offset translation based on checkpoints in new cluster ● no connection to old cluster required
  • 17. © 2019 Cloudera, Inc. All rights reserved. 17 Publish to topic Active/standby Data, offset syncs, and consumer checkpoints. Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Subscribe to *.topic
  • 18. © 2019 Cloudera, Inc. All rights reserved. 18 Publish to topic Primary down: fail over Migrate consumers Data, offset syncs, and consumer checkpoints. Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Use RemoteClusterUtil to migrate to primary.topic (old data) and topic (new data)
  • 19. © 2019 Cloudera, Inc. All rights reserved. 19 Publish to topic Primary down: fail over Migrate consumers Data, offset syncs, and consumer checkpoints. Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster $ srm-control offsets --bootstrap-server :9092 --source primary --group foo --export > out.csv $ kafka-consumer-groups --bootstrap-server B_host:9092 --reset-offsets --group foo --execute --from-file out.csv
  • 20. © 2019 Cloudera, Inc. All rights reserved. 20 Publish to topic Primary permanently lost? Recover from secondary. Lost primary topics can be recovered from remote topics on secondary cluster. Producers Producers Producers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Primary-2 topic1 topic2 secondary.topic1 secondary.topic2 secondary.primary.topic1 secondary.primary.topic2 topic1 topic2 primary.topic1 primary.topic2 primary-2.topic1 primary-2.topic2 Data from old primary
  • 22. © 2019 Cloudera, Inc. All rights reserved. 22 Publish to retail-store Active/standby Demo Scenario Producers Producers NiFi Producers Producers NiFi SRM Paris Cluster NYC Cluster Subscribe to retail-store and nyc_retail-store
  • 24. © 2019 Cloudera, Inc. All rights reserved. 24 Publish to topic Active/active: Cross Consumer Groups or XDCR Consumer subscription defines the patterns Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Produce to both cluster. Producers Producers Consumers Consume from both clusters.
  • 26. © 2019 Cloudera, Inc. All rights reserved. 26 Publish to topic Cross-cluster consumer groups Effectively one big consumer group Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Produce to both cluster. Producers Producers Consumers Subscribe to topic R1 R1 R1 Subscribe to topic R2 R2 R2
  • 27. © 2019 Cloudera, Inc. All rights reserved. 27 Publish to topic Cross-cluster consumer groups What it takes to fail-over? Nothing Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Produce to both cluster. Producers Producers Consumers Subscribe to topic R3 Subscribe to topic R3 R3 Primary Cluster DC temporarily lost
  • 28. © 2019 Cloudera, Inc. All rights reserved. 28 Publish to topic Cross-cluster consumer groups What it takes to fail-back? Nothing also Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Produce to both cluster. Producers Producers Consumers Recover from last point and resume – some events may be delayed R4 R4 R4 DC issue resolved
  • 29. © 2019 Cloudera, Inc. All rights reserved. 29 Publish to topic Cross-cluster consumer groups DC permanently lost Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary-2 Cluster Secondary Cluster Produce to both cluster. Producers Producers Consumers Data previously in primary is not lost and can be recovered from secondary Subscribe to topic Primary Cluster Bring new DC
  • 30. XDCR
  • 31. © 2019 Cloudera, Inc. All rights reserved. 31 Publish to topic Cross Data Center Replication XDCR All consumers process all records Producers Producers Producers Producers Producers Consumers VIP/Load Balancers SRM Primary Cluster Secondary Cluster Produce to both cluster. Producers Producers Consumers Subscribe to *.topic R1 R1 R1 Subscribe to *.topic R1 R1 R2 R2 R2 R2 R2
  • 34. © 2019 Cloudera, Inc. All rights reserved. 34 Cloud migration or Kafka version upgrade
  • 35. © 2019 Cloudera, Inc. All rights reserved. 35 Aggregation for Analytics