SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011–2018. All rights reserved
High throughput data replication over
RAFT
Mukul Kumar Singh, Staff Software Engineer, Hortonworks
Lokesh Jain, Software Engineer, Hortonworks
2 © Hortonworks Inc. 2011–2018. All rights reserved
• msingh@apache.org
• Staff Software Engineer, Hortonworks
• ASF
• Committer for Apache Hadoop
• Committer for Apache Ratis
• MS from Carnegie Mellon University,
Pittsburgh
• ljain@apache.org
• Software Engineer, Hortonworks
• ASF
• Committer for Apache Ratis
• BE(Hons) Computer Science & M.Sc.
(Hons) Mathematics from BITS Pilani
Mukul Kumar Singh Lokesh Jain
Speakers
3 © Hortonworks Inc. 2011–2018. All rights reserved
Raft
4 © Hortonworks Inc. 2011–2018. All rights reserved
Raft
• Raft is a consensus algorithm
• Works when majority of nodes are alive in cluster
• i.e. can handle loss of minority number of nodes.
• “In Search of an Understandable Consensus Algorithm”
• by Diego Ongaro and John Ousterhout
• USENIX ATC’14, https://raft.github.io
5 © Hortonworks Inc. 2011–2018. All rights reserved
Raft Library
• Our Motivations
• Use Raft in Ozone
• “In Search of a Usable Raft Library”
• A long list of Raft implementations is available
• None of them a general library ready to be consumed by other projects.
• Most of them are tied to another project or a part of another project.
• We need a Raft library!
6 © Hortonworks Inc. 2011–2018. All rights reserved
Raft Basic
• Leader Election
• Servers are started as a Follower
• Randomly timeout to become Candidate and start a leader election
• Candidate sends requestVote to other servers
• It becomes the leader once it gets a majority of the votes.
• Append Entries
• Clients send requests to the Leader
• Leader forwards the requests to the Followers
• Leader sends appendEntries to Followers
• When there is no client requests, Leader also sends empty appendEntries
(heartbeats) to Followers to maintain leadership
7 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Ratis
8 © Hortonworks Inc. 2011–2018. All rights reserved
Data Intensive Applications
• In Raft,
• All transactions and the data are written in the log
• Not suitable for data intensive applications
• In Ratis
• Application could choose to not write all the data to log
• State machine data and log data can be separately managed
• See the FileStore example in ratis-example
• See the ContainerStateMachine as an implementation in Apache Hadoop Ozone.
9 © Hortonworks Inc. 2011–2018. All rights reserved
Ratis: Standard Raft Features
• Leader Election + Log Replication
• Automatically elect a leader among the servers in a Raft group
• Randomized timeout for avoiding split votes
• Log is replicated in the Raft group
• Membership Changes
• Members in a Raft group can be re-configurated in runtime
• Replication factor can be changed in runtime
• Log Compaction
• Snapshot is taken periodically
• Send snapshot instead of a long log history.
10 © Hortonworks Inc. 2011–2018. All rights reserved
Ratis: Pluggability
• Pluggable state machine
• Application must define its state machine
• Example: a key-value map
• Pluggable RPC
• Users may provide their own RPC implementation
• Default implementations: gRPC, Netty, Hadoop RPC
• gRPC allows implementation of native client
• Pluggable Raft log
• Users may provide their own log implementation
• The default implementation stores log in local files
11 © Hortonworks Inc. 2011–2018. All rights reserved
Ratis: Asynchronous/Synchronous APIs
• Using gRPC bi-directional stream API
• Netty and Hadoop RPC can support async but not yet implemented
• Server-to-server
• Asynchronous append entries
• Client-to-server
• Asynchronous client requests
12 © Hortonworks Inc. 2011–2018. All rights reserved
General Ratis Use Cases
• You want to:
• (1) replicate the server log/states to multiple machines
• The replication number/cluster membership can be changed in runtime
• It can tolerate server failures.
• or
• (2) have a HA (highly available) service
• When a server fails, another server will automatically take over.
• Clients automatically failover to the new server.
• Apache Ratis is for you!
13 © Hortonworks Inc. 2011–2018. All rights reserved
API
• Client Side APIs
• Send/SendReadOnly
• Send readonly commands are do not change the state of the raft server.
• Async versions also available (sendAsync, sendReadOnlyAsync)
• Server Side APIs
• applyTransaction
• Applies the transaction to the statemachine
• writeStateMachineData
• An optimization to avoid double write penalty for data intensive
applications.
14 © Hortonworks Inc. 2011–2018. All rights reserved
High Throughput
Data Pipeline
15 © Hortonworks Inc. 2011–2018. All rights reserved
Building a high performance data pipeline
• Requirements
• High data write throughput
• Parallelism/async interface
• Large number of transactions per second
• Configurable parameters
• Support for security
16 © Hortonworks Inc. 2011–2018. All rights reserved
Building a high performance data pipeline
• Optimizations
• Separate user data from the raft log
– Avoids double write penalty for data
• Efficient batching of raft log entries
– High write performance during local disk write
– Efficient network replication
• Async processing of operations
– Client ops
– Append entries to followers
– StateMachine implementation
17 © Hortonworks Inc. 2011–2018. All rights reserved
FileStoreStateMachine
• Located at org.apache.ratis.examples.filestore
• Simple state machine implementation to write bytes to a file
• Separates file data from raft log.
• File data written is persisted to disk
• Client generates random bytes of the specified file size
• Client uses writeAsync
18 © Hortonworks Inc. 2011–2018. All rights reserved
Performance Benchmarking
• Setup, 3 nodes with
• Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
• 256GiB System memory
• 10 Gigabit Network Connection
• 4 HGST (HUS726060AL4210) HDD of 5.5TB each
19 © Hortonworks Inc. 2011–2018. All rights reserved
Performance – Write Throughout
0
50
100
150
200
250
300
128000 102400 64000 51200 32000 20480 16000 10240 8000 5120 4000 2048 2000 1024 1000 1000 512 500 250 125 100
DatathrouhputinMB/s
File Size in KB
Write throughput for 1GB
20 © Hortonworks Inc. 2011–2018. All rights reserved
Performance – Transactions per second
0
2000
4000
6000
8000
10000
12000
100000 10000 1000 100 10
NUMBEROFTRANSACTIONPERSECOND
FILE SIZE IN BYTES
Number of transaction with 100000 files
21 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone
22 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone
Client
DN DN DN
RATIS
Ozone
Master
Storage
Container
Manager
Get Block
Get Container Location
(List of DNs)
Write Data
23 © Hortonworks Inc. 2011–2018. All rights reserved
Terminologies
• OM – Ozone Master
• Namespace manager inside Ozone, manages key name to block id mapping.
• Also manages Volume, buckets and key namespaces
• SCM – Storage Container Manager
• Block Manager, manager cluster membership, container location
information, manager containers
• Datanode
• Used to store user data, Ratis server spawned inside the datanode
• Ozone datanode persist containers, blocks are allocated out of containers.
24 © Hortonworks Inc. 2011–2018. All rights reserved
Storage Container
• Hadoop Distributed Data Storage (HDDS) introduces Storage Containers
• Provide generic data storage functionalities.
• Configurable Size (2GB - 16GB+)
• Unit of management and replication in SCM.
• Blocks are allocated from container
• BID = CID + LocalID
25 © Hortonworks Inc. 2011–2018. All rights reserved
Use of Ratis in Ozone
• Replicating data in open containers
• Replication of user data using Ratis
• Support HA in Storage Container Manager
• Work in Progress
• Support HA in Ozone Manager
• Work in Progress
26 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone Ratis Commands
• Ozone Data Pipeline involved interaction between client and
datanode.
• Commands are marked as readonly if they do not change the state
of the datanode.
• GetKey, ReadChunk, Read Container, or
• WriteChunk, PutKey, CreateContainer etc.
• Ozone Client send container commands to the leader datanode
using Ratis Protocol (grpc as underlying rpc)
27 © Hortonworks Inc. 2011–2018. All rights reserved
Command Replication on Containers
Leader
Follower Follower
Write Chunk
CSM
Response
28 © Hortonworks Inc. 2011–2018. All rights reserved
Open Container Replication using Ratis
• Ratis is used for replication of data being written to Ozone Datanodes.
• Ratis replicates container commands on open containers.
• Ozone Datanode provides its own state machine implementation
• This implementation handles various datanode commands (write chunk, put key, create
container)
• Performance optimizations
• To avoid rewrite of data twice to the disk, the state machine implementation separates user
data from block/chunk metadata.
• Multiple chunks are written in parallel.
• Append requests from Leader to followers are made async. Allows multiple appends in
parallel.
• Raft-journal in separate disk – fast contiguous writes without seeking
29 © Hortonworks Inc. 2011–2018. All rights reserved
Ozone Data Write Performance
• The performance numbers were taken for different key sizes and 10 client writes
in parallel.
• Measure the end to end throughput numbers
• Key allocation in OM and Block Allocation is SCM also account for total throughput.
• Ozone Client
• Uses sync apis to write data to the datanodes
• ContainerStateMachine implementation
• Parallelize write chunk operations
Key Sizes 10 MB 100MB
Throughput (MB/s) 81.3 110.3 MB
30 © Hortonworks Inc. 2011–2018. All rights reserved
Summary
• Ratis is Java based implementation of Raft protocol
• Essentially constituting a replicated statemachine.
• Suitable for data intensive applications.
• Features
• Sync/Async client apis
• Pluggable StateMachine
• Pluggable Raft Log Implementation
• Performance
• Write throughput - 250MB/s – 300 MB/s
• IOPS - 10,000 txns/s
31 © Hortonworks Inc. 2011–2018. All rights reserved
Contributors
• A big thanks to all the contributors for Apache Ratis, Apache Hadoop
and Ozone
• Animesh Trivedi, Anu Engineer, Arpit Agarwal, Brent,
• Chen Liang, Chris Nauroth, Devaraj Das, Enis Soztutar,
• garvit, Hanisha Koneru, Hugo Louro, Jakob Homan,
• Jian He, Jing Chen, Jing Zhao, Jitendra Pandey, Junping Du,
• kaiyangzhang, Karl Heinz Marbaise, Li Lu, Lokesh Jain,
• Marton Elek, Mayank Bansal, Mingliang Liu,
• Mukul Kumar Singh, Sen Zhang, Shashikant Banerjee, Sriharsha
Chintalapani,Tsz Wo Nicholas Sze,
• Uma Maheswara Rao G, Venkat Ranganathan, Wangda Tan,
• Weiqing Yang, Will Xu, Xiaobing Zhou, Xiaoyu Yao, Yubo Xu,
• yue liu, Zhiyuan Yang
32 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Ratis & Apache Hadoop Ozone
• Contributions are welcome!
• Ratis
• http://ratis.incubator.apache.org
• dev@ratis.incubator.apache.org
• Ozone
• http://hadoop.apache.org
• hdfs-dev@hadoop.apache.org
33 © Hortonworks Inc. 2011–2018. All rights reserved
Questions?
34 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you
Ad

More Related Content

What's hot (20)

Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
Timothy Spann
 
Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013
Julien Le Dem
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
Databricks
 
Apache hive
Apache hiveApache hive
Apache hive
pradipbajpai68
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
DataWorks Summit
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
DataWorks Summit
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about ceph
Emma Haruka Iwao
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
Rommel Garcia
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
Prashant Gupta
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


Cloudera, Inc.
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
Bill Liu
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
Cloudera, Inc.
 
An Overview of Ambari
An Overview of AmbariAn Overview of Ambari
An Overview of Ambari
Chicago Hadoop Users Group
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
HBaseCon
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)
Eric Sun
 
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netReceive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Yan Vugenfirer
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
JAX London
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
Timothy Spann
 
Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013Parquet Hadoop Summit 2013
Parquet Hadoop Summit 2013
Julien Le Dem
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
Databricks
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
DataWorks Summit
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
DataWorks Summit
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about ceph
Emma Haruka Iwao
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


Cloudera, Inc.
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
Bill Liu
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
Cloudera, Inc.
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
HBaseCon
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)
Eric Sun
 
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netReceive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Yan Vugenfirer
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
JAX London
 

Similar to High throughput data replication over RAFT (20)

Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
DataWorks Summit
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
DataWorks Summit
 
Apache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storyApache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration story
Sunil Govindan
 
Containers and Big Data
Containers and Big Data Containers and Big Data
Containers and Big Data
DataWorks Summit
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
DataWorks Summit
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
alanfgates
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_features
Alberto Romero
 
Multi-Lingual Accumulo Communications
Multi-Lingual Accumulo CommunicationsMulti-Lingual Accumulo Communications
Multi-Lingual Accumulo Communications
Accumulo Summit
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
 
Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
DataWorks Summit
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
Yifeng Jiang
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo
DataWorks Summit
 
Ozone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objectsOzone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objects
DataWorks Summit
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
Accumulo Summit
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
DataWorks Summit
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
DataWorks Summit
 
Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019
alanfgates
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
DataWorks Summit
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
DataWorks Summit
 
Apache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storyApache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration story
Sunil Govindan
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
DataWorks Summit
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
alanfgates
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_features
Alberto Romero
 
Multi-Lingual Accumulo Communications
Multi-Lingual Accumulo CommunicationsMulti-Lingual Accumulo Communications
Multi-Lingual Accumulo Communications
Accumulo Summit
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
Yifeng Jiang
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo
DataWorks Summit
 
Ozone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objectsOzone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objects
DataWorks Summit
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
Accumulo Summit
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
DataWorks Summit
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
DataWorks Summit
 
Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019
alanfgates
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
DataWorks Summit
 
Ad

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Ad

Recently uploaded (20)

Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 

High throughput data replication over RAFT

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved High throughput data replication over RAFT Mukul Kumar Singh, Staff Software Engineer, Hortonworks Lokesh Jain, Software Engineer, Hortonworks
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved • [email protected] • Staff Software Engineer, Hortonworks • ASF • Committer for Apache Hadoop • Committer for Apache Ratis • MS from Carnegie Mellon University, Pittsburgh • [email protected] • Software Engineer, Hortonworks • ASF • Committer for Apache Ratis • BE(Hons) Computer Science & M.Sc. (Hons) Mathematics from BITS Pilani Mukul Kumar Singh Lokesh Jain Speakers
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Raft
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Raft • Raft is a consensus algorithm • Works when majority of nodes are alive in cluster • i.e. can handle loss of minority number of nodes. • “In Search of an Understandable Consensus Algorithm” • by Diego Ongaro and John Ousterhout • USENIX ATC’14, https://raft.github.io
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Raft Library • Our Motivations • Use Raft in Ozone • “In Search of a Usable Raft Library” • A long list of Raft implementations is available • None of them a general library ready to be consumed by other projects. • Most of them are tied to another project or a part of another project. • We need a Raft library!
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved Raft Basic • Leader Election • Servers are started as a Follower • Randomly timeout to become Candidate and start a leader election • Candidate sends requestVote to other servers • It becomes the leader once it gets a majority of the votes. • Append Entries • Clients send requests to the Leader • Leader forwards the requests to the Followers • Leader sends appendEntries to Followers • When there is no client requests, Leader also sends empty appendEntries (heartbeats) to Followers to maintain leadership
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved Apache Ratis
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Data Intensive Applications • In Raft, • All transactions and the data are written in the log • Not suitable for data intensive applications • In Ratis • Application could choose to not write all the data to log • State machine data and log data can be separately managed • See the FileStore example in ratis-example • See the ContainerStateMachine as an implementation in Apache Hadoop Ozone.
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Ratis: Standard Raft Features • Leader Election + Log Replication • Automatically elect a leader among the servers in a Raft group • Randomized timeout for avoiding split votes • Log is replicated in the Raft group • Membership Changes • Members in a Raft group can be re-configurated in runtime • Replication factor can be changed in runtime • Log Compaction • Snapshot is taken periodically • Send snapshot instead of a long log history.
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved Ratis: Pluggability • Pluggable state machine • Application must define its state machine • Example: a key-value map • Pluggable RPC • Users may provide their own RPC implementation • Default implementations: gRPC, Netty, Hadoop RPC • gRPC allows implementation of native client • Pluggable Raft log • Users may provide their own log implementation • The default implementation stores log in local files
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved Ratis: Asynchronous/Synchronous APIs • Using gRPC bi-directional stream API • Netty and Hadoop RPC can support async but not yet implemented • Server-to-server • Asynchronous append entries • Client-to-server • Asynchronous client requests
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved General Ratis Use Cases • You want to: • (1) replicate the server log/states to multiple machines • The replication number/cluster membership can be changed in runtime • It can tolerate server failures. • or • (2) have a HA (highly available) service • When a server fails, another server will automatically take over. • Clients automatically failover to the new server. • Apache Ratis is for you!
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved API • Client Side APIs • Send/SendReadOnly • Send readonly commands are do not change the state of the raft server. • Async versions also available (sendAsync, sendReadOnlyAsync) • Server Side APIs • applyTransaction • Applies the transaction to the statemachine • writeStateMachineData • An optimization to avoid double write penalty for data intensive applications.
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved High Throughput Data Pipeline
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved Building a high performance data pipeline • Requirements • High data write throughput • Parallelism/async interface • Large number of transactions per second • Configurable parameters • Support for security
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved Building a high performance data pipeline • Optimizations • Separate user data from the raft log – Avoids double write penalty for data • Efficient batching of raft log entries – High write performance during local disk write – Efficient network replication • Async processing of operations – Client ops – Append entries to followers – StateMachine implementation
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved FileStoreStateMachine • Located at org.apache.ratis.examples.filestore • Simple state machine implementation to write bytes to a file • Separates file data from raft log. • File data written is persisted to disk • Client generates random bytes of the specified file size • Client uses writeAsync
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved Performance Benchmarking • Setup, 3 nodes with • Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz • 256GiB System memory • 10 Gigabit Network Connection • 4 HGST (HUS726060AL4210) HDD of 5.5TB each
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved Performance – Write Throughout 0 50 100 150 200 250 300 128000 102400 64000 51200 32000 20480 16000 10240 8000 5120 4000 2048 2000 1024 1000 1000 512 500 250 125 100 DatathrouhputinMB/s File Size in KB Write throughput for 1GB
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Performance – Transactions per second 0 2000 4000 6000 8000 10000 12000 100000 10000 1000 100 10 NUMBEROFTRANSACTIONPERSECOND FILE SIZE IN BYTES Number of transaction with 100000 files
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Ozone
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Ozone Client DN DN DN RATIS Ozone Master Storage Container Manager Get Block Get Container Location (List of DNs) Write Data
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Terminologies • OM – Ozone Master • Namespace manager inside Ozone, manages key name to block id mapping. • Also manages Volume, buckets and key namespaces • SCM – Storage Container Manager • Block Manager, manager cluster membership, container location information, manager containers • Datanode • Used to store user data, Ratis server spawned inside the datanode • Ozone datanode persist containers, blocks are allocated out of containers.
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved Storage Container • Hadoop Distributed Data Storage (HDDS) introduces Storage Containers • Provide generic data storage functionalities. • Configurable Size (2GB - 16GB+) • Unit of management and replication in SCM. • Blocks are allocated from container • BID = CID + LocalID
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Use of Ratis in Ozone • Replicating data in open containers • Replication of user data using Ratis • Support HA in Storage Container Manager • Work in Progress • Support HA in Ozone Manager • Work in Progress
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved Ozone Ratis Commands • Ozone Data Pipeline involved interaction between client and datanode. • Commands are marked as readonly if they do not change the state of the datanode. • GetKey, ReadChunk, Read Container, or • WriteChunk, PutKey, CreateContainer etc. • Ozone Client send container commands to the leader datanode using Ratis Protocol (grpc as underlying rpc)
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved Command Replication on Containers Leader Follower Follower Write Chunk CSM Response
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved Open Container Replication using Ratis • Ratis is used for replication of data being written to Ozone Datanodes. • Ratis replicates container commands on open containers. • Ozone Datanode provides its own state machine implementation • This implementation handles various datanode commands (write chunk, put key, create container) • Performance optimizations • To avoid rewrite of data twice to the disk, the state machine implementation separates user data from block/chunk metadata. • Multiple chunks are written in parallel. • Append requests from Leader to followers are made async. Allows multiple appends in parallel. • Raft-journal in separate disk – fast contiguous writes without seeking
  • 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved Ozone Data Write Performance • The performance numbers were taken for different key sizes and 10 client writes in parallel. • Measure the end to end throughput numbers • Key allocation in OM and Block Allocation is SCM also account for total throughput. • Ozone Client • Uses sync apis to write data to the datanodes • ContainerStateMachine implementation • Parallelize write chunk operations Key Sizes 10 MB 100MB Throughput (MB/s) 81.3 110.3 MB
  • 30. 30 © Hortonworks Inc. 2011–2018. All rights reserved Summary • Ratis is Java based implementation of Raft protocol • Essentially constituting a replicated statemachine. • Suitable for data intensive applications. • Features • Sync/Async client apis • Pluggable StateMachine • Pluggable Raft Log Implementation • Performance • Write throughput - 250MB/s – 300 MB/s • IOPS - 10,000 txns/s
  • 31. 31 © Hortonworks Inc. 2011–2018. All rights reserved Contributors • A big thanks to all the contributors for Apache Ratis, Apache Hadoop and Ozone • Animesh Trivedi, Anu Engineer, Arpit Agarwal, Brent, • Chen Liang, Chris Nauroth, Devaraj Das, Enis Soztutar, • garvit, Hanisha Koneru, Hugo Louro, Jakob Homan, • Jian He, Jing Chen, Jing Zhao, Jitendra Pandey, Junping Du, • kaiyangzhang, Karl Heinz Marbaise, Li Lu, Lokesh Jain, • Marton Elek, Mayank Bansal, Mingliang Liu, • Mukul Kumar Singh, Sen Zhang, Shashikant Banerjee, Sriharsha Chintalapani,Tsz Wo Nicholas Sze, • Uma Maheswara Rao G, Venkat Ranganathan, Wangda Tan, • Weiqing Yang, Will Xu, Xiaobing Zhou, Xiaoyu Yao, Yubo Xu, • yue liu, Zhiyuan Yang
  • 32. 32 © Hortonworks Inc. 2011–2018. All rights reserved Apache Ratis & Apache Hadoop Ozone • Contributions are welcome! • Ratis • http://ratis.incubator.apache.org • [email protected] • Ozone • http://hadoop.apache.org • [email protected]
  • 33. 33 © Hortonworks Inc. 2011–2018. All rights reserved Questions?
  • 34. 34 © Hortonworks Inc. 2011–2018. All rights reserved Thank you

Editor's Notes

  • #2: TALK TRACK Hortonworks Powers the Future of Data: data-in-motion, data-at-rest, and Modern Data Applications. [NEXT SLIDE]