SlideShare a Scribd company logo
hbaseconasia2019 HBase at Tencent
HBase At Tencent
Andrew Cheng | 程广旭
Tencent | HBase Committer
Content
01. HBase Service In Tencent
02. Applications
03. Practices & Optimization
01. HBase Service
In Tencent
HBase Story in Tencent
l Began using since 2013
l Used version
l 0.94.17 -> 0.98.6 -> 1.2.5 -> 2.2.0 (ing)
l Largest cluster more than 500 nodes
90+
Clusters
4000+
Nodes
10PB+
Data
3Tri+
RPD
Overview
HBase Users come from 6 groups , more than 100+ different applications
Architecture
Tencent HBase Zookeeper
OpenTSDB
S2Graph
Spark Tookit
HBase Api
TDBank
Lhotse
RestServer
ThriftServer
Kylin
Phoenix Tenpay
Doss
monitoring
TNM2
Deploy CenterWepay Game
Advertiseme
nt
…
02. Applications
Tencent Ads – Real-Time Logjoin System
Mixer Exposure
TDBank
Tencent HBase
Model learning Freshness Budget control Report
Association Table
Flow Table
Click …
LogJoin LogJoin LogJoin LogJoin
Data Source
Transport
Logical
Storage
Consumer
Tenpay - Transaction record
Data Source MySQL
Binlog Paser DBSync
Cache Hippo
Storage Tencent HBase
Thrift Server
Application
C++
Read
Read
Write
Application
JAVA
Read
Write TDSort
03. Practices &
Optimization
Practices–Data migration
add_peer
disable_peer
Set REPLICATION_SCOPE => '1'
snapshot clone_snapshot
Set REPLICATION_SCOPE => '0'
Check Dataenable_peer
Client switch to new cluster
Cluster A Cluster B
ExportSnapshot
delete_snapshot
Business-insensitive data migration
Practices–Table
l Create table per day
l Large amount of data
l TTL is short
l Benefit
l Reduce the amount of data in compaction
l Easy to delete expired data
Optimization - Bandwidth
② RS2 and RS3 Wal data
① Input Data
③ RS2 and RS3 Flush data
⑤ RS2 and RS3 Large compact
④ RS2 and RS3 Small compact
RS1 RS2 RS3
Input Data
Wal
Flush
①
Small compact
Large compact
②
③
④
⑤
Input Data Input Data
Optimization - Bandwidth
l Enable compressing of CellBlocks
l Wal compressor
l Increase the size of memstore
l Reduce the number of threads about compaction
l Turn off major compaction
l create tables by day
Optimization - Online filtering of dirty data
l A large amount of data which have the same Rowkey
l How to find filter rowkeys?
l ResponseTooSlow
l How to set filter rowkeys?
l hbase.hregion.filter.rowkeys
l How to refresh filter rowkeys?
l update_config
Input Data
Filter
Enable
Write
Filter
Yes
Yes
No
No
Optimization - Prefix Bloom Filter(HBASE-20636)
l ROWPREFIX_FIXED_LENGTH
l ROWPREFIX_DELIMITER
uin ts action
Bloom Filter
Prefix
Create Table:
File info:
Optimization - Prefix Bloom Filter(HBASE-20636)
Scan
Not Filter StoreFile
Same
prefix?
{StartKey,EndKey}
Computer hash
value
Hit
BloomFilter?
Prefix length
>=
prefix_length
Yes
Yes
No
Filter StoreFile
No
No
Get prefix key by
prefix_length
Yes
Read
Rowkey
Get prefix key by prefix_length
Computer hash value
Set BloomFilter
Last line?
Input Data
Write BloomFilter information to StoreFile metadata
Yes
No
Write
Optimization - RestServer
RestServer A
Cluster A Cluster B Cluster C
RestServer CRestServer B RestServer D
User
Nginx
Optimization - RestServer
RestServer A
Cluster A Cluster B Cluster C
RestServer CRestServer B
User
Nginx
Mysql
Optimization - RestServer
l Only maintain one configuration
l use effectively resources
l User-friendly access
HBase Community
l 1 Committer, 2 Contributor
l Total commits: 80+
l Feature
l HBASE-20636 Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED
l HBASE-19799 Add web UI to rsgroup
l HBASE-20243 [Shell] Add shell command to create a new table by cloning the existent table
l HBASE-19483 Add proper privilege check for rsgroup commands
l ………
Join Us
Personal WechatDept. Wechat
Thanks!
Ad

Recommended

hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinterest
Michael Stack
 
HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project Status
Michael Stack
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
Michael Stack
 
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
Michael Stack
 
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
Michael Stack
 
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
Michael Stack
 
Hadoop Networking at Datasift
Hadoop Networking at Datasift
huguk
 
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
Fwdays
 
Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelis
huguk
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
huguk
 
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
Michael Stack
 
Володимир Цап "Constraint driven infrastructure - scale or tune?"
Володимир Цап "Constraint driven infrastructure - scale or tune?"
Fwdays
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
HBaseCon
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
HBaseCon
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon
 
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
Michael Stack
 
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
Michael Stack
 
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
HBaseCon
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Omid Vahdaty
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
HBaseCon
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Redis Labs
 
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
WebExpo
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
Khanderao Kand
 
Stratebi Big Data
Stratebi Big Data
Stratebi
 
Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop and Pig at Twitter__HadoopSummit2010
Yahoo Developer Network
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Cloudera, Inc.
 
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
Safe Software
 

More Related Content

What's hot (16)

Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelis
huguk
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
huguk
 
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
Michael Stack
 
Володимир Цап "Constraint driven infrastructure - scale or tune?"
Володимир Цап "Constraint driven infrastructure - scale or tune?"
Fwdays
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
HBaseCon
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
HBaseCon
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon
 
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
Michael Stack
 
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
Michael Stack
 
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
HBaseCon
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Omid Vahdaty
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
HBaseCon
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Redis Labs
 
Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelis
huguk
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
huguk
 
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon 2015: HBase Operations in a Flurry
HBaseCon
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
Michael Stack
 
Володимир Цап "Constraint driven infrastructure - scale or tune?"
Володимир Цап "Constraint driven infrastructure - scale or tune?"
Fwdays
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
HBaseCon
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
HBaseCon
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon
 
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
Michael Stack
 
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
Michael Stack
 
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
HBaseCon
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Omid Vahdaty
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
HBaseCon
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Redis Labs
 

Similar to hbaseconasia2019 HBase at Tencent (20)

Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
WebExpo
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
Khanderao Kand
 
Stratebi Big Data
Stratebi Big Data
Stratebi
 
Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop and Pig at Twitter__HadoopSummit2010
Yahoo Developer Network
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Cloudera, Inc.
 
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
Safe Software
 
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Impetus Technologies
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
nzhang
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Yahoo Developer Network
 
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Christoph Adler
 
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Christoph Adler
 
Enterprise Data Lakes
Enterprise Data Lakes
Farid Gurbanov
 
History of Apache Pinot
History of Apache Pinot
Kishore Gopalakrishna
 
Web performance optimization
Web performance optimization
Kaliop-slide
 
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
DataWorks Summit
 
OSMC 2013 | openTSDB - metrics for a distributed world
OSMC 2013 | openTSDB - metrics for a distributed world
NETWAYS
 
Membase Meetup 2010
Membase Meetup 2010
Membase
 
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Cloudera, Inc.
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
HBaseCon
 
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
WebExpo
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
Khanderao Kand
 
Stratebi Big Data
Stratebi Big Data
Stratebi
 
Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop and Pig at Twitter__HadoopSummit2010
Yahoo Developer Network
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Cloudera, Inc.
 
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
Safe Software
 
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Impetus Technologies
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
nzhang
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Yahoo Developer Network
 
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Christoph Adler
 
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Christoph Adler
 
Web performance optimization
Web performance optimization
Kaliop-slide
 
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
DataWorks Summit
 
OSMC 2013 | openTSDB - metrics for a distributed world
OSMC 2013 | openTSDB - metrics for a distributed world
NETWAYS
 
Membase Meetup 2010
Membase Meetup 2010
Membase
 
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Cloudera, Inc.
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
HBaseCon
 
Ad

More from Michael Stack (20)

hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
Michael Stack
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
Michael Stack
 
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didi
Michael Stack
 
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
Michael Stack
 
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
Michael Stack
 
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
Michael Stack
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
Michael Stack
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
Michael Stack
 
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomi
Michael Stack
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
Michael Stack
 
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
Michael Stack
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
Michael Stack
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
Michael Stack
 
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
Michael Stack
 
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBase
Michael Stack
 
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
Michael Stack
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
Michael Stack
 
HBaseConAsia2019 Keynote
HBaseConAsia2019 Keynote
Michael Stack
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
Michael Stack
 
HBaseConAsia2018 Track1-3: HBase at Xiaomi
HBaseConAsia2018 Track1-3: HBase at Xiaomi
Michael Stack
 
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
Michael Stack
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
Michael Stack
 
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didi
Michael Stack
 
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
Michael Stack
 
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
Michael Stack
 
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
Michael Stack
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
Michael Stack
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
Michael Stack
 
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomi
Michael Stack
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
Michael Stack
 
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
Michael Stack
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
Michael Stack
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
Michael Stack
 
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
Michael Stack
 
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBase
Michael Stack
 
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
Michael Stack
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
Michael Stack
 
HBaseConAsia2019 Keynote
HBaseConAsia2019 Keynote
Michael Stack
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
Michael Stack
 
HBaseConAsia2018 Track1-3: HBase at Xiaomi
HBaseConAsia2018 Track1-3: HBase at Xiaomi
Michael Stack
 
Ad

Recently uploaded (20)

IAREUOUSTPIDWHY$)CHARACTERARERWUEEJJSKWNSND
IAREUOUSTPIDWHY$)CHARACTERARERWUEEJJSKWNSND
notgachabite123
 
最新版美国特拉华大学毕业证(UDel毕业证书)原版定制
最新版美国特拉华大学毕业证(UDel毕业证书)原版定制
taqyea
 
Slides: Eco Economic Epochs for The World Game (s) pdf
Slides: Eco Economic Epochs for The World Game (s) pdf
Steven McGee
 
原版澳洲斯文本科技大学毕业证(SUT毕业证书)如何办理
原版澳洲斯文本科技大学毕业证(SUT毕业证书)如何办理
taqyed
 
The ARUBA Kind of new Proposal Umum .pptx
The ARUBA Kind of new Proposal Umum .pptx
andiwarneri
 
Transmission Control Protocol (TCP) and Starlink
Transmission Control Protocol (TCP) and Starlink
APNIC
 
BroadLink Cloud Service introduction.pdf
BroadLink Cloud Service introduction.pdf
DevendraDwivdi1
 
Logging and Automated Alerting Webinar.pdf
Logging and Automated Alerting Webinar.pdf
ControlCase
 
原版一样(ANU毕业证书)澳洲澳大利亚国立大学毕业证在线购买
原版一样(ANU毕业证书)澳洲澳大利亚国立大学毕业证在线购买
Taqyea
 
Paper: The World Game (s) Great Redesign.pdf
Paper: The World Game (s) Great Redesign.pdf
Steven McGee
 
Make DDoS expensive for the threat actors
Make DDoS expensive for the threat actors
APNIC
 
ChatGPT A.I. Powered Chatbot and Popularization.pdf
ChatGPT A.I. Powered Chatbot and Popularization.pdf
StanleySamson1
 
Clive Dickens RedTech Public Copy - Collaborate or Die
Clive Dickens RedTech Public Copy - Collaborate or Die
Clive Dickens
 
history of internet in nepal Class-8 (sparsha).pptx
history of internet in nepal Class-8 (sparsha).pptx
SPARSH508080
 
DDoS in India, presented at INNOG 8 by Dave Phelan
DDoS in India, presented at INNOG 8 by Dave Phelan
APNIC
 
TCP/IP presentation SET2- Information Systems
TCP/IP presentation SET2- Information Systems
agnesegtcagliero
 
BASICS OF SAP _ ALL ABOUT SAP _WHY SAP OVER ANY OTHER ERP SYSTEM
BASICS OF SAP _ ALL ABOUT SAP _WHY SAP OVER ANY OTHER ERP SYSTEM
AhmadAli716831
 
PROCESS FOR CREATION OF BUSINESS PARTNER IN SAP
PROCESS FOR CREATION OF BUSINESS PARTNER IN SAP
AhmadAli716831
 
原版一样(ISM毕业证书)德国多特蒙德国际管理学院毕业证多少钱
原版一样(ISM毕业证书)德国多特蒙德国际管理学院毕业证多少钱
taqyed
 
最新版加拿大奎斯特大学毕业证(QUC毕业证书)原版定制
最新版加拿大奎斯特大学毕业证(QUC毕业证书)原版定制
taqyed
 
IAREUOUSTPIDWHY$)CHARACTERARERWUEEJJSKWNSND
IAREUOUSTPIDWHY$)CHARACTERARERWUEEJJSKWNSND
notgachabite123
 
最新版美国特拉华大学毕业证(UDel毕业证书)原版定制
最新版美国特拉华大学毕业证(UDel毕业证书)原版定制
taqyea
 
Slides: Eco Economic Epochs for The World Game (s) pdf
Slides: Eco Economic Epochs for The World Game (s) pdf
Steven McGee
 
原版澳洲斯文本科技大学毕业证(SUT毕业证书)如何办理
原版澳洲斯文本科技大学毕业证(SUT毕业证书)如何办理
taqyed
 
The ARUBA Kind of new Proposal Umum .pptx
The ARUBA Kind of new Proposal Umum .pptx
andiwarneri
 
Transmission Control Protocol (TCP) and Starlink
Transmission Control Protocol (TCP) and Starlink
APNIC
 
BroadLink Cloud Service introduction.pdf
BroadLink Cloud Service introduction.pdf
DevendraDwivdi1
 
Logging and Automated Alerting Webinar.pdf
Logging and Automated Alerting Webinar.pdf
ControlCase
 
原版一样(ANU毕业证书)澳洲澳大利亚国立大学毕业证在线购买
原版一样(ANU毕业证书)澳洲澳大利亚国立大学毕业证在线购买
Taqyea
 
Paper: The World Game (s) Great Redesign.pdf
Paper: The World Game (s) Great Redesign.pdf
Steven McGee
 
Make DDoS expensive for the threat actors
Make DDoS expensive for the threat actors
APNIC
 
ChatGPT A.I. Powered Chatbot and Popularization.pdf
ChatGPT A.I. Powered Chatbot and Popularization.pdf
StanleySamson1
 
Clive Dickens RedTech Public Copy - Collaborate or Die
Clive Dickens RedTech Public Copy - Collaborate or Die
Clive Dickens
 
history of internet in nepal Class-8 (sparsha).pptx
history of internet in nepal Class-8 (sparsha).pptx
SPARSH508080
 
DDoS in India, presented at INNOG 8 by Dave Phelan
DDoS in India, presented at INNOG 8 by Dave Phelan
APNIC
 
TCP/IP presentation SET2- Information Systems
TCP/IP presentation SET2- Information Systems
agnesegtcagliero
 
BASICS OF SAP _ ALL ABOUT SAP _WHY SAP OVER ANY OTHER ERP SYSTEM
BASICS OF SAP _ ALL ABOUT SAP _WHY SAP OVER ANY OTHER ERP SYSTEM
AhmadAli716831
 
PROCESS FOR CREATION OF BUSINESS PARTNER IN SAP
PROCESS FOR CREATION OF BUSINESS PARTNER IN SAP
AhmadAli716831
 
原版一样(ISM毕业证书)德国多特蒙德国际管理学院毕业证多少钱
原版一样(ISM毕业证书)德国多特蒙德国际管理学院毕业证多少钱
taqyed
 
最新版加拿大奎斯特大学毕业证(QUC毕业证书)原版定制
最新版加拿大奎斯特大学毕业证(QUC毕业证书)原版定制
taqyed
 

hbaseconasia2019 HBase at Tencent

  • 2. HBase At Tencent Andrew Cheng | 程广旭 Tencent | HBase Committer
  • 3. Content 01. HBase Service In Tencent 02. Applications 03. Practices & Optimization
  • 5. HBase Story in Tencent l Began using since 2013 l Used version l 0.94.17 -> 0.98.6 -> 1.2.5 -> 2.2.0 (ing) l Largest cluster more than 500 nodes 90+ Clusters 4000+ Nodes 10PB+ Data 3Tri+ RPD
  • 6. Overview HBase Users come from 6 groups , more than 100+ different applications
  • 7. Architecture Tencent HBase Zookeeper OpenTSDB S2Graph Spark Tookit HBase Api TDBank Lhotse RestServer ThriftServer Kylin Phoenix Tenpay Doss monitoring TNM2 Deploy CenterWepay Game Advertiseme nt …
  • 9. Tencent Ads – Real-Time Logjoin System Mixer Exposure TDBank Tencent HBase Model learning Freshness Budget control Report Association Table Flow Table Click … LogJoin LogJoin LogJoin LogJoin Data Source Transport Logical Storage Consumer
  • 10. Tenpay - Transaction record Data Source MySQL Binlog Paser DBSync Cache Hippo Storage Tencent HBase Thrift Server Application C++ Read Read Write Application JAVA Read Write TDSort
  • 12. Practices–Data migration add_peer disable_peer Set REPLICATION_SCOPE => '1' snapshot clone_snapshot Set REPLICATION_SCOPE => '0' Check Dataenable_peer Client switch to new cluster Cluster A Cluster B ExportSnapshot delete_snapshot Business-insensitive data migration
  • 13. Practices–Table l Create table per day l Large amount of data l TTL is short l Benefit l Reduce the amount of data in compaction l Easy to delete expired data
  • 14. Optimization - Bandwidth ② RS2 and RS3 Wal data ① Input Data ③ RS2 and RS3 Flush data ⑤ RS2 and RS3 Large compact ④ RS2 and RS3 Small compact RS1 RS2 RS3 Input Data Wal Flush ① Small compact Large compact ② ③ ④ ⑤ Input Data Input Data
  • 15. Optimization - Bandwidth l Enable compressing of CellBlocks l Wal compressor l Increase the size of memstore l Reduce the number of threads about compaction l Turn off major compaction l create tables by day
  • 16. Optimization - Online filtering of dirty data l A large amount of data which have the same Rowkey l How to find filter rowkeys? l ResponseTooSlow l How to set filter rowkeys? l hbase.hregion.filter.rowkeys l How to refresh filter rowkeys? l update_config Input Data Filter Enable Write Filter Yes Yes No No
  • 17. Optimization - Prefix Bloom Filter(HBASE-20636) l ROWPREFIX_FIXED_LENGTH l ROWPREFIX_DELIMITER uin ts action Bloom Filter Prefix Create Table: File info:
  • 18. Optimization - Prefix Bloom Filter(HBASE-20636) Scan Not Filter StoreFile Same prefix? {StartKey,EndKey} Computer hash value Hit BloomFilter? Prefix length >= prefix_length Yes Yes No Filter StoreFile No No Get prefix key by prefix_length Yes Read Rowkey Get prefix key by prefix_length Computer hash value Set BloomFilter Last line? Input Data Write BloomFilter information to StoreFile metadata Yes No Write
  • 19. Optimization - RestServer RestServer A Cluster A Cluster B Cluster C RestServer CRestServer B RestServer D User Nginx
  • 20. Optimization - RestServer RestServer A Cluster A Cluster B Cluster C RestServer CRestServer B User Nginx Mysql
  • 21. Optimization - RestServer l Only maintain one configuration l use effectively resources l User-friendly access
  • 22. HBase Community l 1 Committer, 2 Contributor l Total commits: 80+ l Feature l HBASE-20636 Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED l HBASE-19799 Add web UI to rsgroup l HBASE-20243 [Shell] Add shell command to create a new table by cloning the existent table l HBASE-19483 Add proper privilege check for rsgroup commands l ………