SlideShare a Scribd company logo
2016年9月6日
Achieving 100k Queries per
Hour with Hive on Tez
About Yahoo! JAPAN
2
The Largest Portal Site in Japan
65 billon pageviews / month
2.1 billon pageviews / day
YDN Report
What is YDN Report?
• Report for Yahoo Display Ads. Networks
Batch Reporting over Massive Dataset
• 13 months, 800B+ rows of data
• Adding 3.3B+ rows of data per day
Highly Parallel Workload
• 100K reports per hour
3
YDN Report Query
Typical Query
• Query is Relatively Simple
• Answer “How many clicks did I get last week?”
4
0
5000
10000
15000
1 3 5 7 9 11 13 15 17 19 21 23 25 27
SELECT account, yyyymmdd,
sum(total_imps),
sum(total_click),
...
FROM table_x
WHERE yyyymmdd >= xxx
AND yyyymmdd < xxx
AND account = xxx
...
GROUP BY account, yyyymmdd, ...;
Test Setup
5
Hive Performance Recap
Hive is fast: interactive response
• ORC columnar file format
• Cost based optimizer (CBO)
• Vectorized SQL engine
• Tez execution engine (replacing MapReduce)
Hive 0.10
Batch
Processing 100-150x Query Speedup
Hive 1.2
Human
Interactive
(5 seconds)
Hive on Tez Query Execution
A query execution essentially is put together from
• Client execution [ 0s if done correctly ]
• Optimization [HiveServer2] [~ 0.1s]
• Metadata lookups [Hcatalog, Metastore] [ very fast in hive 0.14 ]
• Application Master creation [4-5s]
• Container Allocation [3-5s]
• Tez task execution on YARN
YARN and HDFS
HiveServer2
Server #1Client
Running testing tool
N connections
N
connections
Metastore Metastore DB
HiveServer2
Server #2
Tez
AM
Tez
Container
Tez
Container
…
Mini Test
Mini Setup Tested
• 50 nodes
• 450B rows dataset
• Achieved 15K queries per hour
So, can we get 100K qph on 700 nodes?
We thought it should be easy, but…
8
The Bottlenecks at Scale
Challenges at Scale
• Hive Metastore Server
• YARN Resource Manager
• Datanode Hotspot
• YARN ATS
9
Hive Metastore Server
10
Use Local Metastore
• Before: HS2 -> Metastore Server -> Metastore DB
• After: HS2 (local metastore) -> Metastore DB
Hive Metastore Server
11
Use Local Metastore
• Throughput: 7.6K -> 22K qph
Pending Apps
YARN ResourceManager Scalability
• Too much pending apps
12
Pending Apps
YARN ResourceManager Scalability
• Too much pending apps
• Resolve: increase yarn.resourcemanager.amlauncher.thread-count
• Throughput: 22K -> 26K qph
13
Pending Containers
YARN ResourceManager Scalability
• Too much pending containers
14
Pending Containers
YARN ResourceManager Scalability
• Too much pending containers
• Resolve: increase tez.am-rm.heartbeat.interval-ms.max
• Throughput: 26K -> 72.5K qph
15
Datanode Hotspot
Last Hour Problem
• Connection timeout and disk access error
• Many queries access recently added data
16
Datanode Hotspot
Last Hour Problem
• Resolve: Increase HDFS replication factor
• Throughput: 72.5K -> 95K qph
17
Other Tunings
Other Tunings We Did
• Container reuse timeout
• YARN capacity scheduler node locality delay
• Tez shuffle keep alive
• TCP fin_wait
Notes on YARN ATS
• Disabling YARN ATS gives higher throughput
• Trade off losing YARN log aggregation
18
End of first half
19
End of first half
Yohei Abe
@Yahoo! JAPAN
Real-life Hive LLAP at
Yahoo! JAPAN
Aug 2016
Agenda
• Hive LLAP at Yahoo! JAPAN
• Tuning
• Performance Result
• Future Work
21
Hive LLAP at Yahoo!
JAPAN
Hive on Tez
Hive on Tez is able to
produce 100K
reports/hour
23
Hive on Tez+LLAP
How Hive on Tez+LLAP
handle 100K reports ?
• how many servers
• Tuning?
24
What is LLAP
What is LLAP?
26
LLAP is for sub-second query
procesisng
•Persistent daemons
•Caching data
What is LLAP?
27
Tez
container
Tez
container
Tez AppMaster
Tez
created dynamically
LLAP
daemon
LLAP
daemon
Tez AppMaster
Tez+LLAP
persistent daemon
Basic Tuning
LLAP test cluster
29
Server node Xeon E5-2660v2
2.2GHz / 2CPU /
128GBMEM /
10GBase-T 2port
Slave node 45 nodes
HiveServer2 node 10 nodes
Hadoop 2.7.1
Hive 2.1.0-snapshot
Tez 0.8.3
Parameters
Some basic parameters needs to be
changed
very slow performance if it’s default
value
30
Threading model
hive.llap.daemon.num.executors
31
hive.llap.io.threadpool.size
thread
executor
thread
thread
I/O
thread
data
Executor thread pool
32
hive.llap.daemon.num.executors
(default 4)
• the number of JVM thread for query
execution
• set this same with the num of vCPU
• 40 in our cpu
Performance: executor thread
33
5.49
23.68 23.42
0
5
10
15
20
25
4(default) 40(our CPU) 72
QPS
executor threads
executor threads - QPS(higher is
be er)
I/O thread pool
hive.llap.io.threadpool.size
(default 10)
• number of IO threads
• Set the number of vCPU
• 40 in our case
34
Performance: I/O thread
35
12.82
23.42
0
5
10
15
20
25
10(default) 40(out CPU)
QPS
I/O threads
I/O threads - QPS
(higher is be er)
Memory
hive.llap.daemon.memory.per.instance.mb  java
-Xmx …
36
hive.llap.io.memory.size
executor I/O
JVM on-heap JVM off-heap
Performance
(compared to Tez)
Performance: QPS
38
0
5
10
15
20
25
30
32 64  96 128 160 192 224 256 288 320 352 384
QPS
clients
LLAP
Tez
100K / hour ?
LLAP 45 nodes(test cluster)
max: 24 qps ≈ 87K query/hour
70 nodes for 100K
(if it’s scaled linearly)
39
Advanced Tuning
Advanced Tuning
41
hive.llap.client.consistent.splits
false(default) => Use file locality for
selecting LLAP daemon
true => LLAP daemon is selected evenly(by
hash distribution)
Recap: LLAP
42
A node runs LLAP
and also datanode
hive.llap.client.consistent.splits
43
17.5
23.42
0
5
10
15
20
25
false(default)
use file locality
true
QPS
hive.llap.client.consistent.splits
QPS(higher is be er)
Locality No Locality
Future Work
Web UI
Web UI (HIVE-11526)
LLAP daemon
exposes basic metrics
on port 15002(default)
Included in HIVE2.1
Contributed from
Yahoo! JAPAN
46
Web UI (HIVE-14030)
HIVE-11526 is just for each daemon
HIVE-14030 provides aggregation view of a
LLAP cluster (not yet in master)
Contributed from Yahoo! JAPAN
47
ACL
Hive Column-level ACL
49
HS2 LLAP
YARN
HDFS
GOAL: Column-level ACL
SQL
ANSWER(?):
HiveServer2 can do it
Direct Access to HDFS
breaks everything
50
HS2 LLAP
YARN
HDFS
Storage Based Authorization
M/R,
Pig,
Spark
Break
SQL
Standard
Based
ACLs !!
But direct accessing(Not from Hive)
to HDFS breaks the security model.
Other solutions
(not only Hive)
are necessary
Future Directions
51
HS2 LLAP
YARN
HDFS
LlapInputForma
t
M/R,
Pig,
Spark
Check
SQL
Based
ACLs
LlapInputFormat checks ACLs to HS2 for other applications.
HIVE-13441
HIVE-12991
see LlapDump.java
Summary
Summary
53
• Throughput is greatly
improved by LLAP
• Some tunings are necessary
• LLAP is also effective for
batch processing
Q & A
Ad

More Related Content

What's hot (20)

Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
Thejas Nair
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Cloudera, Inc.
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
DataWorks Summit
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
Cloudera, Inc.
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
Databricks
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
DataWorks Summit
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
HostedbyConfluent
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
Databricks
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
Databricks
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
Nishith Agarwal
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
Thejas Nair
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Cloudera, Inc.
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
DataWorks Summit
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
Cloudera, Inc.
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
Databricks
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
DataWorks Summit
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
HostedbyConfluent
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
Databricks
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
Databricks
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
Nishith Agarwal
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 

Similar to Achieving 100k Queries per Hour on Hive on Tez (20)

Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015
polo li
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Kristofferson A
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Lucidworks
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
MongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
MongoDB
 
Solr Power FTW: Powering NoSQL the World Over
Solr Power FTW: Powering NoSQL the World OverSolr Power FTW: Powering NoSQL the World Over
Solr Power FTW: Powering NoSQL the World Over
Alex Pinkin
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
Alex Pinkin
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
C4Media
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Xu Jiang
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Lucidworks
 
Fast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonFast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on Tachyon
Alluxio, Inc.
 
Monitoring all Elements of Your Database Operations With Zabbix
Monitoring all Elements of Your Database Operations With ZabbixMonitoring all Elements of Your Database Operations With Zabbix
Monitoring all Elements of Your Database Operations With Zabbix
Zabbix
 
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
Amazon Web Services Korea
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red_Hat_Storage
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
Ton Ngo
 
Tachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBMTachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBM
Shaoshan Liu
 
Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...
Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...
Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...
MongoDB
 
What's new in JBoss ON 3.2
What's new in JBoss ON 3.2What's new in JBoss ON 3.2
What's new in JBoss ON 3.2
Thomas Segismont
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 
Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015
polo li
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Kristofferson A
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Lucidworks
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
MongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
MongoDB
 
Solr Power FTW: Powering NoSQL the World Over
Solr Power FTW: Powering NoSQL the World OverSolr Power FTW: Powering NoSQL the World Over
Solr Power FTW: Powering NoSQL the World Over
Alex Pinkin
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
Alex Pinkin
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
C4Media
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Xu Jiang
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Lucidworks
 
Fast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on TachyonFast Big Data Analytics with Spark on Tachyon
Fast Big Data Analytics with Spark on Tachyon
Alluxio, Inc.
 
Monitoring all Elements of Your Database Operations With Zabbix
Monitoring all Elements of Your Database Operations With ZabbixMonitoring all Elements of Your Database Operations With Zabbix
Monitoring all Elements of Your Database Operations With Zabbix
Zabbix
 
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
Amazon Web Services Korea
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red_Hat_Storage
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
Ton Ngo
 
Tachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBMTachyon_meetup_5-28-2015-IBM
Tachyon_meetup_5-28-2015-IBM
Shaoshan Liu
 
Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...
Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...
Partner Webinar: MongoDB and Softlayer on Bare Metal: Stability, Performance,...
MongoDB
 
What's new in JBoss ON 3.2
What's new in JBoss ON 3.2What's new in JBoss ON 3.2
What's new in JBoss ON 3.2
Thomas Segismont
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 
Ad

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
Ad

Recently uploaded (20)

Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 

Achieving 100k Queries per Hour on Hive on Tez

Editor's Notes

  • #2: Thank you for coming today My name is yohei abe, from yahoo japan. This time, this presentation is from two people, not only me This talk is consits of two parts Both parts of talks are related to Yahoo japan, First part is from Mr. Jiang, about HIVE Tez usecase in Yahoo! Japan Mr. Jiang …. Last part is from me, about LLAP usecase, for same dataset, query. At first, Allow me to introduce myself I’m a engineer of yahoo japan, working for hadoop infrastructure systems, supporting hive, hadoop systems.
  • #3: Yahoo! JAPAN is the largest portal site in japan, providing many services like a weather service, auction, news and whatnot. So, our site is able to reach 81% of entire Japanese internet users, it provides advertisement place for advertisors We offer a variety of advertising solutions.
  • #4: YDN, yahoo display network, is one of the solutions. It uses HIVE to generate YDN report, that has how many impressions were there, were clicked, wre viewed, for a certain period of time. It contians some useful information for advertisers. The point is, data is massive, so large. The data source table, report is generated from that, has 800 billions rows over 13 months period. The report generating job is parallel workload, batch processing, not interactive query. We need to generate 100000 reports per hour, this is our business , customer requirments.
  • #6: From a single client machine, we run 60K queries and calculate queries per hour(qph)from a result. Throughput = 60,000 queries * (60 / minutes taken to process 60K queries) For each cluster configuration change, we have several patterns of attempt withdifferent concurrencies, 32, 64, 96, 128, 256 and bigger.
  • #8: Our queries were already highly optimized. So we focused on some other parts. A query execution essentially is put together from – Client execution [ 0s if done correctly ] – Optimization [HiveServer2] [~ 0.1s] – HCatalog lookups [Hcatalog, Metastore] [ very fast in hive 14 ] – Application Master creation [4-5s] – Container Allocation [3-5s] – Query Execution
  • #14: Pending apps decreased, but Didn’t gain too much throughput
  • #16: Increased tez.am-rm.heartbeat.interval-ms.max from 250ms to 1000ms
  • #18: Increased replication factor for specific directory from 3 to 10
  • #23: Ok, so LLAP. We are going to use LLAP for YDN report.
  • #24: As Jiang said at his talk, hive on tez can produce 100K reports per hour Our engineers found some bottlnecks and fixed them to achieve the requirement by tuning some parameters, basically.
  • #25: Next step is LLAP LLAP is a new hive feature from hive 2.0, So we did some technical investigation, mostly we need to know how llap can process YDN report. How many servers necessary? What parameters need to be changed? Is it possible to generate 100K reports per hour?
  • #26: What is LLAP
  • #27: I think , in other session , in previous hadoop summit, LLAP is already introduced into detail. So here, I’m going to talk just briefly about LLAP, what it does. LLAP is for sub-second query processing, the main component is the persistent daemons.
  • #28: Let me compare with Tez processing model, I think it’s easy way to understand the difference, what LLAP does. In the case of Tez, when the client throw a SQL, application master is created. This is same behavior with LLAP. And then, application master creates some child tez container for computation. These are created dynamically, not persistent. On the other hand, LLAP is persistent daemon. “Persistent” means it’s can be used by some queries, some users, if it is not the query using private data. Persistency provides some benefit like omitting startup cost, intelligent cache, JVM can JIT it effectively and so on. I would say again , I don’t go through the internals. So if you’re interested in that, please make sure and catches talk by core engineers of hortonworks
  • #29: From here, I’m going to talk about some tuning points and performance results.
  • #30: This is our LLAP cluster just for evaluation purpose. The important point here is 45 nodes for LLAP, it means 45 daemons are running. We also prepare hiveserver2 so as not to hiveserver2 becomes bottleneck.
  • #31: LLAP can be configured by xml files as well as hive. These xml files have many parameters, some of them are basic, you need to change some of them for performance. Some default values are not suitable for your system, so you need to change them
  • #32: These basic parameters are related to thread size This is very simplified threading model of LLAP. LLAP has two main components, executor and IO layer. IO layer reads data from HDFS, decode ORC data, convert it to vectorized data, pass them to executor Executor gets data from IO layer, and compute and generate results. These data passing is completely asynchronous.
  • #33: The size of executor size is specified by hive.llap.daemon.num.executors Default is 4 You need to set this value according to your CPU vcore. In our case, its 40.
  • #34: This chart is performance result Vertical axis is for the number of query per seconds, hight is better. Horizontal axis is for the size of executor threads size. The leftmost, default 4 is so slow, CPU is almost idle. The second bar is 40, its in our CPU vcore size. No further improvement is watched when the size is larger than CPU size. So it’s good to set this value to CPU size.
  • #35: Next, IO thread size can be specified by hive.llap.io.threadpool.size Default is 10, it is also too small in our case.
  • #36: This chart is performance result in the case of changing IO thread size Default is not good performance , its not suitable for our cpu. It’s better performance when the size is vcpu size. Following these executor and IO results, I set these values to CPU vcore size on later slides, performance test.
  • #37: Memory, When it comes to memory, these are two parameters. One is for executor , the other is for IO layer. One thing to note is, executore uses JVM on-heap memory, but IO layer uses off-heap memory. The value of executor memory is changed by a little bit through internal process and passed as a java command line parameter of Xmx. There is no clear guidline what value is effective for these values, in our case, split physical memory size equally and set values to them. If LLAP daemon run out of them, you can watch and find it by LLAP Web UI. I’m going to talk about it later on this slide.
  • #38: Performance
  • #39: This chart shows, the blue line is LLAP, Tez+LLAP and red line is tez Verticali axis is query per seconds, higher is better. Horizontal axis is clients, it means more clients, more concurrent queries at the same time. This chart indicates LLAP is always better performance than Tez even for batch processing, not interactive query. グラフのスケールをあわせる
  • #40: Is the previous chart meaning 100K per hour , we need 100k per hour performance for our Ad report. From the chart, the max qps is abou 24, it’s 87000 query per hour using 45 LLAP daemons. Almost there, it was so close, but 45 nodes in our test environment is not enough. We calculated, so if LLAP scaled almost lineary, 70 nodes is enough for 100K performance. It’s far smaller than Tez system. LLAP provide us really good performance.
  • #41: More tuning We found one more parameter that can be effective in our case
  • #42: The parameter name is client consistent splits. This takes boolean value, default is false. The difference is LLAP daemon follows data locality or not. That is, data is on the same machine with LLAP The computation may be fast when LLAP daemon uses local data instead of remote data. The default is false, Tez application master distributes computations based on file locality. True is, Tez application master uses a kind of hash distribution for selecting LLAP. It means file locality is ignored, Compute process is distributed evenly on LLAP cluster
  • #43: Recap: A node runs llap daemon and also datanode daemon.
  • #44: The resut is here. It’s a little bit , opssite result I thought. Ignoring file locality is faster than default setting. But, it depends data size, table size, and so on. We think it cannot be generarized this result, but in our case, it’s faset when I changed the value from default.
  • #45: We have two future work. We are now under investigating, verifying them, LLAP features
  • #46: The first is Web UI
  • #47: LLAP daemon exposes some basic metrics, memory footprint, CPU usage, cache hit ratio.. At a specific port. This feature is in Hive2.1 and contributed from Yahoo! Japan. Thank you for my co-worker. This feature is really useful for cluster administrators. For example, when you cannot get good performance even if you have modern machine, there may be some mis-configuration about LLAP. In that case, you can use this UI, how daemon works, what is cache rate. In my case, I found through the UI, the number of executor is too small. CPU was almost idle.
  • #48: And again, in another JIRA ticket, this UI will be improved. This is not included master branch, I think. This ticket provides you aggregated view of previous UI You can easily check status of all cluster machines, all daemons.
  • #50: Column-level ACL is really important for us, and I think other companies as well Of course, Hive is able to do it using HiveServer2, HIveserver2 and metastore knows which data should be exposed to who, which user
  • #51: But, in our environment, we are not only ussing Hive, but also using other products, like MapReduce. They breaks ACL, because they can read HDFS directly, without Hiveserver2. when you need column-level ACL, you should use only Hive. But we need othre solutions, its necessary, must be.
  • #52: LLAP provides a solution for this issue, It exposes LLAP as storage layer, so other products, not hive, can access it with keeping ACL. If you interested in, plese see JIRA ticket, and LlapDump.java on github, hive repository.