How to Automate Performance Tuning for Apache Spark

WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics

Jean-Yves Stephan & Julien Dumazert,
Founders of Data Mechanics
Automating performance
tuning for Apache Spark
#UnifiedDataAnalytics #SparkAISummit

What is performance tuning?
3
Cluster parameters
● Size
● Instance type
● # of processors
● # of memory
● Disks
● ...
Spark configurations
● Parallelism
● Shuffle
● Storage
● JVM tuning
● Feature flags
● ...

Why automate performance tuning?
4
Pick new
params
Analyze
logs
Run
the job

Why automate performance tuning?
5
Pick new
params
Analyze
logs
Run
the job
Pager ringing at 3am
30% of your engineers time
Missing SLAs every week
Hard manual work
Frequent outages
Slow and expensive

Agenda
Manual
performance tuning
6
Automated
performance tuning

Perf tuning is an iterative process
For the first run
There are rules of thumb for some params:
• # of partitions: 3x the number of cores in the cluster
• # of cores per executor: 4-8
• memory per executor: 85% * (node memory / #
executors by node)
For the other params: make an educated guess!
8

Perf tuning is an iterative process
On the first attempt, the job crashes or does not meet the SLA. What to do?
9
Pick new
params
Analyze
logs
Run
the job
• Ensure stability of the job
• Solve performance issues
• Adjust speed-cost trade-off

Common issues: lack of parallelism
10
Only 8 cores used
on each machine!
Configuration:
• 26 instances n1-highmem-16
• spark.executor.cores = 16

Common issues: lack of parallelism
Configuration:
• 26 instances n1-highmem-16
• spark.executor.cores = 16
11
Only 8 cores used
on each machine!
The reason:
• 26 executors
• spark.sql.shuffle.partitions = 200
→ 200 / 26 ~ 7.7 tasks per executor
Fix: Use 400+ partitions (duration and cost / 2)
See also Adaptive execution (SPARK-23128) for
a way to dynamically and automatically set the
number of partitions.

Common issues: shuffle spill
12
The deserialized data produced by the map
stages in a shuffle does not fit in memory.
Spark temporarily writes it to disk, which
degrades performance.
Fixes:
1. Reduce the input data of each task by
increasing the number of partitions
2. Increase the memory available to each task
– by increasing spark executor memory
– by decreasing the number of cores per
executor

Common issues: data skew
13
This issue is not addressable with parameter tuning. A change in code is required!
Change in code:
1. Find a better partition key if possible
2. Use a map-side (broadcast) join
3. Use a salted key

Improvements based on node metrics
14
● Low CPU Usage => Consider oversubscription, ie telling Spark to schedule say 2x more
tasks per executor than the number of cores
● Low Memory Usage => Consider pruning the excess memory and switching to
CPU-intensive instances
● IO bound queries => Consider switching instance type or Spark IO configurations such as
compression or IO buffer sizes

Cost-speed trade-off
15
On the efficient frontier:
cheaper ⇒ longer
shorter ⇒ more expensive
Once performance issues are
solved, adjust your trade-off
given your needs.
Efficient frontier
Solving performance
issues
Adjusting
cost-speed trade-off

Cost-speed trade-off
16
40 instances
10 instances
4 instances
Example: impact of # of instances

Recap: manual perf tuning
Iterative process:
17
Solve performance
issues
Adjust cost-speed
tradeoff
Most of the impact comes from a
few parameters:
• # and type of instances for
execs and driver
• executor and driver size
(memory, # of cores)
• # of partitions

Open source tuning tools
To detect performance issues: DrElephant (LinkedIn)
To simulate cost-speed trade-off: SparkLens
(only supports adjusting # of executors)
18

Motivations
Performance tuning can make periodic workloads 2x faster
and more stable.
But:
• tedious manual process
• requires expertise
→ to scale it to 100+ pipelines, automation is required!
20
Pick new
params
Analyze
logs
Run
the job

Architecture (tech)
21
Scheduler Spark jobGateway
Optimization
engine
Data Mechanics
Kubernetes cluster
Customer
code
Job history
Spark
listener
1) Unoptimized Spark job
description
2) Optimization engine
identiﬁes job from history
and returns conﬁg
3) Optimized Spark
job description
4) An agent exports
event logs and system
metrics during job run

Architecture (algo)
23
Jun 2nd Jun 3rd Jun 4th Jun 5th Jun 6th Jun 7th

Architecture (algo)
24
Heuristic
A
Jun 2nd Jun 3rd Jun 4th Jun 5th Jun 6th
Heuristic
B
Jun 7th
Spark events log and metrics

Architecture (algo)
25
Heuristic
A
Evaluator
A
Heuristic
B
Evaluator
B
Param set A1
Param set A3
Param set B2
Jun 7th
Evaluators leverage
historical data

Architecture (algo)
26
Heuristic
A
Evaluator
A
Heuristic
B
Evaluator
B
Experiment
manager
Param set A1
Param set A3
Param set B2
Evaluated
param set B2
Evaluated
param set A1
Jun 7th
Evaluators leverage
historical data

Architecture (algo)
27
Heuristic
A
Evaluator
A
Heuristic
B
Evaluator
B
Experiment
manager
Param set A1
Param set A3
Param set B2
Evaluated
param set B2
Evaluated
param set A1
Jun 7th
Optimistic best
param set
Evaluators leverage
historical data

Heuristics
Heuristics look for performance issues:
• FewerTasksThanTotalCores
• ShuffleSpill
• LongShuffleReadTime
• LongGC
• ExecMemoryLargerThanNeeded
• TooShortTasks
• CPUTimeShorterThanComputeTime
• …
28
Or push in a given direction:
• IncreaseDefaultNumPartitions
• IncreaseTotalCores
• …
Heuristic
A
Jun 6th
Heuristic
B
Param sets
Every heuristic proposes different param sets.

Heuristics example
FewerTasksThanTotalCores
If a stage has fewer tasks than the total number of cores:
1. Increase the default number of partitions if applicable
2. Decrease the number of instances
3. Decrease the # of cores per instance (and adjust
memory)
29
Heuristic
A
Jun 6th
Heuristic
B
Param sets

Ranking param sets
Param sets cost and duration are evaluated by
Evaluators.
The Experiment manager selects the best solution
according to customer settings.
30
Evaluator
A
Evaluator
B
Experiment
manager
Optimistic best
param set
Unevaluated param sets

Evaluator
Estimates cost and duration distributions.
• From history if possible
• By simulation otherwise
The simulation: an optimistic model of the Spark scheduler.
• Takes as input a Spark app (Spark events log)
• Simulates a new execution under different conditions
• Different # of partitions, cores / exec, execs
• Optimistic assumptions: no GC time, no shuffle read time
Why optimistic? Encourages exploration!
31
Evaluator
Unevaluated param set
Evaluated param set
(cost and duration)
History

Experiment manager
Selects the best param set given customer objectives like:
• as cheap as possible within maximum duration
• as fast as possible within budget
• finish at 6am no matter what
Contains Bayesian optimization logic to account for noise.
32
Experiment
manager
Optimistic best
param set
Evaluated param sets
(cost and duration)

Architecture (algo)
33
Heuristic
A
Evaluator
A
Heuristic
B
Evaluator
B
Experiment
manager
Param set A1
Param set A3
Param set B2
Evaluated
param set B2
Evaluated
param set A1
Jun 7th
Optimistic best
param set
Evaluators leverage
historical data

34
● Stability: Automatic remediation of
OOMs and timeouts (upon retry)
● Performance: 50% cost reduction.
● Algorithm typically converges and
adapts to changes in 5-10 iterations.
Impact of automated tuning

Data Mechanics platform
35
● A managed platform for containerized
Spark apps in your cloud account
● Just send your code, we automate the
scaling and configurations tuning
● Pricing based on real Spark compute
time, not on idle server uptime.
Gateway
Data
source
Data engineers
Data scientists

The hassle-free Spark platform
powered by Kubernetes
Learn more and sign up for private beta on https://www.datamechanics.co
Also, we’re hiring :) jobs@datamechanics.co

DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

How to Automate Performance Tuning for Apache Spark

More Related Content

What's hot (20)

Similar to How to Automate Performance Tuning for Apache Spark (20)

More from Databricks (20)

Recently uploaded (20)

How to Automate Performance Tuning for Apache Spark