SlideShare a Scribd company logo
1
JVM JIT-compiler
overview
Vladimir Ivanov
HotSpot JVM Compiler
Oracle Corp.
2
Agenda
§  about compilers in general
–  … and JIT-compilers in particular
§  about JIT-compilers in HotSpot JVM
§  monitoring JIT-compilers in HotSpot JVM
3
Static vs Dynamic
AOT vs JIT
4
Dynamic vs Static Compilation
§ Static compilation
–  “ahead-of-time”(AOT) compilation
–  Source code → Native executable
–  Most of compilation work happens before execution
§ Modern Java VMs use dynamic compilers (JIT)
–  “just-in-time” (JIT) compilation
–  Source → Bytecode → Interpreter + JITted executable
–  Most of compilation work happens during execution
Differences
5
Dynamic vs Static Compilation
§ Static compilation (AOT)
–  can utilize complex and heavy analyses and
optimizations
–  … but static information sometimes isn’t enough
–  … and it’s hard to rely on profiling info, if any
–  moreover, how to utilize specific platform features
(like SSE 4.2)?
Differences
6
Dynamic vs Static Compilation
§ Modern Java VMs use dynamic compilers (JIT)
–  aggressive optimistic optimizations
§ through extensive usage of profiling info
–  … but budget is limited and shared with an application
–  startup speed suffers
–  peak performance may suffer as well (not necessary)
Differences
7
Dynamic Compilation
in JVM
8
JVM
§ Runtime
–  class loading, bytecode verification, synchronization
§ JIT
–  profiling, compilation plans, OSR
–  aggressive optimizations
§ GC
–  different algorithms: throughput vs. response time
§ Serviceability
–  JVMTI, JFR, Serviceability Agent
Subsystems
9
Dynamic Compilation (JIT)
§ Just-In-Time compilation
§ Compiled when needed
§ Maybe immediately before execution
–  ...or when we decide it’s important
–  ...or never?
10
Dynamic Compilation (JIT)
§ Knows about
–  loaded classes, methods the program has executed
§ Makes optimization decisions based on code
paths executed
–  Code generation depends on what is observed:
§ loaded classes, code paths executed, branches taken
§ May re-optimize if assumption was wrong, or
alternative code paths taken
–  Instruction path length may change between invocations of
methods as a result of de-optimization / re-compilation
11
Dynamic Compilation (JIT)
§ Can do non-conservative optimizations in dynamic
§ Separates optimization from product delivery cycle
–  Update JVM, run the same application,
realize improved performance!
–  Can be "tuned" to the target platform
12
JVM: Makes Bytecodes Fast
§ JVMs eventually JIT bytecodes
–  To make them fast
–  Some JITs are high quality optimizing compilers
§ But cannot use existing static compilers directly:
–  Tracking OOPs (ptrs) for GC
–  Java Memory Model (volatile reordering & fences)
–  New code patterns to optimize
–  Time & resource constraints (CPU, memory)
13
JVM: Makes Bytecodes Fast
§ JIT'ing requires profiling
–  Because you don't want to JIT everything
§ Profiling allows focused code-gen
§ Profiling allows better code-gen
–  Inline what’s hot
–  Loop unrolling, range-check elimination, etc
–  Branch prediction, spill-code-gen, scheduling
14
Profiling
§ Gathers data about code during execution
–  invariants
§ types, constants (e.g. null pointers)
–  statistics
§ branches, calls
§ Gathered data is used during optimization
–  Educated guess
–  Guess can be wrong
15
Optimistic Compilers
§ Assume profile is accurate
–  Aggressively optimize based on profile
–  Bail out if the profile lies
§  ... and hope that they’re usually right
16
Profile-guided optimization (PGO)
§ Use profile for more efficient optimization
§ PGO in JVMs
–  Always have it, turned on by default
–  Developers (usually) not interested/concerned about it
–  Profile is always consistent to execution scenario
17
Dynamic Compilation (JIT)
§ Is dynamic compilation overhead essential?
–  The longer your application runs, the less the overhead
§ Trading off compilation time, not application time
–  Steal some cycles very early in execution
–  Done “automagically” and transparently to application
§ Most of “perceived” overhead is compiler waiting
for more data
–  ... thus running semi-optimal code for time being
Overhead
18
Java Application lifetime
JVM point of view
Author: Aleksey Shipilev
19
Mixed-Mode Execution
§ Interpreted
–  Bytecode-walking
–  Artificial stack machine
§ Compiled
–  Direct native operations
–  Native register machine
20
Bytecode Execution
1 2
34
Interpretation Profiling
Dynamic
CompilationDeoptimization
21
Deoptimization
§ Bail out of running native code
–  stop executing native (JIT-generated) code
–  start interpreting bytecode
§ It’s a complicated operation at runtime…
22
OSR: On-Stack Replacement
§ Running method never exits?
§ But it’s getting really hot?
§ Generally means loops, back-branching
§ Compile and replace while running
§ Not typically useful in large systems
§ Looks great on benchmarks!
23
Optimizations
24
Optimizations in HotSpot JVM
§  compiler tactics
delayed compilation
tiered compilation
on-stack replacement
delayed reoptimization
program dependence graph rep.
static single assignment rep.
§  proof-based techniques
exact type inference
memory value inference
memory value tracking
constant folding
reassociation
operator strength reduction
null check elimination
type test strength reduction
type test elimination
algebraic simplification
common subexpression
elimination
integer range typing
§  flow-sensitive rewrites
conditional constant propagation
dominating test detection
flow-carried type narrowing
dead code elimination
§  language-specific techniques
class hierarchy analysis
devirtualization
symbolic constant propagation
autobox elimination
escape analysis
lock elision
lock fusion
de-reflection
§  speculative (profile-based) techniques
optimistic nullness assertions
optimistic type assertions
optimistic type strengthening
optimistic array length
strengthening
untaken branch pruning
optimistic N-morphic inlining
branch frequency prediction
call frequency prediction
§  memory and placement transformation
expression hoisting
expression sinking
redundant store elimination
adjacent store fusion
card-mark elimination
merge-point splitting
§  loop transformations
loop unrolling
loop peeling
safepoint elimination
iteration range splitting
range check elimination
loop vectorization
§  global code shaping
inlining (graph integration)
global code motion
heat-based code layout
switch balancing
throw inlining
§  control flow graph transformation
local code scheduling
local code bundling
delay slot filling
graph-coloring register allocation
linear scan register allocation
live range splitting
copy coalescing
constant splitting
copy removal
address mode matching
instruction peepholing
DFA-based code generator
25
Inlining
§ Combine caller and callee into one unit
–  e.g.based on profile
–  … or prove smth using CHA (Class Hierarchy Analysis)
–  Perhaps with a guard/test
§ Optimize as a whole
–  More code means better visibility
26
Inlining
int addAll(int max) {
int accum = 0;
for (int i = 0; i < max; i++) {
accum = add(accum, i);
}
return accum;
}
int add(int a, int b) { return a + b; }
27
Inlining
int addAll(int max) {
int accum = 0;
for (int i = 0; i < max; i++) {
accum = accum + i;
}
return accum;
}
28
JVM: Makes Virtual Calls Fast
§ C++ avoids virtual calls – because they are slow
§ Java embraces them – and makes them fast
–  Well, mostly fast – JIT's do Class Hierarchy Analysis
(CHA)
–  CHA turns most virtual calls into static calls
–  JVM detects new classes loaded, adjusts CHA
§ May need to re-JIT
–  When CHA fails to make the call static, inline caches
–  When IC's fail, virtual calls are back to being slow
29
Inlining and devirtualization
§ Inlining is the most profitable compiler optimization
–  Rather straightforward to implement
–  Huge benefits: expands the scope for other
optimizations
§ OOP needs polymorphism, that implies virtual
calls
–  Prevents naïve inlining
–  Devirtualization is required
–  (This does not mean you should not write OOP code)
30
Devirtualization in JVM
§ Application developers shouldn't care
§ Analyzes hierarchy of currently loaded classes
§ Efficiently devirtualizes all monomorphic calls
§ Able to devirtualize polymorphic calls
§ JVM may inline dynamic methods
–  Reflection calls
–  Runtime-synthesized methods
–  JSR 292
31
Call Site
§ The place where you make a call
§ Types of call sites
– Monomorphic (“one shape”)
§ Single target class
– Bimorphic (“two shapes”)
– Polymorphic (“many shapes”)
– Megamorphic
32
Intrinsics
§ Methods known to the JIT compiler
–  method bytecode is ignored
–  inserts “best” native code
§ e.g. optimized sqrt in machine code
§ Existing intrinsics
–  String::equals, Math::*, System::arraycopy,
Object::hashCode, Object::getClass, sun.misc.Unsafe::*
33
Feedback multiplies optimizations
§ On-line profiling and CHA produces information
–  ...which lets the JIT ignore unused paths
–  ...and helps the JIT sharpen types on hot paths
–  ...which allows calls to be devirtualized
–  ...allowing them to be inlined
–  ...expanding an ever-widening optimization horizon
§ Result:
Large native methods containing tightly optimized
machine code for hundreds of inlined calls!
34
HotSpot JVM
35
JVMs
§ Oracle HotSpot
§ IBM J9
§ Oracle JRockit
§ Azul Zing
§ Excelsior JET
§ Jikes RVM
Implementations
36
HotSpot JVM
§ client / C1
§ server / C2
§ tiered mode (C1 + C2)
JIT-compilers
37
HotSpot JVM
§ client / C1
–  $ java –client
§ only available in 32-bit VM
–  fast code generation of acceptable quality
–  basic optimizations
–  doesn’t need profile
–  compilation threshold: 1,5k invocations
JIT-compilers
38
HotSpot JVM
§ server / C2
–  $ java –server
–  highly optimized code for speed
–  many aggressive optimizations which rely on profile
–  compilation threshold: 10k invocations
JIT-compilers
39
HotSpot JVM
§ Client / C1
+ fast startup
–  peak performance suffers
§ Server / C2
+ very good code for hot methods
–  slow startup / warmup
JIT-compilers comparison
40
Tiered compilation
§ -XX:+TieredCompilation
§ Multiple tiers of interpretation, C1, and C2
§ Level 0 = Interpreter
§ Level 1-3 = C1
–  #1: C1 w/o profiling
–  #2: C1 w/ basic profiling
–  #3: C1 w/ full profiling
§ Level 4 = C2
C1 + C2
41
Monitoring JIT
42
Monitoring JIT-Compiler
§ how to print info about compiled methods?
–  -XX:+PrintCompilation
§ how to print info about inlining decisions
–  -XX:+PrintInlining
§ how to control compilation policy?
–  -XX:CompileCommand=…
§ how to print assembly code?
–  -XX:+PrintAssembly
–  -XX:+PrintOptoAssembly (C2-only)
43
Print Compilation
§ -XX:+PrintCompilation
§ Print methods as they are JIT-compiled
§ Class + name + size
44
Print Compilation
$ java -XX:+PrintCompilation
988 1 java.lang.String::hashCode (55 bytes)
1271 2 sun.nio.cs.UTF_8$Encoder::encode (361 bytes)
1406 3 java.lang.String::charAt (29 bytes)
Sample output
45
Print Compilation
2043 470 % !bs jdk.nashorn.internal.ir.FunctionNode::accept @ 136 (265 bytes)
% == OSR compilation
! == has exception handles (may be expensive)
s == synchronized method
b == blocking compilation
2028 466 n java.lang.Class::isArray (native)
n == native method
Other useful info
46
Print Compilation
§  621 160 java.lang.Object::equals (11 bytes) made not entrant
–  don‘t allow any new calls into this compiled version
§  1807 160 java.lang.Object::equals (11 bytes) made zombie
–  can safely throw away compiled version
Not just compilation notifications
47
No JIT At All?
§ Code is too large
§ Code isn’t too «hot»
–  executed not too often
48
Print Inlining
§ -XX:+PrintInlining
-XX:+UnlockDiagnosticVMOptions
§ Shows hierarchy of inlined methods
§ Prints reason, if a method isn’t inlined
49
Print Inlining Decisions
$ java -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
75 1 java.lang.String::hashCode (55 bytes)
88 2 sun.nio.cs.UTF_8$Encoder::encode (361 bytes)
@ 14 java.lang.Math::min (11 bytes) (intrinsic)
@ 139 java.lang.Character::isSurrogate (18 bytes) never executed
103 3 java.lang.String::charAt (29 bytes)
50
Inlining Heuristic Tuning
§ -XX:MaxInlineSize=#
–  Largest inlinable method (bytecode)
§ -XX:InlineSmallCode=#
–  Largest inlinable compiled method
§ -XX:FreqInlineSize=#
–  Largest frequently-called method…
§ -XX:MaxInlineLevel=#
–  How deep does the rabbit hole go?
§ -XX:MaxRecursiveInlineLevel=#
–  recursive inlining
51
Machine Code
§ -XX:+PrintAssembly
§ http://wikis.sun.com/display/HotSpotInternals/
PrintAssembly
§ Knowing code compiles is good
§ Knowing code inlines is better
§ Seeing the actual assembly is best!
52
-XX:CompileCommand=
§ Syntax
–  “[command] [method] [signature]”
§ Supported commands
–  exclude – never compile
–  inline – always inline
–  dontinline – never inline
§ Method reference
–  class.name::methodName
§ Method signature is optional
53
What Have We Learned?
§ How JIT compilers work
§ How HotSpot’s JIT works
§ How to monitor the JIT in HotSpot
54
Questions?
vladimir.x.ivanov@oracle.com
@iwanowww
55
Graphic Section Divider

More Related Content

PDF
JVM JIT compilation overview by Vladimir Ivanov
ZeroTurnaround
 
PDF
JVM JIT-compiler overview @ JavaOne Moscow 2013
Vladimir Ivanov
 
PPTX
Java Jit. Compilation and optimization by Andrey Kovalenko
Valeriia Maliarenko
 
PDF
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
Vladimir Ivanov
 
PDF
Intrinsic Methods in HotSpot VM
Kris Mok
 
DOCX
just in time JIT compiler
Mohit kumar
 
PDF
Владимир Иванов. JIT для Java разработчиков
Volha Banadyseva
 
PPTX
JVM: A Platform for Multiple Languages
Kris Mok
 
JVM JIT compilation overview by Vladimir Ivanov
ZeroTurnaround
 
JVM JIT-compiler overview @ JavaOne Moscow 2013
Vladimir Ivanov
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Valeriia Maliarenko
 
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
Vladimir Ivanov
 
Intrinsic Methods in HotSpot VM
Kris Mok
 
just in time JIT compiler
Mohit kumar
 
Владимир Иванов. JIT для Java разработчиков
Volha Banadyseva
 
JVM: A Platform for Multiple Languages
Kris Mok
 

What's hot (20)

PPT
Java Performance Tuning
Minh Hoang
 
PPTX
Dissecting the Hotspot JVM
Ivan Ivanov
 
PPTX
Graal in GraalVM - A New JIT Compiler
Koichi Sakata
 
PDF
Detecting hardware virtualization rootkits
Edgar Barbosa
 
PPTX
JVM++: The Graal VM
Martin Toshev
 
PPT
Jvm Performance Tunning
guest1f2740
 
PPTX
Java performance tuning
Mohammed Fazuluddin
 
PDF
不深不淺,帶你認識 LLVM (Found LLVM in your life)
Douglas Chen
 
PPTX
FOSDEM2016 - Ruby and OMR
Charlie Gracie
 
PDF
Using Flame Graphs
Isuru Perera
 
PDF
Introduction to the Java bytecode - So@t - 20130924
yohanbeschi
 
PDF
Towards JVM Dynamic Languages Toolchain
Attila Szegedi
 
PDF
Software Profiling: Java Performance, Profiling and Flamegraphs
Isuru Perera
 
PDF
JVM and Garbage Collection Tuning
Kai Koenig
 
PPTX
JAVA BYTE CODE
Javed Ahmed Samo
 
PDF
JCConf 2018 - Retrospect and Prospect of Java
Joseph Kuo
 
PDF
Java Performance & Profiling
Isuru Perera
 
PPTX
Java bytecode and classes
yoavwix
 
PDF
TWJUG x Oracle Groundbreakers 2019 Taiwan - What’s New in Last Java Versions
Joseph Kuo
 
PPTX
Jfokus 2016 - A JVMs Journey into Polyglot Runtimes
Charlie Gracie
 
Java Performance Tuning
Minh Hoang
 
Dissecting the Hotspot JVM
Ivan Ivanov
 
Graal in GraalVM - A New JIT Compiler
Koichi Sakata
 
Detecting hardware virtualization rootkits
Edgar Barbosa
 
JVM++: The Graal VM
Martin Toshev
 
Jvm Performance Tunning
guest1f2740
 
Java performance tuning
Mohammed Fazuluddin
 
不深不淺,帶你認識 LLVM (Found LLVM in your life)
Douglas Chen
 
FOSDEM2016 - Ruby and OMR
Charlie Gracie
 
Using Flame Graphs
Isuru Perera
 
Introduction to the Java bytecode - So@t - 20130924
yohanbeschi
 
Towards JVM Dynamic Languages Toolchain
Attila Szegedi
 
Software Profiling: Java Performance, Profiling and Flamegraphs
Isuru Perera
 
JVM and Garbage Collection Tuning
Kai Koenig
 
JAVA BYTE CODE
Javed Ahmed Samo
 
JCConf 2018 - Retrospect and Prospect of Java
Joseph Kuo
 
Java Performance & Profiling
Isuru Perera
 
Java bytecode and classes
yoavwix
 
TWJUG x Oracle Groundbreakers 2019 Taiwan - What’s New in Last Java Versions
Joseph Kuo
 
Jfokus 2016 - A JVMs Journey into Polyglot Runtimes
Charlie Gracie
 
Ad

Viewers also liked (20)

PDF
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
aragozin
 
PPTX
Jit complier
Kaya Ota
 
PDF
Алексей Рагозин
Ontico
 
PDF
Introduction To Android
ma-polimi
 
PPTX
Dalvik Vm &amp; Jit
Ankit Somani
 
PPTX
Dalvik jit
Srinivas Kothuri
 
PDF
Юрий Крутилин. Инструментарий для реверс-инжиниринга Android-приложений
Mail.ru Group
 
PPTX
Candy crush saga
tavikeith
 
PPTX
JIT Soultions: Overview
JIT Solutions
 
PPT
20130720 case study of candy crush saga
Christina Hsu
 
PDF
Сергей Полаженко - Security Testing: SQL Injection
SQALab
 
PPTX
AI based Tic Tac Toe game using Minimax Algorithm
Kiran Shahi
 
PPTX
TIC TAC TOE
asmhemu
 
PDF
Inside Android's Dalvik VM - NEJUG Nov 2011
Doug Hawkins
 
ODP
Just-in-time compiler (March, 2017)
Rachel M. Carmena
 
PPTX
Candy crush cheat codes: 10 Cheat Codes of candy crush saga
Meddy Lee
 
PPT
King’s candy crush saga
Caleb Yoon
 
PPT
Java-java virtual machine
Surbhi Panhalkar
 
PPT
Jit
ajithsrc
 
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
aragozin
 
Jit complier
Kaya Ota
 
Алексей Рагозин
Ontico
 
Introduction To Android
ma-polimi
 
Dalvik Vm &amp; Jit
Ankit Somani
 
Dalvik jit
Srinivas Kothuri
 
Юрий Крутилин. Инструментарий для реверс-инжиниринга Android-приложений
Mail.ru Group
 
Candy crush saga
tavikeith
 
JIT Soultions: Overview
JIT Solutions
 
20130720 case study of candy crush saga
Christina Hsu
 
Сергей Полаженко - Security Testing: SQL Injection
SQALab
 
AI based Tic Tac Toe game using Minimax Algorithm
Kiran Shahi
 
TIC TAC TOE
asmhemu
 
Inside Android's Dalvik VM - NEJUG Nov 2011
Doug Hawkins
 
Just-in-time compiler (March, 2017)
Rachel M. Carmena
 
Candy crush cheat codes: 10 Cheat Codes of candy crush saga
Meddy Lee
 
King’s candy crush saga
Caleb Yoon
 
Java-java virtual machine
Surbhi Panhalkar
 
Ad

Similar to "JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine (20)

PPTX
Optimizing Java Notes
Adam Feldscher
 
PPTX
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
Nikita Lipsky
 
PDF
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
PDF
JIT compilation in modern platforms – challenges and solutions
aragozin
 
PPTX
Ahead-Of-Time Compilation of Java Applications
Nikita Lipsky
 
PPTX
Java performance tuning
Jerry Kurian
 
PDF
Eclipse Day India 2015 - Java bytecode analysis and JIT
Eclipse Day India
 
PDF
Lifecycle of a JIT compiled code
J On The Beach
 
PPTX
Java performance jit
Suken Shah
 
PDF
Everything You Wanted to Know About JIT Compilation but Were Afraid to Ask [J...
David Buck
 
PPTX
Cloud Native Compiler
Simon Ritter
 
PPTX
Keeping Your Java Hot by Solving the JVM Warmup Problem
Simon Ritter
 
PPTX
An introduction to JVM performance
Rafael Winterhalter
 
PDF
10 Reasons Why Java Now Rocks More Than Ever
Geert Bevin
 
PDF
What's Inside a JVM?
Azul Systems Inc.
 
PPTX
Java On CRaC
Simon Ritter
 
PDF
Hotspot & AOT
Dmitry Chuyko
 
PDF
How the HotSpot and Graal JVMs execute Java Code
Jim Gough
 
PPTX
A tour of Java and the JVM
Alex Birch
 
PDF
Apache Big Data Europe 2016
Tim Ellison
 
Optimizing Java Notes
Adam Feldscher
 
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
Nikita Lipsky
 
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
JIT compilation in modern platforms – challenges and solutions
aragozin
 
Ahead-Of-Time Compilation of Java Applications
Nikita Lipsky
 
Java performance tuning
Jerry Kurian
 
Eclipse Day India 2015 - Java bytecode analysis and JIT
Eclipse Day India
 
Lifecycle of a JIT compiled code
J On The Beach
 
Java performance jit
Suken Shah
 
Everything You Wanted to Know About JIT Compilation but Were Afraid to Ask [J...
David Buck
 
Cloud Native Compiler
Simon Ritter
 
Keeping Your Java Hot by Solving the JVM Warmup Problem
Simon Ritter
 
An introduction to JVM performance
Rafael Winterhalter
 
10 Reasons Why Java Now Rocks More Than Ever
Geert Bevin
 
What's Inside a JVM?
Azul Systems Inc.
 
Java On CRaC
Simon Ritter
 
Hotspot & AOT
Dmitry Chuyko
 
How the HotSpot and Graal JVMs execute Java Code
Jim Gough
 
A tour of Java and the JVM
Alex Birch
 
Apache Big Data Europe 2016
Tim Ellison
 

More from Vladimir Ivanov (8)

PDF
"Formal Verification in Java" by Shura Iline, Vladimir Ivanov @ JEEConf 2013,...
Vladimir Ivanov
 
PDF
"Optimizing Memory Footprint in Java" @ JEEConf 2013, Kiev, Ukraine
Vladimir Ivanov
 
PDF
"Invokedynamic: роскошь или необходимость?"@ JavaOne Moscow 2013
Vladimir Ivanov
 
PDF
"G1 GC и Обзор сборки мусора в HotSpot JVM" @ JUG SPb, 31-05-2012
Vladimir Ivanov
 
PDF
Управление памятью в Java: Footprint
Vladimir Ivanov
 
PDF
Многоуровневая компиляция в HotSpot JVM
Vladimir Ivanov
 
PDF
G1 GC: Garbage-First Garbage Collector
Vladimir Ivanov
 
PDF
"Диагностирование проблем и настройка GC в HotSpot JVM" (JEEConf, Киев, 2011)
Vladimir Ivanov
 
"Formal Verification in Java" by Shura Iline, Vladimir Ivanov @ JEEConf 2013,...
Vladimir Ivanov
 
"Optimizing Memory Footprint in Java" @ JEEConf 2013, Kiev, Ukraine
Vladimir Ivanov
 
"Invokedynamic: роскошь или необходимость?"@ JavaOne Moscow 2013
Vladimir Ivanov
 
"G1 GC и Обзор сборки мусора в HotSpot JVM" @ JUG SPb, 31-05-2012
Vladimir Ivanov
 
Управление памятью в Java: Footprint
Vladimir Ivanov
 
Многоуровневая компиляция в HotSpot JVM
Vladimir Ivanov
 
G1 GC: Garbage-First Garbage Collector
Vladimir Ivanov
 
"Диагностирование проблем и настройка GC в HotSpot JVM" (JEEConf, Киев, 2011)
Vladimir Ivanov
 

"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine

  • 2. 2 Agenda §  about compilers in general –  … and JIT-compilers in particular §  about JIT-compilers in HotSpot JVM §  monitoring JIT-compilers in HotSpot JVM
  • 4. 4 Dynamic vs Static Compilation § Static compilation –  “ahead-of-time”(AOT) compilation –  Source code → Native executable –  Most of compilation work happens before execution § Modern Java VMs use dynamic compilers (JIT) –  “just-in-time” (JIT) compilation –  Source → Bytecode → Interpreter + JITted executable –  Most of compilation work happens during execution Differences
  • 5. 5 Dynamic vs Static Compilation § Static compilation (AOT) –  can utilize complex and heavy analyses and optimizations –  … but static information sometimes isn’t enough –  … and it’s hard to rely on profiling info, if any –  moreover, how to utilize specific platform features (like SSE 4.2)? Differences
  • 6. 6 Dynamic vs Static Compilation § Modern Java VMs use dynamic compilers (JIT) –  aggressive optimistic optimizations § through extensive usage of profiling info –  … but budget is limited and shared with an application –  startup speed suffers –  peak performance may suffer as well (not necessary) Differences
  • 8. 8 JVM § Runtime –  class loading, bytecode verification, synchronization § JIT –  profiling, compilation plans, OSR –  aggressive optimizations § GC –  different algorithms: throughput vs. response time § Serviceability –  JVMTI, JFR, Serviceability Agent Subsystems
  • 9. 9 Dynamic Compilation (JIT) § Just-In-Time compilation § Compiled when needed § Maybe immediately before execution –  ...or when we decide it’s important –  ...or never?
  • 10. 10 Dynamic Compilation (JIT) § Knows about –  loaded classes, methods the program has executed § Makes optimization decisions based on code paths executed –  Code generation depends on what is observed: § loaded classes, code paths executed, branches taken § May re-optimize if assumption was wrong, or alternative code paths taken –  Instruction path length may change between invocations of methods as a result of de-optimization / re-compilation
  • 11. 11 Dynamic Compilation (JIT) § Can do non-conservative optimizations in dynamic § Separates optimization from product delivery cycle –  Update JVM, run the same application, realize improved performance! –  Can be "tuned" to the target platform
  • 12. 12 JVM: Makes Bytecodes Fast § JVMs eventually JIT bytecodes –  To make them fast –  Some JITs are high quality optimizing compilers § But cannot use existing static compilers directly: –  Tracking OOPs (ptrs) for GC –  Java Memory Model (volatile reordering & fences) –  New code patterns to optimize –  Time & resource constraints (CPU, memory)
  • 13. 13 JVM: Makes Bytecodes Fast § JIT'ing requires profiling –  Because you don't want to JIT everything § Profiling allows focused code-gen § Profiling allows better code-gen –  Inline what’s hot –  Loop unrolling, range-check elimination, etc –  Branch prediction, spill-code-gen, scheduling
  • 14. 14 Profiling § Gathers data about code during execution –  invariants § types, constants (e.g. null pointers) –  statistics § branches, calls § Gathered data is used during optimization –  Educated guess –  Guess can be wrong
  • 15. 15 Optimistic Compilers § Assume profile is accurate –  Aggressively optimize based on profile –  Bail out if the profile lies §  ... and hope that they’re usually right
  • 16. 16 Profile-guided optimization (PGO) § Use profile for more efficient optimization § PGO in JVMs –  Always have it, turned on by default –  Developers (usually) not interested/concerned about it –  Profile is always consistent to execution scenario
  • 17. 17 Dynamic Compilation (JIT) § Is dynamic compilation overhead essential? –  The longer your application runs, the less the overhead § Trading off compilation time, not application time –  Steal some cycles very early in execution –  Done “automagically” and transparently to application § Most of “perceived” overhead is compiler waiting for more data –  ... thus running semi-optimal code for time being Overhead
  • 18. 18 Java Application lifetime JVM point of view Author: Aleksey Shipilev
  • 19. 19 Mixed-Mode Execution § Interpreted –  Bytecode-walking –  Artificial stack machine § Compiled –  Direct native operations –  Native register machine
  • 20. 20 Bytecode Execution 1 2 34 Interpretation Profiling Dynamic CompilationDeoptimization
  • 21. 21 Deoptimization § Bail out of running native code –  stop executing native (JIT-generated) code –  start interpreting bytecode § It’s a complicated operation at runtime…
  • 22. 22 OSR: On-Stack Replacement § Running method never exits? § But it’s getting really hot? § Generally means loops, back-branching § Compile and replace while running § Not typically useful in large systems § Looks great on benchmarks!
  • 24. 24 Optimizations in HotSpot JVM §  compiler tactics delayed compilation tiered compilation on-stack replacement delayed reoptimization program dependence graph rep. static single assignment rep. §  proof-based techniques exact type inference memory value inference memory value tracking constant folding reassociation operator strength reduction null check elimination type test strength reduction type test elimination algebraic simplification common subexpression elimination integer range typing §  flow-sensitive rewrites conditional constant propagation dominating test detection flow-carried type narrowing dead code elimination §  language-specific techniques class hierarchy analysis devirtualization symbolic constant propagation autobox elimination escape analysis lock elision lock fusion de-reflection §  speculative (profile-based) techniques optimistic nullness assertions optimistic type assertions optimistic type strengthening optimistic array length strengthening untaken branch pruning optimistic N-morphic inlining branch frequency prediction call frequency prediction §  memory and placement transformation expression hoisting expression sinking redundant store elimination adjacent store fusion card-mark elimination merge-point splitting §  loop transformations loop unrolling loop peeling safepoint elimination iteration range splitting range check elimination loop vectorization §  global code shaping inlining (graph integration) global code motion heat-based code layout switch balancing throw inlining §  control flow graph transformation local code scheduling local code bundling delay slot filling graph-coloring register allocation linear scan register allocation live range splitting copy coalescing constant splitting copy removal address mode matching instruction peepholing DFA-based code generator
  • 25. 25 Inlining § Combine caller and callee into one unit –  e.g.based on profile –  … or prove smth using CHA (Class Hierarchy Analysis) –  Perhaps with a guard/test § Optimize as a whole –  More code means better visibility
  • 26. 26 Inlining int addAll(int max) { int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; } int add(int a, int b) { return a + b; }
  • 27. 27 Inlining int addAll(int max) { int accum = 0; for (int i = 0; i < max; i++) { accum = accum + i; } return accum; }
  • 28. 28 JVM: Makes Virtual Calls Fast § C++ avoids virtual calls – because they are slow § Java embraces them – and makes them fast –  Well, mostly fast – JIT's do Class Hierarchy Analysis (CHA) –  CHA turns most virtual calls into static calls –  JVM detects new classes loaded, adjusts CHA § May need to re-JIT –  When CHA fails to make the call static, inline caches –  When IC's fail, virtual calls are back to being slow
  • 29. 29 Inlining and devirtualization § Inlining is the most profitable compiler optimization –  Rather straightforward to implement –  Huge benefits: expands the scope for other optimizations § OOP needs polymorphism, that implies virtual calls –  Prevents naïve inlining –  Devirtualization is required –  (This does not mean you should not write OOP code)
  • 30. 30 Devirtualization in JVM § Application developers shouldn't care § Analyzes hierarchy of currently loaded classes § Efficiently devirtualizes all monomorphic calls § Able to devirtualize polymorphic calls § JVM may inline dynamic methods –  Reflection calls –  Runtime-synthesized methods –  JSR 292
  • 31. 31 Call Site § The place where you make a call § Types of call sites – Monomorphic (“one shape”) § Single target class – Bimorphic (“two shapes”) – Polymorphic (“many shapes”) – Megamorphic
  • 32. 32 Intrinsics § Methods known to the JIT compiler –  method bytecode is ignored –  inserts “best” native code § e.g. optimized sqrt in machine code § Existing intrinsics –  String::equals, Math::*, System::arraycopy, Object::hashCode, Object::getClass, sun.misc.Unsafe::*
  • 33. 33 Feedback multiplies optimizations § On-line profiling and CHA produces information –  ...which lets the JIT ignore unused paths –  ...and helps the JIT sharpen types on hot paths –  ...which allows calls to be devirtualized –  ...allowing them to be inlined –  ...expanding an ever-widening optimization horizon § Result: Large native methods containing tightly optimized machine code for hundreds of inlined calls!
  • 35. 35 JVMs § Oracle HotSpot § IBM J9 § Oracle JRockit § Azul Zing § Excelsior JET § Jikes RVM Implementations
  • 36. 36 HotSpot JVM § client / C1 § server / C2 § tiered mode (C1 + C2) JIT-compilers
  • 37. 37 HotSpot JVM § client / C1 –  $ java –client § only available in 32-bit VM –  fast code generation of acceptable quality –  basic optimizations –  doesn’t need profile –  compilation threshold: 1,5k invocations JIT-compilers
  • 38. 38 HotSpot JVM § server / C2 –  $ java –server –  highly optimized code for speed –  many aggressive optimizations which rely on profile –  compilation threshold: 10k invocations JIT-compilers
  • 39. 39 HotSpot JVM § Client / C1 + fast startup –  peak performance suffers § Server / C2 + very good code for hot methods –  slow startup / warmup JIT-compilers comparison
  • 40. 40 Tiered compilation § -XX:+TieredCompilation § Multiple tiers of interpretation, C1, and C2 § Level 0 = Interpreter § Level 1-3 = C1 –  #1: C1 w/o profiling –  #2: C1 w/ basic profiling –  #3: C1 w/ full profiling § Level 4 = C2 C1 + C2
  • 42. 42 Monitoring JIT-Compiler § how to print info about compiled methods? –  -XX:+PrintCompilation § how to print info about inlining decisions –  -XX:+PrintInlining § how to control compilation policy? –  -XX:CompileCommand=… § how to print assembly code? –  -XX:+PrintAssembly –  -XX:+PrintOptoAssembly (C2-only)
  • 43. 43 Print Compilation § -XX:+PrintCompilation § Print methods as they are JIT-compiled § Class + name + size
  • 44. 44 Print Compilation $ java -XX:+PrintCompilation 988 1 java.lang.String::hashCode (55 bytes) 1271 2 sun.nio.cs.UTF_8$Encoder::encode (361 bytes) 1406 3 java.lang.String::charAt (29 bytes) Sample output
  • 45. 45 Print Compilation 2043 470 % !bs jdk.nashorn.internal.ir.FunctionNode::accept @ 136 (265 bytes) % == OSR compilation ! == has exception handles (may be expensive) s == synchronized method b == blocking compilation 2028 466 n java.lang.Class::isArray (native) n == native method Other useful info
  • 46. 46 Print Compilation §  621 160 java.lang.Object::equals (11 bytes) made not entrant –  don‘t allow any new calls into this compiled version §  1807 160 java.lang.Object::equals (11 bytes) made zombie –  can safely throw away compiled version Not just compilation notifications
  • 47. 47 No JIT At All? § Code is too large § Code isn’t too «hot» –  executed not too often
  • 48. 48 Print Inlining § -XX:+PrintInlining -XX:+UnlockDiagnosticVMOptions § Shows hierarchy of inlined methods § Prints reason, if a method isn’t inlined
  • 49. 49 Print Inlining Decisions $ java -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining 75 1 java.lang.String::hashCode (55 bytes) 88 2 sun.nio.cs.UTF_8$Encoder::encode (361 bytes) @ 14 java.lang.Math::min (11 bytes) (intrinsic) @ 139 java.lang.Character::isSurrogate (18 bytes) never executed 103 3 java.lang.String::charAt (29 bytes)
  • 50. 50 Inlining Heuristic Tuning § -XX:MaxInlineSize=# –  Largest inlinable method (bytecode) § -XX:InlineSmallCode=# –  Largest inlinable compiled method § -XX:FreqInlineSize=# –  Largest frequently-called method… § -XX:MaxInlineLevel=# –  How deep does the rabbit hole go? § -XX:MaxRecursiveInlineLevel=# –  recursive inlining
  • 52. 52 -XX:CompileCommand= § Syntax –  “[command] [method] [signature]” § Supported commands –  exclude – never compile –  inline – always inline –  dontinline – never inline § Method reference –  class.name::methodName § Method signature is optional
  • 53. 53 What Have We Learned? § How JIT compilers work § How HotSpot’s JIT works § How to monitor the JIT in HotSpot