Part 1 - Introduction To The G1 Garbage Collector PDF
Part 1 - Introduction To The G1 Garbage Collector PDF
SHARE
To most people, the Java Garbage Collector is a black box that happily goes about
its business. Programmers develop an application, QE validates the functionality
and the Operations team deploys it. Within that process, you may do some tweaking
of the overall heap, PermGen / Metaspace or thread settings, but beyond that
things just seem to work. The question then becomes, what happens when you start
pushing the envelope? What happens when those defaults no longer suffice? As a
developer, tester, performance engineer or architect, it’s an invaluable skill set to
understand the basics of how Garbage Collection works, but also how to collect and
analyze the corresponding data and translate it into effective tuning practices. In
this ongoing series, we’re going to take you on a journey with the G1 Garbage
Collector and transform your understanding from beginner to aficionado that
places GC at the top of your performance pile.
We’re leading off this series with the most fundamental of topics: What is the point
of the G1 (Garbage First) Collector and how does it actually work? Without a general
understanding of its goals, how it makes decisions and how it’s designed, you are
setting out to achieve a desired end state with no vehicle or map to get you there.
At its heart, the goal of the G1 collector is to achieve a predictable soft-target pause
time, defined through -XX:MaxGCPauseMillis, while also maintaining consistent
application throughput. The catch and ultimate goal is to be able to maintain those
targets with the present day demands of high-powered, multi-threaded
applications with an appetite for continually larger heap sizes. A general rule with
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 1/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
G1 is that the higher the pause time target, the achievable throughput, and overall
latency become higher. The lower the pause time target, the achievable throughput
and overall latency become lower. Your goal with Garbage Collection is to combine
an understanding of the runtime requirements of your application, the physical
characteristics of your application and your understanding of G1 to tune a set of
options and achieve an optimal running state that satisfies your business
requirements. It’s important to keep in mind that tuning is a constantly evolving
process in which you establish a set of baselines and optimal settings through
repetitive testing and evaluation. There is no definitive guide or a magic set of
options, you are responsible for evaluating performance, making incremental
changes and re-evaluating until you reach your goals.
For its part, G1 works to accomplish those goals in a few different ways. First, being
true to its name, G1 collects regions with the least amount of live data (Garbage
First!) and compacts/evacuates live data into new regions. Secondly, it uses a series
of incremental, parallel and multi-phased cycles to achieve its soft pause time
target. This allows G1 to do what’s necessary, in the time defined, irrespective of the
overall heap size.
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 2/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
Based on the above calculation, the JVM would, by default, allocate 3072 regions,
each capable of holding 4 MB, as illustrated in the diagram, below. You also have the
option of explicitly specifying the region size through -XX:G1HeapRegionSize. When
setting the region size, it’s important to understand the number of regions your
heap-to-size ratio will create because the fewer the regions, the less flexibility G1
has and the longer it takes to scan, mark and collect each of them. In all cases,
empty regions are added to an unordered linked list also known as the “free list”.
The key is that while G1 is a generational collector, the allocation and consumption
of space is both non-contiguous and free to evolve as it gains a better
understanding of the most efficient young to old ratio. When object production
begins, a region is allocated from the free list as a thread-local allocation buffer
(TLAB) using a compare and swap methodology to achieve synchronization.
Objects can then be allocated within those thread-local buffers without the need for
additional synchronization. When the region has been exhausted of space, a new
region is selected, allocated and filled. This continues until the cumulative Eden
region space has been filled, triggering an evacuation pause (also known as a young
collection / young gc / young pause or mixed collection / mixed gc / mixed pause).
The cumulative amount of Eden space represents the number of regions we believe
can be collected within the defined soft pause time target. The percentage of total
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 3/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
heap allocated for Eden regions can range from 5% to 60% and gets dynamically
adjusted after each young collection based on the performance of the previous
young collection.
Here is an example of what it looks like with objects being allocated into non-
contiguous Eden regions;
GC pause (young); #1
[Eden: 612.0M(612.0M)->0.0B(532.0M) Survivors: 0.0B->80.
GC pause (young); #2
[Eden: 532.0M(532.0M)->0.0B(532.0M) Survivors: 80.0M->80
Based on the above ‘GC pause (young)’ logs, you can see that in pause #1,
evacuation was triggered because Eden reached 612.0M out of a total of 612.0M
(153 regions). The current Eden space was fully evacuated, 0.0B and, given the time
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 4/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
taken, it also decided to reduce the total Eden allocation to 532.0M or 133 regions.
In pause #2, you can see the evacuation is triggered when we reach the new limit of
532.0M. Because we achieved an optimal pause time, Eden was kept at 532.0M.
When the aforementioned young collection takes place, dead objects are collected
and any remaining live objects are evacuated and compacted into the Survivor
space. G1 has an explicit hard-margin, defined by the G1ReservePercent (default
10%), that results in a percentage of the heap always being available for the
Survivor space during evacuation. Without this available space, the heap could fill to
a point in which there are no available regions for evacuation. There is no guarantee
this will not still happen, but that’s what tuning is for! This principle ensures that
after every successful evacuation, all previously allocated Eden regions are
returned to the free list and any evacuated live objects end up in Survivor space.
Continuing with this pattern, objects are again allocated into newly requested Eden
regions. When Eden space fills up, another young collection occurs and, depending
on the age (how many young collections the various objects have survived) of
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 5/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
existing live objects, you will see promotion to Old regions. Given the Survivor
space is part of the young generation, dead objects are collected or promoted
during these young pauses.
Below is an example of what a young collection looks like when live objects from the
Survivor space are evacuated and promoted to a new region in the Old space while
live objects from Eden are evacuated into a new Survivor space region. Evacuated
regions, denoted by the strikethrough, are now empty and returned to the free list.
G1 will continue with this pattern until one of three things happens:
Focusing on the primary trigger, the IHOP represents a point in time, as calculated
during a young collection, where the number of objects in the old regions account
for greater than 45% (default) of the total heap. This liveness ratio is constantly
being calculated and evaluated as a component of each young collection. When one
of these triggers are hit, a request is made to start a concurrent marking cycle.
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 6/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
Below is an example of what a heap may look like after a young collection when the
IHOP threshold is reached, triggering a concurrent mark.
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 7/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
The above log tells us that a mixed collection is starting because the number of
candidate Old regions (553) have a combined 21.75% reclaimable space. This value
is higher than our 5% minimum threshold (5% default in JDK8u40+ / 10% default
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 8/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
Compared to a young collection, a mixed collection will look to collect all three
generations within the same pause time target. It manages this through the
incremental collection of the Old regions based on the value of
G1MixedGCCountTarget (defaults to 8). Meaning, it will divide the number of
candidate Old regions by the G1MixedGCCountTarget and try to collect at least that
many regions during each cycle. After each cycle finishes, the liveness of the Old
region is re-evaluated. If the reclaimable space is still greater than the
G1HeapWastePercent, mixed collections will continue.
This diagram represents a mixed collection. All Eden regions are collected and
evacuated to Survivor regions and, depending on age, all survivor regions are
collected and sufficiently tenured live objects are promoted to new Old regions. At
the same time, a select subset of Old regions are also collected and any remaining
live objects are compacted into new Old regions. The process of compaction and
evacuation allows for a significant reduction in fragmentation and ensures adequate
free regions are maintained.
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 9/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
This diagram represents the heap after a mixed collection finishes. All Eden regions
are collected and live objects reside in a newly allocated Survivor region. Existing
Survivor regions are collected and live objects are promoted to new Old regions.
The set of collected Old regions are returned to the free list and any remaining live
objects are compacted into new Old regions.
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 10/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
Mixed collections will continue until all eight are completed or until the reclaimable
percentage no longer meets the G1HeapWastePercent. From there, you will see the
mixed collection cycle finish and the following events will return to standard young
collections.
Now that we have covered the standard use-cases, let’s jump back and discuss the
exception I mentioned earlier. It applies in situations where the size of an object is
greater than 50% of a single region. In this case, objects are considered to be
humongous and are handled by performing specialized humongous allocations.
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 11/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
Object A: 12800 KB
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 12/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
A few humongous objects may not cause a problem, but a steady allocation of them
can lead to significant heap fragmentation and a noticeable performance impact.
Prior to JDK8u40, humungous objects could only be collected through a Full GC so
the potential for this to impact JDK7 and early JDK8 users is very high. This is why
it’s critical to understand both the size of objects your application produces and
what G1 is defining for region size. Even in the latest JDK8, if you are doing
significant numbers of humongous allocations, it is a good idea to evaluate and tune
away as many as possible.
Finally and unfortunately, G1 also has to deal with the dreaded Full GC. While G1 is
ultimately trying to avoid Full GC’s, they are still a harsh reality especially in
improperly tuned environments. Given that G1 is targeting larger heap sizes, the
impact of a Full GC can be catastrophic to in-flight processing and SLAs. One of the
primary reasons is that Full GCs are still a single-threaded operation in G1. Looking
at causes, the first, and most avoidable, is related to Metaspace.
The second two causes are real and often times unavoidable. Our job as engineers
is to do our best to delay and avoid these situations through tuning and evaluating
the code producing the objects we’re trying to collect. This first major issue is a ‘to-
space exhausted’ event followed by a Full GC. This event accounts for evacuation
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 13/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
failures in which the heap can no longer be expanded and there are no available
regions to accommodate evacuation. If you recall, we previously discussed the
hard-margin, defined by the G1ReservePercent. This event says that you’re
evacuating more objects to the to-space than your reserve accounts for and that
the heap is so full, we have no other available regions. On some occasions, if the
JVM can resolve the space condition, this will not be followed by a Full GC, but it is
still a very costly stop the world event.
If you see this pattern happening often, you can immediately assume you a have a
lot of room for tuning! The second case is a Full GC during concurrent marking. In
this case, we’re not failing evacuation, we’re simply running out of heap before
concurrent marking can finish and trigger a mixed collection. The two causes are
either a memory leak or you’re producing and promoting objects faster than they
can be collected. If the Full GC collection is a significant portion of the heap, you can
assume it’s related to production and promotion. If very little is being collected and
you eventually hit an OutOfMemoryError, you’re more than likely looking at a
memory leak.
In closing, I hope this post sheds some light on the way G1 is designed and how it
goes about making its garbage collection decisions. I hope you stay tuned for the
next article in this series where we will dig into the various options to collect and
interpret the monumental amount of data produce via advanced GC logging.
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 14/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
A Red Hat Technical Account Manager (TAM) is a specialized product expert who
works collaboratively with IT organizations to strategically plan for successful
deployments and help realize optimal performance and growth. The TAM is part of
Red Hat’s world class Customer Experience and Engagement organization and
provides proactive advice and guidance to help you identify and address potential
problems before they occur. Should a problem arise, your TAM will own the issue
and engage the best resources to resolve it as quickly as possible with minimal
disruption to your business.
Red Hat CEO Jim Whitehurst wins CEO of the Year Award
OF INTEREST
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 15/16
3/13/2018 Part 1: Introduction to the G1 Garbage Collector
RELATED WEBINAR
CUSTOMER STORY
https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector 16/16