0% found this document useful (0 votes)
22 views

Adventures in RAC - GC Buffer Busy Acquire and Release - Martins Blog

The document discusses analyzing wait events related to acquiring global cache (gc) buffers in an Oracle Real Application Clusters (RAC) environment. The author describes testing buffer contention by having multiple threads select random rows from a table to force waits on acquiring the same buffer block from the remote instance. The output shows a trace excerpt with the 'gc buffer busy acquire' wait event occurring when accessing a remote buffer.

Uploaded by

Osman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Adventures in RAC - GC Buffer Busy Acquire and Release - Martins Blog

The document discusses analyzing wait events related to acquiring global cache (gc) buffers in an Oracle Real Application Clusters (RAC) environment. The author describes testing buffer contention by having multiple threads select random rows from a table to force waits on acquiring the same buffer block from the remote instance. The output shows a trace excerpt with the 'gc buffer busy acquire' wait event occurring when accessing a remote buffer.

Uploaded by

Osman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

4/4/24, 4:11 AM Adventures in RAC: gc buffer busy acquire and release – Martins Blog

Martins Blog ABOUT ME DATA PRIVACY BOOKS NAVIGATION

Adventures in RAC: gc buffer


busy acquire and release
It seems that I’m getting more and more drawn into the world of performance
analysis, and since I sometimes tend to forget things I need to write them down.
I almost enjoy the “getting there” more than ultimately solving the problem. You
pick up quite a few things on the way.
This environment is Exadata 12.1.1.1.1/Oracle 12.1.0.2 but as with so many things
the fact that the database is on Exadata shouldn’t matter.
So here is one of these posts, this time I’m writing up what I saw related to GC
Buffer Busy Acquire.
gc buffer busy acquire?
Whenever I see a wait event I haven’t dealt with extensively in the past I try to
provoke behaviour to study it more closely. But first you need to know the
event’s meaning. One option is to check v$event_name:
1 SQL> select name, parameter1, parameter2, parameter3, wait_class
2 2 from v$event_name where name = 'gc buffer busy acquire';
3
4 NAME PARAMETER1 PARAMETER2 PARAMETER3 WAIT_C
5 ------------------------------ ---------- ---------- ---------- ------
6 gc buffer busy acquire file# block# class# Cluster

So when you see this event in ASH/traces/v$session etc you know it’s a cluster
wait and potentially limiting your processing throughput. It also tells you the
file#, block# and class# of the buffer which you can link to v$bh. This view allows
you to find the data object ID given these input parameters.
Using https://orainternals.wordpress.com/2012/04/19/gc-buffer-busy-acquire-
vs-release/ as a source I worked out that the event has to do with acquiring a
buffer (=block) in RAC (gc = global cache) on the local instance. If the block you
https://martincarstenbach.com/2014/12/16/adventures-in-rac-gc-buffer-busy-acquire-and-release/ 1/7
4/4/24, 4:11 AM Adventures in RAC: gc buffer busy acquire and release – Martins Blog

need is on the remote instance you wait for it to be released, and the wait event
is gc buffer busy release.
Since Oracle will clone blocks in buffer caches for consistent reads and use a
shared lock on these for reading I thought that waiting can only happen if
someone requested a block in XCUR (exclusive current) mode. So with that
working hypothesis I went to work.
How to test
I started off writing a small java class that creates a connection pool against my
RAC database. I initially used the default service name in the connect descriptor
but had to find out that dbms_monitor.SERV_MOD_ACT_TRACE_ENABLE did
not trace my sessions. In the end I created a true RAC service with CLB and RLB
goals against both instances and I ended up with traces in the diagnostic_dest.
After setting up the UCP connection pool the code will create a number of
threads that each will pull a connection from the pool, do some work (*) and
hand it back to the pool as good citizens should do.
(*) The do some work bit is this::
1 ...
2 try {
3
4 PreparedStatement pstmt = conn.prepare
5 "select /* bufferbusy001 */ id, to_c
6 "from t1 where id = ? for update");
7
8 int randomID = new Random().nextInt((1
9 System.out.println("thread " + mThread
10 pstmt.setInt(1, randomID);
11
12 ResultSet rs = pstmt.executeQuery();
13
14 while (rs.next()) {
15 System.out.println("Thread " +
16 + rs.getInt("id") + ". Now it
17 }
18
19 rs.close();
20 pstmt.close();
21 conn.rollback();
22 conn.close();
23 conn = null;
24
25 Thread.sleep(2000);

https://martincarstenbach.com/2014/12/16/adventures-in-rac-gc-buffer-busy-acquire-and-release/ 2/7
4/4/24, 4:11 AM Adventures in RAC: gc buffer busy acquire and release – Martins Blog

26 } catch (Exception e) {
27 e.printStackTrace();
28 }
29 ...

I think that’s how a Java developer would do it (with more error handling of
course) but then I’m not a Java developer. It did work though! What I considered
most important was to generate contention on a single block. Using
dbms_rowid I could find out which IDs belong to (a random) block:
1 SQL> select * from (
2 2 select id,DBMS_ROWID.ROWID_BLOCK_NUMBER (rowid, 'BIGFILE') as b
3 3 from t1
4 4 ) where block = 11981654;
5
6 ID BLOCK
7 ---------- ----------
8 1450765 11981654
9 1450766 11981654
10 1450767 11981654
11 1450768 11981654
12 1450769 11981654
13 1450770 11981654
14
15 6 rows selected.

So if I manage to randomly select from the table where ID in the range …765 to …
770 then I should be ok and just hit that particular block.
It turned out that the SQL statement completed so quickly I had to considerably
ramp up the number of sessions in the pool to see anything. I went up from 10 to
500 before I could notice a change. Most of the statements are too quick to
even be caught in ASH-Tanel’s ashtop script showed pretty much nothing
except ON-CPU occasionally as well as the odd log file sync event. Snapper also
reported sessions in idle state.
1 SQL> r
2 1 select count(*), inst_id, status, sql_id, event, state
3 2 from gv$session where module = 'BufferBusy'
4 3* group by inst_id, status, sql_id, event, state
5
6 COUNT(*) INST_ID STATUS SQL_ID EVENT
7 ---------- ---------- -------- ------------- ------------------------
8 251 1 INACTIVE SQL*Net message from clie
9 248 2 INACTIVE SQL*Net message from clie

https://martincarstenbach.com/2014/12/16/adventures-in-rac-gc-buffer-busy-acquire-and-release/ 3/7
4/4/24, 4:11 AM Adventures in RAC: gc buffer busy acquire and release – Martins Blog

10
11 2 rows selected

That’s what you see for most of the time.


Let me trace this for you
So in order to get any meaningful idea about the occurrence (or absence) of the
gc buffer busy acquire event I added a MODULE to my sessions so I can later on
run trcsess to combine traces. Here is the resulting raw trace, or rather an
excerpt from it:
1 =====================
2 PARSING IN CURSOR #140650659166120 len=96 dep=0 uid=65 oct=3 lid=65 ti
3 select /* bufferbusy001 */ id, to_char(sysdate,'hh24:mi:ss') d from t1
4 END OF STMT
5 PARSE #140650659166120:c=0,e=5598,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,p
6 WAIT #140650659166120: nam='gc buffer busy acquire' ela= 12250 file#=6
7 WAIT #140650659166120: nam='buffer busy waits' ela= 1890 file#=6 block
8 WAIT #140650659166120: nam='enq: TX - row lock contention' ela= 188606
9 WAIT #140650659166120: nam='gc buffer busy acquire' ela= 1590 file#=6
10 WAIT #140650659166120: nam='enq: TX - row lock contention' ela= 36313
11 WAIT #140650659166120: nam='gc buffer busy acquire' ela= 1268 file#=6
12 WAIT #140650659166120: nam='buffer busy waits' ela= 668 file#=6 block#
13 WAIT #140650659166120: nam='KJC: Wait for msg sends to complete' ela=
14 EXEC #140650659166120:c=1999,e=243530,p=0,cr=9,cu=4,mis=0,r=0,dep=0,og
15 WAIT #140650659166120: nam='SQL*Net message to client' ela= 4 driver i
16 FETCH #140650659166120:c=0,e=16,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh
17 STAT #140650659166120 id=1 cnt=1 pid=0 pos=1 obj=0 op='FOR UPDATE (cr
18 STAT #140650659166120 id=2 cnt=2 pid=1 pos=1 obj=0 op='BUFFER SORT (cr
19 STAT #140650659166120 id=3 cnt=1 pid=2 pos=1 obj=48863 op='INDEX RANGE
20 WAIT #140650659166120: nam='SQL*Net message from client' ela= 260 driv
21 CLOSE #140650659166120:c=0,e=5,dep=0,type=1,tim=4170152758141
22 XCTEND rlbk=1, rd_only=0, tim=4170152758170
23 WAIT #0: nam='gc buffer busy acquire' ela= 3764 file#=6 block#=1198165
24 WAIT #0: nam='buffer busy waits' ela= 1084 file#=6 block#=11981654 cla
25 WAIT #0: nam='log file sync' ela= 246 buffer#=119491 sync scn=19690898
26 WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #

Result! There are gc buffer busy acquire events recorded. I can’t rule out TX-row
lock contention since with all those threads and only 6 IDs to choose from there
was going to be some locking on the same ID caused by the “for update” clause.
Now I am reasonably confident that I worked out at least one scenario causing a
gc buffer busy acquire. You might also find the location of the blocks in the
buffer cache interesting:
https://martincarstenbach.com/2014/12/16/adventures-in-rac-gc-buffer-busy-acquire-and-release/ 4/7
4/4/24, 4:11 AM Adventures in RAC: gc buffer busy acquire and release – Martins Blog

1 SYS:dbm011> select count(*), inst_id, block#, status


2 2 from gv$bh where block# = 11981654
3 3 group by inst_id, block#, status order by inst_id, status;
4
5 COUNT(*) INST_ID BLOCK# STATUS
6 ----------- ----------- ----------- ----------
7 9 1 11981654 cr
8 1 1 11981654 xcur
9 9 2 11981654 cr

There is the one block in XCUR mode and 9 in CR mode in the buffer cache for
that block.
Making it worse
Now I didn’t want to stop there, I was interested in what would happen under
CPU load. During my career I noticed cluster waits appear primarily when you
are CPU-bound (all other things being equal). This could be the infamous
middle-tier-connection-pool-mismanagement or an execution plan going
wrong with hundreds of users performing nested loop joins when they should
hash-join large data sets… This is usually the point where OEM users ask the
DBAs to do something against that “sea of grey” in the performance pages.
As with every cluster technology an overloaded CPU does not help. Well-I guess
that’s true for all computing. To increase the CPU load I created 10 dd sessions
to read from /dev/zero and write to /dev/null. Sounds silly but one of these hogs
1 CPU core 100%. With 10 out of 12 cores 100% occupied that way on node 1 I
relaunched my test. The hypothesis that CPU overload has an effect was proven
right by suddenly finding ASH samples of my session.
1 SQL> @ash/ashtop sql_id,session_state,event "sql_id='6a5jfvpcqvbk6'" s
2
3 Total
4 Seconds AAS %This SQL_ID SESSION EVENT
5 --------- ------- ------- ------------- ------- ---------------------
6 373 1.2 79% | 6a5jfvpcqvbk6 WAITING enq: TX - row lock con
7 54 .2 11% | 6a5jfvpcqvbk6 WAITING gc buffer busy release
8 20 .1 4% | 6a5jfvpcqvbk6 ON CPU
9 11 .0 2% | 6a5jfvpcqvbk6 WAITING gc buffer busy acquire
10 11 .0 2% | 6a5jfvpcqvbk6 WAITING gc current block busy
11 1 .0 0% | 6a5jfvpcqvbk6 WAITING gc current block 2-way
12
13 6 rows selected.

https://martincarstenbach.com/2014/12/16/adventures-in-rac-gc-buffer-busy-acquire-and-release/ 5/7
4/4/24, 4:11 AM Adventures in RAC: gc buffer busy acquire and release – Martins Blog

Using ASQLMON I can even see where time is spent:


1 SQL> @scripts/ash/asqlmon 6a5jfvpcqvbk6 % sysdate-1 sysdate
2
3 SECONDS Activity Visual Line ID Parent ASQLMON_OPERATION
4 ----------- -------- ------------ ------- ------ --------------------
5 38 2.4 % | | 0 SELECT STATEMENT
6 1 .1 % | | 1 0 FOR UPDATE
7 25 1.6 % | | 1 0
8 17 1.1 % | | 1 0
9 109 6.8 % |# | 1 0
10 31 1.9 % | | 1 0
11 27 1.7 % | | 1 0
12 768 48.0 % |##### | 1 0
13 3 .2 % | | 2 1 BUFFER SORT
14 2 .1 % | | 3 2 INDEX RANGE SCAN
15
16
17
18 2 .1 % | | 0 SELECT STATEMENT
19 2 .1 % | | 1 0 FOR UPDATE
20 24 1.5 % | | 1 0
21 9 .6 % | | 1 0
22 30 1.9 % | | 1 0
23 9 .6 % | | 1 0
24 7 .4 % | | 1 0
25 496 31.0 % |### | 1 0
26 % | | 2 1 BUFFER SORT
27 % | | 3 2 INDEX RANGE SCAN
28

Further Reading
I’m sure there is a wealth of resources available out there, in my case Riyaj’s blog
helped me a lot. He even tagged posts with gc buffer busy:
https://orainternals.wordpress.com/tag/gc-buffer-busy
Have a look at the Oaktable World 2014 agenda and watch Tanel Poder’s session
attentively. You’d be surprised how many scripts he made publicly available to
troubleshoot perform. Like snapper? It’s only the tip of the iceberg. And if you
can, you should really attend his advanced troubleshooting seminar.
Responses
Stefan Koehler
December 16, 2014

https://martincarstenbach.com/2014/12/16/adventures-in-rac-gc-buffer-busy-acquire-and-release/ 6/7
4/4/24, 4:11 AM Adventures in RAC: gc buffer busy acquire and release – Martins Blog

Hi Martin,
nice blog post.
I usually have seen extensive “gc buffer busy release” waits in case of
LMS / LGWR (gcs log flush sync / log file sync) issues. The long
duration of “gcs log flush sync / log file sync” was mostly caused by
“broadcast on commit” bugs (in case of fast log file parallel write), but
CPU load can be an influencing factor as well as you have
demonstrated nicely :-))
Regards
Stefan

Martin Bach
December 16, 2014
Hi Stefan!
Always nice to get comments from the experts. If memory serves
me right then Riyaj’s post had an example of an overloaded lgwr
process for anyone who wants to see what happens in that case.
Martin

ABOUT ME DATA PRIVACY


Martins
BOOKS NAVIGATION Blog

Blog at WordPress.com.

https://martincarstenbach.com/2014/12/16/adventures-in-rac-gc-buffer-busy-acquire-and-release/ 7/7

You might also like