Note: Tested in Oracle 12c, 18c, 19c.
1. Test Setup
First we drop the previous reverse index and create a new reverse primary key index, then get object_id of that index:
--drop table test_tab purge;
--create table test_tab (id number, val number);
drop index test_tab#r;
create unique index test_tab#r on test_tab(id) reverse;
alter table test_tab add constraint test_tab#r primary key (id);
select object_id, to_char(object_id, 'XXXXXXXX') id_hex, data_object_id, to_char(data_object_id, 'XXXXXXXX') did_hex
from dba_objects where object_name = 'TEST_TAB#R';
OBJECT_ID ID_HEX DATA_OBJECT_ID DID_HEX
--------- ------ -------------- -------
2460326 258AA6 2460326 258AA6
2. Leaf Block Split Point Finding
With appended Plsql procedure: leaf_split_finder, we can find one exact point of leaf block split.
SQL > exec leaf_split_finder(5000);
--- Leaf Split at row count: 4906, (id between 1 and 11) in 2 Blocks
block: 2661262, cnt: 5, min_id: 1, max_id: 5
block: 3665613, cnt: 6, min_id: 6, max_id: 11
Above output shows index split occurs at row count: 4906. In the previous Blog, that is 4708.
So primary key index is more compact (4.2%) than normal index.
3. CBC Latch Contention Producing
Open a Plsql Window, truncate table, insert 4905 rows at first.
truncate table test_tab;
insert into test_tab select level, -level from dual connect by level <= 4905;
commit;
-- using above index object_id to substitute for 2460326
select blk, count(*), min(id) min_id, max(id) max_id from (
select id
,dbms_rowid.rowid_block_number(sys_op_lbid (2460326, 'L', rowid)) blk
from test_tab)
where id between 1 and 11
group by blk order by min_id, blk;
BLK COUNT(*) MIN_ID MAX_ID
------- -------- ------ ------
2661263 11 1 11
We can see all rows from 1 to 11 are in the same leaf block (2661263).Now we start two groups of jobs, each group having 6 jobs in a query loop (see appended test_tab_select_job). All 6 jobs in the first group access row 1 to 5; whereas 6 jobs in the second group access row 6 to 10;
exec test_tab_select_job(6, 1e9, 1, 0);
exec test_tab_select_job(6, 1e9, 6, 0);
-- stop all jobs by:
-- exec clean_jobs;
Each job in the first group executes the query with id from 1 to 5 in a loop,
whereas job in the second group executes it with id from 6 to 10. Both are with the same Xplan
(INDEX UNIQUE SCAN. In previous Blog, it is INDEX RANGE SCAN).
select val from test_tab where id = :B1
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_TAB | 1 | 26 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | TEST_TAB#R | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Now insert a single new row:
insert into test_tab values(4906, -4906);
select blk, count(*), min(id) min_id, max(id) max_id from (
select id
,dbms_rowid.rowid_block_number(sys_op_lbid (2460326, 'L', rowid)) blk
from test_tab)
where id between 1 and 11
group by blk order by min_id, blk;
BLK COUNT(*) MIN_ID MAX_ID
------- -------- ------ ------
2661263 5 1 5
3665609 6 6 11
One leaf block split occurs. Row 6 to 11 are moved to new block (3665609).
Now if we look again all 12 job sessions. Only 6 sessions in the second group have contention,
similar to the case of previous Blog.
select s.client_identifier, s.sid, s.serial#, p.pid, p.spid, s.event, p1, p2, p3, latchwait, latchspin
from v$session s, v$process p
where s.paddr=p.addr and lower(s.program) like '%j0%'
order by s.client_identifier;
CLIENT_IDENTIFIER SID SERIAL# PID SPID EVENT P1 P2 P3 LATCHWAIT LATCHSPIN
-------------------- --- ------- --- ---- --------------------------- ---------- ------- -- ---------------- ---------
test_tab_select_id_1 725 53859 46 5226 buffer busy waits 1548 2661259 1
test_tab_select_id_1 549 449 51 5236 null event 0 0 0
test_tab_select_id_1 547 35779 45 5224 buffer busy waits 1548 2661263 1
test_tab_select_id_1 6 60174 42 5218 buffer busy waits 1548 2661263 1
test_tab_select_id_1 8 49157 48 5230 buffer busy waits 1548 2661263 1
test_tab_select_id_1 187 44089 43 5220 null event 0 0 0
test_tab_select_id_6 727 18689 52 5238 latch: cache buffers chains 6636068072 228 0 000000018B8A5CE8
test_tab_select_id_6 369 18373 50 5234 latch: cache buffers chains 6636068072 228 0 000000018B8A5CE8
test_tab_select_id_6 189 58221 49 5232 latch: cache buffers chains 6636068072 228 0 000000018B8A5CE8
test_tab_select_id_6 905 10131 53 5240 latch: cache buffers chains 6636068072 228 0 000000018B8A5CE8
test_tab_select_id_6 367 49253 44 5222 latch: cache buffers chains 6636068072 228 0
test_tab_select_id_6 903 64294 47 5228 latch: cache buffers chains 6636068072 228 0 000000018B8A5CE8
A further query on v$latch_children shows majority of CBC SLEEPS (140174) is on child latch: 000000018B8A5CE8 of second group.
The child latch: 000000018B8DC5E8 of first group almost has no SLEEPS and WAIT_TIME.
select latch#, child#, addr, gets, misses, sleeps, spin_gets, wait_time, round(wait_time/sleeps) us_per_sleep
from v$latch_children
where addr = hextoraw('000000018B8DC5E8') or addr = hextoraw('000000018B8A5CE8');
LATCH# CHILD# ADDR GETS MISSES SLEEPS SPIN_GETS WAIT_TIME US_PER_SLEEP
------ ------ ---------------- --------- -------- ------- --------- --------- ------------
228 31143 000000018B8DC5E8 57352278 59783 4 59779 81 20
228 29979 000000018B8A5CE8 104090722 28359401 140174 28213248 334439191 2386
With one more query on v$latch_misses, we can display CBC latch contention locations.
select * from v$latch_misses
where parent_name = 'cache buffers chains' and sleep_count > 0
order by sleep_count desc;
PARENT_NAME WHERE NWFAIL_COUNT SLEEP_COUNT WTR_SLP_COUNT LONGHOLD_COUNT LOCATION
-------------------- ------------------------ ------------ ----------- ------------- -------------- -----------------------
cache buffers chains kcbgtcr: slow path excl 0 181709 24123 2864 kcbgtcr: slow path excl
cache buffers chains kcbgtcr: fast path exam 0 61093 160708 10363 kcbgtcr: fast path exam
cache buffers chains kcbgkcbcr 0 8999 65758 1806 kcbgkcbcr
cache buffers chains kcbrls: pin release 0 7171 10946 1580 kcbrls: pin release
cache buffers chains kcb_trim_hash_chain 0 2191 19 30 kcb_trim_hash_chain
cache buffers chains kcbchg: cr pin cleanout 0 444 82 43 kcbchg: cr pin cleanout
cache buffers chains kcbcge 0 3 54 1 kcbcge
cache buffers chains kcbchg: cr pin change 0 3 0 0 kcbchg: cr pin change
cache buffers chains kcbzwb 0 1 0 0 kcbzwb
cache buffers chains kcbgtcr: retry after read 0 1 0 0 kcbgtcr: retry after read
The top Location are "kcbgtcr: slow path excl" and "kcbgtcr: fast path exam".
"kcbgtcr: slow path excl" has higher SLEEP_COUNT,
but "kcbgtcr: fast path exam" has much higher
WTR_SLP_COUNT (Number of times a waiter slept) and
LONGHOLD_COUNT (Number of times someone held a latch for the entire duration of someone else's sleep).In the previous Blog for normal reverse index, the top Location is "kcbgtcr: fast path pin".
LONGHOLD_COUNT is a parmountal indicator for latch misses since it signifies the sleeping sessions can go to sleep again ("Recurrent Sleeps", see Blog: Oracle Latch Misses, Spin_gets, Sleeps and Recurrent_Misses ). But it is not included in AWR - "Latch Miss Sources" section.
If we make an index TEST_TAB#R tree dump, we can see two leaf blocks: 2661263 and 3665609:
alter session set tracefile_identifier = "TD_TEST_TAB#R_1";
alter session set events 'immediate trace name treedump level 2460326';
----- begin tree dump
branch: 0x13589b 1267867 (0: nrow: 16, level: 1)
leaf: 0x289b8f 2661263 (-1: row:291.291 avs:3932)
leaf: 0x37eec9 3665609 (0: row:282.282 avs:4060)
leaf: 0x1358a1 1267873 (1: row:295.295 avs:3878)
Above dump shows that index root node block is 1267867.
With following query, we can see that it is the next block of header_block: 1267866 on index TEST_TAB#R segment:
SQL > select segment_name, header_file, header_block from dba_segments where segment_name = 'TEST_TAB#R';
SEGMENT_NAME HEADER_FILE HEADER_BLOCK
------------ ----------- ------------
TEST_TAB#R 22 1267866
Even though there is heavy "latch: cache buffers chains" contention, it seems that no blocking sessions are tracked for such wait event:
SQL > select * from v$wait_chains where chain_signature like 'latch: cache buffers chains' or blocker_sid is not null;
no rows selected
4. Latch State Transition Diagram By Dtrace
Pick one CBC latch contention session (SID: 367, Oracle PID: 44 (0x2C), spid: 5222) from above v$session and v$process query.
Run following Dtrace Script on contention latch address: 0x18B8A5CE8 (For Latch State Transition Diagram, See Blog: PL/SQL Function Result Cache Invalidation (I)).
sudo dtrace -w -n \
'
typedef unsigned int ub4;
typedef struct latchstate /* latch state */ {
ub4 nproc_pid; /* Nproc or pid */
ub4 flag; /* blocking flag */
ub4 gets; /* number of gets */
ub4 latch; /* latch number */
ub4 level; /* latch level */
ub4 where; } latchstate;
BEGIN {laddr = $1; step = 1; begin_gets = 0, end_gets = 0; prev_ts = timestamp;
self->evt = "begin"; self->loc = 0x0;}
pid$target::kslgess:entry /arg0 == laddr && (self->evt != probefunc || self->loc != arg4)/ {
printf("\n---------Step %d at %Y (Elapsed-ns=%d)---------", step, walltimestamp, (timestamp-prev_ts));
ls = ((latchstate *) copyin(arg0, sizeof(latchstate)));
begin_gets = (begin_gets == 0 ? ls->gets: begin_gets); end_gets = ls->gets;
printf("\n***State Entry>>>(nproc_pid=>%d, flag=>%d, gets=>%d, latch=>%d, level=>%d, where=>%d)"
,ls->nproc_pid, ls->flag, ls->gets, ls->latch, ls->level, ls->where);
printf("\n>>>Transition>>>%s:%s\n (Addr=>0x%X, Where=>%d)",
probefunc, probename, arg0, arg4);
self->evt = probefunc; self->loc = arg4;
ustack(10, 0);
/*stop(); exit(0);*/ /* no latch blocking, failed get, try again */
step = step + 1; self->entry = 1; prev_ts = timestamp;}
pid$target::kslgess:return /self->entry == 1/ {
ls = ((latchstate *) copyin(laddr, sizeof(latchstate)));
printf("\n>>>State Return***(nproc_pid=>%d, flag=>%d, gets=>%d, latch=>%d, level=>%d, where=>%d)"
,ls->nproc_pid, ls->flag, ls->gets, ls->latch, ls->level, ls->where);
/*stop(); exit(0);*/ /* simulate long latch holder, run by: dtrace -w -n */
self->entry = 2;}
pid$target::kslfre:entry /self->entry == 2 && arg0 == laddr && self->evt != probefunc/ {
printf("\n---------Step %d at %Y (Elapsed-ns=%d)---------", step, walltimestamp, (timestamp-prev_ts));
ls = ((latchstate *) copyin(arg0, sizeof(latchstate)));
end_gets = ls->gets;
printf("\n>>>Transition>>>%s:%s(Addr=>0x%-X)", probefunc, probename, arg0);
self->evt = probefunc; self->loc = 0x0;
/*stop(); exit(0);*/ /* simulate long latch holder, run by: dtrace -w -n */
printf("\n===Latch freed===");
ustack(10, 0); step = step + 1; prev_ts = timestamp;}
END {
printf("\n---------Step %d at %Y (Elapsed-ns=%d)---------", step, walltimestamp, (timestamp-prev_ts));
printf("\nLatch Gets = %d (%d - %d)", (end_gets - begin_gets), (unsigned long)begin_gets, (unsigned long)end_gets); }
' -p 5222 0x18B8A5CE8
The output is some repeated occurrences of following pattern:
kslgess:entry
---------Step 1825 at 2020 Feb 24 15:51:09 (Elapsed-ns=6088)---------
***State Entry>>>(nproc_pid=>47, flag=>536870912, gets=>461619058, latch=>228, level=>1, where=>2168)
>>>Transition>>>kslgess:entry
(Addr=>0x18B8A5CE8, Where=>2218)
oracle`kslgess
a.out`kcbgtcr+0x1a323
a.out`ktrgtc2+0x316
a.out`qeilbk1+0x4a1
a.out`qeilbk0+0xe
a.out`qerixFetchUniqueIndex+0x17f
a.out`qerstFetch+0x543
a.out`qertbFetchByRowID+0x39e
a.out`qerstFetch+0x4ca
a.out`opifch2+0x188b
kslgess:return
>>>State Return***(nproc_pid=>44, flag=>536870912, gets=>461619062, latch=>228, level=>1, where=>2218)
kslgess:entry
---------Step 1826 at 2020 Feb 24 15:51:09 (Elapsed-ns=250117)---------
***State Entry>>>(nproc_pid=>43, flag=>536870912, gets=>461619113, latch=>228, level=>1, where=>2186)
>>>Transition>>>kslgess:entry
(Addr=>0x18B8A5CE8, Where=>2186)
oracle`kslgess
oracle`kcbrls+0x170c
a.out`ktrgtc2+0x554
a.out`qeilbk1+0x4a1
a.out`qeilbk0+0xe
a.out`qerixFetchUniqueIndex+0x17f
a.out`qerstFetch+0x543
a.out`qertbFetchByRowID+0x39e
a.out`qerstFetch+0x4ca
a.out`opifch2+0x188b
It shows the CBC Latch State Transition.
At first, C subroutine kslgess was called
to get CBC latch in entry location (where=>2168) and next location (where=>2186).
Then freed by kslfre.
CallStack shows that each row is accessed by qertbFetchByRowID via qerixFetchUniqueIndex
(TABLE ACCESS BY INDEX ROWID via INDEX UNIQUE SCAN).For the where location, we can use the query below to find them.
CREATE OR REPLACE FORCE VIEW X_GV$LATCH_MISSES as
SELECT t1.indx INDX,
t1.inst_id INST_ID,
t1.ksllasnam PARENT_NAME,
t2.ksllwnam "WHERE",
t1.kslnowtf NWFAIL_COUNT,
t1.kslsleep SLEEP_COUNT,
t1.kslwscwsl WTR_SLP_COUNT,
t1.kslwsclthg LONGHOLD_COUNT,
t2.ksllwnam LOCATION,
t1.con_id CON_ID
FROM x$ksllw t2, x$kslwsc t1
WHERE t2.indx = t1.indx;
select * from sys.X_GV$LATCH_MISSES
where parent_name in ('cache buffers chains')
and indx in (2168, 2186, 2218);
INDX INST_ID PARENT_NAME WHERE NWFAIL_COUNT SLEEP_COUNT WTR_SLP_COUNT LONGHOLD_COUNT LOCATION
---- ------- -------------------- ---------------------- ------------ ----------- ------------- -------------- ---------------------
2168 1 cache buffers chains kcbgtcr: fast path exam 0 106323 185902 15798 kcbgtcr: fast path exam
2186 1 cache buffers chains kcbrls: pin release 0 14959 22770 3187 kcbrls: pin release
2218 1 cache buffers chains kcbgtcr: slow path excl 0 189591 48044 3576 kcbgtcr: slow path excl
The above output showed that the most LONGHOLD_COUNT is in location "kcbgtcr: fast path exam"
when using INDEX UNIQUE SCAN to make TABLE ACCESS.
Whereas when using INDEX RANGE SCAN, the top location is "kcbgtcr: fast path pin".
This tiny nuance of location change can play a critical role in CBC latch contention investigation.
It also shows the different handling of reverse primary key index vs.normal reverse index,
which is probably some diligent work of Oracle optimization.In the above Dtrace, we trace kslgess function. Blog: Oracle 12 and latches, part 2 described it as:
kslgess: kernel service latch get spin shared (latch)Blog: Latch, mutex and beyond written:
kslgess: shared latch wait getAbout kcbrls, Blog: Buffer is pinned count said:
The function kcbrls (Kernel Cache Buffer Release) is used to release a buffer, also it is possible from another places, for example from kcb_post_apply can call kcbzfs (detach handle from x$kcbbf) directly.
5. Buffer Cache Activity Internals
If we make 10200 and 10046 trace for 100 loops for id from 6 to 10, we can see 500 "ktrgtc2 -> ktrexc" INDEX UNIQUE SCAN on the new block (3665609=0x0037eec9).
("ktrexc: for unique index and index root" and "ktrexf: for non-unique index".
see Blog: Oracle Unique Index Non-Consistency Read with RowCR)
alter session set events='10046 trace name context forever, level 12:
10200 trace name context forever, level 10'
tracefile_identifier='10046_10200_index_rp_2';
exec test_tab_select(6, 1000, 6, 0.00);
alter session set events='10046 trace name context off : 10200 trace name context off ';
ktrgtc2(): started for block <0x07cf : 0x0037eec9> objd: 0x00258aa6
ktrexc(): returning 9
ktrgtc2(): completed for block <0x07cf : 0x0037eec9> objd: 0x00258aa6
6. X$TRACE Latch Operations
In this section, we try to use X$TRACE to trace latch operations (10005). Pick one CBC latch contention session (SID: 367, Oracle PID: 44 (0x2C), spid: 5222) and trace all levels (15).
$ > oerr ora 10005
10005, 00000, "trace latch operations for debugging"
// *Action: Enable tracing for various latch operations
// level 1 - trace latch gets and frees
// level 2 - protect/unprotect latch recovery areas
// level 4 - trace multiple posts by processes when latch is freed
// level 8 - signal error on bad latch waits
alter tracing enable "10005:15:44";
SQL> select * from x$trace_events where procs in (44);
ADDR INDX INST_ID CON_ID EVENT TRCLEVEL STATUS PROCS FLAGS
---------------- ---- ------- ------ ----- -------- ------ ----- -----
FFFF80FFBC1361C8 5 1 0 10005 15 1 44 1
Following query on x$trace shows the latch begin (KSL WAIT BEG) and end (KSL WAIT END) for each seq_num
on contention child latch (6636068072/0x18b8a5ce8).
One has a longer elapsed time of 169973 usecs, another is short as of 81 usecs.169973 usecs is about 169 ms. For a CPU time slice of 10 ms, the process can be scheduled to run after 17 sleeps. That is also the case we have observed in some real applications.
select event, seq#, pid, file_loc,function, data
from x$trace
where pid in (44)
order by time;
EVENT SEQ# PID FILE_LOC FUNCTION DATA
----- ---- --- ----------- --------- ----------------------------------------------------------------------------------------------------------------------
10005 7186 44 ksl2.c@2966 kslwtbctx KSL WAIT BEG [latch: cache buffers chains] 6636068072/0x18b8a5ce8 228/0xe4 0/0x0 wait_id=11277 seq_num=11278 snap_id=1
10005 7200 44 ksl2.c@3109 kslwtectx KSL WAIT END [latch: cache buffers chains] 6636068072/0x18b8a5ce8 228/0xe4 0/0x0 wait_id=11277 seq_num=11278 snap_id=1
10005 7201 44 ksl2.c@3117 kslwtectx KSL WAIT END wait times (usecs) - snap=169973, exc=169973, tot=169973
10005 7280 44 ksl2.c@2966 kslwtbctx KSL WAIT BEG [latch: cache buffers chains] 6636068072/0x18b8a5ce8 228/0xe4 0/0x0 wait_id=11278 seq_num=11279 snap_id=1
10005 7301 44 ksl2.c@3109 kslwtectx KSL WAIT END [latch: cache buffers chains] 6636068072/0x18b8a5ce8 228/0xe4 0/0x0 wait_id=11278 seq_num=11279 snap_id=1
10005 7302 44 ksl2.c@3117 kslwtectx KSL WAIT END wait times (usecs) - snap=35031, exc=35031, tot=35031
10005 7312 44 ksl2.c@2966 kslwtbctx KSL WAIT BEG [latch: cache buffers chains] 6636068072/0x18b8a5ce8 228/0xe4 0/0x0 wait_id=11279 seq_num=11280 snap_id=1
10005 7317 44 ksl2.c@3109 kslwtectx KSL WAIT END [latch: cache buffers chains] 6636068072/0x18b8a5ce8 228/0xe4 0/0x0 wait_id=11279 seq_num=11280 snap_id=1
10005 7318 44 ksl2.c@3117 kslwtectx KSL WAIT END wait times (usecs) - snap=81, exc=81, tot=81