First we run queries and make dumps to expose row cache data structure and content. Then we perform 10222 event trace and gdb script to understand row cache data access and row cache mutex usage. Finally we make a few discussions on the test result.
Note: Tested in Oracle 19c on Linux.
Acknowledgement: This work results from the help and discussions with a longtime Oracle specialist.
1. Row Cache: 'dc_props' and 'dc_cdbprops'
Following query shows that dc_props has 85 objects, and dc_cdbprops has 6.
select cache#, type, subordinate#, parameter, count, usage, fixed, gets, fastgets
from v$rowcache where parameter in ('dc_props', 'dc_cdbprops');
CACHE# TYPE SUBORDINATE# PARAMETER COUNT USAGE FIXED GETS FASTGETS
------ ------ ------------ ----------- ----- ----- ----- ------- --------
15 PARENT dc_props 85 85 0 1162164 0
60 PARENT dc_cdbprops 6 6 0 1127 0
One query on nls_database_parameters requires 88 row cache Gets.
(GETS_DELTA should be 85, the difference of 3 is due to v$rowcache query.
dc_props is the backbone of nls_database_parameters.
See Blog:
nls_database_parameters, dc_props, latch: row cache objects )
column gets new_value gets_old;
column fastgets new_value gets_oldf;
select sum(gets) gets, sum(fastgets) fastgets from v$rowcache where parameter = 'dc_props';
select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
select (sum(gets)-&gets_old) gets_delta, (sum(fastgets)-&gets_oldf) gets_deltaf, sum(gets) gets, sum(fastgets) fastgets from v$rowcache where parameter = 'dc_props';
GETS_DELTA GETS_DELTAF GETS FASTGETS
---------- ----------- ---------- ----------
88 0 138749 0
By the way, Oracle has a hidden parameter to turn on FASTGETS when TRUE,
and 19c changed the default value from FALSE to TRUE.
- _kqr_optimistic_reads
- Description: optimistic reading of row cache objects
- Default: FALSE (Oracle 12.1); TRUE (Oracle 19.6)
- Change: alter system set "_kqr_optimistic_reads"=false; (require DB restart)
select Hash_Bucket_Size, count(*) Number_of_Buckets, (Hash_Bucket_Size * count(*)) Number_of_RowCache_Objects from
(select (hash+1) Hash_Bucket, count(*) Hash_Bucket_Size from v$rowcache_parent where cache_name = 'dc_props' group by hash)
group by Hash_Bucket_Size order by 2 desc;
HASH_BUCKET_SIZE NUMBER_OF_BUCKETS NUMBER_OF_ROWCACHE_OBJECTS
---------------- ----------------- --------------------------
1 22 22
2 13 26
3 5 15
4 4 16
6 1 6
Here the similar query for 6 'dc_cdbprops' objects:
select Hash_Bucket_Size, count(*) Number_of_Buckets, (Hash_Bucket_Size * count(*)) Number_of_RowCache_Objects from
(select (hash+1) Hash_Bucket, count(*) Hash_Bucket_Size from v$rowcache_parent where cache_name = 'dc_cdbprops' group by hash)
group by Hash_Bucket_Size order by 2 desc;
HASH_BUCKET_SIZE NUMBER_OF_BUCKETS NUMBER_OF_ROWCACHE_OBJECTS
---------------- ----------------- --------------------------
2 2 4
1 2 2
With following helper function and query, we can reveal the exact name of each 'dc_props' row cache object
(See Blog:
Oracle ROWCACHE Views and Contents)
----helper function----
create or replace function dump_hex2str (dump_hex varchar2) return varchar2 is
l_str varchar2(100);
begin
with sq_pos as (select level pos from dual connect by level <= 1000)
,sq_chr as (select pos, chr(to_number(substr(dump_hex, (pos-1)*2+1, 2), 'XX')) ch
from sq_pos where pos <= length(dump_hex)/2)
select listagg(ch, '') within group (order by pos) word
into l_str
from sq_chr;
return l_str;
end;
/
select dump_hex2str(rtrim(key, '0')) dc_prop_name, v.*
from v$rowcache_parent v
where cache_name in ('dc_props')
order by key;
DC_PROP_NAME INDX HASH ADDRESS CACHE# CACHE_NAME EXISTENT
-------------------- ---- ---- ---------------- ------ ---------- --------
COMMON_DATA_MA 8840 1 0000000094E51558 15 dc_props N
MAX_PDB_SNAPSHOTS 8838 0 0000000093638360 15 dc_props Y
NLS_SPECIAL_CHARS 8842 2 000000009F55FB18 15 dc_props N
NLS_TIMESTAMP_FORMAT 8841 2 000000008C2726C8 15 dc_props Y
PDB_AUTO_UPGRADE 8839 0 0000000095BCE218 15 dc_props N
...
85 rows selected.
2. Row Cache Data Dump
With following command, we can dump 'dc_props'. (See MOS Bug 19354335 - Diagnostic enhancement for rowcache data dumps (Doc ID 19354335.8))
alter session set tracefile_identifier = 'dc_props_dump';
-- dump level 0xf2b: f is cache id 15 ('dc_props'), 2 is single cacheiddump, b is level of 11
alter session set events 'immediate trace name row_cache level 0xf2b';
alter session set events 'immediate trace name row_cache off';
The dump consists of 3 sections, the first is about all ROW CACHE STATISTICS,
the second is ROW CACHE HASH TABLE for our specified 'dc_props' (same as above query output about HASH_BUCKET),
the third is about each BUCKET and its grouped rows.
For example, 4 rows are hashed to BUCKET 50.
After BUCKET 50 is BUCKET 52, there is no BUCKET 51.We will pick BUCKET 50 and its last row cache parent object "addr=0x8d7bc858" in our later discussion.
ROW CACHE STATISTICS:
cache size gets misses hit ratio
-------------------------- ------- ------- ------ ---------
dc_tablespaces 560 94329 3277 0.966
dc_free_extents 336 0 0 0.000
dc_segments 416 9751 3050 0.762
dc_rollback_segments 496 769333 367 1.000
...
ROW CACHE HASH TABLE: cid=15 ht=0xa709f910 size=64
Hash Chain Size Number of Buckets
--------------- -----------------
0 0
1 22
2 13
3 5
4 4
5 0 <<< HASH_BUCKET_SIZE 5 has 0 Buckets. Same as previous query
6 1
BUCKET 50:
row cache parent object: addr=0x8d776c38 cid=15(dc_props) conid=0 conuid=0
hash=641d81f1 typ=21 transaction=(nil) flags=00000001 inc=1, pdbinc=1
own=0x8d776d08[0x8d776d08,0x8d776d08] wat=0x8d776d18[0x8d776d18,0x8d776d18] mode=N req=N
status=EMPTY/-/-/-/-/-/-/-/- KGH unpinned
data=
row cache parent object: addr=0x8e8c5b10 cid=15(dc_props) conid=0 conuid=0
hash=534cbb1 typ=21 transaction=(nil) flags=00000002 inc=1, pdbinc=1
own=0x8e8c5be0[0x8e8c5be0,0x8e8c5be0] wat=0x8e8c5bf0[0x8e8c5bf0,0x8e8c5bf0] mode=N req=N
status=VALID/-/-/-/-/-/-/-/- KGH unpinned
row cache parent object: addr=0x8e8e5be0 cid=15(dc_props) conid=0 conuid=0
hash=3cf1dc71 typ=21 transaction=(nil) flags=00000002 inc=1, pdbinc=1
own=0x8e8e5cb0[0x8e8e5cb0,0x8e8e5cb0] wat=0x8e8e5cc0[0x8e8e5cc0,0x8e8e5cc0] mode=N req=N
status=VALID/-/-/-/-/-/-/-/- KGH unpinned
row cache parent object: addr=0x8d7bc858 cid=15(dc_props) conid=0 conuid=0
hash=819e131 typ=21 transaction=(nil) flags=00000002 inc=1, pdbinc=1
own=0x8d7bc928[0x8d7bc928,0x8d7bc928] wat=0x8d7bc938[0x8d7bc938,0x8d7bc938] mode=N req=N
status=VALID/-/-/-/-/-/-/-/- KGH unpinned
data=
BUCKET 50 total object count=4
BUCKET 52:
row cache parent object: addr=0x8d76c148 cid=15(dc_props) conid=0 conuid=0
3. Processstate Dump
Additionally, we make a Processstate Dump, which contains "call: 0xb74520f0", and can be mapped to later 10222 trace "pso=0xb74520f0", and gdb "kqrpre1 pso (SOC): b74520f0".
alter session set tracefile_identifier = 'proc_state';
alter session set events 'immediate trace name PROCESSSTATE level 10';
alter session set events 'immediate trace name PROCESSSTATE off';
Each SO (State Object) seems having one SOC (State Object Call (or Copy) as current instance)
PROCESS STATE
-------------
Process global information:
process: 0xb84202a0, call: 0xb74520f0, xact: (nil), curses: 0xb898ecc8, usrses: 0xb898ecc8
in_exception_handler: no
----------------------------------------
SO: 0xb8eeb9a8, type: process (2), map: 0xb84202a0
state: LIVE (0x4532), flags: 0x1
owner: (nil), proc: 0xb8eeb9a8
link: 0xb8eeb9c8[0xb8eeb9c8, 0xb8eeb9c8]
child list count: 6, link: 0xb8eeba18[0xb8f060c8, 0xa6fe8208]
pg: 0
SOC: 0xb84202a0, type: process (2), map: 0xb8eeb9a8
state: LIVE (0x99fc), flags: INIT (0x1)
(process) Oracle pid:55, ser:96, calls cur/top: 0xb74520f0/0xb74520f0
...
----------------------------------------
SO: 0xb8f226a8, type: call (3), map: 0xb74520f0
state: LIVE (0x4532), flags: 0x1
owner: 0xb8eeb9a8, proc: 0xb8eeb9a8
link: 0xb8f226c8[0xa6fe8208, 0xb4ed1eb0]
child list count: 0, link: 0xb8f22718[0xb8f22718, 0xb8f22718]
pg: 0
SOC: 0xb74520f0, type: call (3), map: 0xb8f226a8
state: LIVE (0x99fc), flags: INIT (0x1)
(call) sess: cur b898ecc8, rec 0, usr b898ecc8; flg:20 fl2:1; depth:0
svpt(xcb:(nil) sptn:0x13f uba: 0x00000000.0000.00 uba: 0x00000000.0000.00)
4. Row Cache Event 10222
Make one 10222 trace: (See Blog: Oracle row cache objects Event: 10222, Dtrace Scripts (I))
alter session set tracefile_identifier = 'dc_props_10222';
alter session set events '10222 trace name context forever, level 4294967295';
select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
alter session set events '10222 trace name context off';
The trace file contains Function Calls and Call Stack,
for example, Parent Object po=0x8d7bc858 has hash=819e131, pso=0xb74520f0.
The same information can be found in the output of next gdb Script.
kqrpre: start hash=819e131 mode=S keyIndex=0 dur=CALL opt=FALSE
kqrpre: found cached object po=0x8d7bc858 flg=2
kqrmpin : kqrpre: found po Pin po 0x8d7bc858 cid=15 flg=2 hash=819e131
time=978904531
kqrpre: pinned po=0x8d7bc858 flg=2 pso=0xb74520f0
pinned stack po=0x8d7bc858 cid=15: ----- Abridged Call Stack Trace -----
ksedsts()+426<-kqrpre2 code="" kkdlpexecsql="" kkdlpexecsqlcbk="" kkdlpget="" kqrpre1="">-kqrpre2>
5. gdb Script
In Oracle session, we run again query on nls_database_parameters, and at the same time, trace the process with Blog appended Script: gdb_cmd_script2 in a UNIX window, write the trace output into file mutex-test2.log.
$ > gdb -x gdb_cmd_script2 -p 18158
SQL > select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
Excerpted from mutex-test2.log, here the access pattern for row cache row with addr=0x8d7bc858.
Breakpoint 334, 0x00000000126e3930 in kqrpre1 ()
<<<==================== (1) kqrpre1 pso (SOC): b74520f0 ====================>>>
Breakpoint 124, 0x00000000126dd4e0 in kqrpre2 ()
Breakpoint 125, 0x00000000126df840 in kqrhsh ()
Breakpoint 335, 0x00000000126df8ad in kqrhsh ()
(1) kqrhsh retq hash (rax): hash=819e131
Breakpoint 135, 0x00000000126e33f0 in kqrGetHashMutexInt ()
Breakpoint 336, 0x00000000129da970 in kgxExclusive ()
(1-1) ---> kgxExclusive Mutex addr (rsi): a70a01a8
Breakpoint 146, 0x00000000126e5fd0 in kqrGetPOMutex ()
Breakpoint 336, 0x00000000129da970 in kgxExclusive ()
(1-2) ---> kgxExclusive Mutex addr (rsi): 8d7bc960
Breakpoint 137, 0x00000000126e38c0 in kqrIncarnationMismatch ()
Breakpoint 132, 0x00000000126e2170 in kqrCacheHit ()
Breakpoint 338, 0x00000000126e2980 in kqrmpin ()
(1) kqrmpin v$rowcache_parent.address (rdi): addr=8d7bc858 cid=15
Breakpoint 131, 0x00000000126e1fa0 in kqrFreeHashMutex ()
Breakpoint 337, 0x00000000129db980 in kgxRelease ()
(1-1) <--- kgxRelease Mutex addr (*rsi): a70a01a8
Breakpoint 127, 0x00000000126dfd30 in kqrLockPo ()
Breakpoint 128, 0x00000000126e0180 in kqrAllocateEnqueue ()
Breakpoint 129, 0x00000000126e0790 in kqrget ()
Breakpoint 137, 0x00000000126e38c0 in kqrIncarnationMismatch ()
Breakpoint 137, 0x00000000126e38c0 in kqrIncarnationMismatch ()
Breakpoint 126, 0x00000000126dfc40 in kqrFreePOMutex ()
Breakpoint 337, 0x00000000129db980 in kgxRelease ()
(1-2) <--- kgxRelease Mutex addr (*rsi): 8d7bc960
Breakpoint 139, 0x00000000126e3980 in kqrprl ()
Breakpoint 140, 0x00000000126e3ed0 in kqreqd ()
Breakpoint 146, 0x00000000126e5fd0 in kqrGetPOMutex ()
Breakpoint 336, 0x00000000129da970 in kgxExclusive ()
(1-3) ---> kgxExclusive Mutex addr (rsi): 8d7bc960
Breakpoint 141, 0x00000000126e4d40 in kqrReleaseLock ()
Breakpoint 143, 0x00000000126e5910 in kqrmupin ()
Breakpoint 142, 0x00000000126e4ff0 in kqrpspr ()
Breakpoint 337, 0x00000000129db980 in kgxRelease ()
(1-3) <--- kgxRelease Mutex addr (*rsi): 8d7bc960
Breakpoint 144, 0x00000000126e5e30 in kqrFreeEnqueue ()
6. Discussions
- gdb output mutex-test2.log contains 95 kqrpre1 Calls, each for one 'dc_props' / 'dc_cdbprops' object
(total 85+6=91, 4 are fetched twice). - For each row cache access, first compute a hash value (hash=819e131) and then access object (addr=8d7bc858).
- Each RowCache Object Get triggers three kgxExclusive Mutex Gets,
one from kqrGetHashMutexInt (Mutex addr (rsi): a70a01a8),
two from kqrGetPOMutex (Mutex addr (rsi): 8d7bc960). - Mutex a70a01a8 is used in kqrGetHashMutexInt to protect hash BUCKET, hence BUCKET mutex.
In above example, BUCKET 50 contains 4 row cache objects, so there are 4 occurrences in mutex-test2.log.
Mutex 8d7bc960 is used in kqrGetPOMutex to protect individual row cache object, hence OBJECT mutex.
So each row cache object has its own mutex. - kqrGetHashMutexInt Mutex is released after first kqrGetPOMutex Get (interleaved).
So at certain instant, one session can hold two Mutexes at the same time (same as latch).
For latch, there is MOS Note: ORA-600 [504] "Trying to obtain a latch which is already held" (Doc ID 28104.1), which said:
"This ORA-600 triggers when a latch request violates the ordering rules for obtaining latches and granting the latch would potentially result in a deadlock." (See Blog: ORA-600 [504] LATCH HIERARCHY ERROR Deadlock)
For Mutex, it is not clear if there could exist such similar deadlock. - The first Mutex Get is from kqrGetHashMutexInt, called by kqrpre, and released only after the second Mutex Get (the first of kqrGetPOMutex mutex Get).
So kqrpre can hold two Mutexes and keep the first Mutex longer (till kqrFreeHashMutex).
kqreqd makes one Mutex Get.
In case of Latch (Oracle 12c), there are three Latch Gets with 3 different Locations:
(see Blog: Oracle row cache objects Event: 10222, Dtrace Scripts (I) )Where=>4441(0x1159): kqrpre: find obj -> kslgetl() -- latch Get at 1st Location Where=>4464(0x1170): kqreqd -> kslgetl() -- latch Get at 2nd Location Where=>4465(0x1171): kqreqd: reget -> kslgetl() -- latch Get at 3rd Location
They are probably mapped to three kgxExclusive Mutex Gets as discussed in above Point (c). - If one session, which held a mutex, but abnormally terminated, CLMN wrote a log
about its cleanup and recovery of the mutex (0xa70a01a8).
KGX cleanup... KGX Atomic Operation Log 0x928ad480 Mutex 0xa70a01a8(14, 0) idn f00003c oper EXCL(6) Row Cache uid 14 efd 6 whr 19 slp 0 oper=0 pt1=(nil) pt2=(nil) pt3=(nil) pt4=(nil) pt5=(nil) ub4=15 flg=0x8 KQR UOL Recovery lc 928ad480 options = 1
7. kqrMutexLocations[] array
For all kqrMutexLocations in kqr.c, we can try to list them with following command. They often appear in AWR section "Mutex Sleep Summary" for Mutex Type: "Row Cache" (or V$MUTEX_SLEEP / V$MUTEX_SLEEP_HISTORY.location).
// uname -srm
// Linux 3.10.0-957.5.1.el7.x86_64 x86_64 cast (uint64_t *) optional
// Linux 4.18.0-372.9.1.el8.x86_64 x86_64 requires cast (uint64_t *)
#Define Command to kqrMutexLocations
define PrintkqrMutexLocations
set $i = 0
while $i < $arg0
x /s *((uint64_t *)&kqrMutexLocations + $i)
set $i = $i + 1
end
end
(gdb) PrintkqrMutexLocations 51
0x14c21f40: "[01] kqrHashTableInsert"
0x14c21f58: "[02] kqrHashTableRemove"
0x14c21f70: "[03] kqrUpdateHashTable"
0x14c21f88: "[04] kqrshu"
0x14c21f94: "[05] kqrWaitForObjectLoad"
0x14c21fb0: "[06] kqrGetClusterLock"
0x14c21fc8: "[07] kqrBackgroundInvalidate"
0x14c21fe8: "[08] kqrget"
0x14c21ff4: "[09] kqrReleaseLock"
0x14c22008: "[10] kqreqd"
0x14c22014: "[11] kqrfpo"
0x14c22020: "[12] kqrpad"
0x14c2202c: "[13] kqrsad"
0x14c22038: "[14] kqrScan"
0x14c22048: "[15] kqrCacheHit"
0x14c2205c: "[16] kqrReadFromDB"
0x14c22070: "[17] kqrCreateUsingSecondaryKey"
0x14c22090: "[18] kqrCreateNewVersion"
0x14c220ac: "[19] kqrpre"
0x14c220b8: "[20] kqrpla"
0x14c220c4: "[21] kqrpScanAndInvalidateLocal"
0x14c220e4: "[22] kqrpScan"
0x14c220f4: "[23] kqrpsiv"
0x14c22104: "[24] kqrpsci"
0x14c22114: "[25] kqrpup"
0x14c22120: "[26] kqrpfu"
0x14c2212c: "[27] kqrpdl"
0x14c22138: "[28] kqrLocalInvalidateByHash"
0x14c22158: "[29] kqrpiv"
0x14c22164: "[30] kqrpsf"
0x14c22170: "[31] kqrcmt"
0x14c2217c: "[32] kqrsfd"
0x14c22188: "[33] kqrsrd"
0x14c22194: "[34] kqrssc"
0x14c221a0: "[35] kqrisi"
0x14c221ac: "[36] kqrsup"
0x14c221b8: "[37] kqrsfu"
0x14c221c4: "[38] kqrsdl"
0x14c221d0: "[39] kqrsiv"
0x14c221dc: "[40] kqrfrpo"
0x14c221ec: "[41] kqrfrso"
0x14c221fc: "[42] kqrdfc"
0x14c22208: "[43] kqrlfc"
0x14c22214: "[44] kqrdpc"
0x14c22220: "[45] kqrdhst"
0x14c22230: "[46] kqrMutexCleanup"
0x14c22248: "[47] kqrftr"
0x14c22254: "[48] kqrhngc"
0x14c22264: "[49] kqrcic"
0x14c22270: "[50] kqrpspr"
0x14c22280: "[51] kqrglblkrs"
8. gdb script: gdb_cmd_script2
Before running the script, the breakpoint number and address have to be adjusted.
### ----================ gdb script: gdb -x gdb_cmd_script2 -p 18158 ================----
set pagination off
set logging file /temp/mutex-test2.log
set logging overwrite on
set logging on
set $k = 0
set $g = 0
set $r = 0
set $spacex = ' '
rbreak ^kqr.*
rbreak ^kgx.*
command 1-333
continue
end
#kqrmpin
delete 133
#kqrpre1
delete 138
#kgxExclusive
delete 326
#kgxRelease
delete 331
#10222 trc pso, processstate dump SOC (State Object Call): $r8
break *kqrpre1
command
printf "<<<==================== (%i) kqrpre1 pso (SOC): %x ====================>>>\n", ++$k, $r8
set $g = 0
set $r = 0
continue
end
#kqrhsh+109 retq, output hash value in "oradebug dump row_cache 0xf2b" (0xf2b is for "dc_props")
break *0x00000000126df8ad
command
printf "%5c(%i) kqrhsh retq hash (rax): hash=%x\n", $spacex, $k, $rax
continue
end
break *kgxExclusive
command
printf "%5c(%i-%i) ---> kgxExclusive Mutex addr (rsi): %x\n", $spacex, $k, ++$g, $rsi
continue
end
break *kgxRelease
command
printf "%5c(%i-%i) <--- kgxRelease Mutex addr (*rsi): %x\n", $spacex, $k, ++$r, *((int *)$rsi)
continue
end
#output v$rowcache_parent.address for row cache object
break *kqrmpin
command
printf "%5c(%i) kqrmpin v$rowcache_parent.address (rdi): addr=%x cid=%i\n", $spacex, $k, $rdi, $r11
continue
end