Oracle Global Customer Support Document: Oracle Database 12c Release 2 Support Lessons Learned from Early Adopters wrote:
"row cache mutex" replaces "latch: row cache objects" but rarely seen
As a start point to be acquainted with this new mechanism, this Blog takes the same Test Code in Blog: nls_database_parameters, dc_props, latch: row cache objects to focus only on Dictionary Cache dc_props.
We will run tests in both 12cR2 and 12cR1 by varying the number of parallel sessions, compare their behaviors, and then try to measure their scalability.
Note: All tests are done in 12.2.0.1.0 (12cR2) and 12.1.0.2.0 (12cR1) on Linux with 6 physical processors.
Update: (31-Aug-2021): "row cache mutex" seems caused hot objects, see:
Bug 31135517 - High row cache mutex waits on hot objects - superseded (Doc ID 31135517.8)
Description
This problem is introduced in release 12.2 when latches are replaced with the introduction
of row cache mutexes.
In nearly all cases, the mutex architecture is better because the locking can be targeted at
the object level. Because the mutexes are embedded within the objects themselves, there are
no false conflicts caused by sessions trying to lock different objects within the same cache.
So in general, contention is improved.
It is unfortunate that where one or two very 'hot' row cache objects exist(s), the previous
architecture is better because the current architecture requires two mutex compared to one
latch get. The reasons for this are complex, but it is primarily because many row cache objects
have multiple keys, and are linked into the hash table on multiple hash chains. The potential for
contention is increased because the two mutex gets introduce multiple choke points.
The fix implements 'hot object cloning' to alleviate row cache mutex contention.
Stack from hang analyze trace will likely contain:
... kslwaitctx ksfwaitctx kgxWait kgxExclusive kqrGetMutexByAddr ..
1. 12cR2 "row cache mutex" vs. 12cR1 "latch: row cache objects"
The new Event "row cache mutex" is exposed with row cache parameters:
P1TEXT = cache id
P2TEXT = where requested
select event, p1text, p1, p2text, p2, p3text, p3
from v$session v where event in ('row cache mutex');
EVENT P1TEXT P1 P2TEXT P2 P3TEXT P3
--------------- -------- -- --------------- -- ------ --
row cache mutex cache id 15 where requested 19 0
"cache id" can be used to directly pinpoint the exact contention row cache objects.Whereas in lower Releases, Event "latch: row cache objects" is exposed with latch parameters, and v$latch_children.child# is used to join with x$kqrst.kqrstcln and filtered by v$latch_children.addr = :P1RAW to find row cache info.
select event, p1text, p1, p1raw, p2text, p2, p2raw, p3text, p3, p3raw
from v$session v where event in ('latch: row cache objects');
EVENT P1TEXT P1 P1RAW P2TEXT P2 P2RAW P3TEXT P3 P3RAW
------------------------ ------- ---------- ---------------- ------ --- ---------------- ------ -- -----
latch: row cache objects address 5970309816 0000000163DBB2B8 number 411 000000000000019B tries 0 00
select la.child# latch_child, kqrstcid cache#, kqrsttxt parameter, la.addr latch_addr --, dc.*, la.*
from x$kqrst dc, v$latch_children la
where dc.kqrstcln = la.child#
and la.name = 'row cache objects'
and la.addr = '0000000182D98958';
LATCH_CHILD CACHE# PARAMETER LATCH_ADDR
----------- ------ --------- ----------------
18 15 dc_props 0000000163DBB2B8
12cR2 v$rowcache contains 71 rows, one new dictionary cache: 'dc_cdbprops' is added.By the way, 'row cache objects' LATCH# is numbered as "506" (12cR1 is "411"), but hardly used.
select name, latch# from v$latch where name in ('row cache objects');
NAME LATCH# GETS
----------------- ------ ----
row cache objects 506 456
2. Test
Run following test script in 12cR2 and 12cR1 on a Linux with 6 physical processors.
The script consists of 8 cases with 1, 2, 3, 4, 5, 6, 9 and 12 parallel sessions respectively.
Each case runs for 10 minutes.
declare
type t_num_tab is table of number;
l_nums t_num_tab := new t_num_tab(1, 2, 3, 4, 5, 6, 9, 12);
begin
clean_jobs;
sys.dbms_workload_repository.create_snapshot('ALL');
for n in 1..l_nums.count loop
nls_select_jobs(l_nums(n), 1e9); -- n Sessions
dbms_lock.sleep(600); -- 10 minutes
clean_jobs;
sys.dbms_workload_repository.create_snapshot('ALL');
end loop;
end;
/
3. Test Result
After test, collect all 8 AWR reports in 12cR2 and 12cR1, extract relevant 4 Sections to compare.
3.1 12cR2
Top 10 Foreground Events by Total Wait Time
Session# Event Waits Total Wait Time (sec) Avg Wait % DB time Wait Class
-------- --------------- -------- --------------------- -------- --------- -----------
1 row cache mutex 9 0 10.33us 0.0 Concurrency
2 row cache mutex 12,836 18 1.46ms 1.6 Concurrency
3 row cache mutex 48,390 111 2.31ms 5.9 Concurrency
4 row cache mutex 111,312 280 2.52ms 10.5 Concurrency
5 row cache mutex 199,253 491 2.47ms 14.6 Concurrency
6 row cache mutex 322,542 886 2.75ms 20.1 Concurrency
9 row cache mutex 418,819 3030 7.24ms 35.6 Concurrency
12 row cache mutex 573,556 4673 8.15ms 38.7 Concurrency
SQL ordered by Elapsed Time
(SQL Id: 7mgsfc44trnr8
SQL Text: SELECT VALUE FROM NLS_DATABASE_PARAMETERS WHERE PARAMETER = 'NLS_CHARACTERSET')
Session# Elapsed Time (s) Executions Elapsed Time per Exec (s) %Total %CPU %IO
-------- ---------------- ---------- ------------------------- ------ ------- ----
1 498 3,174,194 0.00 81.47 100.40 0.00
2 983 5,876,575 0.00 83.48 98.52 0.00
3 1,515 8,383,434 0.00 79.99 93.22 0.00
4 2,087 10,518,830 0.00 77.82 87.48 0.00
5 2,533 11,688,284 0.00 74.97 81.51 0.00
6 3,112 12,242,878 0.00 70.67 71.93 0.00
9 4,894 9,121,274 0.00 57.49 34.12 0.00
12 6,685 9,300,453 0.00 55.32 25.77 0.00
Mutex Sleep Summary
Session# Mutex Type Location Sleeps Wait Time (ms)
-------- ---------- ----------- --------- --------------
1 Row Cache [19] kqrpre 12 0
2 Row Cache [19] kqrpre 16,734 18,744
3 Row Cache [19] kqrpre 70,398 108,322
4 Row Cache [19] kqrpre 164,963 266,506
5 Row Cache [19] kqrpre 290,989 466,843
6 Row Cache [19] kqrpre 478,910 831,835
9 Row Cache [19] kqrpre 890,437 2,929,696
16 Row Cache [19] kqrpre 1,268,073 4,499,378
1 Row Cache [13] kqreqd 8 0
2 Row Cache [13] kqreqd 415 75
3 Row Cache [13] kqreqd 1,536 300
4 Row Cache [13] kqreqd 4,555 850
5 Row Cache [13] kqreqd 9,913 1,625
6 Row Cache [13] kqreqd 27,487 3,565
9 Row Cache [13] kqreqd 20,223 14,429
16 Row Cache [13] kqreqd 24,497 22,772
(Row Cache [19] kqrpre: find obj
Row Cache [13] kqreqd: reget)
Dictionary Cache Stats
Session# Cache Get Requests Pc t Miss Scan Reqs Pct Miss Mod Reqs Final Usage
-------- -------- ------------ --------- --------- -------- -------- -----------
1 dc_props 222,194,045 0.00 0 0 70
2 dc_props 406,987,135 0.00 0 0 70
3 dc_props 572,806,160 0.00 0 0 70
4 dc_props 708,102,414 0.00 0 0 70
5 dc_props 776,478,519 0.00 0 0 70
6 dc_props 807,080,147 0.00 0 0 70
9 dc_props 605,040,134 0.00 0 0 70
12 dc_props 616,520,863 0.00 0 0 70
3.2 12cR1
Top 10 Foreground Events by Total Wait Time
Session# Event Waits Total Wait Time (sec) Avg Wait(ms) % DB time Wait Class
-------- ------------------------ --------- --------------------- ------------ --------- -----------
1 latch: row cache objects 12 0 0.02 0.0 Concurrency
2 latch: row cache objects 19,775 1 0.03 0.1 Concurrency
3 latch: row cache objects 259,069 6 0.02 0.3 Concurrency
4 latch: row cache objects 612,329 16 0.03 0.7 Concurrency
5 latch: row cache objects 1,008,481 36 0.04 1.2 Concurrency
6 latch: row cache objects 1,189,823 116 0.10 3.1 Concurrency
9 latch: row cache objects 1,036,139 2441 2.36 30.2 Concurrency
12 latch: row cache objects 948,170 3861 4.07 34.5 Concurrency
SQL ordered by Elapsed Time
Session# Elapsed Time (s) Executions Elapsed Time per Exec (s) %Total %CPU %IO
-------- ---------------- ---------- ------------------------- ------ ------ ----
1 506 6,164,161 0.00 85.76 100.53 0.00
2 1,056 10,848,660 0.00 88.01 100.32 0.00
3 1,614 9,977,224 0.00 91.07 99.91 0.00
4 2,227 8,172,259 0.00 93.72 99.69 0.00
5 2,843 6,951,871 0.00 94.78 99.06 0.00
6 3,493 6,890,154 0.00 93.93 96.45 0.00
9 5,243 7,245,168 0.00 64.80 49.04 0.00
12 6,728 7,562,209 0.00 60.19 36.65 0.00
Latch Sleep Breakdown
Session# Latch Name Get Requests Misses Sleeps Spin Gets
-------- ----------------- ------------- ----------- --------- -----------
1 row cache objects 1,109,597,151 633 14 619
2 row cache objects 1,953,334,110 105,279,309 20,757 105,259,556
3 row cache objects 1,796,543,339 378,078,181 269,547 377,819,261
4 row cache objects 1,471,777,609 541,981,017 632,528 541,369,153
5 row cache objects -3,043,081,012 596,060,413 1,034,539 595,053,127
6 row cache objects 1,229,364,378 689,496,609 1,200,975 688,326,423
9 row cache objects 1,304,630,648 561,205,207 1,091,371 560,171,774
16 row cache objects 1,361,872,758 607,120,518 1,007,494 606,174,258
* -3,043,081,012 is due to 32-bit overflow, it should be 1,251,886,284 (=2^32 - 3,043,081,012).
Dictionary Cache Stats
Session# Cache Get Requests Pc t Miss Scan Reqs Pct Miss Mod Reqs Final Usage
-------- -------- ------------ --------- --------- -------- -------- -----------
1 dc_props 369,850,105 0.00 0 0 60
2 dc_props 651,060,932 0.00 0 0 60
3 dc_props 598,832,676 0.00 0 0 60
4 dc_props 490,574,970 0.00 0 0 60
5 dc_props 417,291,212 0.00 0 0 60
6 dc_props 413,622,949 0.00 0 0 60
9 dc_props 434,872,658 0.00 0 0 60
12 dc_props 453,953,683 0.00 0 0 60
4. Discussions
4.1 Top 10 Foreground Events by Total Wait Time
12cR2 "row cache mutex" has about 2 times less "Waits" than 12cR1 "latch: row cache objects".
That is the advantage of light-weight "Mutex" over "Latch".
"Total Wait Time" and "Avg Wait" increase linearly when Session# <= 6 (number of physical processors), non-linearly when > 6.
12cR2 "row cache mutex" has higher concurrency ("% DB time") than 12cR1 "latch: row cache objects".
Strangely 12cR2 Mutex "Avg Wait" is much higher than that of 12cR1 Latch.
4.2 SQL ordered by Elapsed Time
"Executions" in 12cR2 reaches maximum at Session# = 6, 20% higher than 12cR1 peak value at Session# = 2.
It means 12cR1 can use less CPU resource to reach its peak value.
12cR1 has higher throughput than 12cR2 when less Sessions (less than 4); but lower when more Sessions.
4.3 Mutex Sleep Summary vs Latch Sleep Breakdown
12cR2 "row cache mutex" Sleeps and Wait Time increase non-linearly.
12cR1 "latch: row cache objects" Misses and Sleeps increase non-linearly although "Get Requests" are almost similar.
In 12cR1, "Waits" in AWR "Top 10 Foreground Events by Total Wait Time" is close to "Sleeps" in "Latch Sleep Breakdown".
in 12cR2, it is far from "Sleeps" in "Mutex Sleep Summary".
4.4 Dictionary Cache Stats
dc_props "Get Requests" depends on query "Executions".
Each query (SQL Id: 7mgsfc44trnr8) Execution triggers 70 "Get Requests" in 12cR2, and 60 in 12cR1
since dc_props rows (Field "Final Usage") increased from 60 in 2cR1 to 70 in 12cR2.
5. Test Code
create or replace procedure nls_select(p_cnt number) as
l_val VARCHAR2 (256 Byte);
begin
for i in 1..p_cnt loop
select value into l_val from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
end loop;
end;
/
create or replace procedure nls_select_jobs(p_job_cnt number, p_cnt number) as
l_job_id pls_integer;
begin
for i in 1.. p_job_cnt loop
dbms_job.submit(l_job_id, 'begin while true loop nls_select('||p_cnt||'); end loop; end;');
end loop;
commit;
end;
/
create or replace procedure clean_jobs as
begin
for c in (select job from dba_jobs) loop
begin
dbms_job.remove (c.job);
exception when others then null;
end;
commit;
end loop;
for c in (select d.job, d.sid, (select serial# from v$session where sid = d.sid) ser
from dba_jobs_running d) loop
begin
execute immediate
'alter system kill session '''|| c.sid|| ',' || c.ser|| ''' immediate';
dbms_job.remove (c.job);
exception when others then null;
end;
commit;
end loop;
dbms_lock.sleep(2);
-- select * from dba_jobs;
-- select * from dba_jobs_running;
end;
/
-- exec nls_select_jobs(2, 1e9);
-- exec clean_jobs; -- stop all jobs