On Oracle: PL/SQL Function Result Cache Invalidation (I)

Oracle PL/SQL Function Result Cache is a global cross-session cached associative array, mapping from base data sources to function result, and stored in shared pool:


SQL(64) > select * from v$sgastat where name like '%Result Cache%';

POOL         NAME                            BYTES
------------ -------------------------- ----------
shared pool  Result Cache: Cache Mgr           256
shared pool  Result Cache: Memory Mgr          208
shared pool  Result Cache: State Objs        69920
shared pool  Result Cache                   159880

If changes to any of data sources are committed, the cached result becomes invalid by the DML session, and get rebuilt by the user session which is accessing the function. All build, invalidate and access are synchronized by a single latch:


SQL(854) > select * from v$latch where name = 'Result Cache: RC Latch';

Note: all tests are done in Oracle 12.1.0.2.0. See appended Test Code.

1. Build

Run:


SQL(64) > exec dbms_result_cache.flush;
SQL(64) > exec run_test;

List Result Cache:


select scn_to_timestamp(scn) scn_time, ro.* 
from v$result_cache_objects ro 
where 1=1
--  and type = 'Dependency'
--  and status = 'Invalid'
order by scn_time, scn;

Display dependencies:


select count(*) from v$result_cache_dependency;

 COUNT(*)
---------
       51

select distinct
  from_obj.name from_name, from_obj.type, from_obj.status, from_obj.namespace,
  from_obj.scn, scn_to_timestamp(from_obj.scn) scn_time,
  to_obj.name to_name, to_obj.type to_type, to_obj.status to_status, to_obj.namespace to_namespace, 
  to_obj.scn to_scn, scn_to_timestamp(to_obj.scn) to_scn_time
from v$result_cache_objects    from_obj, 
     v$result_cache_dependency dep,
     v$result_cache_objects    to_obj
where from_obj.id = dep.result_id 
  and to_obj.id   = dep.depend_id
--  and to_obj.name in ('K.RC_TAB_DEP', 'K.RC_TAB')
--  and from_obj.name like '%"K"."%GET_VAL"%'
order by from_obj.scn;

2. Invalidate


SQL(64) > update rc_tab_dep set dval=33 where did = 3;

SQL(64) > select s.sid, t.used_urec from v$transaction t, v$session s 
           where t.ses_addr=s.saddr and s.sid=sys.dbms_support.mysid;      
 
 SID  USED_UREC
---- ----------
  64          1 

SQL(64) > select * from v$result_cache_statistics
            where lower(name) like '%invalid%'; 

 ID NAME                   VALUE
--- ---------------------- -----
  8 Invalidation Count     0
  9 Delete Count Invalid   0

SQL(64) > select count(*) from v$result_cache_dependency;

 COUNT(*)
---------
       51

SQL(64) > commit;

SQL(64) > select * from v$result_cache_statistics
            where lower(name) like '%invalid%'; 

 ID NAME                   VALUE
--- ---------------------- -----
  8 Invalidation Count     17
  9 Delete Count Invalid   0
  
SQL(64) > select count(*) from v$result_cache_dependency;

 COUNT(*)
---------
       0

The granularity of invalidation seems in table unit, even only one single row is updated, all result cache entries depending on that table are invalidate. Above test shows that "Invalidation Count" is increased from 0 to 17 (17 "GET_VAL" result cache entries).

3. Invalidate: select for update

Rebuild Cache, and then run a "select for update" for one row:


SQL(64) > exec dbms_result_cache.flush;
SQL(64) > exec run_test;

SQL(64) > select * from rc_tab_dep where did = 3 for update;

SQL(64) > select s.sid, t.used_urec from v$transaction t, v$session s 
           where t.ses_addr=s.saddr and s.sid=sys.dbms_support.mysid;      
 
 SID  USED_UREC
---- ----------
  64          1

Get spid 13699 of test session (SID 64), open a Solaris window, run command:


$ dtrace -w -n 'pid$target::qesrcRO_Invalidate:entry {@[pid, ustack(10, 0)] = count(); stop(); exit(0);}' -p 13699

Back to SQL Window, run:


SQL(64) > commit;

Session is hanging.

Solaris window prints out Stack Trace:


    oracle`qesrcRO_Invalidate
    oracle`qesrcDO_Invalidate+0x9c8
    oracle`qesrcCM_PreCmt_+0xd81
    oracle`ktcCommitTxn_new+0x2a7
    oracle`ktcCommitTxn+0x59
    oracle`ktdcmt+0x8b
    oracle`k2lcom+0x94
    oracle`k2send+0x4f3
    oracle`xctctl+0x62
    oracle`xctCommitTxn+0x2b8

Open a second SQL Window, build Result Cache for function get_val_2, which is irrelevant to function get_val:


SQL(512) > exec run_test_2;

Make some check on Unix side:


-- spid 10502 of test session (SID 512), check callstack:

$  pstack 10502

 fffffd7ffc9d3e3b semsys   (2, c00000f, fffffd7fffdf4968, 1, d0)
 0000000001ab90f5 sskgpwwait () + 1e5
 0000000001ab8c95 skgpwwait () + c5
 0000000001ab7ab0 kslgess () + 980
 0000000001ab6cef ksl_get_shared_latch () + 39f
 0000000005ab76ab qesrcPin_Get () + 8cb
 0000000005aa8ff4 qesrcCM_Acquire () + 354
 000000000bd2287d psdfrcLookup () + 12d
 000000000e001883 pevm_ENTERX () + 143
 000000000dfc2907 pfrinstr_ENTERX () + 47
 0000000001a435ca pfrrun_no_tool () + 12a
 
$  ipcs -as
 T         ID      KEY        MODE        OWNER    GROUP  CREATOR   CGROUP NSEMS   OTIME    CTIME
 Semaphores:
 s  201326607   0x375d691c --ra-r-----   oracle      dba   oracle      dba   704 13:14:45 15:52:33

We can see:
 semid = 201326607 = 0xc00000f
 
and "0xc00000f" is 
 semsys   (2, c00000f, fffffd7fffdf4968, 1, d0)
 
in pstack output, and 2 is semop defined in syscall.h:
 semop (...) :: semsys(2, ...)
 
 int semop(int semid, struct sembuf *sops, size_t nsops);
 

If we truss the blocking session, we can also see the same semid with semnum begin 34 and 16.
One thing we noticed is that truss de-blocks (unlock) the blocking session (like prun).

$  truss -dp 13699 
/1:     833.5393        semctl(201326607, 34, SETVAL, 1)                = 0
/1:     833.5395        semctl(201326607, 16, SETVAL, 1)                = 0
/1:     833.5395        lwp_info(0xFFFFFD7FFFDF7440)                    = 0

It hangs on Result Cache: RC Latch.

On a third SQL Window, watch the blocking:


SQL(1024) > select sid, blocker_sid, blocker_is_valid, wait_event_text, p2 from v$wait_chains;

  SID BLOCKER_SID BLOCK WAIT_EVENT_TEXT    P2
----- ----------- ----- --------------- -----
  512          64 TRUE  latch free        559
   64             FALSE

SQL(1024) > select * from v$latchholder;

 PID  SID LADDR            NAME                       GETS
---- ---- ---------------- ---------------------- --------
  25   64 00000000600572B0 Result Cache: RC Latch  15564353

On a fourth SQL Window, make select for update on another row:


SQL(768) > select * from rc_tab_dep where did = 5 for update;

       DID       DVAL
---------- ----------
         5         50
1 row selected.

SQL(768) > commit;

The commit is also hanging on "latch free" of 'Result Cache: RC Latch'.

And all other sessions which are accessing or updating Result Cache are waiting for the same latch free (ksl_get_shared_latch).

In this test, we started one transaction by "select for update", but did not update any row, however, "commit" triggered a call to "qesrcRO_Invalidate", hence held "RC Latch". Any other later session which requires this latch is waiting for "latch free".

Furthermore, query on any one of
v$result_cache_objects
v$result_cache_dependency
v$result_cache_statistics

is also pending on "RC Latch" "latch free".

The above Stack Trace shows that when a transaction user session calls commit command, commit takes a detour to visit Result Cache along its code path in order to perform the invalidation before publishing the confirmed and consistent news to the world. Therefore, Function Result Cache is not free of charge. It demands the resources to maintain the database wide consistence, and thus PL/SQL Function Result Cache (in Oracle Database PL/SQL Language Reference) said:

Oracle Database automatically detects all data sources (tables and views) that are queried while a result-cached function is running. If changes to any of these data sources are committed, the cached result becomes invalid and must be recomputed. The best candidates for result-caching are functions that are invoked frequently but depend on information that changes infrequently or never.

This behavior of "SELECT FOR UPDATE" is described in Oracle MOS
Bug 17855824 : PLSQL RESULT CACHE WRONGLY INVALIDATED BY A SELECT FOR UPDATE STATEMENT

4. RC Latch "ksl_get_shared_latch" tracing

For each RC Latch Get, the call stack looks like:


 0000000001ab696b ksl_get_shared_latch () + 1b
 0000000005ab76ab qesrcPin_Get () + 8cb
 0000000005aa8ff4 qesrcCM_Acquire () + 354
 000000000bd2287d psdfrcLookup () + 12d
 000000000e001883 pevm_ENTERX () + 143

In case of Latch Miss, the process yields the CPU and goes to sleep:


 fffffd7ffc9d396a yield () + a
 0000000005ab78d6 qesrcPin_Get () + af6
 0000000005aa8ff4 qesrcCM_Acquire () + 354
 000000000bd2287d psdfrcLookup () + 12d
 000000000e001883 pevm_ENTERX () + 143

Run the same test, and measure Latch Get time in qesrcCM_Acquire (CM: Cache Manager) by dtrace:


SQL > exec run_test;

$ dtrace -n \
'BEGIN {self->start_wts = walltimestamp; self->start_ts = timestamp;}
pid$target::qesrcCM_Acquire:entry /execname == "oracle"/ { self->rc = 1; }
pid$target::ksl_get_shared_latch:entry /execname == "oracle" && self->rc == 1/ { self->ts = timestamp; }
pid$target::ksl_get_shared_latch:return /self->ts > 0/ {  
 @lquant["ns"] = lquantize(timestamp - self->ts, 0, 10000, 1000); 
 @avgs["AVG_ns"] = avg(timestamp - self->ts);
 @mins["MIN_ns"] = min(timestamp - self->ts);
 @maxs["MAX_ns"] = max(timestamp - self->ts);
 @sums["SUM_ms"] = sum((timestamp - self->ts)/1000000);
 @counts[ustack(10, 0)] = count(); 
 self->rc = 0; self->ts = 0;}
END { printf("Start: %Y, End: %Y, Elapsed_ms: %d\n", self->start_wts, walltimestamp, (timestamp - self->start_ts)/1000000);}
' -p 13699

  Start: 2017 Sep  4 14:54:02, End: 2017 Sep  4 14:54:06, Elapsed_ms: 3939

  ns
           value  ------------- Distribution ------------- count
               0 |                                         0
            1000 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@      15
            2000 |@@                                       1
            3000 |                                         0
            4000 |                                         0
            5000 |                                         0
            6000 |@@                                       1
            7000 |                                         0

  AVG_ns    1605
  MIN_ns    1140
  MAX_ns    6891
  SUM_ms       0

  oracle`ksl_get_shared_latch+0x17a
  oracle`qesrcCM_Acquire+0x34b
  oracle`psdfrcLookup+0x119
  oracle`pevm_ENTERX+0x142
  oracle`pfrinstr_ENTERX+0x3e
  oracle`pfrrun_no_tool+0xc2
  oracle`pfrrun+0x4a2
  oracle`plsql_run+0x278
  oracle`peicnt+0x14d
  oracle`kkxexe+0x205
    17

5. Causal Analysis of Result Cache

V$RESULT_CACHE_OBJECTS provides 4 time related columns:

(a). CREATION_TIMESTAMP:
Time when the object was first time created in Result Cache, and never changed for both TYPE: Result and Dependency.

(b). SCN:
For TYPE = Result, it is the SCN of first time created in Result Cache, same as CREATION_TIMESTAMP;
Its STATUS is 'Published' if valid; 'Invalid' if invalidated (for example, due to Dependency).
Its OBJECT_NO is always 0.

For TYPE = Dependency, when first time inserted, it is the SCN of time created in Result Cache, same as CREATION_TIMESTAMP.
Its STATUS is always 'Published', never 'Invalid'. Therefore for Dependency, there is no 'Invalid'.
Once changed, it is the last Commit SCN (Getting Oracle Transaction Commit SCN).
SCN timestamp minus CREATION_TIMESTAMP gives us the time span from time Created to time Changed.
Any existence of this time difference indicates the Change of Dependency,
hence it can be used to track Result Invalidation caused by its Dependencies.
Its OBJECT_NO is Dictionary object number (DBA_OBJECTS.object_id).

(c). LRU_NUMBER and SCAN_COUNT:
LRU list position, the bigger the value, the more recent the usage.
(Oracle Docu wrote "the smaller the value, the more recent the usage". It seems not the case.)
LRU_NUMBER is monotonic and increases one for each access of Result Cache.
To maintain this ordering, LRU_NUMBERs are shifted by increasing 1 to reflect RC access.
It is not clear if there exist multiple LRU Chains because each LRU_NUMBER can have multiple rows.
(select lru_number, count(*) from v$result_cache_objects where lru_number > 0 group by lru_number order by 2 desc;)

Following test shows that SCAN_COUNT is the number of calls for one Result Cache item;
LRU_NUMBER is the total number of calls over entire Result Cache.


 
create or replace function test_scan_lru (p_id number) return number result_cache as
begin
  return p_id;
end;
/

exec dbms_result_cache.flush;

select name, scan_count, lru_number 
from v$result_cache_objects 
where type = 'Result' and name like '%TEST_SCAN_LRU%';

 -- no rows selected

declare
  l_ret number;
begin
  for i in 1..1e6 loop
    l_ret := test_scan_lru(123);
  end loop;
end;
/

select name, scan_count, lru_number 
from v$result_cache_objects 
where type = 'Result' and name like '%TEST_SCAN_LRU%';

 -- NAME                                                   SCAN_COUNT   LRU_NUMBER
 -- ----  ---------------------------------------------- ------------ ------------
 -- "S".  "GET_VAL_3"::8."GET_VAL_3"#3048d2af80817a01 #1      999,999   48,007,056

declare
  l_ret number;
begin
  for i in 1..1e6 loop
    l_ret := test_scan_lru(123);
  end loop;
end;
/

select name, scan_count, lru_number 
from v$result_cache_objects 
where type = 'Result' and name like '%TEST_SCAN_LRU%';

 -- NAME                                                 SCAN_COUNT   LRU_NUMBER
 -- -------------------------------------------------- ------------ ------------
 -- "S"."GET_VAL_3"::8."GET_VAL_3"#3048d2af80817a01 #1    1,999,999   49,007,056

Synthesizing above information, we can compose 3 queries to track Result Cache invalidations, and 2 queries to monitor its usage (see appended "Result Cache Monitoring Views").


-- Last invalidated Dependencies
select * from rc_scn_view;

-- last invalidated Dependencies and Result
select * from rc_scn_dep_view;

-- RC usage
select * from rc_lru_number_view;

-- RC memory
select * from rc_memory_view;
 
-- RC stats
select * from rc_stats_view;

6. Result Cache: RC Latch S mode and X mode Blocking Chains

We will make S mode and X mode blocking test based on the calling API revealed in Blog Oracle 12 and latches:

 The kslgetl and ksl_get_shared_latch functions take the following arguments:
    1-latch address
    2-immediate get (0 means yes, 1 means no)
    3-where (X$KSLLW.INDX)
    4-why (X$KSLWSC.INDX)
    5-mode (8=shared,16=exclusive; only for ksl_get_shared_latch function)

The above text needs two corrections. We will have a look later.

   (1). arguments why (arg2) comes before where(arg3)
   (2). X$KSLLW.INDX and X$KSLWSC.INDX are the same (equal join columns), both point to the where (location).
        X$KSLLW is parent table, X$KSLWSC is child table.

If Mode=16, arg5 is 20000000 00000000 | <PID>.

It seems that Oracle calls different subroutines for different Latches. ksl_get_shared_latch is C subroutine to get shared Latch (in S Mode or X Mode), for example, result_cache Latch; whereas kslgetl is used to get exclusive Latch, for example, row cache objects Latch.

Following documents have a deep investigation of Latches and Mutexes:
     New features of Latches and Mutexes in Oracle 12c
     [Oracle] Researching internal latch implementation (ksl_get_shared_latch, kslfre, kslgetsl_w) and crashing PMON
     The missing argument of ksl_get_shared_latch : the power of disassembly in action

6.1. S mode Blocking


--------------------- Setup ---------------------
create or replace function rc_s_x_mode_blocking_func (p_x number) return number result_cache as
begin
  return p_x;
end;
/

create or replace procedure rc_s_x_mode_blocking_proc (p_x number) as
 l_x number;
begin
  l_x := rc_s_x_mode_blocking_func(p_x);
end;
/

--------------------- Prepare Test ---------------------
-- Open 3 Test Sessions (SID): S1(4335), S2(6122), S3(7497) with Process (SPID): P1(6114), P2(26548), P3(26550); 
-- and one monitoring Sessions (SID): S4.
-- S1 and S2 request S Mode to get already built Items, where S3 requests X Mode to add a new Item.

-- Build 2 Items in Result Cache.

SQL (S4) > exec dbms_result_cache.flush;

SQL (S4) > exec rc_s_x_mode_blocking_proc(1);
SQL (S4) > exec rc_s_x_mode_blocking_proc(2);

--------------------- Test Steps ---------------------

--==== @T1: S1 requets S Mode, stop at Return (stop at Entry has no blocking). Launch Dtrace at first, then run plsql.

SQL (S1)> exec rc_s_x_mode_blocking_proc(1);

P1$ sudo dtrace -w -n \
'pid$target::qesrcCM*:entry /execname == "oracle"/ { self->rc = 1; }
pid$target::ksl_get_shared_latch:entry /execname == "oracle" && self->rc == 1  && arg4 == 8 / { 
  printf("\n--------Calling %s:%s(Addr=>0x%X, Immed_Get=>%d, Why=>%d, Where=>%d(0x%X), Mode=>%d, PID=>0x%X)", 
         probefunc, probename, arg0, arg1, arg2, arg3, arg3, arg4, arg5);
  self->rc = 2;}
pid$target::ksl_get_shared_latch:return /self->rc == 2/ {
   printf("\n--------Stopped at %s:%s", probefunc, probename);
   printf("\n--------ustack(5, 0)--------"); ustack(5, 0);
   stop(); exit(0);}
' -p 6114

 CPU     ID                    FUNCTION:NAME
   1  80660       ksl_get_shared_latch:entry
--------Calling ksl_get_shared_latch:entry
    (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5358(0x14EE), Mode=>8, PID=>0x1)
   1  80661      ksl_get_shared_latch:return
--------Stopped at ksl_get_shared_latch:return
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch+0x1b8
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143
              oracle`pfrinstr_ENTERX+0x47

-- oradebug peek RC Latch address
SQL> oradebug setmypid
Statement processed.
SQL> oradebug peek 0x600572B0 24
[0600572B0, 0600572C8) = 00000001 00000000 0C56FEFC 0000022F 00000002 000014EE
                            Nproc  ˆX flag     gets   latch#   level#    where


SQL (S4)> select addr, gets, misses, sleeps, immediate_gets, spin_gets, wait_time from v$latch where latch# in (559) or name in ('Result Cache: RC Latch');
  ADDR              GETS       MISSES   SLEEPS  IMMEDIATE_GETS  SPIN_GETS  WAIT_TIME
  ----------------  ---------  -------  ------  --------------  ---------  ----------
  00000000600572B0  207027964  1011584  508174  0               797961     4975709595

SQL (S4)> select * from v$latchholder;
  PID  SID   LADDR             NAME                    GETS     
  ---- ----  ----------------  ----------------------  ---------  
  33   4335  00000000600572B0  Result Cache: RC Latch  207027964 


--==== @T2: S2 requets S Mode, not blocked

SQL (S2)> exec rc_s_x_mode_blocking_proc(2);

SQL (S4)> select addr, gets, misses, sleeps, immediate_gets, spin_gets, wait_time from v$latch where latch# in (559) or name in ('Result Cache: RC Latch');
  ADDR              GETS       MISSES   SLEEPS  IMMEDIATE_GETS  SPIN_GETS  WAIT_TIME
  ----------------  ---------  -------  ------  --------------  ---------  ----------
  00000000600572B0  207027965  1011584  508174  0               797961     4975709595

SQL (S4)> select * from v$latchholder;
  PID  SID   LADDR             NAME                    GETS     
  ---- ----  ----------------  ----------------------  ---------  
  33   4335  00000000600572B0  Result Cache: RC Latch  207027965 

--==== @T3: S3 requets X Mode Request, blocked by S1, Wait: latch free. Launch Dtrace at first, then run plsql.

SQL (S3)> exec rc_s_x_mode_blocking_proc(3);

P3$ sudo dtrace -n \
'pid$target::qesrcCM*:entry /execname == "oracle"/ { self->rc = 1; }
pid$target::ksl_get_shared_latch:entry /execname == "oracle" && self->rc == 1  && arg4 == 16 / { 
  printf("\n--------Calling %s:%s(Addr=>0x%X, Immed_Get=>%d, Why=>%d, Where=>%d(0x%X), Mode=>%d, PID=>0x%X))", 
          probefunc, probename, arg0, arg1, arg2, arg3, arg3, arg4, arg5);
  printf("\n--------ustack(5, 0)--------"); ustack(5, 0); self->rc = 0;}
' -p 26550

 CPU     ID                    FUNCTION:NAME
   1  80816       ksl_get_shared_latch:entry
--------Calling ksl_get_shared_latch:entry
    (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5347(0x14E3), Mode=>16, PID=>0x2000000000000023))
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch
              oracle`qesrcCM_AddDO_+0x43b5
              oracle`qesrcCM_AddDO+0xf3
              oracle`qesrcCM_Acquire+0x702
              oracle`psdfrcLookup+0x12d

-- oradebug peek RC Latch address              
SQL>  oradebug peek 0x600572B0 24
[0600572B0, 0600572C8) = 00000001 40000000 0C56FEFE 0000022F 00000002 000014EE
                            Nproc  ˆX flag     gets   latch#   level#    where  

--******** S3 stopped here, resumed when S1 is set run ********--

   1  80816       ksl_get_shared_latch:entry
--------Calling ksl_get_shared_latch:entry
    (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5358(0x14EE), Mode=>16, PID=>0x2000000000000023))
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch
              oracle`qesrcPin_Get+0x959
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143

   1  80816       ksl_get_shared_latch:entry
--------Calling ksl_get_shared_latch:entry
    (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5374(0x14FE), Mode=>16, PID=>0x2000000000000023))
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch
              oracle`qesrcCM_Release+0x293
              oracle`psdfrcWriteResult+0x213
              oracle`pevm_RET+0x3d9
              oracle`pfrinstr_RET+0x26

P4$ pstack 26550

  ------------  lwp# 1 / thread# 1  ---------------
   ffff80ffbdba36eb semsys   (2, 33, ffff80ffbfff2f68, 1, 29d)
   000000000578adc5 sskgpwwait () + 1e5
   000000000578a965 skgpwwait () + c5
   000000000578972d kslgess () + 92d
   00000000057889bf ksl_get_shared_latch () + 39f
   00000000097a4145 qesrcCM_AddDO_ () + 43b5
   00000000097a4d03 qesrcCM_AddDO () + f3
   00000000097aa772 qesrcCM_Acquire () + 702
   000000000fa4e10d psdfrcLookup () + 12d
   0000000011d2e163 pevm_ENTERX () + 143

SQL (S4)> select addr, gets, misses, sleeps, immediate_gets, spin_gets, wait_time from v$latch where latch# in (559) or name in ('Result Cache: RC Latch');
  ADDR              GETS       MISSES   SLEEPS  IMMEDIATE_GETS  SPIN_GETS  WAIT_TIME
  ----------------  ---------  -------  ------  --------------  ---------  ----------
  00000000600572B0  207027966  1011584  508174  0               797961     4975709595

SQL (S4)> select * from v$latchholder;
  PID  SID   LADDR             NAME                    GETS     
  ---- ----  ----------------  ----------------------  ---------  
  33   4335  00000000600572B0  Result Cache: RC Latch  207027966 
  
SQL (S4)> select sid, blocker_sid, blocker_is_valid, chain_signature, in_wait wait_event_text, p1, p2, p3, in_wait_secs, num_waiters from v$wait_chains;  
  SID   OSID   BLOCKER_SID  BLOCKER_OSID  BLOCKER_IS_VALID  CHAIN_SIGNATURE                IN_WAIT  WAIT_EVENT_TEXT  P1          P2   P3  IN_WAIT_SECS  NUM_WAITERS
  ----  -----  -----------  ------------  ----------------  -----------------------------  -------  ---------------  ----------  ---  --  ------------  -----------  
  4335  6114                              FALSE             <='latch free'  FALSE                                                        1
  7497  26550  4335         6114          TRUE              <='latch free'  TRUE     latch free       1610969776  559  0   406           0
 

--==== @T4: S2 requets S Mode again, blocked by S1. Wait Event: latch free. But the real Blocking is caused by S3.

SQL (S2)> exec rc_s_x_mode_blocking_proc(2);

P2$ pstack 26548

   ffff80ffbdba36eb semsys   (2, 33, ffff80ffbfff3948, 1, 11d)
   000000000578adc5 sskgpwwait () + 1e5
   000000000578a965 skgpwwait () + c5
   000000000578972d kslgess () + 92d
   00000000057889bf ksl_get_shared_latch () + 39f
   00000000097b8a7b qesrcPin_Get () + 8cb
   00000000097aa3c4 qesrcCM_Acquire () + 354
   000000000fa4e10d psdfrcLookup () + 12d
   0000000011d2e163 pevm_ENTERX () + 143

SQL (S4)> select addr, gets, misses, sleeps, immediate_gets, spin_gets, wait_time from v$latch where latch# in (559) or name in ('Result Cache: RC Latch');
  ADDR              GETS       MISSES   SLEEPS  IMMEDIATE_GETS  SPIN_GETS  WAIT_TIME
  ----------------  ---------  -------  ------  --------------  ---------  ----------
  00000000600572B0  207027966  1011584  508174  0               797961     4975709595

SQL (S4)> select * from v$latchholder;
  PID  SID   LADDR             NAME                    GETS     
  ---- ----  ----------------  ----------------------  ---------  
  33   4335  00000000600572B0  Result Cache: RC Latch  207027966
  
SQL (S4)> select sid, osid, blocker_sid, blocker_osid, blocker_is_valid, chain_signature, in_wait, wait_event_text, p1, p2, p3, in_wait_secs, num_waiters from v$wait_chains;
  SID   OSID   BLOCKER_SID  BLOCKER_OSID  BLOCKER_IS_VALID  CHAIN_SIGNATURE                IN_WAIT  WAIT_EVENT_TEXT  P1          P2   P3  IN_WAIT_SECS  NUM_WAITERS
  ----  -----  -----------  ------------  ----------------  -----------------------------  -------  ---------------  ----------  ---  --  ------------  -----------
  4335  6114                              FALSE             <='latch free'  FALSE                                                        2
  7497  26550  4335         6114          TRUE              <='latch free'  TRUE     latch free       1610969776  559  0   1403          0
  6122  26548  4335         6114          TRUE              <='latch free'  TRUE     latch free       1610969776  559  0   803           0

 
--==== @T5: resumne S1 running

P1$ prun 6114

SQL (S4)> select addr, gets, misses, sleeps, immediate_gets, spin_gets, wait_time from v$latch where latch# in (559) or name in ('Result Cache: RC Latch');
  ADDR              GETS       MISSES   SLEEPS  IMMEDIATE_GETS  SPIN_GETS  WAIT_TIME
  ----------------  ---------  -------  ------  --------------  ---------  ----------
  00000000600572B0  207027970  1011586  508177  0               797961     6975739595

Above test shows that compatible S Mode not blocking each other; but it blocks X Mode request at ksl_get_shared_latch:entry. Once there exists an X Mode request at ksl_get_shared_latch:entry, all coming S Mode requests are blocked.

Probably this is the behaviour pointed out in Presentation: New features of Latches and Mutexes in Oracle 12c (Andrey Nikolaev) Page 28:
    If the latch is being held in S mode, the X mode waiter will block all further requests.

At T2, V$LATCHHOLDER shows S1 is a RC Latch Holder, but there are not yet any blocked sessions. So the existence of Latch Holder does not necessarily mean blockings.

From T3 (or T4) to T5, V$LATCH shows there are 4 (207027970-207027966) Gets, 2 (1011586-1011584) Misses, 3 (508177-508174) Sleeps. Among them,
     S3 made 3 X Mode (16) Gets, 1 Miss and 2 Sleeps.
     S2 made 1 S Mode (8) Get, 1 Miss and 1 Sleep.

S3 made 3 X Mode Gets for:
     qesrcCM_AddDO
     qesrcCM_Acquire
     psdfrcWriteResult

It also demonstrated that 1 Miss can provoke more Sleeps.

If we expand GV$LATCH_MISSES definition by adding column: INDX:


CREATE OR REPLACE FORCE VIEW X_GV$LATCH_MISSES as
SELECT t1.indx              INDX,
       t1.inst_id           INST_ID,       
       t1.ksllasnam         PARENT_NAME,   
       t2.ksllwnam          "WHERE",       
       t1.kslnowtf          NWFAIL_COUNT,  
       t1.kslsleep          SLEEP_COUNT,   
       t1.kslwscwsl         WTR_SLP_COUNT, 
       t1.kslwsclthg        LONGHOLD_COUNT,
       t2.ksllwnam          LOCATION,      
       t1.con_id            CON_ID         
  FROM x$ksllw t2, x$kslwsc t1
 WHERE t2.indx = t1.indx;

and run query below to map INDX to "WHERE" or "LOCATION".


select * from sys.X_GV$LATCH_MISSES 
where parent_name in ('Result Cache: RC Latch') 
  and indx in (5347, 5358, 5374, 5359);
  
INDX  INST_ID  PARENT_NAME             WHERE                          NWFAIL_COUNT  SLEEP_COUNT  WTR_SLP_COUNT  LONGHOLD_COUNT  LOCATION
----- -------  ----------------------  -----------------------------  ------------  -----------  -------------  --------------  -----------------------------
5347  1        Result Cache: RC Latch  Result Cache: Serialization01  0             9633         119281         712             Result Cache: Serialization01
5358  1        Result Cache: RC Latch  Result Cache: Serialization12  0             1303643      1036967        79711           Result Cache: Serialization12
5359  1        Result Cache: RC Latch  Result Cache: Serialization13  0             1776         116077         176             Result Cache: Serialization13
5374  1        Result Cache: RC Latch  Result Cache: Serialization28  0             2300         53438          474             Result Cache: Serialization28

In GV$LATCH_MISSES definition, parent x$ksllw stores "WHERE" names, child x$kslwsc stores real values. Both WHERE and LOCATION are same.

In AWR Section: Latch Miss Sources, only gv$latch_misses (nwfail_count, sleep_count, wtr_slp_count) are visible. Unfortunately longhold_count is not reported, but longhold_count often signifies the X Mode Requests, which caused RC Latch longhold.

oradebug peek shows S Mode Blocking state transition. Flag 40000000 denotes S Mode Blocking.


--==== @T1: S1 requets S Mode. Get RC item at 1st Session, then Stop
[0600572B0, 0600572C8) = 00000001 00000000 0C56FEFC 0000022F 00000002 000014EE   
                            Nproc ˆX flag      gets   latch#   level#    where
                            
--==== @T3: S3 requets X Mode Request. Add RC item at 3rd Session, blocked
[0600572B0, 0600572C8) = 00000001 40000000 0C56FEFE 0000022F 00000002 000014EE   
                            Nproc ˆX flag      gets   latch#   level#    where

Note: In AIX oradebug peek output, "ˆX flag" comes before "Nproc":


SQL> oradebug peek 0x0700000000055978 24;
SQL> [700000000055978, 700000000055990) = 00000000 00000001 0D298B4B 022F0000 00000002 000014EE
SQL> [700000000055978, 700000000055990) = 40000000 00000001 0D298B4F 022F0000 00000002 000014EE
SQL> [700000000055978, 700000000055990) = 40000000 00000000 0D298B50 022F0000 00000002 000014EE
                                           ˆX flag    Nproc     gets   latch#   level#    where

SQL> [700000000055978, 700000000055990) = 20000000 00000058 0D2984B2 022F0000 00000002 000014FE
                                           ˆX flag     pidˆ     gets   latch#   level#    where

6.2. X mode Blocking

The approach we picked out to study Oracle RC Latch activity is trying to depict its State-Transition Diagrams.

  dtrace to show Transition Event (Call Stack), paused to view State.
  oradebug Peek to display State.
  prun to resume Transition.


--------------------- Test X Mode Blocking ---------------------
-- Open 2 Solaris Windows, and one Sqlplus session. 
-- In first Solaris Windows, run dtrace; in second Solaris Windows, run oradebug and host prun.
-- Launch Dtrace at first, then run plsql.

SQL > exec rc_s_x_mode_blocking_proc(6);   

sudo dtrace -w -n \
'BEGIN {self->seq = 1; }
pid$target::qesrcCM*:entry /execname == "oracle"/ { self->rc = 1; }
pid$target::ksl_get_shared_latch:entry / self->rc > 0 && arg0 == 0x600572B0 && (arg4 == 8 || arg4 == 16) / { 
   printf("\nStep %d at %Y--------Calling %s:%s(Addr=>0x%X, Immed_Get=>%d, Why=>%d, Where=>%d(0x%X), Mode=>%d, PID=>0x%X)", 
          self->seq, walltimestamp, probefunc, probename, arg0, arg1, arg2, arg3, arg3, arg4, arg5);
   self->rc = 2; self->seq = self->seq + 1; }
pid$target::ksl_get_shared_latch:return /self->rc > 1/ {
   printf("\nStep %d at %Y--------Stopped at %s:%s", self->seq, walltimestamp, probefunc, probename);
   printf("\n--------ustack(5, 0)--------"); ustack(5, 0); self->seq = self->seq + 1; 
   stop();}
pid$target::kslfre:entry /self->rc > 1 && arg0 == 0x600572B0/ {
   printf("\nStep %d at %Y--------Stopped at %s:%s(Addr=>0x%-X)", self->seq, walltimestamp, probefunc, probename, arg0);
   printf("\n--------ustack(5, 0)--------"); ustack(5, 0); self->seq = self->seq + 1; self->rc = 3; 
   stop();} 
pid$target::kslfre:return /self->rc == 3/ {
   printf("\nStep %d at %Y--------Stopped at %s:%s(Addr=>0x%-X)", self->seq, walltimestamp, probefunc, probename, arg0);
   printf("\n--------ustack(5, 0)--------"); ustack(5, 0); self->seq = self->seq + 1; self->rc = 2;
   stop();}       
' -p 24559

SQL> oradebug peek 0x600572B0 24
     [0600572B0, 0600572C8) = 00000000 00000000 84BD178B 0000022F 00000002 000014FE
SQL> host prun 24559

dtrace: allowing destructive actions
 CPU     ID                    FUNCTION:NAME
   1  81838       ksl_get_shared_latch:entry
Step 1 at 2017 Sep 21 12:04:56--------Calling ksl_get_shared_latch:entry
       (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5358(0x14EE), Mode=>8, PID=>0x1)

   1  81839      ksl_get_shared_latch:return
Step 3 at 2017 Sep 21 12:04:56--------Stopped at ksl_get_shared_latch:return
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch+0x1b8
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143
              oracle`pfrinstr_ENTERX+0x47

SQL> oradebug peek 0x600572B0 24             
     [0600572B0, 0600572C8) = 00000001 00000000 84BD178C 0000022F 00000002 000014EE
SQL> host prun 24559                   

   1  81838       ksl_get_shared_latch:entry
Step 5 at 2017 Sep 21 12:06:05--------Calling ksl_get_shared_latch:entry
     (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5347(0x14E3), Mode=>16, PID=>0x2000000000000023)

   1  81839      ksl_get_shared_latch:return
Step 7 at 2017 Sep 21 12:06:05--------Stopped at ksl_get_shared_latch:return
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch+0x1b8
              oracle`qesrcCM_AddDO+0xf3
              oracle`qesrcCM_Acquire+0x702
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143
              
SQL> oradebug peek 0x600572B0 24
     [0600572B0, 0600572C8) = 00000023 20000000 84BD178D 0000022F 00000002 000014E3         
SQL> host prun 24559     

   1  81838       ksl_get_shared_latch:entry
Step 9 at 2017 Sep 21 12:08:07--------Calling ksl_get_shared_latch:entry
     (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5358(0x14EE), Mode=>16, PID=>0x2000000000000023)

   1  81839      ksl_get_shared_latch:return
Step 11 at 2017 Sep 21 12:08:07--------Stopped at ksl_get_shared_latch:return
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch+0x1b8
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143
              oracle`pfrinstr_ENTERX+0x47
              
SQL> oradebug peek 0x600572B0 24  
     [0600572B0, 0600572C8) = 00000023 20000000 84BD178E 0000022F 00000002 000014EE
SQL> host prun 24559               

   1  81838       ksl_get_shared_latch:entry
Step 13 at 2017 Sep 21 12:08:51--------Calling ksl_get_shared_latch:entry
     (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5374(0x14FE), Mode=>16, PID=>0x2000000000000023)

   1  81839      ksl_get_shared_latch:return
Step 15 at 2017 Sep 21 12:08:51--------Stopped at ksl_get_shared_latch:return
--------ustack(5, 0)--------
              oracle`ksl_get_shared_latch+0x1b8
              oracle`psdfrcWriteResult+0x213
              oracle`pevm_RET+0x3d9
              oracle`pfrinstr_RET+0x26
              oracle`pfrrun_no_tool+0x12a
              
SQL> oradebug peek 0x600572B0 24  
     [0600572B0, 0600572C8) = 00000023 20000000 84BD178F 0000022F 00000002 000014FE
SQL> host prun 24559               

   1  81840                     kslfre:entry
Step 16 at 2017 Sep 21 12:09:27--------Stopped at kslfre:entry(Addr=>0x600572B0)
--------ustack(5, 0)--------
              oracle`kslfre
              oracle`qesrcCM_Release+0x22d
              oracle`psdfrcWriteResult+0x213
              oracle`pevm_RET+0x3d9
              oracle`pfrinstr_RET+0x26

   1  81841                     kslfre:entry
Step 17 at 2017 Sep 21 12:09:27--------Stopped at kslfre:entry(Addr=>0x600572B0)
--------ustack(5, 0)--------
              oracle`kslfre
              oracle`qesrcCM_Release+0x22d
              oracle`psdfrcWriteResult+0x213
              oracle`pevm_RET+0x3d9
              oracle`pfrinstr_RET+0x26
              
SQL> oradebug peek 0x600572B0 24  
     [0600572B0, 0600572C8) = 00000023 20000000 84BD178F 0000022F 00000002 000014FE
SQL> host prun 24559               

   0  81842                    kslfre:return
Step 18 at 2017 Sep 21 12:10:25--------Stopped at kslfre:return(Addr=>0x172)
--------ustack(5, 0)--------
              oracle`kslfre+0x172
              oracle`psdfrcWriteResult+0x213
              oracle`pevm_RET+0x3d9
              oracle`pfrinstr_RET+0x26
              oracle`pfrrun_no_tool+0x12a
              
SQL> oradebug peek 0x600572B0 24  
     [0600572B0, 0600572C8) = 00000000 00000000 84BD178F 0000022F 00000002 000014FE

Only looking the oradebug peek output, we can see the state transition of RC Latch values:


SQL> oradebug peek 0x600572B0 24  --S0: initial  
    [0600572B0, 0600572C8) = 00000000 00000000 84BD178B 0000022F 00000002 000014FE
                                 pidˆ  ˆX flag     gets   latch#   level#    where    

SQL> oradebug peek 0x600572B0 24  --S1: Mode=>8  qesrcCM_Acquire+0x354
    [0600572B0, 0600572C8) = 00000001 00000000 84BD178C 0000022F 00000002 000014EE
SQL> host prun 24559  
    
SQL> oradebug peek 0x600572B0 24  --S2: Mode=>16 qesrcCM_AddDO+0xf3
    [0600572B0, 0600572C8) = 00000023 20000000 84BD178D 0000022F 00000002 000014E3
SQL> host prun 24559  

SQL> oradebug peek 0x600572B0 24  --S3: Mode=>16 qesrcCM_Acquire+0x354
    [0600572B0, 0600572C8) = 00000023 20000000 84BD178E 0000022F 00000002 000014EE
SQL> host prun 24559  

SQL> oradebug peek 0x600572B0 24  --S4: Mode=>16 psdfrcWriteResult+0x213
    [0600572B0, 0600572C8) = 00000023 20000000 84BD178F 0000022F 00000002 000014FE
SQL> host prun 24559  

SQL> oradebug peek 0x600572B0 24  --S5: psdfrcWriteResult -> qesrcCM_Release -> kslfre:entry
    [0600572B0, 0600572C8) = 00000023 20000000 84BD178F 0000022F 00000002 000014FE
SQL> host prun 24559  

SQL> oradebug peek 0x600572B0 24  --S6: psdfrcWriteResult -> qesrcCM_Release -> kslfre:return
    [0600572B0, 0600572C8) = 00000000 00000000 84BD178F 0000022F 00000002 000014FE

From S1 to S4, each Step gets once Latch, in total, it requires 4 RC Latch Gets (from 84BD178C to 84BD178F), which are one S Mode(8), and 3 X Mode(16). Therefore, inserting new RC items is much expensive than simply fetching them. In consequence, more latch free wait event on RC Latch pervades over system.

If after the first dtrace stop (Step S1), we also open 3 Sqlplus Sesssions to fetch already existed Result Cache Item (S Mode Get):


SQL > exec rc_s_x_mode_blocking_proc(1);

We observed the following behaviours:

  From S0 -> S1: S Mode get.
  From S2 -> S3 and S3 -> S4: one single blocked S Mode Session(the 1st wating Session) can get RC Latch (deblocked);
                              other Sessions are still blocked (kslfre called ???).
  From S4 -> S5: no S Mode Sessions can get RC Latch, all S Mode Sessions are still blocked.
  From S5 -> S6: all S Mode Sessions get RC Latch.

It looks like that during each X Mode state transition in that X Mode session, maximum one S Mode Session can occasionally get RC Latch, all other S Mode Sessions have to wait till "kslfre" called by that X Mode Session (of course, other X Mode Sessions are also blocked. And first RC Latch Get by each X Mode Session is a S Mode Get).

Above oradebug peek also shows that S Mode and X Mode Blockings are marked by different flags:

  S Mode Blocking:  Blocking is achieved by flag 0x40000000 bit in the latch value.
  X Mode Blocking:  Blocking is achieved by flag 0x20000000 bit in the latch value.

In S Mode Blocking, first position (4 bytes) stores Nproc; whereas in X Mode Blocking, first position stores blocking session PID.


[0600572B0, 0600572C8) = 00000001 40000000 0C56FEFE 0000022F 00000002 000014EE
                            Nproc  ˆX flag     gets   latch#   level#    where
                            
[0600572B0, 0600572C8) = 00000023 20000000 84BD178D 0000022F 00000002 000014E3
                             pidˆ  ˆX flag     gets   latch#   level#    where

Refering to X_GV$LATCH_MISSES (GV$LATCH_MISSES with INDX), we can map INDX to "WHERE":

  5347 (0x14E3) -> Result Cache: Serialization01
  5358 (0x14EE) -> Result Cache: Serialization12
  5360 (0x14F0) -> Result Cache: Serialization14
  5374 (0x14FE) -> Result Cache: Serialization28

Cross-checking with V$RESULT_CACHE_OBJECTS:

  V$RESULT_CACHE_OBJECTS.creation_timestamp = timestamp of psdfrcWriteResult+0x213 Call
  V$RESULT_CACHE_OBJECTS.build_time = 
    from  timestamp of psdfrcWriteResult+0x213 Call 
    to    timestamp of (psdfrcWriteResult -> qesrcCM_Release ->) kslfre:entry Call

Look further two Addresses, they contain Misses: 0x70C257=7389783, and Sleeps: 0xAC913=706835


SQL> oradebug peek 0x0600572C8 24
[0600572C8, 0600572E0) = 00000001 00000000 00000000 0070C257 00000000 00000000

SQL> oradebug peek 0x0600572E0 24
[0600572E0, 0600572F8) = 5CA280DF 00000005 000AC913 00000000 00000000 00000000

6.3. X mode Blocking: Dtrace

With following Dtrace script, we will draw a State Transition Diagram to demonstrate 4 RC Latch Gets (1 S Mode and successive 3 X Mode) when adding a new RC item (only one S Mode when fetching an existed item). For each Transition, we show State Entry and State Return. We can see that there is a kslfre call between two Transitions. That is probably why occasionally other S Mode Sessions (fetching an existed item) can still get RC Latch during this X Mode Session (adding a new item) as discussed in the last section.


-- Launch Dtrace at first, then run plsql to add a new RC item.

SQL > exec rc_s_x_mode_blocking_proc(7); 

sudo dtrace -n \
'
typedef unsigned int ub4;

typedef struct latchstate /* latch state */ {
    ub4 nproc_pid;  /* Nproc or pid   */
    ub4 flag;       /* blocking flag  */ 
    ub4 gets;       /* number of gets */
    ub4 latch;      /* latch number   */
    ub4 level;      /* latch level    */
    ub4 where; } latchstate;
    
BEGIN {laddr = 0x600572B0; step = 1; begin_gets = 0, end_gets = 0; prev_ts = timestamp;
   self->evt = "begin"; self->loc = 0x0;}
pid$target::qesrcCM*:entry /execname == "oracle"/ { self->rc = 1; }
pid$target::ksl_get_shared_latch:entry /self->rc > 0 && arg0 == laddr && (self->evt != probefunc || self->loc != arg3)/ { 
   printf("\n---------Step %d at %Y (Elapsed-ns=%d)---------", step, walltimestamp, (timestamp-prev_ts));
   ls = ((latchstate *) copyin(arg0, sizeof(latchstate)));
   begin_gets = (begin_gets == 0 ? ls->gets: begin_gets); end_gets = ls->gets;
   printf("\n***State Entry>>>(nproc_pid=>0x%X, flag=>0x%X, gets=>0x%X, latch=>0x%X, level=>0x%X, where=>0x%X)"
           ,ls->nproc_pid, ls->flag, ls->gets, ls->latch, ls->level, ls->where);
   printf("\n>>>Transition>>>%s:%s\n      (Addr=>0x%X, Immed_Get=>%d, Why=>%d, Where=>%d(0x%X), Mode=>%d, PID=>0x%X)", 
          probefunc, probename, arg0, arg1, arg2, arg3, arg3, arg4, arg5);
   self->evt = probefunc; self->loc = arg3;    
   ustack(5, 0);   
   step = step + 1; self->entry = 1; prev_ts = timestamp;} 
pid$target::ksl_get_shared_latch:return /self->rc > 0 && self->entry == 1/ {
   ls = ((latchstate *) copyin(laddr, sizeof(latchstate))); 
   printf("\n>>>State Return***(nproc_pid=>0x%X, flag=>0x%X, gets=>0x%X, latch=>0x%X, level=>0x%X, where=>0x%X)"
           ,ls->nproc_pid, ls->flag, ls->gets, ls->latch, ls->level, ls->where);
    self->entry = 0;}   
pid$target::kslfre:entry /self->rc > 0 && arg0 == laddr && self->evt != probefunc/ {
   printf("\n---------Step %d at %Y (Elapsed-ns=%d)---------", step, walltimestamp, (timestamp-prev_ts));
   ls = ((latchstate *) copyin(arg0, sizeof(latchstate)));
   end_gets = ls->gets;
   printf("\n>>>Transition>>>%s:%s(Addr=>0x%-X)", probefunc, probename, arg0);
   self->evt = probefunc; self->loc = 0x0;
   printf("\n===Latch freed===");
   ustack(5, 0); step = step + 1; prev_ts = timestamp;}      
END {
   printf("\n---------Step %d at %Y (Elapsed-ns=%d)---------", step, walltimestamp, (timestamp-prev_ts));
   printf("\nLatch Gets = %d (%d - %d)", (end_gets - begin_gets), (unsigned long)begin_gets, (unsigned long)end_gets); }
' -p 18118

 CPU     ID                    FUNCTION:NAME
   4  81846       ksl_get_shared_latch:entry
---------Step 1 at 2017 Sep 25 08:14:15 (Elapsed-ns=2668662103)---------
***State Entry>>>(nproc_pid=>0x0, flag=>0x0, gets=>0x84BD1DBE, latch=>0x22F, level=>0x2, where=>0x14EE)
>>>Transition>>>ksl_get_shared_latch:entry
      (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5358(0x14EE), Mode=>8, PID=>0x1)
              oracle`ksl_get_shared_latch
              oracle`qesrcPin_Get+0x8cb
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143

   4 128142      ksl_get_shared_latch:return
>>>State Return***(nproc_pid=>0x1, flag=>0x0, gets=>0x84BD1DBF, latch=>0x22F, level=>0x2, where=>0x14EE)
   4  89490                     kslfre:entry
---------Step 2 at 2017 Sep 25 08:14:15 (Elapsed-ns=48752)---------
>>>Transition>>>kslfre:entry(Addr=>0x600572B0)
===Latch freed===
              oracle`kslfre
              oracle`qesrcPin_Get+0xaf6
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143

   4  81846       ksl_get_shared_latch:entry
---------Step 3 at 2017 Sep 25 08:14:15 (Elapsed-ns=19989)---------
***State Entry>>>(nproc_pid=>0x0, flag=>0x0, gets=>0x84BD1DBF, latch=>0x22F, level=>0x2, where=>0x14EE)
>>>Transition>>>ksl_get_shared_latch:entry
      (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5347(0x14E3), Mode=>16, PID=>0x2000000000000024)
              oracle`ksl_get_shared_latch
              oracle`qesrcCM_AddDO_+0x43b5
              oracle`qesrcCM_AddDO+0xf3
              oracle`qesrcCM_Acquire+0x702
              oracle`psdfrcLookup+0x12d

   4 128142      ksl_get_shared_latch:return
>>>State Return***(nproc_pid=>0x24, flag=>0x20000000, gets=>0x84BD1DC0, latch=>0x22F, level=>0x2, where=>0x14E3)
   4  89490                     kslfre:entry
---------Step 4 at 2017 Sep 25 08:14:15 (Elapsed-ns=11792)---------
>>>Transition>>>kslfre:entry(Addr=>0x600572B0)
===Latch freed===
              oracle`kslfre
              oracle`qesrcCM_AddDO_+0x2e0
              oracle`qesrcCM_AddDO+0xf3
              oracle`qesrcCM_Acquire+0x702
              oracle`psdfrcLookup+0x12d

   4  81846       ksl_get_shared_latch:entry
---------Step 5 at 2017 Sep 25 08:14:15 (Elapsed-ns=9352)---------
***State Entry>>>(nproc_pid=>0x0, flag=>0x0, gets=>0x84BD1DC0, latch=>0x22F, level=>0x2, where=>0x14E3)
>>>Transition>>>ksl_get_shared_latch:entry
      (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5358(0x14EE), Mode=>16, PID=>0x2000000000000024)
              oracle`ksl_get_shared_latch
              oracle`qesrcPin_Get+0x959
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143

   4 128142      ksl_get_shared_latch:return
>>>State Return***(nproc_pid=>0x24, flag=>0x20000000, gets=>0x84BD1DC1, latch=>0x22F, level=>0x2, where=>0x14EE)
   4  89490                     kslfre:entry
---------Step 6 at 2017 Sep 25 08:14:15 (Elapsed-ns=25046)---------
>>>Transition>>>kslfre:entry(Addr=>0x600572B0)
===Latch freed===
              oracle`kslfre
              oracle`qesrcPin_Get+0xaf6
              oracle`qesrcCM_Acquire+0x354
              oracle`psdfrcLookup+0x12d
              oracle`pevm_ENTERX+0x143

   4  81846       ksl_get_shared_latch:entry
---------Step 7 at 2017 Sep 25 08:14:15 (Elapsed-ns=25170)---------
***State Entry>>>(nproc_pid=>0x0, flag=>0x0, gets=>0x84BD1DC1, latch=>0x22F, level=>0x2, where=>0x14EE)
>>>Transition>>>ksl_get_shared_latch:entry
      (Addr=>0x600572B0, Immed_Get=>1, Why=>1, Where=>5374(0x14FE), Mode=>16, PID=>0x2000000000000024)
              oracle`ksl_get_shared_latch
              oracle`qesrcCM_Release+0x293
              oracle`psdfrcWriteResult+0x213
              oracle`pevm_RET+0x3d9
              oracle`pfrinstr_RET+0x26

   4 128142      ksl_get_shared_latch:return
>>>State Return***(nproc_pid=>0x24, flag=>0x20000000, gets=>0x84BD1DC2, latch=>0x22F, level=>0x2, where=>0x14FE)
   4  89490                     kslfre:entry
---------Step 8 at 2017 Sep 25 08:14:15 (Elapsed-ns=9845)---------
>>>Transition>>>kslfre:entry(Addr=>0x600572B0)
===Latch freed===
              oracle`kslfre
              oracle`qesrcCM_Release+0x22d
              oracle`psdfrcWriteResult+0x213
              oracle`pevm_RET+0x3d9
              oracle`pfrinstr_RET+0x26

^C
   1      2                             :END
---------Step 9 at 2017 Sep 25 08:14:17 (Elapsed-ns=2254749376)---------
Latch Gets = 4 (2226986430 - 2226986434)

7. RESULT_CACHE Object Status Transition

SQL RESULT_CACHE hint has an option of SNAPSHOT. which specifies the Expired time interval. In the following test, we set "snapshot=5" (5 seconds) to track status transition of RESULT_CACHE Object:
New -> Published -> Expired -> Invalid

As the test shows, SNAPSHOT option turns off usual RESULT_CACHE invalidation of DML statement, and allows stale Read (dirty Read, non Consistency Read) of RESULT_CACHE.

We observed this usage in Oracle dynamic sampling (SNAPSHOT=3600). In real application, this undocumented feature can be used to make RESULT_CACHE as a database-wide provider of periodically sampling data.


----------------- Test Code -----------------
exec dbms_result_cache.flush;

select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3;
select status, name from v$result_cache_objects; 

exec dbms_lock.sleep(2);
update rc_tab set val = -val where id = 3;
commit;
select id, val from rc_tab where id = 3;     

--stale Read
select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3;
select status, name from v$result_cache_objects; 

exec dbms_lock.sleep(3);
select status, name from v$result_cache_objects; 

exec dbms_lock.sleep(1);
select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3;
select status, name from v$result_cache_objects;

----------------- Test Output -----------------

07:11:00 > exec dbms_result_cache.flush;
07:11:00 > select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3;
  
          ID        VAL
  ---------- ----------
           3         30
07:11:00 > select status, name from v$result_cache_objects;

  STATUS    NAME
  --------- -----------------------------------------------------------------------
  Published select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3

07:11:00 > exec dbms_lock.sleep(2);
07:11:02 > update rc_tab set val = -val where id = 3;
  1 row updated.
07:11:02 > commit;
07:11:02 > select id, val from rc_tab where id = 3;
  
          ID        VAL
  ---------- ----------
           3        -30
           
07:11:02 > --stale Read           
07:11:02 > select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3;
  
          ID        VAL
  ---------- ----------
           3         30
07:11:02 > select status, name from v$result_cache_objects;

  STATUS    NAME
  --------- -----------------------------------------------------------------------
  Published select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3

07:11:02 > exec dbms_lock.sleep(3);
07:11:05 > select status, name from v$result_cache_objects;

  STATUS    NAME
  --------- -----------------------------------------------------------------------
  Expired   select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3

07:11:05 > exec dbms_lock.sleep(1);
07:11:06 > select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3;
  
          ID        VAL
  ---------- ----------
           3        -30
07:11:06 > select status, name from v$result_cache_objects;

  STATUS    NAME
  --------- -----------------------------------------------------------------------
  Published select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3
  Invalid   select /*+ RESULT_CACHE(snapshot=5) */ id, val from rc_tab where id = 3
  2 rows selected.

8. Result Cache History Recording

There exist no DBA_HIST_* views to display the RC history of active sessions. We can create a small utility to record RC history usages (see appended code).

For example, we record 10 snaps in an interval of 1 second for top 5 rows, and then check the created history.


exec rc_scn_record_hist(p_snaps => 10, p_interval=> 1, p_rows => 5);

select * from hist_rc_scn 
order by ts, sign(extract(minute from(scn_diff))*60 + extract(second from(scn_diff))) desc
        ,scn desc, scn_diff desc, status, name, id;

select * from hist_rc_scn_dep 
order by ts, sign(extract(minute from(f_scn_diff))*60 + extract(second from(f_scn_diff))) desc
        ,f_scn desc, f_scn_diff desc, f_status, f_name;
        
select * from hist_rc_lru_number t
order by ts, name;   

select * from hist_rc_memory t
order by ts, free; 

select * from hist_rc_stats t
order by name, ts;

9. Result Cache Hidden Parameters

Here a list of RC hidden parameters in Oracle 12.1.0.2.0:

Name                                       Description                                             Default 
------------------------------------------ ------------------------------------------------------- -------
_result_cache_auto_size_threshold          result cache auto max size allowed                      100       
_result_cache_auto_time_threshold          result cache auto time threshold                        1000 
_result_cache_auto_execution_threshold     result cache auto execution threshold                   1    
_result_cache_deterministic_plsql          result cache deterministic PLSQL functions              FALSE
_result_cache_block_size                   result cache block size                                 1024 
_result_cache_copy_block_count             blocks to copy instead of pinning the result            1    
_result_cache_global                       Are results available globally across RAC?              TRUE 
_result_cache_timeout                      maximum time (sec) a session waits for a result         10   
_result_cache_auto_time_distance           result cache auto time distance                         300  
_result_cache_auto_dml_monitoring_slots    result cache auto dml monitoring slot                   4    
_result_cache_auto_dml_monitoring_duration result cache auto dml monitoring duration               15   
_result_cache_auto_dml_threshold           result cache auto dml threshold                         16   
_result_cache_auto_dml_trend_threshold     result cache auto dml trend threshold                   20   
_optimizer_ads_result_cache_life           result cache shelf life for ADS queries                 3600 
_client_result_cache_bypass                bypass the client result cache                          FALSE
_re_result_cache_keysiz                    defines max number key for result cache hash table      20   
_re_result_cache_size                      defines max number of cached elements for result cache  20

In verbatim text, they sound like something to regulate the refresh of Result Cache or its Dynamic Performance Views. But it is not clear how to make the test cases to detect their usage.

10. Test Code


drop table rc_tab;

drop table rc_tab_dep;

drop table rc_tab_2;

create table rc_tab (id number, val number);

create table rc_tab_dep (did number, dval number);

create table rc_tab_2 (id number, val number);

insert into rc_tab select level, level*10 from dual connect by level <= 17;

insert into rc_tab_dep select level, level*10 from dual connect by level <= 17;

insert into rc_tab_2 select level, level*10 from dual connect by level <= 13;

commit;

create or replace function get_val_dep (p_did number) return number as
  l_dval_dep number;
begin
  select dval into l_dval_dep from rc_tab_dep where did = p_did;
  return l_dval_dep;
end;
/

create or replace function get_val (p_id number) return number result_cache as
  l_val     number;
  l_val_dep number;
begin
  select val into l_val from rc_tab where id = p_id;
  l_val_dep := get_val_dep(p_id);
  return l_val + l_val_dep;
end;
/

create or replace procedure run_test as
  l_val number;
begin
  -- SQL RESULT_CACHE
  --for c in (select /*+ RESULT_CACHE */ * from rc_tab) loop
  --  null;
  --end loop;
  
  for i in 1 .. 17 loop
    l_val := get_val(i);
  end loop;
end;
/

create or replace function get_val_2 (p_id number) return number result_cache as
  l_val     number;
  l_val_dep number;
begin
  select val into l_val from rc_tab_2 where id = p_id;
  return l_val;
end;
/

create or replace procedure run_test_2 as
  l_val number;
begin
  for i in 1 .. 13 loop
    l_val := get_val_2(i);
  end loop;
end;
/

11. Result Cache Monitoring Views


drop view rc_scn_view;

create view rc_scn_view as
select systimestamp ts, scn_to_timestamp(t.scn) scn_time, (scn_to_timestamp(scn) - creation_timestamp) scn_diff, t.*  
from   v$result_cache_objects t
order by sign(extract(minute from(scn_diff))*60 + extract(second from(scn_diff))) desc
        ,scn desc, scn_diff desc, status, name, id;    

drop view rc_scn_dep_view;

create view rc_scn_dep_view as
with rs_scn as 
  (select scn_to_timestamp(t.scn) scn_time, (scn_to_timestamp(scn) - creation_timestamp) scn_diff, 
          id, type, status, name, namespace, creation_timestamp, scn, object_no, invalidations, row_count
   from   v$result_cache_objects t)
select systimestamp ts,
       f.scn_time f_scn_time, f.scn_diff f_scn_diff, f.id f_id, f.type f_type, f.status f_status, f.name f_name, f.namespace f_namespace, 
       f.creation_timestamp f_creation_timestamp, f.scn f_scn, f.object_no f_object_no, f.invalidations f_invalidations, f.row_count f_row_count,
       t.scn_time t_scn_time, t.scn_diff t_scn_diff, t.id t_id, t.type t_type, t.status t_status, t.name t_name, t.namespace t_namespace, 
       t.creation_timestamp t_creation_timestamp, t.scn t_scn, t.object_no t_object_no, t.invalidations  t_invalidations, t.row_count t_row_count
from rs_scn                    f, 
     v$result_cache_dependency d,
     rs_scn                    t
where f.id    = d.result_id(+) 
  and t.id(+) = d.depend_id
--  and t.name in ('K.RC_TAB_DEP', 'K.RC_TAB')
--  and f.name like '%"K"."%GET_VAL"%'
order by sign(extract(minute from(f.scn_diff))*60 + extract(second from(f.scn_diff))) desc
        ,f.scn desc, f.scn_diff desc, f.status, f.name;
        
drop view rc_lru_number_view;

create view rc_lru_number_view as
select systimestamp ts, name, cache_id, type, status, namespace
      ,min(lru_number) min_lru, max(lru_number) max_lru, max(lru_number) - min(lru_number) lru_diff
      ,count(*) rc_count, sum(block_count) blocks, sum(ceil(row_count*row_size_avg/1024)) est_blocks
      ,sum(row_count) row_cnt, min(row_size_min) row_size_min, max(row_size_max) row_size_max, round(avg(row_size_avg)) row_size_avg
      ,sum(pin_count) pin_count, round(avg(scan_count)) avg_scan_cnt
      ,sum(build_time*10) build_time_ms, round(avg(build_time*10), 3) avg_build_time, min(build_time*10) min_build_time, max(build_time*10) max_build_time
      ,sum(invalidations) invalidations
from   v$result_cache_objects t
where  lru_number > 0
group by name, cache_id, type, status, namespace
order by rc_count desc, avg_scan_cnt desc;  

drop view rc_memory_view;

create view rc_memory_view as 
select systimestamp ts, v.* from
(select free, count(*) cnt, min(offset) min_offset, max(offset) max_offset from v$result_cache_memory group by free) v;

drop view rc_stats_view;

create view rc_stats_view as 
select systimestamp ts, v.* from v$result_cache_statistics v;

12. Result Cache History Recording


drop table hist_rc_scn;

create table hist_rc_scn as select 123456 snap_id, v.* from rc_scn_view v where 1=2;

drop table hist_rc_scn_dep;

create table hist_rc_scn_dep as select 123456 snap_id, v.* from rc_scn_dep_view v where 1=2;

drop table hist_rc_lru_number;

create table hist_rc_lru_number as select 123456 snap_id, v.* from rc_lru_number_view v where 1=2;

drop table hist_rc_memory;

create table hist_rc_memory as select 123456 snap_id, v.* from rc_memory_view v where 1=2;

drop table hist_rc_stats;

create table hist_rc_stats as select 123456 snap_id, v.* from rc_stats_view v where 1=2;


create or replace procedure rc_scn_record_hist(p_snaps number := 10, p_interval number := 1, p_rows number := 10) as
begin
  for i in 1..p_snaps loop
    insert into hist_rc_scn   
    select i snap_id, c.* from  
    (select v.* from rc_scn_view v 
     order by sign(extract(minute from(scn_diff))*60 + extract(second from(scn_diff))) desc
             ,scn desc, scn_diff desc, status, name, id) c
    where rownum <= p_rows;
    
    insert into hist_rc_scn_dep 
    select i snap_id, c.* from 
    (select v.* from rc_scn_dep_view v
     order by sign(extract(minute from(f_scn_diff))*60 + extract(second from(f_scn_diff))) desc
             ,f_scn desc, f_scn_diff desc, f_status, f_name) c
    where rownum <= p_rows;
    
    insert into hist_rc_lru_number 
    select i snap_id, c.* from rc_lru_number_view c;    
    
    insert into hist_rc_memory 
    select i snap_id, c.* from rc_memory_view c;
        
    insert into hist_rc_stats 
    select i snap_id, c.* from rc_stats_view c;     
     
    commit;
    dbms_lock.sleep(p_interval);
  end loop;
end;
/

truncate table hist_rc_scn;
truncate table hist_rc_scn_dep;
truncate table hist_rc_lru_number;
truncate table hist_rc_memory;
truncate table hist_rc_stats;

exec rc_scn_record_hist(p_snaps => 10, p_interval=> 1, p_rows => 5);

select * from hist_rc_scn 
order by ts, sign(extract(minute from(scn_diff))*60 + extract(second from(scn_diff))) desc
        ,scn desc, scn_diff desc, status, name, id;

select * from hist_rc_scn_dep 
order by ts, sign(extract(minute from(f_scn_diff))*60 + extract(second from(f_scn_diff))) desc
        ,f_scn desc, f_scn_diff desc, f_status, f_name;
        
select * from hist_rc_lru_number t
order by ts, name;  

select * from hist_rc_memory t
order by ts, free; 

select * from hist_rc_stats t
order by name, ts;

Monday, March 20, 2017

PL/SQL Function Result Cache Invalidation (I)