Tuesday, November 1, 2016

"library cache: mutex X" and Application Context

Heavy Event: "library cache: mutex X" is observed when Application Context is frequently changed. The application is using Oracle Virtual Private Database to regulate data access with driving application context, which determines which policy group is in effect for each use case.

In this Blog, Application Context is used as a concrete case to discuss Oracle "library cache: mutex X".

Note: All tests are done in Oracle 12.1.0.2 on AIX, Solaris, Linux with 6 physical processors.


1. Test


Run the appended Test Code by launching 4 Jobs:
  exec ctx_set_jobs(4);

Monitor Job sessions:

select sid, program, event, p1text, p1, p2text, p2, p3text, p3
  from v$session where program like '%(J%';

  SID PROGRAM              EVENT                     P1TEXT          P1 P2TEXT             P2 P3TEXT                P3
----- -------------------- ------------------------- ------ ----------- ------ -------------- ------ -----------------
   38 oracle@testdb (J003) library cache: mutex X    idn     1317011825 value   3968549781504 where   9041305591414788
  890 oracle@testdb (J000) library cache: mutex X    idn     1317011825 value    163208757248 where   9041305591414874
  924 oracle@testdb (J001) library cache: mutex X    idn     1317011825 value   4556960301056 where   9041305591414874
 1061 oracle@testdb (J002) library cache: mutex X    idn     1317011825 value   3968549781504 where   9041305591414879

Pick idn (P1): 1317011825, and query v$db_object_cache:

select name, namespace, type, hash_value, locks, pins, locked_total, pinned_total
from v$db_object_cache where hash_value in (1317011825);

NAME       NAMESPACE       TYPE             HASH_VALUE       LOCKS        PINS LOCKED_TOTAL PINNED_TOTAL
---------- --------------- --------------- ----------- ----------- ----------- ------------ ------------
TEST_CTX   APP CONTEXT     APP CONTEXT      1317011825           4           0            4    257802287

It shows that "library cache: mutex X" is on application context: TEST_CTX, and PINNED_TOTAL is probably increased for each access.
Although TEST_CTX is a local context and its values is stored in the User Global Area (UGA), the content of "library cache: mutex X" is globally on its definition.

select namespace, package, type from dba_context where namespace = 'TEST_CTX';

NAMESPACE  PACKAGE       TYPE
---------- ------------  ----------------
TEST_CTX   TEST_CTX_PKG  ACCESSED LOCALLY

After test, clean-up all jobs by:
  exec clean_jobs;


2. Mutex Contention and Performance


Run the test, monitor Mutex Contention and Performance:

SQL > exec ctx_set_jobs(4);

column owner format a6
column name format a10
column property format a10
column namespace format a12
column type format a12

SQL > select owner, name, property, hash_value, locks, pins, locked_total, pinned_total, executions, sharable_mem, namespace, type, full_hash_value
      from  v$db_object_cache v
      where (name in ('TEST_CTX') or hash_value in (1317011825) or property like '%HOT%');

OWNER  NAME       PROPERTY   HASH_VALUE  LOCKS  PINS LOCKED_TOTAL PINNED_TOTAL EXECUTIONS SHARABLE_MEM NAMESPACE    TYPE         FULL_HASH_VALUE
------ ---------- ---------- ---------- ------ ----- ------------ ------------ ---------- ------------ ------------ ------------ --------------------------------
SYS    TEST_CTX              1317011825      4     0            4   1191106893 1191106886         4096 APP CONTEXT  APP CONTEXT  3581f5a97dfac7485a3330954e800171

SQL > select * from v$mutex_sleep order by sleeps desc, location;

MUTEX_TYPE        LOCATION                        SLEEPS  WAIT_TIME 
----------------- --------------------------- ---------- ---------- 
Library Cache     kglpndl1  95                    760307   76983943 
Library Cache     kglpin1   4                     288210  172310041 
Library Cache     kglpnal1  90                    193282   38493570 
Library Cache     kglGetHandleReference 123            1          0 

--display mutex REQUESTING/BLOCKING details for each session
SQL > select * from v$mutex_sleep_history order by sleep_timestamp desc, location;

MUTEX_IDENTIFIER SLEEP_TIMESTAMP      MUTEX_TYPE          GETS  SLEEPS REQUESTING_SESSION BLOCKING_SESSION LOCATION                    MUTEX_VALUE       P1  P1RAW           
---------------- -------------------- ------------- ---------- ------- ------------------ ---------------- -------------------------- ------------------ --- ----------------
      1317011825 24-OCT-2017 14:15:40 Library Cache  675726540  449377                  7                0 kglpin1   4                 00                 0  0000000176D1E860
      1317011825 24-OCT-2017 14:15:40 Library Cache  675726542  442641                368                0 kglpndl1  95                00                 0  0000000176D1E860
      1317011825 24-OCT-2017 14:15:40 Library Cache  675711683  444299                901                7 kglpndl1  95                0000000700000000   0  0000000176D1E860
      1317011825 24-OCT-2017 14:15:40 Library Cache  675709618  438207                187                0 kglpin1   4                 00                 0  0000000176D1E860
      1317011825 24-OCT-2017 14:09:06 Library Cache    2806872       1                900                0 kglGetHandleReference 123   00                 0  0000000176D1E860


Pick spid of one Oracle session, for example, 10684, get callstack:

$ > pstack 10684
10684:  ora_j000_testdb 
 fffffd7ffc9d3e3b semsys   (4, e000013, fffffd7fffdf5658, 1, fffffd7fffdf5660)
 0000000001ab9008 sskgpwwait () + f8
 0000000001ab8c95 skgpwwait () + c5
 0000000001c710d5 ksliwat () + 8f5
 0000000001c70410 kslwaitctx () + 90
 0000000001e6ffb0 kgxWait () + 520
 000000000dd1ae6f kgxExclusive () + 1cf
 00000000021cc025 kglGetMutex () + b5
 000000000212400e kglpin () + 2fe
 00000000026aa159 kglpnp () + 269
 00000000026a71ab kgiina () + 1db
 000000000dd118b9 kgintu_named_toplevel_unit () + 39
 0000000007ac16a6 kzctxBInfoGet () + 746
 0000000007ac38ed kzctxChkTyp () + fd
 0000000007ac43f0 kzctxesc () + 510
 0000000002781d9d pevm_icd_call_common () + 29d
 0000000002781930 pfrinstr_ICAL () + 90
 0000000001a435ca pfrrun_no_tool () + 12a
 0000000001a411e0 pfrrun () + 4c0
 0000000001a3fb48 plsql_run () + 288

Where semsys(4, ...) is semtimedop(int semid, struct sembuf *sops, size_t nsops, const struct timespec *timeout) specified in syscall.h.

Run a small dtrace script:

$ > sudo dtrace -n \
'BEGIN {self->start_wts = walltimestamp; self->start_ts = timestamp;}
pid$target::kglpndl:entry /execname == "oracle"/ { self->rc = 1; }
pid$target::kgxExclusive:entry /execname == "oracle" && self->rc == 1/ { self->ts = timestamp; }
pid$target::kgxExclusive:return /self->ts > 0/ {  
 @lquant["ns"] = lquantize(timestamp - self->ts, 0, 10000, 1000); 
 @avgs["AVG_ns"] = avg(timestamp - self->ts);
 @mins["MIN_ns"] = min(timestamp - self->ts);
 @maxs["MAX_ns"] = max(timestamp - self->ts);
 @sums["SUM_ms"] = sum((timestamp - self->ts)/1000000);
 @counts[ustack(10, 0)] = count(); 
 self->rc = 0; self->ts = 0;}
END { printf("Start: %Y, End: %Y, Elapsed_ms: %d\n", self->start_wts, walltimestamp, (timestamp - self->start_ts)/1000000);}
' -p 10684

dtrace: description 'BEGIN ' matched 8 probes
Start: 2017 Oct 24 14:30:02, End: 2017 Oct 24 14:31:08, Elapsed_ms: 66183

  ns
     value  ------------- Distribution ------------- count
       < 0 |                                         0
         0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         3352394
      1000 |@@@@@@@@                                 803168
      2000 |                                         11598
      3000 |                                         1484
      4000 |                                         890
      5000 |                                         626
      6000 |                                         460
      7000 |                                         315
      8000 |                                         265
      9000 |                                         147
  >= 10000 |                                         2227

  AVG_ns                                                         1999
  MIN_ns                                                          777
  MAX_ns                                                     20411473
  SUM_ms                                                         4214

      oracle`kgxExclusive+0x105
      oracle`kglpndl+0x1fe
      oracle`kglUnPin+0x101
      a.out`kzctxChkTyp+0x14e
      a.out`kzctxesc+0x510
      a.out`pevm_icd_call_common+0x29d
      a.out`pfrinstr_ICAL+0x90
      oracle`pfrrun_no_tool+0x12a
      oracle`pfrrun+0x4c0
      oracle`plsql_run+0x288
  4173574
The above CallStack shows that kgxExclusive is triggered by kglpin via kglGetMutex.

Solaris "prstat -mL" show about 30% percentage of time the process has spent sleeping (SLP).


3. Hot library cache objects


Learning from Blog: Divide and conquer the true mutex contention
 
    KGLPIN:   KGL: PIN heaps and load data pieces of an object 
    KGLPNDL:  KGL PiN DeLete
    KGLPNAL1: KGL PiN ALlOcate
    
    KGLHBH1 63, KGLHDGN2 106: Invalid Password, Application Context(eg: SYS_CONTEXT)
"library cache: mutex X" can be allevaited by creating multiple copies of hot objects, which are controlled by two hidden parameters:
    _kgl_hot_object_copies: controls the maximum number of copies
    _kgl_debug:             marks hot library cache objects as a candidate for cloning
Configure two hidden parameters:

SQL > alter system set "_kgl_hot_object_copies"= 255 scope=spfile;

      --alter system reset "_kgl_hot_object_copies" scope=spfile;

SQL > alter system set "_kgl_debug"=
      "name='TEST_CTX' schema='SYS'    namespace=21 debug=33554432", 
      "name='PLITBLM'  schema='PUBLIC' namespace=1  debug=33554432"
      scope=spfile;
      
      --alter system reset "_kgl_debug" scope=spfile;
PUBLIC SYNONYM (namespace=1) 'PLITBLM' is added here to show multiple library cache objects can be specified in _kgl_debug. PLITBLM is package for PLSQL Index TaBLe Mangement, i.e PLSQL Collections (Associative Arrays, Nested Table, Varrays). All its implementations are through c interface.

library cache object NAMESPACE/TYPE name and id mapping can be listed by following two queries:

SQL > select distinct kglhdnsp NAMESPACE_id, kglhdnsd NAMESPACE_name from x$kglob 
      --where kglhdnsd in ('APP CONTEXT')
      order by kglhdnsp;

SQL > select distinct kglobtyp TYPE_id, kglobtyd TYPE_name from x$kglob 
      --where kglobtyd in ('APP CONTEXT')
      order by kglobtyp;
Run the same test:

-- Stop all Jobs
SQL > exec clean_jobs;

--Restart DB to activate Hot library cache objects
SQL> startup force

SQL > select owner, name, property, hash_value, locks, pins, locked_total, pinned_total, executions, sharable_mem, namespace, type, full_hash_value
      from  v$db_object_cache v
      where (name in ('TEST_CTX') or hash_value in (1317011825) or property like '%HOT%');
      
OWNER  NAME       PROPERTY   HASH_VALUE LOCKS PINS LOCKED_TOTAL PINNED_TOTAL EXECUTIONS SHARABLE_MEM NAMESPACE    TYPE     FULL_HASH_VALUE
------ ---------- ---------- ---------- ----- ---- ------------ ------------ ---------- ------------ ------------ -------- -----------------------------
SYS    TEST_CTX   HOT        1317011825     0    0            1            0          0         0 APP CONTEXT  CURSOR   3581f5a97dfac7485a3330954e800171       
 
SQL > exec ctx_set_jobs(4); 
 
SQL > select sid, program, event, p1text, p1, p2text, p2, p3text, p3
      from v$session where program like '%(J%'; 

 SID PROGRAM                  EVENT                     P1TEXT          P1 P2TEXT             P2 P3TEXT                P3
---- ------------------------ ------------------------- ------ ----------- ------ -------------- ------ -----------------
   5 oracle@s5d00003 (J001)   null event                                 0                     0                        0
 186 oracle@s5d00003 (J004)   null event                                 0                     0                        0
 369 oracle@s5d00003 (J005)   null event                                 0                     0                        0
 902 oracle@s5d00003 (J000)   null event                                 0                     0                        0
 
SQL > select owner, name, property, hash_value, locks, pins, locked_total, pinned_total, executions, sharable_mem, namespace, type, full_hash_value
      from  v$db_object_cache v
      where (name in ('TEST_CTX') or hash_value in (1317011825) or property like '%HOT%'); 
 
OWNER  NAME       PROPERTY   HASH_VALUE LOCKS PINS LOCKED_TOTAL PINNED_TOTAL EXECUTIONS SHARABLE_MEM NAMESPACE    TYPE         FULL_HASH_VALUE
------ ---------- ---------- ---------- ----- ---- ------------ ------------ ---------- ------------ ------------ ------------ --------------------------------
SYS    TEST_CTX   HOT        1317011825     0    0            1            0          0         0    APP CONTEXT  CURSOR       3581f5a97dfac7485a3330954e800171
SYS    TEST_CTX   HOTCOPY6   1487681198     1    0            2    151394920  151394917      4096    APP CONTEXT  APP CONTEXT  1047cbfbca3fd100cf5758a258ac36ae
SYS    TEST_CTX   HOTCOPY138 3082567164     1    0            2    151821083  151821080      4096    APP CONTEXT  APP CONTEXT  31ac20c09044dfec01acd307b7bc3dfc
SYS    TEST_CTX   HOTCOPY187 3192676979     1    0            2    151252013  151252010      4096    APP CONTEXT  APP CONTEXT  6635075a7adcc68672a02262be4c6273
SYS    TEST_CTX   HOTCOPY115 4198626891     1    0            2    150529629  150529626      4096    APP CONTEXT  APP CONTEXT  d6724a1c9f480d55f73dc8fcfa41f64b 

SQL > select * from v$mutex_sleep order by sleeps desc, location;

MUTEX_TYPE                       LOCATION                                     SLEEPS  WAIT_TIME
-------------------------------- ---------------------------------------- ---------- ----------
Cursor Pin                       kkslce [KKSCHLPIN2]                               2      20118

SQL > select * from v$mutex_sleep_history order by sleep_timestamp desc, location;

MUTEX_IDENTIFIER SLEEP_TIMESTAMP      MUTEX_TYPE GETS SLEEPS REQUESTING_SESSION BLOCKING_SESSION LOCATION            MUTEX_VALUE      P1 P1RAW
---------------- -------------------- ---------- ---- ------ ------------------ ---------------- ------------------- ---------------- -- -----
      2816823972 24-OCT-2017 15:09:13 Cursor Pin    1      1                183              364 kkslce [KKSCHLPIN2] 0000016C00000000  2 00
      2214650983 24-OCT-2017 15:04:50 Cursor Pin    1      1                  5              902 kkslce [KKSCHLPIN2] 0000038600000000  2 00


Run the same dtrace script:

$ > sudo dtrace -n \
'BEGIN {self->start_wts = walltimestamp; self->start_ts = timestamp;}
pid$target::kglpndl:entry /execname == "oracle"/ { self->rc = 1; }
pid$target::kgxExclusive:entry /execname == "oracle" && self->rc == 1/ { self->ts = timestamp; }
pid$target::kgxExclusive:return /self->ts > 0/ {  
 @lquant["ns"] = lquantize(timestamp - self->ts, 0, 10000, 1000); 
 @avgs["AVG_ns"] = avg(timestamp - self->ts);
 @mins["MIN_ns"] = min(timestamp - self->ts);
 @maxs["MAX_ns"] = max(timestamp - self->ts);
 @sums["SUM_ms"] = sum((timestamp - self->ts)/1000000);
 @counts[ustack(10, 0)] = count(); 
 self->rc = 0; self->ts = 0;}
END { printf("Start: %Y, End: %Y, Elapsed_ms: %d\n", self->start_wts, walltimestamp, (timestamp - self->start_ts)/1000000);}
' -p 11751

Start: 2017 Oct 24 15:21:02, End: 2017 Oct 24 15:22:40, Elapsed_ms: 97999
  ns
     value  ------------- Distribution ------------- count
       < 0 |                                         0
         0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   8050589
      1000 |@@                                       330902
      2000 |                                         1106
      3000 |                                         1606
      4000 |                                         1352
      5000 |                                         630
      6000 |                                         322
      7000 |                                         201
      8000 |                                         133
      9000 |                                         94
  >= 10000 |                                         481

  AVG_ns         897
  MIN_ns         813
  MAX_ns      315083
  SUM_ms           0

      oracle`kgxExclusive+0x105
      oracle`kglpndl+0x1fe
      oracle`kglUnPin+0x101
      a.out`kzctxChkTyp+0x14e
      a.out`kzctxesc+0x510
      a.out`pevm_icd_call_common+0x29d
      a.out`pfrinstr_ICAL+0x90
      oracle`pfrrun_no_tool+0x12a
      oracle`pfrrun+0x4c0
      oracle`plsql_run+0x288
  8387416
Solaris "prstat -mL" show almost 100% percentage of time the process has spent in user mode (USR).

Try with official API in dbms_shared_pool, it seems that NAMESPACE: 'APP CONTEXT' not yet supported.

-- Stop all Jobs
SQL > exec clean_jobs;

SQL > alter system reset "_kgl_debug" scope=spfile;

--Restart DB
SQL> startup force

SQL > exec sys.dbms_shared_pool.markhot('SYS', 'TEST_CTX', 21);
      --exec sys.dbms_shared_pool.unmarkhot('SYS', 'TEST_CTX', 21);

 ORA-26680: object type not supported
 ORA-06512: at "SYS.DBMS_SHARED_POOL", line 133
 
-- Using 32 Byte (16 hexadecimal) V$DB_OBJECT_CACHE.FULL_HASH_VALUE
SQL > exec sys.dbms_shared_pool.markhot(hash=>'3581f5a97dfac7485a3330954e800171', NAMESPACE=>21);
      --exec sys.dbms_shared_pool.unmarkhot(hash=>'3581f5a97dfac7485a3330954e800171', NAMESPACE=>21);

 ORA-26680: object type not supported
 ORA-06512: at "SYS.DBMS_SHARED_POOL", line 138
Comparing "_kgl_debug" and markhot, it seems that "_kgl_debug" is persistent after DB restart, but not always stable after DB restart. Several sessions can still contend for the same library cache objects without creating/using HOT objects.

Whereas markhot seems stable after DB restart, but not always persistent after DB restart. Moreover, markhot does not support all NAMESPACEs of library cache objects.

When a SYNONYM is marked HOT, it can encounter core dump with Error:
    ORA-00600: internal error code, arguments: [kgltti-no-dep1]
with CallStack:
 
  kgltti()+1358            -> dbgeEndDDEInvocation() // ERROR SIGNALED: yes COMPONENT: LIBCACHE
  kqlCompileSynonym()+3840 -> kgltti() 
  kqllod_new()+3768        -> kqlCompileSynonym() 
  kqlCallback()+79         -> kqllod_new() 
  kqllod()+710             -> kqlCallback() 
  kglobld()+1058           -> kqllod() 
  kglobpn()+1232           -> kglobld() 
  kglpim()+489             -> kglobpn() 
  kglpin()+1785            -> kglpim() 
  kglgob()+493             -> kglpin() 
  kgiind()+1529            -> kglgob() 
  pfri8_inst_spec()+126    -> kgiind() 
  pfri1_inst_spec()+69     -> pfri8_inst_spec() 
  pfrrun()+1506            -> pfri1_inst_spec() 
  plsql_run()+648          -> pfrrun() 
This error is addressed by Oracle MOS:
    ORA-00600 [kgltti-no-dep1] When Synonym Marked Hot (Doc ID 2153847.1)


4. V$MUTEX_SLEEP_HISTORY


V$MUTEX_SLEEP_HISTORY displays time-series data. Each row in this view is for a specific time, mutex type, location, requesting session and blocking session combination. The data in this view is contained within a circular buffer, with the most recent sleeps shown (Oracle V$MUTEX_SLEEP_HISTORY).

Two fields are documented as:
  GETS    The total number of gets since the mutex was created and up until the time of the wait 
          (and from all sessions past and present)
  SLEEPS  The number of times the requester had to sleep before obtaining the mutex
It seems that GETS is an instance-wide accumulated historized data, whereas SLEEPS is per REQUESTING_SESSION for a specific time, mutex type, location.

We can try to edit a query to reveal Mutex contention details. Column SLEEPS and GETs_Per_MS are the points to monitor.

with mutex_1 as (select /*+ materializse */ m.*
                ,(gets -lag(gets) over(partition by mutex_identifier, mutex_type, location order by sleep_timestamp, requesting_session)) Total_GETs_Delta
                ,(((sleep_timestamp -lag(sleep_timestamp) over(partition by mutex_identifier, mutex_type, location 
                    order by sleep_timestamp, requesting_session)))) Elapsed
               from v$mutex_sleep_history m) 
    ,mutex_2 as (select /*+ materializse */ m.*
                ,(extract(second from Elapsed) + extract(minute from Elapsed) * 60 + extract(hour from Elapsed))*1e3 Elapsed_MS
               from mutex_1 m)
    ,mutex   as (select /*+ materializse */ m.*
                ,round(Total_GETs_Delta/nullif(Elapsed_MS, 0)) GETs_Per_MS
               from mutex_2 m)
    ,oc as (select /*+ materializse */ * from v$db_object_cache where hash_value in (select mutex_identifier from mutex)) 
    ,ses as (select /*+ materializse */ s.*
              ,(select owner|| '.' ||object_name||case when procedure_name is not null then '.' ||procedure_name end
                from dba_procedures
                where object_id = s.plsql_entry_object_id and subprogram_id = s.plsql_entry_subprogram_id) plsql_entry
              ,(select owner||'.'||object_name||case when procedure_name is not null then  '.' || procedure_name end
                from dba_procedures
                where object_id = s.plsql_object_id and subprogram_id = s.plsql_subprogram_id) plsql
              ,(select dbms_lob.substr(sql_text, 50, 1) from dba_hist_sqltext where sql_id = s.sql_id and rownum = 1) sql_text
             from v$active_session_history s where sample_time <= sysdate-10/1440)   -- only last 10 minutes
select m.*, o.*, s.*
from  mutex m, oc o, ses s;
where m.mutex_identifier = o.hash_value(+)
  and m.requesting_session = s.session_id(+)
  and m.sleep_timestamp between (s.sample_time(+) - INTERVAL'1'SECOND) and s.sample_time(+)
order by m.mutex_identifier, m.mutex_type, m.location, m.gets desc, m.sleep_timestamp desc, m.requesting_session;


5. Code Path of "library cache: mutex X"


Run Application Context settings 1000 times and at the same time dtrace its SPID: 1217.

It prints out 29 "kgxExclusive" callstacks (Top 3 callstacks are listed at first). Editing them together, we can build up a small dictionary of "library cache: mutex X" to lookup different occurrences and frequencies of mutex X.

The test (on a fresh started instance) shows that 'TEST_CTX' was pinned 1000 times for 1000 executions.

1000 Application Context settings are implemented by 1000:
    PinAndPrepare:  kglpnp   -> kglpin
    PinAndAllocate: kglpin   -> kglpnal 
    PinAndDelete:   kglUnPin -> kglpndl

SQL > select name, locks, pins, locked_total, pinned_total, executions
      from v$db_object_cache where name in ('TEST_CTX') or hash_value in (1317011825);
      
      NAME             LOCKS        PINS LOCKED_TOTAL PINNED_TOTAL  EXECUTIONS
      ---------- ----------- ----------- ------------ ------------ -----------
      TEST_CTX             1           0            1         1000         999      

SQL > exec ctx_set(1000, 123);

SQL > select name, locks, pins, locked_total, pinned_total, executions
      from v$db_object_cache where name in ('TEST_CTX') or hash_value in (1317011825);

      NAME             LOCKS        PINS LOCKED_TOTAL PINNED_TOTAL  EXECUTIONS
      ---------- ----------- ----------- ------------ ------------ -----------
      TEST_CTX             1           0            1         2000        1999 

$ > sudo dtrace -n \
'pid$target::kgxExclusive:entry /execname == "oracle"/ { self->in = 1; }
pid$target::kgxExclusive:return /self->in > 0/ {
@counts[ustack(10, 0)] = count(); 
  self->in = 0;}
' -p 1217

dtrace: description 'pid$target::kgxExclusive:entry ' matched 4 probes

oracle`kgxExclusive+0x105               oracle`kgxExclusive+0x105          oracle`kgxExclusive+0x105              
a.out`kglpin+0x2fe                      a.out`kglpndl+0x1fe                a.out`kglpnal+0x1ae                    
a.out`kglpnp+0x269                      a.out`kglUnPin+0x101               a.out`kglpin+0x52f                     
a.out`kgiina+0x1db                      a.out`kzctxChkTyp+0x14e            a.out`kglpnp+0x269                     
oracle`kgintu_named_toplevel_unit+0x39  a.out`kzctxesc+0x510               a.out`kgiina+0x1db                     
a.out`kzctxBInfoGet+0x746               a.out`pevm_icd_call_common+0x29d   oracle`kgintu_named_toplevel_unit+0x39 
a.out`kzctxChkTyp+0xfd                  a.out`pfrinstr_ICAL+0x90           a.out`kzctxBInfoGet+0x746              
a.out`kzctxesc+0x510                    oracle`pfrrun_no_tool+0x12a        a.out`kzctxChkTyp+0xfd                 
a.out`pevm_icd_call_common+0x29d        oracle`pfrrun+0x4c0                a.out`kzctxesc+0x510                   
a.out`pfrinstr_ICAL+0x90                oracle`plsql_run+0x288             a.out`pevm_icd_call_common+0x29d       
  1000                                    1000                               1000                                 

oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105              
a.out`kglpin+0x2fe                     a.out`kgldtin+0x85d                 a.out`kglhdgn+0xc2                     
a.out`kglgob+0x1ed                     a.out`kgldti+0x63                   a.out`kglLock+0x679                    
a.out`kgldpo0+0x4d8                    a.out`kksauc+0x3b7                  a.out`kglget+0x11b                     
a.out`kglgbo+0x95                      a.out`kkscscid_auc_eval+0x176       a.out`kkspsc0+0x744                    
a.out`kksauc+0x888                     a.out`kkscsCheckCriteria+0x691      a.out`kksParseCursor+0x74              
a.out`kkscscid_auc_eval+0x176          a.out`kkscsCheckCursor+0x30a        a.out`opiosq0+0x70a                    
a.out`kkscsCheckCriteria+0x691         a.out`kkscsSearchChildList+0x356    a.out`kpooprx+0x102                    
a.out`kkscsCheckCursor+0x30a           a.out`kksfbc+0x916                  a.out`kpoal8+0x308                     
a.out`kkscsSearchChildList+0x356       a.out`kkspsc0+0x9d7                 oracle`opiodr+0x433                    
  1                                      1                                   1                                    
                                                                                                                  
oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105           
a.out`kglLock+0x93d                    a.out`kglpin+0x2fe                  a.out`kglpndl+0x1fe                 
a.out`kglget+0x11b                     a.out`kglgob+0x1ed                  a.out`kglUnPin+0x101                
a.out`kglgob+0x13b                     a.out`kkdogty+0x3fd                 a.out`kksauc+0x3d0                  
a.out`kkdogty+0x3fd                    a.out`kkdot2t+0x36                  a.out`kkscscid_auc_eval+0x176       
a.out`kkdot2t+0x36                     a.out`opibnd0+0xbe0                 a.out`kkscsCheckCriteria+0x691      
a.out`psdgbaa+0x791                    a.out`opibnd+0x175                  a.out`kkscsCheckCursor+0x30a        
a.out`pevm_GBVAR+0xdc                  a.out`kpopbnd+0xc7                  a.out`kkscsSearchChildList+0x356    
a.out`pfrinstr_GBVAR+0x37              a.out`kpoal8+0x1431                 a.out`kksfbc+0x916                  
oracle`pfrrun_no_tool+0x12a            oracle`opiodr+0x433                 a.out`kkspsc0+0x9d7                 
  1                                      1                                   1                                 
                                                                                                                  
oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105              
a.out`kglLock+0x93d                    a.out`kglati+0x6a                   a.out`kglLockCursor+0xf6               
a.out`kglget+0x11b                     a.out`kksaxs+0x76b                  a.out`kxsGetLookupLock+0x70            
a.out`kglgob+0x13b                     a.out`kksauc+0x1cc                  a.out`kkscsCheckCursor+0x198           
a.out`kgldpo0+0x4d8                    a.out`kkscscid_auc_eval+0x176       a.out`kkscsSearchChildList+0x356       
a.out`kglgbo+0x95                      a.out`kkscsCheckCriteria+0x691      a.out`kksfbc+0x916                     
a.out`kksauc+0x888                     a.out`kkscsCheckCursor+0x30a        a.out`kkspsc0+0x9d7                    
a.out`kkscscid_auc_eval+0x176          a.out`kkscsSearchChildList+0x356    a.out`kksParseCursor+0x74              
a.out`kkscsCheckCriteria+0x691         a.out`kksfbc+0x916                  a.out`opiosq0+0x70a                    
a.out`kkscsCheckCursor+0x30a           a.out`kkspsc0+0x9d7                 a.out`kpooprx+0x102                    
  1                                      1                                   1                                    
    
oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105       
a.out`kglhdgn+0xc2                     a.out`kglLock+0x93d                 a.out`kgllkdl+0x196             
a.out`kglLock+0x679                    a.out`kglget+0x11b                  a.out`kglUnLock+0xf8            
a.out`kglget+0x11b                     a.out`kglgob+0x13b                  a.out`kxsUnlock+0x1b1           
a.out`kglgob+0x13b                     a.out`kkdogty+0x3fd                 a.out`kxsFreeXsc+0x16e          
a.out`kkdogty+0x3fd                    a.out`kkdot2t+0x36                  a.out`kksCloseCursor+0x60a      
a.out`kkdot2t+0x36                     oracle`psdsbaa+0x244                a.out`opicca+0x81               
a.out`opibnd0+0xbe0                    a.out`pevm_SBVAR+0x123              a.out`opiclo+0x96               
a.out`opibnd+0x175                     a.out`pfrinstr_SBVAR+0x3a           a.out`kpoclsa+0x44              
a.out`kpopbnd+0xc7                     oracle`pfrrun_no_tool+0x12a         oracle`opiodr+0x433             
  1                                      1                                   1                             
    
oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105          
a.out`kgllkdl+0x196                    a.out`kglpnal+0x1ae                 a.out`kglLock+0x2aa               
a.out`kglUnLock+0xf8                   a.out`kglpin+0x52f                  a.out`kglget+0x11b                
a.out`kksauc+0x3e0                     a.out`kglgob+0x1ed                  a.out`kksaxs+0x52d                
a.out`kkscscid_auc_eval+0x176          a.out`kgldpo0+0x4d8                 a.out`kksauc+0x1cc                
a.out`kkscsCheckCriteria+0x691         a.out`kglgbo+0x95                   a.out`kkscscid_auc_eval+0x176     
a.out`kkscsCheckCursor+0x30a           a.out`kksauc+0x888                  a.out`kkscsCheckCriteria+0x691    
a.out`kkscsSearchChildList+0x356       a.out`kkscscid_auc_eval+0x176       a.out`kkscsCheckCursor+0x30a      
a.out`kksfbc+0x916                     a.out`kkscsCheckCriteria+0x691      a.out`kkscsSearchChildList+0x356  
a.out`kkspsc0+0x9d7                    a.out`kkscsCheckCursor+0x30a        a.out`kksfbc+0x916                
  1                                      1                                   1                               
                                                                                                                  
oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105           
a.out`kglpin+0x2fe                     a.out`kgllkdl+0x196                 a.out`kglpnal+0x1ae                 
a.out`kglpnp+0x269                     a.out`kglUnLock+0xf8                a.out`kglpin+0x52f                  
a.out`kgiinb+0x5e0                     a.out`kxsReleaseParentLock+0xb6     a.out`kglgob+0x1ed                  
a.out`pfri7_inst_body_common+0x18d     a.out`kxsFreeXsc+0x181              a.out`kkdogty+0x3fd                 
a.out`pfri3_inst_body+0x45             a.out`kksCloseCursor+0x60a          a.out`kkdot2t+0x36                  
oracle`pfrrun+0x685                    a.out`opicca+0x81                   a.out`opibnd0+0xbe0                 
oracle`plsql_run+0x288                 a.out`opiclo+0x96                   a.out`opibnd+0x175                  
oracle`peicnt+0x946                    a.out`kpoclsa+0x44                  a.out`kpopbnd+0xc7                  
oracle`kkxexe+0x2f3                    oracle`opiodr+0x433                 a.out`kpoal8+0x1431                 
  3                                      1                                   1                                 
                                                                                                                  
oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105              
a.out`kglhdgn+0x23e                    a.out`kglpnal+0x1ae                 a.out`kgllkdl+0x196                    
a.out`kglLock+0x679                    a.out`kglpin+0x52f                  a.out`kss_del_cb+0x19d                 
a.out`kglget+0x11b                     a.out`kglpnp+0x269                  a.out`kssdel+0xf2                      
a.out`kglgob+0x13b                     a.out`kgiinb+0x5e0                  a.out`kssdch_int+0x326                 
a.out`kkdogty+0x3fd                    a.out`pfri7_inst_body_common+0x18d  a.out`ksudlc+0xf6                      
a.out`kkdot2t+0x36                     a.out`pfri3_inst_body+0x45          a.out`kss_del_cb+0x106                 
a.out`opibnd0+0xbe0                    oracle`pfrrun+0x685                 a.out`kssdel+0xf2                      
a.out`opibnd+0x175                     oracle`plsql_run+0x288              a.out`ksupop+0x23a                     
a.out`kpopbnd+0xc7                     oracle`peicnt+0x946                 oracle`opiodr+0x4b1                    
  1                                      3                                   1                                    

oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105           oracle`kgxExclusive+0x105        
a.out`kglnti+0x83                      a.out`kglhdgn+0x23e                 a.out`kglpin+0x2fe               
a.out`kksauc+0x7c0                     a.out`kglLock+0x679                 a.out`kglpnp+0x269               
a.out`kkscscid_auc_eval+0x176          a.out`kglget+0x11b                  a.out`kgiind+0xbae               
a.out`kkscsCheckCriteria+0x691         a.out`kkspsc0+0x744                 a.out`pfri8_inst_spec+0x7e       
a.out`kkscsCheckCursor+0x30a           a.out`kksParseCursor+0x74           a.out`pfri1_inst_spec+0x45       
a.out`kkscsSearchChildList+0x356       a.out`opiosq0+0x70a                 oracle`pfrrun+0x5e2              
a.out`kksfbc+0x916                     a.out`kpooprx+0x102                 oracle`plsql_run+0x288           
a.out`kkspsc0+0x9d7                    a.out`kpoal8+0x308                  oracle`peicnt+0x946              
a.out`kksParseCursor+0x74              oracle`opiodr+0x433                 oracle`kkxexe+0x2f3              
  1                                      1                                   4                              
    
oracle`kgxExclusive+0x105              oracle`kgxExclusive+0x105   
a.out`kglpnal+0x1ae                    a.out`kglpndl+0x1fe         
a.out`kglpin+0x52f                     a.out`kss_del_cb+0x19d      
a.out`kglpnp+0x269                     a.out`kssdel+0xf2           
a.out`kgiind+0xbae                     a.out`kssdch_int+0x326      
a.out`pfri8_inst_spec+0x7e             a.out`ksudlc+0xf6           
a.out`pfri1_inst_spec+0x45             a.out`kss_del_cb+0x106      
oracle`pfrrun+0x5e2                    a.out`kssdel+0xf2           
oracle`plsql_run+0x288                 a.out`ksupop+0x23a          
oracle`peicnt+0x946                    oracle`opiodr+0x4b1         
  4                                      8  


6. Mutex vs. Latch


Latch is an instance-wide centralized locking mechanism, whereas Mutex is a distributed locking mechanism, directly attached on the specific shared memory data structures. That is why there exists v$latch (v$latch_children) for all Latches, whereas Mutex is exposed as V$DB_OBJECT_CACHE.hash_value. Latch is pre-defined and limited, whereas Mutex is dynamically created/released when requested.

Blog: Reducing "library cache: mutex X" concurrency with dbms_shared_pool.markhot lists top 3 differences between mutexes and latches:
  – A mutex can protect a single structure, latches often protect many structures
  – A mutex get is about 30-35 instructions in the algorithm, compared to 150-200 instructions for a latch get
  – A mutex is around 16 bytes in size, compared to 112-200 bytes for a latch
Mutex looks about 5 times slimmer and hence hopefully proportionally faster than Latch.

Further deep discussion can be found in Blog: LATCHES, LOCKS, PINS AND MUTEXES


7. Mutex Contention Test on Linux


Run the test on Linux, the same "library cache: mutex X" contentions can be observed.

Pick one Oracle session's spid, for example, 16060.
Run Command strace, We can see continues output of lines beginning with semtimedop:

$ > strace -Tp 16060

    semtimedop(20873218, {{41, -1, 0}}, 1, {0, 10000000}) = -1 EAGAIN (Resource temporarily unavailable) <0.011305>
Refer to Linux Documentations:

int semtimedop(int semid, struct sembuf *sops, size_t nsops, const struct timespec *timeout);

struct sembuf {
  unsigned short  sem_num; /* semaphore index in array, semaphore number */
  short          sem_op;  /* semaphore operation */
  short          sem_flg; /* operation flags */
};

struct timespec {
  time_t          tv_sec;
  long            tv_nsec;
};
                 
EAGAIN: An operation could not proceed immediately and 
        either IPC_NOWAIT was specified in sem_flg or the time limit specified in timeout expired.             
timespec.tv_nsec = 10000000 ns (0.011305 Second) looks like CFS Scheduler Time Slice as 10ms.

We can try to simulate Oracle ORA-07445 by:

kill -s SEGV 16060         
We get database alert log and incident file of killed session:
    
-- alert log    
Thu Oct 26 09:04:08 2017
Exception [type: SIGSEGV, unknown code] [ADDR:0x6400003F6A] [PC:0x7F98B7BA728A, semtimedop()+10] [exception issued by pid: 16234, uid: 100] [flags: 0x0, count: 1]
Errors in file /orabin/app/oracle/admin/c0d00445/diag/rdbms/c0d00445/c0d00445/trace/c0d00445_j001_16060.trc  (incident=22697):
ORA-07445: exception encountered: core dump [semtimedop()+10] [SIGSEGV] [ADDR:0x6400003F6A] [PC:0x7F98B7BA728A] [unknown code] []
Incident details in: /orabin/app/oracle/admin/c0d00445/diag/rdbms/c0d00445/c0d00445/incident/incdir_22697/c0d00445_j001_16060_i22697.trc

-- incident file 
Exception [type: SIGSEGV, unknown code] [ADDR:0x6400003F6A] [PC:0x7F98B7BA728A, semtimedop()+10] 
          [exception issued by pid: 16234, uid: 100] [flags: 0x0, count: 1]
Registers:
%rax: 0xfffffffffffffffc %rbx: 0x0000000000000000 %rcx: 0xffffffffffffffff
%rdx: 0x0000000000000001 %rdi: 0x00000000013e8002 %rsi: 0x00007ffc98536278
%rsp: 0x00007ffc985360a8 %rbp: 0x00007ffc985362a0  %r8: 0x0000000000002710
 %r9: 0x0000000000989680 %r10: 0x00007ffc98536228 %r11: 0x0000000000000202
%r12: 0x00007ffc985364d0 %r13: 0x00000000013e8002 %r14: 0x00000001708d3998
%r15: 0x0000000000000000 %rip: 0x00007f98b7ba728a %efl: 0x0000000000000202
  semtimedop()+0 (0x7f98b7ba7280) mov %rcx,%r10
  semtimedop()+3 (0x7f98b7ba7283) mov $0xdc,%eax
  semtimedop()+8 (0x7f98b7ba7288) syscall
> semtimedop()+10 (0x7f98b7ba728a) cmp $0xfffff001,%rax
  semtimedop()+16 (0x7f98b7ba7290) jae 0x7f98b7ba7293
  semtimedop()+18 (0x7f98b7ba7292) ret
  semtimedop()+19 (0x7f98b7ba7293) mov 0x2a2d0e(%rip),%rcx
  semtimedop()+26 (0x7f98b7ba729a) xor %edx,%edx
 
   sslsshandler()+456         -> ssexhd()        // CALL TYPE: call   ERROR SIGNALED: yes   COMPONENT: (null)
   __sighandler()             -> sslsshandler() 
   semtimedop()+10            -> __sighandler() 
   sskgpwwait()+237           -> semtimedop() 
   skgpwwait()+200            -> sskgpwwait() 
   ksliwat()+2091             -> skgpwwait() 
   kslwaitctx()+161           -> ksliwat() 
   ksfwaitctx()+32            -> kslwaitctx() 
   kgxWait()+1076             -> ksfwaitctx() 
   kgxExclusive()+741         -> kgxWait() 
   kglGetMutex()+170          -> kgxExclusive() 
   kglpndl()+503              -> kglGetMutex() 
   kglpnds()+81               -> kglpndl() 
   kglUnPin()+228             -> kglpnds() 
   kzctxChkTyp()+306          -> kglUnPin() 
   kzctxesc()+347             -> kzctxChkTyp() 
   pevm_icd_call_common()+411 -> kzctxesc() 
   pfrinstr_ICAL()+135        -> pevm_icd_call_common() 
   pfrrun_no_tool()+60        -> pfrinstr_ICAL() 
   pfrrun()+1155              -> pfrrun_no_tool() 
   plsql_run()+708            -> pfrrun() 


8. No Read Consistency in Application Context (Addendum 2017.06.06)


Modify above local context as global context, and start a job to update context value by:

exec clean_jobs;

create or replace context test_ctx using test_ctx_pkg accessed globally;

create or replace procedure ctx_set_global(p_cnt number) as
begin
 for i in 1..p_cnt loop
   test_ctx_pkg.set_val(i);    
 end loop;
end;
/

create or replace procedure ctx_set_jobs_global as
  l_job_id pls_integer;
begin
  dbms_job.submit(l_job_id, 'begin while true loop ctx_set_global(1000000); end loop; end;');
  commit;
end;    
/

exec ctx_set_jobs_global;

And then query systimestamp and global context by:

alter session set nls_timestamp_tz_format ='yyyy-mon-dd hh24:mi:ss.ff tzr tzd';
column name format a8
column min_time format a35
column max_time format a35
column min_ctx format a10
column max_ctx format a10

-- grace period for job start
exec dbms_lock.sleep(10);   

with sq_time_1 as (select systimestamp tm from dual connect by level <= 10000),
     sq_ctx_1  as (select sys_context('test_ctx', 'attr') ctx from dual connect by level <= 100),
     sq_time_2 as (select systimestamp tm from dual connect by level <= 10000),
     sq_ctx_2  as (select sys_context('test_ctx', 'attr') ctx from dual connect by level <= 100)
select 'Snap_1' name, min(tm) min_time, max(tm) max_time, min(ctx) min_ctx, max(ctx) max_ctx from sq_time_1, sq_ctx_1
union all
select 'Snap_2' name, min(tm) min_time, max(tm) max_time, min(ctx) min_ctx, max(ctx) max_ctx from sq_time_2, sq_ctx_2;


NAME     MIN_TIME                            MAX_TIME                            MIN_CTX    MAX_CTX
-------- ----------------------------------- ----------------------------------- ---------- ----------
Snap_1   2017-jun-06 08:05:09.625790 +02:00  2017-jun-06 08:05:09.625790 +02:00  263007     263007
Snap_2   2017-jun-06 08:05:09.625790 +02:00  2017-jun-06 08:05:09.625790 +02:00  362922     362922

exec dbms_lock.sleep(4);

with sq_ctx_1  as (select sys_context('test_ctx', 'attr') ctx from dual connect by level <= 100),
     sq_time_1 as (select systimestamp tm from dual connect by level <= 10000)
select 'Snap_1' name, min(tm) min_time, max(tm) max_time, min(ctx) min_ctx, max(ctx) max_ctx from sq_time_1, sq_ctx_1
union all
select 'Snap_2' name, systimestamp, systimestamp, min(sys_context('test_ctx', 'attr')), max(sys_context('test_ctx', 'attr')) 
  from dual connect by level <= 100;

NAME     MIN_TIME                            MAX_TIME                            MIN_CTX    MAX_CTX
-------- ----------------------------------- ----------------------------------- ---------- ----------
Snap_1   2017-jun-06 08:05:17.354265 +02:00  2017-jun-06 08:05:17.354265 +02:00  693574     693574
Snap_2   2017-jun-06 08:05:17.354265 +02:00  2017-jun-06 08:05:17.354265 +02:00  794397     794396

The output shows that systimestamp satisfies Read Consistency, but global context has two or more different values, hence global context does not guarantee Statement-Level Read Consistency.

Even for local context, Blog: Gotcha: Application Contexts demonstrated Non Read Consistency.

9. Test Code


create or replace context test_ctx using test_ctx_pkg;

create or replace package test_ctx_pkg is 
  procedure set_val (val number);
 end;
/

create or replace package body test_ctx_pkg is
  procedure set_val (val number) as
  begin
    dbms_session.set_context('test_ctx', 'attr', val);
  end;
end;
/

create or replace procedure ctx_set(p_cnt number, val number) as
begin
 for i in 1..p_cnt loop
  test_ctx_pkg.set_val(val);    -- 'library cache: mutex X' on TEST_CTX
 end loop;
end;
/

create or replace procedure ctx_set_jobs(p_job_cnt number) as
  l_job_id pls_integer;
begin
  for i in 1.. p_job_cnt loop
    dbms_job.submit(l_job_id, 'begin while true loop ctx_set(100000, '||i||'); end loop; end;');
  end loop;
  commit;
end;    
/

create or replace procedure clean_jobs as
begin
  for c in (select job from dba_jobs) loop
    begin
       dbms_job.remove (c.job);
    exception when others then null;
    end;
    commit;
  end loop;

  for c in (select d.job, d.sid, (select serial# from v$session where sid = d.sid) ser 
              from dba_jobs_running d) loop
    begin
      execute immediate
             'alter system kill session '''|| c.sid|| ',' || c.ser|| ''' immediate';
      dbms_job.remove (c.job);
    exception when others then null;
    end;
    commit;
  end loop;
  
  -- select * from dba_jobs;
  -- select * from dba_jobs_running;
end;
/


/**
exec ctx_set_jobs(4);

exec clean_jobs;

column program format a20
column event format a25
column p1text format a6
column p1 format 9999999999
column p2text format a6
column p2 format 9999999999999
column p3text format a6
column p3 format 9999999999999999

select sid, program, event, p1text, p1, p2text, p2, p3text, p3
  from v$session where program like '%(J%';

  SID PROGRAM              EVENT                     P1TEXT          P1 P2TEXT             P2 P3TEXT                P3
----- -------------------- ------------------------- ------ ----------- ------ -------------- ------ -----------------
   38 oracle@testdb (J003) library cache: mutex X    idn     1317011825 value   3968549781504 where   9041305591414788
  890 oracle@testdb (J000) library cache: mutex X    idn     1317011825 value    163208757248 where   9041305591414874
  924 oracle@testdb (J001) library cache: mutex X    idn     1317011825 value   4556960301056 where   9041305591414874
 1061 oracle@testdb (J002) library cache: mutex X    idn     1317011825 value   3968549781504 where   9041305591414879

  
column name format a10
column namespace format a15
column type format a15
set numformat 9999999999
  
select name, namespace, type, hash_value, locks, pins, locked_total, pinned_total
from v$db_object_cache where hash_value in (1317011825);

NAME       NAMESPACE       TYPE             HASH_VALUE       LOCKS        PINS LOCKED_TOTAL PINNED_TOTAL
---------- --------------- --------------- ----------- ----------- ----------- ------------ ------------
TEST_CTX   APP CONTEXT     APP CONTEXT      1317011825           4           0            4    257802287

**/