Wednesday, November 7, 2018

Row Cache and Sql Executions (IV)


(I)-Tracing Methods      (II)-Object Type(Part-1)      (III)-Object Type(Part-2)       (IV)-Sql Executions (IV)       (V)-Contentions and Scalability


A common belief is that Row Cache is used in Sql hard parsing or Plsql compiling. While minimizing the number of Sql parsing is to reduce non-productive overhead, maximizing the number of Sql executions is to achieve the utmost productive outcome(throughput).

If each Sql execution is bundled with Row Cache GETs and is going hand in hand together, it will be gradually dragged down, and eventually at some point be contained by Row Cache Latch contentions (see next Blog).

In previous two Blogs, we have showed that the Object Type Row Cache Gets are proportional to the number of executions in host language Plsql and Java (still interesting to see other languages).

This Blog will look Row Cache Gets and Sql Executions.

Note: All tests are done in Oracle 12.1.0.2.

Update (2020-Jul-16): See section: 1.1. Oracle 19c rowcache FASTGETS


1. dc_tablespaces, dc_users and Hash Joins


At first let's run a test for 1000 Hash Joins (after warmup), and then examine the output:

------------------------- Setup -------------------------
drop table hj1;
drop table hj2;

create table hj1 as select level x from dual connect by level <= 1000; 
create table hj2 as select level x from dual connect by level <= 1000; 

------------------------- Test -------------------------
column name format a20
column parameter format a20
column cache#_list format a20

select parameter, sum(gets) gets, listagg(cache#, ',') within group (order by cache#) as cache#_list 
from v$rowcache where parameter in ('dc_tablespaces', 'dc_users') and gets > 0
group by parameter;

select name, gets, addr, latch#, child#
from v$latch_children v where name in ('row cache objects') and child# in (5, 8);

select name, gets, addr, latch#
      ,(select count(*) from v$latch_children where name = v.name) children_cnt
from v$latch v where name in ('simulator hash latch');

select a.name, b.value from v$statname a, v$mystat b 
where a.statistic# = b.statistic# and name in ('parse count (total)', 'parse count (hard)');

declare
  l_cnt number;
begin
  for i in 1..1000 loop
    select /*+ use_hash(t2) */ count(*) into l_cnt from hj1 t1, hj2 t2 where t1.x=t2.x;
    -- with sq as (select /*+ materialize*/ * from hj1) select count(*) into l_cnt from sq;  -- for next test
  end loop;
end;
/

--grace period
exec dbms_lock.sleep(1);  

select a.name, b.value from v$statname a, v$mystat b 
where a.statistic# = b.statistic# and name in ('parse count (total)', 'parse count (hard)');

select parameter, sum(gets) gets, listagg(cache#, ',') within group (order by cache#) as cache#_list 
from v$rowcache where parameter in ('dc_tablespaces', 'dc_users') and gets > 0
group by parameter;

select name, gets, addr, latch#, child#
from v$latch_children v where name in ('row cache objects') and child# in (5, 8);

select name, gets, addr, latch#
      ,(select count(*) from v$latch_children where name = v.name) children_cnt
from v$latch v where name in ('simulator hash latch');

------------------------- Output -------------------------
---*** Before Test ***---
   PARAMETER             GETS   CACHE#_LIST
   --------------------- ------ -----------
   dc_tablespaces        11000            0
   dc_users              18746         7,10
   
   NAME                  GETS   ADDR              LATCH#  CHILD#
   --------------------- ------ ----------------- ------- ------
   row cache objects     34571  0000000182D973D0  411     5
   row cache objects     56110  0000000182D978C8  411     8
   
   NAME                  GETS   ADDR              LATCH#  CHILDREN_CNT
   --------------------- ------ ----------------- ------- ------------
   simulator hash latch  55711  00000000600296D0  240     2048

   NAME                  VALUE
   -------------------- ------
   parse count (total)     321
   parse count (hard)        1

---*** After Test ***---
   NAME                  VALUE
   -------------------- ------
   parse count (total)     328
   parse count (hard)        1

   PARAMETER             GETS   CACHE#_LIST
   --------------------- ------ -----------
   dc_tablespaces        12000            0
   dc_users              19756         7,10
   
   NAME                  GETS   ADDR              LATCH#  CHILD#
   --------------------- ------ ----------------- ------- ------
   row cache objects     37571  0000000182D973D0  411     5
   row cache objects     59140  0000000182D978C8  411     8
   
   NAME                  GETS   ADDR              LATCH#  CHILDREN_CNT
   --------------------- ------ ----------------- ------- ------------
   simulator hash latch  57712  00000000600296D0  240     2048

------------------------- Delta -------------------------
   NAME                 Delta-Parse Count
   -------------------- -----------------
   parse count (total)                  7
   parse count (hard)                   0

   PARAMETER             Delta-GETS   CACHE#_LIST
   --------------------- ------------ -----------
   dc_tablespaces        1000                   0
   dc_users              1010                7,10
                                      
   NAME                  Delta-GETS   ADDR              LATCH#  CHILD#
   --------------------- ------------ ----------------- ------- -----------------
   row cache objects     3000         0000000182D973D0  411     5 (dc_tablespaces)
   row cache objects     3030         0000000182D978C8  411     8 (dc_users)     
                                      
   NAME                  Delta-GETS   ADDR              LATCH#  CHILDREN_CNT
   --------------------- ------------ ----------------- ------- ------------
   simulator hash latch  2001         00000000600296D0  240     2048
The Delta values for 1000 Hash Joins showed:
  total parse count is 7, hard parse count is 0, hence no any hard parse observed (misconception ?).
  Row Cache: dc_tablespaces (CID=0), and dc_users (CID=7) made about 1000 GETs each.
  Row Cache Latch: dc_tablespaces (5) and dc_users (8) made about 3000 GETs each (3 times of Row Cache GETs).
  simulator hash latch made about 2000 GETs.
(Note: more statistics are listed with Tom Kyte Runstats)

The same behaviour can be observed by 10222 trace or dtrace, for example,
run dtace: rco_dtrace_cache, the output looks like (CID ADDR are same too):

CID = 0  ADDR = 17C376EF8
           kqrLockAndPinPo:entry
    oracle`kqrLockAndPinPo
    oracle`kqrpre1+0x8d4
    oracle`ktatminextsz+0x2f2
    oracle`qerhjComputeFanoutAndBPS+0x7a5
    oracle`qerhjComputeHjParameters+0x12f

CID = 7  ADDR = 17BD52070
           kqrLockAndPinPo:entry
    oracle`kqrLockAndPinPo
    oracle`kqrpre1+0x8d4
    oracle`ktsitbs_info+0x53
    oracle`ktatminextsz+0x9a3
    oracle`qerhjComputeFanoutAndBPS+0x7a5

  *************** CID Stats with Address ****************
  CID = 11   ADDR = 178FC8160        CNT = 3
  CID = 17   ADDR = 179A6D968        CNT = 3
  CID = 0    ADDR = 17C376EF8        CNT = 1000
  CID = 7    ADDR = 17BD52070        CNT = 1000
  *************** CID Stats ****************
  CID = 11   CNT = 3
  CID = 17   CNT = 3
  CID = 0    CNT = 1000
  CID = 7    CNT = 1000
  *************** CID Stats Summary ****************
  CNT = 2006  
If we put one more table (e.g, hj3) in the above select, all the GETs (except "simulator hash latch") will be doubled because of the one additional Hash Join.

This behaviour is described in MOS:
  Bug 13902396 Hash joins cause "row cache objects" latch gets and "shared pool" latch gets (disabled fix)
    The function making these calls, ktatminextsz(), returns the minimum extent size for the users temporary tablespace
It can be switched off (except "simulator hash latch") by a special event:

  alter system set events '45053 trace name context forever, level 255';
  
  --Switched on by: 
  alter system set events '45053 trace name context off';
  
  (45053 Event has no effect on "simulator hash latch")
where 255 is obtained by:

(Blog: 如何诊断latch: row cache objects)
  SELECT name, decode (bitmapped, 0, least (DFLincr, dflinit), bitmapped) -1 "Level" 
  FROM sys.ts$ where name in ('TEMP', 'UNDO', 'SYSTEM');
    NAME    Level
    ------- -----
    TEMP    255      --TEMP allocation_type is UNIFORM
    UNDO    7
    SYSTEM  7
    
  select tablespace_name, block_size, initial_extent, next_extent, allocation_type
        ,(least(initial_extent, coalesce(next_extent, initial_extent))/block_size) - 1 "Level" 
  from dba_tablespaces where tablespace_name in ('TEMP', 'UNDO', 'SYSTEM');   
  
    TABLESPACE_NAME  BLOCK_SIZE  INITIAL_EXTENT  NEXT_EXTENT  ALLOCATION_TYPE  Level
    ---------------- ----------- --------------- ------------ ---------------- -----
    SYSTEM           8192        65536                        SYSTEM           7
    TEMP             8192        2097152         2097152      UNIFORM          255
    UNDO             8192        65536                        SYSTEM           7 
With following Dtrace script, we can verify the return value (255) of subroutine: ktatminextsz. Therefore, setting value 255 effectively bypasses repeated calls of ktatminextsz.

declare
  l_cnt number;
begin
  for i in 1..1000 loop
    select /*+ use_hash(t2) */ count(*) into l_cnt from hj1 t1, hj2 t2 where t1.x=t2.x;
  end loop;
end;
/

sudo dtrace -n \
'pid$target::ktatminextsz:return /execname == "oracle"/ {
  @RET_CNT[arg0, arg1] = count();}
END {
  printf("\n**** ktatminextsz Output ****");
  printa("\nProgram Counter Offset = %d \nMINEXTSZ(Return Value) = %d \nCNT = %-10@d", @RET_CNT); }
' -p 17419


  **** ktatminextsz Output ****
  Program Counter Offset = 2578
  MINEXTSZ(Return Value) = 255
  CNT = 1000
Bug 13902396 Note said:
  This fix is disabled by default. ...
  This fix should ONLY be used under the direction of Oracle.
The Bug was first created in Mar 29, 2012 for Oracle 11.2.0.3, but still reproducible in 12cR1 since the fix is disabled by default according to the Note.

We have seen that in Oracle, the second TOP Latch by children count (the first is: cache buffers chains, see next Blog) is:

  NAME                     CNT     RATIO
  ----------------------- ------  ------
  simulator hash latch    2048    4.71
which is also heavily used in Hash Join, but hardly noticed any contentions owing to the number of its children
("simulator hash latch" is included here because it is a long time searched latch in Oracle community).

In the above test, if we use the same Table join by replacing hj2 with hj1:

declare
  l_cnt number;
begin
  for i in 1..1000 loop
    select /*+ use_hash(t2) */ count(*) into l_cnt from hj1 t1, hj1 t2 where t1.x=t2.x;
  end loop;
end;
/
There is (almost) no more GETs of "simulator hash latch".

By the way, 45053 is a kg_event with valid value from 0 to 2^32.

SQL> ORADEBUG DOC EVENT NAME

     Events in library GENERIC:
     ------------------------------
     kg_event[]           Support old error number events (use err# for short)

45053 Level Range: [0, 4294967295]  


1.1. Oracle 19c rowcache FASTGETS (update in 2020-Jul-16)


In Oracle 19c, we need to run following test to observe v$rowcache.FASTGETS (after first time warmup). (See Blog: Oracle rowcache fastgets)

------------------------- Setup -------------------------
drop table hj1;
drop table hj2;

create table hj1 as select level x from dual connect by level <= 1000; 
create table hj2 as select level x from dual connect by level <= 1000; 

------------------------- Tested (Oracle 19c on AIX, LINUX, Solaris) -------------------------
-- parse count (hard) not changed
-- v$rowcache.FASTGETS increase is equal to loop number
-- v$rowcache.GETS not changed (always about 600, probably from v$rowcache select)

column name format a20
column parameter format a20
column cache#_list format a20
column gets new_value gets_old
column fastgets new_value gets_old_fast

select parameter, sum(gets) gets, sum(fastgets) fastgets, listagg(cache#, ',') within group (order by cache#) as cache#_list 
from v$rowcache where parameter in ('dc_users') and gets > 0
group by parameter;

select a.name, b.value from v$statname a, v$mystat b 
where a.statistic# = b.statistic# and name in ('parse count (total)', 'parse count (hard)');

declare
  l_cnt number;
begin
  for i in 1..7000 loop      -- increase loop number to see FASTGETS changes.  
    select /*+ use_hash(t2) */ count(*) into l_cnt from hj1 t1, hj2 t2 where t1.x=t2.x;
    -- with sq as (select /*+ materialize*/ * from hj1) select count(*) into l_cnt from sq;  -- for next test
  end loop;
end;
/

select a.name, b.value from v$statname a, v$mystat b 
where a.statistic# = b.statistic# and name in ('parse count (total)', 'parse count (hard)');

select parameter, sum(gets) gets, sum(fastgets) fastgets, listagg(cache#, ',') within group (order by cache#) as cache#_list
      ,sum(gets)-&gets_old delta_gets,  sum(fastgets)-&gets_old_fast detal_fastgets
from v$rowcache where parameter in ('dc_users') and gets > 0
group by parameter;

------------------------- Test Output -------------------------
---*** Before Test ***---

PARAMETER                  GETS   FASTGETS CACHE#_LIST
-------------------- ---------- ---------- --------------------
dc_users                4510502   14201934 7,10

NAME                      VALUE
-------------------- ----------
parse count (total)       10232
parse count (hard)          444

---*** After Test ***---
NAME                      VALUE
-------------------- ----------
parse count (total)       17237
parse count (hard)          444

PARAMETER                  GETS   FASTGETS CACHE#_LIST          DELTA_GETS DETAL_FASTGETS
-------------------- ---------- ---------- -------------------- ---------- --------------
dc_users                4511076   14208939 7,10                        574           7005
Above test output showed that there is no parse count (hard), but DETAL_FASTGETS is close to number of loop iterations.

Oracle has a hidden parameter:
  _kqr_optimistic_reads 
  optimistic reading of row cache objects 
  default: FALSE (Oracle 12.1); TRUE (Oracle 19.6)
19c changed the default value from FALSE to TRUE.
  When default "_kqr_optimistic_reads"=TRUE,  FASTGETS is euqal to iterations, GETS is 0 
       (non-zero 574 of DELTA_GETS is due to monitoring query).
  When default "_kqr_optimistic_reads"=FALSE, GETS is euqal to iterations, FASTGETS is 0.
In 19c, if we set it back to 12c (require DB re-start):

   alter system set "_kqr_optimistic_reads"=FALSE;
we can observe the 12c behaviour.


2. dc_tablespaces, dc_users and temporary tables


Now it raises the question if 45053 is a universal trace event, which is also temporary table relevant.

Let's try it. In the above test, replace:

    select /*+ use_hash(t2) */ count(*) into l_cnt from hj1 t1, hj2 t2 where t1.x=t2.x;
by a Subquery Factoring Clause with materialize hint to resolve the subquery as a global temporary table:

    with sq as (select /*+ materialize*/ * from hj1) select count(*) into l_cnt from sq;
Then make two tests: one is without 45053 trace event (default); one is with it.

we got the identical result as follows:

------------------------- Delta -------------------------
   PARAMETER             Delta-GETS   CACHE#_LIST
   --------------------- ------------ -----------
   dc_tablespaces        5000                   0
   dc_users              2000                7,10
                                      
   NAME                  Delta-GETS   ADDR              LATCH#  CHILD#
   --------------------- ------------ ----------------- ------- -----------------
   row cache objects     15000        0000000182D973D0  411     5 (dc_tablespaces)
   row cache objects     6000         0000000182D978C8  411     8 (dc_users)     
                                      
   NAME                  Delta-GETS   ADDR              LATCH#  CHILDREN_CNT
   --------------------- ------------ ----------------- ------- ------------
   simulator hash latch  2399         00000000600296D0  240     2048
Therefore setting 45053 trace event does not reduce the Row Cache Gets.

Run dtace: rco_dtrace_cache, the output is consistent with above output:

*************** CID Stats with Address (double counting) ****************
CID = 7    ADDR = 17BF59D30        CNT = 1
CID = 11   ADDR = 16669DED0        CNT = 3
CID = 17   ADDR = 179A6D968        CNT = 3
CID = 7    ADDR = 17D7CACD0        CNT = 2000
CID = 0    ADDR = 17C376EF8        CNT = 5005
*************** CID Stats ****************
CID = 11   CNT = 3
CID = 17   CNT = 3
CID = 7    CNT = 2001
CID = 0    CNT = 5005
*************** CID Stats Summary ****************
CNT = 7012
Comparing to the test of Hash Joins in Section 1, dc_tablespaces is 5 times, dc_users is 2 times.

If we further investigate the reason by checking the call of subroutine: "ktatminextsz", there is no output.

sudo dtrace -n \
'pid$target::ktatminextsz:return /execname == "oracle"/ {
  @RET_CNT[arg0, arg1] = count();}
END {
  printf("\n**** ktatminextsz Output ****");
  printa("\nProgram Counter Offset = %d \nMINEXTSZ(Return Value) = %d \nCNT = %-10@d", @RET_CNT); }
' -p 13767

**** ktatminextsz Output ****
That is probably why activating 45053 trace event has no effect on above temporary tables Factoring Clause.

Now it is unclear how to enumerate all the cases in which Row Cache Gets are proportional to Sql Executions.