Wednesday, January 24, 2018

Oracle Logical Read: Consistent Gets Access Path and Cost

Oracle Logical Reads (Buffer Gets) fetch data from Buffer Cache (Memory) in two different Modes: Consistent Mode Get (Consistent Gets) and Current Mode Get (DB Block Gets). Blocks in Consistent Mode are the memory versions at the POINT IN TIME the query started, whereas Blocks in Current Mode are the versions at CURRENT TIME (right now). Each block can have multiversion Clones in Consistent Mode, but maximum one single version in Current Mode.

This Blog will discuss Consistent Gets, and next Blog will talk about DB Block Gets.

At first, we test Consistent Gets in 4 different Access Paths and measure the Cost in terms of Event 10200 ("consistent read buffer status"), then demonstrate 'latch: cache buffers chains' in row-by-row slow processing.

Note: All tests are done in Oracle 12.1.0.2 on AIX, Solaris, Linux with 6 physical processors.


1. Test Setup


Pick the same Test Code from Blog: Hot Block Identification and latch: cache buffers chains.

drop table test_tab;
              
create table test_tab 
  INITRANS   26           -- prevent Segments ITL Waits and Deadlocks
as select level id, rpad('ABC', 10, 'X') val from dual connect by level <= 5;

alter table test_tab minimize records_per_block;

truncate table test_tab;

insert into test_tab select level id, rpad('ABC', 10, 'X') val from dual connect by level <= 100;

commit;

drop index test_tab#u1;

create unique index test_tab#u1 on k.test_tab (id) 
  pctfree    90
  initrans   26        -- prevent Segments ITL Waits and Deadlocks
;

exec dbms_stats.gather_table_stats(null, 'TEST_TAB', cascade => TRUE);

select block, to_char(block, 'xxxxxxxx') block_hex, count (*) cnt 
  from (select rowid rid, 
               dbms_rowid.rowid_object (rowid) object, 
               dbms_rowid.rowid_relative_fno (rowid) fil, 
               dbms_rowid.rowid_block_number (rowid) block, 
               dbms_rowid.rowid_row_number (rowid) ro, 
               t.* 
          from test_tab t 
         order by fil, block, ro) 
 group by block 
 order by block, cnt desc;

     BLOCK BLOCK_HEX        CNT
  -------- --------- ----------
   3070331    2ed97b          5
   3070332    2ed97c          5
   3070333    2ed97d          5
   3070334    2ed97e          5
   3070335    2ed97f          5
   3072640    2ee280          5
   3072641    2ee281          5
   3072642    2ee282          5
   3072643    2ee283          5
   3072644    2ee284          5
   3072645    2ee285          5
   3072646    2ee286          5
   3072647    2ee287          5
   3072649    2ee289          5
   3072650    2ee28a          5
   3072651    2ee28b          5
   3072652    2ee28c          5
   3072653    2ee28d          5
   3072654    2ee28e          5
   3072655    2ee28f          5
 20 rows selected. 
  
select object_name, object_id from dba_objects where object_name = 'TEST_TAB#U1';  

  OBJECT_NAME  OBJECT_ID
  ----------- ----------
  TEST_TAB#U1    2260907

select block#, to_char(block#, 'xxxxxxxx') block#_hex
      ,count(*) rows_per_block
from (
  select dbms_rowid.rowid_block_number(sys_op_lbid (2260907, 'L', t.rowid))  block#
        ,t.*
    from test_tab t
   where id is not null
)
group by block#
order by block#; 

    BLOCK# BLOCK#_HE ROWS_PER_BLOCK
  -------- --------- --------------
   3072660    2ee294              5
   3072661    2ee295              5
   3072662    2ee296              5
   3072663    2ee297              5
   3072664    2ee298              5
   3072665    2ee299              5
   3072666    2ee29a              5
   3072667    2ee29b              5
   3072668    2ee29c              5
   3072669    2ee29d              5
   3072670    2ee29e              5
   3072671    2ee29f              5
   3072673    2ee2a1              5
   3072674    2ee2a2              5
   3072675    2ee2a3              5
   3072676    2ee2a4              5
   3072677    2ee2a5              5
   3072678    2ee2a6              5
   3072679    2ee2a7              5
   3072680    2ee2a8              5
 20 rows selected 


2. Buffer Read Access Path Tests


Trace Event 10200 shows all consistent read Blocks and their sequence. We will demonstrate the number of consistent read Blocks with different access path.

At first, display Table and Index information:

column object_name format a12
column object_id_hex format a20
column data_object_id_hex format a20

select object_name, object_id, data_object_id, 
       to_char(object_id, 'xxxxxxxx') object_id_hex, to_char(data_object_id, 'xxxxxxxx') data_object_id_hex
 from dba_objects where object_name in ('TEST_TAB', 'TEST_TAB#U1');


 OBJECT_NAME  OBJECT_ID  DATA_OBJECT_ID  OBJECT_ID_HEX  DATA_OBJECT_ID_HEX
 ----------- ---------- --------------- -------------- -------------------
 TEST_TAB       2260905         2260906         227fa9              227faa
 TEST_TAB#U1    2260907         2260907         227fab              227fab


2.1 USER ROWID without using Index


First collect ROWID for test, then make the test:

select /*+ user_rowid_no_index */ t.*, rowid rid from test_tab t 
where id in (1, 2, 3, 4, 5, 6, 7) order by t.id;

  ID VAL        RID
 --- ---------- ------------------
   1 ABCXXXXXXX AAIn+qAAAAALtl7AAA
   2 ABCXXXXXXX AAIn+qAAAAALtl7AAB
   3 ABCXXXXXXX AAIn+qAAAAALtl7AAC
   4 ABCXXXXXXX AAIn+qAAAAALtl7AAD
   5 ABCXXXXXXX AAIn+qAAAAALtl7AAE
   6 ABCXXXXXXX AAIn+qAAAAALtl8AAA
   7 ABCXXXXXXX AAIn+qAAAAALtl8AAB

alter session set events '10200 trace name context forever,level 10';

select /*+ user_rowid_no_index */ * from test_tab t 
where id in (1, 2, 3, 4, 5, 6, 7)
  and rowid in 
      ('AAIn+qAAAAALtl7AAA'
      ,'AAIn+qAAAAALtl7AAB'
      ,'AAIn+qAAAAALtl7AAC'
      ,'AAIn+qAAAAALtl7AAD'
      ,'AAIn+qAAAAALtl7AAE'
      ,'AAIn+qAAAAALtl8AAA'
      ,'AAIn+qAAAAALtl8AAB');

alter session set events '10200 trace name context off';

Rows (1st) Rows (avg) Rows (max)  Row Source Operation (0 Branch Block, 0 Leaf Blocks, 3 Table Blocks)
---------- ---------- ----------  ---------------------------------------------------
         7          7          7  INLIST ITERATOR  (cr=3 pr=0 pw=0 time=130 us)
         7          7          7   TABLE ACCESS BY USER ROWID TEST_TAB (cr=3 pr=0 pw=0 time=261 us cost=1 size=14 card=1)

ktrget2(): started for block  
 <0x07cf : 0x002ed97b> objd: 0x00227faa       <-- Table  Block 1
 <0x07cf : 0x002ed97b> objd: 0x00227faa       <-- Table  Block 1
 <0x07cf : 0x002ed97c> objd: 0x00227faa       <-- Table  Block 2


2.2 INDEX RANGE SCAN with INDEX ROWID BATCHED



alter session set events '10200 trace name context forever,level 10';

select /*+ index_range_scan */ * from test_tab t where id between 1 and 7;

alter session set events '10200 trace name context off';

Row Source Operation (1 Branch Block, 3 Leaf Blocks, 3 Table Blocks)
---------------------------------------------------
TABLE ACCESS BY INDEX ROWID BATCHED TEST_TAB (cr=7 pr=0 pw=0 time=210 us cost=3 size=98 card=7)
 INDEX RANGE SCAN TEST_TAB#U1 (cr=4 pr=0 pw=0 time=552 us cost=2 size=0 card=7)(object id 2260907)

ktrgtc2(): started for block
  <0x07cf : 0x002ee293> objd: 0x00227fab   <-- Branch Block 
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ed97b> objd: 0x00227faa       <-- Table  Block 1
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ed97b> objd: 0x00227faa       <-- Table  Block 1
  <0x07cf : 0x002ee295> objd: 0x00227fab     <-- Leaf  Block 2
  <0x07cf : 0x002ed97c> objd: 0x00227faa       <-- Table  Block 2


2.3 INDEX UNIQUE SCAN



select /*+ index_uniqe_scan */ * from test_tab t where id in (1, 2, 3, 4, 5, 6, 7);

Row Source Operation (2 Branch Block, 8 Leaf Blocks, 3 Table Blocks)
---------------------------------------------------
INLIST ITERATOR  (cr=13 pr=0 pw=0 time=218 us)
 TABLE ACCESS BY INDEX ROWID TEST_TAB (cr=13 pr=0 pw=0 time=812 us cost=3 size=98 card=7)
  INDEX UNIQUE SCAN TEST_TAB#U1 (cr=10 pr=0 pw=0 time=585 us cost=2 size=0 card=7)(object id 2260907)
  
ktrgtc2(): started for block  
  <0x07cf : 0x002ee293> objd: 0x00227fab   <-- Branch Block
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ed97b> objd: 0x00227faa       <-- Table  Block 1
  <0x07cf : 0x002ee293> objd: 0x00227fab   <-- Branch Block
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ed97b> objd: 0x00227faa       <-- Table  Block 1
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ee294> objd: 0x00227fab     <-- Leaf  Block 1
  <0x07cf : 0x002ee295> objd: 0x00227fab     <-- Leaf  Block 2
  <0x07cf : 0x002ed97c> objd: 0x00227faa       <-- Table  Block 2
  <0x07cf : 0x002ee295> objd: 0x00227fab     <-- Leaf  Block 2


2.4 INDEX ROWID GET


We can construct a pipelined functions to force INDEX ROWID GETs in a given frequency.

create type number_tab as table of number;
/

create or replace function id_pipelined(p_cnt number, p_sleep_seconds number := 0) 
return number_tab pipelined as
begin
  for i in 1..p_cnt loop
    if p_sleep_seconds > 0 then
      dbms_lock.sleep(p_sleep_seconds);  -- purposed delay, specially before producing first row
    end if;
    pipe row(i);
  end loop;
  return;
  exception
    when no_data_needed then
      raise;
    when others then
      dbms_output.put_line('others Handler');
      raise;
end;
/

select /*+ leading(c) cardinality(c 10) indx(t test_tab#u1) index_rowid_pipelined */ * 
  from table(cast (id_pipelined(7, 1.3) as number_tab)) c, test_tab t
 where c.column_value = t.id;
  
Row Source Operation (2 Branch Block, 8 Leaf Blocks, 7 Table Blocks)
---------------------------------------------------
NESTED LOOPS  (cr=17 pr=0 pw=0 time=253 us cost=35 size=160 card=10)
 NESTED LOOPS  (cr=10 pr=0 pw=0 time=7802413 us cost=35 size=160 card=10)
  COLLECTION ITERATOR PICKLER FETCH ID_PIPELINED (cr=0 pr=0 pw=0 time=7801325 us cost=30 size=20 card=10)
  INDEX UNIQUE SCAN TEST_TAB#U1 (cr=10 pr=0 pw=0 time=823 us cost=1 size=0 card=1)(object id 2260907)
 TABLE ACCESS BY INDEX ROWID TEST_TAB (cr=7 pr=0 pw=0 time=395 us cost=1 size=14 card=1)
  
ktrgtc2(): started for block 
 <0x07cf : 0x002ee293> objd: 0x00227fab  <-- Branch Block 
 <0x07cf : 0x002ee294> objd: 0x00227fab    <-- Leaf  Block 1
 <0x07cf : 0x002ed97b> objd: 0x00227faa      <-- Table  Block 1
 <0x07cf : 0x002ee293> objd: 0x00227fab  <-- Branch Block 
 <0x07cf : 0x002ee294> objd: 0x00227fab    <-- Leaf  Block 1
 <0x07cf : 0x002ed97b> objd: 0x00227faa      <-- Table  Block 1
 <0x07cf : 0x002ee294> objd: 0x00227fab    <-- Leaf  Block 1
 <0x07cf : 0x002ed97b> objd: 0x00227faa      <-- Table  Block 1
 <0x07cf : 0x002ee294> objd: 0x00227fab    <-- Leaf  Block 1
 <0x07cf : 0x002ed97b> objd: 0x00227faa      <-- Table  Block 1
 <0x07cf : 0x002ee294> objd: 0x00227fab    <-- Leaf  Block 1
 <0x07cf : 0x002ed97b> objd: 0x00227faa      <-- Table  Block 1
 <0x07cf : 0x002ee294> objd: 0x00227fab    <-- Leaf  Block 1
 <0x07cf : 0x002ee295> objd: 0x00227fab    <-- Leaf  Block 2
 <0x07cf : 0x002ed97c> objd: 0x00227faa      <-- Table  Block 2
 <0x07cf : 0x002ee295> objd: 0x00227fab    <-- Leaf  Block 2
 <0x07cf : 0x002ed97c> objd: 0x00227faa      <-- Table  Block 2


3. Discussions


3.1 Access Path


Above 4 tests showed:
  USER ROWID:         3 Blocks (0 Branch Block, 0 Leaf Blocks, 3 Table Blocks)
  INDEX RANGE SCAN:   7 Blocks (1 Branch Block, 3 Leaf Blocks, 3 Table Blocks)
  INDEX UNIQUE SCAN: 13 Blocks (2 Branch Block, 8 Leaf Blocks, 3 Table Blocks)
  INDEX ROWID GET:   17 Blocks (2 Branch Block, 8 Leaf Blocks, 7 Table Blocks)
The fourth test is the most expensive one because pipelined function is a row-by-row processing. The consumer starts processing only after producer produces it (the principle of "pipelined"). In the above example, consumer waits for "p_sleep_seconds" when receiving first row from producer. In the real applications, during "p_sleep_seconds", any modifications can occur, and in order to maintain Read Consistency from query start SCN, Oracle should use UNDO to clone Buffer, hence, 'latch: cache buffers chains' (see Demo in next Section).

In the aspect of Disk Read, row-by-row access is fulfilled by "db file sequential read" instead of more efficient multiblock (batch) "db file scattered read" or "db file parallel read" (to be discussed in next Blog: Oracle DB File Read Access Path and Cost).

Oracle 12c Database Data Cartridge Developer's Guide Section: 13 Using Pipelined and Parallel Table Functions wrote:

Rows from a collection returned by a table function can also be pipelined; this means that they are iteratively returned as they are produced, instead of being returned in a single batch after all processing of the table function's input is completed.

Pipelining enables a table function to return rows faster and can reduce the memory required to cache a table function's results.


It looks like that Pipelined function is a reflection of consuming row-by-row produced data as a memory (PGA) optimization of collection processing if intermediate result can be aggregated and consumed rows can be forgotten.

By the way, Pipelined table functions can be implemented in two ways: native PL/SQL and interface. In the above example, we use the native PL/SQL approach.

Oracle is a RDBMS based on relational model, where "relation" means "tabular", in mathematics, a set of objects (rows).

Blog Ask TOM: Slow processing on database using merge declared:
    SQL is about sets, the best way to make SQL perform is to think SETS, not procedurally

The first test with "USER ROWID" is the cheapest fetching approach. In real application, it can be implemented by first caching ROWID of previously fetched ROWs, and subsequently, make the direct table fetch, and therefore bypass INDEX access. (See: Blog: Hot Block Identification and latch: cache buffers chains Section: "5. Index Hot Block Application Fix")


3.2 'latch: cache buffers chains' on Index Blocks


Index is made of a part of table columns. Usually it is more condensed than table, i.e, each Branch/Leaf Block can contain more rows than table Block. In other words, the same index Block is more frequently touched than table Block, hence more prone to 'latch: cache buffers chains' when multi sessions concurrently access it.


4. 'latch: cache buffers chains' Demo



create or replace procedure index_rowid_pipelined_get(p_cnt number, p_sleep_seconds number := 0) as
begin
  for c in (
    select /*+ leading(c) cardinality(c 10) indx(t test_tab#u1) index_rowid_pipelined */ * 
    from table(cast (id_pipelined(p_cnt, p_sleep_seconds) as number_tab)) c, test_tab t
    where c.column_value = t.id)
  loop
    null;
  end loop;
end;
/

create or replace procedure index_rowid_pipelined_get_jobs(p_job_cnt number, p_cnt number, p_sleep_seconds number := 0) as
  l_job_id pls_integer;
begin
  for i in 1.. p_job_cnt loop
    dbms_job.submit(l_job_id, 'begin while true loop index_rowid_pipelined_get('||p_cnt||', '||p_sleep_seconds||'); end loop; end;');
  end loop;
  commit;
end;    
/
Launch 6 Jobs on a UNIX with 6 physical Processors:

exec index_rowid_pipelined_get_jobs(6, 90, 0);
All Blocks in V$BH have status 'xcur'.

When modifying Table and no commit/rollback:

update test_tab set id = -id where abs(id) between 3 and 12;   -- Without commit/rollback
We can immediately observe 'latch: cache buffers chains' or 'buffer busy waits':

select * from v$session_wait where event in ('latch: cache buffers chains') order by event, sid;

select * from  v$session_wait_history where event in ('latch: cache buffers chains') order by event, sid;
In case of 'latch: cache buffers chains', one guess is that when
  V$SESSION_WAIT.state = 'WAITED SHORT TIME' (less than one centisecond), V$LATCH.Spin_Gets increased
  V$SESSION_WAIT.state = 'WAITED KNOWN TIME' (more than one centisecond), V$LATCH.Sleeps increased
  no entries in V$SESSION_WAIT, but V$BH.status='cr' Block Read,          V$LATCH.immediate_gets increased
but it is not clear how to confirm it.

By querying v$bh, we can see Consistent Read Copy of Blocks (cloned) marked as 'cr':

select d.object_name, data_object_id, b.status, b.* 
  from v$bh b, dba_objects d
 where b.objd = d.data_object_id
   and object_name in ('TEST_TAB', 'TEST_TAB#U1')
   --and b.status not in ('free')
   and b.status in ('xcur', 'cr')
order by b.status, object_name, block#;
In this case, Event 10201 ("consistent read undo application") can be used to trace CR Reads constructed by applying UNDO Blocks.

To stop all Jobs, execute following "clean_jobs".


-------------- clean_jobs --------------

create or replace procedure clean_jobs as
begin
  for c in (select job from dba_jobs) loop
    begin
       dbms_job.remove (c.job);
    exception when others then null;
    end;
    commit;
  end loop;

  for c in (select d.job, d.sid, (select serial# from v$session where sid = d.sid) ser 
              from dba_jobs_running d) loop
    begin
      execute immediate
             'alter system kill session '''|| c.sid|| ',' || c.ser|| ''' immediate';
      dbms_job.remove (c.job);
    exception when others then null;
    end;
    commit;
  end loop;
  
  -- select * from dba_jobs;
  -- select * from dba_jobs_running;
end;
/

-- exec clean_jobs; -- stop all jobs