Wednesday, January 29, 2020

Oracle Unique Index Non-Consistency Read with RowCR

RowCR is an optimization to avoid constructing a CR block if the unique index accessed row is not in an uncommitted transaction. If a row is updated and committed after a query started using unique index, instead of reading undo block and cloning a before image, Oracle fetches the row from modified block (after image). It results in a wrong result since it is no more a consistent read.

Blog: Oracle Consistency Read Changed made an extensive test of RowCR behaviors and contains a short description on RowCR:
     A brief overview of this optimization is that we try to avoid rollbacks while constructing a CR block 
     if the present block has no uncommitted changes.
and cited some Oracler's confirmation: "We have RowCR Optimization turned on by default, which may not be appropriate."

In this Blog, we will repeat the same tests as above Blog with:
  -.Sql Trace Event (10046)
  -.Consistent Read Event (10200)
  -.Dtrace 
so that we can try to have a further understanding of RowCR internals.
(see Blog: Dynamic tracing of Oracle logical I/O: part 2. Dtrace LIO v2 is released
and right linked book: Oracle Database Performance Tuning (Studies . Practices . Researches) - Chapter 1: Data Accesses),

Parallel to the discussion of Read Consistency, we will give a look of Write Consistency achieved by DML Restart, and show a new way to track exact three update starts.

Note:
     -. All tests done in 12c, 18c, 19c.
     -. Trace files are shortened by removing irrelevant text, and commented by meta data names (started with "<==").
     -. As of Oracle 11gR2, "_row_cr" is TRUE in default. The feature can be disabled by setting "_row_cr" = FALSE


1. Test Setup


We first create a table and two indexes, one unique, another non-unique. Then create two Plsql procedures, one is a query procedure with two indexes, and another is an update procedure. In the query procedure, open a cursor and then go to sleep for a few seconds so that during the sleep, update procedure is executed.

---====================== Test Setup ======================---

drop table rowcr_tab purge;

create table rowcr_tab (id_uniq, id_non_uniq, color, ts, seq) 
   as select level, level, 'BLACK', localtimestamp, 0 from dual connect by level <= 1e3;

create unique index rowcr_tab#id_uniq on rowcr_tab(id_uniq);

--alter index rowcr_tab#id_uniq rebuild reverse;
--alter index rowcr_tab#id_uniq rebuild noreverse;

create        index rowcr_tab#id_non_uniq on rowcr_tab(id_non_uniq);

exec dbms_stats.gather_table_stats(null, 'ROWCR_TAB', cascade=>true);

create or replace procedure rowcr_read_u1n2 (p_test varchar2, p_id number, p_sleep number) as
  cursor c_uniq is select /*+ index(t rowcr_tab#id_uniq) */ * from rowcr_tab t where id_uniq = p_id;
  cursor c_non_uniq is select /*+ index(t rowcr_tab#id_non_uniq) */ * from rowcr_tab t where id_non_uniq = p_id;
  --cursor c_rowid is select * from rowcr_tab t where rowid = chartorowid('AAJXn8AAAAAANCVADN');
  l_row          rowcr_tab%rowtype;
begin
  if p_test in ('all', 'uniq') then open c_uniq; end if;
  if p_test in ('all', 'non_uniq') then open c_non_uniq; end if;
  dbms_output.put_line('-------------------- Cursor Opened at '|| localtimestamp);
  
  dbms_lock.sleep(p_sleep);
  
  if p_test in ('all', 'uniq') then 
   fetch c_uniq into l_row;
   while (c_uniq%found) loop
     dbms_output.put_line('Unqiue     Index Read Color: '||l_row.color||' at '|| l_row.ts);
     fetch c_uniq into l_row;
   end loop;
   close c_uniq;
  end if;
    
  if p_test in ('all', 'non_uniq') then 
   fetch c_non_uniq into l_row;
   while (c_non_uniq%found) loop
     dbms_output.put_line('Non-Unqiue Index Read Color: '||l_row.color||' at '|| l_row.ts);
     fetch c_non_uniq into l_row;
   end loop;
   close c_non_uniq;
 end if;
end;
/

create or replace procedure rowcr_update (p_id number, p_commit boolean) as
  l_old_val varchar2(50);
  l_new_val varchar2(50);
begin
  select 'Old Color: '||color||' at '||ts into l_old_val from rowcr_tab t where id_uniq = p_id;
  dbms_output.put_line(l_old_val);
  dbms_output.put_line('--------- Update at '|| localtimestamp);
  update rowcr_tab set color=decode(color, 'BLACK', 'WHITE', 'WHITE', 'BLACK'), ts=localtimestamp where id_uniq=p_id;
  if p_commit then 
     commit;
  end if;
  select 'New Color: '||color||' at '||ts into l_new_val from rowcr_tab t where id_uniq = p_id;
  dbms_output.put_line(l_new_val);
end;
/


2. Collect Meta Data


Then we list the Meta Data which are referenced in later trace files.

---====================== Collect Meta Data ======================---

select object_name, object_id, object_type, to_char(object_id, 'XXXXXXXXXXX') objd_hex 
  from dba_objects where object_name like 'ROWCR_TAB%';
  
OBJECT_NAME            OBJECT_ID  OBJECT_TYPE  OBJD_HEX
---------------------  ---------  -----------  --------
ROWCR_TAB              2456060    TABLE        2579FC
ROWCR_TAB#ID_UNIQ      2456061    INDEX        2579FD
ROWCR_TAB#ID_NON_UNIQ  2456062    INDEX        2579FE

select segment_name, segment_type, tablespace_name, header_file, header_block,
       to_char(header_file, 'xxxxxx') header_file_hex, to_char(header_block+1, 'xxxxxxx') header_block_hex
  from dba_segments where segment_name like 'ROWCR_TAB%';

SEGMENT_NAME           SEGMENT_TYPE  TABLESPACE_NAME  HEADER_FILE  HEADER_BLOCK  HEADER_FILE_HEX  HEADER_BLOCK_HEX
---------------------  ------------  ---------------  -----------  ------------  ---------------  ----------------
ROWCR_TAB              TABLE         U1               1548         53394             60c              d093
ROWCR_TAB#ID_UNIQ      INDEX         U1               1548         53402             60c              d09b
ROWCR_TAB#ID_NON_UNIQ  INDEX         U1               1548         53410             60c              d0a3

select name, ts#, to_char(ts#, 'xxxxx') ts#_hex from v$tablespace where name='U1';

NAME  TS#   TS#_HEX
----  ----  -------
U1    1999     7cf

select indx, kcbwhdes from sys.x_kcbwh where indx in (1298, 1299, 1004, 1007, 61, 1061);

INDX  KCBWHDES
----  ----------------
61    ktuwh27: kturbk
1004  kdswh02: kdsgrp
1007  kdswh05: kdsgrp
1061  kdiwh16: kdifxs
1298  qeilwhrp: qeilbk
1299  qeilwhnp: qeilbk

select rowid rd, t.* from rowcr_tab t where id_uniq=678 or id_non_uniq=678;

RD                    ID_UNIQ ID_NON_UNIQ COLOR TS
------------------ ---------- ----------- ----- ---------------------------
AAJXn8AAAAAANCVADN        678         678 BLACK 23.01.20 15:21:21.903262000


3. Run Test


We open two Sqlplus sessions. In the first session, we open cursor, sleep 30 seconds, then fetch row from opened cursor. In the second session, we make update on that same row within above 30 sleeping seconds. Here the test output from both sessions:

---========== Session-1 Run @T1, Wait 30 Seconds ==========---

15:05:41 TESTDB(111)> exec rowcr_read_u1n2(p_test => 'all', p_id => 678, p_sleep => 30);

  -------------------- Cursor Opened at 23-JAN-2020 15:05:56
  Unqiue     Index Read Color: WHITE at 23-JAN-2020 15:06:09
  Non-Unqiue Index Read Color: BLACK at 23-JAN-2020 12:42:49


---========== Session-2 Run @T2 (T2 < T1 + 30) ==========---

15:06:05 TESTDB(222)> exec rowcr_update(678, true);

  Old Color: BLACK at 23-JAN-2020 12:42:49
  -------------------- Update at 23-JAN-2020 15:06:09
  New Color: WHITE at 23-JAN-2020 15:06:09
In Session-2, the color in row (id = 678) is updated from BLACK to WHITE and committed. In Session-1, Unqiue Index Read returns WHITE, which is the after image even though the cursor was opened before update, whereas Non-Unqiue Index Read returns BLACK, which is the before image.

The consequence is that Unqiue Index Read returns a wrong Non-Consistency result, whereas Non-Unqiue Index Read gives the correct Consistency result.

Now we run two tests with Sql Trace Event (10046), Consistent Read Event (10200) and Dtrace.


3.1. Unique Index Read


Here the test and output:

---========== Session-1 Run @T1, Wait 30 Seconds ==========---

alter session set events='10046 trace name context forever, level 12: 
                          10200 trace name context forever, level 10' 
      tracefile_identifier='10046_10200_rowcr_uniq_1';
      
exec rowcr_read_u1n2(p_test => 'uniq', p_id => 678, p_sleep => 30);

alter session set events='10046 trace name context off : 10200 trace name context off ';


15:08:24 TESTDB(111)> exec rowcr_read_u1n2(p_test => 'uniq', p_id => 678, p_sleep => 30);
  
  -------------------- Cursor Opened at 23-JAN-2020 15:08:24
  Unqiue     Index Read Color: BLACK at 23-JAN-2020 15:08:40


---========== Session-2 Run @T2 (T2 < T1 + 30) ==========---

15:08:38 TESTDB(222)> exec rowcr_update(p_id => 678, p_commit => true);

  Old Color: WHITE at 23-JAN-2020 15:06:09
  -------------------- Update at 23-JAN-2020 15:08:40
  New Color: BLACK at 23-JAN-2020 15:08:40


3.1.1. Unique Index - Sql Trace



SELECT /*+ index(t rowcr_tab#id_uniq) */ * FROM ROWCR_TAB T WHERE ID_UNIQ = :B1 

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          3          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          3          0           1

Row Source Operation
---------------------------------------------------
TABLE ACCESS BY INDEX ROWID ROWCR_TAB (cr=3 pr=0 pw=0 time=593 us cost=1 size=25 card=1)
 INDEX UNIQUE SCAN ROWCR_TAB#ID_UNIQ (cr=2 pr=0 pw=0 time=187 us cost=1 size=0 card=1)(object id 2456061)


3.1.2. Unique Index - Consistent Read Trace (Event 10200)



ktrgtc2(): started for block <0x07cf : 0x0000d09b> objd: 0x002579fd    <== ROWCR_TAB#ID_UNIQ root
ktrexc(): returning 2 on:  ffff80ffbccf4e70  
ktrgtc2(): completed for block <0x07cf : 0x0000d09b> objd: 0x002579fd

ktrgtc2(): started for block <0x07cf : 0x0000d09d> objd: 0x002579fd    <== ROWCR_TAB#ID_UNIQ leaf
ktrexc(): returning 2 on:  ffff80ffbccf4e70  
ktrgtc2(): completed for block <0x07cf : 0x0000d09d> objd: 0x002579fd

ktrgtc2(): started for block <0x07cf : 0x0000d095> objd: 0x002579fc    <== ROWCR_TAB
ktrexc(): returning 9 on:  ffff80ffbccf4e70 
ktrgtc2(): completed for block <0x07cf : 0x0000d095> objd: 0x002579fc


3.1.3. Unqiue Index - Dtrace



$ dtracelio.d 1111

Dynamic tracing of Oracle logical I/O v2.1 by Alexander Anokhin ( http://alexanderanokhin.wordpress.com )

  kcbgtcr(0xFFFF80FFBFFF39E0,1,1298,0) [tsn: 1999 rdba: 0xd09b (0/53403) obj: 2456061] where: 1298 exam: 1  <== ROWCR_TAB#ID_UNIQ root
  kcbgtcr(0xFFFF80FFBC1446E8,1,1299,0) [tsn: 1999 rdba: 0xd09d (0/53405) obj: 2456061] where: 1299 exam: 1  <== ROWCR_TAB#ID_UNIQ leaf
  kcbgtcr(0xFFFF80FFBC14C518,1,1004,0) [tsn: 1999 rdba: 0xd095 (0/53397) obj: 2456060] where: 1004 exam: 1  <== ROWCR_TAB

===================== Logical I/O Summary (grouped by object/function) ==============
 function    stat   object_id   data_object_id   mode_held   where     bufs     calls
--------- ------- ----------- ---------------- ----------- ------- -------- ---------
  kcbgtcr      cr     2456060          2456060                1004        1         1
  kcbgtcr      cr     2456061          2456061                1298        1         1
  kcbgtcr      cr     2456061          2456061                1299        1         1
=====================================================================================

============================= Logical I/O Summary (grouped by object) =============================
 object_id  data_object_id   lio   cr   cr (e)   cr (d)   cu   cu (d) ispnd (Y) ispnd (N)   pin rls
---------- --------------- ----- ---- -------- -------- ---- -------- --------- --------- ---------
         0               0     0    0        0        0    0        0         0         1         0
   2456060         2456060     1    1        1        0    0        0         0         1         1   <== ROWCR_TAB
   2456061         2456061     2    2        2        0    0        0         0         1         0   <== ROWCR_TAB#ID_UNIQ
---------- --------------- ----- ---- -------- -------- ---- -------- --------- --------- ---------
     total                     3    3        3        0    0        0         0         3         1
===================================================================================================

Legend
  lio      : logical gets (cr + cu)
  cr       : consistent gets
  cr (e)   : consistent gets - examination
  cr (d)   : consistent gets direct
  cu       : db block gets
  cu (d)   : db block gets direct
  ispnd (Y): buffer is pinned count
  ispnd (N): buffer is not pinned count
  pin rls  : pin releases

where
  INDX  KCBWHDES
  ----  ----------------
  1298  qeilwhrp: qeilbk
  1299  qeilwhnp: qeilbk
  1004  kdswh02: kdsgrp  
The above trace files showed that Unique Index made 2 ROWCR_TAB#ID_UNIQ cr reads and 1 ROWCR_TAB cr read, without any UNDO read.

If we dump the block, we can see that the fetched row has been updated and committed (commit SCN) after query started (query start SCN), and when query read the row using unique index, it saw the row already committed, and simply returned it without any SCN checking although (query start SCN < commit SCN).


3.2. Non-Unique Index Read



---========== Session-1 Run @T1, Wait 30 Seconds ==========---

alter session set events='10046 trace name context forever, level 12: 
                          10200 trace name context forever, level 10' 
      tracefile_identifier='10046_10200_rowcr_non_uniq_1';
      
exec rowcr_read_u1n2(p_test => 'non_uniq', p_id => 678, p_sleep => 30);

alter session set events='10046 trace name context off : 10200 trace name context off ';


15:17:34 TESTDB(111)> exec rowcr_read_u1n2(p_test => 'non_uniq', p_id => 678, p_sleep => 30);

  -------------------- Cursor Opened at 23-JAN-2020 15:17:34
  Non-Unqiue Index Read Color: WHITE at 23-JAN-2020 15:12:53


---========== Session-2 Run @T2 (T2 < T1 + 30) ==========---

15:17:25 TESTDB(222)> exec rowcr_update(p_id => 678, p_commit => true);
  Old Color: WHITE at 23-JAN-2020 15:12:53
  -------------------- Update at 23-JAN-2020 15:17:43
  New Color: BLACK at 23-JAN-2020 15:17:43


3.2.1. Non-Unique Index - Sql Trace



********************************************************************************

SELECT /*+ index(t rowcr_tab#id_non_uniq) */ * FROM ROWCR_TAB T WHERE ID_NON_UNIQ = :B1 


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          5          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          5          0           1

 Row Source Operation
 ---------------------------------------------------
 TABLE ACCESS BY INDEX ROWID BATCHED ROWCR_TAB (cr=5 pr=0 pw=0 time=778 us cost=1 size=25 card=1)
  INDEX RANGE SCAN ROWCR_TAB#ID_NON_UNIQ (cr=3 pr=0 pw=0 time=422 us cost=1 size=0 card=1)(object id 2456062)


3.2.2. Non-Unique Index - Consistent Read Trace (Event 10200)



ktrgtc2(): started for block <0x07cf : 0x0000d0a3> objd: 0x002579fe      <== ROWCR_TAB#ID_NON_UNIQ root
ktrexc(): returning 2 on:  ffff80ffbccf4e70  
ktrgtc2(): completed for block <0x07cf : 0x0000d0a3> objd: 0x002579fe

ktrget2(): started for block  <0x07cf : 0x0000d0a5> objd: 0x002579fe     <== ROWCR_TAB#ID_NON_UNIQ leaf
ktrexf(): returning 9 on:  ffff80ffbccf4e70 
ktrgcm(): completed for block  <0x07cf : 0x0000d0a5> objd: 0x002579fe
ktrget2(): completed for  block <0x07cf : 0x0000d0a5> objd: 0x002579fe

ktrget2(): started for block  <0x07cf : 0x0000d095> objd: 0x002579fc     <== ROWCR_TAB
ktrexf(): returning 9 on:  ffff80ffbccf4e70 
kcbchg updating CR fields for 0xa0f720d8, 53397; 896:7b82885d            <== read UNDO block
kcbchg new CR fields for 0xa0f720d8, 53397; 896:7b82885d
ktrgcm(): completed for block  <0x07cf : 0x0000d095> objd: 0x002579fc
ktrget2(): completed for  block <0x07cf : 0x0000d095> objd: 0x002579fc

ktrget2(): started for block  <0x07cf : 0x0000d0a5> objd: 0x002579fe     <== ROWCR_TAB#ID_NON_UNIQ leaf
ktrexf(): returning 9 on:  ffff80ffbccf4e70                              <== INDEX RANGE SCAN next read
ktrgcm(): completed for block  <0x07cf : 0x0000d0a5> objd: 0x002579fe
ktrget2(): completed for  block <0x07cf : 0x0000d0a5> objd: 0x002579fe


3.2.3. Non-Unique Index - Dtrace



$ dtracelio1t.d 1111

Dynamic tracing of Oracle logical I/O v2.1 by Alexander Anokhin ( http://alexanderanokhin.wordpress.com )

  kcbgtcr(0xFFFF80FFBC147A40,1,1298,0)          [tsn: 1999 rdba: 0xd0a3   (0/53411) obj: 2456062] where: 1298 exam: 1    <== ROWCR_TAB#ID_NON_UNIQ root
  kcbgtcr(0xFFFF80FFBC147A40,0,1299,4033943420) [tsn: 1999 rdba: 0xd0a5   (0/53413) obj: 2456062] where: 1299 exam: 0    <== ROWCR_TAB#ID_NON_UNIQ leaf
  kcbgtcr(0xFFFF80FFBC1463C8,0,1007,0)          [tsn: 1999 rdba: 0xd095   (0/53397) obj: 2456060] where: 1007 exam: 0    <== ROWCR_TAB
  kcbgtcr(0xFFFF80FFBFFF1FB0,1,61,0)            [tsn:    2 rdba: 0xc051ef (3/20975) obj: 0  dobj: -1] where: 61 exam: 1  <== UNDO block
  kcbgtcr(0xFFFF80FFBC147790,0,1061,0)          [tsn: 1999 rdba: 0xd0a5   (0/53413) obj: 2456062] where: 1061 exam: 0    <== ROWCR_TAB#ID_NON_UNIQ leaf

===================== Logical I/O Summary (grouped by object/function) ==============
 function    stat   object_id   data_object_id   mode_held   where     bufs     calls
--------- ------- ----------- ---------------- ----------- ------- -------- ---------
  kcbgtcr      cr           0               -1                  61        1         1
  kcbgtcr      cr     2456060          2456060                1007        1         1
  kcbgtcr      cr     2456062          2456062                1061        1         1
  kcbgtcr      cr     2456062          2456062                1298        1         1
  kcbgtcr      cr     2456062          2456062                1299        1         1
=====================================================================================

============================= Logical I/O Summary (grouped by object) =============================
 object_id  data_object_id   lio   cr   cr (e)   cr (d)    cu  cu (d) ispnd (Y) ispnd (N)   pin rls
---------- --------------- ----- ---- -------- -------- ----- ------- --------- --------- ---------
         0               0     0    0        0        0     0       0         0         1         0
   2456060         2456060     1    1        0        0     0       0         0         1         1   <== ROWCR_TAB
         0              -1     1    1        1        0     0       0         0         0         0   <== UNDO block
   2456062         2456062     3    3        1        0     0       0         0         2         2   <== ROWCR_TAB#ID_NON_UNIQ
---------- --------------- ----- ---- -------- -------- ----- ------- --------- --------- ---------
     total                     5    5        2        0     0       0         0         4         3
===================================================================================================

where
  INDX  KCBWHDES
  ----  ----------------
  1298  qeilwhrp: qeilbk
  1299  qeilwhnp: qeilbk
  1007  kdswh05: kdsgrp
  61    ktuwh27: kturbk  
  1061  kdiwh16: kdifxs  
The above trace files showed that Non-Unique Index made 3 ROWCR_TAB#ID_UNIQ cr reads, 1 ROWCR_TAB cr read, and additionally 1 UNDO read to re-construct before image.


4. RowCR Documentation


Web page: RE: _row_cr setting in RAC has a detailed explanation about Row CR:
  In Oracle9i Release 2, the Row CR feature can partially alleviate the overhead of the global block cleanout/rollback problem. 
  The Row CR feature will reduce the number of CR rollbacks and avoid a costly block cleanout/rollback in a RAC environment. 
  Instead of performing a block cleanout, Row CR will only attempt to generate a CR version for the particular row. 
  Currently, Row CR works for UPDATE statements that have a Unique Index Scan or Fetch by Row ID in the row source. 
  Whenever there is a Fetch by Unique Index row source in the execution plan, that causes CR cleanout / rollback, 
  Row CR will kick in. The statistic Row CR attempts essentially measures the number of updates that have this property 
  that cause CR cleanout or rollbacks to happen. The statistic Row CR hits measures the success ratio of 
  Row CR among the attempts made. Oracle10g and beyond this feature should extend for index range scans, 
  as well as further optimize cleanout processing. 
  
  To extend Row CR functionality, the hidden parameter _row_cr needs to be set to TRUE (default is FALSE). 
  Row CR will be tried before invoking CR Rollback. The parameter can be changed dynamically, by using ALTER SYSTEM. 
  Currently, Row CR is turned on only for Updates whose plan is Unique Scans or Fetch by Row ID. 
  No other operation gets Row CR. When setting the _row_cr parameter to TRUE, Row CR will be attempted for ALL SQL 
  (this will include Updates, Selects, Deletes and Joins) that have a Fetch By Unique Index 
  OR a Fetch BY Row ID row source in the row source tree.
  
  This parameter has benefited several bugs, however this is unsupported functionality. Consider setting _row_cr=TRUE 
  if majority are immediate CR cleanouts
  
  Scott  
Oracle MOS Doc: ORA-00600: [KTRGCM_3] (Doc ID 424779.1) wrote:
     "_row_cr" is a hidden parameter used to control CR requests and Buffer waits on remote undo segment headers 
     which is a common problem on RAC instances
     
     RowCR is now enabled and supported with 10.2. RowCR can partially reduce the overhead of 
     the global block cleanout/rollback problem by simply attempting to generate a CR version for the requested row. 
     
     Please note that RowCR (_row_cr=TRUE) is NOT SUPPORTED in Oracle versions prior to 10g Release.
Reading above documentation, it looks like that RowCR is an optimization, originally implemented for RAC.


5. ktrexc, ktrexf and Buffer Pin


Above trace files showed two code paths of logical read:
     ktrgtc2->kcbgtcr->ktrexc: for unique index and index root
     ktrgtc2->kcbgtcr->ktrexf: for non-unique index
For unique index, the first path with ktrexc is used. For index root block, the same ktrexc path is used.

According to Blog: Buffer is pinned count:

Notice that function ktrexc is called in kcbgtcr. It is examination - the case when Oracle just read the buffer and does not pin it. The statistic "consistent gets - examination" is incremented inside this function.

It has a deep discussion with Dtrace on statistics "buffer is pinned count" and "buffer is not pinned count". It looks like that:
     index root block is never pinned.
     non-unique index leaf block is always pinned.
     table block is checked and pinned.


6. Query without using Cursor


In the above test, we observed this Non-Consistency Read with cursor.

Now we can also make one more test without using cursor, and check if there still exists such Non-Consistency Read.

At first, create a sleeping function to postpone query row fetch.

create or replace function row_sleep(p_id number, p_seconds number) return number as 
begin
  dbms_output.put_line('Sleeping seconds: '||p_seconds||' for id: '||p_id||' at '||localtimestamp);
  dbms_lock.sleep(p_seconds);
  return p_id;
end;
/
Then run following statement in Session-1, and same previous update statement in Session-2.

  select t.* from rowcr_tab t where id_uniq in (1, 678);
  select t.* from rowcr_tab t where id_uniq in (1, 678) and row_sleep(id_uniq, 10) = id_uniq;
  select t.* from rowcr_tab t where id_uniq in (1, 678);
Here the test out put from both sessions:

---========== Session-1 Run @T1, Wait 40 Seconds ==========---

09:23:02 TESTDB(111)> select t.* from rowcr_tab t where id_uniq in (1, 678);

  ID_UNIQ ID_NON_UNIQ COLOR TS
  ------- ----------- ----- --------------------
        1           1 BLACK 23-JAN-2020 11:16:45
      678         678 BLACK 24-JAN-2020 09:22:52

09:23:10 TESTDB(111)> select t.* from rowcr_tab t where id_uniq in (1, 678) and row_sleep(id_uniq, 10) = id_uniq;

  ID_UNIQ ID_NON_UNIQ COLOR TS
  ------- ----------- ----- --------------------
        1           1 BLACK 23-JAN-2020 11:16:45
      678         678 BLACK 24-JAN-2020 09:22:52
  
  Sleeping seconds: 10 for id:   1 at 24-JAN-2020 09:23:10
  Sleeping seconds: 10 for id:   1 at 24-JAN-2020 09:23:20
  Sleeping seconds: 10 for id: 678 at 24-JAN-2020 09:23:30
  Sleeping seconds: 10 for id: 678 at 24-JAN-2020 09:23:40
  
  Elapsed: 00:00:40.09

09:23:50 TESTDB(111)> select t.* from rowcr_tab t where id_uniq in (1, 678);

  ID_UNIQ ID_NON_UNIQ COLOR TS
  ------- ----------- ----- --------------------
        1           1 BLACK 23-JAN-2020 11:16:45
      678         678 WHITE 24-JAN-2020 09:23:17
    
    
---========== Session-2 Run @T2 (T2 < T1 + 20) ==========---

09:22:53 TESTDB(222)> exec rowcr_update(p_id => 678, p_commit => true);
  Old Color: BLACK at 24-JAN-2020 09:22:52
  -------------------- Update at 24-JAN-2020 09:23:17
  New Color: WHITE at 24-JAN-2020 09:23:17
The test output shows that query made Consistency Read when not using cursor. The query returned before image (BLACK for id 678) when the row with id 678 was updated to WHITE after query started.

By the way, we can see that query sleeps 40 seconds although only two rows are selected with each of 10 seconds. The two additional of 10 seconds are due to filter predicate in Rowsource Id 2 (TABLE ACCESS BY INDEX ROWID), which is visible in XPLAN.

--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    25 |     1   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |                   |       |       |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| ROWCR_TAB         |     1 |    25 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | ROWCR_TAB#ID_UNIQ |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

  Predicate Information (identified by operation id):
  ---------------------------------------------------
     2 - filter("ROW_SLEEP"("ID_UNIQ",10)=1 OR "ROW_SLEEP"("ID_UNIQ",10)=678)
     3 - access("ID_UNIQ"=1 OR "ID_UNIQ"=678)
         filter("ID_UNIQ"="ROW_SLEEP"("ID_UNIQ",10))
However, if we select one single row, the sleeping time is 10 seconds (not 20 seconds) because there is no more filter predicate in Rowsource Id 1 (TABLE ACCESS BY INDEX ROWID). Therefore, the additional filter predicate is only generated for selection of multiple rows.
         
09:43:02 TESTDB(111)> select t.* from rowcr_tab t where id_uniq in (678) and row_sleep(id_uniq, 10) = id_uniq;

  ID_UNIQ ID_NON_UNIQ COLOR TS
  ------- ----------- ----- --------------------
      678         678 WHITE 30-JAN-2020 09:23:17
  
  1 row selected.
  Sleeping seconds: 10 for id: 678 at 30-JAN-2020 09:43:03
  Elapsed: 00:00:10.05     
  
-------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                   |     1 |    25 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROWCR_TAB         |     1 |    25 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | ROWCR_TAB#ID_UNIQ |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------  

  Predicate Information (identified by operation id):
  ---------------------------------------------------
     2 - access("ID_UNIQ"=678)
         filter("ROW_SLEEP"("ID_UNIQ",10)=678)  


7. Write Consistency and DML Restart


Following above discussion of Read Consistency, we can also have a look of Write Consistency achieved with DML Restart.

Book: Expert Oracle Database Architecture (3rd ed. Edition) Page 267-274 described Write consistency and showed update restart with before row trigger. Tanel's video: Oracle SQL Monitoring and Write Consistency Demo used V$SQL_PLAN_MONITOR and SQL trace to demonstrate update restart. Blog: Write consistency and DML restart revealed internal "updThreePhaseExe" mechanism: the implementation of multiple time update restarts and explored update restart with DML Trace.

In this section, we will show 3 update starts caused by DML restart with previous defined delay function row_sleep. In both Session-1 and Session-2, we run the same update (and commit) one immediately after another:

---========== Session-1 Run @T1 ==========---
alter session set tracefile_identifier = 'trc_1';
alter system set events 'trace[DML] {callstack: fname dmlTrace} disk=high trace("DML restarted sqlid : %\n", sqlid())';
alter session set events '10046 trace name context forever, level 12';

update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_1 */ rowcr_tab 
   set id_uniq = -id_uniq, ts=localtimestamp
 where id_uniq in (-678, 678)   --id_uniq+0 avoids multi evaluation of filter
   and row_sleep(id_uniq, 10) = id_uniq+0;    
 
commit;

alter session set events '10046 trace name context off';

---========== Session-2 Run @T2 ==========---
alter session set tracefile_identifier = 'trc_2';
alter system set events 'trace[DML] {callstack: fname dmlTrace} disk=high trace("DML restarted sqlid : %\n", sqlid())';
   -- DML restarted sqlid : fsg7u662gr5kj
alter session set events '10046 trace name context forever, level 12';
 				
update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_2 */ rowcr_tab 
   set id_uniq = -id_uniq, ts=localtimestamp
 where id_uniq in (-678, 678)   --id_uniq+0 avoids multi evaluation of filter
   and row_sleep(id_uniq, 10) = id_uniq+0;  

commit;

alter session set events '10046 trace name context off';  


Test Output



---========== Session-1 Run @T1, id_uniq changed from 678 to -678 ==========---
TESTDB(111)> update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_1 */ rowcr_tab 
                set id_uniq = -id_uniq, ts=localtimestamp
              where id_uniq in (-678, 678)   --id_uniq+0 avoids multi evaluation of filter
                and row_sleep(id_uniq, 10) = id_uniq+0;  
                           
     Sleeping seconds: 10 for id: 678 at 25-JAN-2020 15:04:53

---========== Session-2 Run @T2, id_uniq first Consistency Read is 678, changed from -678 to 678 ==========---
TESTDB(222)> update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_2 */ rowcr_tab 
                set id_uniq = -id_uniq, ts=localtimestamp
              where id_uniq in (-678, 678)   --id_uniq+0 avoids multi evaluation of filter
                and row_sleep(id_uniq, 10) = id_uniq+0;  
                
     Sleeping seconds: 10 for id: 678 at 25-JAN-2020 15:04:59
     Sleeping seconds: 10 for id: -678 at 25-JAN-2020 15:05:09
     Sleeping seconds: 10 for id: -678 at 25-JAN-2020 15:05:19
We can see that Session-1 shows one single line: "Sleeping seconds: 10 for id: 678".

But Session-2 shows three lines, one is "Sleeping seconds: 3 for id: 678", which is the first Consistency Read; the other two are "Sleeping seconds: 3 for id: -678", which are the updated result of Session-1 and re-read because of update restart. Each of them are with an inteval of 10 seconds. The three output lines by function row_sleep exactly reflect the first initial "update (rollbacked)", the second promoted "select for update", and the third "final update".

With this new approach, we can track all three update starts, to be precise, three update select phases. Whereas with "before row trigger", it is only fired two times, hence two pairs of output.

In the following SQL Monitoring Report and SQL Trace, Session-2 shows Execs=3 or starts=3 for row source "UPDATE". sqlmon_restarts.sql lists SQL_ID=fsg7u662gr5kj with starts=3.


SQL Monitoring Report



--------------- Session-1 ---------------
select SYS.DBMS_SQLTUNE.REPORT_SQL_MONITOR('grvrt3ux6xhpd', report_level=>'ALL', type=>'TEXT') from dual;
  update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_1 */ rowcr_tab set id_uniq = -id_uniq, ts=localtimestamp where id_uniq in (-678, 678) --id_uniq+0 avoids multi evaluation of filter and row_sleep(id_uniq, 10) = id_uniq+0

========================================================================================================================================
| Id |       Operation       |       Name        |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Activity | Activity Detail |
|    |                       |                   | (Estim) |      | Active(s) | Active |       | (Actual) |   (%)    |   (# samples)   |
========================================================================================================================================
|  0 | UPDATE STATEMENT      |                   |         |      |         1 |    +10 |     1 |        0 |          |                 |
|  1 |   UPDATE              | ROWCR_TAB         |         |      |         1 |    +10 |     1 |        0 |          |                 |
|  2 |    INLIST ITERATOR    |                   |         |      |         1 |    +10 |     1 |        1 |          |                 |
|  3 |     INDEX UNIQUE SCAN | ROWCR_TAB#ID_UNIQ |       1 |    1 |         1 |    +10 |     2 |        1 |          |                 |
========================================================================================================================================

--------------- Session-2 ---------------
select SYS.DBMS_SQLTUNE.REPORT_SQL_MONITOR('bd0x94n13aqsc', report_level=>'ALL', type=>'TEXT') from dual;
  update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_2 */ rowcr_tab set id_uniq = -id_uniq, ts=localtimestamp where id_uniq in (-678, 678) --id_uniq+0 avoids multi evaluation of filter and row_sleep(id_uniq, 10) = id_uniq+0

========================================================================================================================================
| Id |       Operation       |       Name        |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Activity | Activity Detail |
|    |                       |                   | (Estim) |      | Active(s) | Active |       | (Actual) |   (%)    |   (# samples)   |
========================================================================================================================================
|  0 | UPDATE STATEMENT      |                   |         |      |        11 |    +20 |     3 |        1 |          |                 |
|  1 |   UPDATE              | ROWCR_TAB         |         |      |        11 |    +20 |     3 |        1 |          |                 |
|  2 |    INLIST ITERATOR    |                   |         |      |        11 |    +20 |     3 |        3 |          |                 |
|  3 |     INDEX UNIQUE SCAN | ROWCR_TAB#ID_UNIQ |       1 |    1 |        21 |    +10 |     6 |        3 |          |                 |
========================================================================================================================================

   --  Predicate Information (identified by operation id):
   --  ---------------------------------------------------
   --     3 - access(("ID_UNIQ"=(-678) OR "ID_UNIQ"=678))
   --         filter("ROW_SLEEP"("ID_UNIQ",10)="ID_UNIQ"+0)       


SQL Trace



****************** Session-1 ******************
update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_1 */ rowcr_tab
   set id_uniq = -id_uniq, ts=localtimestamp
 where id_uniq in (-678, 678)   --id_uniq+0 avoids multi evaluation of filter
   and row_sleep(id_uniq, 10) = id_uniq+0

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.02      10.02          0          4          6           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.02      10.02          0          4          6           1

Rows (1st) Row Source Operation
---------- ---------------------------------------------------
         0 UPDATE  ROWCR_TAB (cr=4 pr=0 pw=0 time=10000668 us starts=1)
         1  INLIST ITERATOR  (cr=4 pr=0 pw=0 time=10000416 us starts=1)
         1   INDEX UNIQUE SCAN ROWCR_TAB#ID_UNIQ (cr=4 pr=0 pw=0 time=10000409 us starts=2 cost=1 size=15 card=1)(object id 4460833)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PL/SQL lock timer                               1        9.99          9.99

****************** Session-2 ******************
update /*+ GATHER_PLAN_STATISTICS MONITOR Upd_2 */ rowcr_tab
   set id_uniq = -id_uniq, ts=localtimestamp
 where id_uniq in (-678, 678)   --id_uniq+0 avoids multi evaluation of filter
   and row_sleep(id_uniq, 10) = id_uniq+0

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.05      30.05          0          9          8           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.05      30.05          0          9          8           1

Rows (1st) Row Source Operation
---------- ---------------------------------------------------
         0 UPDATE  ROWCR_TAB (cr=9 pr=0 pw=0 time=30036737 us starts=3)
         3  INLIST ITERATOR  (cr=9 pr=0 pw=0 time=30002585 us starts=3)
         3   INDEX UNIQUE SCAN ROWCR_TAB#ID_UNIQ (cr=9 pr=0 pw=0 time=30002573 us starts=6 cost=1 size=15 card=1)(object id 4460833)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PL/SQL lock timer                               3       10.00         30.00


tpt-oracle/sqlmon_restarts.sql



--(https://github.com/tanelpoder/tpt-oracle/blob/master/sqlmon_restarts.sql) 
-- Copyright 2018 Tanel Poder. All rights reserved. More info at http://tanelpoder.com
-- Licensed under the Apache License, Version 2.0. See LICENSE.txt for terms & conditions.
-- Purpose:     List UPDATE/DELETE statements that have experienced restarts due to write consistency from V$SQL_PLAN_MONITOR

SELECT
    inst_id
  , sql_id
  , starts
  , sql_exec_start
  , sql_exec_id
  , plan_operation
  , plan_object_owner||'.'||plan_object_name object_name 
FROM 
    gv$sql_plan_monitor 
WHERE 
    plan_line_id = 1 
AND starts > 1
ORDER BY
    sql_id
  , sql_exec_start;

INST_ID  SQL_ID         STARTS SQL_EXEC_START        SQL_EXEC_ID  PLAN_OPERATION  OBJECT_NAME
-------  -------------  ------ --------------------  -----------  --------------  -----------
      1  bd0x94n13aqsc       3 25-JAN-2020 15:04:59     16777217  UPDATE          K.ROWCR_TAB


update deadlock


But continue to ask: "what happens if ...": if we modify two rows with one update statement in Session-1 (at first commit open transaction), but in Session-2, we immediately modify the same two rows with two update statements, at first id_uniq=101, then id_uniq=1, both sessions are hanging for a few seconds, then Session-1 (first updating session) throws a deadlock (all update statements are using INDEX UNIQUE SCAN ROWCR_TAB#ID_UNIQ).

---========== Session-1 Run @T3 ==========---
update rowcr_tab set ts=localtimestamp, seq=seq+row_sleep(1, 30) where id_uniq in (1, 101);

---========== Session-2 Run @T4 ==========---
update rowcr_tab set ts=localtimestamp, seq=seq+row_sleep(1, 30) where id_uniq in (101);
update rowcr_tab set ts=localtimestamp, seq=seq+row_sleep(1, 30) where id_uniq in (1);
Here the output:

---========== Session-1 Run @T3, Output==========---
19:03:03 TESTDB(111)> update rowcr_tab set ts=localtimestamp, seq=seq+row_sleep(1, 30) where id_uniq in (1, 101);
  Sleeping seconds: 30 for id: 1 at 25-JAN-2020 19:04:05
  update rowcr_tab set ts=localtimestamp, seq=seq+row_sleep(1, 30) where id_uniq in (1, 101)
                                            *
  ERROR at line 1:
  ORA-00060: deadlock detected while waiting for resource
  Elapsed: 00:00:36.17


---========== Session-2 Run @T4, Output ==========---
19:03:08 TESTDB(222)> update rowcr_tab set ts=localtimestamp, seq=seq+row_sleep(1, 30) where id_uniq in (101);
  Sleeping seconds: 30 for id: 1 at 25-JAN-2020 19:04:10
  1 row updated.
  Elapsed: 00:00:30.01
  
19:04:41 TESTDB(222)> update rowcr_tab set ts=localtimestamp, seq=seq+row_sleep(1, 30) where id_uniq in (1);
  Sleeping seconds: 30 for id: 1 at 25-JAN-2020 19:04:41
  1 row updated.
  Elapsed: 00:00:30.39