Monday, September 12, 2022

Update Restart and Undo Extends

In Blog: That old restart problem again... there is a comment about update restart and undo extends.

Thereafter two Blogs are composed:
     Update Restart and new Active Undo Extent
     Delete Restart and new Active Undo Extent
which demonstrated that the number of Update Restart (or Delete Restart) is equal to the number of times rollback segment size is extended (V$ROLLSTAT.extends).

In this Blog, we will add more flavors to the code of Blog: Update Restart and new Active Undo Extent to show one-to-one mapping of UpdateRestart-to-UndoExtend.

Furthermore, we track down each Update restart and Undo extend to one respective Oracle internal subrountine call of ktuAddExtent with Dtrace and GDB.

The probable reason of restart is when there is more Undo space requested by an update, it first allocates a new Undo extent, then restarts the update (it is also possible that is an Undo pre-allocation).

Note 1. Tested on Oracle 19.13, 19.7
     2. DB settings:
           temp_undo_enabled                    boolean     TRUE
           undo_management                      string      AUTO
           undo_retention                       integer     3600
           
           _rollback_segment_count              integer     1
           transactions_per_rollback_segment    integer     1

1. Test Code


See Appendix 1. Test Code


2. Test Run and Output


For each test, make a new Sqlplus connection. If not reproduced occasionally (due to "AUTO" undo_management), repeat the test.
We also set the diagnostic events from Blog: That Old Restart Problem Strikes Back: Setting the Stage .

conn k/s@test_db

truncate table delete_data;

insert into delete_data 
  select owner, object_name, rownum id from dba_objects where rownum  < 53001;

commit;

select blocks, bytes/power(2,20) mbytes from user_segments where segment_name = 'DELETE_DATA';
  -- BLOCKS = 384    MBYTES = 3

alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "dml_trc_4a2";
--set diagnostic events in my session:
alter session set events 'trace[dml]:sql_trace wait=true';

exec testp;
Here the Sqlplus output:

--======= Begin Rollstat: USN=107, EXTENTS=24, EXTENDS=115 =======--
Restart#: 1 @ 03:49:44
            doing "AASqdcAAAAABKokABg(id = 5226)" again, trigger called count = 5227
            Undo: l_max_extent_id=21, extent count = 5, extent_id = 21, block_id = 16000, blocks = 128
Restart#: 2 @ 03:49:48
            doing "AASqdcAAAAABKpmABj(id = 16676)" again, trigger called count = 16678
            Undo: l_max_extent_id=22, extent count = 6, extent_id = 22, block_id = 16128, blocks = 128
Restart#: 3 @ 03:49:53
            doing "AASqdcAAAAABLLVABV(id = 28192)" again, trigger called count = 28195
            Undo: l_max_extent_id=23, extent count = 7, extent_id = 23, block_id = 16256, blocks = 128
Restart#: 4 @ 03:49:57
            doing "AASqdcAAAAABLKbACl(id = 39666)" again, trigger called count = 39670
            Undo: l_max_extent_id=24, extent count = 8, extent_id = 24, block_id = 16512, blocks = 128
Restart#: 5 @ 03:50:02
            doing "AASqdcAAAAABLNkAAm(id = 51186)" again, trigger called count = 51191
            Undo: l_max_extent_id=25, extent count = 9, extent_id = 25, block_id = 16640, blocks = 128
--======= End Rollstat: USN=107, EXTENTS=29, EXTENDS=120 =======--
    row     count = 53000
    trigger count = 53005
    restart count = 5
    extents delta = 5
    extends delta = 5
    
Elapsed: 00:00:21.98
The above output shows that 53000 single row updates, trigger fired 53005 times, hence 5 update restarts.
The mapping of DELETE_DATA.id to Undo extent_id / block_id in 5 row restarts is:

  id = 5226      extent_id = 21, block_id = 16000
  id = 16676     extent_id = 22, block_id = 16128
  id = 28192     extent_id = 23, block_id = 16256
  id = 39666     extent_id = 24, block_id = 16512
  id = 51186     extent_id = 25, block_id = 16640   
We can also see the EXTENT_ID / BLOCK_ID of 5 restarts in dba_undo_extents (EXTENT_ID 21 to 25):

select * from dba_undo_extents 
 where segment_name like'_SYSSMU107_%' and status = 'ACTIVE'
 order by extent_id, file_id, block_id;

  OWN SEGMENT_NAME            TABLESPACE_NAME EXTENT_ID FILE_ID  BLOCK_ID    BYTES  BLOCKS RELATIVE_FNO  STATUS
  --- ----------------------- --------------- --------- ------- --------- -------- ------- ------------  ------
  SYS _SYSSMU107_2079059358$  UNDO                   17       3     14208  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   18       3     15616  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   19       3     15744  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   20       3     15872  1048576     128            3  ACTIVE
  
  SYS _SYSSMU107_2079059358$  UNDO                   21       3     16000  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   22       3     16128  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   23       3     16256  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   24       3     16512  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   25       3     16640  1048576     128            3  ACTIVE
  
  SYS _SYSSMU107_2079059358$  UNDO                   26       3     16768  1048576     128            3  ACTIVE
  
  10 rows selected.
Further more, we can get Undo Segment info and make block dump:

select * from dba_rollback_segs where segment_id = 107;

  SEGMENT_NAME           OWNER  TABLESPACE_NAME SEGMENT_ID FILE_ID BLOCK_ID INITIAL_EXTENT NEXT_EXTENT MIN_EXTENTS MAX_EXTENTS PCT_INCREASE STATUS  RELATIVE_FNO
  ---------------------- ------ --------------- ---------- ------- -------- -------------- ----------- ----------- ----------- ------------ ------  ------------
  _SYSSMU107_2079059358$ PUBLIC UNDO                   107       3      992         131072       65536           2       32765              ONLINE             3


alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "undo_seg_header_blk_992_4a1";  
alter system dump datafile  3 block 992;	
In the dump, we can also see the extent_id / block_id of 5 restarts in Extent Map
(added comments after "==#" are heximal to decimal conversion. Extent Number:21 to 25)

BH (0x10bf59580) file#: 3 rdba: 0x00c003e0 (3/992) class: 229 ba: 0x10b104000
  TRN TBL::
  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num    cmt
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0x5a5d  0x0013  0x00000b7302e37794  0x00c037f8   0x0000.000.00000000  0x00000418   0x00000000  1662040111
   0x01    9    0x00  0x5a59  0x0008  0x00000b7302e376ad  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
   0x02    9    0x00  0x5a5a  0x000c  0x00000b7302e377d3  0x00c037f9   0x0000.000.00000000  0x00000001   0x00000000  1662040112
   0x03    9    0x00  0x5a5e  0x0015  0x00000b7302e37675  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
   0x04   10    0x80  0x5a5e  0x0011  0x00000b7302e378c5  0x00c04191   0x0000.000.00000000  0x00000419   0x00000000  0
   0x05    9    0x00  0x5a5c  0x001f  0x00000b7302e376ff  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
   ....
   0x21    9    0x00  0x5a5d  0x001b  0x00000b7302e3772d  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
  EXT TRN CTL::

 Retention Table                               Extent Map                                                       
  ---------------------------------------      ------------------------
 Extent Number:0  Commit Time: 1662039334       0x00c003e1  length: 7                                           
 Extent Number:1  Commit Time: 1662039334       0x00c003d8  length: 8                                           
 Extent Number:2  Commit Time: 1662039334       0x00c01380  length: 128                                         
 ......                                      
 Extent Number:20  Commit Time: 0               0x00c03e00  length: 128  
                                        
 Extent Number:21  Commit Time: 0               0x00c03e80  length: 128      ==# 0x00c03e80 = 3/16000                                  
 Extent Number:22  Commit Time: 0               0x00c03f00  length: 128      ==# 0x00c03f00 = 3/16128              
 Extent Number:23  Commit Time: 0               0x00c03f80  length: 128      ==# 0x00c03f80 = 3/16256              
 Extent Number:24  Commit Time: 0               0x00c04080  length: 128      ==# 0x00c04080 = 3/16512              
 Extent Number:25  Commit Time: 0               0x00c04100  length: 128      ==# 0x00c04100 = 3/16640
                                     
 Extent Number:26  Commit Time: 0               0x00c04180  length: 128                                         
 Extent Number:27  Commit Time: 1662039334      0x00c04000  length: 128                                         
 Extent Number:28  Commit Time: 1662039334      0x00c0a800  length: 128   
Now look the trace file of diagnostic events.

Two related FETCH and UPDATE cursors are:

PARSING IN CURSOR #140024761145664 len=48 dep=1 uid=49 oct=3 lid=49 tim=8901333911119 hv=2815500193 ad='9b939860' sqlid='f4qphfamx27x1'
SELECT * FROM DELETE_DATA ORDER BY ID FOR UPDATE

PARSING IN CURSOR #140024754434376 len=62 dep=1 uid=49 oct=6 lid=49 tim=8901334052791 hv=4254591981 ad='9a87f850' sqlid='7k30h4gytguzd'
UPDATE DELETE_DATA SET OWNER = LOWER(OWNER) WHERE ROWID = :B1 
Pick 3 consecutive rows from trace file: DELETE_DATA.id: 5225 (c2 35 1a), 5226 (c2 35 1b), and 5227 (c2 35 1c).

For a normal row update, for example, DELETE_DATA.id: 5225 ('c2 35 1a') with rowid 004aa75c.0004aa24.5f = 4892508.305700.95, it looks like:

FETCH #140024761145664:c=2,e=2,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=8901335815921
updSetExecCmpColInfo: not RHS: objn=4892505, cid=3
  kduukcmpf=0x7f5a0e3ac1a2, kduukcmpl=0x7f5a0e3ac1a0, kduukcmpp=(nil)
updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.57
updrow: objn=4892505 phase=NOT LOCKED
updicf: SAVE phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: SAVE kduukcmpl 3
	kduukcmpp: c2 35 1a             
updrow: kauupd objn:4892505 table:0 rowMigrated:FALSE  rowid 004aa75c.0004aa24.5f code 0
EXEC #140024754434376:c=299,e=299,p=0,cr=1,cu=2,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=8901335816231
CLOSE #140024754434376:c=0,e=0,dep=1,type=3,tim=8901335816253
For 5 update restart rows, there are 5 respective lines containing "error=1551", all of which are similar to following example of DELETE_DATA.id: 5226 ('c2 35 1b') with rowid: "AASqdcAAAAABKokABg" = 004aa75c.0004aa24.60 = 4892508.305700.96:

FETCH #140024761145664:c=3,e=2,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=8901335816264
updSetExecCmpColInfo: not RHS: objn=4892505, cid=3
  kduukcmpf=0x7f5a0e3b12ba, kduukcmpl=0x7f5a0e3b12b8, kduukcmpp=(nil)
updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.58
updrow: objn=4892505 phase=NOT LOCKED
updicf: SAVE phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: SAVE kduukcmpl 3
	kduukcmpp: c2 35 1b

-------------- START of extra lines for restart --------------
updrow: objn=4892505 error=1551
updrow: qecinvsub objn=4892505
updrow: setting ROW_RETRY objn=4892505
updrow: retry_this_row: ROW_RETRY set, objn= 4892505 phase=NOT LOCKED
updicf: RETRY phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: RETRY kduukcmpl 3
	kduukcmpp: c2 35 1b

*** 15:49:44.051090
WAIT #140024754434376: nam='PL/SQL lock timer' ela= 999222 duration=0 p2=0 p3=0 obj#=-1 tim=8901336816048
doing "AASqdcAAAAABKokABg(id = 5226)" again, trigger called count = 5227
-------------- END of extra lines for restart --------------
......
updrow: kauupd objn:4892505 table:0 rowMigrated:FALSE  rowid 004aa75c.0004aa24.60 code 0
EXEC #140024754434376:c=3929,e=1002883,p=0,cr=69,cu=17,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=8901336819159
CLOSE #140024754434376:c=0,e=0,dep=1,type=3,tim=8901336819249
Here trace for DELETE_DATA.id: 5227 ('c2 35 1c') with rowid 004aa75c.0004aa24.61 = 4892508.305700.97:

FETCH #140024761145664:c=6,e=6,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=8901336819272
updSetExecCmpColInfo: not RHS: objn=4892505, cid=3
  kduukcmpf=0x7f5a0e3ac1a2, kduukcmpl=0x7f5a0e3ac1a0, kduukcmpp=(nil)
updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efd.1732.01
updrow: objn=4892505 phase=NOT LOCKED
updicf: SAVE phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: SAVE kduukcmpl 3
	kduukcmpp: c2 35 1c
updrow: kauupd objn:4892505 table:0 rowMigrated:FALSE  rowid 004aa75c.0004aa24.61 code 0
EXEC #140024754434376:c=352,e=351,p=0,cr=1,cu=2,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=8901336819641
CLOSE #140024754434376:c=1,e=0,dep=1,type=3,tim=8901336819661
Comparing normal update and update restart, we mark extra lines between START and END. They are all about "RETRY" with cid=3 (dc_rollback_segments) in "phase=NOT LOCKED". One notable line is "updicf: RETRY kduukcmpl 3".

In both case, "FETCH #140024761145664" is executed only once to fetch one row (r=1).

Here the lines of 5 update restarts with uba, kduukcmpp, and dbms_system.ksdwrt messages:

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.58
	kduukcmpp: c2 35 1b
*** 15:49:44.051090
doing "AASqdcAAAAABKokABg(id = 5226)" again, trigger called count = 5227

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378e0  xid: 0x006b.004.00005a5e  uba: 0x00c03f7c.1733.5a
	kduukcmpp: c3 02 43 4d
*** 15:49:48.676064
doing "AASqdcAAAAABKpmABj(id = 16676)" again, trigger called count = 16678

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378e5  xid: 0x006b.004.00005a5e  uba: 0x00c03ffc.1734.5a
	kduukcmpp: c3 03 52 5d
*** 15:49:53.312088
doing "AASqdcAAAAABLLVABV(id = 28192)" again, trigger called count = 28195

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378e8  xid: 0x006b.004.00005a5e  uba: 0x00c040fc.1735.5a
	kduukcmpp: c3 04 61 43
*** 15:49:57.950063
doing "AASqdcAAAAABLKbACl(id = 39666)" again, trigger called count = 39670

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378eb  xid: 0x006b.004.00005a5e  uba: 0x00c0417c.1736.5a
	kduukcmpp: c3 06 0c 57
*** 15:50:02.548055
doing "AASqdcAAAAABLNkAAm(id = 51186)" again, trigger called count = 51191
We can see that all DELETE_DATA.id of 5 restarts are marked with kduukcmpp:

	kduukcmpp: c2 35 1b     = 5226	
	kduukcmpp: c3 02 43 4d  = 16676	
	kduukcmpp: c3 03 52 5d  = 28192	
	kduukcmpp: c3 04 61 43  = 39666	
	kduukcmpp: c3 06 0c 57  = 51186
Pick uba of 5 corresponding restarts:

  scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.58
  scn: 0x00000b7302e378e0  xid: 0x006b.004.00005a5e  uba: 0x00c03f7c.1733.5a
  scn: 0x00000b7302e378e5  xid: 0x006b.004.00005a5e  uba: 0x00c03ffc.1734.5a
  scn: 0x00000b7302e378e8  xid: 0x006b.004.00005a5e  uba: 0x00c040fc.1735.5a
  scn: 0x00000b7302e378eb  xid: 0x006b.004.00005a5e  uba: 0x00c0417c.1736.5a
and map them to file_id / block_id:
                                                                                                                          
  uba: 0x00c03efc = 3/16124
  uba: 0x00c03f7c = 3/16252
  uba: 0x00c03ffc = 3/16380
  uba: 0x00c040fc = 3/16636
  uba: 0x00c0417c = 3/16764          
then compare with respective file_id / block_id in dba_undo_extents (or Extent Map):

  0x00c03e80 = 3/16000 
  0x00c03f00 = 3/16128 
  0x00c03f80 = 3/16256 
  0x00c04080 = 3/16512 
  0x00c04100 = 3/16640 
all are with a difference of 124:

  16124 - 16000 = 124         
  16252 - 16128 = 124 
  16380 - 16256 = 124 
  16636 - 16512 = 124 
  16764 - 16640 = 124 
Above dba_undo_extents output shows that each extent contains 128 blocks. Probably when there are only 4 free blocks remaining (124 blocks are already used), Oracle makes an Undo pre-allocation (pre-extention).

Look above 3 consecutive row updates and their respective uba: row DELETE_DATA.id: 5225 without restart, 5226 ('c2 35 1b') with restart, 5227 ('c2 35 1c') without restart:

  DELETE_DATA.id: 5225 ('c2 35 1a')  uba: 0x00c03efc.1732.57
  DELETE_DATA.id: 5226 ('c2 35 1b')  uba: 0x00c03efc.1732.58
  DELETE_DATA.id: 5227 ('c2 35 1c')  uba: 0x00c03efd.1732.01
Both row 5225 and 5226 are in the same Undo block (uba: 0x00c03efc= 3/16124) of EXTENT_ID = 21, and row 5226 is stored as last slot (0x58).
But row 5227 is jumped to next Undo block (uba: 0x00c03efd= 3/16125) of same EXTENT_ID = 21 and stored in first slot(0x01).

In average, each Undo block is used for 53000(total rows)/5(Undo extents)/128(blocks per extent) = 83 rows, which is close to 0x58 = 88.

Above dba_undo_extents query output shows:

  EXTENT_ID = 21 has 128 BLOCKS, which are BLOCK_ID 16000 to 16127
  EXTENT_ID = 22 has 128 BLOCKS, which are BLOCK_ID 16128 to 16255
so both uba (0x00c03efc= 3/16124 and 0x00c03efd= 3/16125) are still in the same EXTENT_ID = 21. Hence we could probably say that Oracle makes an Undo pre-allocation (pre-extention) when there are only 4 free blocks remaining from 128 allocated blocks.

That seems exactly what ORA-01551 Docu wrote:

  $> oerr ora 1551
       01551, 00000, "extended rollback segment, pinned blocks released"
       // *Cause: Doing recursive extent of rollback segment, trapped internally
       //        by the system
       // *Action: None
Restart is caused by creating Undo recursive extent with internally trapped error "1551" (possibly restart is performed in "1551" exeception handler).

By the way, if there is no trigger (drop trigger delete_data_bt), 3 consecutive updates are as follows.

Update 1 has no "error=1551";

Update 2 has "error=1551", and three extra lines starting with "updrowFastPath", last extra line marked with "ROW_RETRY".
         Both Update 1 and Update 2 are in same Undo block "0x00c01efc", Update 2 is at last slot "5a" (0x5a=90) of "0x00c01efc".

Update 3 has no "error=1551"; It is in next Undo block "0x00c01efd". It is at first slot "01" of "0x00c01efd".

-- Update 1
FETCH #140562258761720:c=1,e=1,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=9006876120842
updThreePhaseExe: objn=4892673 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e9971b  xid: 0x0069.000.00005af3  uba: 0x00c01efc.1757.59 
EXEC #140562256908616:c=186,e=186,p=0,cr=0,cu=1,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=9006876121038
CLOSE #140562256908616:c=0,e=0,dep=1,type=3,tim=9006876121052

-- Update 2
FETCH #140562258761720:c=1,e=1,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=9006876121062
updThreePhaseExe: objn=4892673 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e9971b  xid: 0x0069.000.00005af3  uba: 0x00c01efc.1757.5a 
	updrowFastPath: objn=4892673 error=1551
	updrowFastPath: qecinvsub objn=4892673
	updrowFastPath: setting ROW_RETRY objn=4892673
EXEC #140562256908616:c=118,e=379,p=0,cr=1,cu=13,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=9006876121451
CLOSE #140562256908616:c=0,e=0,dep=1,type=3,tim=9006876121507

-- Update 3
FETCH #140562258761720:c=2,e=2,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=9006876121521
updThreePhaseExe: objn=4892673 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e9971b  xid: 0x0069.000.00005af3  uba: 0x00c01efd.1757.01
EXEC #140562256908616:c=192,e=192,p=0,cr=0,cu=1,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=9006876121726
CLOSE #140562256908616:c=1,e=1,dep=1,type=3,tim=9006876121739


3. Dtrace and GDB


We can further use Dtrace and GDB to reveal that each Update restart / Undo extend is one call of Oracle internal subroutine: ktuAddExtent.


3.1 Dtrace


First, build a clean test data in a new Sqlplus session:

conn k/s@test_db_solaris

truncate table delete_data;

insert into delete_data 
  select owner, object_name, rownum id from dba_objects where rownum  < 53001;

commit;
Then, find its UNIX process id (pid=1234), start Dtrace;

$ sudo dtrace -n 'pid$target:oracle:ktuAddExtent:entry 
{ 
  printf("\n---- at %Y ", walltimestamp);
  printf("DBA_ROLLBACK_SEGS.segment_id(arg0)=%d, DBA_UNDO_EXTENTS.extent_id(arg2)=%d\n", arg0, arg2);
  ustack();    
}'  -p 1234
Run test, the output shows 5 update restarts and its respective Undo USN/(extent_id, block_id)

19:55:21 SQL > exec testp;
--======= Begin Rollstat: USN=131, EXTENTS=6, EXTENDS=2 =======--
Restart#: 1 @ 07:55:55
            doing "AAQVWNAAAAAMeQGAAJ(id = 5139)" again, trigger called count = 5140
            Undo: l_max_extent_id=6, extent count = 6, extent_id = 6, block_id = 11264, blocks = 128
Restart#: 2 @ 07:55:56
            doing "AAQVWNAAAAAMeRHAAA(id = 16588)" again, trigger called count = 16590
            Undo: l_max_extent_id=7, extent count = 7, extent_id = 7, block_id = 11264, blocks = 128
Restart#: 3 @ 07:55:58
            doing "AAQVWNAAAAAMePDACw(id = 28105)" again, trigger called count = 28108
            Undo: l_max_extent_id=8, extent count = 8, extent_id = 8, block_id = 11264, blocks = 128
Restart#: 4 @ 07:55:59
            doing "AAQVWNAAAAAMeONAAr(id = 39578)" again, trigger called count = 39582
            Undo: l_max_extent_id=9, extent count = 9, extent_id = 9, block_id = 11264, blocks = 128
Restart#: 5 @ 07:56:01
            doing "AAQVWNAABAAAHpGAAB(id = 51098)" again, trigger called count = 51103
            Undo: l_max_extent_id=10, extent count = 10, extent_id = 10, block_id = 11264, blocks = 128
--======= End Rollstat: USN=131, EXTENTS=11, EXTENDS=7 =======--
    row     count = 53000
    trigger count = 53005
    restart count = 5
    extents delta = 5
    extends delta = 5
Elapsed: 00:00:07.45
Here dba_undo_extents rows after test (no ACTIVE rows before test):

SQL > select * from dba_undo_extents 
       where segment_name like'_SYSSMU131_%' and status = 'ACTIVE'
       order by extent_id, file_id, block_id;


OWN SEGMENT_NAME           TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID   BYTES BLOCKS RELATIVE_FNO COMMIT_WTIME STATUS
--- ---------------------- --------------- --------- ------- -------- ------- ------ ------------ ------------ ------
SYS _SYSSMU131_3111742082$ UNDO                    0       3     1264   65536      8            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    1       3     1256   65536      8            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    2       3    59520 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    3       3    59648 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    4       3    60928 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    5       3    61056 1048576    128            3              ACTIVE

SYS _SYSSMU131_3111742082$ UNDO                    6       3    62464 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    7       3    62592 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    8       3    63872 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    9       3    10112 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                   10       3    11264 1048576    128            3              ACTIVE

11 rows selected.
Here the output of Dtrace (edited). Probe ktuAddExtent is fired 7 times, each of which results in one new extent (DBA_UNDO_EXTENTS.extent_id: 4-10).

The timestamp shows that extent_id=4 and 5 are at (almost) same time (19:55:54), but then (almost) each second one new extent_id since we call "dbms_session.sleep(1) in "trigger delete_data_bt".

 ID      FUNCTION:NAME
 89987   ktuAddExtent:entry
-------- at 19:55:54 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=4
-------- at 19:55:54 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=5

-------- at 19:55:54 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=6
-------- at 19:55:55 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=7
-------- at 19:55:57 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=8
-------- at 19:55:58 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=9
-------- at 19:56:00 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=10
Now look ustack:

DBA_UNDO_EXTENTS.extent_id(arg2)=4/5    DBA_UNDO_EXTENTS.extent_id(arg2)=6/7/8/9/10
------------------------------------    -------------------------------------------
oracle`ktuAddExtent                     oracle`ktuAddExtent            
oracle`ktusmaex+0x170                   oracle`ktusmaex+0x170          
oracle`ktusmasp1r+0xf7                  oracle`ktusmasp1r+0xf7         
oracle`ktrsexec+0x228                   oracle`ktrsexec+0x228          
oracle`ktuaspResumableInPdb+0xb         oracle`ktuaspResumableInPdb+0xb
oracle`ktusmasp+0x290                   oracle`ktusmasp+0x290          
oracle`ktuchg2+0x633                    oracle`ktuchg2+0x633           
oracle`ktbchg2+0x113                    oracle`ktbchg2+0x113           
oracle`kddchg+0x397                     oracle`kddchg+0x397            
oracle`kddlok+0x922                     oracle`kduovw+0x1296           
a.out`kddlkr+0x12a                      oracle`kduurp+0x446            
a.out`qerfuProcessFetchedRow+0x2ec      oracle`kdusru+0x4f3            
a.out`qerfuLockingRowProcedure+0x45     oracle`kauupd+0x1ec            
a.out`qerstRowP+0x587                   oracle`updrow+0x6a0            
a.out`qersoFetchSimple+0x141            a.out`qerupUpdRow+0x355        
a.out`qersoFetch+0xbe                   oracle`qerupFetch+0x1e9        
a.out`qerstFetch+0x542                  a.out`qerstFetch+0x5c0         
a.out`qerfuStart+0x68c                  oracle`updaul+0x678            
a.out`qerstStart+0x4d2                  oracle`updThreePhaseExe+0x14e  
a.out`selexe0+0x87a                     oracle`updexe+0x1ec            
Left column is ustack for 2 extents (extent_id=4/5);
Right column is ustack for 5 (extent_id=6/7/8/9/10).

Left column is the code path for the execution of "select for update" in

  procedure testp 
    cursor l_delete_csr is select * from delete_data order by id for update;
which contains "selexe0" and "qerfuLockingRowProcedure" (no new extents is generated if select without "for update").

right column is the code path for the execution of "update" in:

  procedure testp
    update delete_data set owner = lower(owner) where current of l_delete_csr;
which shows "updexe" and "updThreePhaseExe".

Note*: On Solaris test with Dtrace, we use default setting of "transactions_per_rollback_segment" and "_rollback_segment_count"


3.2 GDB


Similar to Dtrace, at first, build clean test data in a new Sqlplus session:

conn k/s@test_db_linux

truncate table delete_data;

insert into delete_data 
  select owner, object_name, rownum id from dba_objects where rownum  < 53001;

commit;
Compose a GDB command script (gdb_cmd_undo_extend) as follows:

------- gdb_cmd_undo_extend:   gdb -x gdb_cmd_undo_extend -p 6789 -----
set pagination off
set logging file /tmp/gdb_cmd_undo_extend.log
set logging overwrite on
set logging on
set $k = 0

break *ktuAddExtent
command 
printf "<<<=== (%i). DBA_ROLLBACK_SEGS.segment_id ($rdi)=%i, DBA_UNDO_EXTENTS.extent_id($rax)=%i ===>>>\n", ++$k, $rdi, $rax
bt         
continue
end
Then, find its UNIX process id (pid=6789) of Sqlplus session, start GDB command script on it:

$gdb -x gdb_cmd_undo_extend -p 6789 
Run test:

SQL > exec testp;
The output shows 5 update restarts and its respective Undo USN/extent_id

21:25:26 SQL> exec testp;
--======= Begin Rollstat: USN=105, EXTENTS=8, EXTENDS=160 =======--
Restart#: 1 @ 09:25:50
            doing "AASqnfAAAAABKolAAP(id = 5374)" again, trigger called count = 5375
            Undo: l_max_extent_id=6, extent count = 5, extent_id = 6, block_id = 9472, blocks = 128
Restart#: 2 @ 09:25:51
            doing "AASqnfAAAAABKpnABz(id = 16828)" again, trigger called count = 16830
            Undo: l_max_extent_id=7, extent count = 6, extent_id = 7, block_id = 9728, blocks = 128
Restart#: 3 @ 09:25:52
            doing "AASqnfAAAAABLLZABd(id = 28344)" again, trigger called count = 28347
            Undo: l_max_extent_id=8, extent count = 7, extent_id = 8, block_id = 9856, blocks = 128
Restart#: 4 @ 09:25:54
            doing "AASqnfAAAAABLKfABz(id = 39815)" again, trigger called count = 39819
            Undo: l_max_extent_id=9, extent count = 8, extent_id = 9, block_id = 9984, blocks = 128
Restart#: 5 @ 09:25:55
            doing "AASqnfAAAAABLNkAC7(id = 51335)" again, trigger called count = 51340
            Undo: l_max_extent_id=10, extent count = 9, extent_id = 10, block_id = 10240, blocks = 128
--======= End Rollstat: USN=105, EXTENTS=13, EXTENDS=165 =======--
    row     count = 53000
    trigger count = 53005
    restart count = 5
    extents delta = 5
    extends delta = 5
Elapsed: 00:00:07.05
Here dba_undo_extents rows after test (no ACTIVE rows before test):

SQL > select * from dba_undo_extents 
       where segment_name like'_SYSSMU105_%' and status = 'ACTIVE'
       order by extent_id, file_id, block_id;
 
OWN SEGMENT_NAME          TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID   BYTES BLOCKS RELATIVE_FNO COMMIT_WTIME STATUS
--- --------------------- --------------- --------- ------- -------- ------- ------ ------------ ------------ ------
SYS _SYSSMU105_589038956$ UNDO                    2       3     7808 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    3       3    16640 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    4       3     8320 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    5       3     8448 1048576    128            3              ACTIVE

SYS _SYSSMU105_589038956$ UNDO                    6       3     9472 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    7       3     9728 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    8       3     9856 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    9       3     9984 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                   10       3    10240 1048576    128            3              ACTIVE

SYS _SYSSMU105_589038956$ UNDO                   11       3    10752 1048576    128            3              ACTIVE

10 rows selected.
Here the output of GDB (edited). Breakpoint ktuAddExtent is hit 8 times, each of which results in one new extent (DBA_UNDO_EXTENTS.extent_id: 3-10).

Breakpoint 1, 0x0000000001b8ea20 in ktuAddExtent ()

<<<=== (1). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=3 ===>>>
<<<=== (2). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=4 ===>>>
<<<=== (3). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=5 ===>>>

<<<=== (4). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=6 ===>>>
<<<=== (5). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=7 ===>>>
<<<=== (6). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=8 ===>>>
<<<=== (7). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=9 ===>>>
<<<=== (8). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=10 ===>>>
Now look GDB backtrace:

BA_UNDO_EXTENTS.extent_id($rax)=3/4/5    DBA_UNDO_EXTENTS.extent_id(arg2)=6/7/8/9/10
-------------------------------------    -------------------------------------------
#0  ktuAddExtent ()                      #0   ktuAddExtent ()              
#1  ktusmaex ()                          #1   ktusmaex ()                  
#2  ktusmasp1r ()                        #2   ktusmasp1r ()                
#3  ktrsexecExecuteInExcepHdlr ()        #3   ktrsexecExecuteInExcepHdlr ()
#4  ktrsexec ()                          #4   ktrsexec ()                  
#5  ktuaspResumableInPdb ()              #5   ktuaspResumableInPdb ()      
#6  ktusmasp ()                          #6   ktusmasp ()                  
#7  ktuchg2 ()                           #7   ktuchg2 ()                   
#8  ktbchg2 ()                           #8   ktbchg2 ()                   
#9  kddchg ()                            #9   kddchg ()                    
#10 kddlok ()                            #10  kduovw ()                    
#11 kddlkr ()                            #11  kduurp ()                    
#12 qerfuProcessFetchedRow ()            #12  kdusru ()                    
#13 qerfuLockingRowProcedure ()          #13  kauupd ()                    
#14 qersoFetchSimple ()                  #14  updrow ()                    
#15 qersoFetch ()                        #15  qerupUpdRow ()               
#16 qerfuStart ()                        #16  qerupFetch ()                
#17 selexe0 ()                           #17  updaul ()                    
#18 opiexe ()                            #18  updThreePhaseExe ()          
#19 opipls ()                            #19  updexe ()                    
#20 opiodr ()                            #20  opiexe ()                    
#21 rpidrus ()                           #21  opipls ()                    
#22 skgmstack ()                         #22  opiodr ()                    
#23 rpidru ()                            #23  rpidrus ()                   
#24 rpiswu2 ()                           #24  skgmstack ()                 
#25 rpidrv ()                            #25  rpidru ()                    
#26 psddr0 ()                            #26  rpiswu2 ()                   
#27 psdnal ()                            #27  rpidrv ()                    
#28 pevm_EXECC ()                        #28  psddr0 ()                    
#29 pfrinstr_EXECC ()                    #29  psdnal ()                    
#30 pfrrun_no_tool ()                    #30  pevm_EXECC ()                
#31 pfrrun ()                            #31  pfrinstr_EXECC ()            
#32 plsql_run ()                         #32  pfrrun_no_tool ()            
#33 peicnt ()                            #33  pfrrun ()                    
#34 kkxexe ()                            #34  plsql_run ()        
Left column is backtrace for 3 extents (extent_id=3/4/5);
right column is backtrace for 5 (extent_id=6/7/8/9/10).

Left column is the code path for "select for update" in

  procedure testp 
    cursor l_delete_csr is select * from delete_data order by id for update;
which contains "selexe0" and "qerfuLockingRowProcedure" (no new extents is generated if select without "for update").

right column is the code path for "update" in:

  procedure testp
    update delete_data set owner = lower(owner) where current of l_delete_csr;
which shows "updexe" and "updThreePhaseExe".


4. Discussions


Update Restart can be looked in different aspects, for example,
     -. Single vs. multiple rows;
     -. With vs. Without other sessions interfering;
     -. One vs. Three Phases.


4.1 Three Phase Update Restart


Two recent Blogs:
     Oracle Write Consistency and ORA-00600: [13030], [20]
     Oracle Write Consistency and ORA-30926: "unable to get a stable set of rows in the source tables"
showed the update restart of multiple rows in every update execution with other sessions interfering, in which we also listed a few well deep Related Works on this area.

Blog: That Old Restart Problem Strikes Back: Setting the Stage showed the update restart of multiple rows in every update execution without other sessions interfering (see repeated test in Appendix 2).

The test code in Book: Expert Oracle Database Architecture (Thomas Kyte, Darl Kuhn, 3rd Edition) Page: 270 shows the update restart of single row in every update execution with other sessions interfering (see test in Appendix 4).

All those tests showed three phases, which are ordered as "NOT LOCKED", "LOCK", "ALL LOCKED":

   updThreePhaseExe: objn=3122646 phase=NOT LOCKED
   updThreePhaseExe: objn=3122646 phase=LOCK
   updThreePhaseExe: objn=3122646 phase=ALL LOCKED


4.2 One Phase Update Restart


In this Blog, we looked the update restart of single row in every update execution without other session interfering.

In this Blog, DML UTS (Unified Tracing Service) only showed one phase, which is "NOT LOCKED":

   updThreePhaseExe: objn=4892505 phase=NOT LOCKED 
In fact, if we set 10200 Consistent Reads, and repeat the test:

alter session set tracefile_identifier = 'trace_10200_10046_test_2';
alter session set events '10046 trace name context forever, level 12';
-- alter session set events '10051 trace name context forever, level 1';
alter session set events '10200 trace name context forever, level 10';
--set diagnostic events in my session:
alter session set events 'trace[dml]:sql_trace wait=true';

exec testp;
The trace shows that every update (whatever restart or not) has only phase=NOT LOCKED, and has the same calls of ktrget2 and ktrgcm:

updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED

ktrget2(): started for block  <0x0d8f : 0x0004aa35> objd: 0x004aa7fb
...
ktrgcm(): completed for block  <0x0d8f : 0x0004aa35> objd: 0x004aa7fb
ktrget2(): completed for  block <0x0d8f : 0x0004aa35> objd: 0x004aa7fb


5. Summary


This Blog shows the update restart of single row in every update execution without other sessions interfering, which is caused by new Undo extends.

Since every update execution has only one phase, i.e. "NOT LOCKED", and has only one "FETCH" for Consistent Read, it is hard to say that such update restart is a real "Restart".

It is different to the other update restart, which is caused by row locking from other sessions interfering.

By the way, we also use the same above diagnostic events to run the test in Blog: Delete Restart and new Active Undo Extent) the output shows update restarts, but the trace file does not contain anything like:


  updThreePhaseExe
  phase=NOT LOCKED
  error=1551


Appendix 1. Test Code


alter system set transactions_per_rollback_segment=1 scope=spfile;
alter system set "_rollback_segment_count"=1         scope=spfile;
      --alter system reset "_rollback_segment_count" scope=spfile;
  --Re-Start DB

create or replace function get_undo_extent_cnt return number
as
  l_cnt number;
begin   
 select count(*) into l_cnt from dba_undo_extents where status = 'ACTIVE';
 return l_cnt;
end;
/

create or replace function get_undo_extent_info return varchar2
as
  l_xid         varchar2(100);
  l_usn         number;
  l_extent_cnt  number;
  l_last_extent varchar2(100);
  l_max_extent_id number;
begin  
 l_xid := dbms_transaction.local_transaction_id;
 l_usn := substr(dbms_transaction.local_transaction_id, 1, instr(dbms_transaction.local_transaction_id, '.') -1);
 
 -- all stats only on ACTIVE extent
 select count(*) 
   into l_extent_cnt 
   from dba_undo_extents 
  where instr(segment_name, '_SYSSMU'||l_usn||'_') > 0 and status = 'ACTIVE';
 
 select max(extent_id) 
   into l_max_extent_id
   from dba_undo_extents 
  where instr(segment_name, '_SYSSMU'||l_usn||'_') > 0 and status = 'ACTIVE'; 
                 
 select ', extent_id = '||extent_id||', block_id = '||block_id||', blocks = '|| blocks 
   into l_last_extent
   from dba_undo_extents
  where instr(segment_name, '_SYSSMU'||l_usn||'_') > 0 and status = 'ACTIVE'
    and extent_id = l_max_extent_id;
 
 return 'l_max_extent_id='||l_max_extent_id ||', extent count = '||l_extent_cnt||l_last_extent;
end;
/

create or replace package pacepack
as
  type array is table of number index by varchar2(40);
  g_data                  array;
  g_cnt                   number;
  g_restart_cnt           number;
  g_undo_extent_cnt       number;
end pacepack;
/

drop table delete_data;

create table delete_data as
  select owner, object_name, rownum id from dba_objects where 0 = 1;
    
create or replace trigger delete_data_bt
  before delete or update on delete_data
  for each row
declare
  l_msg varchar2(100);
begin
  pacepack.g_cnt        := pacepack.g_cnt + 1;
  if ( pacepack.g_data.exists(rowidtochar(:old.rowid)))
  then
    dbms_session.sleep(1);
    pacepack.g_restart_cnt := pacepack.g_restart_cnt + 1;
    
    l_msg := 'doing "'||:old.rowid ||'(id = '||:old.id||')" again, '||'trigger called count = '||pacepack.g_cnt;
    sys.dbms_system.ksdwrt(3, l_msg);
    dbms_output.put_line('Restart#: '||pacepack.g_restart_cnt||' @ ' ||(to_char(sysdate, 'hh:mi:ss')));
    dbms_output.put_line(lpad(chr(32), 12, chr(32))||l_msg);
    dbms_output.put_line(lpad(chr(32), 12, chr(32))||'Undo: '||get_undo_extent_info);
  else
    pacepack.g_data(rowidtochar(:old.rowid)) := 1;
  end if;
end;
/

create or replace procedure testp as
  l_cnt           number := 0;
  cursor          l_delete_csr is select * from delete_data order by id for update;
  l_start_extents number;
  l_start_extends number;
  l_end_extents   number;
  l_end_extends   number;
begin
  pacepack.g_data.delete;
  pacepack.g_cnt         := 0;
  pacepack.g_restart_cnt := 0;
         
  for l_delete_row in l_delete_csr loop
    update delete_data set owner = lower(owner) where current of l_delete_csr;
    l_cnt := l_cnt + 1;
   
    if l_cnt = 1 then
     for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
      dbms_output.put_line('--======= Begin Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends||' =======--');
      l_start_extents := c.extents;
      l_start_extends := c.extends;
     end loop;
    end if;
  end loop;
      
  for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
    dbms_output.put_line('--======= End Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends||' =======--');
    l_end_extents := c.extents;
    l_end_extends := c.extends;
  end loop;
  
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'row     count = '||l_cnt); 
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'trigger count = '||pacepack.g_cnt);
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'restart count = '||(pacepack.g_cnt - l_cnt));
  
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'extents delta = '||(l_end_extents - l_start_extents));
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'extends delta = '||(l_end_extends - l_start_extends));
 end;
/


Appendix 2. error=1551 Test


We repeated the same test in Blog: That Old Restart Problem Strikes Back: Setting the Stage .

The test table (big) has all similar stats as the Blog (163840 blocks and 1280 MB).

alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "big_upd_3"; 
alter session set events 'trace[dml]:sql_trace wait=true';

-- added MONITOR hint
update /*+ MONITOR */ big
      set val = val + 1
    where val = 1;

100000 rows updated.
Here the trace file (edited and comment added after "=====>"). It shows three phases, and Oracle stats in each phase.

test_db_ora_31526_big_upd_3.trc

=====>Line: 255
PARSING IN CURSOR #140473762595472 len=67 dep=0 uid=49 oct=6 lid=49 tim=8799656282217 hv=2426892631 ad='6a8460b0' sqlid='fm6dxja8afwar'
update /*+ MONITOR */ big
      set val = val + 1
    where val = 1
END OF STMT
PARSE #140473762595472:c=1131,e=1269,p=0,cr=1,cu=0,mis=1,r=0,dep=0,og=1,plh=3225303172,tim=8799656282216

updThreePhaseExe: objn=4892269 phase=NOT LOCKED
updaul: phase is NOT LOCKED

=====>Line: 5204. About 5000 lines (from 255 to 5204) with 'db file scattered read' blocks=32. that is about 150000 blocks.
WAIT #140473762595472: nam='db file scattered read' ela= 151 file#=22 block#=605928 blocks=32 obj#=4892269 tim=8799658444950
updaul: objn=4892269 error=1551

updThreePhaseExe: objn=4892269 phase=LOCK
updaul: phase is LOCK

=====>Line:10145
WAIT #140473762595472: nam='db file scattered read' ela= 149 file#=22 block#=605864 blocks=32 obj#=4892269 tim=8799660393207

=====>Line: 25003, 39864, 54722, 60581, 84439, 99390 have "updrowFastPath: objn=4892269 error=1551"
updrowFastPath: objn=4892269 error=1551

=====>Line:110218. About 100000 lines (from 10145 to 110218) with 'updrowFastPath'.
updrowFastPath: kddlkr objn 4892269 table 0  rowid 004aa66d.0009452f.2c code 0

updThreePhaseExe: objn=4892269 phase=ALL LOCKED
updaul: phase is ALL LOCKED

=====>Line:115173. About 5000 lines (from 110218 to 115173) with 'db file scattered read' blocks=32. that is about 150000 blocks.
WAIT #140473762595472: nam='db file scattered read' ela= 170 file#=22 block#=605896 blocks=32 obj#=4892269 tim=8799671146703
Here the monitor report. Execs = 3 reveals three phase updates, 3 times excessive I/O stats also indicates such three phases.

SQL > select dbms_sqltune.report_sql_monitor('fm6dxja8afwar') from dual;

------------------------------
update /*+ MONITOR */ big set val = val + 1 where val = 1

Global Information
------------------------------
 SQL ID              :  fm6dxja8afwar       
 Duration            :  38s                 
 Program             :  sqlplus.exe         

Global Stats
=========================================================
| Elapsed |   Cpu   |    IO    | Buffer | Read  | Read  |
| Time(s) | Time(s) | Waits(s) |  Gets  | Reqs  | Bytes |
=========================================================
|      39 |      29 |       10 |   793K | 15723 |   4GB |
=========================================================

SQL Plan Monitoring Details (Plan Hash Value=3225303172)
=====================================================================================================================================================
| Id |      Operation       | Name |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read  | Read  | Activity |      Activity Detail       |
|    |                      |      | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes |   (%)    |        (# samples)         |
=====================================================================================================================================================
|  0 | UPDATE STATEMENT     |      |         |      |        27 |    +12 |     3 |        0 |       |       |          |                            |
|  1 |   UPDATE             | BIG  |         |      |        27 |    +12 |     3 |        0 |     3 | 24576 |     2.56 | Cpu (1)                    |
|  2 |    TABLE ACCESS FULL | BIG  |       1 |    2 |        38 |     +2 |     3 |     205K | 15720 |   4GB |    94.87 | Cpu (28)                   |
|    |                      |      |         |      |           |        |       |          |       |       |          | db file scattered read (9) |
=====================================================================================================================================================


Appendix 3. Event 1551 Errorstack Trace.


We also make a 1551 Errorstack Trace:

alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "trc_1551_1";
alter session set events='1551 trace name errorstack level 3';

exec testp;
Here the output:

----- Error Stack Dump -----
ORA-01551: extended rollback segment, pinned blocks released
----- Current SQL Statement for this session (sql_id=f4qphfamx27x1) -----
SELECT * FROM DELETE_DATA ORDER BY ID FOR UPDATE
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x8bcaefe8         3  procedure K.TESTP
0x8bcaefe8        13  procedure K.TESTP
0x6aea90d8         1  anonymous block

----- Call Stack Trace -----
[12] (dbgePostErrorKGE()+1066       
[13] (dbkePostKGE_kgsf()+71         
[14] (kgeade()+392                  
[15] (kgeselv()+89                  
[16] (ksesecl0()+189                
[17] (ktugur()+7201                 
[18] (ktuchg2()+1063                
[19] (ktbchg2()+231                 
[20] (kddchg()+611                  
[21] (kddlok()+2350                 
[22] (kddlkr()+322                  
[23] (qerfuProcessFetchedRow()+669  
[24] (qerfuLockingRowProcedure()+68 
[25] (qersoFetchSimple()+420        
[26] (qersoFetch()+210              
[27] (qerfuStart()+602              
[28] (selexe0()+3409                
[29] (opiexe()+11896                
[30] (opipls()+2427                 
[31] (opiodr()+1202                 
[32] (rpidrus()+198                 
[33] (skgmstack()+65                
[34] (rpidru()+132                  
[35] (rpiswu2()+543                 
[36] (rpidrv()+1266                 
[37] (psddr0()+467                  
[38] (psdnal()+624                  
[39] (pevm_EXECC()+306              
[40] (pfrinstr_EXECC()+56           
[41] (pfrrun_no_tool()+60           
[42] (pfrrun()+902                  
[43] (plsql_run()+752  


Appendix 4. Single row update in every update execution with other sessions interfering


We also repeat the code test in Book: Expert Oracle Database Architecture (Thomas Kyte, Darl Kuhn, 3rd Edition) Page: 270.

drop table t;
create table t (x int, y int);
insert into t values (1, 5);
commit;

--SID-1@T1
Update t set y=10 where y=5;

--SID-2@T2
alter session set tracefile_identifier = 'trace_10200_10046_test_x3';
alter session set events '10046 trace name context forever, level 12';
alter session set events '10200 trace name context forever, level 10';
-- alter session set events '10051 trace name context off';
alter session set events 'trace[dml]:sql_trace wait=true';

Update t Set x=x+1 Where y=5;

--SID-1@T3
commit;
Trace file shows three phases:

updThreePhaseExe: objn=4892669 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e8a526 

updThreePhaseExe: objn=4892669 phase=LOCK
updaul: phase is LOCK snap oldsnap env: (scn: 0x00000b7302e8a533 

updThreePhaseExe: objn=4892669 phase=ALL LOCKED
updaul: phase is ALL LOCKED snap oldsnap env: (scn: 0x00000b7302e8a533

and 13 occurrences of following call patterns:

ktrget2(): started for block  <0x0d8f : 0x0011cb7b> objd: 0x004aa7fd
ktrexf(): returning 9 on:  0x7fbd2ead6588  
ktrgcm(): completed for block  <0x0d8f : 0x0011cb7b> objd: 0x004aa7fd
ktrget2(): completed for  block <0x0d8f : 0x0011cb7b> objd: 0x004aa7fd
The xplan also reveals three phases by "starts=3":

SQL ID: gwkdp2wjjxxks Plan Hash: 931696821
Update t Set x=x+1 Where y=5

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00      17.23          0         21          2           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00      17.23          0         22          2           0

Rows (1st) Row Source Operation
---------- ---------------------------------------------------
         0 UPDATE  T (cr=21 pr=0 pw=0 time=17229569 us starts=3)
         1  TABLE ACCESS FULL T (cr=20 pr=0 pw=0 time=824 us starts=3 cost=2 size=26 card=1)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enq: TX - row lock contention                   1       17.22         17.22

Sunday, September 11, 2022

Proposal for ORA-04030 incident/trace files Enhancement to list Plsql package Memory Usage

For "ORA-04030: out of process memory" incident/trace files, we would like to propose to list Plsql package (both spec and body) names, and amount of respective used memory by each Plsql package. The selection criteria could be TOP-n Plsql packages (for example, n=10), or the sum of used memory usage by the listed Plsql packages is greater than Percentage-x (for example, x=90), or any package which uses more than KB-m memory (for example, m=1024).

The same list of Plsql packages can be also applied for Event 10261.
(see ORA-00600 [723], [67108952], [pga heap] When Event 10261 Set To Limit The PGA Leak (Doc ID 1162423.1),
and Blog: Limit PGA Memory Usage)

The content of Plsql package usage can be similar to dbms_session.get_package_memory_utilization.
(see Blog: dbms_session package_memory_utilization)

We often hit such ORA-04030 in big Oracle applications. It would be a great new feature to facilitate Plsql applications and memory usage investigation.