Blog: That old restart problem again...
(
http://tkyte.blogspot.com/2010/04/that-old-restart-problem-again.html)
or Link:
That old restart problem again...
presents a new case of update restart, which demonstrates that restart can also happen within a single session, not necessarily two sessions with one being blocked by a update from another session. Inherently single session restart is distinct from the normal two sessions' restart in which all rows till first detected changed row are restarted, however in single session restart, only occasionally a few rows are restarted.
Referring to Oracle documentation, it is not evident to get a clue about such behavior. Sometimes I hear the warning: "Don't go searching for it in the documentation" (Blog: NO_DATA_NEEDED - something I learned recently in
http://tkyte.blogspot.com/2010/04/nodataneeded-something-i-learned.html).
So it is worth of pursuing some research on it. Following modified script shows certain coincidence of new active undo_extent and restart.
I slightly modified the code in Blog: That old restart problem again...: by adding a helper function to count the active undo extents, and some lines to print out this count.
create or replace function get_undo_extent_cnt return number
as
l_cnt number;
begin
select count(*) into l_cnt from dba_undo_extents where status = 'ACTIVE';
return l_cnt;
end;
/
create or replace package pacepack
as
type array is table of number index by varchar2(40);
g_data array;
g_cnt number;
g_restart_cnt number;
g_undo_extent_cnt number;
end pacepack;
/
drop table delete_data;
create table delete_data as
select owner, object_name, rownum id
from dba_objects
where rownum < 53001;
create or replace trigger delete_data_bt
before delete or update on delete_data
for each row
begin
if ( pacepack.g_data.exists(rowidtochar(:old.rowid)))
then
pacepack.g_restart_cnt := pacepack.g_restart_cnt + 1;
dbms_output.put_line( 'doing "' || :old.rowid ||
'" again was called ' || pacepack.g_cnt );
dbms_output.put_line('-- Restart#: ' || pacepack.g_restart_cnt ||
', Undo Extent Count: ' || get_undo_extent_cnt);
else
pacepack.g_data(rowidtochar(:old.rowid)) := 1;
end if;
pacepack.g_cnt := pacepack.g_cnt + 1;
end;
/
Assume you are the only one on the database, then run the same script:
declare
cursor l_delete_csr is
select * from delete_data for update;
l_cnt number := 0;
begin
pacepack.g_data.delete;
pacepack.g_cnt := 0;
pacepack.g_restart_cnt := 0;
for l_delete_row in l_delete_csr loop
update delete_data set owner = lower(owner) where current of l_delete_csr;
l_cnt := l_cnt + 1;
if l_cnt = 1 then
for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
dbms_output.put_line('---- Begin Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
end loop;
end if;
end loop;
dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt || ' local count = ' || l_cnt );
for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
dbms_output.put_line('---- End Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
end loop;
end;
/
output:
---- Begin Rollstat: USN=20, EXTENTS=24, EXTENDS=21
doing "AABEYTAAAAAC9WMAB6" again was called 7144
-- Restart#: 1, Undo Extent Count: 23
doing "AABEYTAAAAAC9XMACo" again was called 18475
-- Restart#: 2, Undo Extent Count: 24
doing "AABEYTAAAAAC9YpACA" again was called 29740
-- Restart#: 3, Undo Extent Count: 25
doing "AABEYTAAAAAC9ZsAB1" again was called 41005
-- Restart#: 4, Undo Extent Count: 26
doing "AABEYTAAAAAC9axAAa" again was called 52298
-- Restart#: 5, Undo Extent Count: 27
trigger count = 53005 local count = 53000
---- End Rollstat: USN=20, EXTENTS=29, EXTENDS=26
The output shows 5 Restarts(53005-53000) , 5 new Extents(29-24) created by 5 extending(26-21).
So for each restart, a new active Undo Extent is created. If you run it repeatedly, you will see the same behavior.
As we know, Oracle fulfils a DML by following 5 steps:
create UNDO Change Vector
create REDO Change Vector
combine both Change Vectors into Redo Buffer, and then write to redo log
write UNDO record into UNDO file
write modified Data into DATA file
When a transaction fills rollback segment and finds that the last remaining block is not sufficient to hold the UNDO record, it will trigger SMON to perform a recursive space transaction, which will allocate a new UNDO extent. Hereafter Oracle recommences the above 5 steps, and hence an "Update Restart".
As I tested in 10g and 11g (undo_management=AUTO), this output is confirmed, but I would be careful to have any earlier conclusion before Oracle says.
Update (2022Aug25) --------------------------------------------------
Blog:
That Old Restart Problem Strikes Back: Setting the Stage (1 August 2019)
sets DML diagnostic events in session:
alter session set events 'trace[dml]:sql_trace wait=true';
and shows that there is one ORA-01551 error for each restart.
Setting this DML event, repeat our above test.
For above 5 Restarts, we can see 5 occurrences of following text in the trace file:
updThreePhaseExe: objn=4891285 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7ac xid: 0x006e.011.00006316 uba: 0x00c0227c.1936.58
updrow: objn=4891285 phase=NOT LOCKED
updrow: objn=4891285 error=1551
updrow: qecinvsub objn=4891285
updrow: setting ROW_RETRY objn=4891285
updrow: retry_this_row: ROW_RETRY set, objn= 4891285 phase=NOT LOCKED
The difference of those 5 occurrences are the second line about (scn, xid, uba),
where scn and uba are different for each restart, but for the same xid:
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7ac xid: 0x006e.011.00006316 uba: 0x00c0227c.1936.58
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b0 xid: 0x006e.011.00006316 uba: 0x00c0517c.1937.5a
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b2 xid: 0x006e.011.00006316 uba: 0x00c051fc.1938.5a
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b3 xid: 0x006e.011.00006316 uba: 0x00c0527c.1939.59
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b7 xid: 0x006e.011.00006316 uba: 0x00c052fc.193a.5a
If we made a 1551 errorstack trace:
alter session set events='1551 trace name errorstack level 3';
the Error Stack shows:
----- Error Stack Dump -----
ORA-01551: extended rollback segment, pinned blocks released
----- Current SQL Statement for this session (sql_id=4u6b4b82z7xrp) -----
SELECT * FROM DELETE_DATA FOR UPDATE
----- PL/SQL Call Stack -----
[15] (kgeselv()+89
[16] (ksesecl0()+189
[17] (ktusmasp()+1088 --->ktusmasp - kernel transaction undo smu (system managed undo) add space
[18] (ktuchg2()+9603
[19] (ktbchg2()+231
[20] (kddchg()+611
[21] (kddlok()+2350
[22] (kddlkr()+322
[23] (qerfuProcessFetchedRow()+669
[24] (qerfuLockingRowProcedure()+68
[25] (qersoRowP()+880
[26] (kdstf000110100000000km()+1014
[27] (kdsttgr()+2154
[28] (qertbFetch()+1090
[29] (qersoProcessULS()+281
[30] (qersoFetchSimple()+2251
[31] (qersoFetch()+210
[32] (qerfuStart()+602
[33] (selexe0()+3409
[34] (opiexe()+11896
End of Update (2022Aug25)--------------------------------------------------
The same behavior is also observed for the single row update,
in which the cursor is defined with an input parameter p_id to select a single row:
declare
cursor l_delete_csr (p_id number) is
select * from delete_data where id = p_id for update;
l_cnt number := 0;
begin
pacepack.g_data.delete;
pacepack.g_cnt := 0;
pacepack.g_restart_cnt := 0;
for i in 1..53000 loop
for l_delete_row in l_delete_csr(i) loop
update delete_data set owner = lower(owner) where current of l_delete_csr;
l_cnt := l_cnt + 1;
if l_cnt = 1 then
for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
dbms_output.put_line('---- Begin Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
end loop;
end if;
end loop;
end loop;
dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt ||' local count = ' || l_cnt );
for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
dbms_output.put_line('---- End Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
end loop;
end;
/
Output:
---- Begin Rollstat: USN=34, EXTENTS=29, EXTENDS=25
doing "AABEYTAAAAAC9WhABL" again was called 11315
-- Restart#: 1, Undo Extent Count: 29
doing "AABEYTAAAAAC9XnAAB" again was called 22612
-- Restart#: 2, Undo Extent Count: 30
doing "AABEYTAAAAAC9ZCAAe" again was called 33877
-- Restart#: 3, Undo Extent Count: 31
doing "AABEYTAAAAAC9aHAAr" again was called 45142
-- Restart#: 4, Undo Extent Count: 32
trigger count = 53004 local count = 53000
---- End Rollstat: USN=34, EXTENTS=33, EXTENDS=29
The Blog: Three new "hints" (
http://rwijk.blogspot.com/2009/10/three-new-hints.html) shows a new Oracle 11.2.0.1 hint: RETRY_ON_ROW_CHANGE, which I would call it block level restart, in comparing to the traditional row level (maybe this hint already exists in 10g since it is mentioned in 10g "Bug 6082734 Dump (kdr9igtl) accessing a row from an IOT").
One interesting code in this Blog is that it uses after trigger to catch the restart.