Thursday, May 5, 2011

ORA-04027: self-deadlock during automatic validation

Sometimes ago I hit an error:

   ORA-04027: self-deadlock during automatic validation for object

I wonder why there is a self-deadlock in a single session (no autonomous transaction involved). Then I tried to extract a short test case to reproduce it as follows.

drop table testt1;
drop package testp1;
drop procedure proc1;

create table testt1 (a number(2));
insert into testt1 values (12);
commit;

create or replace package testp1 as
  function f1 return number;
  procedure p1;
end testp1;
/

create or replace procedure proc1              
as
  procedure prt(p_name varchar2) as
  begin
      for c in (select p_name || object_name || '(' || object_type || ')' || status s
                from   dba_objects
                where  object_name in ('TESTP1', 'PROC1'))
      loop
        dbms_output.put_line(c.s);
      end loop; 
  end;
begin
  prt('Before Alter: ');
  execute immediate 'alter table testt1 modify (a number(2))';
  prt('After Alter:  ');
  update testt1 set a=testp1.f1;
end proc1;
/

create or replace package body testp1 as
  function f1 return number as
    begin
      return 10;
    end;
   
  procedure p1 is
    begin
      proc1;
    end;
end testp1;
/

Now if I run:

SQL> exec proc1;

Before Alter: TESTP1(PACKAGE BODY)VALID
Before Alter: TESTP1(PACKAGE)VALID
Before Alter: PROC1(PROCEDURE)VALID
After Alter: TESTP1(PACKAGE BODY)INVALID
After Alter: TESTP1(PACKAGE)VALID
After Alter: PROC1(PROCEDURE)INVALID
BEGIN proc1; END;

*
ERROR at line 1:
ORA-04045: errors during recompilation/revalidation of TESTP1
ORA-04027: self-deadlock during automatic validation for object PROC1
ORA-06512: at "PROC1", line 16
ORA-06512: at line 1

There are two invalids: TESTP1(PACKAGE BODY) and PROC1(PROCEDURE). You can make both valid by:


alter PACKAGE TESTP1 COMPILE BODY;

but whenever you call proc1, they are again invalid.

The dependency is TESTP1(PACKAGE BODY) on PROC1, and PROC1 & TESTP1(PACKAGE BODY) on TESTP1(PACKAGE).

Here is an attempt to make some reasoning.

When running "exec proc1;", proc1 is pinned, after "alter table testt1", proc1 is invalid, which in turn, causes the invalid of testp1 body due to dependency (the pinned version of proc1 is still valid since it is the directly called unit), then proc1 runs to the update statement, which requires a valid testp1 body, thus it demands an X-lock on testp1 body, which again asks for an X-lock of proc1 (due to dependency). Since proc1 is already pinned (Share-lock) by its own, it is not possible to allocate an X-lock to itself. So a self-deadlock is generated during validation of proc1.
                       
The code was tested on 10gr2 and 11gr2. Toggling 11gr2 hidden parameters:
_disable_fast_validate(TRUE, FALSE), _ignore_fg_deps  (TABLES, PLSQL, ALL, NONE)
seems having no influence on the behavior.

Wednesday, May 4, 2011

Foreign Key on Nested Table

This Blog will make an attempt to define a referential constraint on a nested table column.

At first I copy and run the code from Book: Expert Oracle Database Architecture: 9i and 10g (http://www.apress.com/9781590595305).

drop table dept_and_emp cascade constraints;
drop type emp_tab_type;
drop type emp_type;

create or replace NONEDITIONABLE type emp_type  -- NONEDITIONABLE since 11.2
--create or replace type emp_type
as object
(empno       number(4),
 ename       varchar2(10),
 job         varchar2(9),
 mgr         number(4),
 hiredate    date,
 sal         number(7, 2),
 comm        number(7, 2)
)
/

create or replace NONEDITIONABLE type emp_tab_type  -- NONEDITIONABLE since 11.2
--create or replace type emp_tab_type
as table of emp_type
/

create table dept_and_emp
(deptno    number(2) primary key,
 dname     varchar2(14),
 loc       varchar2(13),
 emps      emp_tab_type
)
nested table emps store as emps_nt
/

alter table emps_nt add constraint
emps_empno_unique unique(empno)
/

insert into dept_and_emp values(1, 'Dept_1', 'Loc_1',
  new emp_tab_type(new emp_type(11, 'Emp_11', 'Job_11', null, date'1962-01-11', 1, 1),
                   new emp_type(12, 'Emp_12', 'Job_12', 11, date'1962-01-12', 2, 2)
                   ));
                  
insert into dept_and_emp values(2, 'Dept_2', 'Loc_2',
  new emp_tab_type(new emp_type(21, 'Emp_21', 'Job_21', null, date'1962-02-11', 1, 1),
                   new emp_type(22, 'Emp_22', 'Job_22', 21, date'1962-02-12', 2, 2)
                   )); 
                  
commit;
As already showed in the book, if run:

alter table emps_nt add constraint mgr_fk foreign key(mgr) references emps_nt(empno);

one will get an error:

ORA-30730: referential constraint not allowed on nested table column

Now I will try with other columns. At first, I list the columns from master and nested table (not relevant columns are removed):

SQL > select o.object_name tname, c.name, c.type# ctype
        from   dba_objects o, sys.col$ c
       where c.obj# = o.object_id
         and o.object_name in ('EMPS_NT', 'DEPT_AND_EMP')
       order by o.object_name, c.name;

DEPT_AND_EMP            SYS_NC0000400005$    23
EMPS_NT                 NESTED_TABLE_ID      23

I see both DEPT_AND_EMP.SYS_NC0000400005$ and EMPS_NT.NESTED_TABLE_ID having the same type: 23(SQLT_BIN).
   
then run:

alter table emps_nt add constraint emps_nt_fk foreign key(nested_table_id)
references dept_and_emp(sys_nc0000400005$);

it will create a foreign key constraint EMPS_NT_FK.

I can verify it by:

SQL > select constraint_name, constraint_type, table_name 
        from dba_constraints where table_name = 'EMPS_NT';

EMPS_NT_FK          R     EMPS_NT
EMPS_EMPNO_UNIQUE   U     EMPS_NT

Tested on Oracle 10gr2 and 11gr2, both works.

Now I am lost with Oracle documentation:

ORA-30730: referential constraint not allowed on nested table column
    Cause: An attempt was made to define a referential constraint on a nested table column.
    Action: Do not specify referential constraints on nested table columns.

By the way, on 11gr2, with the query:

SQL > select  index_name, table_name, column_name 
        from  dba_ind_columns 
       where table_name in ('EMPS_NT', 'DEPT_AND_EMP');

SYS_C0016818               DEPT_AND_EMP      DEPTNO
SYS_C0016819               DEPT_AND_EMP      EMPS
SYS_FK0000184243N00004$    EMPS_NT           NESTED_TABLE_ID
EMPS_EMPNO_UNIQUE          EMPS_NT           EMPNO

we can see the column NESTED_TABLE_ID in EMPS_NT is indexed by: SYS_FK0000184243N00004$. So Oracle follows its rule: "foreign keys should be indexed" (Oracle® Database Concepts 11g Release 2 (11.2) Part Number E16508-05 in http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/datainte.htm#CNCPT88886)

However, this index can not be found in 10gr2.    

We can also verify this foreign key constraint:

SQL > delete dept_and_emp where deptno = 1;
ORA-02292: integrity constraint (K.EMPS_NT_FK) violated - child record found

Now disabling this constraint, delete deptno 1 again:

SQL > alter table emps_nt disable constraint emps_nt_fk;
SQL > delete dept_and_emp where deptno = 1;
1 row deleted.

We can see that both parent and child rows are deleted:

SQL > select SYS_NC0000400005$, t.* from dept_and_emp t;

SQL > select /*+ nested_table_get_refs */ NESTED_TABLE_ID, SYS_NC_ROWINFO$, t.* from emps_nt t;

(Without hint: nested_table_get_refs, it hits: "ORA-22812: cannot reference nested table column's storage table")

However, if we delete with two additional hints:

SQL > delete /*+ no_index(t SYS_FK0000184243N00004$) nested_table_get_refs */ dept_and_emp t
      where deptno = 2;
1 row deleted.

We can see that only parent row is deleted, but child rows remain:

SQL > select SYS_NC0000400005$, t.* from dept_and_emp t;
no rows selected

SQL > select /*+ nested_table_get_refs */ NESTED_TABLE_ID, SYS_NC_ROWINFO$, t.* from emps_nt t;

E40DC694BEAC31C7E04478E7D1E92954 (21; Emp_21; Job_21; ; 11.02.1962; 1; 1) 21 Emp_21 Job_21  11.02.1962 1 1

E40DC694BEAC31C7E04478E7D1E92954 (22; Emp_22; Job_22; 21; 12.02.1962; 2; 2) 22 Emp_22 Job_22 21 12.02.1962 2 2

2 rows selected.

Taking the NESTED_TABLE_ID of above query, we can delete these child rows directly:

SQL > delete /*+ nested_table_get_refs */ emps_nt t
      where nested_table_id=hextoraw('E40DC694BEAC31C7E04478E7D1E92954');
2 rows deleted.

SQL > select /*+ nested_table_get_refs */ NESTED_TABLE_ID, SYS_NC_ROWINFO$, t.* from emps_nt t;
no rows selected

 

Monday, May 2, 2011

Update Restart and new Active Undo Extent

Blog:  That old restart problem again...
(http://tkyte.blogspot.com/2010/04/that-old-restart-problem-again.html) or Link: That old restart problem again...
presents a new case of update restart, which demonstrates that restart can also happen within a single session, not necessarily two sessions with one being blocked by a update from another session. Inherently single session restart is distinct from the normal two sessions' restart  in which all rows till first detected changed row are restarted, however in single session restart, only occasionally a few rows are restarted.

Referring to Oracle documentation, it is not evident to get a clue about such behavior. Sometimes I hear the warning: "Don't go searching for it in the documentation" (Blog:  NO_DATA_NEEDED - something I learned recently in http://tkyte.blogspot.com/2010/04/nodataneeded-something-i-learned.html).

So it is worth of pursuing some research on it. Following modified script shows certain coincidence of new active undo_extent and restart.

I slightly modified the code in Blog:  That old restart problem again...: by adding a helper function to count the active undo extents, and some lines to print out this count.

create or replace function get_undo_extent_cnt return number
as
  l_cnt number;
begin   
 select count(*) into l_cnt from dba_undo_extents where status = 'ACTIVE';
 return l_cnt;
end;
/

create or replace package pacepack
as
  type array is table of number index by varchar2(40);
  g_data                         array;
  g_cnt                           number;
  g_restart_cnt                number;
  g_undo_extent_cnt       number;
end pacepack;
/

drop table delete_data;

create table delete_data as
select owner, object_name, rownum id
from  dba_objects
where rownum  < 53001;
    
create or replace trigger delete_data_bt
  before delete or update on delete_data
  for each row
begin
  if ( pacepack.g_data.exists(rowidtochar(:old.rowid)))
  then
    pacepack.g_restart_cnt := pacepack.g_restart_cnt + 1;
   
    dbms_output.put_line( 'doing "' || :old.rowid ||
                           '" again was called ' || pacepack.g_cnt );
    dbms_output.put_line('-- Restart#: ' || pacepack.g_restart_cnt ||
                            ', Undo Extent Count: ' || get_undo_extent_cnt);
  else
    pacepack.g_data(rowidtochar(:old.rowid)) := 1;
  end if;
      
  pacepack.g_cnt        := pacepack.g_cnt + 1;
end;
/

Assume you are the only one on the database, then run the same script:

declare
  cursor l_delete_csr is
    select * from delete_data for update;
  l_cnt number := 0;
begin
  pacepack.g_data.delete;
  pacepack.g_cnt         := 0;
  pacepack.g_restart_cnt := 0;
         
  for l_delete_row in l_delete_csr loop
    update delete_data set owner = lower(owner) where current of l_delete_csr;
    l_cnt := l_cnt + 1;
   
    if l_cnt = 1 then
     for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
      dbms_output.put_line('---- Begin Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
     end loop;
    end if;
  end loop;
    
  dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt || ' local count = '   || l_cnt );   
  for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
    dbms_output.put_line('---- End Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
  end loop;
 end;
 /

output:

---- Begin Rollstat: USN=20, EXTENTS=24, EXTENDS=21
doing "AABEYTAAAAAC9WMAB6" again was called 7144
-- Restart#: 1, Undo Extent Count: 23
doing "AABEYTAAAAAC9XMACo" again was called 18475
-- Restart#: 2, Undo Extent Count: 24
doing "AABEYTAAAAAC9YpACA" again was called 29740
-- Restart#: 3, Undo Extent Count: 25
doing "AABEYTAAAAAC9ZsAB1" again was called 41005
-- Restart#: 4, Undo Extent Count: 26
doing "AABEYTAAAAAC9axAAa" again was called 52298
-- Restart#: 5, Undo Extent Count: 27
trigger count = 53005 local count = 53000
---- End Rollstat: USN=20, EXTENTS=29, EXTENDS=26

The output shows 5 Restarts(53005-53000) , 5 new Extents(29-24) created by 5 extending(26-21).
So for each restart, a new active Undo Extent is created. If you run it repeatedly, you will see the same behavior.

As we know, Oracle fulfils a DML by following 5 steps:
  create UNDO Change Vector
  create REDO Change Vector
  combine both Change Vectors into Redo Buffer, and then write to redo log
  write UNDO record into UNDO file
  write modified Data into DATA file

When a transaction fills rollback segment and finds that the last remaining block is not sufficient to hold the UNDO record, it will trigger SMON to perform a recursive space transaction, which will allocate a new UNDO extent. Hereafter Oracle recommences the above 5 steps, and hence an "Update Restart".

As I tested in 10g and 11g (undo_management=AUTO), this output is confirmed,  but I would be careful to have any earlier conclusion before Oracle says.

Update (2022Aug25) --------------------------------------------------

Blog: That Old Restart Problem Strikes Back: Setting the Stage (1 August 2019)
sets DML diagnostic events in session:

  alter session set events 'trace[dml]:sql_trace wait=true';
and shows that there is one ORA-01551 error for each restart.

Setting this DML event, repeat our above test.
For above 5 Restarts, we can see 5 occurrences of following text in the trace file:

updThreePhaseExe: objn=4891285 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7ac  xid: 0x006e.011.00006316  uba: 0x00c0227c.1936.58 
updrow: objn=4891285 phase=NOT LOCKED
updrow: objn=4891285 error=1551
updrow: qecinvsub objn=4891285
updrow: setting ROW_RETRY objn=4891285
updrow: retry_this_row: ROW_RETRY set, objn= 4891285 phase=NOT LOCKED
The difference of those 5 occurrences are the second line about (scn, xid, uba),
where scn and uba are different for each restart, but for the same xid:

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7ac  xid: 0x006e.011.00006316  uba: 0x00c0227c.1936.58 
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b0  xid: 0x006e.011.00006316  uba: 0x00c0517c.1937.5a 
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b2  xid: 0x006e.011.00006316  uba: 0x00c051fc.1938.5a 
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b3  xid: 0x006e.011.00006316  uba: 0x00c0527c.1939.59 
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302b6b7b7  xid: 0x006e.011.00006316  uba: 0x00c052fc.193a.5a 
If we made a 1551 errorstack trace:

  alter session set events='1551 trace name errorstack level 3';
the Error Stack shows:

----- Error Stack Dump -----
ORA-01551: extended rollback segment, pinned blocks released
----- Current SQL Statement for this session (sql_id=4u6b4b82z7xrp) -----
SELECT * FROM DELETE_DATA FOR UPDATE
----- PL/SQL Call Stack -----

[15] (kgeselv()+89                    
[16] (ksesecl0()+189                  
[17] (ktusmasp()+1088   --->ktusmasp - kernel transaction undo smu (system managed undo) add space              
[18] (ktuchg2()+9603                  
[19] (ktbchg2()+231                   
[20] (kddchg()+611                    
[21] (kddlok()+2350                   
[22] (kddlkr()+322                    
[23] (qerfuProcessFetchedRow()+669    
[24] (qerfuLockingRowProcedure()+68   
[25] (qersoRowP()+880                 
[26] (kdstf000110100000000km()+1014   
[27] (kdsttgr()+2154                  
[28] (qertbFetch()+1090               
[29] (qersoProcessULS()+281           
[30] (qersoFetchSimple()+2251         
[31] (qersoFetch()+210                
[32] (qerfuStart()+602                
[33] (selexe0()+3409                  
[34] (opiexe()+11896  
End of Update (2022Aug25)--------------------------------------------------



The same behavior is also observed for the single row update, in which the cursor is defined with an input parameter p_id to select a single row:

declare
   cursor l_delete_csr (p_id number) is
     select * from delete_data where id = p_id for update;
   l_cnt number := 0;
begin
  pacepack.g_data.delete;
  pacepack.g_cnt         := 0;
  pacepack.g_restart_cnt := 0;

  for i in 1..53000 loop       
    for l_delete_row in l_delete_csr(i) loop
      update delete_data set owner = lower(owner) where current of l_delete_csr;
      l_cnt := l_cnt + 1;
     
      if l_cnt = 1 then
       for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
        dbms_output.put_line('---- Begin Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
       end loop;
      end if;     
    end loop;
  end loop;
  
  dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt ||' local count = '   || l_cnt );
  for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
    dbms_output.put_line('---- End Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends);
  end loop;      
end;
/



Output:

---- Begin Rollstat: USN=34, EXTENTS=29, EXTENDS=25
doing "AABEYTAAAAAC9WhABL" again was called 11315
-- Restart#: 1, Undo Extent Count: 29
doing "AABEYTAAAAAC9XnAAB" again was called 22612
-- Restart#: 2, Undo Extent Count: 30
doing "AABEYTAAAAAC9ZCAAe" again was called 33877
-- Restart#: 3, Undo Extent Count: 31
doing "AABEYTAAAAAC9aHAAr" again was called 45142
-- Restart#: 4, Undo Extent Count: 32
trigger count = 53004 local count = 53000
---- End Rollstat: USN=34, EXTENTS=33, EXTENDS=29


The Blog: Three new "hints" (http://rwijk.blogspot.com/2009/10/three-new-hints.html) shows a new  Oracle 11.2.0.1 hint: RETRY_ON_ROW_CHANGE, which I would call it block level restart, in comparing to the traditional row level (maybe this hint already exists in 10g since it is mentioned in 10g  "Bug 6082734  Dump (kdr9igtl) accessing a row from an IOT").

One interesting code in this Blog is that it uses after trigger to catch the restart.