Monday, December 7, 2015

Oracle bigfile tablespace pre-allocation and session blocking

Oracle 11g introduced the new feature tablespace extent pre-allocation, and described as:

SMCO decides to expand the tablespace based on history, extension is split evenly across all datafiles in the tablespace which have not reached their maxsize and are still limited to 10% of the full tablespace size in one hourly SMCO wakeup.

With 11g tbs pre-extension feature, more extents are required to meet the free space projection requirement. The system will does extension in multiple of increments (NEXT), not just keeping free space within one increment.

and provided the workaround:

 1. User can set hidden parameter "_enable_space_preallocation" to 0 (default value 3) to turn off the tbs pre-extension feature.
   alter system set "_enable_space_preallocation"=0;

 2. User can set hidden parameter "_kttext_warning" to specify the percentage we'd like to pre-allocate. By default, it is 5%.
   alter system set "_kttext_warning"=5;

See MOS Notes:
   SMCO (Space Management Coordinator) For Autoextend On Datafiles And How To Disable/Enable (Doc ID 743773.1)
   AUTOEXTEND Grows To Full Size Without Reason (Doc ID 1459097.1)
   Master Note: Overview of Oracle Segment Storage (Doc ID 1491960.1)

A bigfile tablespace contains only one datafile. When a tablespace of 2T requiring a pre-allocation of 5% in default, it is 100G. On a system with disk write of 100M/S, it will take 1000 seconds to terminate this pre-allocation. That is probably why the hidden parameter is prepared to turn off the new feature.

Test


Run test with default setting (see appended Test Code):
 ALTER SYSTEM SET "_ENABLE_SPACE_PREALLOCATION" = 3; 
 exec test_run(5, 2, 1);

then look the tablespace usage statistics by:

 select d.snap_id
       ,round((tablespace_usedsize*8192/1024/1024/1024),2) used_size_gb
       ,round((tablespace_size*8192/1024/1024/1024),2)     size_gb
       ,round((tablespace_maxsize*8192/1024/1024/1024),2)  max_size_gb
       ,round(100*(tablespace_size - tablespace_usedsize)/tablespace_size, 2) percentage
 from dba_hist_tbspc_space_usage d, v$tablespace t
 where d.tablespace_id = t.ts# and t.name in ('TBS')
 order by d.rtime;

the output:

SNAP_ID USED_SIZE_GB SIZE_GB MAX_SIZE_GB PERCENTAGE
12021.1921.951003.48
12122.1923.121004.01
12223.1924.411005.02
12324.1924.691002.02
12425.1326.051003.56

The percetage is (almost) under 5%.

Verify again segment size with unix file size:

 select segment_name, sum(blocks) blocks, round(sum(bytes)/1024/1024/1024, 2) gb
       ,round(27976015872/1024/1024/1024, 2)               unix_gb
       ,round(100*(27976015872-sum(bytes))/27976015872, 2) percetage
   from dba_segments
  where segment_name in ('TESTT') group by segment_name;
 
 where 27976015872 is unix file size (ls -l tbs.dbf)

SEGMENT_NAME BLOCKS GB UNIX_GB PERCENTAGE
TESTT325632024.8426.054.65

We can see SIZE_GB(tablespace_size) of 26.05 matches well with UNIX_GB (unix file size).

Test again with pre-allocation off:
 ALTER SYSTEM SET "_ENABLE_SPACE_PREALLOCATION" = 0;  
 exec test_run(5, 2, 1);

and collect the tablespace usage statistics. The output:

SNAP_ID USED_SIZE_GB SIZE_GB MAX_SIZE_GB PERCENTAGE
13231.1931.251000.20
13332.2532.261000.02
13433.2533.281000.09
13534.3434.381000.09
13635.3235.361000.11

The percetage is under 1%.

Verify again segment size with unix file size (37968945152):

 select segment_name, sum(blocks) blocks, round(sum(bytes)/1024/1024/1024, 2) gb
       ,round(37968945152/1024/1024/1024, 2)               unix_gb
       ,round(100*(37968945152-sum(bytes))/37968945152, 2) percetage
   from dba_segments
  where segment_name in ('TESTT') group by segment_name;

SEGMENT_NAME BLOCKS GB UNIX_GB PERCENTAGE
TESTT456396834.8235.361.53

With following query, one can also estimate the Disk Write Throughput:

select count(*)                                                     cnt
      ,round(sum(delta_write_io_bytes)/(1024*1024*1024), 2)         write_gb
      ,round(sum(delta_interconnect_io_bytes)/(1024*1024*1024), 2)  write_int_gb
      ,count(*)                                                     elapsed_seconds
      ,round(sum(delta_write_io_bytes)/(1024*1024)/(count(*)*1), 2) disk_write_MB_per_second
from v$active_session_history t
where event in 'Data file init write';

The same behavior can be reproduced by simply resize datafile to a bigger size,
for instance, from current 60G to 100G:
 ALTER DATABASE DATAFILE '/home/oradata/testdb/tbs.dbf' RESIZE 100G;
and monitor file size by:
 select * from dba_data_files where file_name = '/home/oradata/testdb/tbs.dbf';

Wait Events


There are 3 blocking wait events observed among Wnnn, DBWn, and user sessions.
1. Data file init write
Wnnn or user sessions wait on this event when allocating space in datafile. It is never blocked.
When "_ENABLE_SPACE_PREALLOCATION" = 3, both Wnnn and user sessions can wait on such event.
When "_ENABLE_SPACE_PREALLOCATION" = 0, no Wnnn active; and only user sessions can wait on such event.

2. local write wait  
Wnnn or user sessions wait on this event when writing "bitmap block" (CLASS# 11). The blocker is DBWn.

3. buffer busy waits
user sessions wait on this event when reading file header block (CLASS# 13, BLOCK# 2).
The blocker is Wnnn or other user session, which are waiting on "Data file init write".

It can also happen when one Wnnn reading file header block (CLASS# 13, BLOCK# 2).
The blocker is another Wnnn or other user session, which are waiting on "Data file init write".

The user session, which is waiting on Event "buffer busy waits" when reading file header block (CLASS# 13, BLOCK# 2), can in turn block other user sessions if they are performing DML (e.g. INSERT) on the same segment. These other user sessions are waiting on Event "enq: TX - contention".

For the application of Distributed Transaction, this can cause the Error:
    ORA-02049: timeout: distributed transaction waiting for lock
because over there there is a hard constraint: distributed_lock_timeout.
If "enq: TX - contention" wait time of Distributed Transaction is more than distributed_lock_timeout, the session throws ORA-02049.
(For ORA-02049, See Blog: Redo Practice)

During pre-allocation, Wnnn or user sessions are writing to datafile till its target size (by big chunks)with locking on its file header block. The consequence is that all access to the datafile by user sessions is blocked.

For Batch processing, it could improve the whole throughput; whereas on OLTP applications, user sessions could be frozen during entire allocation. Even turning off pre-allocation by "_ENABLE_SPACE_PREALLOCATION" = 0, the allocation by user sessions also blocks other sessions because file header block lock (event "Data file init write") is an exclusive lock and serialize all operations on it.

Dumping file header block shows that it records the file size and allocation details (all number in K), which probably explains why it is a single serializing point.

 File Space Header Block:
 Header Control:
 RelFno: 1024, Unit: 8, Size: 664320, Flag: 9
 AutoExtend: YES, Increment: 1280, MaxSize: 13107200
 Initial Area: 126, Tail: 664319, First: 82240, Free: 784
 Deallocation scn: 2440207268.2059
 Header Opcode:
 Save: No Pending Op
 End dump data blocks tsn: 861 file#: 859 minblk 2 maxblk 2


truss (AIX) on Wnnn when it is on "Data file init write":

  listio64(0x0000000010000004, 0x000000000FFFFFFF, 0x00000000FFFF6140, 0x0000000000000001, ...
  listio64(0x0000000010000004, 0x000000000FFFFFFF, 0x00000000FFFF6140, 0x0000000000000001, ...
  listio64(0x0000000010000004, 0x000000000FFFFFFF, 0x00000000FFFF6140, 0x0000000000000001, ...
  aio_nwait_timeout64(0x0000000000001000, 0x0000000000000003, ...) = 0x0000000000000003
  fsync(16)                        = 0
  statx("/home/oradata/testdb/tbs.dbf", 0x0FFFFFFFFFFF7D00, 176, 010) = 0
  kopen64x("/home/oradata/testdb/tbs.dbf", 0400400000200, 037777676400, , 1152921504606846944) = 17
  close(17)

Oracle has designated a dedicated event:
 db file single write
and said:
 This event is used to wait for the writing of the file headers.

but is only observed a single once when first time to create TABLESPACE with DATAFILE.

Other Test


Curious about the jump of "_ENABLE_SPACE_PREALLOCATION" from default 3 to 0, value 1 and 2 is also tested, which are more or less like 0.

Parallel to "_kttext_warning" for tablespace, there are also 4 hidden parameters for segment allocation:

NAME DES INST_DEF_VAL
_ktslj_segext_warning segment pre-extension warning threshold in percentage 10
_ktslj_segext_warning_mbsegment pre-extension warning threshold in MB 0
_ktslj_segext_max_mb segment pre-extension max size in MB (0: unlimited) 0
_ktslj_segext_retry segment pre-extension retry 5

Test Code


drop tablespace tbs including contents and datafiles cascade constraints;

create bigfile tablespace tbs datafile
  '/home/oradata/testdb/tbs.dbf' size 10M autoextend on next 10M maxsize 100G
logging online extent management local autoallocate blocksize 8k
segment space management auto flashback on;


drop table testt cascade constraints;

create table testt(id1 number, id2 number, id3 number, id4 number, text1 varchar2(3000), text2 varchar2(4000))
tablespace tbs storage (initial 16k next 128k);
       
create index testt_ix1 on testt (id1)
tablespace tbs storage (initial 16k next 128k); 

create index testt_ix2 on testt (id2)
tablespace tbs storage (initial 16k next 128k); 

create index testt_ix3 on testt (id3)
tablespace tbs storage (initial 16k next 128k); 

create index testt_ix4 on testt (id4)
tablespace tbs storage (initial 16k next 128k); 
                                          
create or replace procedure insert_testt (p_cnt number default 1) is
begin
  for i in 1..p_cnt loop
   insert into testt select level id1, level id2, level id3, level id4
                           ,rpad('ABC', 3000, 'K') text1, rpad('XYZ', 4000, 'K') text2
     from dual connect by level <= 65000;  -- about 0.5GB
   commit;
  end loop;
end;
/

create or replace procedure insert_testt_job (p_jobs number default 2, p_cnt number default 1)
is
  l_job_id       pls_integer;
  l_running_job_cnt number := 1;
  l_start_time    number := dbms_utility.get_time;
begin
  for i in 1..p_jobs loop
   dbms_job.submit(l_job_id, 'begin insert_testt(p_cnt=> '||p_cnt||'); end;');
  end loop;
  commit;
 
  while l_running_job_cnt > 0
  loop
    dbms_lock.sleep(1);
   select count(*) into l_running_job_cnt
     from dba_jobs
    where (job = l_job_id or what like 'begin insert_testt(p_cnt=>%')
      and rownum = 1;
  end loop;
  dbms_output.put_line('Elapsed Seconds: '|| round((dbms_utility.get_time - l_start_time)/100));
end;   
/

create or replace procedure test_run (p_run number default 5, p_jobs number default 2, p_cnt number default 1) is
begin
 sys.dbms_workload_repository.create_snapshot('ALL');
 for i in 1..p_run loop
  insert_testt_job(p_jobs, p_cnt);
  sys.dbms_workload_repository.create_snapshot('ALL');
 end loop;
end;
/

Tuesday, November 3, 2015

Oracle 12c single session "library cache lock (cycle)" deadlock

Note:

This is a 12c follow-up of previous Blog: Oracle 11gR2 single session "library cache pin"

Update (2020-Oct-23):

The discussed behaviour is no more reproducible in 18c or later version due to 18c new introduced a hidden parameter "_EnableShadowTypes" (enable shadow types, default FALSE). To reproduce the problem in 18c or later version, set "_EnableShadowTypes" to TRUE.

It is documented in MOS Note:
Pipeline Function Returning UDT Defined in the Package Specification Raises ORA-00902 Post Database Upgrade to 18c (Doc ID 2696498.1)

Prior to 18c, when such pipeline function is compiled, database implicitly creates shadow objects at SQL level 
(e.g. SYS_PLSQL_135723_9_1) so that the pipeline function can return the value. 
However, from RDBMS version 18c and onward, it fails because implicit creation of shadow object type 
for pipelined table functions returning user defined object type defined in package has been stopped. 
This change has been introduced by the following internal.

Alternatively, there's a backout switch to re-enable the generation of shadow object types 
for pipelined table functions returning user defined object type defined in package specification. 
The backout switch is to set the hidden parameter "_EnableShadowTypes" to TRUE.

      alter session SET "_EnableShadowTypes" = true;
(also see MOS Note:
-. Bug 29935685 - ORA-7445 [qcopgoty] Error During Compilation of Package When Using Pipelined Table Functions in 19.3 and Above (Doc ID 29935685.8)
-. Pipelined Function Consumes More Time IN 18C Than 12.2.0.1 (Doc ID 2603568.1) )

library cache lock (cycle)


Recently we observed a single session "library cache lock (cycle)" deadlock when upgrading to 12c (12.1.0.2.0) from 11gR2.

To reproduce the deadlock, run the attached TestCode (at the end of this Blog) to create lc_pin# package and package body.

To observe the new generated objects, run query:
 select owner, object_name, object_type from dba_objects
 where last_ddl_time > sysdate -10/1440
 order by object_name;


it lists all new created sources:

 TEST LC_PIN#                    PACKAGE BODY
 TEST LC_PIN#                    PACKAGE
 TEST SYS_PLSQL_6174CDA6_21_1    TYPE
 TEST SYS_PLSQL_6174CDA6_31_1    TYPE
 TEST SYS_PLSQL_6174CDA6_9_1     TYPE
 TEST SYS_PLSQL_6174CDA6_DUMMY_1 TYPE
 SYS   SYS_PLSQL_750F00_462_1     TYPE
 SYS   SYS_PLSQL_750F00_DUMMY_1   TYPE


Run query on dba_source:
 select * from dba_source
 where name like 'SYS_PLSQL_6174CDA6%' or name like 'SYS_PLSQL_750F00%'
 order by name, line;


it shows the mapping between LC_PIN# defined types and new generated types:

 TEST SYS_PLSQL_6174CDA6_21_1     for t_vc
 TEST SYS_PLSQL_6174CDA6_31_1     for t_vc_tab (table of SYS_PLSQL_6174CDA6_21_1)
 TEST SYS_PLSQL_6174CDA6_9_1      for t_dba_row_tab (table of SYS_PLSQL_750F00_462_1)
 TEST SYS_PLSQL_6174CDA6_DUMMY_1  for index table of SYS_PLSQL_6174CDA6_31_1
 SYS  SYS_PLSQL_750F00_462_1      for sys.dba_tables%rowtype
 SYS  SYS_PLSQL_750F00_DUMMY_1    for index table of SYS_PLSQL_6174CDA6_9_1


Now if we manually drop the generated SYS type: (in real applications, it is done by CLEANUP_NON_EXIST_OBJ Task. see later discussion):

 SQL > drop type SYS.SYS_PLSQL_750F00_462_1 force;

SYS_PLSQL_750F00_462_1 is no more registered in dba_objects, but still retained in sys.obj$.

 select * from sys.obj$
 where mtime > sysdate -10/1440
 order by mtime;


In sys.obj$, however, it is altered from type# 13 (TYPE) to type# 10 object (also named NON-EXISTENT object in Oracle).

SYS_PLSQL_6174CDA6_9_1 is a table of dropped SYS_PLSQL_750F00_462_1, it becomes invalid, try to recompile it:

 SQL > alter type test.sys_plsql_6174cda6_9_1 compile;
   Warning: Type altered with compilation errors.

 SQL > show error
   Errors for TYPE TEST.SYS_PLSQL_6174CDA6_9_1:
   LINE/COL ERROR
   -------- -----------------------------------------------------------------
   0/0      PL/SQL: Compilation unit analysis terminated
   1/46     PLS-00201: identifier 'SYS.SYS_PLSQL_750F00_462_1' must be
            declared


The invalid SYS_PLSQL_6174CDA6_9_1 is used in package body, try to compile it:

 SQL > alter package lc_pin# compile body;
   Warning: Package Body altered with compilation errors.

 SQL > show error
   Errors for PACKAGE BODY LC_PIN#:
   LINE/COL ERROR
   -------- -----------------------------------------------------------------
   25/8     PL/SQL: ORA-04020: deadlock detected while trying to lock object
            TEST.SYS_PLSQL_6174CDA6_31_1
   25/8     PL/SQL: SQL Statement ignored
   33/8     PL/SQL: Statement ignored
   33/17    PLS-00364: loop index variable 'C' use is invalid


where Line 25 (see attached TestCode) is
  with sq as (select * from table(foo))

Now we have SYS_PLSQL_6174CDA6_9_1 and LC_PIN# package body invalid, but LC_PIN# package spec still valid.

The above compile body took a few seconds. If we open a new Sqlplus session, and run a query on v$wait_chains during the compile body:

 select chain_signature, to_char(p1, 'xxxxxxxxxxxxxxxxxxxx') p1, p1_text,
    to_char(p2, 'xxxxxxxxxxxxxxxxxxxxxxxx') p2, p2_text,
    to_char(p3, 'xxxxxxxxxxxxxxxxx') p3, p3_text,
       in_wait_secs, time_remaining_secs
 from v$wait_chains;


it shows CHAIN_SIGNATURE: 'library cache lock' (cycle) (v$session.event: 'library cache lock', see later discussion):

CHAIN_SIGNATURE P1 P1_TEXT P2 P2_TEXT P3 P3_TEXT IN_WAIT_SECS TIME_REMAINING_SECS
'library cache lock' (cycle) 15ab8f290 handle address 15f2189a8 lock address 585a300010003 100*mode+namespace 1 898


It will also generates an trace file, which contains a line:
     ORA-04020: deadlock detected while trying to lock object TEST.SYS_PLSQL_6174CDA6_31_1

(In 11gR2, they are "library cache pin" and "ORA-04021: timeout occurred while waiting to lock object".
see Blog: Oracle 11gR2 single session "library cache pin")

A quick workaround is to recompile the package spec even it is still valid:

     alter package lc_pin# compile;

It re-compiled SYS_PLSQL_6174CDA6_9_1 (TYPE) and LC_PIN# (PACKAGE BODY), but not LC_PIN# (PACKAGE).

After the re-compilation, all are valid, we can run the query:

   select * from table(lc_pin#.soo);    

And watch object dependencies currently loaded in the shared pool by:

 select (select to_name from v$object_dependency where to_hash = d.from_hash and rownum=1) from_name
       ,(select sql_text from v$sql where hash_value = d.from_hash) sql_text
       ,d.*
 from v$object_dependency d
 where to_name like 'SYS_PLSQL_6174CDA6%' or to_name like 'SYS_PLSQL_750F00%' or to_name = 'LC_PIN#'
 order by to_name;


Oracle view v$object_dependency has to_name column, but not from_name column, we can add two from_name columns with query:

with sq as (select hash_value, name, type from v$db_object_cache v where name = 'LC_PIN#')
select (select name||' ('||type||')' from v$db_object_cache where hash_value = from_hash and rownum=1) from_name_1
      ,(select to_name from v$object_dependency where to_hash = d.from_hash and rownum=1) from_name_2
      ,(select substr(sql_text, 1, 20) from v$sql where hash_value = d.from_hash and rownum=1) sql_text
      ,d.*
  from v$object_dependency d
 where from_hash in (select hash_value from sq) or to_hash in (select hash_value from sq) 
 order by from_hash, to_name;

Reasoning:


The problem seems caused by the "with" factoring clause (see TestCode, function soo in package body lc_pin#).

When Oracle parses "with" factoring clause, it acquires a "library cache pin" in the Share Mode (S) on the dependent objects, in this case, it is "t_vc_tab", then it proceeds to main clause, in which it realizes that the dependent object: "t_dba_row_tab" is invalid. In order to resolve this invalid, Oracle attempts to recompile package spec, which requests Exclusive Mode (X) on the related objects.

Since the already held mode (S) on "t_vc_tab" is not consistent with requesting mode (X), Oracle session generated a dump:

 A deadlock among DDL and parse locks is detected.
 ORA-04020: deadlock detected while trying to lock object TEST.SYS_PLSQL_6174CDA6_31_1

 Short stack dump of current process:
  ksedsts()+244<-ksfdmp()+59<-kgllkde()+742<-kglLockWait()+1216<-kgllkal()+1493<-kglLock()+1533
  <-kglget()+354<-kokt_gtyp()+1195<-koktgdptt()+1678<-koktsdcu()+327<-koktdrp()+3321<-opiexe()+23402

 --------------------------------------------------------
   object   waiting  waiting       blocking blocking
   handle   session     lock mode   session     lock mode
 --------  -------- -------- ----  -------- -------- ----
 15ab8f290  18fbfb3c0 15f2189a8    X  18fbfb3c0 165dbbe28    S


 ------------- WAITING LOCK -------------
 SO: 0x15f2189a8, type: 96, owner: 0x180658498, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
  LibraryObjectLock:  Address=15f2189a8 Handle=15ab8f290 RequestMode=X
   CanBeBrokenCount=9 Incarnation=7 ExecutionCount=0
   User=18fbfb3c0 Session=18fbff560 ReferenceCount=0
   Flags=[0000] SavepointNum=2043e
   LibraryHandle:  Address=15ab8f290 Hash=f71a33e4 LockMode=S PinMode=S LoadLockMode=0 Status=VALD

 ------------- BLOCKING LOCK ------------
 SO: 0x165dbbe28, type: 96, owner: 0x15f102fe0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
  LibraryObjectLock:  Address=165dbbe28 Handle=15ab8f290 Mode=S
   CallPin=155fbeed8 CanBeBrokenCount=9 Incarnation=7 ExecutionCount=0
   User=18fbfb3c0 Session=18fbfb3c0 ReferenceCount=1
   Flags=CNB/PNC/[0003] SavepointNum=203a9
  LibraryHandle:  Address=15ab8f290 Hash=f71a33e4 LockMode=S PinMode=S LoadLockMode=0 Status=VALD


If we quickly select on v$wait_chains by:

 select chain_signature, to_char(p1, 'xxxxxxxxxxxxxxxxxxxx') p1, p1_text,
    to_char(p2, 'xxxxxxxxxxxxxxxxxxxxxxxx') p2, p2_text,
    to_char(p3, 'xxxxxxxxxxxxxxxxx') p3, p3_text,
       in_wait_secs, time_remaining_secs
 from v$wait_chains;


We got:

CHAIN_SIGNATURE P1 P1_TEXT P2 P2_TEXT P3 P3_TEXT IN_WAIT_SECS TIME_REMAINING_SECS
'library cache lock' (cycle) 15ab8f290 handle address 15f2189a8 lock address 585a300010003 100*mode+namespace 1 898


Although time_remaining_secs is 898 seconds (total wait_time is about 900), the above row disappeared after 9 seconds,
probably because the session already generated the dump. The time_remaining_secs is still calculated with 15 minutes "library cache pin" timeout in 11gR2 (Blog: Oracle 11gR2 single session "library cache pin")

Total wait_time of 900 seconds (15 minutes) seems coming from default setting of "_kgl_time_to_wait_for_locks" (time in minute, modification requires instance restart), which controls time to wait for locks and pins before timing out (see Blog: One Oracle AQ Deadlock ). But with Oracle 12c, 15 minutes (900 seconds) is only for display, no more real honoured.

A further query:

  select (select kglnaobj||'('||kglobtyd||')'
            from x$kglob v 
           where kglhdadr = object_handle and rownum=1) kglobj_name
         ,v.*
  from v$libcache_locks v
  where v.holding_user_session  =
           (select saddr from v$session
             where event ='library cache lock' and rownum = 1)
    and object_handle in (select object_handle from v$libcache_locks where mode_requested !=0)
  order by kglobj_name, holding_user_session, type, mode_held, mode_requested;


shows there are two rows on SYS_PLSQL_6174CDA6_31_1(TYPE) with TYPE: LOCK.
In the first row, the HOLDING_USER_SESSION and HOLDING_SESSION are different, it has
 MODE_REQUESTED = 3 (Exclusive mode).

KGLOBJ_NAME TYPE ADDR HOLDING_USER_SESSION HOLDING_SESSION OBJECT_HANDLE LOCK_HELD REFCOUNT MODE_HELD MODE_REQUESTED SAVEPOINT_NUMBER CON_ID
SYS_PLSQL_6174CDA6_31_1(TYPE) LOCK 15F2189A8 18FBFB3C0 18FBFF56015AB8F290 0 0 0 3 132158 0
SYS_PLSQL_6174CDA6_31_1(TYPE) LOCK 165DBBE28 18FBFB3C0 18FBFB3C0 15AB8F290155FBEED8 1 2 0 132009 0
(some leading ZEROs are removed)

From the query result, we can see that HOLDING_USER_SESSION already held a LOCK mode of 2(Share mode), but at the same time designates a different recursive session to request a LOCK mode of 3(Exclusive mode). The column SAVEPOINT_NUMBER seems recording the sequence of LOCK GET (132009) and REQUEST(132158), so first has GET, then REQUEST (132009 < 132158).

Oracle throws such deadlock since both GET and REQUEST are originated from same HOLDING_USER_SESSION.

Crossing check with above dump file under "WAITING LOCK", we can see:
    User=18fbfb3c0 Session=18fbff560
where HOLDING_USER_SESSION (18fbfb3c0) is different from HOLDING_SESSION (18fbff560), but under "BLOCKING LOCK", both are same (18fbfb3c0).

SavepointNum is hex: 2043e (decimal 132158), and 203a9 (decimal 132009).

In Oracle, HOLDING_USER_SESSION is the session we see in v$session, whereas HOLDING_SESSION is the recursive session when both are not the same. Normally recursive session is spawned out when HOLDING_USER_SESSION requires "SYS" user privilege to perform certain tasks.

By the way, recursive session is not exported in v$session because of filter predicate:
    bitand("s"."ksuseflg",1)<>0
on x$ksuse, where bitand (s.ksuseflg, 19) =
    1     for 'USER' session,
    17   for 'BACKGROUND',
    2     for 'RECURSIVE'.

(Tanel has a Blog on recursive session)

One more point, in 11.2.0.3.0, the names of generated types are composed by DBA_OBJECTS.object_id of lc_pin# and dba_tables, but in 12c, they are named "6174CDA6" and "750F00". It is not clear to me where they come from.

Repeat the same test in 11.2.0.4.0, the behavior is same as 12c, but the name convention is similar to 11.2.0.3.0.

TYPE Dropping


In the above discussion, we manually drop the type to force the invalid by:

 drop type SYS.SYS_PLSQL_750F00_462_1 force;

Before upgrade to 12c, the query in 11gR2:

 select owner, object_name, object_type
 from dba_objects
 where (object_name like 'LC_PIN#'
     or object_name like 'SYS_PLSQL_1349946%' or object_name like 'SYS_PLSQL_3238%'
     or object_name like 'SYS_PLSQL_6174CDA6%' or object_name like 'SYS_PLSQL_750F00%')
 order by object_name;


returns:

 TEST LC_PIN#                    PACKAGE BODY
 TEST LC_PIN#                    PACKAGE
 TEST SYS_PLSQL_1349946_21_1     TYPE
 TEST SYS_PLSQL_1349946_31_1     TYPE
 TEST SYS_PLSQL_1349946_9_1      TYPE
 TEST SYS_PLSQL_1349946_DUMMY_1  TYPE
 SYS  SYS_PLSQL_3238_382_1       TYPE
 SYS  SYS_PLSQL_3238_DUMMY_1     TYPE


and after the upgrade to 12c, it returns:

 TEST LC_PIN#                     PACKAGE BODY
 TEST LC_PIN#                     PACKAGE
 TEST SYS_PLSQL_6174CDA6_21_1     TYPE
 TEST SYS_PLSQL_6174CDA6_31_1     TYPE
 TEST SYS_PLSQL_6174CDA6_9_1      TYPE
 TEST SYS_PLSQL_6174CDA6_DUMMY_1  TYPE
 SYS  SYS_PLSQL_750F00_462_1      TYPE
 SYS  SYS_PLSQL_750F00_DUMMY_1    TYPE


so all the TYPES in 12c are newly created, and the TYPES in 11gR2 were dropped during upgrade.

Look the DDL of DBA_TABLES,
  select * from  dba_tab_cols where table_name = 'DBA_TABLES' order by column_name; 

we can see that is is augmented to 65 Columns in 12c from 55 in 11gR2. Query:
  select * from dba_objects where object_name = 'DBA_TABLES';
shows the created time and last_ddl_time.

CLEANUP_NON_EXIST_OBJ Task


Oracle 12c has introduced a few CLEANUP JOBs. We can see them by:

 select * from dba_scheduler_jobs
 where job_name like 'CLEANUP%';
  

output:

 CLEANUP_NON_EXIST_OBJ
 CLEANUP_ONLINE_IND_BUILD
 CLEANUP_ONLINE_PMO
 CLEANUP_TAB_IOT_PMO
 CLEANUP_TRANSIENT_PKG
 CLEANUP_TRANSIENT_TYPE


Look JOB: CLEANUP_NON_EXIST_OBJ, the comments Column said:

 Cleanup Non Existent Objects in obj$

and the job_action Column:

 declare
   myinterval   number;
 begin
   myinterval := dbms_pdb.cleanup_task (1);
   if myinterval <> 0
   then
     next_date := systimestamp + numtodsinterval (myinterval, 'second');
   end if;
 end;


If we run the above block, the NON-EXISTENT object, in this case,
        SYS.SYS_PLSQL_750F00_462_1,
will be removed.

Open package SYS.dbms_pdb, it is documented as:

        The following routine is related to operations done in SMON
        until 11.2. But, with the introduction of PDBs in 12c and with
        the possibility of having multiple PDBs in a single
        CDB, we want to move this cleanup out of SMON so that SMON is not
        overloaded with work that can be done in some other background process.
        The goal is to move everything except transaction recovery out of SMON.


        cleanup_task - cleanup task previously done in SMON
        This procedure performs cleanup task previously done in SMON


It looks like the type was dropped by SMON in 11gR2; but in 12c, performed by a dedicated JOB.

But in one 12c instance crash, SMON dump contains one section on its most recent 20 activities:

  SMON's recent activities: (most recent first)
    Activity  1:              Active transaction recovery -  started 2205" ago, not ended 
    Activity  2:                  Doing Sort Segment work -  0" (started 2205" ago)
    ...
    Activity  9:                         Coalesce extents -  1" (started 2221" ago)
    Activity 10:                      Auto-tuning for IMU -  0" (started 2221" ago)
    Activity 11:                        scn-->time update -  0" (started 2221" ago)    
    ...
    Activity 20:        read afn from prewarm fet channel -  0" (started 2521" ago)
  End -- SMON's most recent 20 activities.
in which we can see SMON 10 activities.
  Active transaction recovery
  Auto-tuning for IMU
  Coalesce extents
  Doing Sort Segment work
  Launch background space manager
  Offline undo segments
  read afn from prewarm fet channel
  Read from broadcast channel
  read from segment info channel
  scn-->time update
So SMON is not only perfroming Active transaction recovery, but also other 9 (at least) activities. It is not clear how to match above SYS.dbms_pdb.cleanup_task package description with real SMON dump. (In general, both sources are higly reliable since the description should be wriiten by the developers, and SMON dump reports real existed facts, which have been performed.)

There are also other 12c new JOBs:

 select * from dba_scheduler_jobs where owner = 'SYS';

For example, FILE_SIZE_UPD, whose comments said: Update file size periodically, and job_action:

 declare
   myinterval   number;
 begin
   myinterval := dbms_pdb.cleanup_task (7);
   if myinterval <> 0
   then
     next_date := systimestamp + numtodsinterval (myinterval, 'second');
   end if;
 end;



TestCode


 -- This test is with dba_tables.
 -- It is also reproducible with dba_segments, dba_objects, dba_indexes.

drop package lc_pin#;

create or replace package lc_pin#
 as
   type t_dba_row_tab is table of sys.dba_tables%rowtype; 
   type t_vc          is record (name varchar2(30));
   type t_vc_tab      is table of t_vc;
    
   function foo return t_vc_tab pipelined;
   function koo return t_dba_row_tab pipelined;
   function soo return t_dba_row_tab pipelined;
 end lc_pin#;
 /

create or replace package body lc_pin#
 as

  function foo return t_vc_tab pipelined
   is
     l_result  t_vc;
   begin
     l_result.name     := 'lc_test';
     pipe row(l_result);
     return;
   end foo;  
   
  function koo return t_dba_row_tab pipelined
   is
   begin
     for c in (select * from dba_tables where  rownum = 1) loop
       pipe row(c);
     end loop;
   end koo;
  
  function soo return t_dba_row_tab pipelined
   is
   begin
     for c in (
       with sq as (select * from table(foo))    -- Line 25
       select nt.*
      from   sq 
            ,(select * from table(koo)) nt
      
      -- following re-write works
      -- select nt.* from (select * from table(foo)) sq, (select * from table(koo)) nt
     ) loop
       pipe row(c);                 -- Line 33
     end loop;
   end soo;
  
end lc_pin#;
/

Friday, September 25, 2015

SQL DML Exceptions, Rollbacks and PL/SQL Exception Handlers

When SQL DML statements hit runtime errors, Oracle rollbacks only the last DML statement, which caused the error. PL/SQL exception handling can make use of this behavior to save the not rollbacked work and keep the application continue running. Oracle is optimized to rollback the entire transaction if there is only one single DML statement inside the transaction.

Oracle "Database PL/SQL Language Reference" Section: "Retrying Transactions After Handling Exceptions" said:

   To retry a transaction after handling an exception that it raised, use this technique:
    ......
   If the transaction fails, control transfers to the exception-handling part of the sub-block, and after the exception handler runs, the loop repeats.

which interprets the exception-handling as Transaction-Level. Over there the example contains two DML statements (delete and insert), and rollback is controlled by a savepoint.

At first pick the test code from Book Oracle PL/SQL Programming (5th edition, Steven Feuerstein) Chapter 14: Section: DML and Exception Handling (Page 446), and add some extra lines:
   
drop table books;

create table books (book_id integer);

create or replace function tabcount return pls_integer is
   l_return pls_integer;
begin
   select count (*) into l_return from books;
   return l_return;
end tabcount;
/

create or replace procedure empty_library (pre_empty_count out pls_integer) is
begin
   pre_empty_count := tabcount ();
   dbms_output.put_line ('empty_library pre_empty_count='||pre_empty_count);
   dbms_output.put_line ('empty_library tabcount S1='||tabcount());
   delete from books where book_id=1;
   dbms_output.put_line ('empty_library tabcount S2='||tabcount());
   raise no_data_found;
end;
/

Run first test with Exception Handler:

set serveroutput on;

truncate table books;

declare
  table_count number := -1;
begin
  insert into books (book_id) values (1);
  insert into books (book_id) values (2);
  dbms_output.put_line ('tabcount S1='||tabcount());
  empty_library (table_count);
  exception when others then
    dbms_output.put_line ('tabcount S2='||tabcount());
    dbms_output.put_line ('table_count S3='||table_count);
end;
/

select * from books;

The output looks like:

tabcount S1=2
empty_library pre_empty_count=2
empty_library tabcount S1=2
empty_library tabcount S2=1
tabcount S2=1
table_count S3=-1

SQL> select * from books;
   BOOK_ID
----------
         2   

Run another test without Exception Handler:

set serveroutput on;

truncate table books;

declare
  table_count number := -1;
begin
  insert into books (book_id) values (1);
  insert into books (book_id) values (2);
  dbms_output.put_line ('tabcount S1='||tabcount());
  empty_library (table_count);
end;
/

select * from books;

The output is:

tabcount S1=2
empty_library pre_empty_count=2
empty_library tabcount S1=2
empty_library tabcount S2=1
ORA-01403: no data found
ORA-06512: at "K.EMPTY_LIBRARY", line 8

SQL> select * from books;
no rows selected  

Reading text in Page 446:

  When an exception occurs in a PL/SQL block, the Oracle database does not roll back
  any of the changes made by DML statements in that block.


and Page 447:

  If an exception propagates past the outermost block (i.e., it goes “unhandled”),
  then in most host execution environments for PL/SQL like SQL*Plus, a rollback is
  automatically executed, reversing any outstanding changes.


Crosschecking above two tests, we can see SQL*Plus does not roll back if there is an error handler,
and does a rollback to the beginning of block if there is no error handler (Unhandled Exceptions).

Relevant information can also be found in Page 145 about Unhandled Exceptions:

  If an exception is raised in your program, and it is not handled by an exception section
  in either the current or enclosing PL/SQL blocks, that exception is unhandled. PL/SQL
  returns the error that raised the unhandled exception all the way back to the application
  environment from which PL/SQL was run. That environment (a tool like SQL*Plus,
  Oracle Forms, or a Java program) then takes an action appropriate to the situation; in
  the case of SQL*Plus, a ROLLBACK of any DML changes from within that top-level
  block’s logic is automatically performed.


and Page 137,

  When this procedure(RAISE_APPLICATION_ERROR) is run, execution of the current PL/SQL block halts immediately,
  and any changes made to OUT or IN OUT arguments (if present and without the NOCOPY hint) will be reversed.

Book Expert Oracle Database Architecture (3rd Edition, Thomas Kyte, Darl Kuhn) - Chapter 8, Section: Atomicity (Page 277-283) explains the principle of Statement-Level Atomicity, and mimicks the work Oracle normally does with the SAVEPOINT:

 Savepoint sp;
 statement;
 If error then rollback to sp;


It further shows that Oracle considers Procedure-Level Atomicity (PL/SQL anonymous blocks) to be statements as well.

This Blog will try to show how to apply above principle to PL/SQL blocks with Exception Handlers in order to keep application not interrupted.

We will run 5 test cases to demonstrate such rollbacks in Statement-Level. When error occurs, it is not "transaction fails", but only "statement fails".

All Testcode is appended at the end of Blog.

1. undo_tbs_test_1


Run following code:

  set serveroutput on lines=200
  select n.name, s.value from v$mystat s, v$statname n
  where s.statistic#=n.statistic#
    and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied');
  select id, ts, step from test_t1 where id <= 2;

  exec undo_tbs_test_1;

  select n.name, s.value from v$mystat s, v$statname n
  where s.statistic#=n.statistic#
    and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied');
  select id, ts, step from test_t1 where id <= 2;


The output look as follows:

  NAME                                             VALUE
  ------------------------------------------- ----------
  user commits                                       685
  rollback changes - undo records applied         147854
  transaction rollbacks                              126

  ID TS                                     STEP
  -- -------------------------------- ----------
   1 22-SEP-2015 07:15:01                      0
   2 22-SEP-2015 07:15:01                      0

  ORA-30036: unable to extend segment by 8 in undo tablespace 'TEMPUNDO', at Step=1
  ORA-30036: unable to extend segment by 8 in undo tablespace 'TEMPUNDO', at Step=2
  ORA-30036: unable to extend segment by 8 in undo tablespace 'TEMPUNDO', at Step=3
  End committed, at Step=4

  NAME                                            VALUE
  ------------------------------------------ ----------
  user commits                                      685
  rollback changes - undo records applied        151098
  transaction rollbacks                             129

  ID TS                                   STEP
  -- ------------------------------ ----------
   1 22-SEP-2015 07:15:01                    0
   2 22-SEP-2015 07:15:01                    0


we can see 3 (129-126) "transaction rollbacks", and no user commits (685-685), and hence no rows updated.

2. undo_tbs_test_2


Run following code:

  update test_t1 set acc = null;
  commit;
  select n.name, s.value from v$mystat s, v$statname n
  where s.statistic#=n.statistic#
    and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied');
  select count(*) from test_t1 where acc is null;

  exec undo_tbs_test_2;

  select n.name, s.value from v$mystat s, v$statname n
  where s.statistic#=n.statistic#
    and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied');
  select count(*) from test_t1 where acc is null;


The output look as follows:

  NAME                                                    VALUE
  -------------------------------------------------- ----------
  user commits                                              707
  rollback changes - undo records applied                153199
  transaction rollbacks                                     129

  sql%rowcount=50 Updated_1, at Step=1
  ORA-30036: unable to extend segment by 8 in undo tablespace 'TEMPUNDO', at Step=1
  sql%rowcount=50 Updated_1, at Step=2
  ORA-30036: unable to extend segment by 8 in undo tablespace 'TEMPUNDO', at Step=2
  ......
  ORA-30036: unable to extend segment by 8 in undo tablespace 'TEMPUNDO', at Step=18
  sql%rowcount=50 Updated_1, at Step=19
  ORA-30036: unable to extend segment by 8 in undo tablespace 'TEMPUNDO', at Step=19
  sql%rowcount=50 Updated_1, at Step=20
  sql%rowcount=0 Updated_1, at Step=20
  End committed, at Step=20

  NAME                                                    VALUE
  -------------------------------------------------- ----------
  user commits                                              727
  rollback changes - undo records applied                154923
  transaction rollbacks                                     129

  sql> select count(*) from test_t1 where acc is null;
     COUNT(*)
   ----------
            0


we can see 20 user commits (727-707), no "transaction rollbacks" (129-129), and all rows are updated. From application point of view, all the updates are successfully performed, even with a small UNDO Tablespace, which is not met one-run requirement.

This could be used as a workaround in case of UNDO Tablespace is limited.

3. unique_constraint_test_1


Run following code:

  update test_t1 set id = rownum;
  commit;
  select n.name, s.value from v$mystat s, v$statname n
  where s.statistic#=n.statistic#
    and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied');

  exec unique_constraint_test_1;

  select n.name, s.value from v$mystat s, v$statname n
  where s.statistic#=n.statistic#
    and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied');


The output look as follows:

  NAME                                                    VALUE
  -------------------------------------------------- ----------
  user commits                                              728
  rollback changes - undo records applied                154923
  transaction rollbacks                                     129

  ORA-00001: unique constraint (K.TEST_T1_PK) violated, at Step=1
  ORA-00001: unique constraint (K.TEST_T1_PK) violated, at Step=2
  ORA-00001: unique constraint (K.TEST_T1_PK) violated, at Step=3
  End committed, at Step=4

  NAME                                                    VALUE
  -------------------------------------------------- ----------
  user commits                                              728
  rollback changes - undo records applied                157962
  transaction rollbacks                                     132


Again we see 3 (132-129) "transaction rollbacks", and no user commits (728-728), and hence no rows updated.

4. unique_constraint_test_2


Run following code:

 update test_t1 set id = rownum, step = 0, acc = 0;
 commit;
 select id, ts, step, acc from test_t1
  where id in (1, 900, 901, 1000, -1, -900, -901, 1000) order by step, acc, id;
 select n.name, s.value from v$mystat s, v$statname n
 where s.statistic#=n.statistic#
   and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied');

 exec unique_constraint_test_2;

 select n.name, s.value from v$mystat s, v$statname n
 where s.statistic#=n.statistic#
   and name in ('user commits', 'transaction rollbacks', 'rollback changes - undo records applied'); 
 select id, ts, step, acc from test_t1
  where id in (1, 900, 901, 1000, -1, -900, -901, 1000) order by step, acc, id;


The output look as follows:

   ID TS                               STEP     ACC
 ---- -------------------------- ---------- -------
    1 22-SEP-2015 07:35:13                0       0
  900 22-SEP-2015 07:35:13                0       0
  901 22-SEP-2015 07:35:13                0       0
 1000 22-SEP-2015 07:35:13                0       0

 NAME                                           VALUE
 ----------------------------------------- ----------
 user commits                                     742
 rollback changes - undo records applied       163307
 transaction rollbacks                            145

 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 sql%rowcount=100 Updated_1, at Step=1
 ORA-00001: unique constraint (K.TEST_T1_PK) violated, at Step=1
 ORA-00001: unique constraint (K.TEST_T1_PK) violated, at Step=2
 ORA-00001: unique constraint (K.TEST_T1_PK) violated, at Step=3
 End committed, at Step=4

 NAME                                          VALUE
 ---------------------------------------- ----------
 user commits                                    743
 rollback changes - undo records applied      164201
 transaction rollbacks                           147

   ID TS                              STEP     ACC
 ---- ------------------------- ---------- -------
  901 22-SEP-2015 07:35:13               0       0
 1000 22-SEP-2015 07:35:13               0       0
   -1 22-SEP-2015 07:35:13               1       1
 -900 22-SEP-2015 07:35:13               1       9


we can see 1 user commits (743-742), 2 "transaction rollbacks" (147-145), 900 rows are updated, 100 rows not updated.

5. deadlock_test


Restore test table:

 column name format a50
 update test_t1 set id = rownum, name = null, acc = 0;

 commit;


Open 3 Sessions, run following 3 scripts in 3 different sessions sequetially at time T1, T2, and T3:

 exec deadlock_s1;  --Session_1_T1

 exec deadlock_s2;  --Session_2_T2

 exec deadlock_s3;  --Session_3_T3


Check the updates by:

 select id, ts, name, acc from test_t1 where id <= 3 order by id; 
 ID TS                         NAME                                ACC
 -- -------------------------- -------------------------------- ----------
  1 22-SEP-2015 07:35:13       Session_1_T1/Session_2_T2/                2
  2 22-SEP-2015 07:35:13       Session_2_T2/                         1
  3 22-SEP-2015 07:35:13       Session_1_T1/Session_3_T3/                2


All Sessions are terminated in about 120 seconds, and Session_1 throws Exeception:
 ORA-00060: deadlock detected while waiting for resource, in Session_1_T1/

Row 1 updated twice (Session_1 and Session_2), Row 3 updated twice (Session_1 and Session_3),
Row 2 updated once (Session_2).

In fact, Row 2 was also updated once by Session_1, but it hit deadlock error, and the update is rollbacked. However, the other two row updates (Row 1 and 3) by Session_1 are still kept, thus committed.

Session_3 is blocked by Session_1, no transaction can be started (see v$transaction).
It will wait till Session_1's transaction terminated.

TestCode


--ensure no transaction alive on TEMPUNDO and all TEMPUNDO'terminated DML's undo_retention expired.
drop tablespace tempundo;


create undo tablespace tempundo
datafile '/testdb/undo/tempundo.dbf'
size 8m reuse autoextend off retention noguarantee
/


alter system set undo_tablespace = tempundo scope=both;
select name, value from v$parameter where name like '%undo%';
drop table test_t1;
create table test_t1
  (id number, ts timestamp default systimestamp, step number, name varchar2(4000), acc number);
alter table test_t1 add constraint test_t1_pk primary key (id);
insert into test_t1 select level, systimestamp, 0, lpad('x', 4000, 'y'), null
  from dual connect by level <= 1000;
commit;
select bytes from dba_segments where segment_name ='TEST_T1';  
--8'388'608

create or replace procedure undo_tbs_test_1 as
  l_step number := 0;
begin
  loop
    begin
      l_step := l_step + 1;
      exit when l_step > 3;  -- limit the number of retries. without it, endless loop.
      update test_t1 set step=l_step, name = lower(name) where id >= 2;
      dbms_output.put_line('sql%rowcount='||sql%rowcount||' Updated_2, at Step='||l_step);
      exit;
    exception when others then
      dbms_output.put_line(sqlerrm||', at Step='||l_step);
      commit;
    end;
  end loop;
  commit;
  dbms_output.put_line('End committed'||', at Step='||l_step);
end;
/


create or replace procedure undo_tbs_test_2 as
  l_step number := 0;
begin
  loop
    begin
      l_step := l_step + 1;
      exit when l_step > 20;
     
      for i in 1..3 loop
        update test_t1 set step=l_step, name = lower(name), acc=rownum

         where acc is null and rownum <=50;
        dbms_output.put_line('sql%rowcount='||sql%rowcount||' Updated_1, at Step='||l_step);
        exit when sql%rowcount=0;
      end loop;
     
      exit;
    exception when others then
      dbms_output.put_line(sqlerrm||', at Step='||l_step);
      commit;
    end;
  end loop;
  commit;
  dbms_output.put_line('End committed'||', at Step='||l_step);
end;
/


create or replace procedure unique_constraint_test_1 as
  l_step number := 0;
begin
  loop
    begin
      l_step := l_step + 1;
      exit when l_step > 3;   -- limit the number of retries. without it, endless loop.
      update test_t1 set id = - mod(id, 998);
      dbms_output.put_line('sql%rowcount='||sql%rowcount||' Updated_1, at Step='||l_step);
      exit;
    exception when others then
      dbms_output.put_line(sqlerrm||', at Step='||l_step);
      commit;
    end;
  end loop;
  commit;
  dbms_output.put_line('End committed'||', at Step='||l_step);
end;
/


create or replace procedure unique_constraint_test_2 as
  l_step number := 0;
begin
  loop
    begin
      l_step := l_step + 1;
      exit when l_step > 3;
     
      for i in 1..10 loop
        update test_t1 set id = - mod(id, 998), step = l_step, acc = i where id > 0 and rownum <=100;
        dbms_output.put_line('sql%rowcount='||sql%rowcount||' Updated_1, at Step='||l_step);
        exit when sql%rowcount=0;
      end loop;
     
      exit;
    exception when others then
      dbms_output.put_line(sqlerrm||', at Step='||l_step);
      commit;
    end;
  end loop;
  commit;
  dbms_output.put_line('End committed'||', at Step='||l_step);
end;
/


create or replace procedure deadlock_s1 as
begin
  update test_t1 set name = name||'Session_1_T1/', acc = acc + 1 where id = 3; 

  -- updated, keep locked till committed
  update test_t1 set name = name||'Session_1_T1/', acc = acc + 1 where id = 1; 

  -- updated, keep locked till committed
  dbms_lock.sleep(60);
  update test_t1 set name = name||'Session_1_T1/', acc = acc + 1 where id = 2; 

  -- rollback due to deadlock
  commit;
exception when others then
  dbms_output.put_line(sqlerrm||', in Session_1_T1/');
  dbms_lock.sleep(60);
  commit; 
end;
/

create or replace procedure deadlock_s2 as
begin
  update test_t1 set name = name||'Session_2_T2/', acc = acc + 1 where id = 2; 

  -- updated, keep locked till committed
  dbms_lock.sleep(60);
  update test_t1 set name = name||'Session_2_T2/', acc = acc + 1 where id = 1; 

  -- updated, keep locked till committed
  commit;
exception when others then
  dbms_output.put_line(sqlerrm||', in Session_2_T2/');
  dbms_lock.sleep(60);
  commit;   
end;
/

create or replace procedure deadlock_s3 as
begin
  update test_t1 set name = name||'Session_3_T3/', acc = acc + 1 where id = 3;  
  -- blocked 120 seconds by Session_1, no transaction started, see v$transaction
  commit;
end;
/

Monday, September 14, 2015

Limit PGA Memory Usage

Following MOS Notes lists 2 measures to control and limit PGA memory usage, but neither are recommended by Oracle. However, the best option is to fix the application.

This Blog is a follow-up of previous Blog: ORA-04030 incident file and alert.log.

1.   Event 10261


Oracle MOS:
    ORA-00600 [723], [67108952], [pga heap] When Event 10261 Set To Limit The PGA Leak (Doc ID 1162423.1)

This event is useful for PGA memory leaks (and UGA if the UGA is in the PGA). The event causes Oracle to raise an ORA-600 if the PGA tries to grow above the specified size.

In pfile/spfile, for example, the below event:
  event = 10261 trace name context forever,level 3145728
enforces a 3.2Gb Gb limit on the PGA size, and replaces the ORA-4030 with an ORA-600 [723] error.

Let's make two tests (see appended Test Code, tested with Oracle 11.2.0.3.0 on AIX):

alter system set event = "10261 trace name context forever,level 3145728" scope=spfile;
    -- DB must be restarted

SQL > exec pga_mem_test.allo(2*1024);
    -- 2GB allocation is OK
SQL > exec pga_mem_test.allo(4*1024);
    -- 4GB allocation throws error
   
ORA-00600: internal error code, arguments: [723], [65520], [top uga heap], [], [], [], [], [], [], [], [], []

Incident Dump shows:

 ORA-00600: internal error code, arguments: [723], [65520], [top uga heap], [], [], [], [], [], [], [], [], []
 
 ========= Dump for incident 16985 (ORA 600 [723]) ========
 ----- Beginning of Customized Incident Dump(s) -----
 ****** ERROR: PGA size limit exceeded in rfg: 3221284184 > 3221225472 *****
 ******************************************************

where the bottom limit is computed as:
   3221225472 = 3145728*1024
Update: In Oracle 19.13, the output looks like:

SQL > exec pga_mem_test.allo(4*1024);
BEGIN pga_mem_test.allo(4*1024); END;

*
ERROR at line 1:
ORA-04068: existing state of packages has been discarded
ORA-10260: PGA limit (3072 MB) exceeded - process terminated
ORA-06512: at "K.PGA_MEM_TEST", line 44
ORA-06512: at line 1

2.   Limit Parameters


Oracle MOS:
   PLSQL Procedure Causing ORA-04030: (pga heap,control file i/o buffer) And ORA-04030:
   (koh-kghu sessi,pmuccst: adt/record) or ORA-04030: (koh-kghucall ,pmucalm coll) Errors (Doc ID 1325100.1)

Either Change the page count (memory map entries per process) limit at the OS level, or adjust realfree heap pagesize at the database level:
Change the page count at the OS level:
     more /proc/sys/vm/max_map_count
  sysctl -w vm.max_map_count=262144 (for example)


Adjust the realfree heap pagesize within the database by setting the following parameters in the init/spfile and restart the database.

For versions 11.2.0.4 and lower:
  _use_realfree_heap=TRUE
  _realfree_heap_pagesize_hint = 262144

For 12.1 and higher:
  _use_realfree_heap=TRUE
  _realfree_heap_pagesize = 262144

As a test with Oracle 11.2.0.4.0 on Linux with following configuration:

Linux$ cat /proc/sys/vm/max_map_count
65530
 
Linux$ > free -mt
             total       used       free     shared    buffers     cached
Mem:         24160       5571      18588       1761        131       4257
-/+ buffers/cache:       1183      22977
Swap:            0          0          0
Total:       24160       5571      18588

The system has 24GB physical memory, and database SGA is 4GB.

At first, limit PGA per Session to 65530*4K = 256M

SQL > alter system set "_realfree_heap_pagesize_hint"=4K scope=spfile;
   -- restart DB
SQL > exec pga_mem_test.allo(1024); 

the incident dump shows:      

Dump of Real-Free Memory Allocator Heap [0x7fd64003a000]
mag=0xfefe0001 flg=0x5000003 fds=0x6 blksz=4096
blkdstbl=0x7fd64003a010, iniblk=331776 maxblk=524288 numsegs=71
In-use num=65353 siz=1335451648, Freeable num=32 siz=155648, Free num=113 siz=2225766
...
******************* Dumping process map ****************
...
7f455e47e000-7f455e483000 rw-p 00000000 00:05 4219                       /dev/zero
...
 
where blksz=4096.

However, siz/num=1335451648/65353=20434, it means that average pagesize is about 20 instead of specified 4k.
The entries in process map also confirmed it:
 7f455e47e000-7f455e483000
is decimal:
 139935911239680-139935911260160
i.e 20k.

Then, increase PGA per Session limit to 65530*256K = 16G

SQL > alter system set "_realfree_heap_pagesize_hint"=256K scope=spfile;
   -- restart DB

Open 3 Sqlplus sessions, at first session, run:

SQL(162,55) > exec pga_mem_test.allo(1024*13, 120);

Wait 5 seconds, at second session, run query:

SQL(84,59) > set numformat 99,999,999,999
SQL(84,59) > select s.sid, s.program, p.pga_used_mem, p.pga_alloc_mem 
               from v$session s, v$process p
              where s.paddr=p.addr and p.pga_used_mem >1e9;

 SID PROGRAM        PGA_USED_MEM   PGA_ALLOC_MEM
---- ----------- --------------- ---------------
 162 sqlplus.exe  15,957,853,158  15,959,926,406

At third session, run:

SQL(242,559) > exec pga_mem_test.allo(1024*13, 120);

Wait 5 seconds, at second session, run again the query:

SQL(84,59) > select s.sid, s.program, p.pga_used_mem, p.pga_alloc_mem 
               from v$session s, v$process p
              where s.paddr=p.addr and p.pga_used_mem >1e9;
 
 SID PROGRAM        PGA_USED_MEM   PGA_ALLOC_MEM
---- ----------- --------------- ---------------
 242 sqlplus.exe  15,957,853,158  15,959,926,406 

Look again first session:

SQL(162,55) > exec pga_mem_test.allo(1024*13, 120);
ERROR:
ORA-03114: not connected to ORACLE
BEGIN pga_mem_test.allo(1024*13, 120); END;
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 19710
Session ID: 162 Serial number: 55

Checking Oracle alert.log, trace, and incident files, there are nothing about the disconnected 1st session (Process ID: 19710, Session ID: 162 Serial number: 55).

Resorting to Linux dmesg, we can see:

[12:15:20] oracle invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
[12:15:20] oracle cpuset=/ mems_allowed=0
[12:15:20] Pid: 19555, comm: oracle Not tainted 2.6.32-642.el6.x86_64 #1  <--- Pid: 19555 PMON
[12:15:20] Call Trace:
[12:15:20] [<ffffffff81131640>] ? dump_header+0x90/0x1b0
[12:15:20] [<ffffffff8123c20c>] ? security_real_capable_noaudit+0x3c/0x70
[12:15:20] [<ffffffff81131ac2>] ? oom_kill_process+0x82/0x2a0
[12:15:20] [<ffffffff81131a01>] ? select_bad_process+0xe1/0x120
[12:15:20] [<ffffffff81131f00>] ? out_of_memory+0x220/0x3c0
[12:15:20] [<ffffffff8113e8dc>] ? __alloc_pages_nodemask+0x93c/0x950
[12:15:20] [<ffffffff81177b2a>] ? alloc_pages_current+0xaa/0x110
[12:15:20] [<ffffffff8112ea37>] ? __page_cache_alloc+0x87/0x90
[12:15:20] [<ffffffff8112e41e>] ? find_get_page+0x1e/0xa0
[12:15:20] [<ffffffff8112f9d7>] ? filemap_fault+0x1a7/0x500
[12:15:20] [<ffffffff81159394>] ? __do_fault+0x54/0x530
[12:15:20] [<ffffffff81159967>] ? handle_pte_fault+0xf7/0xb20
[12:15:20] [<ffffffff8122959c>] ? sem_lock+0x6c/0x130
[12:15:20] [<ffffffff8122b298>] ? sys_semtimedop+0x338/0xae0
[12:15:20] [<ffffffff8115a629>] ? handle_mm_fault+0x299/0x3d0
[12:15:20] [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20
[12:15:20] [<ffffffff81052156>] ? __do_page_fault+0x146/0x500
[12:15:20] [<ffffffff810688ed>] ? thread_group_times+0x3d/0x120
[12:15:20] [<ffffffff81079b8e>] ? mmput+0x1e/0x120
[12:15:20] [<ffffffff8109c348>] ? getrusage+0x158/0x340
[12:15:20] [<ffffffff8154dbce>] ? do_page_fault+0x3e/0xa0
[12:15:20] [<ffffffff8154aed5>] ? page_fault+0x25/0x30
...
[12:15:20] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[12:15:20] [19555]   100 19555  1168551      871   0       0             0 oracle  <--- PMON
...
[12:15:20] [19710]   100 19710  5072861  3995279   0       0             0 oracle  <--- 1st session
[12:15:20] [19713]   100 19713  2974770  1866502   5       0             0 oracle  <--- 3rd session
[12:15:20] [19715]   100 19715  1171449    60050   4       0             0 oracle  <--- 2nd session
...
[12:15:20] Out of memory: Kill process 19710 (oracle) score 647 or sacrifice child
[12:15:20] Killed process 19710, UID 100, (oracle) total-vm:20291444kB, anon-rss:15600524kB, file-rss:380592kB
The above log shows that Oracle PMON(19555) calls oom_kill_process() to kill one memory offending process(19710) to satisfy the request of the new process(19713).

process 19710 is killed due to badness score 647:
  Out of memory: Kill process 19710 (oracle) score 647 or sacrifice child

The baseline for the badness score is the proportion of RAM that each task's rss, pagetable and swap space use.

In Linux oom_kill.c, out_of_memory() calls select_bad_process() to find processes to be killed.
  If found, kill by oom_kill_process().
  If not found, panic the system (halt the system, never return) by:
     panic("Out of memory and no killable processes...\n");

and it is emphasized by the comment:

  out_of_memory - kill the "best" process when we run out of memory
  Found nothing?!?! Either we hang forever, or we panic.

The hidden parameter _pga_max_size does not limit a process size, only the work area.

3.   12c PGA_AGGREGATE_LIMIT


Oracle MOS:
   Doc ID 1520324.1: Limiting process size with database parameter PGA_AGGREGATE_LIMIT

While PGA_AGGREGATE_TARGET only controls allocations of tunable memory, PGA_AGGREGATE_LIMIT aborts or terminates the sessions or processes that are consuming the most untunable PGA memory, such as:
     pl/sql memory areas
  session context, cursor caches (MOS Doc ID 284951.1).

This new initialization parameter dynamically sets an instance-wide hard limit for PGA memory.

If the value of PGA_AGGREGATE_LIMIT is reached, a 12c new error message will be reported:
  ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT

4.  PGA Memory Components


Blog TUNING PGA : PART – I (Anju Garg) classified PGA in functions, and by tunability:

PGA components:
    Stack space
        bind variables
        arrays (PL/SQL)
   UGA
        Session information such as logon information, and other information required by a database Session.
        SQL Work areas : used for sorting, hash operations etc.
        Private SQL Area : contains Open/Closed cursors and cursor state information for open cursors for example, 
                                        the number of rows retrieved so far in a full table scan.

PGA areas:
    Untunable PGA
         Context information of each session
         Each open cursor
         PL/SQL, OLAP or Java memory
   Tunable PGA
         SQL work areas

5.  PGA Overallocation vs. tunable and non-tunable areas


Oracle MOS: LOW PGA HIT RATIO THOUGH OVER ALLOCATION COUNT IS NONE (Doc ID 284951.1) said:

 Over-allocating PGA memory can happen if the value of PGA_AGGREGATE_TARGET is too small to accommodate the PGA component un tunable (session context, cursor caches, etc) memory plus the minimum memory required to execute the work area workload. When this happens, Oracle cannot honor the initialization parameter PGA_AGGREGATE_TARGET, and extra PGA memory needs to be allocated.
 
The number of times Oracle had to allocate more PGA memory then the PGA_AGGREGATE_TARGET suggested. This indicates the PGA target was set too small to accommodate the un tunable (session context, cursor caches, etc) memory plus the tunable component. This count should be zero ideally.

Oracle MOS: How To Avoid ORA-04030/ORA-12500 In 32-bit Windows Environment [Video] (Doc ID 373602.1) wrote:

 Within the PGA we have "tunable" and "non-tunable" areas. The tunable part is memory allocated for intensive SQL operations such as sorts, hash-joins, bitmap merge, and bitmap index create. This memory can be shrunk and expanded in response to system load. However, the non-tunable part cannot be managed in the same way. Importantly the non-tunable part includes cursors. We can control the number of cursors by the init.ora OPEN_CURSORS parameter, but if this is exceeded we get an ORA-1000 error, which is clearly undesirable. See unpublished Note:1012266.6 - "Overview of ORA-1000 Maximum Number of Cursors Exceeded" for more info. More importantly, however, we have no control over the size of a cursor, and users may open very large cursors dependent on their SQL or PLSQL.
 
 Also note that if we set PGA_AGGREGATE_TARGET too small to accommodate the non-tunable part of the PGA plus the minimum memory required to execute the tunable part, then Oracle cannot honour the PGA_AGGREGATE_TARGET value, and will attempt to allocate extra memory. This is known as overallocation, and an estimation of this can be seen in the view V$PGA_TARGET_ADVICE under the column ESTD_OVERALLOC_COUNT.

PGA memory is divided as tunable and non-tunable areas, while tunable is constrained under PGA_AGGREGATE_TARGET, non-tunable can be over allocated in any size (till ORA-04030), and Oracle records these activities in column ESTD_OVERALLOC_COUNT of V$PGA_TARGET_ADVICE. Therefore ESTD_OVERALLOC_COUNT is caused by over request of non-tunable areas.

Before each overallocation, probably Oracle tries to deallocate certain less used memory (for example, LRU Algorithm) at first. If not satisfied, new memory is allocated.

OPEN_CURSORS specifies the maximum number of open cursors (handles to private SQL areas) a session can have at once. Under overallocation, session cursor caches could be subject to memory deallocation.

In Oracle, each child cursor is associated with one KGLH0 and one SQLA (both in SGA), where KGLH0 stores environment information, SQLA stores parsing tree and xplan. When memory is reclaimed, KGLH0 stays, whereas SQLA is deallocated.

Once child cursor is cleaned out from session cursor caches, and no more found in SQLA. The later re-use of the same child cursor will result in a hard-parsing, and normally Oracle SQL Trace shows it as:

     Misses in library cache during parse:   0
    Misses in library cache during execute: 1


Probably the parsing "during execute" is a quick parsing with less optimization compared to “during parse”, for example, limiting "_optimizer_max_permutations" to a small number, so that it could be faster and not disturb too much execution phase. But in the negative side, it could be that it would not find the optimal xplan.

We have observed such sub-optimal xplan generated in "during execute" phase, but in general we don't know how to determine if an xplan is created "during parse" or "during execute".

In order to fix such sub-optimal xplan generated in "during execute" phase, PGA_AGGREGATE_TARGET is increased to avoid PGA overallocation, and at the same time, the SQL statement is manipulated to be different so that it is forced to be hard parsed from scratch.

The problem sql involves partitions of a partitioned table, in which all partitions are unbalanced (some of them almost empty). Hence the pure sql with partition key is hardly optimal for each partition select. So we try to put some dummy hint with partition key to prevent cursor sharing (this sounds counterproductive). Since the sql is heavy, and at one time period, only a few partitions are selected, this paid off seems justified.

Since the problem occurs when upgrade to 11.2.0.4 and much more "Parse Calls" than "Executions" are found in AWR report, probably we hit:
     Oracle MOS: ORA-04030 occurred while executing PLSQL procedure (Doc ID 1953999.1)
which is supposed to be fixed by interim Patch 18384537:
     OPIPLS TAKES TOO MANY RETRIES TO LOAD CURSOR.

In summary, the above discussion reveals one aspect of cooperative work between SGA and PGA. Here we are trying to bring two commonly regarded independent components together.

Test Code



create or replace package pga_mem_test as
  procedure allo (p_mb int, p_sleep number := 0);
end;
/

create or replace package body pga_mem_test as
  type t_tab_kb   is table of char(1024);   -- 1KB
  p_tab_1mb          t_tab_kb := t_tab_kb();
  type t_tab_mb   is table of t_tab_kb;     
  p_tab_mb           t_tab_mb := t_tab_mb();
  p_sid              number;  --   := sys.dbms_support.mysid;
  
  -------------------------------------------
  procedure rpt(l_name varchar) is
     l_v$process_mem            varchar2(4000);
     l_v$process_memory_mem     varchar2(4000);
  begin
   select 'Used/Alloc/Freeable/Max >>> '||
           round(pga_used_mem/1024/1024)    ||'/'||round(pga_alloc_mem/1024/1024)||'/'||
             round(pga_freeable_mem/1024/1024)||'/'||round(pga_max_mem/1024/1024)
       into l_v$process_mem
       from v$process 
       where addr = (select paddr from v$session where sid = p_sid);
      
    select 'Category(Alloc/Used/Max) >>> '||
             listagg(Category||'('||round(allocated/1024/1024)||'/'||
                     round(used/1024/1024)||'/'||round(max_allocated/1024/1024)||') > ')
     within group (order by Category desc) name_usage_list
       into l_v$process_memory_mem
       from v$process_memory
       where pid = (select pid from v$process
                     where addr = (select paddr from v$session where sid = p_sid));
  
    dbms_output.put_line(rpad(l_name, 20)||' > '||rpad(l_v$process_mem, 50));
    dbms_output.put_line('             ------ '||l_v$process_memory_mem);
  end rpt;
   
  -------------------------------------------   
  procedure allo (p_mb int, p_sleep number) is
  begin
   select sid into p_sid from v$mystat s where rownum <=1;
   
   rpt('Start allocate: '||p_mb||' MB');
   
   select 'M' bulk collect into p_tab_1mb from dual connect by level <= 1024;  -- 1MB
   
   for i in 1..p_mb loop   -- p_mb MB
    p_tab_mb.extend;
    p_tab_mb(i) := p_tab_1mb;
   end loop;
  
   rpt('End allocate: '||p_mb||' MB');
   dbms_lock.sleep(p_sleep);
  end allo;
  
end;
/

/*
   exec dbms_session.reset_package;
   set  serveroutput on
   exec pga_mem_test.allo(1024*1, 30);       -- allocate 1GB
*/

------------------------------------------- 
create or replace procedure pga_mem_test_jobs(p_job_cnt number, p_mb number, p_sleep number := 0)
as
   l_job_id pls_integer;
begin
    for i in 1.. p_job_cnt loop
      dbms_job.submit(l_job_id, 'begin pga_mem_test.allo('||p_mb||', '||p_sleep||'); end;');
    end loop;
    commit;
end;    
/

--exec pga_mem_test_jobs(4, 1024*2, 60);   -- 4 Jobs, each allocates 2 GB, sleeping 60 seconds