Thursday, December 15, 2022

OracleJVM JAVA_JIT_ENABLED Linux /dev/shm mount noexec EPERM and Performance

In Linux, when JAVA_JIT_ENABLED is enabled, the native compiled Java code is stored in /dev/shm/JOEZSHM_*.
However, if tmpfs /dev/shm/ is mounted with "noexec" option as follows:

mount | grep shm
   tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,noexec)
mmap of JOEZSHM_* gets error EPERM ("Operation not permitted"), and JAVA_JIT_ENABLED does not work.

In MZnn and session trace files, there are plenty of "Operation not permitted" (EPERM) for JOEZSHM_*.

As a consequence, Oracle falls back to JAVA_JIT_ENABLED=false and the performance is degraded upto 50 times slower than the functional case.

Note: Tested on Oracle 19.13.


1. Test-1 Simple Case


First we make a simple SQL Java call (dbms_java.getversion calls java.lang.System.getProperty via dbms_java.get_ojvm_property), and strace its Linux process:

SQL > select 'JServer version: '||dbms_java.getversion, 'JDK version:'||dbms_java.get_jdk_version from dual;  

$> strace -tT -o mmap_strace.log -p 16637
After the call, we can see the new created memory files: /dev/shm/JOEZSHM_*:

$> ls -l /dev/shm
  total 32768
  -rwxrwx--- 1 oracle dba 16777216 Dec 13 07:36 JOEZSHM_testdb_1_0_0_0_0_2084322508
  -rwxrwx--- 1 oracle dba 16777216 Dec 13 07:36 JOEZSHM_testdb_1_0_1_0_0_749348400
  
-- no entry found, no JOEZSHM_* mapped to process address space
$> pmap -X -p 16637 |grep -i JOEZSHM
mmap_strace.log shows:

  07:36:23 open("/dev/shm/JOEZSHM_testdb_1_0_0_0_0_2084322508", O_RDWR|O_CREAT|O_NOFOLLOW|O_CLOEXEC, 0770) = 7 <0.000456>
  07:36:23 lseek(7, 0, SEEK_CUR)          = 0 <0.000018>
  07:36:23 lseek(7, 0, SEEK_END)          = 16777216 <0.000023>
  07:36:23 lseek(7, 0, SEEK_SET)          = 0 <0.000030>
  07:36:23 mmap(NULL, 16777216, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, 7, 0) = -1 EPERM (Operation not permitted) <0.000022>
MZ00 and session trace files are filled with:

sjoezshm_map_obj mmap failed for /JOEZSHM_testdb_1_0_0_0_0_2084322508: Operation not permitted
size  = 16777216, prot_mode  = 7, map_flag = 1
joez_shm_open_object failed: size = 16777216, extnam = /JOEZSHM_testdb_1_0_0_0_0_2084322508  flags = 0x34 
joez: Failed loading machine code: Unable to allocate code space

sjoezshm_map_obj mmap failed for /JOEZSHM_testdb_1_0_1_0_0_749348400: Operation not permitted
size  = 16777216, prot_mode  = 7, map_flag = 1
joez_shm_open_object failed: size = 16777216, extnam = /JOEZSHM_testdb_1_0_1_0_0_749348400  flags = 0x34 
joez: Failed loading machine code: Unable to allocate code space


2. Test-2 Complex Case


Use the same test code in Blog: Oracle 19.4 OracleJVM JAVA_JIT_ENABLED Not Working on AIX to read a 27MB signed JAR file testJar.jar (security signatures under META-INF directory).

$> ls -l /tmp/testJar.jar
  -rw-r--r-- 1 oracle dba 28944084 Dec  6 17:39 /tmp/testJar.jar

$> unzip -l /tmp/testJar.jar
  Archive:  /tmp/testJar.jar
    Length      Date    Time    Name
  ---------  ---------- -----   ----
        144  12-05-2019 08:27   META-INF/MANIFEST.MF
        306  12-05-2019 08:27   META-INF/KUNALIAS.SF
       1471  12-05-2019 08:27   META-INF/KUNALIAS.DSA
          0  12-05-2019 08:26   META-INF/
  110947240  12-05-2019 08:23   test1.txt
  ---------                     -------
  110949161                     5 files
First we invoke OracleJVMJarInputStream without signature verify, it takes 3 seconds:

Sql > set serveroutput on size 50000
Sql > exec dbms_java.set_output(50000); 

SQL > exec OracleJVMJarInputStream(p_verify => 'false', p_info => 'no');

********* getNextJarEntry *********
------ NextJarEntry: 1, Name: META-INF/KUNALIAS.SF ------
         getNextEntry ElapsedMills: 31, at: 1670863088140
         Insert DB 1 row, blob size 0 at Mon Dec 12 17:38:08 CET 2022
         readContent ElapsedMills: 5, at: 1670863088145
------ NextJarEntry: 2, Name: META-INF/KUNALIAS.DSA ------
         getNextEntry ElapsedMills: 0, at: 1670863088145
         Insert DB 1 row, blob size 0 at Mon Dec 12 17:38:08 CET 2022
         readContent ElapsedMills: 2, at: 1670863088147
------ NextJarEntry: 3, Name: META-INF/ ------
         getNextEntry ElapsedMills: 0, at: 1670863088148
         Insert DB 1 row, blob size 0 at Mon Dec 12 17:38:08 CET 2022
         readContent ElapsedMills: 2, at: 1670863088150
------ NextJarEntry: 4, Name: test1.txt ------
         getNextEntry ElapsedMills: 0, at: 1670863088150
         Insert DB 1 row, blob size 110920480 at Mon Dec 12 17:38:11 CET 2022
         readContent ElapsedMills: 3125, at: 1670863091275

PL/SQL procedure successfully completed.
Elapsed: 00:00:03.20
Then we invoke OracleJVMJarInputStream with signature verify, it takes 3 minutes:

Sql > exec OracleJVMJarInputStream(p_verify => 'true', p_info => 'no');

********* getNextJarEntry *********
------ NextJarEntry: 1, Name: META-INF/KUNALIAS.SF ------
         getNextEntry ElapsedMills: 1, at: 1670863115333
         Insert DB 1 row, blob size 0 at Mon Dec 12 17:38:35 CET 2022
         readContent ElapsedMills: 6, at: 1670863115339
------ NextJarEntry: 2, Name: META-INF/KUNALIAS.DSA ------
         getNextEntry ElapsedMills: 0, at: 1670863115339
         Insert DB 1 row, blob size 0 at Mon Dec 12 17:38:35 CET 2022
         readContent ElapsedMills: 423, at: 1670863115762
------ NextJarEntry: 3, Name: META-INF/ ------
         getNextEntry ElapsedMills: 0, at: 1670863115762
         Insert DB 1 row, blob size 0 at Mon Dec 12 17:38:35 CET 2022
         readContent ElapsedMills: 3, at: 1670863115765
------ NextJarEntry: 4, Name: test1.txt ------
         getNextEntry ElapsedMills: 0, at: 1670863115765
         Insert DB 1 row, blob size 110920480 at Mon Dec 12 17:41:35 CET 2022
         readContent ElapsedMills: 179651, at: 1670863295417

PL/SQL procedure successfully completed.
Elapsed: 00:03:00.12
If JAVA_JIT_ENABLED works (/dev/shm is mounted without "noexec"), the execution with signature verify is completed in 5 seconds (instead of 3 minutes).


3. Test-3 Standalone mmap.c Test


Web Page: mmap() fails on tmpfs (operation not permitted) #2974 provided a mmap.c test code and showed "mmap() failed: : Operation not permitted".

We can use the same code to simulate Oracle JIT /dev/shm/JOEZSHM_* test.
(with small adaptations of open and mmap arguments to Oracle parameters according to strace output in above Test-1 Simple Case).

// mmap-fail.cpp
#include <stdint.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

// $ g++ mmap-fail.cpp -o mmap-fail
// $ sudo ./mmap-fail
// mmap() failed: : Operation not permitted

int main(int argc, const char* argv[])
{
    //const char* path = "./eightbytes.bin"; // okay on lxfs
    //const char* path = "/run/eightbytes.bin";
    //int fd = open(path, O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY|O_NOFOLLOW, 0644);
    int fd = open("/dev/shm/JOEZSHM_testdb_1_0_0_0_0_2084322508", O_RDWR|O_CREAT|O_NOFOLLOW|O_CLOEXEC, 0770);
    if (fd < 0) {
        perror("open() failed: ");
        return 1;
    }
    int r = posix_fallocate(fd, 0, sizeof(uint64_t));
    if (r < 0) {
        perror("posix_fallocate() failed: ");
        return 1;
    }
    //uint64_t *p = (uint64_t*)mmap(NULL, sizeof(uint64_t), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    uint64_t *p = (uint64_t*)mmap(NULL, sizeof(uint64_t), PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, fd, 0);
    if (p == MAP_FAILED) {
        perror("mmap() failed: ");
        return 1;
    }
    int r2 = close(fd);

    printf("Whee! mmap SUC\n");
    return 0;
}
Run the test with strace:

$> strace -tT -o mmap-fail_strace.log ./mmap-fail
  mmap() failed: : Operation not permitted
mmap-fail_strace.log shows:

  07:42:37 open("/dev/shm/JOEZSHM_testdb_1_0_0_0_0_2084322508", O_RDWR|O_CREAT|O_NOFOLLOW|O_CLOEXEC, 0770) = 3 <0.000022>
  07:42:37 fallocate(3, 0, 0, 8)          = 0 <0.000018>
  07:42:37 mmap(NULL, 8, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, 3, 0) = -1 EPERM (Operation not permitted) <0.000017>


4. mmap.c EPERM Operation not permitted


From source mm/mmap.c, we can see that EPERM (Operation not permitted) is thrown in do_mmap
(case MAP_SHARED fallthrough to case MAP_PRIVATE).

// mm/mmap.c
unsigned long do_mmap(struct file *file, unsigned long addr,
  ...
	if (file) {
		...
		switch (flags & MAP_TYPE) {
		case MAP_SHARED:
      ...
			fallthrough;
		case MAP_SHARED_VALIDATE:
      ...
			fallthrough;
		case MAP_PRIVATE:
			if (!(file->f_mode & FMODE_READ))
				return -EACCES;
			if (path_noexec(&file->f_path)) {
				if (vm_flags & VM_EXEC)
					return -EPERM;
				vm_flags &= ~VM_MAYEXEC;
			}
			
			
include/uapi/asm-generic/errno-base.h
  #define	EPERM		 1	/* Operation not permitted */


5. Fix


To fix EPERM problem on Linux for JAVA_JIT_ENABLED, remove noexec option for tmpfs /dev/shm and make it permanent in /etc/fstab:

$> mount -o remount,exec /dev/shm
So that it looks like:

$> mount |grep shm
     tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)

$> cat /etc/fstab |grep shm
     tmpfs /dev/shm tmpfs nosuid,nodev 0 0


6. Related Work and Other Observations


For further discussions on OracleJVM JAVA_JIT_ENABLED, see:
     What the heck are the /dev/shm/JOXSHM_EXT_x files on Linux?
     Oracle 19.4 OracleJVM JAVA_JIT_ENABLED Not Working on AIX

There is also contre-proposition:
     The Oracle Linux operating system must mount /dev/shm with the noexec option.

When JAVA_JIT_ENABLED works, and we strace MZ00 session for a while.
Trace file shows that MZ00 is continuously compiling Java, each time, only one method (not entire class).
The timestamp of /dev/shm/JOEZSHM_* shows that those memory files are also updated.

$> strace -tT -s 128 -o mz00_strace.txt -p 23037

-- (fd 6 is testdb_mz00_23037.trc)
$> grep "Done compiling.*locale*" mz00_strace.txt
  15:48:11 write(6, "Done compiling sun/util/locale/provider/TimeZoneNameUtility$TimeZoneNameGetter.getObject", 88) = 88 <0.000026>
  15:50:44 write(6, "Done compiling sun/util/locale/provider/TimeZoneNameUtility$TimeZoneNameGetter.getObject", 88) = 88 <0.000019>
  15:50:44 write(6, "Done compiling sun/util/locale/provider/TimeZoneNameProviderImpl.getDisplayNameArray", 84) = 84 <0.000025>
  15:50:44 write(6, "Done compiling sun/util/locale/provider/TimeZoneNameUtility$TimeZoneNameGetter.getName", 86) = 86 <0.000046>
  15:50:44 write(6, "Done compiling sun/util/locale/provider/LocaleResources.removeEmptyReferences", 77) = 77 <0.000035>
  15:50:45 write(6, "Done compiling sun/util/locale/provider/LocaleResources.getTimeZoneNames", 72) = 72 <0.000039>
  15:50:46 write(6, "Done compiling sun/util/locale/BaseLocale$Key.normalize", 55) = 55 <0.000030>
  15:50:47 write(6, "Done compiling sun/util/locale/provider/TimeZoneNameProviderImpl.getDisplayName", 79) = 79 <0.000057>
By the way, in the old Oracle release, there are many JIT compiled small files (KB) with name pattern like JOXSHM_EXT_*,
In Oracle 19c, there are only a few big files (16MB) with name pattern like JOEZSHM_*.

Update (2023-02-26)Oracle 19c Java Developer's Guide: 9.1 Oracle JVM Just-in-Time Compiler (JIT) documented this behaviour:
Note:
 
On Linux, Oracle JVM JIT uses POSIX shared memory that requires access to the /dev/shm directory.
The /dev/shm directory should be of type tmpfs and you must mount this directory as follows:
  -. With rw and execute permissions set on it
  -. Without noexec or nosuid set on it
If the correct mount options are not used, then the following failure may occur during installation of the database:
  ORA-29516: Aurora assertion failure: Assertion failure at joez.c:
             Bulk load of method java/lang/Object. failed; insufficient shm- object space

Sunday, December 4, 2022

One Test of Oracle too many parse errors with ORA-12850 and ORA-12872

In this Blog, we first make a test to demonstrate Oracle "too many parse errors" associated with ORA-12850 and ORA-12872.
Then we reveal both errors in Oracle trace files. It looks like a limitation of materialized WITH subquery in parallel statements.

Note: Reproduced in Oracle 19.9, 19.10, 19.11, 19.13, but not 19.17. All are non RAC DB.


1. Test Setup



create table test_tab as select level x from dual connect by level <= 10; 

create or replace procedure test_proc (p_cnt number, p_limit pls_integer, p_sleeps number) as
  type t_num_tab      is table of number;
  l_num_tab           t_num_tab := new t_num_tab();
begin
  for i in 1..p_cnt loop 
    dbms_session.sleep(p_sleeps);
    with sq as (select /*+ materialize  */ x from test_tab)
    select /*+ parallel(2) */ t.x  
      bulk collect into l_num_tab
      from test_tab t, sq 
     where rownum <= p_limit;
   end loop; 
   
   dbms_output.put_line('Selected Rows = '||l_num_tab.count);
end;
/


2. Test Run


(Step-1) Open a new Sqlplus session

(Step-2) Flush Shared Pool
            alter system flush shared_pool; 

(Step-3) Run test procedure
            exec test_proc(1000, 3, 0.1);

(Step-4) Check DB alert.log if there is message:
            WARNING: too many parse errors
   
(Step-5) If no such message, repeat Step-1 to Step-5.
         That means opening next new Sqlplus session to do the same test.
         Most of tests showed that one or two Sqlplus sessions are sufficient to reproduce the problem.
Here the message in DB alert.log. PARSE ERROR are signalled with error=12850 or error=12872.
Note that the message is only written for each 100 such errors (count=13300 jump to count=13400).

2022-12-01T20:01:08.221827+01:00
WARNING: too many parse errors, count=13300 SQL hash=0x49952973
PARSE ERROR: ospid=31115, error=12850 for statement: 
2022-12-01T20:01:08.221953+01:00
WITH SQ AS (SELECT /*+ materialize  */ X FROM TEST_TAB) SELECT /*+ parallel(2) */ T.X FROM TEST_TAB T, SQ WHERE ROWNUM <= :B1
Additional information: hd=0x97efe2e0 phd=0x9437c118 flg=0x110676 cisid=86 sid=86 ciuid=86 uid=86 sqlid=3z4qz1d4taabm
...Current username=K
...Application:  Action: 
2022-12-01T20:01:09.170414+01:00
WARNING: too many parse errors, count=13400 SQL hash=0x49952973
PARSE ERROR: ospid=31117, error=12872 for statement: 
2022-12-01T20:01:09.170531+01:00
WITH SQ AS (SELECT /*+ materialize  */ X FROM TEST_TAB) SELECT /*+ parallel(2) */ T.X FROM TEST_TAB T, SQ WHERE ROWNUM <= :B1
Additional information: hd=0x97efe2e0 phd=0x9437c118 flg=0x110676 cisid=86 sid=86 ciuid=86 uid=86 sqlid=3z4qz1d4taabm
...Current username=K
...Application: SQL*Plus Action:  
Appendix-1 lists a few related Oracle Error Documents.

Appendix-2 shows parallel xplan.


3. ORA-10388 Trace


Oracle (Doc ID 2180844.1) has a description of ORA-12850 (PX QC and worker signature-plan mismatch) and documents Troubleshooting Steps.

How to get plan signature tracing for ORA-12850 (Doc ID 2180844.1)
  Purpose
      ORA-12850 errors are usually caused by the worker getting a plan signature different from that of the QC.  
      These are typically, although not always, QRY OPTIMIZER bugs. This note provides the recommended tracing needed 
      in order to see if there is a signature-plan mismatch and diagnose the cause of ORA-12850.†

  Troubleshooting Steps
      alter system set†"_parallel_blackbox_size" =†524288;†
      -- prevents traces from being truncated
      alter session set max_dump_file_size = unlimited;                         
      -- customer will be prompted for a string that will appear in the tracefile name
      alter session set tracefile_identifier = '&TRACEFILE_IDENTIFIER';         
      alter session set events 'trace[SQL_Compiler.*] disk highest';
      alter session set events 'trace[Parallel_Execution.*] disk highest';
      
      --run test code, for example, in this Blog:
      exec test_proc(10, 3, 0.1);
      
      -- turn tracing off
      alter session set events 'trace[SQL_Compiler.*] off';
      alter session set events 'trace[Parallel_Execution.*] off';
    
    Look for the plan signatures in the QC vs. worker traces.  
    The worker that will have the mismatch should be the one with the 10388 error in it.  
    Notice that the plan signatures do NOT match, which causes the ORA-12850.
	
  3. Look for the plan signatures in the QC vs. worker traces.  
     The worker that will have the mismatch should be the one with the 10388 error in it.  
     Notice that the plan signatures do NOT match, which causes the ORA-12850.
     
        QC Plan Signature from Parse Msg:  SQL Plan Signature
        Slave Plan Signature:  SQL Plan Signature
Following above Oracle instructions, we get Plan Signature in worker (pnnn) traces like:

  Slave Plan Signature:  SQL Plan Signature (0x93838ff0):
  QC Plan Signature from Parse Msg:  SQL Plan Signature (0x7f0fed684134):
In Oracle Document, ORA-10388 Comment said: commands a slave to leave the server group (query failed).
So we will look for 10388 error in the trace files.

In worker (pnnn) trace files, internal oracle error=10388 (message to terminate a parallel query) is precised by "In signal-handler after catching 12850", and finalized by "Exiting parallel client parallel query execution(6) on error=10388".

====================== error 12850 ======================
2022-12-01 11:28:29.288*:PX_Control:kxfx.c@4849:kxfxsp1(begin): 
	Open & parse cursor(EXPRESS/-): 
	[123:WITH SQ AS (SELECT /*+ materialize  */ X FROM TEST_TAB) SELECT /*+ parallel(2) */ T.X FROM TEST_TAB T, SQ WHERE ROWNUM <= :B1 ]
2022-12-01 11:28:29.288*:PX_Control:kxfx.c@4853:kxfxsp1(end): 
2022-12-01 11:28:29.288*:PX_Control:kxfx.c@5885:kxfxsp(end): 
2022-12-01 11:28:29.327*:PX_Control:kxfx.c@6114:kxfxsStatus(): 
	joining FALSE ParseState KXFXUPARSE2 (3)
2022-12-01 11:28:29.331*:PX_Control:kxfx.c@10728:kxfxmai(): 
	In signal-handler after catching 12850
2022-12-01 11:28:29.333*:PX_Control:kxfx.c@10728:kxfxmai(): 
	In signal-handler after catching 10388
2022-12-01 11:28:29.333*:PX_Control:kxfx.c@5678:kxfxsuf(): 
	kxfx slave finish
2022-12-01 11:28:29.333*:PX_Messaging:kxfp.c@14665:kxfprdp_int(): 
	Exiting parallel client parallel query execution(6) on error=10388
In the same trace files, "Received error 12872" is noted with "First parse failed" during Semantic Analysis:

====================== error 12872 ======================
2022-12-01 11:29:30.411*:PX_Control:kxfx.c@6829:kxfxsExecute(): 
	Received error 12872 from first slave parse.
	First parse failed. On to outlined parse 
2022-12-01 11:29:30.411*:PX_Control:kxfx.c@13917:kxfxsStmtDump(): 
	kxfxsSmtExecute stmt:0x7f0fed69ff50
	 UGA: InUse:255KB Alloc:255KB
	 PGA: InUse:1665KB Alloc:1835KB
Enabling tracing for cur#=1 sqlid=3z4qz1d4taabm top-level
Parsing cur#=1 sqlid=3z4qz1d4taabm len=123 
sql=WITH SQ AS (SELECT /*+ materialize  */ X FROM TEST_TAB) SELECT /*+ parallel(2) */ T.X FROM TEST_TAB T, SQ WHERE ROWNUM <= :B1 
End parsing of cur#=1 sqlid=3z4qz1d4taabm
Semantic Analysis cur#=1 sqlid=3z4qz1d4taabm
OPTIMIZER INFORMATION


4. Performance Impacts


When multiple sessions frequently run such statement, especially with high degree of parallelism (DOP), DB performance degradation is often caused by following heavy wait events:

  cursor: pin S wait on X
  library cache lock
  kksfbc child completion
  PX Deq: Parse Reply
  PX Deq: Execution Msg
When looking the wait chains, often one PX particular worker is the blocker of QC or other workers. We also noticed that V$SQL.child_number can go over 1000 ("_cursor_obsolete_threshold" default 1024).

A quick workaround is to remove parallel hint to make serial execution.

Blog: 12c Parse listed a few activity stats to monitor:
  opened cursors cumulative         
  enqueue requests                  
  enqueue releases                  
  sql area purged                   
  sql area evicted                  
  parse count (total)               
  parse count (hard)                
  parse count (failures)            
Tests showed that the call (1000 executions of query):

  exec test_proc(1000, 3, 0.1); 
generated 6000 lines of "too many parse errors" in DB alert.log:
(the count seems reading from V$SQLAREA.invalidations for each sql_id)

  WARNING: too many parse errors, count=53700 SQL hash=0x49952973 ... sqlid=3z4qz1d4taabm
  ...
  WARNING: too many parse errors, count=59700 SQL hash=0x49952973 ... sqlid=3z4qz1d4taabm
v$sysstat query:

select v.value, v.* from v$sysstat v
where name in ('parse count (failures)', 'sql area purged', 'sql area evicted', 
               'parse count (hard)', 'parse count (total)', 
               'opened cursors cumulative', 'enqueue releases', 'enqueue requests') 
order by v.value;
shwed that:
   the value of 'parse count (failures)', 'sql area purged', 'sql area evicted' increased 6000
   the value of 'parse count (hard)', 'parse count (total)' increased a little more than 6000 (if no other DB activities)
So each test query execution created 6 "too many parse errors" related to 6 'parse count (failures)'.


5. ORA-00600 [xplUnCompact:magic]


In very rare case, QC session throws ORA-00600 as follows. It looks like that the error occurs when optimizer automatic sql reoptimisation (kkoarSalvageStats, COMPONENT: SQL_Costing) checks cardinality feedback (kkocfbCheckCardEstRws, COMPONENT: SQL_Costing) in xplan (xplUnCompact, COMPONENT: explain).

ORA-00600: internal error code, arguments: [xplUnCompact:magic], [0], [0x7FFE5C370D98]

----- Call Stack Trace -----

[12] (kgeasnmierr()+146 -> kgerinv())
[13] (xplUnCompact()+1320 -> kgeasnmierr())
[14] (kkocfbCheckCardEstRws()+815 -> xplUnCompact())
[15] (kkocfbCheckCardEstRws()+259 -> kkocfbCheckCardEstRws())
[16] (kkocfbCheckCardEstRws()+259 -> kkocfbCheckCardEstRws())
     -- a couple of same calling lines
[35] (kkocfbCheckCardEstRws()+259 -> kkocfbCheckCardEstRws())
[36] (kkocfbCheckCardEst()+571 -> kkocfbCheckCardEstRws())
[37] (kkoarSalvageStats()+910 -> kkocfbCheckCardEst())
[38] (kxsFreeWorkArea()+149 -> kkoarSalvageStats())
[39] (kxsFreeExecutionHeap()+60 -> kxsFreeWorkArea())
[40] (kksumc()+631 -> kxsFreeExecutionHeap())
[41] (opiexe()+10228 -> kksumc())

Session Wait History:
    elapsed time of 0.993485 sec since last wait
 0: waited for 'enq: PS - contention'


6. Related Oracle BUGs and Related Work


There are a few related BUGs in Oracle MOS, we tested them (patch installed or hidden parameter set. Test DB are not RAC).
All have no effect, and DB still throws the same "too many parse errors":

Bug 31602782 ††Contention on "CURSOR: PIN S WAIT ON X" when PQ slave's execution plan does not match with QC       
   Versions confirmed as being affected: 12.2.0.1 (Base Release) - 19.13.0
   Description
     When a parallel query is executed, QC sends parse message to all workers in all  
     instances. In a instance one worker tries to parse the cursor and other workers 
     waits (cursor pin s wait for x) for that cursor to parsed and built. 
      
     If this worker could not build the cursor with the same plan as QC and could not  
     join, another worker tries the same and so on until all workers get exhausted 
     and query runs in serial or without workers from that instance. 
      
     This can cause a huge contention on CURSOR: PIN S WAIT ON X .
      
     After this bug fix, if a worker on one instance fails to produce same plan as QC, 
     other workers on the same instance do not retry the parse and workers from 
     that instance are not allocted.
    
Bug 32121673††A Process May Crash With Ora-600 [xpluncompact:magic]
 
Warning: Too Many Parse Errors With ORA-12850/ORA-12872 (Doc ID 2816169.1)
  Cause
     It appears that this is due to Bug 31877812 : WHEN HIGH CONCURRENCY TOO MUCH CPU UTILIZATION AND PARALLEL 
     QUERY IS FAILING WITH ORA-12872, this bug is currently waiting on two other bugs to be resolved first. 
  Solution
    †Bug is not resolved as of yet but what has been able to get by the issue is setting the following parameter:
       alter system set "_nlj_batching_enabled"=0;
  
RAC: ORA-12801 and ORA-12850: Could Not Allocate Slaves On All Specified Instances: 2 Needed, 1 (Doc ID 1503358.1)
Blog: ORA-12850: Could not allocate slaves on all specified instances error and a workaround for my scripts talked about GV$ views for reading cluster-wide ASH data and PX slaves used for fetching remote data due to Xplan temp table transformations of materialized WITH subqueries, and provided workaround of "INLINE" hint in some TPT scripts.

The ORA-12850 indicates the limitation of materialized WITH subquery in Oracle 19c RAC installations because of GV$ view access.
This Blog test shows the limitation of materialized WITH subquery in parallel statements (non RAC).

Blog: Oracle WITH subquery_factoring_clause TEMP TABLE TRANSFORMATION and log file sync talked about "INLINE" hint in materialized WITH subqueries and log file sync.


Appendix-1. Oracle Error Document


ORA-12850, "Could not allocate slaves on all specified instances: %s needed, %s allocate d"
  *Cause: When executing a query on a gv$ fixed view, one or more                                                  
          instances failed to allocate a slave to process query.                                                   
  *Action:Check trace output for instances on which slaves failed to start.                                        
          GV$ query can only proceed if slaves can be allocated on all                                             
          instances.  
          
ORA-12872, "First slave parse gave different plan"                                                               
  *Cause: First hard parse on slave given QC-supplied environment and                                              
          parameters gave different plan from QC. Try again with outline.                                          
  *Action: No external action. Internally used for outline-based reparse.                                          

ORA-12842, "Cursor invalidated during parallel execution"
  *Cause:  The cursor was invalidated during the parse phase of deferred
           parallel processing, e.g. when set operands are parallelized.
  *Action: Depends on why the cursor was invalidated. Possible causes include
           DDL on a schema object and shared pool being flushed.

ORA-10388, "parallel query server interrupt (failure)"
  *Cause: internal use only
  *Action: this event should never be set externally
  *Comment: commands a slave to leave the server group (query failed)

  The ORA-10388 is an internal oracle message to terminate a parallel query when the operation is failed.
  This message should not be visible in alert.log

ORA-10387, "parallel query server interrupt (normal)"
  *Cause: internal use only
  *Action: this event should never be set externally
  *Comment: commands a slave to leave the server group normally

ORA-10382,  "parallel query server interrupt (reset)"            
  *Cause: internal use only                                       
  *Action: this event should never be set externally              
  *Comment: commands a slave to clear its buffers and reset itself


Appendix-2. Xplan



SQL_ID  3z4qz1d4taabm, child number 0
-------------------------------------
WITH SQ AS (SELECT /*+ materialize  */ X FROM TEST_TAB) SELECT /*+ 
parallel(2) */ T.X FROM TEST_TAB T, SQ WHERE ROWNUM <= :B1
 
Plan hash value: 1215440870
 
---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                             |       |       |   170 (100)|          |        |      |            |
|   1 |  TEMP TABLE TRANSFORMATION             |                             |       |       |            |          |        |      |            |
|   2 |   PX COORDINATOR                       |                             |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)                 | :TQ10000                    |   327 |  4251 |     2   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   4 |     LOAD AS SELECT (TEMP SEGMENT MERGE)| SYS_TEMP_0FD9D664E_DBF2D292 |       |       |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR                 |                             |   327 |  4251 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL                | TEST_TAB                    |   327 |  4251 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
|*  7 |   COUNT STOPKEY                        |                             |       |       |            |          |        |      |            |
|   8 |    PX COORDINATOR                      |                             |       |       |            |          |        |      |            |
|   9 |     PX SEND QC (RANDOM)                | :TQ20001                    |   106K|  1357K|   168   (0)| 00:00:01 |  Q2,01 | P->S | QC (RAND)  |
|* 10 |      COUNT STOPKEY                     |                             |       |       |            |          |  Q2,01 | PCWC |            |
|  11 |       MERGE JOIN CARTESIAN             |                             |   106K|  1357K|   168   (0)| 00:00:01 |  Q2,01 | PCWP |            |
|  12 |        PX BLOCK ITERATOR               |                             |   327 |  4251 |     2   (0)| 00:00:01 |  Q2,01 | PCWC |            |
|* 13 |         TABLE ACCESS FULL              | TEST_TAB                    |   327 |  4251 |     2   (0)| 00:00:01 |  Q2,01 | PCWP |            |
|  14 |        BUFFER SORT                     |                             |   327 |       |   166   (0)| 00:00:01 |  Q2,01 | PCWP |            |
|  15 |         PX RECEIVE                     |                             |   327 |       |            |          |  Q2,01 | PCWP |            |
|  16 |          PX SEND BROADCAST             | :TQ20000                    |   327 |       |            |          |  Q2,00 | P->P | BROADCAST  |
|  17 |           VIEW                         |                             |   327 |       |            |          |  Q2,00 | PCWP |            |
|  18 |            PX BLOCK ITERATOR           |                             |   327 |  4251 |     2   (0)| 00:00:01 |  Q2,00 | PCWC |            |
|* 19 |             TABLE ACCESS FULL          | SYS_TEMP_0FD9D664E_DBF2D292 |   327 |  4251 |     2   (0)| 00:00:01 |  Q2,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
 
   1 - SEL$2       
   2 - SEL$1       
   6 - SEL$1        / TEST_TAB@SEL$1
  13 - SEL$2        / T@SEL$2
  17 - SEL$D67CB2D2 / SQ@SEL$2
  19 - SEL$D67CB2D2 / T1@SEL$D67CB2D2
 
Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      OPT_PARAM('optimizer_dynamic_sampling' 1)
      OPT_PARAM('_optimizer_use_feedback' 'false')
      OPT_PARAM('_optimizer_gather_stats_on_load' 'false')
      OPT_PARAM('_px_adaptive_dist_method' 'off')
      OPT_PARAM('_optimizer_strans_adaptive_pruning' 'false')
      OPT_PARAM('_optimizer_nlj_hj_adaptive_join' 'false')
      OPT_PARAM('optimizer_index_cost_adj' 50)
      OPT_PARAM('optimizer_index_caching' 95)
      OPT_PARAM('_fix_control' '23473108:0 20424684:0 22582700:0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      OUTLINE_LEAF(@"SEL$D67CB2D2")
      MATERIALIZE(@"SEL$1")
      OUTLINE_LEAF(@"SEL$2")
      FULL(@"SEL$2" "T"@"SEL$2")
      NO_ACCESS(@"SEL$2" "SQ"@"SEL$2")
      LEADING(@"SEL$2" "T"@"SEL$2" "SQ"@"SEL$2")
      USE_MERGE_CARTESIAN(@"SEL$2" "SQ"@"SEL$2")
      PQ_DISTRIBUTE(@"SEL$2" "SQ"@"SEL$2" NONE BROADCAST)
      FULL(@"SEL$D67CB2D2" "T1"@"SEL$D67CB2D2")
      FULL(@"SEL$1" "TEST_TAB"@"SEL$1")
      END_OUTLINE_DATA
  */
Peeked Binds (identified by position):
--------------------------------------
   1 - :B1 (NUMBER): 3
 
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
   7 - filter(ROWNUM<=:B1)
  10 - filter(ROWNUM<=:B1)
  13 - access(:Z>=:Z AND :Z<=:Z)
  19 - access(:Z>=:Z AND :Z<=:Z)
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   0 -  STATEMENT
           -  parallel(2)
 
   2 -  SEL$1
           -  materialize
Note
-----
   - Degree of Parallelism is 2 because of hint

Monday, October 24, 2022

How many times SYSDATE is called in SQL and Plsql ?

In this Blog, we will try to count the number of SYSDATE and SYSTIMESTAMP executions in SQL and Plsql.

Note: Tested in Oracle 19.13


1. Test Setup


At first, we create a test table with 2 date columns and 2 timestamp columns, insert 2 rows:

drop table test_tab;

create table test_tab (id number, dt1 date, dt2 date, sts1 timestamp with time zone, sts2 timestamp with time zone);

insert into test_tab values (1, sysdate+1, sysdate+11, systimestamp+interval'1'day, systimestamp+interval'11'day); 
insert into test_tab values (2, sysdate+2, sysdate+22, systimestamp+interval'2'day, systimestamp+interval'22'day); 

commit;
Then create a Plsql function with 1 SQL sysdate call and 1 Plsql sysdate call:

--Each function execution makes 1 SQL sysdate call (from dual) and 1 Plsql sysdate call

create or replace function test_sysdate_fun return date as
  l_date date;
begin
  select sysdate into l_date from dual;     --SQL sysdate call
  l_date := sysdate;                        --Plsql sysdate call
  dbms_output.put_line(l_date);
  return l_date;
end;
/
It seems that Oracle subroutine kxsCaptureSysDate is executed once for each SQL sysdate call,
and pessdt is executed once for each Plsql sysdate call (see later section about package body SYS.STANDARD).
We can compose a GDB script to track both calls.

------- gdb_cmd_sysdate:   gdb -x gdb_cmd_sysdate -p <pid> -------

set pagination off
set logging file /tmp/gdb_cmd_sysdate.log
set logging overwrite on
set logging on
set $s = 0
set $p = 0

break kxsCaptureSysDate
command 
printf "<<< SQL sysdate Call === (%i).===>>>\n", ++$s
bt 4       
continue
end

break pessdt
command 
printf "<<< Plsql sysdate Call === (%i).===>>>\n", ++$p
bt 4       
continue
end


2. Test Run


Open a Sqlplus session, start above GDB script on its process (<pid>), then make the test.
Each sysdate call is reported as either from SQL, or from Plsql.

First, we test the function call in Plsql:

declare
  l_date date;
begin
  l_date := test_sysdate_fun;
end;
/
Here the GDB script output:

Breakpoint 1, 0x000000000372e1c0 in kxsCaptureSysDate ()
<<< SQL sysdate Call === (1).===>>>
  #0  0x000000000372e1c0 in kxsCaptureSysDate ()
  #1  0x0000000012c8bef0 in kxsSetDateTime ()
  #2  0x0000000012b11b44 in opiexe ()
  #3  0x0000000012b201eb in opipls ()

Breakpoint 2, 0x0000000005677200 in pessdt ()
<<< Plsql sysdate Call === (1).===>>>
  #0  0x0000000005677200 in pessdt ()
  #1  0x0000000012fa5f58 in pevm_icd_call_common ()
  #2  0x0000000012f9fe62 in pfrinstr_BCAL ()
  #3  0x0000000012f9ed5c in pfrrun_no_tool ()
Above GDB output shows that each test_sysdate_fun execution triggers 1 SQL sysdate call and 1 Plsql sysdate call.

Then, we execute a query involing SQL and Plsql sysdate calls:

select id, dt1, sysdate, sysdate
      ,test_sysdate_fun rs 
  from test_tab 
 where (dt1 > sysdate - 10)
   and (test_sysdate_fun > sysdate - 10)
   and (dt2 > test_sysdate_fun - 20);
GDB script output looks as follows (only top frame in each call is showed):

<<< SQL sysdate Call === (1).===>>>
  #0  0x000000000372e1c0 in kxsCaptureSysDate ()

<<< SQL sysdate Call === (2).===>>>
  #0  0x000000000372e1c0 in kxsCaptureSysDate ()

<<< Plsql sysdate Call === (1).===>>>
  #0  0x0000000005677200 in pessdt ()

<<< SQL sysdate Call === (3).===>>>
  #0  0x000000000372e1c0 in kxsCaptureSysDate ()

<<< Plsql sysdate Call === (2).===>>>
  #0  0x0000000005677200 in pessdt ()

<<< SQL sysdate Call === (4).===>>>
  #0  0x000000000372e1c0 in kxsCaptureSysDate ()

<<< Plsql sysdate Call === (3).===>>>
  #0  0x0000000005677200 in pessdt ()

<<< SQL sysdate Call === (5).===>>>
  #0  0x000000000372e1c0 in kxsCaptureSysDate ()

<<< Plsql sysdate Call === (4).===>>>
  #0  0x0000000005677200 in pessdt ()

<<< SQL sysdate Call === (6).===>>>
  #0  0x000000000372e1c0 in kxsCaptureSysDate ()

<<< Plsql sysdate Call === (5).===>>>
  #0  0x0000000005677200 in pessdt ()
There are 11 sysdate calls.
First 1 SQL sysdate call is from sql where condition dt1.
Then 2 sysdate calls are from 1 test_sysdate_fun where condition (no table column referred).
Then 4 sysdate calls are from 2 test_sysdate_fun where condition dt2 for 2 rows (table column dt2 referred).
Then 4 sysdate calls are from 2 test_sysdate_fun column expressions for 2 rows.
(Note that each test_sysdate_fun execution triggers 1 SQL sysdate call and 1 Plsql sysdate call)

Above test shows that for SQL sysdate, all multiple direct calls in where conditions and column values for multiple rows are merged (bound) to one single SQL sysdate call for each SQL statement execution.

But for Plsql function, when it appears in column unrelated where condition, it is called only once for multiple rows;
when it appears in column related where condition or expression, it is called once for each column and each row.

Here some more examples with their number of sysdate calls:

-- Only 1 SQL sysdate call in each query 
-- (it does not matter how many times in select columns or in where conditions) 

alter session set nls_date_format ='YYYY*MON*DD HH24:MI:SS';  

select id, dt1 
  from test_tab 
 where (dt1 > sysdate - 10);
   
select sysdate, sysdate + 1, sysdate + 2, id, dt1 
  from test_tab 
 where (dt1 > sysdate - 10);
   
select sysdate, sysdate + 1, sysdate + 2, id, dt1 
  from test_tab 
 where (dt1 > sysdate - 10) 
   and (dt2 > sysdate - 20);
   
with sq1 as (select sysdate d1 from dual),
     sq2 as (select sysdate d2 from dual),
     sq3 as (select sysdate d3 from dual)
select * from sq1, sq2, sq3;

-- 13 sysdate calls (first 1 SQL sysdate call from 2 sql where conditions, 
--                    then 2 SQL call and 2 Plsql call from test_sysdate_funn where conditions,
--                         2 SQL call and 2 Plsql call from test_sysdate_funn rs1 for 2 rows
--                         2 SQL call and 2 Plsql call from test_sysdate_funn rs2 for 2 rows) 

select sysdate, sysdate + 1, sysdate + 2, id, dt1
      ,test_sysdate_fun rs1
      ,test_sysdate_fun rs2
  from test_tab 
 where (dt1 > sysdate - 10) 
   and (dt2 > sysdate - 20) 
   and (test_sysdate_fun > sysdate - 10)
   and (test_sysdate_fun > sysdate - 20);

-- 1 SQL sysdate call, query takes more than 10 seconds, but all 4 column values are same. 

with sq1 as (select SYSDATE d1 from dual connect by level <= 1e3)
    ,sq2 as (select SYSDATE d2 from dual connect by level <= 1e5)
select min(d1) min_d1, max(d1) max_d1, min(d2) min_d2, max(d2) max_d2 from sq1, sq2;


3. systimestamp


In above gdb script, replacing kxsCaptureSysDate by kxsCaptureSysTime for SQL call,
and pessdt by pessts for Plsql call,
we can make the similar tests for systimestamp calls.

--Each function execution makes 1 SQL systimestamp call (from dual) and 1 Plsql systimestamp call 

create or replace function test_systimestamp_fun return timestamp with time zone as
  l_sts timestamp with time zone;
begin
  select systimestamp into l_sts from dual;
  l_sts := systimestamp;
  dbms_output.put_line(l_sts);
  return l_sts;
end;
/

alter session set nls_timestamp_tz_format ='YYYY*MON*DD HH24:MI:SS.FF3 TZR TZD';

-- 1 SQL systimestamp call 

select systimestamp, systimestamp + interval'10'day, systimestamp + interval'20'day, id, sts1 
  from test_tab 
 where (sts1 > systimestamp - interval'10'day) 
   and (sts2 > systimestamp - interval'20'day);
   
--There are 7 systimestamp calls. 
--First 1 SQL systimestamp call is from sql where condition.   
--Then 2 systimestamp calls are from 1 test_systimestamp_fun where condition.    
--Then 4 systimestamp calls are from 2 test_systimestamp_fun executions for 2 rows.
--Each test_systimestamp_fun execution triggers 1 SQL systimestamp call and 1 Plsql systimestamp call.

select id, sts1
      ,test_systimestamp_fun sts 
  from test_tab 
 where (sts1 >  systimestamp - interval'10'day)
   and (test_systimestamp_fun > systimestamp - interval'10'day);

-- 1 SQL systimestamp call, query takes more than 10 seconds, all 4 column values are same.

with sq1 as (select SYSTIMESTAMP d1 from dual connect by level <= 1e3)
    ,sq2 as (select SYSTIMESTAMP d2 from dual connect by level <= 1e5)
select min(d1) min_d1, max(d1) max_d1, min(d2) min_d2, max(d2) max_d2 from sq1, sq2;  


4. Plsql SYS.STANDARD sysdate/pessdt, systimestamp/pessts


Package body SYS.STANDARD contains code and comments about sysdate/pessdt and systimestamp/pessts:

  function pessdt return DATE;
    pragma interface (c,pessdt);

  -- Bug 1287775: back to calling ICD.
  -- Special: if the ICD raises ICD_UNABLE_TO_COMPUTE, that means we should do
  -- the old 'SELECT SYSDATE FROM DUAL;' thing.  This allows us to do the
  -- SELECT from PL/SQL rather than having to do it from C (within the ICD.)
  
  function sysdate return date is
    d date;
  begin
    d := pessdt;
    return d;
  exception
    when ICD_UNABLE_TO_COMPUTE then
      select sysdate into d from sys.dual;
      return d;
  end;

  function pessts return timestamp_tz_unconstrained;
    pragma interface (c,pessts);

  -- Special: if the ICD raises ICD_UNABLE_TO_COMPUTE, that means we should do
  -- the old 'SELECT systimestamp FROM dual;' thing.  This allows us to do the
  -- SELECT from PL/SQL rather than having to do it from C (within the ICD.)
  
  FUNCTION systimestamp RETURN timestamp_tz_unconstrained
  IS  t timestamp_tz_unconstrained;
  BEGIN
    t := pessts;
    RETURN t;
  EXCEPTION
    WHEN ICD_UNABLE_TO_COMPUTE THEN
      SELECT systimestamp INTO t FROM sys.dual;
      RETURN t;
  END;
By thw way, the mentioned Oracle MOS "Bug 1287775 - Server side PLSQL no longer uses DUAL to get SYSDATE (Doc ID 1287775.8)" wrote:
  This performance regression was introduced with the fix in Bug:616870.
  With this fix calling SYSDATE from PL/SQL is much slower as SYSDATE
  is obtained by selecting from DUAL. 
  This fix restores server side performance by allowing server side PLSQL
  to use internal calls rather than selecting from DUAL.

5. Related Work


Blog: Different SYSDATE behaviors in SQL and PL/SQL wrote:
      "In SQL, SYSDATE is called just once for the entire statement.
       In PL/SQL, SYSDATE is called every time it is invoked."
Its tests are based on the count of different sysdate returned values in multiple sysdate calls from SQL and Plsql.

In this Blog, we track each sysdate call instead of its returned value to show the different number of sysdate calls from SQL and Plsql.

Monday, October 3, 2022

Oracle WITH subquery_factoring_clause TEMP TABLE TRANSFORMATION and log file sync

If WITH subquery_factoring_clause in a query is transformed as a temporary table, the session can experience log file sync when Log Write (LGWR) has problem.

As further check, we also make tests without supsending LGWR, and show real system observations.

Note: Tested in Oracle 19c


1. Test


Open 3 Sqlplus sessions:
   Oradebug session
   Test session
   Monitor session


1.1 Oradebug session


In order to understand Oracle internals, run following script to suspend LGWR for 30 seconds:

------------------------ Oradebug session ------------------------
col pid new_value orapid
select pid from v$process where program like '%LGWR%';  

alter system switch logfile;

exec dbms_lock.sleep(1); 

oradebug setorapid &orapid
oradebug suspend
exec dbms_lock.sleep(30);
oradebug resume 


1.2 Test session


The test sql contains one WITH subquery_factoring_clause. Before and After the sql execution, we collect session redo/undo stats, and we also dump CURRENT redo logfile.

------------------------ Test session ------------------------
select 'BEFORE---', s.sid, n.name, s.value from v$statname n, v$mystat s 
 where n.statistic# = s.statistic#
   and name in ('redo entries', 'redo size', 'undo change vector size', 'redo synch time (usec)');

with sq as (select class, rownum r from v$waitstat)
  select session_id, 
         (select min(r) from sq) mir, 
         (select min(r) from sq) mar
    from dba_hist_active_sess_history
    --from gv$active_session_history
    where rownum <= 3;   
    
select 'AFTER---', s.sid, n.name, s.value from v$statname n, v$mystat s 
 where n.statistic# = s.statistic#  
   and name in ('redo entries', 'redo size', 'undo change vector size', 'redo synch time (usec)');
   

-- dump redo log
col member new_value redolog 
col trcfile new_value trcfile 
select member, 'redodump_G'||l.group#||'_'||to_char(sysdate,'hh24_mi_ss')  trcfile 
  from v$log l, v$logfile f where l.status = 'CURRENT' and f.group# = l.group#;

alter session set tracefile_identifier='&trcfile';
alter system dump logfile '&redolog'; 


1.3 Monitor session


During the test, run a query to check the existence of transaction and get its xid:

------------------------ Monitor session ------------------------
-- gv$transaction is defined on x$ktcxb with filter: bitand (ksspaflg, 1) != 0 and bitand (ktcxbflg, 2) != 0

set lines 300
col state for a10
col event for a30

select sid, state, event, last_call_et, kxidusn, kxidslt, kxidsqn -- ,x.*, s.*
  from x$ktcxb x, v$session s
 where ktcxbses = saddr and kxidusn != 0 and event = 'log file sync';  


2. Test Output



2.1 Test session Output



    BEFORE--        SID NAME                              VALUE
    --------- ---------- ---------------------------- ----------
    BEFORE---        383 redo entries                          5
    BEFORE---        383 redo size                          1140
    BEFORE---        383 redo synch time (usec)         45940025
    BEFORE---        383 undo change vector size             272
                                                       
                                                       
    AFTER--        SID NAME                              VALUE
    -------- ---------- ------------------------------ --------
    AFTER---        383 redo entries                          7
    AFTER---        383 redo size                          1516
    AFTER---        383 redo synch time (usec)         66985927
    AFTER---        383 undo change vector size             352
Comparing "BEFORE" and "AFTER" session stats, we can see:
   "redo entries"            Increase: 7-5       = 2
   "redo size"               Increase: 1516-1140 = 376
   "undo change vector size" Increase: 352-272   = 80                                                      
Later in redo logfile dump, we can see the two "redo entries" (REDO RECORD) and "undo change vector size".


2.2 Monitor session Output



    SID STATE      EVENT            LAST_CALL_ET    KXIDUSN    KXIDSLT    KXIDSQN
    --- ---------- ---------------- ------------ ---------- ---------- ----------
    383 WAITING    log file sync              15        106         19      48951
We can see that the session has one transaction with:

  xid=KXIDUSN.KXIDSLT.KXIDSQN=106.19.48951=0x6a.13.bf37
which can also be found in redo logfile dump.

If during the hanging, we printout short_stack on Test session, it looks like:

SQL> oradebug short_stack

  ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()
  <-semtimedop()+10<-skgpwwait()+187<-ksliwat()+2293<-kslwaitctx()+200
  <-kcrf_commit_force_int()+1866<-ksupop()+3275<-opiodr()+1525<-ttcpip()+1246
  <-opitsk()+1900<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165
  <-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
It seems that the transaction for CURSOR DURATION MEMORY temporary table is proceeded with Oracle internal subroutines:

  ktiTxnPoolAllocate ()
  kcrf_commit_force_int ()
  ktuIMTabCacheCommittedTxn ()
CURSOR DURATION MEMORY temporary table is created by "qks3tCrtTmpTabl" during parsing with "kksParseCursor" when first time executed in the system (the created temporary table is a DB-wide shared cursor, not session private).


2.3 redo logfile dump


The complete dump is:

REDO RECORD - Thread:1 RBA: 0x00008e.00000002.0010 LEN: 0x0110 VLD: 0x05 CON_UID: 0
SCN: 0x00000b7305de450c SUBSCN:  1 10/03/2022 08:21:38
(LWN RBA: 0x00008e.00000002.0010 LEN: 0x00000001 NST: 0x0001 SCN: 0x00000b7305de450c)
CHANGE #1 CON_ID:0 TYP:0 CLS:227 AFN:3 DBA:0x00c003d0 OBJ:4294967295 SCN:0x00000b7305de44f7 SEQ:1 OP:5.2 ENC:0 RBL:0 FLG:0x0000
ktudh redo: slt: 0x0013 sqn: 0x0000bf37 flg: 0x0411 siz: 80 fbi: 0
            uba: 0x00c001cb.1f32.07    pxid:  0x0000.000.00000000
CHANGE #2 CON_ID:0 TYP:0 CLS:228 AFN:3 DBA:0x00c001cb OBJ:4294967295 SCN:0x00000b7305de44f6 SEQ:1 OP:5.1 ENC:0 RBL:0 FLG:0x0000
ktudb redo: siz: 80 spc: 7454 flg: 0x0010 seq: 0x1f32 rec: 0x07
            xid:  0x006a.013.0000bf37  
ktubl redo: slt: 19 wrp: 1 flg: 0x0c08 prev dba:  0x00000000 rci: 0 opc: 5.7 [objn: 0 objd: 0 tsn: 0]
[Undo type  ] Regular undo  [User undo done   ]  No  [Last buffer split]  No 
[Temp object]           No  [Tablespace Undo  ]  No  [User only        ]  No 
Begin trans    
 prev ctl uba: 0x00c001cb.1f32.06 prev ctl max cmt scn:  0x00000b7305de40a5 
 prev tx cmt scn:  0x00000b7305de40a6 
 txn start scn:  0xffffffffffffffff  logon user: 49
 prev brb:  0x00c001bf  prev bcl:  0x00000000
BuExt idx: 0 flg2: 0
 
REDO RECORD - Thread:1 RBA: 0x00008e.00000002.0120 LEN: 0x0068 VLD: 0x01 CON_UID: 0
SCN: 0x00000b7305de450d SUBSCN:  1 10/03/2022 08:21:38
CHANGE #1 CON_ID:0 TYP:0 CLS:227 AFN:3 DBA:0x00c003d0 OBJ:4294967295 SCN:0x00000b7305de450c SEQ:1 OP:5.4 ENC:0 RBL:0 FLG:0x0000
ktucm redo: slt: 0x0013 sqn: 0x0000bf37 srt: 0 sta: 9 flg: 0x2 ktucf redo: uba: 0x00c001cb.1f32.07 ext: 2 spc: 7372 fbi: 0 
END OF REDO DUMP
There are two REDO RECORDs (text after ### are added comment), which are the two "redo entries" in session stats:

  First REDO RECORD with: 
     CHANGE #1 OP:5.2    ###  Update rollback segment header - KTURDH    ('undo header' CLS:227)
     CHANGE #2 OP:5.1    ###  Undo block or undo segment header - KTURDB ('undo block'  CLS:228, siz: 80)

  Second REDO RECORD with: 
     CHANGE #1 OP:5.4    ###  Commit transaction (transaction table update) - KTURCM (CLS:227)
For OP code, See Layer and corresponding Operation codes description.

  Layer 5 : Transaction Undo - KCOCOTUN [ktucts.h]
    Opcode 1 : Undo block or undo segment header - KTURDB
    Opcode 2 : Update rollback segment header - KTURDH
    Opcode 3 : Rollout a transaction begin - KTURBG
    Opcode 4 : Commit transaction (transaction table update) - KTURCM - no undo record


2.4 Sql Xplan


Xplan shows that WITH subquery_factoring_clause is treated as a temporary table load:

SQL_ID  gvrzr3dmans6k, child number 0
-------------------------------------
with sq as (select class, rownum r from v$waitstat)   select 
session_id,          (select min(r) from sq) mir,          (select 
min(r) from sq) mar     from dba_hist_active_sess_history     --from 
gv$active_session_history     where rownum <= 3
 
Plan hash value: 1419524593
 
----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                             |       |       |   453 (100)|          |       |       |
|   1 |  SORT AGGREGATE                          |                             |     1 |    13 |            |          |       |       |
|   2 |   VIEW                                   |                             |     1 |    13 |     2   (0)| 00:00:01 |       |       |
|   3 |    TABLE ACCESS FULL                     | SYS_TEMP_0FD9D6B1B_42ABBB2  |     1 |     6 |     2   (0)| 00:00:01 |       |       |
|   4 |  SORT AGGREGATE                          |                             |     1 |    13 |            |          |       |       |
|   5 |   VIEW                                   |                             |     1 |    13 |     2   (0)| 00:00:01 |       |       |
|   6 |    TABLE ACCESS FULL                     | SYS_TEMP_0FD9D6B1B_42ABBB2  |     1 |     6 |     2   (0)| 00:00:01 |       |       |
|   7 |  TEMP TABLE TRANSFORMATION               |                             |       |       |            |          |       |       |
|   8 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6B1B_42ABBB2  |       |       |            |          |       |       |
|   9 |    COUNT                                 |                             |       |       |            |          |       |       |
|* 10 |     FIXED TABLE FULL                     | X$KCBWAIT                   |     1 |     6 |     0   (0)|          |       |       |
|* 11 |   COUNT STOPKEY                          |                             |       |       |            |          |       |       |
|* 12 |    HASH JOIN RIGHT OUTER                 |                             | 29795 |   931K|   449   (0)| 00:00:01 |       |       |
|  13 |     INDEX FAST FULL SCAN                 | WRH$_EVENT_NAME_PK          |  2054 | 28756 |     5   (0)| 00:00:01 |       |       |
|  14 |     PARTITION RANGE ALL                  |                             | 29795 |   523K|   444   (0)| 00:00:01 |     1 |     8 |
|  15 |      TABLE ACCESS FULL                   | WRH$_ACTIVE_SESSION_HISTORY | 29795 |   523K|   444   (0)| 00:00:01 |     1 |     8 |
----------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
  10 - filter((MOD("INDX",19)<>0 AND "INST_ID"=USERENV('INSTANCE')))
  11 - filter(ROWNUM<=3)
  12 - access("ASH"."DBID"="EVT"."DBID" AND "ASH"."EVENT_ID"="EVT"."EVENT_ID")
Note that the name of temporary table (SYS_TEMP_0FD9D6B1B_42ABBB2) is changed after each new parsing of the same statement, but SQL_ID (gvrzr3dmans6k), child number (0) and Plan hash value (1419524593) are always the same. (the first number: 0FD9D6B1B increased by 1 after each new parsing, the second number: 42ABBB2 stays the same in each database)


3. INLINE hint


With an INLINE hint, the subquery is treated as an inline view instead of a temporary table. Then there is no more log file sync is observed.

with sq as (select /*+ INLINE */ class, rownum r from v$waitstat)
  select session_id, 
         (select min(r) from sq) mir, 
         (select min(r) from sq) mar
    from dba_hist_active_sess_history
    --from gv$active_session_history
    where rownum <= 3; 
Update (04-Dec-2022)
Blog: ORA-12850: Could not allocate slaves on all specified instances error and a workaround for my scripts talked about GV$ views for reading cluster-wide ASH data and PX slaves used for fetching remote data due to Xplan temp table transformations of materialized WITH subqueries, and provided workaround of "INLINE" hint in some TPT scripts.


4. TEMP TABLE TRANSFORMATION and Performance


As a further test without suspending LGWR, we can run following code for both cases of With/Without INLINE hint and compare their outcome:

create or replace procedure testp (p_sleep_seconds number, p_job_id number, p_hint varchar2 := null) as
  l_start_time    number := dbms_utility.get_time;
  l_cnt           number := 0;
  type num_tab    is table of number;
  l_num_tab_1     num_tab;
  l_num_tab_2     num_tab;
  l_num_tab_3     num_tab;
begin
  loop
    execute immediate q'[
      with sq as (select /*+ ]' ||p_hint||q'[ */ class, rownum r from v$waitstat)
      select session_id, 
            (select min(r) from sq) mir, 
            (select min(r) from sq) mar
       from dba_hist_active_sess_history
      where rownum <= ]' || p_job_id
      bulk collect into l_num_tab_1, l_num_tab_2, l_num_tab_3;

    l_cnt := l_cnt + 1;
    exit when mod(l_cnt, 1000) = 0 and (dbms_utility.get_time - l_start_time) >= 100*p_sleep_seconds;
  end loop;
end;
/

-- exec testp(3, 2);
-- exec testp(3, 2, 'INLINE');

create or replace procedure start_jobs (p_count number, p_sleep_seconds number, p_hint varchar2 := null) as
begin
  for i in 1..p_count loop
    dbms_scheduler.create_job (
      job_name        => 'TEST_JOB_'||i,
      job_type        => 'PLSQL_BLOCK',
      job_action      => 'begin testp('||p_sleep_seconds||','||i||','''||p_hint||'''); end;',    
      start_date      => systimestamp,
      auto_drop       => true,
      enabled         => true);
  end loop;
end;
/


-- Start Test for 10 minutes

exec start_jobs(16, 600);

--exec start_jobs(16, 600, 'INLINE');
During the test, we can check system behaviour with:

select program, event, p1, p2, p3, p1text, p2text, p3text, v.* 
  from v$session v 
 where program like '%(J0%'
 order by v.event; 

select program, event, p1, p2, p3, p1text, p2text, p3text, v.* 
  from v$active_session_history v
 order by sample_time desc, v.event; 
 
select * from v$sysstat 
 where name in ('redo entries', 'redo size', 'undo change vector size', 'redo synch time');
In real OLTP systems, which have many short running queries composed of WITH subquery_factoring_clause transformed as a temporary table, "SQL memory manager workarea list latch" ("Where": qesmmIRegisterWorkArea, qesmmIUnRegisterWorkArea) is observed in top Latch Miss Sources.

"log file sync" is listed in Top 10 Foreground Events with following redo stats:
   log file sync Avg Wait:  1.12 ms
   Redo size per second:      90 K
   Redo entries per second:  240
   Redo size per Redo entry: 380 Bytes
   log switches per Hour:    1

Monday, September 12, 2022

Update Restart and Undo Extends

In Blog: That old restart problem again... there is a comment about update restart and undo extends.

Thereafter two Blogs are composed:
     Update Restart and new Active Undo Extent
     Delete Restart and new Active Undo Extent
which demonstrated that the number of Update Restart (or Delete Restart) is equal to the number of times rollback segment size is extended (V$ROLLSTAT.extends).

In this Blog, we will add more flavors to the code of Blog: Update Restart and new Active Undo Extent to show one-to-one mapping of UpdateRestart-to-UndoExtend.

Furthermore, we track down each Update restart and Undo extend to one respective Oracle internal subrountine call of ktuAddExtent with Dtrace and GDB.

The probable reason of restart is when there is more Undo space requested by an update, it first allocates a new Undo extent, then restarts the update (it is also possible that is an Undo pre-allocation).

Note 1. Tested on Oracle 19.13, 19.7
     2. DB settings:
           temp_undo_enabled                    boolean     TRUE
           undo_management                      string      AUTO
           undo_retention                       integer     3600
           
           _rollback_segment_count              integer     1
           transactions_per_rollback_segment    integer     1

1. Test Code


See Appendix 1. Test Code


2. Test Run and Output


For each test, make a new Sqlplus connection. If not reproduced occasionally (due to "AUTO" undo_management), repeat the test.
We also set the diagnostic events from Blog: That Old Restart Problem Strikes Back: Setting the Stage .

conn k/s@test_db

truncate table delete_data;

insert into delete_data 
  select owner, object_name, rownum id from dba_objects where rownum  < 53001;

commit;

select blocks, bytes/power(2,20) mbytes from user_segments where segment_name = 'DELETE_DATA';
  -- BLOCKS = 384    MBYTES = 3

alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "dml_trc_4a2";
--set diagnostic events in my session:
alter session set events 'trace[dml]:sql_trace wait=true';

exec testp;
Here the Sqlplus output:

--======= Begin Rollstat: USN=107, EXTENTS=24, EXTENDS=115 =======--
Restart#: 1 @ 03:49:44
            doing "AASqdcAAAAABKokABg(id = 5226)" again, trigger called count = 5227
            Undo: l_max_extent_id=21, extent count = 5, extent_id = 21, block_id = 16000, blocks = 128
Restart#: 2 @ 03:49:48
            doing "AASqdcAAAAABKpmABj(id = 16676)" again, trigger called count = 16678
            Undo: l_max_extent_id=22, extent count = 6, extent_id = 22, block_id = 16128, blocks = 128
Restart#: 3 @ 03:49:53
            doing "AASqdcAAAAABLLVABV(id = 28192)" again, trigger called count = 28195
            Undo: l_max_extent_id=23, extent count = 7, extent_id = 23, block_id = 16256, blocks = 128
Restart#: 4 @ 03:49:57
            doing "AASqdcAAAAABLKbACl(id = 39666)" again, trigger called count = 39670
            Undo: l_max_extent_id=24, extent count = 8, extent_id = 24, block_id = 16512, blocks = 128
Restart#: 5 @ 03:50:02
            doing "AASqdcAAAAABLNkAAm(id = 51186)" again, trigger called count = 51191
            Undo: l_max_extent_id=25, extent count = 9, extent_id = 25, block_id = 16640, blocks = 128
--======= End Rollstat: USN=107, EXTENTS=29, EXTENDS=120 =======--
    row     count = 53000
    trigger count = 53005
    restart count = 5
    extents delta = 5
    extends delta = 5
    
Elapsed: 00:00:21.98
The above output shows that 53000 single row updates, trigger fired 53005 times, hence 5 update restarts.
The mapping of DELETE_DATA.id to Undo extent_id / block_id in 5 row restarts is:

  id = 5226      extent_id = 21, block_id = 16000
  id = 16676     extent_id = 22, block_id = 16128
  id = 28192     extent_id = 23, block_id = 16256
  id = 39666     extent_id = 24, block_id = 16512
  id = 51186     extent_id = 25, block_id = 16640   
We can also see the EXTENT_ID / BLOCK_ID of 5 restarts in dba_undo_extents (EXTENT_ID 21 to 25):

select * from dba_undo_extents 
 where segment_name like'_SYSSMU107_%' and status = 'ACTIVE'
 order by extent_id, file_id, block_id;

  OWN SEGMENT_NAME            TABLESPACE_NAME EXTENT_ID FILE_ID  BLOCK_ID    BYTES  BLOCKS RELATIVE_FNO  STATUS
  --- ----------------------- --------------- --------- ------- --------- -------- ------- ------------  ------
  SYS _SYSSMU107_2079059358$  UNDO                   17       3     14208  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   18       3     15616  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   19       3     15744  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   20       3     15872  1048576     128            3  ACTIVE
  
  SYS _SYSSMU107_2079059358$  UNDO                   21       3     16000  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   22       3     16128  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   23       3     16256  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   24       3     16512  1048576     128            3  ACTIVE
  SYS _SYSSMU107_2079059358$  UNDO                   25       3     16640  1048576     128            3  ACTIVE
  
  SYS _SYSSMU107_2079059358$  UNDO                   26       3     16768  1048576     128            3  ACTIVE
  
  10 rows selected.
Further more, we can get Undo Segment info and make block dump:

select * from dba_rollback_segs where segment_id = 107;

  SEGMENT_NAME           OWNER  TABLESPACE_NAME SEGMENT_ID FILE_ID BLOCK_ID INITIAL_EXTENT NEXT_EXTENT MIN_EXTENTS MAX_EXTENTS PCT_INCREASE STATUS  RELATIVE_FNO
  ---------------------- ------ --------------- ---------- ------- -------- -------------- ----------- ----------- ----------- ------------ ------  ------------
  _SYSSMU107_2079059358$ PUBLIC UNDO                   107       3      992         131072       65536           2       32765              ONLINE             3


alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "undo_seg_header_blk_992_4a1";  
alter system dump datafile  3 block 992;	
In the dump, we can also see the extent_id / block_id of 5 restarts in Extent Map
(added comments after "==#" are heximal to decimal conversion. Extent Number:21 to 25)

BH (0x10bf59580) file#: 3 rdba: 0x00c003e0 (3/992) class: 229 ba: 0x10b104000
  TRN TBL::
  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num    cmt
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0x5a5d  0x0013  0x00000b7302e37794  0x00c037f8   0x0000.000.00000000  0x00000418   0x00000000  1662040111
   0x01    9    0x00  0x5a59  0x0008  0x00000b7302e376ad  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
   0x02    9    0x00  0x5a5a  0x000c  0x00000b7302e377d3  0x00c037f9   0x0000.000.00000000  0x00000001   0x00000000  1662040112
   0x03    9    0x00  0x5a5e  0x0015  0x00000b7302e37675  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
   0x04   10    0x80  0x5a5e  0x0011  0x00000b7302e378c5  0x00c04191   0x0000.000.00000000  0x00000419   0x00000000  0
   0x05    9    0x00  0x5a5c  0x001f  0x00000b7302e376ff  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
   ....
   0x21    9    0x00  0x5a5d  0x001b  0x00000b7302e3772d  0x00000000   0x0000.000.00000000  0x00000000   0x00000000  1662039990
  EXT TRN CTL::

 Retention Table                               Extent Map                                                       
  ---------------------------------------      ------------------------
 Extent Number:0  Commit Time: 1662039334       0x00c003e1  length: 7                                           
 Extent Number:1  Commit Time: 1662039334       0x00c003d8  length: 8                                           
 Extent Number:2  Commit Time: 1662039334       0x00c01380  length: 128                                         
 ......                                      
 Extent Number:20  Commit Time: 0               0x00c03e00  length: 128  
                                        
 Extent Number:21  Commit Time: 0               0x00c03e80  length: 128      ==# 0x00c03e80 = 3/16000                                  
 Extent Number:22  Commit Time: 0               0x00c03f00  length: 128      ==# 0x00c03f00 = 3/16128              
 Extent Number:23  Commit Time: 0               0x00c03f80  length: 128      ==# 0x00c03f80 = 3/16256              
 Extent Number:24  Commit Time: 0               0x00c04080  length: 128      ==# 0x00c04080 = 3/16512              
 Extent Number:25  Commit Time: 0               0x00c04100  length: 128      ==# 0x00c04100 = 3/16640
                                     
 Extent Number:26  Commit Time: 0               0x00c04180  length: 128                                         
 Extent Number:27  Commit Time: 1662039334      0x00c04000  length: 128                                         
 Extent Number:28  Commit Time: 1662039334      0x00c0a800  length: 128   
Now look the trace file of diagnostic events.

Two related FETCH and UPDATE cursors are:

PARSING IN CURSOR #140024761145664 len=48 dep=1 uid=49 oct=3 lid=49 tim=8901333911119 hv=2815500193 ad='9b939860' sqlid='f4qphfamx27x1'
SELECT * FROM DELETE_DATA ORDER BY ID FOR UPDATE

PARSING IN CURSOR #140024754434376 len=62 dep=1 uid=49 oct=6 lid=49 tim=8901334052791 hv=4254591981 ad='9a87f850' sqlid='7k30h4gytguzd'
UPDATE DELETE_DATA SET OWNER = LOWER(OWNER) WHERE ROWID = :B1 
Pick 3 consecutive rows from trace file: DELETE_DATA.id: 5225 (c2 35 1a), 5226 (c2 35 1b), and 5227 (c2 35 1c).

For a normal row update, for example, DELETE_DATA.id: 5225 ('c2 35 1a') with rowid 004aa75c.0004aa24.5f = 4892508.305700.95, it looks like:

FETCH #140024761145664:c=2,e=2,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=8901335815921
updSetExecCmpColInfo: not RHS: objn=4892505, cid=3
  kduukcmpf=0x7f5a0e3ac1a2, kduukcmpl=0x7f5a0e3ac1a0, kduukcmpp=(nil)
updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.57
updrow: objn=4892505 phase=NOT LOCKED
updicf: SAVE phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: SAVE kduukcmpl 3
	kduukcmpp: c2 35 1a             
updrow: kauupd objn:4892505 table:0 rowMigrated:FALSE  rowid 004aa75c.0004aa24.5f code 0
EXEC #140024754434376:c=299,e=299,p=0,cr=1,cu=2,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=8901335816231
CLOSE #140024754434376:c=0,e=0,dep=1,type=3,tim=8901335816253
For 5 update restart rows, there are 5 respective lines containing "error=1551", all of which are similar to following example of DELETE_DATA.id: 5226 ('c2 35 1b') with rowid: "AASqdcAAAAABKokABg" = 004aa75c.0004aa24.60 = 4892508.305700.96:

FETCH #140024761145664:c=3,e=2,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=8901335816264
updSetExecCmpColInfo: not RHS: objn=4892505, cid=3
  kduukcmpf=0x7f5a0e3b12ba, kduukcmpl=0x7f5a0e3b12b8, kduukcmpp=(nil)
updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.58
updrow: objn=4892505 phase=NOT LOCKED
updicf: SAVE phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: SAVE kduukcmpl 3
	kduukcmpp: c2 35 1b

-------------- START of extra lines for restart --------------
updrow: objn=4892505 error=1551
updrow: qecinvsub objn=4892505
updrow: setting ROW_RETRY objn=4892505
updrow: retry_this_row: ROW_RETRY set, objn= 4892505 phase=NOT LOCKED
updicf: RETRY phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: RETRY kduukcmpl 3
	kduukcmpp: c2 35 1b

*** 15:49:44.051090
WAIT #140024754434376: nam='PL/SQL lock timer' ela= 999222 duration=0 p2=0 p3=0 obj#=-1 tim=8901336816048
doing "AASqdcAAAAABKokABg(id = 5226)" again, trigger called count = 5227
-------------- END of extra lines for restart --------------
......
updrow: kauupd objn:4892505 table:0 rowMigrated:FALSE  rowid 004aa75c.0004aa24.60 code 0
EXEC #140024754434376:c=3929,e=1002883,p=0,cr=69,cu=17,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=8901336819159
CLOSE #140024754434376:c=0,e=0,dep=1,type=3,tim=8901336819249
Here trace for DELETE_DATA.id: 5227 ('c2 35 1c') with rowid 004aa75c.0004aa24.61 = 4892508.305700.97:

FETCH #140024761145664:c=6,e=6,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=8901336819272
updSetExecCmpColInfo: not RHS: objn=4892505, cid=3
  kduukcmpf=0x7f5a0e3ac1a2, kduukcmpl=0x7f5a0e3ac1a0, kduukcmpp=(nil)
updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efd.1732.01
updrow: objn=4892505 phase=NOT LOCKED
updicf: SAVE phase=NOT LOCKED, cid=3, skflag=x1
	kccflg=x20, kccfl2=x100
updicf: SAVE kduukcmpl 3
	kduukcmpp: c2 35 1c
updrow: kauupd objn:4892505 table:0 rowMigrated:FALSE  rowid 004aa75c.0004aa24.61 code 0
EXEC #140024754434376:c=352,e=351,p=0,cr=1,cu=2,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=8901336819641
CLOSE #140024754434376:c=1,e=0,dep=1,type=3,tim=8901336819661
Comparing normal update and update restart, we mark extra lines between START and END. They are all about "RETRY" with cid=3 (dc_rollback_segments) in "phase=NOT LOCKED". One notable line is "updicf: RETRY kduukcmpl 3".

In both case, "FETCH #140024761145664" is executed only once to fetch one row (r=1).

Here the lines of 5 update restarts with uba, kduukcmpp, and dbms_system.ksdwrt messages:

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.58
	kduukcmpp: c2 35 1b
*** 15:49:44.051090
doing "AASqdcAAAAABKokABg(id = 5226)" again, trigger called count = 5227

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378e0  xid: 0x006b.004.00005a5e  uba: 0x00c03f7c.1733.5a
	kduukcmpp: c3 02 43 4d
*** 15:49:48.676064
doing "AASqdcAAAAABKpmABj(id = 16676)" again, trigger called count = 16678

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378e5  xid: 0x006b.004.00005a5e  uba: 0x00c03ffc.1734.5a
	kduukcmpp: c3 03 52 5d
*** 15:49:53.312088
doing "AASqdcAAAAABLLVABV(id = 28192)" again, trigger called count = 28195

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378e8  xid: 0x006b.004.00005a5e  uba: 0x00c040fc.1735.5a
	kduukcmpp: c3 04 61 43
*** 15:49:57.950063
doing "AASqdcAAAAABLKbACl(id = 39666)" again, trigger called count = 39670

updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e378eb  xid: 0x006b.004.00005a5e  uba: 0x00c0417c.1736.5a
	kduukcmpp: c3 06 0c 57
*** 15:50:02.548055
doing "AASqdcAAAAABLNkAAm(id = 51186)" again, trigger called count = 51191
We can see that all DELETE_DATA.id of 5 restarts are marked with kduukcmpp:

	kduukcmpp: c2 35 1b     = 5226	
	kduukcmpp: c3 02 43 4d  = 16676	
	kduukcmpp: c3 03 52 5d  = 28192	
	kduukcmpp: c3 04 61 43  = 39666	
	kduukcmpp: c3 06 0c 57  = 51186
Pick uba of 5 corresponding restarts:

  scn: 0x00000b7302e378df  xid: 0x006b.004.00005a5e  uba: 0x00c03efc.1732.58
  scn: 0x00000b7302e378e0  xid: 0x006b.004.00005a5e  uba: 0x00c03f7c.1733.5a
  scn: 0x00000b7302e378e5  xid: 0x006b.004.00005a5e  uba: 0x00c03ffc.1734.5a
  scn: 0x00000b7302e378e8  xid: 0x006b.004.00005a5e  uba: 0x00c040fc.1735.5a
  scn: 0x00000b7302e378eb  xid: 0x006b.004.00005a5e  uba: 0x00c0417c.1736.5a
and map them to file_id / block_id:
                                                                                                                          
  uba: 0x00c03efc = 3/16124
  uba: 0x00c03f7c = 3/16252
  uba: 0x00c03ffc = 3/16380
  uba: 0x00c040fc = 3/16636
  uba: 0x00c0417c = 3/16764          
then compare with respective file_id / block_id in dba_undo_extents (or Extent Map):

  0x00c03e80 = 3/16000 
  0x00c03f00 = 3/16128 
  0x00c03f80 = 3/16256 
  0x00c04080 = 3/16512 
  0x00c04100 = 3/16640 
all are with a difference of 124:

  16124 - 16000 = 124         
  16252 - 16128 = 124 
  16380 - 16256 = 124 
  16636 - 16512 = 124 
  16764 - 16640 = 124 
Above dba_undo_extents output shows that each extent contains 128 blocks. Probably when there are only 4 free blocks remaining (124 blocks are already used), Oracle makes an Undo pre-allocation (pre-extention).

Look above 3 consecutive row updates and their respective uba: row DELETE_DATA.id: 5225 without restart, 5226 ('c2 35 1b') with restart, 5227 ('c2 35 1c') without restart:

  DELETE_DATA.id: 5225 ('c2 35 1a')  uba: 0x00c03efc.1732.57
  DELETE_DATA.id: 5226 ('c2 35 1b')  uba: 0x00c03efc.1732.58
  DELETE_DATA.id: 5227 ('c2 35 1c')  uba: 0x00c03efd.1732.01
Both row 5225 and 5226 are in the same Undo block (uba: 0x00c03efc= 3/16124) of EXTENT_ID = 21, and row 5226 is stored as last slot (0x58).
But row 5227 is jumped to next Undo block (uba: 0x00c03efd= 3/16125) of same EXTENT_ID = 21 and stored in first slot(0x01).

In average, each Undo block is used for 53000(total rows)/5(Undo extents)/128(blocks per extent) = 83 rows, which is close to 0x58 = 88.

Above dba_undo_extents query output shows:

  EXTENT_ID = 21 has 128 BLOCKS, which are BLOCK_ID 16000 to 16127
  EXTENT_ID = 22 has 128 BLOCKS, which are BLOCK_ID 16128 to 16255
so both uba (0x00c03efc= 3/16124 and 0x00c03efd= 3/16125) are still in the same EXTENT_ID = 21. Hence we could probably say that Oracle makes an Undo pre-allocation (pre-extention) when there are only 4 free blocks remaining from 128 allocated blocks.

That seems exactly what ORA-01551 Docu wrote:

  $> oerr ora 1551
       01551, 00000, "extended rollback segment, pinned blocks released"
       // *Cause: Doing recursive extent of rollback segment, trapped internally
       //        by the system
       // *Action: None
Restart is caused by creating Undo recursive extent with internally trapped error "1551" (possibly restart is performed in "1551" exeception handler).

By the way, if there is no trigger (drop trigger delete_data_bt), 3 consecutive updates are as follows.

Update 1 has no "error=1551";

Update 2 has "error=1551", and three extra lines starting with "updrowFastPath", last extra line marked with "ROW_RETRY".
         Both Update 1 and Update 2 are in same Undo block "0x00c01efc", Update 2 is at last slot "5a" (0x5a=90) of "0x00c01efc".

Update 3 has no "error=1551"; It is in next Undo block "0x00c01efd". It is at first slot "01" of "0x00c01efd".

-- Update 1
FETCH #140562258761720:c=1,e=1,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=9006876120842
updThreePhaseExe: objn=4892673 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e9971b  xid: 0x0069.000.00005af3  uba: 0x00c01efc.1757.59 
EXEC #140562256908616:c=186,e=186,p=0,cr=0,cu=1,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=9006876121038
CLOSE #140562256908616:c=0,e=0,dep=1,type=3,tim=9006876121052

-- Update 2
FETCH #140562258761720:c=1,e=1,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=9006876121062
updThreePhaseExe: objn=4892673 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e9971b  xid: 0x0069.000.00005af3  uba: 0x00c01efc.1757.5a 
	updrowFastPath: objn=4892673 error=1551
	updrowFastPath: qecinvsub objn=4892673
	updrowFastPath: setting ROW_RETRY objn=4892673
EXEC #140562256908616:c=118,e=379,p=0,cr=1,cu=13,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=9006876121451
CLOSE #140562256908616:c=0,e=0,dep=1,type=3,tim=9006876121507

-- Update 3
FETCH #140562258761720:c=2,e=2,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=3817064686,tim=9006876121521
updThreePhaseExe: objn=4892673 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e9971b  xid: 0x0069.000.00005af3  uba: 0x00c01efd.1757.01
EXEC #140562256908616:c=192,e=192,p=0,cr=0,cu=1,mis=0,r=1,dep=1,og=1,plh=1680159706,tim=9006876121726
CLOSE #140562256908616:c=1,e=1,dep=1,type=3,tim=9006876121739


3. Dtrace and GDB


We can further use Dtrace and GDB to reveal that each Update restart / Undo extend is one call of Oracle internal subroutine: ktuAddExtent.


3.1 Dtrace


First, build a clean test data in a new Sqlplus session:

conn k/s@test_db_solaris

truncate table delete_data;

insert into delete_data 
  select owner, object_name, rownum id from dba_objects where rownum  < 53001;

commit;
Then, find its UNIX process id (pid=1234), start Dtrace;

$ sudo dtrace -n 'pid$target:oracle:ktuAddExtent:entry 
{ 
  printf("\n---- at %Y ", walltimestamp);
  printf("DBA_ROLLBACK_SEGS.segment_id(arg0)=%d, DBA_UNDO_EXTENTS.extent_id(arg2)=%d\n", arg0, arg2);
  ustack();    
}'  -p 1234
Run test, the output shows 5 update restarts and its respective Undo USN/(extent_id, block_id)

19:55:21 SQL > exec testp;
--======= Begin Rollstat: USN=131, EXTENTS=6, EXTENDS=2 =======--
Restart#: 1 @ 07:55:55
            doing "AAQVWNAAAAAMeQGAAJ(id = 5139)" again, trigger called count = 5140
            Undo: l_max_extent_id=6, extent count = 6, extent_id = 6, block_id = 11264, blocks = 128
Restart#: 2 @ 07:55:56
            doing "AAQVWNAAAAAMeRHAAA(id = 16588)" again, trigger called count = 16590
            Undo: l_max_extent_id=7, extent count = 7, extent_id = 7, block_id = 11264, blocks = 128
Restart#: 3 @ 07:55:58
            doing "AAQVWNAAAAAMePDACw(id = 28105)" again, trigger called count = 28108
            Undo: l_max_extent_id=8, extent count = 8, extent_id = 8, block_id = 11264, blocks = 128
Restart#: 4 @ 07:55:59
            doing "AAQVWNAAAAAMeONAAr(id = 39578)" again, trigger called count = 39582
            Undo: l_max_extent_id=9, extent count = 9, extent_id = 9, block_id = 11264, blocks = 128
Restart#: 5 @ 07:56:01
            doing "AAQVWNAABAAAHpGAAB(id = 51098)" again, trigger called count = 51103
            Undo: l_max_extent_id=10, extent count = 10, extent_id = 10, block_id = 11264, blocks = 128
--======= End Rollstat: USN=131, EXTENTS=11, EXTENDS=7 =======--
    row     count = 53000
    trigger count = 53005
    restart count = 5
    extents delta = 5
    extends delta = 5
Elapsed: 00:00:07.45
Here dba_undo_extents rows after test (no ACTIVE rows before test):

SQL > select * from dba_undo_extents 
       where segment_name like'_SYSSMU131_%' and status = 'ACTIVE'
       order by extent_id, file_id, block_id;


OWN SEGMENT_NAME           TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID   BYTES BLOCKS RELATIVE_FNO COMMIT_WTIME STATUS
--- ---------------------- --------------- --------- ------- -------- ------- ------ ------------ ------------ ------
SYS _SYSSMU131_3111742082$ UNDO                    0       3     1264   65536      8            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    1       3     1256   65536      8            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    2       3    59520 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    3       3    59648 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    4       3    60928 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    5       3    61056 1048576    128            3              ACTIVE

SYS _SYSSMU131_3111742082$ UNDO                    6       3    62464 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    7       3    62592 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    8       3    63872 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                    9       3    10112 1048576    128            3              ACTIVE
SYS _SYSSMU131_3111742082$ UNDO                   10       3    11264 1048576    128            3              ACTIVE

11 rows selected.
Here the output of Dtrace (edited). Probe ktuAddExtent is fired 7 times, each of which results in one new extent (DBA_UNDO_EXTENTS.extent_id: 4-10).

The timestamp shows that extent_id=4 and 5 are at (almost) same time (19:55:54), but then (almost) each second one new extent_id since we call "dbms_session.sleep(1) in "trigger delete_data_bt".

 ID      FUNCTION:NAME
 89987   ktuAddExtent:entry
-------- at 19:55:54 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=4
-------- at 19:55:54 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=5

-------- at 19:55:54 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=6
-------- at 19:55:55 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=7
-------- at 19:55:57 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=8
-------- at 19:55:58 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=9
-------- at 19:56:00 DBA_ROLLBACK_SEGS.segment_id(arg0)=131, DBA_UNDO_EXTENTS.extent_id(arg2)=10
Now look ustack:

DBA_UNDO_EXTENTS.extent_id(arg2)=4/5    DBA_UNDO_EXTENTS.extent_id(arg2)=6/7/8/9/10
------------------------------------    -------------------------------------------
oracle`ktuAddExtent                     oracle`ktuAddExtent            
oracle`ktusmaex+0x170                   oracle`ktusmaex+0x170          
oracle`ktusmasp1r+0xf7                  oracle`ktusmasp1r+0xf7         
oracle`ktrsexec+0x228                   oracle`ktrsexec+0x228          
oracle`ktuaspResumableInPdb+0xb         oracle`ktuaspResumableInPdb+0xb
oracle`ktusmasp+0x290                   oracle`ktusmasp+0x290          
oracle`ktuchg2+0x633                    oracle`ktuchg2+0x633           
oracle`ktbchg2+0x113                    oracle`ktbchg2+0x113           
oracle`kddchg+0x397                     oracle`kddchg+0x397            
oracle`kddlok+0x922                     oracle`kduovw+0x1296           
a.out`kddlkr+0x12a                      oracle`kduurp+0x446            
a.out`qerfuProcessFetchedRow+0x2ec      oracle`kdusru+0x4f3            
a.out`qerfuLockingRowProcedure+0x45     oracle`kauupd+0x1ec            
a.out`qerstRowP+0x587                   oracle`updrow+0x6a0            
a.out`qersoFetchSimple+0x141            a.out`qerupUpdRow+0x355        
a.out`qersoFetch+0xbe                   oracle`qerupFetch+0x1e9        
a.out`qerstFetch+0x542                  a.out`qerstFetch+0x5c0         
a.out`qerfuStart+0x68c                  oracle`updaul+0x678            
a.out`qerstStart+0x4d2                  oracle`updThreePhaseExe+0x14e  
a.out`selexe0+0x87a                     oracle`updexe+0x1ec            
Left column is ustack for 2 extents (extent_id=4/5);
Right column is ustack for 5 (extent_id=6/7/8/9/10).

Left column is the code path for the execution of "select for update" in

  procedure testp 
    cursor l_delete_csr is select * from delete_data order by id for update;
which contains "selexe0" and "qerfuLockingRowProcedure" (no new extents is generated if select without "for update").

right column is the code path for the execution of "update" in:

  procedure testp
    update delete_data set owner = lower(owner) where current of l_delete_csr;
which shows "updexe" and "updThreePhaseExe".

Note*: On Solaris test with Dtrace, we use default setting of "transactions_per_rollback_segment" and "_rollback_segment_count"


3.2 GDB


Similar to Dtrace, at first, build clean test data in a new Sqlplus session:

conn k/s@test_db_linux

truncate table delete_data;

insert into delete_data 
  select owner, object_name, rownum id from dba_objects where rownum  < 53001;

commit;
Compose a GDB command script (gdb_cmd_undo_extend) as follows:

------- gdb_cmd_undo_extend:   gdb -x gdb_cmd_undo_extend -p 6789 -----
set pagination off
set logging file /tmp/gdb_cmd_undo_extend.log
set logging overwrite on
set logging on
set $k = 0

break *ktuAddExtent
command 
printf "<<<=== (%i). DBA_ROLLBACK_SEGS.segment_id ($rdi)=%i, DBA_UNDO_EXTENTS.extent_id($rax)=%i ===>>>\n", ++$k, $rdi, $rax
bt         
continue
end
Then, find its UNIX process id (pid=6789) of Sqlplus session, start GDB command script on it:

$gdb -x gdb_cmd_undo_extend -p 6789 
Run test:

SQL > exec testp;
The output shows 5 update restarts and its respective Undo USN/extent_id

21:25:26 SQL> exec testp;
--======= Begin Rollstat: USN=105, EXTENTS=8, EXTENDS=160 =======--
Restart#: 1 @ 09:25:50
            doing "AASqnfAAAAABKolAAP(id = 5374)" again, trigger called count = 5375
            Undo: l_max_extent_id=6, extent count = 5, extent_id = 6, block_id = 9472, blocks = 128
Restart#: 2 @ 09:25:51
            doing "AASqnfAAAAABKpnABz(id = 16828)" again, trigger called count = 16830
            Undo: l_max_extent_id=7, extent count = 6, extent_id = 7, block_id = 9728, blocks = 128
Restart#: 3 @ 09:25:52
            doing "AASqnfAAAAABLLZABd(id = 28344)" again, trigger called count = 28347
            Undo: l_max_extent_id=8, extent count = 7, extent_id = 8, block_id = 9856, blocks = 128
Restart#: 4 @ 09:25:54
            doing "AASqnfAAAAABLKfABz(id = 39815)" again, trigger called count = 39819
            Undo: l_max_extent_id=9, extent count = 8, extent_id = 9, block_id = 9984, blocks = 128
Restart#: 5 @ 09:25:55
            doing "AASqnfAAAAABLNkAC7(id = 51335)" again, trigger called count = 51340
            Undo: l_max_extent_id=10, extent count = 9, extent_id = 10, block_id = 10240, blocks = 128
--======= End Rollstat: USN=105, EXTENTS=13, EXTENDS=165 =======--
    row     count = 53000
    trigger count = 53005
    restart count = 5
    extents delta = 5
    extends delta = 5
Elapsed: 00:00:07.05
Here dba_undo_extents rows after test (no ACTIVE rows before test):

SQL > select * from dba_undo_extents 
       where segment_name like'_SYSSMU105_%' and status = 'ACTIVE'
       order by extent_id, file_id, block_id;
 
OWN SEGMENT_NAME          TABLESPACE_NAME EXTENT_ID FILE_ID BLOCK_ID   BYTES BLOCKS RELATIVE_FNO COMMIT_WTIME STATUS
--- --------------------- --------------- --------- ------- -------- ------- ------ ------------ ------------ ------
SYS _SYSSMU105_589038956$ UNDO                    2       3     7808 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    3       3    16640 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    4       3     8320 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    5       3     8448 1048576    128            3              ACTIVE

SYS _SYSSMU105_589038956$ UNDO                    6       3     9472 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    7       3     9728 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    8       3     9856 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                    9       3     9984 1048576    128            3              ACTIVE
SYS _SYSSMU105_589038956$ UNDO                   10       3    10240 1048576    128            3              ACTIVE

SYS _SYSSMU105_589038956$ UNDO                   11       3    10752 1048576    128            3              ACTIVE

10 rows selected.
Here the output of GDB (edited). Breakpoint ktuAddExtent is hit 8 times, each of which results in one new extent (DBA_UNDO_EXTENTS.extent_id: 3-10).

Breakpoint 1, 0x0000000001b8ea20 in ktuAddExtent ()

<<<=== (1). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=3 ===>>>
<<<=== (2). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=4 ===>>>
<<<=== (3). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=5 ===>>>

<<<=== (4). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=6 ===>>>
<<<=== (5). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=7 ===>>>
<<<=== (6). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=8 ===>>>
<<<=== (7). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=9 ===>>>
<<<=== (8). DBA_ROLLBACK_SEGS.segment_id ($rdi)=105, DBA_UNDO_EXTENTS.extent_id($rax)=10 ===>>>
Now look GDB backtrace:

BA_UNDO_EXTENTS.extent_id($rax)=3/4/5    DBA_UNDO_EXTENTS.extent_id(arg2)=6/7/8/9/10
-------------------------------------    -------------------------------------------
#0  ktuAddExtent ()                      #0   ktuAddExtent ()              
#1  ktusmaex ()                          #1   ktusmaex ()                  
#2  ktusmasp1r ()                        #2   ktusmasp1r ()                
#3  ktrsexecExecuteInExcepHdlr ()        #3   ktrsexecExecuteInExcepHdlr ()
#4  ktrsexec ()                          #4   ktrsexec ()                  
#5  ktuaspResumableInPdb ()              #5   ktuaspResumableInPdb ()      
#6  ktusmasp ()                          #6   ktusmasp ()                  
#7  ktuchg2 ()                           #7   ktuchg2 ()                   
#8  ktbchg2 ()                           #8   ktbchg2 ()                   
#9  kddchg ()                            #9   kddchg ()                    
#10 kddlok ()                            #10  kduovw ()                    
#11 kddlkr ()                            #11  kduurp ()                    
#12 qerfuProcessFetchedRow ()            #12  kdusru ()                    
#13 qerfuLockingRowProcedure ()          #13  kauupd ()                    
#14 qersoFetchSimple ()                  #14  updrow ()                    
#15 qersoFetch ()                        #15  qerupUpdRow ()               
#16 qerfuStart ()                        #16  qerupFetch ()                
#17 selexe0 ()                           #17  updaul ()                    
#18 opiexe ()                            #18  updThreePhaseExe ()          
#19 opipls ()                            #19  updexe ()                    
#20 opiodr ()                            #20  opiexe ()                    
#21 rpidrus ()                           #21  opipls ()                    
#22 skgmstack ()                         #22  opiodr ()                    
#23 rpidru ()                            #23  rpidrus ()                   
#24 rpiswu2 ()                           #24  skgmstack ()                 
#25 rpidrv ()                            #25  rpidru ()                    
#26 psddr0 ()                            #26  rpiswu2 ()                   
#27 psdnal ()                            #27  rpidrv ()                    
#28 pevm_EXECC ()                        #28  psddr0 ()                    
#29 pfrinstr_EXECC ()                    #29  psdnal ()                    
#30 pfrrun_no_tool ()                    #30  pevm_EXECC ()                
#31 pfrrun ()                            #31  pfrinstr_EXECC ()            
#32 plsql_run ()                         #32  pfrrun_no_tool ()            
#33 peicnt ()                            #33  pfrrun ()                    
#34 kkxexe ()                            #34  plsql_run ()        
Left column is backtrace for 3 extents (extent_id=3/4/5);
right column is backtrace for 5 (extent_id=6/7/8/9/10).

Left column is the code path for "select for update" in

  procedure testp 
    cursor l_delete_csr is select * from delete_data order by id for update;
which contains "selexe0" and "qerfuLockingRowProcedure" (no new extents is generated if select without "for update").

right column is the code path for "update" in:

  procedure testp
    update delete_data set owner = lower(owner) where current of l_delete_csr;
which shows "updexe" and "updThreePhaseExe".


4. Discussions


Update Restart can be looked in different aspects, for example,
     -. Single vs. multiple rows;
     -. With vs. Without other sessions interfering;
     -. One vs. Three Phases.


4.1 Three Phase Update Restart


Two recent Blogs:
     Oracle Write Consistency and ORA-00600: [13030], [20]
     Oracle Write Consistency and ORA-30926: "unable to get a stable set of rows in the source tables"
showed the update restart of multiple rows in every update execution with other sessions interfering, in which we also listed a few well deep Related Works on this area.

Blog: That Old Restart Problem Strikes Back: Setting the Stage showed the update restart of multiple rows in every update execution without other sessions interfering (see repeated test in Appendix 2).

The test code in Book: Expert Oracle Database Architecture (Thomas Kyte, Darl Kuhn, 3rd Edition) Page: 270 shows the update restart of single row in every update execution with other sessions interfering (see test in Appendix 4).

All those tests showed three phases, which are ordered as "NOT LOCKED", "LOCK", "ALL LOCKED":

   updThreePhaseExe: objn=3122646 phase=NOT LOCKED
   updThreePhaseExe: objn=3122646 phase=LOCK
   updThreePhaseExe: objn=3122646 phase=ALL LOCKED


4.2 One Phase Update Restart


In this Blog, we looked the update restart of single row in every update execution without other session interfering.

In this Blog, DML UTS (Unified Tracing Service) only showed one phase, which is "NOT LOCKED":

   updThreePhaseExe: objn=4892505 phase=NOT LOCKED 
In fact, if we set 10200 Consistent Reads, and repeat the test:

alter session set tracefile_identifier = 'trace_10200_10046_test_2';
alter session set events '10046 trace name context forever, level 12';
-- alter session set events '10051 trace name context forever, level 1';
alter session set events '10200 trace name context forever, level 10';
--set diagnostic events in my session:
alter session set events 'trace[dml]:sql_trace wait=true';

exec testp;
The trace shows that every update (whatever restart or not) has only phase=NOT LOCKED, and has the same calls of ktrget2 and ktrgcm:

updThreePhaseExe: objn=4892505 phase=NOT LOCKED
updaul: phase is NOT LOCKED

ktrget2(): started for block  <0x0d8f : 0x0004aa35> objd: 0x004aa7fb
...
ktrgcm(): completed for block  <0x0d8f : 0x0004aa35> objd: 0x004aa7fb
ktrget2(): completed for  block <0x0d8f : 0x0004aa35> objd: 0x004aa7fb


5. Summary


This Blog shows the update restart of single row in every update execution without other sessions interfering, which is caused by new Undo extends.

Since every update execution has only one phase, i.e. "NOT LOCKED", and has only one "FETCH" for Consistent Read, it is hard to say that such update restart is a real "Restart".

It is different to the other update restart, which is caused by row locking from other sessions interfering.

By the way, we also use the same above diagnostic events to run the test in Blog: Delete Restart and new Active Undo Extent) the output shows update restarts, but the trace file does not contain anything like:


  updThreePhaseExe
  phase=NOT LOCKED
  error=1551


Appendix 1. Test Code


alter system set transactions_per_rollback_segment=1 scope=spfile;
alter system set "_rollback_segment_count"=1         scope=spfile;
      --alter system reset "_rollback_segment_count" scope=spfile;
  --Re-Start DB

create or replace function get_undo_extent_cnt return number
as
  l_cnt number;
begin   
 select count(*) into l_cnt from dba_undo_extents where status = 'ACTIVE';
 return l_cnt;
end;
/

create or replace function get_undo_extent_info return varchar2
as
  l_xid         varchar2(100);
  l_usn         number;
  l_extent_cnt  number;
  l_last_extent varchar2(100);
  l_max_extent_id number;
begin  
 l_xid := dbms_transaction.local_transaction_id;
 l_usn := substr(dbms_transaction.local_transaction_id, 1, instr(dbms_transaction.local_transaction_id, '.') -1);
 
 -- all stats only on ACTIVE extent
 select count(*) 
   into l_extent_cnt 
   from dba_undo_extents 
  where instr(segment_name, '_SYSSMU'||l_usn||'_') > 0 and status = 'ACTIVE';
 
 select max(extent_id) 
   into l_max_extent_id
   from dba_undo_extents 
  where instr(segment_name, '_SYSSMU'||l_usn||'_') > 0 and status = 'ACTIVE'; 
                 
 select ', extent_id = '||extent_id||', block_id = '||block_id||', blocks = '|| blocks 
   into l_last_extent
   from dba_undo_extents
  where instr(segment_name, '_SYSSMU'||l_usn||'_') > 0 and status = 'ACTIVE'
    and extent_id = l_max_extent_id;
 
 return 'l_max_extent_id='||l_max_extent_id ||', extent count = '||l_extent_cnt||l_last_extent;
end;
/

create or replace package pacepack
as
  type array is table of number index by varchar2(40);
  g_data                  array;
  g_cnt                   number;
  g_restart_cnt           number;
  g_undo_extent_cnt       number;
end pacepack;
/

drop table delete_data;

create table delete_data as
  select owner, object_name, rownum id from dba_objects where 0 = 1;
    
create or replace trigger delete_data_bt
  before delete or update on delete_data
  for each row
declare
  l_msg varchar2(100);
begin
  pacepack.g_cnt        := pacepack.g_cnt + 1;
  if ( pacepack.g_data.exists(rowidtochar(:old.rowid)))
  then
    dbms_session.sleep(1);
    pacepack.g_restart_cnt := pacepack.g_restart_cnt + 1;
    
    l_msg := 'doing "'||:old.rowid ||'(id = '||:old.id||')" again, '||'trigger called count = '||pacepack.g_cnt;
    sys.dbms_system.ksdwrt(3, l_msg);
    dbms_output.put_line('Restart#: '||pacepack.g_restart_cnt||' @ ' ||(to_char(sysdate, 'hh:mi:ss')));
    dbms_output.put_line(lpad(chr(32), 12, chr(32))||l_msg);
    dbms_output.put_line(lpad(chr(32), 12, chr(32))||'Undo: '||get_undo_extent_info);
  else
    pacepack.g_data(rowidtochar(:old.rowid)) := 1;
  end if;
end;
/

create or replace procedure testp as
  l_cnt           number := 0;
  cursor          l_delete_csr is select * from delete_data order by id for update;
  l_start_extents number;
  l_start_extends number;
  l_end_extents   number;
  l_end_extends   number;
begin
  pacepack.g_data.delete;
  pacepack.g_cnt         := 0;
  pacepack.g_restart_cnt := 0;
         
  for l_delete_row in l_delete_csr loop
    update delete_data set owner = lower(owner) where current of l_delete_csr;
    l_cnt := l_cnt + 1;
   
    if l_cnt = 1 then
     for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
      dbms_output.put_line('--======= Begin Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends||' =======--');
      l_start_extents := c.extents;
      l_start_extends := c.extends;
     end loop;
    end if;
  end loop;
      
  for c in (select usn, extents, extends from v$rollstat where xacts > 0) loop
    dbms_output.put_line('--======= End Rollstat: '||'USN='||c.usn||', EXTENTS='||c.extents||', EXTENDS='||c.extends||' =======--');
    l_end_extents := c.extents;
    l_end_extends := c.extends;
  end loop;
  
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'row     count = '||l_cnt); 
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'trigger count = '||pacepack.g_cnt);
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'restart count = '||(pacepack.g_cnt - l_cnt));
  
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'extents delta = '||(l_end_extents - l_start_extents));
  dbms_output.put_line(lpad(chr(32), 4, chr(32))||'extends delta = '||(l_end_extends - l_start_extends));
 end;
/


Appendix 2. error=1551 Test


We repeated the same test in Blog: That Old Restart Problem Strikes Back: Setting the Stage .

The test table (big) has all similar stats as the Blog (163840 blocks and 1280 MB).

alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "big_upd_3"; 
alter session set events 'trace[dml]:sql_trace wait=true';

-- added MONITOR hint
update /*+ MONITOR */ big
      set val = val + 1
    where val = 1;

100000 rows updated.
Here the trace file (edited and comment added after "=====>"). It shows three phases, and Oracle stats in each phase.

test_db_ora_31526_big_upd_3.trc

=====>Line: 255
PARSING IN CURSOR #140473762595472 len=67 dep=0 uid=49 oct=6 lid=49 tim=8799656282217 hv=2426892631 ad='6a8460b0' sqlid='fm6dxja8afwar'
update /*+ MONITOR */ big
      set val = val + 1
    where val = 1
END OF STMT
PARSE #140473762595472:c=1131,e=1269,p=0,cr=1,cu=0,mis=1,r=0,dep=0,og=1,plh=3225303172,tim=8799656282216

updThreePhaseExe: objn=4892269 phase=NOT LOCKED
updaul: phase is NOT LOCKED

=====>Line: 5204. About 5000 lines (from 255 to 5204) with 'db file scattered read' blocks=32. that is about 150000 blocks.
WAIT #140473762595472: nam='db file scattered read' ela= 151 file#=22 block#=605928 blocks=32 obj#=4892269 tim=8799658444950
updaul: objn=4892269 error=1551

updThreePhaseExe: objn=4892269 phase=LOCK
updaul: phase is LOCK

=====>Line:10145
WAIT #140473762595472: nam='db file scattered read' ela= 149 file#=22 block#=605864 blocks=32 obj#=4892269 tim=8799660393207

=====>Line: 25003, 39864, 54722, 60581, 84439, 99390 have "updrowFastPath: objn=4892269 error=1551"
updrowFastPath: objn=4892269 error=1551

=====>Line:110218. About 100000 lines (from 10145 to 110218) with 'updrowFastPath'.
updrowFastPath: kddlkr objn 4892269 table 0  rowid 004aa66d.0009452f.2c code 0

updThreePhaseExe: objn=4892269 phase=ALL LOCKED
updaul: phase is ALL LOCKED

=====>Line:115173. About 5000 lines (from 110218 to 115173) with 'db file scattered read' blocks=32. that is about 150000 blocks.
WAIT #140473762595472: nam='db file scattered read' ela= 170 file#=22 block#=605896 blocks=32 obj#=4892269 tim=8799671146703
Here the monitor report. Execs = 3 reveals three phase updates, 3 times excessive I/O stats also indicates such three phases.

SQL > select dbms_sqltune.report_sql_monitor('fm6dxja8afwar') from dual;

------------------------------
update /*+ MONITOR */ big set val = val + 1 where val = 1

Global Information
------------------------------
 SQL ID              :  fm6dxja8afwar       
 Duration            :  38s                 
 Program             :  sqlplus.exe         

Global Stats
=========================================================
| Elapsed |   Cpu   |    IO    | Buffer | Read  | Read  |
| Time(s) | Time(s) | Waits(s) |  Gets  | Reqs  | Bytes |
=========================================================
|      39 |      29 |       10 |   793K | 15723 |   4GB |
=========================================================

SQL Plan Monitoring Details (Plan Hash Value=3225303172)
=====================================================================================================================================================
| Id |      Operation       | Name |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read  | Read  | Activity |      Activity Detail       |
|    |                      |      | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes |   (%)    |        (# samples)         |
=====================================================================================================================================================
|  0 | UPDATE STATEMENT     |      |         |      |        27 |    +12 |     3 |        0 |       |       |          |                            |
|  1 |   UPDATE             | BIG  |         |      |        27 |    +12 |     3 |        0 |     3 | 24576 |     2.56 | Cpu (1)                    |
|  2 |    TABLE ACCESS FULL | BIG  |       1 |    2 |        38 |     +2 |     3 |     205K | 15720 |   4GB |    94.87 | Cpu (28)                   |
|    |                      |      |         |      |           |        |       |          |       |       |          | db file scattered read (9) |
=====================================================================================================================================================


Appendix 3. Event 1551 Errorstack Trace.


We also make a 1551 Errorstack Trace:

alter session set max_dump_file_size = UNLIMITED;
alter session set tracefile_identifier = "trc_1551_1";
alter session set events='1551 trace name errorstack level 3';

exec testp;
Here the output:

----- Error Stack Dump -----
ORA-01551: extended rollback segment, pinned blocks released
----- Current SQL Statement for this session (sql_id=f4qphfamx27x1) -----
SELECT * FROM DELETE_DATA ORDER BY ID FOR UPDATE
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x8bcaefe8         3  procedure K.TESTP
0x8bcaefe8        13  procedure K.TESTP
0x6aea90d8         1  anonymous block

----- Call Stack Trace -----
[12] (dbgePostErrorKGE()+1066       
[13] (dbkePostKGE_kgsf()+71         
[14] (kgeade()+392                  
[15] (kgeselv()+89                  
[16] (ksesecl0()+189                
[17] (ktugur()+7201                 
[18] (ktuchg2()+1063                
[19] (ktbchg2()+231                 
[20] (kddchg()+611                  
[21] (kddlok()+2350                 
[22] (kddlkr()+322                  
[23] (qerfuProcessFetchedRow()+669  
[24] (qerfuLockingRowProcedure()+68 
[25] (qersoFetchSimple()+420        
[26] (qersoFetch()+210              
[27] (qerfuStart()+602              
[28] (selexe0()+3409                
[29] (opiexe()+11896                
[30] (opipls()+2427                 
[31] (opiodr()+1202                 
[32] (rpidrus()+198                 
[33] (skgmstack()+65                
[34] (rpidru()+132                  
[35] (rpiswu2()+543                 
[36] (rpidrv()+1266                 
[37] (psddr0()+467                  
[38] (psdnal()+624                  
[39] (pevm_EXECC()+306              
[40] (pfrinstr_EXECC()+56           
[41] (pfrrun_no_tool()+60           
[42] (pfrrun()+902                  
[43] (plsql_run()+752  


Appendix 4. Single row update in every update execution with other sessions interfering


We also repeat the code test in Book: Expert Oracle Database Architecture (Thomas Kyte, Darl Kuhn, 3rd Edition) Page: 270.

drop table t;
create table t (x int, y int);
insert into t values (1, 5);
commit;

--SID-1@T1
Update t set y=10 where y=5;

--SID-2@T2
alter session set tracefile_identifier = 'trace_10200_10046_test_x3';
alter session set events '10046 trace name context forever, level 12';
alter session set events '10200 trace name context forever, level 10';
-- alter session set events '10051 trace name context off';
alter session set events 'trace[dml]:sql_trace wait=true';

Update t Set x=x+1 Where y=5;

--SID-1@T3
commit;
Trace file shows three phases:

updThreePhaseExe: objn=4892669 phase=NOT LOCKED
updaul: phase is NOT LOCKED snap oldsnap env: (scn: 0x00000b7302e8a526 

updThreePhaseExe: objn=4892669 phase=LOCK
updaul: phase is LOCK snap oldsnap env: (scn: 0x00000b7302e8a533 

updThreePhaseExe: objn=4892669 phase=ALL LOCKED
updaul: phase is ALL LOCKED snap oldsnap env: (scn: 0x00000b7302e8a533

and 13 occurrences of following call patterns:

ktrget2(): started for block  <0x0d8f : 0x0011cb7b> objd: 0x004aa7fd
ktrexf(): returning 9 on:  0x7fbd2ead6588  
ktrgcm(): completed for block  <0x0d8f : 0x0011cb7b> objd: 0x004aa7fd
ktrget2(): completed for  block <0x0d8f : 0x0011cb7b> objd: 0x004aa7fd
The xplan also reveals three phases by "starts=3":

SQL ID: gwkdp2wjjxxks Plan Hash: 931696821
Update t Set x=x+1 Where y=5

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00      17.23          0         21          2           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00      17.23          0         22          2           0

Rows (1st) Row Source Operation
---------- ---------------------------------------------------
         0 UPDATE  T (cr=21 pr=0 pw=0 time=17229569 us starts=3)
         1  TABLE ACCESS FULL T (cr=20 pr=0 pw=0 time=824 us starts=3 cost=2 size=26 card=1)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enq: TX - row lock contention                   1       17.22         17.22