Wednesday, November 27, 2013

Remove Stale Native Code Files on AIX

3 days after posting Blog: PGA,
    SGA space usage watching from UNIX
about ipcs command for Oracle SGA, there was a problem reported on Oracle 11.2.0.3.0 caused by Stale Native Code Files.

Searching in Oracle Documentation, we found one hit:
    Oracle® Database Readme 11g Release 2 (11.2) E41331-02

    4.29.2 Stale Native Code Files Are Being Cached
  
    Natively compiled PL/SQL and native code generated by the JIT compiler for Oracle JVM, may be cached in operating system files. The SHUTDOWN ABORT and SHUTDOWN IMMEDIATE commands do not clean these cached native code files (reference Bug 8527383).

   The name patterns are as follows where sid_name is the system identifier name:
   JOXSHM_EXT_*_sid_name_*
   PESHM_EXT_*_sid_name_*
   PESLD_sid_name_*


However as observed with Oracle 11.2.0.3.0 on AIX 7.1.0.0, the name patterns looks like:

 /JOXSHM_EXT_*_instname_sidname
 /PESLD_instname_*_*


(sidname is the ID of shared memory segment)

(Update-2022-Dec-21: In Oracle 19c, the name pattern is like JOEZSHM_* instead of JOXSHM_EXT_*)

In fact, when summing up shared memory segments plus stale objects, it is more than Oracle reported SGA size since such stale objects are no more visible to Oracle. As a consequence, if there exist a large number of such Stale objects, Database can be made outages due to space starvation.

To this problem, Oracle provides certain workaround described in MOS: (Doc ID 1120143.1)

   Stale Native Code Files Are Being Cached with File Names Such as: JOXSHM_EXT*, PESHM_EXT*, PESLD* or SHMDJOXSHM_EXT*

but it is only applied to platforms except AIX since on AIX Stale Native Code Files such as (JOXSHM_EXT*) are kept as shared memory segments not as physical files, so there is no way to delete them by rm command.

Tests on AIX and Linux show the different shutdown behaviors:

 shutdown immediate: clean current stale files (created by this startup), but not previous ones.
                                           Probably because sidname does not belong to current running instance.
 shutdown normal:       same as shutdown immediate
 shutdown abort:         no clean

This Blog will try to provide a workaround for AIX by a small demo with detail steps.

(1). On AIX DB, run the appended TestCase.

(2). View all native and JIT compiled code by:

  select * from dba_plsql_object_settings where plsql_code_type = 'NATIVE';
 
  select * from dba_java_methods where is_native = 'YES';

  Here also two non-documented methods:

   select * from sys.ncomp_dll$ n, dba_objects o where n.obj# = o.object_id;

    call sys.dbms_feature_plsql_native to generate a plsqlNativeReport.
   
    declare
      l_is_used   number;
      l_aux_count number;
      l_report    clob;
    begin
      sys.dbms_feature_plsql_native(o_is_used=>l_is_used, o_aux_count=>l_aux_count, o_report=>l_report);
      dbms_output.put_line('o_is_used='||l_is_used);
      dbms_output.put_line('o_aux_count='||l_aux_count);
      dbms_output.put_line('o_report='||l_report);
    end;   

(3). Shutdown database by:

          shutdown abort

(4). On AIX, monitor memory usage:

 ipcs -ar |grep -e JOXSHM_EXT -e PESHM -e PESLD | awk '{cnt+=1; sum+=$10} END {print "Count=",cnt,"Sum=",sum,"Average=",sum/cnt}'

(5). remove them by:

 ipcs -ar |grep -e JOXSHM_EXT -e PESHM -e PESLD | awk ' {name = $17; cmd = "ipcrm -r -m " name; print "Run ", cmd; system( cmd )}'


Addendum (2016.01.21):

The above workaround seems also adopted by Oracle MOS: (Doc ID 1120143.1) in which there is one original text:
    There is no fix to remove the stale files.
that was seen in 06-Nov-2013.
Now one controversial text was added:
    On AIX these files can then be removed using the 'ipcrm' command.


There is one MOS Note on Solaris and Linux: Ora-7445 [Ioc_pin_shared_executable_object()] (Doc ID 1316906.1)
The error is most often related to an inconsistency that has been detected between the java shared object loaded in memory and the backing store image stored on disk as a result of calling java code and having the JIT compiler enabled.
    On Solaris these files are written to /tmp and have names like .SHMDJOXSHM_EXT_...
    On Linux these files are written to /dev/shm and have names like JOXSHM_EXT_...
Where IOC could be an abbreviation of Inversion of Control, something like callbacks.


Addendum (2015.12.28):

The Blog: What the heck are the /dev/shm/JOXSHM_EXT_x files on Linux? talked about such files on Linux.


Test Case


alter system set plsql_code_type=native;

create or replace package bla as
  procedure foo;
end;
/
create or replace package body bla as
  procedure foo is begin null; end;
end;
/
exec bla.foo;

alter system set java_jit_enabled=true;

create or replace and compile java source named "Hello" as
 public class Hello {
    public static String world ()
    {
       return "hello world";
    }
 }
/
create or replace function helloworld
 return varchar2 as language java
 name 'Hello.world () return java.lang.string';
/
select helloworld from dual;

Tuesday, November 19, 2013

PGA, SGA memory usage watching from UNIX

In the last two Blogs, we talked about how to measure PL/SQL package PGA memory usage and investigate ORA-04030 errors in incident file and alert.log:
    dbms_session package_memory_utilization
    ORA-04030 incident file and alert.log

In this Blog, we try to watch memory usage by UNIX commands on both AIX and Solaris systems at first, and then show sga_max_size impact on the memory allocation of both systems. All the tests are done on Oracle 11.2.0.3.0.

1. AIX


With the appended script, we can allocate 2GB PGA by:

    SQL >   exec pga_mem_test.allo(2048);

Then on OS level, you can watch it by (14090450 is process ID of the session):

   $>  svmon -P 14090450 -O filtercat=exclusive -O filterprop=data -O pgsz=on -O mpss=on
  -------------------------------------------------------------------------------
       Pid Command          Inuse      Pin     Pgsp  Virtual
  14090450 oracle          613027       48        0   613027

       PageSize                Inuse        Pin       Pgsp    Virtual
       s    4 KB              612947          0          0     612947
       m   64 KB                   5          3          0          5

      Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
    d90b51        12 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    ae0826        15 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    d30b5b        18 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    1d0b95        17 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    c20b4a        19 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    6c07e4        13 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    800b08        14 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    f00b78        16 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    240bac        11 work text data BSS heap           s    666     0    0   64058
                                                       m   3962     0    0       0
    3f0bb7        1a work text data BSS heap           s      8     0    0   22648
                                                       m   1415     0    0       0
     e0b86  80020014 work USLA heap                    s     44     0    0    1804
                                                       m    110     0    0       0
    3d0bb5  9001000a work shared library data          s    103     0    0     103
                                                       m      0     0    0       0
    420aca f00000002 work process private              m      5     3    0       5
    a10b29  ffffffff work application stack            s     11     0    0      43
                                                       m      2     0    0       0
    890b01  8001000a work private load data            s      3     0    0       3
                                                       m      0     0    0       0

                                                      
In Segment detail section: counter for mixed size segments are converted as s (small) size, and displayed under column Virtual. For example, segment: 240bac, 666 + 16*3962 = 64058.

That is why in PageSize section, s (4KB) is shown with 612947 Inuse and Virtual.

However counter for segments with only m (medium) size is not converted, for example, segment: 420aca. And in PageSize section, m (64KB) is shown with only 5 Inuse and Virtual.

In fact, among 613027 pages(each 4KB), only 835 (666 + 8 + 44 + 103 + 11 + 3) are s pages, and the rest are m pages. That means, s pages amounts to 3.2 MB, and m pages is about 2390MB. So this process is mainly allocated with 64KB pages.
 
With option unit=MB, we can see them in MB:

  $>  svmon -P 14090450 -O filtercat=exclusive -O filterprop=data -O pgsz=on -O mpss=on -O unit=MB
  -------------------------------------------------------------------------------
       Pid Command          Inuse      Pin     Pgsp  Virtual
  14090450 oracle         2394.64     0.19        0  2394.64


As we know, process space distribution is calculated by:

    Virtual (virtual space) = Inuse(physical memory) + Pgsp (paging space)

so this process is all allocated in physical memory due to Pgsp = 0.

If we take the option segment=category, we can see SYSTEM segments and SHARED segments (only part of Oracle SGA) (some details are removed).

  $svmon -P 14090450 -O segment=category -O filterprop=data -O pgsz=on -O mpss=on -O unit=MB
  -------------------------------------------------------------------------------
       Pid Command          Inuse      Pin     Pgsp  Virtual
  14090450 oracle         3542.34     36.6        0  3267.93
       PageSize                Inuse        Pin       Pgsp    Virtual
       s    4 KB             2690.65          0          0    2416.25
       m   64 KB              851.69       36.6          0     851.69
  ...............................................................................
  SYSTEM segments                      Inuse      Pin     Pgsp  Virtual
                                        47.3     36.4        0     47.3
  ...............................................................................
  EXCLUSIVE segments                   Inuse      Pin     Pgsp  Virtual
                                     2394.64     0.19        0  2394.64
  ...............................................................................
  SHARED segments                      Inuse      Pin     Pgsp  Virtual
                                     1100.40        0        0   826.00


and eventually segment detail can be shown by:
    $>  svmon -D 240bac -O frame=on
    
If we allocate a session with 12GB PGA, we can see that 3.77G Pgsp (paging space) is used.
But only part of EXCLUSIVE segments are in paging space, the whole SYSTEM segments and SHARED segments are still kept in memory. 

  $svmon -P 45940898 -O unit=auto
  ----------------------------------------------------------
        Pid Command          Inuse      Pin     Pgsp  Virtual
   45940898 oracle           10.8G    36.6M    3.77G    14.6G
   
   
Even though a session's whole PGA is initially allocated in real memory, AIX VMM can still page out of part of it into paging space when memory is required by other active sessions. 
   
The SHARED segments reported above seems only a part of entire SGA since it is much smaller than SEGSZ of ipcs:

  $ipcs -ma
    T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH     SEGSZ
    m   2268082   00000000 --rw-r-----   oracle      dba   oracle      dba     25 2113929216


We can also watch whole system memory usage by:

  $svmon -G -O unit=auto
 --------------------------------------------------------------------------------------
                   size       inuse        free         pin     virtual  available   mmode
    memory        60.0G       25.6G       15.7G       25.6G       42.8G         0K     Ded
    pg space      8.00G       4.22G

                   work        pers        clnt       other
    pin           12.2M          0K          0K       2.11G
    in use        25.5G          0K     119.00M    



By the way, Temp space used by Global Temporary Tables (GTT), or Temporary Table created by subquery_factoring_clause (implicit or materialize hint) are recorded in TEMP_SPACE_ALLOCATED of V$ACTIVE_SESSION_HISTORY / DBA_HIST_ACTIVE_SESS_HISTORY.

Oracle Doc said about it:
    Amount of TEMP memory (in bytes) consumed by this session at the time this sample was taken.

here the "memory" probably should be interpreted as underlined temporary segment on disk, not memory.

When we create a big GTT, we can neither observe memory consumption with SVMON for the particular session, nor system wide by VMSTAT. However, we can see the intensive I/O activities of disk on which temporary segment is located with IOSTAT.

A query on V$TEMPSEG_USAGE or V$TEMPSTAT can also justify this disk usage.


Update (September 17, 2020): Un-released shared memory segment after shutdown


Occasionally we noticed that shared memory segments (shmid) used by SGA are not released after DB shutdown (or restart), specially when using "startup force" (combination of shutdown abort and startup). Database alert.log contains the message ("attacheded" is Oracle typo):

Warning: 2 processes are still attacheded to shmid 362322707:
 (size: 65536 bytes, creator pid: 35810862, last attach/detach pid: 6080768)
ipcs command shows:

$ > ipcs -ma
T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH       SEGSZ     CPID    LPID    ATIME   DTIME     CTIME
m 362322707 0xffffffff D-rw-------   oracle      dba   oracle      dba      1       65536 35810862 6080768 21:20:14 21:20:14 21:20:14
m 163093320 0xffffffff D-rw-------   oracle      dba   oracle      dba      1 43597383680 35810862 6080768 21:20:14 21:20:14 21:20:14

Note: The first "D" in MODE "D-rw" means:
        If the associated shared memory segment has been removed. 
        It disappears when the last process attached to the segment detaches it.
        
   It seems there exist some processes attached on the shared memory segment to be removed (MODE "D-rw"),
   but it is pending on the termination of those attached processes.
 
   In Oracle DB, the problem seems caused by DB shutdown abort.
 
   There are two alternatives to remove them:
 
    (a). Kill all attached (and Stopped) processes on "D-rw" Schmid and check if the Schmid is removed.
         To perform it, at first you need to find all processes started before last DB start, and still attached on the DB.
    (b). Reboot Lpar to cleanup all Schmid.

#-- With T option, write the output with the date

$ > ipcs -Top
T        ID     KEY        NATTCH  CPID      LPID      ATIME             DTIME             CTIME           
m 362322707 0xffffffff     1       35810862  6080768   21:20:14 Sep 17   21:20:14 Sep 17   21:20:14 Sep 17  
m 163093320 0xffffffff     1       35810862  6080768   21:20:14 Sep 17   21:20:14 Sep 17   21:20:14 Sep 17  
With ipcs "S" Flags, we can find a list of SID attached to shared memory id, for exaample, above 163093320:

$ > ipcs -maS

m 163093320 0xffffffff D-rw-------   oracle      dba   oracle      dba      1 43597383680 35810862 6080768 21:20:14 21:20:14 21:20:14
SID :
0x235aa6 0x376a2f4  
With svmon "S" Flags, we can displays the memory-usage statistics for segments that the SIDs parameter specifies. The output also includes a list of OS PIDs which are still attacheded on the SID (belongs to a SHMID):

$> svmon -S 0x235aa6 0x376a2f4 -O unit=KB -O pidlist=on

Unit: KB
    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
 3ddeb5f  70001003 work shmat/mmap                   m     64     0    0      64
                   pid(s)=32664296, 45810862
 2d49a54  70001001 work shmat/mmap                   m     64     0    0      64
                   pid(s)=32664296, 45810862
Then with ps command, we can find the details of OS processes (Oracle sessions). Their start time (or elapsed time) shows that they still exist even after DB shutdown:

$ > ps aux 32664296

USER          PID %CPU %MEM   SZ  RSS    TTY STAT    STIME  TIME COMMAND
oracle   32664296  0.0  2.0 253992 189560      - A      Aug 18  0:05 oracletestdb

$ > ps -o cp,pcpu,tcpu,pid,etime,state,time,tdiskio,user,pagein,rssize,vsz,ppid,wchan,shmpgsz,spgsz,thcount,tag,command,args=Args -p 32664296

 CP  %CPU        TCPU      PID     ELAPSED ST        TIME TDISKIO     USER PAGEIN   RSS   VSZ     PPID    WCHAN SHMPGSZ SPGSZ THCNT TAG             COMMAND  Args
  0   0.0           - 32664296 29-23:46:55 A     00:00:05       -   oracle      0 189560 10872        1        -      4K   64K     1 -               oracle   oracletestdb
Blog: Identifying shared memory segment users using lsof shows how to find the similar mapping in Linux with ipcs and lsof command to map OS process id to shmid. (In AIX, lsof seems not showing shmid, neither "procfiles -cn")

By the way, Oracle SYSRESV Utility (MOS Doc ID 123322.1) provides sysresv command to manage OS resources (not OS process id displayed):

$ > $ORACLE_HOME/bin/sysresv -d on -l $ORACLE_SID


Update (August 23, 2021): JAVA_JIT_ENABLED enabled compiled shared memory segments


From Oracle 19c, when JAVA_JIT_ENABLED enabled, java native compiled targets are stored in shared memory: JOEZSHM. Following are some example output from Linux and AIX.

Oracle command sysresv cannot show those JOEZSHM shared memory, hence it cannot remove them with -i or -f option.

--- Linux: ipcs, sysresv not display /dev/shm ---

linuxdb$ ipcs -m
  ------ Shared Memory Segments --------
  key        shmid      owner      perms      bytes      nattch     status
  0x00000000 3014656    oracle     600        8908800    112
  0x00000000 3047425    oracle     600        4563402752 56
  0x00000000 3080194    oracle     600        7868416    56
  0x8dbd5b2c 3112963    oracle     600        45056      56

linuxdb$ ls -ltr /dev/shm
  total 32768
  drwxr-x--- 2 oracle dba       40 Jun 14 07:23 orahpatch_linuxdb
  -rwxrwx--- 1 oracle dba 16777216 Aug 23 09:53 JOEZSHM_linuxdb_1_0_1_0_0_658830875
  -rwxrwx--- 1 oracle dba 16777216 Aug 23 09:54 JOEZSHM_linuxdb_1_0_0_0_0_2043113040
	
linuxdb$ $ORACLE_HOME/bin/sysresv -d on -l $ORACLE_SID
  ------ Shared Memory Segments --------
  key        shmid      owner      perms      bytes      nattch     status
  0x00000000 3014656    oracle     600        8908800    111
  0x00000000 3047425    oracle     600        4563402752 56
  0x00000000 3080194    oracle     600        7868416    56
  0x8dbd5b2c 3112963    oracle     600        45056      56


--- AIX: sysresv not display JOEZSHM, not show size of Shared Memory (less info than sysresv on Linux ) ---

aixdb$ ipcs -mar
  IPC status from /dev/mem as of Mon Aug 23 14:41:00 CEST 2021
  T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH     SEGSZ  CPID  LPID   ATIME    DTIME    CTIME  RTFLAGS NAME
  Shared Memory:
  m  20972544   00000000 --rw-------   oracle      dba   oracle      dba    157 5419040768 16777572 19071430 14:40:19 14:40:19 12:46:19
  m  20972545 0xecf3d44c --rw-------   oracle      dba   oracle      dba    157     65536 16777572 19071430 14:40:19 14:40:19 12:46:19
  m  20972546   00000000 --rw-------   oracle      dba   oracle      dba    157  16777216 16777572 19071430 14:40:19 14:40:19 12:46:19
  m  19923971   00000000 --rw-------   oracle      dba   oracle      dba    157  33554432 16777572 19071430 14:40:19 14:40:19 12:46:19
  m         - 0xffffffff --rw-rw----   oracle      dba   oracle      dba      0  16777216 11797042     0 no-entry no-entry 16:27:36    -    /JOEZSHM_aixdb_1_0_0_0_0_2042869279
  m         - 0xffffffff --rw-rw----   oracle      dba   oracle      dba      0  16777216 13304362     0 no-entry no-entry 16:35:53    -    /JOEZSHM_aixdb_1_0_0_0_0_2042708934

aixdb$ $ORACLE_HOME/bin/sysresv -d on -l $ORACLE_SID
  IPC status from /dev/mem as of Mon Aug 23 10:01:28 CEST 2021
  T        ID     KEY        MODE       OWNER    GROUP
  Shared Memory:
  m  20972544   00000000 --rw-------   oracle      dba
  m  20972545 0xecf3d44c --rw-------   oracle      dba
  m  20972546   00000000 --rw-------   oracle      dba
  m  19923971   00000000 --rw-------   oracle      dba


2. Solaris


On OS level, you can watch it by:

   $pmap -x 23739
                

             Address     Kbytes        RSS       Anon     Locked Mode   Mapped File
    ---------------- ---------- ---------- ---------- ---------- ------ -----------      
    FFFFFD7C4648D000         64         64         64          - rw---    [ anon ]
    0000000060000000      49152      49152          -      49152 rwxsR    [ ism shmid=0x800002d ]
    0000000080000000    7143424    7143424          -    7143424 rwxsR    [ ism shmid=0x800002e ]
    0000000240000000         20         20          -         20 rwxsR    [ ism shmid=0x900002f ]
    ---------------- ---------- ---------- ---------- ---------- ------ -----------
            total Kb    9947208    9936408    2536840    7192596       


Where pid: 25032 is process ID of the session with a PGA allocation of a 2GB memory table.

The above output shows that most of mapping is with a size of 64K marked as "[ anon ]".
The total Anon amounts to 2536840 (a little more than 2GB).
The total Locked is 7192596, which is the SGA size, and are marked like "[ ism shmid=0x800002e ]".

The SGA of this DB is allocated with ISM, therefore all SGA memory is locked, no swap space is needed to back it. We can roughly estimate virtual space, RSS (physical memory), and swap space by:

  virtual space   = 9947208 - 7192596 = 2,754,612 KB
  physical memory = 9936408 - 7192596 = 2,743,812 KB
  swap space      = 2754612 - 2743812 =    10,800 KB


since process 25032 is the only session with large PGA size and swap space is quite low,
we can say that PGA of this session is mainly allocated in physical memory.
     
Similarly we can watch SGA size by:

  $>  ipcs -ma
   T         ID      KEY        MODE        OWNER    GROUP  CREATOR   CGROUP NATTCH      SEGSZ 
   m   16777255   0x4a725828 --rw-rw----   oracle      dba   oracle      dba     84 7365218304
 

here SEGSZ of ipcs is correlated well with total Locked of pmap.

3. Linux



Run:
 SQL > exec pga_mem_test.allo(2048);

and watch PGA memory on Linux allocation by:

 $> pmap -x 46332 |grep -e zero |awk '{cnt+=1; Kbytes+=$2; RSS+=$3; Dirty+=$4} END {print "Count=",cnt,",Kbytes=",Kbytes,",RSS=",RSS,",Dirty=",Dirty,",Avg=",Kbytes/cnt}'
   Count= 37544 ,Kbytes= 2440768 ,RSS= 2403836 ,Dirty= 2403836 ,Avg= 65.0109

where Kbytes should be the allocated size.

Similarly we can watch SGA size in Linux by:

 $> ipcs -m
   ------ Shared Memory Segments --------
   key        shmid      owner      perms      bytes      nattch     status
   0x00000000 1455849472 oracle     640        33554432   21
   0x00000000 1455882241 oracle     640        4513071104 21
   0x19d100cc 1455915010 oracle     640        2097152    21

Pick one PID of Oracle session, and run:

 $> pmap -x 49169 |grep shmid
   0000000060000000   32768     448     448 rw-s-    [ shmid=0x56c68000 ]
   0000000062000000 4407296  276736  276736 rw-s-    [ shmid=0x56c70001 ]
   000000016f000000    2048       4       4 rw-s-    [ shmid=0x56c78002 ]

where 3 shared memory segments match each other:
  shmid=1455849472=0x56c68000  
  shmid=1455882241=0x56c70001
  shmid=1455915010=0x56c78002   
  
and total sga allocated is:
  4548722688 bytes (=33554432+4513071104+2097152)


4. sga_max_size on AIX


Recently a DB on AIX experienced performance problem due to shared pool memory resizing. The memory is configured as follows:

  sga_max_size       26'239'565'824

  shared_pool_size     2'214'592'512
  java_pool_size        268'435'456
  large_pool_size        536'870'912
  log_buffer              83'886'080
  db_cache_size        9'261'023'232
  db_keep_cache_size    1'006'632'960
  db_recycle_cache_size   1'006'632'960


and Automatic Memory Management is disabled (memory_target not specified).

AWR report shows 9 "Memory Resize Ops", where "DEFAULT buffer cache" is indicated by "SHR/IMM", and "shared pool" is signalized by "GRO/IMM".

It also shows the changes of db_cache_size and shared_pool_size:

  Parameter Name    Begin value   End value
  ----------------  ------------   -------------
  db_cache_size     9'261'023'232  8'657'043'456
  shared_pool_size  2'214'592'512  2'818'572'288


from which we can see 603'979'776 Bytes is moved from db_cache_size to shared_pool_size.
However, AIX "ipcs -a" shows:

  T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH     SEGSZ 
  ---------------------------------------------------------------------------------------
  m  20971562   00000000 --rw-rw----   oracle      dba   oracle      dba    379 201326592 
  m 143656025   00000000 --rw-rw----   oracle      dba   oracle      dba    379 26038239232
  m  20972634 0xc1bd22ac --rw-rw----   oracle      dba   oracle      dba    379     16384


so the AIX real allocated memory:

    201'326'592 + 26'038'239'232

is exactly equal to sga_max_size (26'239'565'824) (KEY marked with 0xc1bd22ac of SEGSZ = 16384 is omitted).

In conclusion, SGA_MAX_SIZE specifies the SGA maximum size of AIX allocated shared_memory, but not SGA effectively used size. It is advisable not to specify this parameter.

If sga_max_size is not specified, allocated shared_memory is determined by:

 select sum(value)/1024/1024
   from v$parameter
 where name in('shared_pool_size', 'java_pool_size', 'large_pool_size', 'streams_pool_size',

               'log_buffer', 'db_cache_size', 'db_keep_cache_size', 'db_recycle_cache_size'
               );


and sga_max_size is also deduced the above calculation, and AIX real allocated memory obeys to this value too. 


5. sga_max_size on Solaris x86-64


Repeated the test on a Solaris Operating System (x86-64) DB by specifying sga_max_size = 2600M,
it behaves the identical to AIX.

          Address     Kbytes        RSS       Anon     Locked Mode   Mapped File
  ---------------------------------------------------------------------------------------------
 0000000060000000      32768      26624          -          - rwxs-    [ dism shmid=0x5000020 ]
 0000000080000000    2629632     897024          -          - rwxs-    [ dism shmid=0x5000021 ]
 0000000160000000         12          4          -          - rwxs-    [ dism shmid=0x6000022 ]


Besides this, SGA memory allocation is changed to DISM.

Kbytes column confirms that 2600M = 32768K + 2629632K. However RSS shows that only 897024K of 2629632K (about 34%)  are allocated in physical memory, and no more memory is under Column Locked.

DISM is implemented with a dism daemon which is running oradism to dynamically manage memory allocation.

During database Starting up, DISM is started after all Mandatory Background Processes.

oradism ($ORACLE_HOME/bin/oradism) should be configured with permission:
            -rwsr-xr-x
in order to be running with root privileges.

If sga_max_size is not configured, it is computed as sum of all component groups in v$sga, and ISM is used.
If sga_max_size is configured greater than above computed value, DISM is used (over configured memory goes to Group "Variable Size")
If less or equal (same as not configured), it is still ISM.

Seeing the output of SWAP in Solaris running Oracle:

$ swap -l
swapfile   dev    swaplo   blocks     free
/dev/swap  181,1  8        100663288  100663288

it was asked why SWAP areas not used at all. Here is some essence:

A running Oracle instance is a "Cache of Bits (text, data, bss, stack)". Because of Cache, it is preferred in memory (RSS), whereas SWAP is a device of disk.

Even Oracle double accentuates it by the term "Buffer_Cache", where both "Buffer" and "Cache" are synonyms for memory.

In seldom case, Oracle SGA or PGA are using SWAP (disk), which could be an indication of performance degrade. For Solaris, one case is DISM.

So no use of SWAP conforms to Oracle's intention.

Update (2019-09-30)

With Optimized Shared Memory (OSM) in Solaris 11.3, the new mode for System V shared memory, we observed that Oracle 12c SGA can not go to swap space (same as ISM), but PGA can allocate in swap space (the non-swapped part in physical memory marked as "RSS").

If sga_max_size is bigger than available physical memory, instance can not be started and throws error: ORA-27125 ("unable to create shared memory segment").

Test Code 



create or replace package pga_mem_test as
  procedure allo (i_mb int);
end;
/
create or replace package body pga_mem_test as
  type t_tab_kb   is table of char(1024);   -- 1KB
  p_tab_1mb          t_tab_kb := t_tab_kb();
  type t_tab_mb   is table of t_tab_kb;    
  p_tab_mb           t_tab_mb := t_tab_mb();
  p_sid              number   := sys.dbms_support.mysid;
 
 -------------------------------------------
 procedure rpt(l_name varchar) is
    l_v$process_mem            varchar2(4000);
    l_v$process_memory_mem     varchar2(4000);
 begin
  select 'Used/Alloc/Freeable/Max >>> '||
          round(pga_used_mem/1024/1024)    ||'/'||round(pga_alloc_mem/1024/1024)||'/'||
            round(pga_freeable_mem/1024/1024)||'/'||round(pga_max_mem/1024/1024)
      into l_v$process_mem
      from v$process
      where addr = (select paddr from v$session where sid = p_sid);
    
   select 'Category(Alloc/Used/Max) >>> '||
            listagg(Category||'('||round(allocated/1024/1024)||'/'||
                    round(used/1024/1024)||'/'||round(max_allocated/1024/1024)||') > ')
    within group (order by Category desc) name_usage_list
      into l_v$process_memory_mem
      from v$process_memory
      where pid = (select pid from v$process
                    where addr = (select paddr from v$session where sid = p_sid));

   dbms_output.put_line(rpad(l_name, 20)||' > '||rpad(l_v$process_mem, 50));
   dbms_output.put_line('             ------ '||l_v$process_memory_mem);
 end rpt;
  
 -------------------------------------------  
  procedure allo (i_mb int) is
  begin
   rpt('Start allocate: '||i_mb||' MB');
 
   select 'M' bulk collect into p_tab_1mb from dual connect by level <= 1024;  -- 1MB
 
   for i in 1..i_mb loop   -- i_mb MB
    p_tab_mb.extend;
    p_tab_mb(i) := p_tab_1mb;
   end loop;
 
   rpt('End allocate: '||i_mb||' MB');
  end allo;
 
end;
/

Tuesday, November 5, 2013

ORA-04030 incident file and alert.log

In the last two Blogs, we talked about how to measure PL/SQL package PGA memory usage and different ways to reclaim unused memory:
      dbms_session package_memory_utilization
      Oracle PL/SQL Collection Memory Reclaim

In this Blog we will discuss its diagnostics.

ORA-04030 incident file


ORA-04030 is threw when PGA memory allocation is over certain limit.

16 GB in Rel 11.2.0.3.0


Quite often ORA-04030 is threw when PGA consumption reaches 16 GB, and the generated incident file contains following text:

   Dump of Real-Free Memory Allocator Heap [0x1108c1090]
   mag=0xfefe0001 flg=0x5000003 fds=0x0 blksz=65536
   blkdstbl=0x1108c10a0, iniblk=252928 maxblk=262144 numsegs=255
   In-use num=252073 siz=3639541760, Freeable num=0 siz=0, Free num=0 siz=0


The 16GB limit seems deduced by:
      262144(maxblk) * 65536(blksz) = 16GB
where blksz is controlled by DB hidden parameters (MOS [1506315.1], [1328489.1]):
  _use_realfree_heap=TRUE
  _realfree_heap_pagesize_hint = 65536


and both can be queried by:

col name format a30
col cur_val format a20


select i.ksppinm name, v.ksppstvl cur_val, v.ksppstdf default_val, v.ksppstvf
from x$ksppi i, x$ksppcv v
where i.indx = v.indx
  and i.ksppinm in ('_realfree_heap_pagesize_hint', '_use_realfree_heap');


For blksz=65536, in Blog: PGA, SGA memory usage watching from UNIX, we have also observed that majority of page (segment) is allocated in a unit of 64KB on both AIX and Solaris for high PGA sessions.

In the above example, the real allocated memory is:
     252073 (In-use num) * 65536(blksz) = 16'519'856'128 Bytes
but only siz=3,639,541,760 is reported. This is probably due to the 32 Bit Integer Overflow (maximum 4GB, as observed in Blog: dbms_session package_memory_utilization) in the process dumping.

Adding 3*4GB Overflow, the effective memory allocated should be:
   3*(4*1024*1024*1024) + 3639541760 = 16,524,443,648 (16GB)

16GB is upper limit. Sometimes a session exits by ORA-04030 with only 11GB memory as follows:
   Dump of Real-Free Memory Allocator Heap [0x1008b1080]
   mag=0xbefe0001 flg=0x5000003 fds=0x6 blksz=65536
   blkdstbl=0xffffffff7d700010, iniblk=183298 maxblk=262144 numsegs=226
   In-use num=178538 siz=3115057154, Freeable num=0 siz=0, Free num=1 siz=54726560     

  
   178538*65536 = 11,700,666,368 (11GB)
   2*(4*1024*1024*1024) + 3115057154 = 11,704,991,746 (11GB)


In such case, it could probably indicate memory capped on the UNIX layer or virtual space limitation.

Usually ORA-04030 only triggers single session exit. Occasionally ORA-04030 can cause entire DB crash when PSP (PSP0) background process is affected. This is probably due to either certain threshold on the number of ORA-04030 errors, or memory shortage which PSP and other Oracle mandatory background processes ask for.

32 GB in Rel 11.2.0.4.0


It seems that Oracle increased the limit to 32 GB in Rel 11.2.0.4.0, as ORA-04030 dumps showed:
  Dump of Real-Free Memory Allocator Heap [0xfffffd7ffdbad000]
  mag=0xfefe0001 flg=0x5000003 fds=0x6 blksz=65536
  blkdstbl=0xfffffd7ffdbad010, iniblk=523264 maxblk=524288 numsegs=321
  In-use num=522322 siz=4176347136, Freeable num=0 siz=0, Free num=0 siz=0

where maxblk is doubled from 262144 to 524288.

4 GB and max_map_count


However if "In-use num" in the incident file shows a number far less than maxblk, it could be due to some UNIX settings. For example, Linux vm.max_map_count, limits the VM memory map table for the process.

If setting vm.max_map_count=65536, the session throws ORA-04030 when approaching to 4GB (max_map_count * blksz), and "In-use num" (65007) shows a number close to 65536 in the generated incident file:

    Dump of Real-Free Memory Allocator Heap [0x7fcf69a3c000]
    mag=0xfefe0001 flg=0x5000003 fds=0x7 blksz=65536
    blkdstbl=0x7fcf69a3c010, iniblk=69632 maxblk=262144 numsegs=115
    In-use num=65007 siz=4264296448, Freeable num=0 siz=0, Free num=1 siz=46268416


In such case, "Process Map Dump" Section in incident dump file contains about 65007 (close to 65536) entries like:
         7f7cd86c3000-7f7cd86d3000 rw-p 00000000 00:05 1030     /dev/zero

which represents a data (bss) memory area (struct vm_area_struct) of 65536 Bytes (= 140174183706624 - 140174183641088 in decimal).
It is the information contained in /proc/[pid]/maps. pmap formats this information in a bit more readable manner.


alert.log


When DB crashed, one can get a quick evidence from alert.log, which contains something like:
    PMON (ospid: 21951): terminating the instance due to error 490
  Instance terminated by PMON, pid = 23951

      (ORA-00490 means: PSP process terminated with error, and its Cause: The process spawner died)
or

  DBW0 (ospid: 1310): terminating the instance due to error 4030
  Instance terminated by DBW0, pid = 1510


whereas a normal instance shutdown is marked by a message like:
    Instance terminated by USER, pid = 1407
in alert.log.


When OS system is under pressure of space allocation, we can also see some warning in alert.log and related trace files.

alert.log


  Errors in file /orabin/app/oracle/admin/testdb/diag/rdbms/testdb/testdb/trace/testdb_psp0_28835930.trc:
 ORA-27300: OS system dependent operation:fork failed with status: 12
 ORA-27301: OS failure message: Not enough space
 ORA-27302: failure occurred at: skgpspawn3
 Process W000 died, see its trace file
 Process startup failed, error stack:


testdb_psp0_28835930.trc


 Process startup failed, error stack:
 ORA-27300: OS system dependent operation:fork failed with status: 12
 ORA-27301: OS failure message: Not enough space
 ORA-27302: failure occurred at: skgpspawn3


testdb_w000_30212136.trc


  MMNL absent for 78398 secs; Foregrounds taking over