Tuesday, November 19, 2013

PGA, SGA space usage watching from UNIX

In the last two Blogs, we talked about how to measure PL/SQL package PGA memory usage and investigate ORA-04030 errors in incident file and alert.log:
    dbms_session package_memory_utilization
    ORA-04030 incident file and alert.log

In this Blog, we try to watch memory usage by UNIX commands on both AIX and Solaris systems at first, and then show sga_max_size impact on the memory allocation of both systems. All the tests are done on Oracle 11.2.0.3.0.

1. AIX


With the appended script, we can allocate 2GB PGA by:

    SQL >   exec pga_mem_test.allo(2048);

Then on OS level, you can watch it by (14090450 is process ID of the session):

   $>  svmon -P 14090450 -O filtercat=exclusive -O filterprop=data -O pgsz=on -O mpss=on
  -------------------------------------------------------------------------------
       Pid Command          Inuse      Pin     Pgsp  Virtual
  14090450 oracle          613027       48        0   613027

       PageSize                Inuse        Pin       Pgsp    Virtual
       s    4 KB              612947          0          0     612947
       m   64 KB                   5          3          0          5

      Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
    d90b51        12 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    ae0826        15 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    d30b5b        18 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    1d0b95        17 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    c20b4a        19 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    6c07e4        13 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    800b08        14 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    f00b78        16 work text data BSS heap           s      0     0    0   65536
                                                       m   4096     0    0       0
    240bac        11 work text data BSS heap           s    666     0    0   64058
                                                       m   3962     0    0       0
    3f0bb7        1a work text data BSS heap           s      8     0    0   22648
                                                       m   1415     0    0       0
     e0b86  80020014 work USLA heap                    s     44     0    0    1804
                                                       m    110     0    0       0
    3d0bb5  9001000a work shared library data          s    103     0    0     103
                                                       m      0     0    0       0
    420aca f00000002 work process private              m      5     3    0       5
    a10b29  ffffffff work application stack            s     11     0    0      43
                                                       m      2     0    0       0
    890b01  8001000a work private load data            s      3     0    0       3
                                                       m      0     0    0       0

                                                      
In Segment detail section: counter for mixed size segments are converted as s (small) size, and displayed under column Virtual. For example, segment: 240bac, 666 + 16*3962 = 64058.

That is why in PageSize section, s (4KB) is shown with 612947 Inuse and Virtual.

However counter for segments with only m (medium) size is not converted, for example, segment: 420aca. And in PageSize section, m (64KB) is shown with only 5 Inuse and Virtual.

In fact, among 613027 pages(each 4KB), only 835 (666 + 8 + 44 + 103 + 11 + 3) are s pages, and the rest are m pages. That means, s pages amounts to 3.2 MB, and m pages is about 2390MB. So this process is mainly allocated with 64KB pages.
 
With option unit=MB, we can see them in MB:

  $>  svmon -P 14090450 -O filtercat=exclusive -O filterprop=data -O pgsz=on -O mpss=on -O unit=MB
  -------------------------------------------------------------------------------
       Pid Command          Inuse      Pin     Pgsp  Virtual
  14090450 oracle         2394.64     0.19        0  2394.64


As we know, process space distribution is calculated by:

    Virtual (virtual space) = Inuse(physical memory) + Pgsp (paging space)

so this process is all allocated in physical memory due to Pgsp = 0.

If we take the option segment=category, we can see SYSTEM segments and SHARED segments (only part of Oracle SGA) (some details are removed).

  $svmon -P 14090450 -O segment=category -O filterprop=data -O pgsz=on -O mpss=on -O unit=MB
  -------------------------------------------------------------------------------
       Pid Command          Inuse      Pin     Pgsp  Virtual
  14090450 oracle         3542.34     36.6        0  3267.93
       PageSize                Inuse        Pin       Pgsp    Virtual
       s    4 KB             2690.65          0          0    2416.25
       m   64 KB              851.69       36.6          0     851.69
  ...............................................................................
  SYSTEM segments                      Inuse      Pin     Pgsp  Virtual
                                        47.3     36.4        0     47.3
  ...............................................................................
  EXCLUSIVE segments                   Inuse      Pin     Pgsp  Virtual
                                     2394.64     0.19        0  2394.64
  ...............................................................................
  SHARED segments                      Inuse      Pin     Pgsp  Virtual
                                     1100.40        0        0   826.00


and eventually segment detail can be shown by:
    $>  svmon -D 240bac -O frame=on
    
If we allocate a session with 12GB PGA, we can see that 3.77G Pgsp (paging space) is used.
But only part of EXCLUSIVE segments are in paging space, the whole SYSTEM segments and SHARED segments are still kept in memory. 

  $svmon -P 45940898 -O unit=auto
  ----------------------------------------------------------
        Pid Command          Inuse      Pin     Pgsp  Virtual
   45940898 oracle           10.8G    36.6M    3.77G    14.6G
   
   
Even though a session's whole PGA is initially allocated in real memory, AIX VMM can still page out of part of it into paging space when memory is required by other active sessions. 
   
The SHARED segments reported above seems only a part of entire SGA since it is much smaller than SEGSZ of ipcs:

  $ipcs -ma
    T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH     SEGSZ
    m   2268082   00000000 --rw-r-----   oracle      dba   oracle      dba     25 2113929216


We can also watch whole system memory usage by:

  $svmon -G -O unit=auto
 --------------------------------------------------------------------------------------
                   size       inuse        free         pin     virtual  available   mmode
    memory        60.0G       25.6G       15.7G       25.6G       42.8G         0K     Ded
    pg space      8.00G       4.22G

                   work        pers        clnt       other
    pin           12.2M          0K          0K       2.11G
    in use        25.5G          0K     119.00M    



By the way, Temp space used by Global Temporary Tables (GTT), or Temporary Table created by subquery_factoring_clause (implicit or materialize hint) are recorded in TEMP_SPACE_ALLOCATED of V$ACTIVE_SESSION_HISTORY / DBA_HIST_ACTIVE_SESS_HISTORY.

Oracle Doc said about it:
    Amount of TEMP memory (in bytes) consumed by this session at the time this sample was taken.

here the "memory" probably should be interpreted as underlined temporary segment on disk, not memory.

When we create a big GTT, we can neither observe memory consumption with SVMON for the particular session, nor system wide by VMSTAT. However, we can see the intensive I/O activities of disk on which temporary segment is located with IOSTAT.

A query on V$TEMPSEG_USAGE or V$TEMPSTAT can also justify this disk usage.


2. Solaris


On OS level, you can watch it by:

   $pmap -x 23739
                

             Address     Kbytes        RSS       Anon     Locked Mode   Mapped File
    ---------------- ---------- ---------- ---------- ---------- ------ -----------      
    FFFFFD7C4648D000         64         64         64          - rw---    [ anon ]
    0000000060000000      49152      49152          -      49152 rwxsR    [ ism shmid=0x800002d ]
    0000000080000000    7143424    7143424          -    7143424 rwxsR    [ ism shmid=0x800002e ]
    0000000240000000         20         20          -         20 rwxsR    [ ism shmid=0x900002f ]
    ---------------- ---------- ---------- ---------- ---------- ------ -----------
            total Kb    9947208    9936408    2536840    7192596       


Where pid: 25032 is process ID of the session with a PGA allocation of a 2GB memory table.

The above output shows that most of mapping is with a size of 64K marked as "[ anon ]".
The total Anon amounts to 2536840 (a little more than 2GB).
The total Locked is 7192596, which is the SGA size, and are marked like "[ ism shmid=0x800002e ]".

The SGA of this DB is allocated with ISM, therefore all SGA memory is locked, no swap space is needed to back it. We can roughly estimate virtual space, RSS (physical memory), and swap space by:

  virtual space   = 9947208 - 7192596 = 2,754,612 KB
  physical memory = 9936408 - 7192596 = 2,743,812 KB
  swap space      = 2754612 - 2743812 =    10,800 KB


since process 25032 is the only session with large PGA size and swap space is quite low,
we can say that PGA of this session is mainly allocated in physical memory.
     
Similarly we can watch SGA size by:

  $>  ipcs -ma
   T         ID      KEY        MODE        OWNER    GROUP  CREATOR   CGROUP NATTCH      SEGSZ 
   m   16777255   0x4a725828 --rw-rw----   oracle      dba   oracle      dba     84 7365218304
 

here SEGSZ of ipcs is correlated well with total Locked of pmap.

3. Linux



Run:
 SQL > exec pga_mem_test.allo(2048);

and watch PGA memory on Linux allocation by:

 $> pmap -x 46332 |grep -e zero |awk '{cnt+=1; Kbytes+=$2; RSS+=$3; Dirty+=$4} END {print "Count=",cnt,",Kbytes=",Kbytes,",RSS=",RSS,",Dirty=",Dirty,",Avg=",Kbytes/cnt}'
   Count= 37544 ,Kbytes= 2440768 ,RSS= 2403836 ,Dirty= 2403836 ,Avg= 65.0109

where Kbytes should be the allocated size.

Similarly we can watch SGA size in Linux by:

 $> ipcs -m
   ------ Shared Memory Segments --------
   key        shmid      owner      perms      bytes      nattch     status
   0x00000000 1455849472 oracle     640        33554432   21
   0x00000000 1455882241 oracle     640        4513071104 21
   0x19d100cc 1455915010 oracle     640        2097152    21

Pick one PID of Oracle session, and run:

 $> pmap -x 49169 |grep shmid
   0000000060000000   32768     448     448 rw-s-    [ shmid=0x56c68000 ]
   0000000062000000 4407296  276736  276736 rw-s-    [ shmid=0x56c70001 ]
   000000016f000000    2048       4       4 rw-s-    [ shmid=0x56c78002 ]

where 3 shared memory segments match each other:
  shmid=1455849472=0x56c68000  
  shmid=1455882241=0x56c70001
  shmid=1455915010=0x56c78002   
  
and total sga allocated is:
  4548722688 bytes (=33554432+4513071104+2097152)


4. sga_max_size on AIX


Recently a DB on AIX experienced performance problem due to shared pool memory resizing. The memory is configured as follows:

  sga_max_size       26'239'565'824

  shared_pool_size     2'214'592'512
  java_pool_size        268'435'456
  large_pool_size        536'870'912
  log_buffer              83'886'080
  db_cache_size        9'261'023'232
  db_keep_cache_size    1'006'632'960
  db_recycle_cache_size   1'006'632'960


and Automatic Memory Management is disabled (memory_target not specified).

AWR report shows 9 "Memory Resize Ops", where "DEFAULT buffer cache" is indicated by "SHR/IMM", and "shared pool" is signalized by "GRO/IMM".

It also shows the changes of db_cache_size and shared_pool_size:

  Parameter Name    Begin value   End value
  ----------------  ------------   -------------
  db_cache_size     9'261'023'232  8'657'043'456
  shared_pool_size  2'214'592'512  2'818'572'288


from which we can see 603'979'776 Bytes is moved from db_cache_size to shared_pool_size.
However, AIX "ipcs -a" shows:

  T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH     SEGSZ 
  ---------------------------------------------------------------------------------------
  m  20971562   00000000 --rw-rw----   oracle      dba   oracle      dba    379 201326592 
  m 143656025   00000000 --rw-rw----   oracle      dba   oracle      dba    379 26038239232
  m  20972634 0xc1bd22ac --rw-rw----   oracle      dba   oracle      dba    379     16384


so the AIX real allocated memory:

    201'326'592 + 26'038'239'232

is exactly equal to sga_max_size (26'239'565'824) (KEY marked with 0xc1bd22ac of SEGSZ = 16384 is omitted).

In conclusion, SGA_MAX_SIZE specifies the SGA maximum size of AIX allocated shared_memory, but not SGA effectively used size. It is advisable not to specify this parameter.

If sga_max_size is not specified, allocated shared_memory is determined by:

 select sum(value)/1024/1024
   from v$parameter
 where name in('shared_pool_size', 'java_pool_size', 'large_pool_size', 'streams_pool_size',

               'log_buffer', 'db_cache_size', 'db_keep_cache_size', 'db_recycle_cache_size'
               );


and sga_max_size is also deduced the above calculation, and AIX real allocated memory obeys to this value too. 


5. sga_max_size on Solaris x86-64


Repeated the test on a Solaris Operating System (x86-64) DB by specifying sga_max_size = 2600M,
it behaves the identical to AIX.

          Address     Kbytes        RSS       Anon     Locked Mode   Mapped File
  ---------------------------------------------------------------------------------------------
 0000000060000000      32768      26624          -          - rwxs-    [ dism shmid=0x5000020 ]
 0000000080000000    2629632     897024          -          - rwxs-    [ dism shmid=0x5000021 ]
 0000000160000000         12          4          -          - rwxs-    [ dism shmid=0x6000022 ]


Besides this, SGA memory allocation is changed to DISM.

Kbytes column confirms that 2600M = 32768K + 2629632K. However RSS shows that only 897024K of 2629632K (about 34%)  are allocated in physical memory, and no more memory is under Column Locked.

DISM is implemented with a dism daemon which is running oradism to dynamically manage memory allocation.

During database Starting up, DISM is started after all Mandatory Background Processes.

oradism ($ORACLE_HOME/bin/oradism) should be configured with permission:
            -rwsr-xr-x
in order to be running with root privileges.

If sga_max_size is not configured, it is computed as sum of all component groups in v$sga, and ISM is used.
If sga_max_size is configured greater than above computed value, DISM is used (over configured memory goes to Group "Variable Size")
If less or equal (same as not configured), it is still ISM.

Seeing the output of SWAP in Solaris running Oracle:

$ swap -l
swapfile   dev    swaplo   blocks     free
/dev/swap  181,1  8        100663288  100663288

it was asked why SWAP areas not used at all. Here is some essence:

A running Oracle instance is a "Cache of Bits (text, data, bss, stack)". Because of Cache, it is preferred in memory (RSS), whereas SWAP is a device of disk.

Even Oracle double accentuates it by the term "Buffer_Cache", where both "Buffer" and "Cache" are synonyms for memory.

In seldom case, Oracle SGA or PGA are using SWAP (disk), which could be an indication of performance degrade. For Solaris, one case is DISM.

So no use of SWAP conforms to Oracle's intention.

Test Code 



create or replace package pga_mem_test as
  procedure allo (i_mb int);
end;
/
create or replace package body pga_mem_test as
  type t_tab_kb   is table of char(1024);   -- 1KB
  p_tab_1mb          t_tab_kb := t_tab_kb();
  type t_tab_mb   is table of t_tab_kb;    
  p_tab_mb           t_tab_mb := t_tab_mb();
  p_sid              number   := sys.dbms_support.mysid;
 
 -------------------------------------------
 procedure rpt(l_name varchar) is
    l_v$process_mem            varchar2(4000);
    l_v$process_memory_mem     varchar2(4000);
 begin
  select 'Used/Alloc/Freeable/Max >>> '||
          round(pga_used_mem/1024/1024)    ||'/'||round(pga_alloc_mem/1024/1024)||'/'||
            round(pga_freeable_mem/1024/1024)||'/'||round(pga_max_mem/1024/1024)
      into l_v$process_mem
      from v$process
      where addr = (select paddr from v$session where sid = p_sid);
    
   select 'Category(Alloc/Used/Max) >>> '||
            listagg(Category||'('||round(allocated/1024/1024)||'/'||
                    round(used/1024/1024)||'/'||round(max_allocated/1024/1024)||') > ')
    within group (order by Category desc) name_usage_list
      into l_v$process_memory_mem
      from v$process_memory
      where pid = (select pid from v$process
                    where addr = (select paddr from v$session where sid = p_sid));

   dbms_output.put_line(rpad(l_name, 20)||' > '||rpad(l_v$process_mem, 50));
   dbms_output.put_line('             ------ '||l_v$process_memory_mem);
 end rpt;
  
 -------------------------------------------  
  procedure allo (i_mb int) is
  begin
   rpt('Start allocate: '||i_mb||' MB');
 
   select 'M' bulk collect into p_tab_1mb from dual connect by level <= 1024;  -- 1MB
 
   for i in 1..i_mb loop   -- i_mb MB
    p_tab_mb.extend;
    p_tab_mb(i) := p_tab_1mb;
   end loop;
 
   rpt('End allocate: '||i_mb||' MB');
  end allo;
 
end;
/