Thursday, March 28, 2013

Shared Pool - Memory Allocation - unbalanced

Continuing from my last field experience of ORA-04031 (Shared Pool - KKSSP),  I would like to share my further observation of Shared Pool Memory Allocation.

At first exposing some figures (facts), then trying to discuss the unbalanced memory allocation among subpools, eventual the cause of ORA-04031.

Figures


1. Areas (Subheaps)


There are 959 areas (Oracle 11.2.0.3.0) in shared pool dump.

The query:

   select count(*) from v$sgastat where pool = 'shared pool';

returns only 881, so not all areas are registered in v$sgastat, for example, "kokcd", "post agent".
(probably you will see 1062 in Oracle 12.1.0.1.0, and 35, 593, 729, 857 in the previous releases enumerated in Oracle Performance Firefighting)

2. Subpools


Subpool size can have more than 20% difference (range from 2'684'354KB to 3'355'443KB, all size in KB), areas in Subpools are also different. There are total 959 areas, but each subpool can have maximum 316 areas.

Name Subpool_1 Subpool_2 Subpool_3 Subpool_4 Subpool_5 Subpool_6 Subpool_7 Sum
Subpool_Size 2'684'354 3'355'443 3'355'443 2'684'354 3'355'443 2'684'354 3'355'443 21'474'836
Area_Count 295 303 308 306 306 308 316 959


Top 5 Areas Name Subpool_1 Subpool_2 Subpool_3 Subpool_4 Subpool_5 Subpool_6 Subpool_7 Sum
KKSSP 617'693 563'340 590'296 590'798 577'532 643'504 577'237 4'160'404
db_block_hash_buckets 529'039 535'429 529'039 539'525 535'429 529'044 535'429 3'732'935
KGLH0 554'532 485'312 464'006 353'634 450'528 346'045 332'455 2'986'514
SQLA 1'110 588'943 565'317 185'664 574'008 155'498 272'452 2'342'994
free memory 306'562 358'029 353'146 306'659 342'194 325'386 304'535 2'296'513


3. Single Area Allocation


There are 649 areas allocated only to one single subpool. 110 areas allocated to two subpools, but often one of two is allocated simply 24 Bytes, and other is allocated the rest, for example, FileOpenBlock is allocated 24 Bytes into Subpool_6, 510,025,424 Bytes into Subpool_7.

  Here are the top 10 (all size in Byte):
                                                                
Name                    Subpool_1   Subpool_2   Subpool_3   Subpool_4    Subpool_5   Subpool_6   Subpool_7    Sum         ALLOC_COUNT
----------------------- ----------- ----------- ----------- ------------ ----------- ----------- ------------ ----------- -----------
FileOpenBlock                                                                        24          510,025,424  510,025,448           2
enqueue                 39,257,104                                                               24           39,257,128            2
KSK VT POOL                                                 24           19,130,696                           19,130,720            2
ktlbk state objects                             17,971,200                                                    17,971,200            1
Wait History Segment                                                                 15,733,120               15,733,120            1
Global Context                                              11,017,024                                        11,017,024            1
call                    24           10,057,864                                                               10,057,888            2
keswx:plan en                        10,011,896                                                               10,011,896            1
KEWS sesstat values                                         9,802,944                                         9,802,944             1
FileIdentificatonBlock               24         7,711,368                                                     7,711,392             2

4. Unbalanced Areas


Following Areas allocations are not well balanced (difference to average > 30%)   

  The top 10 are (all size in Byte):

Name                 Subpool_1   Subpool_2   Subpool_3   Subpool_4    Subpool_5   Subpool_6   Subpool_7    Sum           ALLOC_COUNT
-------------------- ----------- ----------- ----------- ------------ ----------- ----------- ------------ ------------- -----------
SQLA                 1,110,321   588,943,273 565,317,412 185,664,243  574,008,432 155,498,158 272,452,386  2,342,994,419           7
FileOpenBlock                                                                     24          510,025,424  510,025,448             2
SQLP                 761,808     66,155,040  61,548,864  15,324,936   57,623,952  10,234,544  21,440,800   233,089,944             7
KGLS                 737,792     9,169,960   8,730,128   4,925,272    8,779,216   4,559,760   5,985,728    42,887,856              7
KQR M PO             30,720      799,232     512         255,896      30,568,128  138,680     8,590,256    40,383,424              7
enqueue              39,257,104                                                               24           39,257,128              2
PLDIA                0           7,246,864   8,072,928   4,285,000    7,129,088   3,520,016   5,066,408    35,320,304              7
write state object   4,662,208   4,662,184   48          9,324,368    24          4,662,208   4,662,208    27,973,248              7
KQR L PO                         924,032     21,262,248  117,768                                           22,304,048              3
PRTDS                241,720     4,668,976   4,023,352   2,747,344    5,001,776   2,203,384   1,311,744    20,198,296              7


Discussion


Based on the above figures, we can see that each subpool can have maximum 1/3 of total areas, and certain areas are only allocated to 1 or 2 subpools. For example,

  • "FileOpenBlock" is in Subpool_6 and Subpool_7, but Subpool_6 has only 24 Bytes.
  • "enqueue" is in Subpool_1 and Subpool_7, but Subpool_7 has only 24 Bytes.
  • "PLDIA" is in Subpool_1 and Subpool_7, but Subpool_1 has 0 Bytes.
  • "Global Context" is only in Subpool_4.

Now we have some more explanation of "Global Context" contention as experienced in the past since it is all only in one single subpool.

Certain areas are extremely unbalanced among subpools. For example,

  • Subpool_1 has   1'110 KB for SQLA.  
  • Subpool_2 has 588'943 KB for SQLA.

That probably results in the error:

       ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","SELECT MAX(XX) FROM...","SQLA","tmp")

(probably Oracle can improve the error message by including the Subpool number)

Now I wonder if this shared pool architecture is by design hardly to be balanced. For static area allocations, like "db_block_hash_buckets", it could be OK. However dynamic areas with frequent memory fragmentation would put tremendous pressure on memory management.

ORA-04031 is a cumulative outcome after certain time of ineffective usage of memory. Therefore it is hard to predict and reproduce. Quite often it is thrown by the victim session, which are not necessarily the cause of the error.

By what we learned from LISP (CLOS) and Java, which are using automatic memory management (garbage collection), we can also say that Shared Pool memory management puts remarkable challenge on this technique, and hence a long way to be perfect.

For example, Oracle shared_pool durations "instance", "session", "cursor", and "execution" (vs. LISP garbage collector generations), is probably the second parameter of Heap Dump title:
   HEAP DUMP heap name="sga heap(1,0)"
where 1 is subpool_number, 0 is duration.

In Oracle, if Redo is claimed as most critical mechanism in Oracle, probably shared pool should be declared as most sophisticated one.

BTW, to inspect number of Subpools, run following query (instead of checking "_kghdsidx_count").

    select * from v$latch_children where name = 'shared pool';

It returns 7 rows (maxmum 7 Subpools), but only the active ones are those with higher GETS (or MISSES).