All tests are done in Oracle 12.1.0.2.0.
1. Test Setup
create or replace java source named "OracleJavaOOM512" as
public class OracleJavaOOM512 {
// final static byte[] SB = new byte[1024*1024*500]; // workaround
public static void createBuffer(int bufferSize) {
// OracleRuntime.setMaxRunspaceSize(1024*1024*1024) // set 1GB, no help
// System.gc();
byte[] SB = new byte[bufferSize];
}
}
/
alter JAVA CLASS "OracleJavaOOM512" COMPILE;
create or replace procedure createBuffer512(p_buffer_size in number) is language java
name 'OracleJavaOOM512.createBuffer(int)';
/
create or replace procedure createBuffer512_loop
(p_cnt number, p_start number, p_delta number, p_sleep_seconds number := 0) as
l_size number := p_start;
begin
for i in 1..p_cnt loop
dbms_lock.sleep(p_sleep_seconds);
dbms_output.put_line('Step -- '||i||' --, Buffer Size (MB) = '||l_size);
createBuffer512(1024*1024*l_size);
l_size := l_size + p_delta;
end loop;
end;
/
2. Tests
Run first test to allocate 511MB:
Sqlplus > exec createBuffer512(1024*1024*511);
PL/SQL procedure successfully completed
succeeded without Error.Run second test for 512MB:
Sqlplus > exec createBuffer512(1024*1024*512);
ORA-29532: Java call terminated by uncaught Java exception: java.lang.OutOfMemoryError
ORA-06512: at "S.CREATEBUFFER512", line 1
It hits the general Java error: ORA-29532 java.lang.OutOfMemoryError, which probably indicates that the JVM is limited by 512MB
for one single object instance, in this test, it is "new byte[bufferSize]".Calling OracleRuntime.setMaxRunspaceSize(1024*1024*1024) to set MaxRunspaceSize to 1GB does not help.
If interested, a 29532 event trace can also be made:
Sqlplus > alter session set max_dump_file_size = UNLIMITED;
Sqlplus > alter session set tracefile_identifier='OOM512_29532';
Sqlplus > alter session set events '29532 trace name errorstack level 3';
Sqlplus > exec createBuffer512(1024*1024*512);
ORA-29532: Java call terminated by uncaught Java exception: java.lang.OutOfMemoryError
Open a new Oracle session, run third test case, which gradually allocates memory from 1MB to 511MB, each time increases 1MB per call.
Sqlplus > exec createBuffer512_loop(511, 1, 1);
Step -- 1 --, Buffer Size (MB) = 1
Step -- 2 --, Buffer Size (MB) = 2
...
Step -- 264 --, Buffer Size (MB) = 264
Step -- 265 --, Buffer Size (MB) = 265
BEGIN createBuffer512_loop(511, 1, 1); END;
*
ERROR at line 1:
ORA-29532: Java call terminated by uncaught Java exception: java.lang.OutOfMemoryError
ORA-04030: out of process memory when trying to allocate 282220624 bytes (joxu pga heap,f:OldSpace)
ORA-06512: at "S.CREATEBUFFER512", line 1
ORA-06512: at "S.CREATEBUFFER512_LOOP", line 8
ORA-06512: at line 1
It throws OutOfMemoryError when allocating 266MB (the number can be varied, but before reaching 511),
and generates trace file and incident file.
Note that in this test, besides the above general Java error: ORA-29532, it throws also ORA-04030.
3. Oracle JVM Garbage Collection
Oracle 12c Document: Database Java Developer's Guide Chapter 1 Introduction to Java in Oracle Database, Section Memory Spaces Management depicts a 3-Layer Java memory Management:
- New Space in Call space
- Old Space in Call space
- Session Memory in Session space
- Generational scavenging for short-lived objects
- Mark and lazy sweep collection for objects that exist for the life of a single call
- Copying collector for long-lived objects, that is, objects that live across calls within a session
Book: Oracle Database Programming using Java and Web Services (Kuassi Mensah) Section 2.2 Java Memory Management (Page 41-51) reveals internal GC mechanisms (probably written for Oracle 10g).
Garbage Collection Techniques
The OracleJVM memory manager uses a set of GC techniques for its various memory structures (listed in Table 2.1), including the generational GC, mark-sweep GC, and copy GC.
These 3 GC techniques in Oracle 10g look similar to GC algorithms in 12c.
Mark-sweep GC
A mark-sweep GC consists of two phases: (1) the mark phase (garbage detection) and the (2) sweep phase (Garbage Collection). The sweep phase places the “garbaged” memory on the freelists. The allocation is serviced out of the freelists. Lazy sweeping is a variation technique that marks objects normally, but, in the sweep phase, leaves it up to the allocation phase to determine whether the marked objects are live (no need to reclaim) or not (can be reclaimed), rather than sweeping the object memory at that time.
Java Memory Areas
Oldspace is an object memory used for holding long-lived or large objects (i.e., larger than 1-K Bytes) for the duration of a call. It is cleaned up using Marksweep GC (described earlier) for large objects; it uses a variation of “lazy sweeping” for small objects.
Looking up Table 2.1 "Summary of OracleJVM Memory Structure" (Page 50), since our test allocation is more than 1-K Bytes, it should be allocated into Old-space using Buddy memory allocation, and garbage collected by Mark-Sweep.
4. Reasoning
The error message in Sqlplus Window indicates that memory problem is caused by:
joxu pga heap,f:OldSpace
which also appears in trace file.
An excerpt of incident file looks like:
=======================================
TOP 10 MEMORY USES FOR THIS PROCESS
---------------------------------------
84% 91 MB, 17 chunks: "f:OldSpace " JAVA
joxu pga heap ds=fffffd7ffbe8bcc8 dsprt=fffffd7ffc0a9078
2% 2432 KB, 92 chunks: "free memory "
pga heap ds=fffffd7ffc345640 dsprt=0
2% 1767 KB, 437 chunks: "free memory "
session heap ds=fffffd7ffc02d728 dsprt=fffffd7ffc358350
1% 1064 KB, 2 chunks: "permanent memory " JAVA
joxu pga heap ds=fffffd7ffbe8bcc8 dsprt=fffffd7ffc0a9078
...
=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
******************************************************
PRIVATE HEAP SUMMARY DUMP
108 MB total:
104 MB commented, 519 KB permanent
4125 KB free (0 KB in empty extents),
97 MB, 1 heap: "joxp heap " 2315 KB free held
8894 KB, 1 heap: "session heap " 679 KB free held
------------------------------------------------------
Summary of subheaps at depth 1
104 MB total:
101 MB commented, 952 KB permanent
1814 KB free (25 KB in empty extents),
94 MB, 1 heap: "joxu pga heap "
------------------------------------------------------
Summary of subheaps at depth 2
99 MB total:
97 MB commented, 1408 KB permanent
272 KB free (0 KB in empty extents),
91 MB, 17 chunks: "f:OldSpace "
=========================================
REAL-FREE ALLOCATOR DUMP FOR THIS PROCESS
-----------------------------------------
Dump of Real-Free Memory Allocator Heap [0xfffffd7ffbfed000]
mag=0xfefe0001 flg=0x5000003 fds=0x0 blksz=65536
blkdstbl=0xfffffd7ffbfed010, iniblk=523264 maxblk=524288 numsegs=255
In-use num=133 siz=112984064, Freeable num=224 siz=2914910208, Free num=15 siz=21889024
...
-------------------------
Top 10 processes:
-------------------------
(percentage is of 27 GB total allocated memory)
99% pid 66: 105 MB used of 27 GB allocated (27 GB freeable) <= CURRENT PROC
0% pid 11: 7792 KB used of 8294 KB allocated
...
======================================================
ESTIMATED MEMORY USES FOR ALL PROCESSES
------------------------------------------------------
(from 1 snapshot out of 67 processes)
99% 27 GB, 66 processes: "unsnapshotted "
1% 172 MB, 1 process : "free memory "
pga heap 95 chunks
0% 93 MB, 1 process : "f:OldSpace " JAVA
...
******************* Dumping process map ****************
Start addr - End addr Size PgSz Shmid Perms Object name
------------------ - ----------------- ------- ---- ---------- ------------------------------
...
0x80000000 - 0x1af000000 4964352K 4K 0x40000039 rwxs- [ anon ] osm
...
0xfffffd792326d000 - 0xfffffd79333bd000 263488K 4K 0xffffffff rw--- [ anon ]
0xfffffd793355d000 - 0xfffffd794365d000 263168K 4K 0xffffffff rw--- [ anon ]
0xfffffd794374d000 - 0xfffffd795374d000 262144K 4K 0xffffffff rw--- [ anon ]
0xfffffd795377d000 - 0xfffffd796293d000 247552K 4K 0xffffffff rw--- [ anon ]
0xfffffd7962aad000 - 0xfffffd7971b6d000 246528K 4K 0xffffffff rw--- [ anon ]
...
There are about 30GB segments marked as:rw--- [ anon ]
which are session's PGA.
Above output shows:
105 MB used of 27 GB allocated (27 GB freeable) <= CURRENT PROC
among 27 GB allocated, almost all of which is freeable.
However due to Oracle JVM lazy GC, they are not immediately released, and eventually leads to:
ORA-04030: out of process memory when trying to allocate 283285584 bytes (joxu pga heap,f:OldSpace)
(By the way, biggest chunk 4964352K is SGA, created with Optimized Shared Memory (OSM) available since Oracle 12c when sga_max_size is not configured, or is set greater than the computed value.
See ISM, DISM in Oracle 11g Blog: PGA, SGA memory usage watching from UNIX)
Concerning the freeable memory, V$PROCESS_MEMORY documentation said:
For the "Freeable" category, Column ALLOCATED is the amount of free PGA memory eligible to be released to the operating system.
Note that OutOfMemoryError process still holds the "Freeable" memory and does not immediately put them back to freelists since the process is still alive. It takes time for GC to reclaim them gradually. If the process (Oracle session) terminated (disconnected), it is immediately reusable by other processes.
You can query V$PROCESS_MEMORY or V$PROCESS to monitor it.
Crosschecking with Solaris pmap output:
oracle@test$ pmap -x 1122
1122: testdb (LOCAL=NO)
Address Kbytes RSS Anon Locked Mode Mapped File
...
0000000080000000 4964352 4964352 - 4964352 rwxs- [ osm shmid=0x40000039 ]
...
FFFFFD785FC1D000 306816 306812 306812 - rw--- [ anon ]
FFFFFD787296D000 305792 305788 305788 - rw--- [ anon ]
FFFFFD78854BD000 297472 297452 297452 - rw--- [ anon ]
FFFFFD7897F1D000 296448 296412 296412 - rw--- [ anon ]
FFFFFD78AA76D000 295424 295420 295420 - rw--- [ anon ]
FFFFFD78BCECD000 294336 294332 294332 - rw--- [ anon ]
...
---------------- ---------- ---------- ---------- ----------
total Kb 31990020 31933340 26615772 4982784
(you can even use Solaris MDB Formatting Dcmds to display content at one above virtual address space)Incident file and pmap output show that the majority of memory are allocated as "anon" pages, but not removed from the mappings, which means that there are more mmap than munmap.
We can see this un-matching mmap and munmap with:
oracle@test$ truss -cp -t mmap,munmap 1122
syscall seconds calls errors
mmap .056 1014
munmap 6.447 68
-- Linux trace mmap and munmap: strace -e trace=memory -p pid
The output shows that munmap is 1716 (=(6.447/68)/(0.056/1014)) times expensive. That is probably one reason why the GC strategy is lazy delayed.or watch the details by:
oracle@test$ truss -p -t mmap,munmap 1122
...
mmap(0xFFFFFD7C6E8ED000, 202375168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFFD7C6E8ED000
mmap(0xFFFFFD7C625ED000, 203423744, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFFD7C625ED000
mmap(0xFFFFFD7C561FD000, 204537856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFFD7C561FD000
...
mmap(0xFFFFFD7A43ACD000, 243924992, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFFD7A43ACD000
mmap(0xFFFFFD7A34F7D000, 244973568, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFFD7A34F7D000
mmap(0xFFFFFD7A2632D000, 246022144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFFD7A2632D000
mmap(0xFFFFFD7A175DD000, 247136256, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFFD7A175DD000
munmap(0xFFFFFD7A2632D000, 246022144) = 0
munmap(0xFFFFFD7A34F7D000, 244973568) = 0
munmap(0xFFFFFD7A43ACD000, 243924992) = 0
...
It shows that not all mapped chunks are un-mapped, and unmapping occurs in reverse order (last mapping is first unmapped).With Solaris dtrace, we can count number of mmap and munmap, as well as sum of their amount (in MB). The large difference between mmap and munmap is probably pointing to the unreleased "Freeable" memory.
oracle@test$ dtrace -n 'syscall::mmap:entry,syscall::munmap:entry/pid==1122 && arg0 != 0x0/
{
@CNT[probefunc] = count();
@SUM[probefunc] = sum(arg1/1024/1024);
}
END
{
printf("\n%10s %10s %10s\n", "NAME", "COUNT", "SUM(MB)");
printa( "%10s %10@d %10@d\n", @CNT, @SUM);
}'
------------ output -------------
NAME COUNT SUM(MB)
munmap 44 3987
mmap 404 32400
There are 360 (404-44) more mmap, 32400 MB allocated, but only 3987 MB de-allocated.
which results in 28413 (32400-3987) MB unreleased allocation.With Section 7 appended Dtrace script, we can trace the details of each mmap and munmap. For the OutOfMemoryError execution:
Sqlplus > exec createBuffer512_loop(511, 1, 1);
Step -- 1 --, Buffer Size (MB) = 1
Step -- 2 --, Buffer Size (MB) = 2
...
Step -- 262 --, Buffer Size (MB) = 262
Step -- 263 --, Buffer Size (MB) = 263
Step -- 264 --, Buffer Size (MB) = 264
Step -- 265 --, Buffer Size (MB) = 265
BEGIN createBuffer512_loop(511, 1, 1); END;
*
ERROR at line 1:
ORA-29532: Java call terminated by uncaught Java exception: java.lang.OutOfMemoryError
ORA-04030: out of process memory when trying to allocate 282220624 bytes (joxu pga heap,f:OldSpace)
ORA-06512: at "S.CREATEBUFFER512", line 1
ORA-06512: at "S.CREATEBUFFER512_LOOP", line 8
ORA-06512: at line 1
Dtrace output looks like:
=> mmap 2017 Feb 5 14:16:01: mmap(addr=0xFFFF80FFBA5CF000, len=1114112, prot=0x3, flags=0x112, fildes=4294967295, off=0)
<= mmap 2017 Feb 5 14:16:01: 0xFFFF80FFBA5CF000
**++** increase = 1114112( 1(MB)), mmap# = 1, mmap_MB = 1, munmap# = 0, munmap_MB = 0, sum = 1
=> mmap 2017 Feb 5 14:16:01: mmap(addr=0xFFFF80FFBA1AF000, len=1114112, prot=0x3, flags=0x112, fildes=4294967295, off=0)
<= mmap 2017 Feb 5 14:16:01: 0xFFFF80FFBA1AF000
**++** increase = 1114112( 1(MB)), mmap# = 2, mmap_MB = 2, munmap# = 0, munmap_MB = 0, sum = 2
...
=> mmap 2017 Feb 5 14:19:43: mmap(addr=0xFFFF80FAA317F000, len=253493248, prot=0x3, flags=0x112, fildes=4294967295, off=0)
<= mmap 2017 Feb 5 14:19:43: 0xFFFF80FAA317F000
**++** increase = 253493248(241(MB)), mmap# = 359, mmap_MB = 28726, munmap# = 44, munmap_MB = 8671, sum = 20055
=> munmap 2017 Feb 5 14:19:54: munmap(addr=0xFFFF80FAB24AF000, len=252444672)
<= munmap 2017 Feb 5 14:19:54: 0.
**--** decrease = 252444672(240(MB)), mmap# = 359, mmap_MB = 28726, munmap# = 45, munmap_MB = 8911, sum = 19815
...
=> mmap 2017 Feb 5 14:20:04: mmap(addr=0xFFFF80FAE901F000, len=276955136, prot=0x3, flags=0x112, fildes=4294967295, off=0)
<= mmap 2017 Feb 5 14:20:04: 0xFFFF80FAE901F000
**++** increase = 276955136(264(MB)), mmap# = 391, mmap_MB = 31308, munmap# = 58, munmap_MB = 11940, sum = 19368
=> mmap 2017 Feb 5 14:20:04: mmap(addr=0xFFFF80FAD864F000, len=278003712, prot=0x3, flags=0x112, fildes=4294967295, off=0)
<= mmap 2017 Feb 5 14:20:04: 0xFFFF80FAD864F000
**++** increase = 278003712(265(MB)), mmap# = 393, mmap_MB = 31573, munmap# = 58, munmap_MB = 11940, sum = 19633
=> mmap 2017 Feb 5 14:20:06: mmap(addr=0xFFFF80FAB6FAF000, len=280100864, prot=0x3, flags=0x112, fildes=4294967295, off=0)
<= mmap 2017 Feb 5 14:20:06: 0xFFFF80FAB6FAF000
**++** increase = 280100864(267(MB)), mmap# = 397, mmap_MB = 32106, munmap# = 58, munmap_MB = 11940, sum = 20166
=> mmap 2017 Feb 5 14:20:09: mmap(addr=0xFFFF80FA924DF000, len=281214976, prot=0x3, flags=0x112, fildes=4294967295, off=0)
<= mmap 2017 Feb 5 14:20:09: 0xFFFF80FA924DF000
**++** increase = 281214976(268(MB)), mmap# = 399, mmap_MB = 32374, munmap# = 58, munmap_MB = 11940, sum = 20434
...
:END
NAME COUNT SUM(MB)
munmap 93 15793
mmap 405 32375
The error message indicates that OutOfMemoryError is raised when trying to
allocate 282,220,624 bytes (joxu pga heap,f:OldSpace), where 282,220,624 bytes is about 269 MB.
If we look the last line of Dtrace output, the last successful allocation is 268 MB,
the next allocation of one more MB (p_delta of createBuffer512_loop call is 1 MB) hits the error.Above createBuffer512_loop(511, 1, 1) is a loop of allocation with each step increasing 1 MB. It seems that each step requires a new memory chunk to be allocated since the previous step allocated memory is 1 MB smaller, which can not be reused and hence assigned to Category "Freeable". If such steps continue, in step 256, it will reach 32 GB (256*(1+256)/2=32,768 MB). But as shown above, RDBMS also makes un-allocation (munmap) to release certain allocated memory, that probably why it results in error in Step 265.
One worst speculation could also be that RDBMS only sums up allocated memory (ignore un-allocated), when reaching limit of 32 GB, it raises OutOfMemoryError.
The last line in Dtrace output shows: sum = 20,434 MB, which computed by 32,374 (mmap_MB) - 11,940 (munmap_MB). It corresponds to the UNIX command output:
top.RES, prstat.RSS, "ps aux".RSS, "pmap -x".Anon total
In most tests, it is about 15% more than v$process_memory Category "Freeable"
("Freeable" is almost total of session PGA memory), for example:
select round(allocated/1e6) allocated_mb, v.* from v$process_memory v
where pid in (22)
order by allocated desc;
ALLOCATED_MB PID SERIAL# CATEGORY ALLOCATED Used MAX_ALLOCATED
------------ ---- ------- -------- -------------- --------- -------------
17,180 22 53 Freeable 17,179,869,184 0
398 22 53 Other 397,914,636 3,533,074,452
4 22 53 JAVA 4,167,792 4,160,648 548,833,672
3 22 53 PL/SQL 2,622,904 2,533,008 2,924,192
0 22 53 SQL 224,248 17,432 7,021,552
Note that v$process_memory.MAX_ALLOCATED is not filled (missed) for Category "Freeable",
we will give a further look of this field in its historical view: dba_hist_process_mem_summary.Probably We can say that this OutOfMemoryError is indeed a PGA Freeable memory problem, instead of OraleJVM Java.
For mmap prot / flags arguments, we can find the mapping in truss command output and sys/mman.h.
---------- truss output and mmap prot/flags ----------
mmap(0xFFFF80FFA95D2000, 23461888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFFFF80FFA95D2000
prot=0x3 (PROT_READ|PROT_WRITE)
flags=0x112 (MAP_PRIVATE|MAP_FIXED|MAP_ANON)
mmap(0xFFFF80FFB99E2000, 2818048, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_NORESERVE|MAP_ANON, -1, 12460032) = 0xFFFF80FFB99E2000
prot=0x0 (PROT_NONE)
flags=0x152 (MAP_PRIVATE|MAP_FIXED|MAP_NORESERVE|MAP_ANON)
---------- /usr/include/sys/mman.h ----------
#define PROT_READ 0x1 /* pages can be read */
#define PROT_WRITE 0x2 /* pages can be written */
#define PROT_EXEC 0x4 /* pages can be executed */
#define PROT_NONE 0x0 /* pages cannot be accessed */
/* sharing types: must choose either SHARED or PRIVATE */
#define MAP_SHARED 1 /* share changes */
#define MAP_PRIVATE 2 /* changes are private */
#define MAP_TYPE 0xf /* mask for share type */
/* other flags to mmap (or-ed in to MAP_SHARED or MAP_PRIVATE) */
#define MAP_FIXED 0x10 /* user assigns address */
#define MAP_NORESERVE 0x40 /* don't reserve needed swap area */
#define MAP_ANON 0x100 /* map anonymous pages directly */
#define MAP_ANONYMOUS MAP_ANON /* (source compatibility) */
Other dtrace commands can help us understand largeobj (oldSpace) allocation and GC activity.
oracle@test$ dtrace -n 'pid$target::eoa_new_largeobj:entry {@[pid, ustack(40, 0)] = count(); }' -p 1122
oracle@test$ dtrace -n 'pid$target::eoa_oldspace_gc_scan_xt:entry {@[pid, ustack(40, 0)] = count(); }' -p 1122
Two tests below have no problem since the first step already allocated the max needed memory,
which is sufficient for the usage of all later steps (no more need to any allocation):
Sqlplus > exec createBuffer512_loop(511, 511, -1);
Sqlplus > exec createBuffer512_loop(511, 511, 0);
Both above tests propose us a quick workaround by introducing a class (static) variable and setting it to a enough big value as follows:final static byte[] SB = new byte[1024*1024*500];
Note that java class variable in Java Stored Procedure is stateful object, which survives across Java calls in the same RDBMS session (like PL/SQL Package variable), and can be reset by dbms_java.endsession (similar to dbms_session.reset_package), which clears any Java session state remaining from previous execution of Java in the current RDBMS session (introduced in 11g release 1, see Java Developer's Guide - 4.5 Two-Tier Duration for Java Session State ). Therefore, if JAVA VM memory is under pressure, dbms_java.endsession can be called after certain threshold to release the memory, a trade-off between Memory (JavaGC) and Performance.
Oracle Plsql starts OracleJVM by the call stack:
kkxmjexe
jox_invoke_java_
joe_run_vm
JVM initializes Java session, loads classes, and finally clears session with:
joevm_init_session_imcache -- imcache in-memory cache
joevm_in_clinit -- multi calls, one call for one class init
joevm_clear_session_imcache
If we call dbms_java.endsession after the excution of java stored procedure
(createBuffer512 in our example, see later test code),
Java session state is cleaned (hence Java memory freed).
The next excution of createBuffer512 makes kkxmjexe to re-execute above three steps
(initialize, load, clear). hence performance sacrifice.
It should be called in a controlled manner, see later example: procedure createBuffer512_loop
calling dbms_java.endsession.Here the call stacks of init_session and clear_session, which are re-run if Java session state is reset by dbms_java.endsession:
---------- joevm_init_session_imcache Call Stack ----------
#0 0x000000000585d5e0 in joevm_init_session_imcache ()
#1 0x00000000056ea471 in eoa_startup_default_objmems ()
#2 0x00000000056e9dd9 in ioesub_init_call ()
#3 0x00000000057304d2 in seoa_note_stack_outside ()
#4 0x00000000056e9ae1 in ioe_init_call ()
#5 0x0000000004d7ce51 in jox03_init_call ()
#6 0x0000000004d02f67 in jox_invoke_java_ ()
#7 0x0000000009c6db05 in kkxmjexe ()
#8 0x0000000004bed088 in kgmexcb ()
#9 0x000000000275a64b in kkxmswu ()
#10 0x0000000004beabc3 in kgmexwi ()
#11 0x0000000004be997a in kgmexec ()
#12 0x0000000011889684 in pefjavacal ()
#13 0x0000000005628dd1 in pefcal ()
#14 0x00000000054cda7b in pevm_FCAL ()
#15 0x00000000054aeb4e in pfrinstr_FCAL ()
#16 0x00000000129e612c in pfrrun_no_tool ()
#17 0x00000000129e4a96 in pfrrun ()
#18 0x00000000129edac0 in plsql_run ()
---------- joevm_clear_session_imcache Call Stack ----------
#0 0x0000000005859340 in joevm_clear_session_imcache ()
#1 0x00000000056e8209 in eoa_vm_call_cleanup ()
#2 0x00000000056e7b64 in ioesub_end_call ()
#3 0x00000000057304d2 in seoa_note_stack_outside ()
#4 0x00000000056e8809 in ioe_end_call ()
#5 0x0000000004c6a4b5 in joxenc ()
#6 0x0000000004c69609 in joxdlc_internal ()
#7 0x0000000004c68e73 in jox_flush_java_session ()
#8 0x0000000004d2cb10 in jox_ioe_call_java_ ()
#9 0x000000000c44c587 in pcklfun ()
#10 0x00000000056cb29b in spefcpfa ()
#11 0x000000000568d267 in spefmccallstd ()
#12 0x000000000562f45b in peftrusted ()
#13 0x0000000004113278 in psdexsp ()
#14 0x00000000126edae2 in rpiswu2 ()
#15 0x000000000373cdea in kxe_push_env_internal_pp_ ()
#16 0x00000000037a4bb5 in kkx_push_env_for_ICD_for_new_session ()
#17 0x0000000004112c63 in psdextp ()
#18 0x0000000005629427 in pefccal ()
#19 0x0000000005628cdf in pefcal ()
#20 0x00000000054cda7b in pevm_FCAL ()
#21 0x00000000054aeb4e in pfrinstr_FCAL ()
#22 0x00000000129e612c in pfrrun_no_tool ()
#23 0x00000000129e4a96 in pfrrun ()
#24 0x00000000129edac0 in plsql_run ()
5. Possible Causes and Fixes
When OracleJVM hitting OutOfMemoryError, it often falls into two v$process_memory categorie: Java or Freeable. Here two possible fixes.
5.1. Category JAVA
If v$process_memory for the session (Oracle PID) shows that high memory usage is Category JAVA, or incident file (ORA-04030: out of process memory) has the top consumer JAVA:
100% 32 GB, 292 chunks: "f:OldSpace " JAVA
joxu pga heap ds=ffff80ffbc1abc60 dsprt=ffff80ffbacb9078
we can call dbms_java.endsession to clear any Java session state,
which makes createBuffer512_loop(511, 1, 1) successfully completed:
create or replace procedure createBuffer512_loop
(p_cnt number, p_start number, p_delta number, p_sleep_seconds number := 0) as
l_size number := p_start;
l_ret varchar2(100);
l_chuncks number := 1; --10 -- control dbms_java.endsession call frequency
begin
for i in 1..p_cnt loop
dbms_lock.sleep(p_sleep_seconds);
dbms_output.put_line('Step -- '||i||' --, Buffer Size (MB) = '||l_size);
createBuffer512(1024*1024*l_size);
l_size := l_size + p_delta;
if mod(i, l_chuncks) = 0 then
l_ret := dbms_java.endsession;
--l_ret := dbms_java.endsession_and_related_state;
--dbms_session.free_unused_user_memory; -- no help
dbms_output.put_line('dbms_java.endsession return = '||l_ret);
end if;
end loop;
end;
/
Sqlplus > exec createBuffer512_loop(511, 1, 1);
Step -- 1 --, Buffer Size (MB) = 1
dbms_java.endsession return = java session ended
...
Step -- 510 --, Buffer Size (MB) = 510
dbms_java.endsession return = java session ended
Step -- 511 --, Buffer Size (MB) = 511
dbms_java.endsession return = java session ended
PL/SQL procedure successfully completed.
5.2. Category Freeable
If v$process_memory for the session (Oracle PID) shows that high memory usage is Category Freeable, or incident file (ORA-04030: out of process memory) has the top consumer "unsnapshotted":
100% 17 GB, 36 processes: "unsnapshotted "
0% 1247 KB, 1 process : "f:OldSpace " JAVA
joxu pga heap 14 chunks
We can try to allocate max reuired memory in the very first call of OracleJVM, for example, above test:
exec createBuffer512_loop(511, 511, 0);
We can also try to make certain pause (sleeping) and and expect that RDBMS can have time to collect the freeable space (not Java GC):
exec createBuffer512_loop(511, 1, 1, 0.4);
As tested, this does not always work since the freeable space is marked as "unsnapshotted",
and only released when the session disconnected.
6. Oracle Performance Views
Query GC statistics and process Java memory by:
select s.program, s.sid, n.name p_name, t.value, round(t.value/1e6, 2) mb
from v$session s, v$sesstat t, v$statname n
where 1=1
and s.sid=t.sid
and n.statistic# = t.statistic#
and name = 'java call heap gc count'
and s.sid in (920);
select round(allocated/1e6) mb, v.* from v$process_memory v
where pid in (66)
order by allocated desc;
select h.end_interval_time, allocated_max, num_processes, p.*
from dba_hist_process_mem_summary p, dba_hist_snapshot h
where h.snap_id = p.snap_id
and category ='JAVA'
order by h.end_interval_time desc;
Populate v$process_memory_detail by:(see dbms_session.get_package_memory_utilization and limitations)
Sqlplus > exec pga_sampling(920, 120, 1);
and then run query:
select round(bytes/1024/1024, 2) mb, v.* from process_memory_detail_v v
where 1=1
and name in ('f:OldSpace', 'free memory')
and round(bytes/1024/1024, 2) > 10
order by timestamp;
With following query, we can displays historical information about dynamic PGA memory usage
and map them to AWR Section "Process Memory Summary" column names.
One problem with this view is that max_allocated_max (AWR "Hist Max Alloc (MB)") is always missed for Category "Freeable"
(that could be a bug). As previously discussed, this "Freeable" is exactly the figures about unreleased memory (leaks).
select h.end_interval_time, m.category
-- total allocated for all processes in that Category at the snap_id
,round(allocated_total/1024/1024) "Alloc (MB)"
-- max allocated for one single process in that Category at the snap_id
,round(allocated_max/1024/1024) "Max Alloc (MB)"
-- max ever allocated for one single process in that Category during snap interval
-- max of v$process_memory.MAX_ALLOCATED. It is null for Category "Freeable"
,round(max_allocated_max/1024/1024) "Hist Max Alloc (MB)"
,m.*
from dba_hist_process_mem_summary m, dba_hist_snapshot h
where m.snap_id = h.snap_id
order by m.snap_id desc, m.category;
Oracle performance views v$process, v$process_memory reports "freeable" memory, whereas view:v$sesstat, v$active_session_history, dba_hist_active_sess_history
do not record this real figure, probably due to "Freeable" memory (27GB) which is signaled by the 27 GB "unsnapshotted" line in the incident file:
99% 27 GB, 66 processes: "unsnapshotted "
7. Linux Test
All above tests are performed on Solaris. Repeat the same test on a Linux with 24GB of total usable memory.
Sqlplus > exec createBuffer512_loop(511, 1, 1);
ORA-03113: end-of-file on communication channel
Process ID: 32251
-- Linux trace all memory mapping related system calls: mmap and munmap
-- strace -e trace=memory -p 32251
There are no trace file or incident file generated.
Checking dmesg out, it looks like that process is killed by PSP0 (call oom-killer) when reaching system limit of 24GB, but before reaching 32GB PGA limit.
[08:07:36] ora_psp0_td invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
[08:07:36] ora_psp0_td cpuset=/ mems_allowed=0
[08:07:36] Pid: 23201, comm: ora_psp0_td Not tainted 2.6.32-642.6.2.el6.x86_64 #1
[08:07:36] Call Trace:
[08:07:36] [] ? dump_header+0x90/0x1b0
[08:07:36] [] ? security_real_capable_noaudit+0x3c/0x70
[08:07:36] [] ? oom_kill_process+0x82/0x2a0
[08:07:36] [] ? select_bad_process+0xe1/0x120
[08:07:36] [] ? out_of_memory+0x220/0x3c0
[08:07:36] [] ? __alloc_pages_nodemask+0x93c/0x950
[08:07:36] [] ? alloc_pages_current+0xaa/0x110
[08:07:36] [] ? __page_cache_alloc+0x87/0x90
[08:07:36] [] ? find_get_page+0x1e/0xa0
[08:07:36] [] ? filemap_fault+0x1a7/0x500
[08:07:36] [] ? __do_fault+0x54/0x530
[08:07:36] [] ? handle_pte_fault+0xf7/0xb20
[08:07:36] [] ? sem_lock+0x6c/0x130
[08:07:36] [] ? sys_semtimedop+0x338/0xae0
[08:07:36] [] ? handle_mm_fault+0x299/0x3d0
[08:07:36] [] ? __do_page_fault+0x146/0x500
[08:07:36] [] ? wait_consider_task+0x7e6/0xb20
[08:07:36] [] ? read_tsc+0x9/0x10
[08:07:36] [] ? ktime_get_ts+0xbf/0x100
[08:07:36] [] ? poll_select_copy_remaining+0xf8/0x150
[08:07:36] [] ? do_page_fault+0x3e/0xa0
[08:07:36] [] ? page_fault+0x25/0x30
...
[08:07:36] Out of memory: Kill process 32251 (oracle_32251_td) score 619 or sacrifice child
[08:07:36] Killed process 32251, UID 100, (oracle_32251_td) total-vm:20111724kB, anon-rss:15083248kB, file-rss:215188kB
[08:07:36] oracle_32251_td: page allocation failure. order:0, mode:0x280da
[08:07:36] Pid: 32251, comm: oracle_32251_td Not tainted 2.6.32-642.6.2.el6.x86_64 #1
Note Buddy memory request "order=0", and oom-killer reason: "score 619"
7. Dtrace for mmap and munmap
sudo dtrace -F -n '
BEGIN {
delta_size = 0;
mmap_mb = 0;
munmap_mb = 0;
mmap_count = 0;
munmap_count = 0;
}
syscall::mmap:entry/pid == $1 && arg0 != 0x0/
{
delta_size = arg1; mmap_count++;
mmap_mb = mmap_mb + delta_size/1024/1024;
printf("%Y: %s(addr=0x%X, len=%d, prot=0x%X, flags=0x%X, fildes=%d, off=%d)",
walltimestamp, probefunc, arg0, arg1, arg2, arg3, arg4, arg5);
@CNT[probefunc] = count();
@SUM[probefunc] = sum(arg1/1024/1024);
}
syscall::mmap:return/pid == $1 && delta_size > 0/
{
printf("%Y: 0x%X\n", walltimestamp, arg0);
printf("**++** increase = %9d(%3d(MB)), mmap# = %d, mmap_MB = %d, munmap# = %d, munmap_MB = %d, sum = %d\n",
delta_size, delta_size/1024/1024, mmap_count, mmap_mb, munmap_count, munmap_mb, mmap_mb-munmap_mb);
delta_size = 0;
}
syscall::munmap:entry/pid == $1/
{
delta_size = arg1; munmap_count++;
munmap_mb = munmap_mb + delta_size/1024/1024;
printf("%Y: %s(addr=0x%X, len=%d)", walltimestamp, probefunc, arg0, arg1);
@CNT[probefunc] = count();
@SUM[probefunc] = sum(arg1/1024/1024);
}
syscall::munmap:return/pid == $1 && delta_size > 0/
{
printf("%Y: %d.\n", walltimestamp, arg0);
printf("**--** decrease = %9d(%3d(MB)), mmap# = %d, mmap_MB = %d, munmap# = %d, munmap_MB = %d, sum = %d\n",
delta_size, delta_size/1024/1024, mmap_count, mmap_mb, munmap_count, munmap_mb, mmap_mb-munmap_mb);
delta_size = 0;
}
END
{
printf("\n%10s %10s %10s\n", "NAME", "COUNT", "SUM(MB)");
printa( "%10s %10@d %10@d\n", @CNT, @SUM);
}' 1122
References
1. Database Java Developer's Guide Chapter 1 Introduction to Java in Oracle Database, Section Memory Spaces Management (Old Space)
2.Oracle Database Programming using Java and Web Services (Kuassi Mensah) Section 2.2 Java Memory Management (Page 41-49)
3. dbms_session.get_package_memory_utilization and limitations
4.Oracle Java OutOfMemoryError (11g Blog, no more reproducible in 12c)
5. PGA, SGA memory usage watching from UNIX