Friday, May 23, 2014

Oracle 11g AUTO_SAMPLE_SIZE performance on Small Tables

Oracle 11g enhanced DBMS_STATS.AUTO_SAMPLE_SIZE by a new algorithm in calculating NDV
(see Blog: Improvement of AUTO sampling statistics gathering feature in Oracle Database 11g ).

Recent experience on large Databases (a few TB with mixed objects' size) reveals the performance disadvantage on small tables when using AUTO_SAMPLE_SIZE. In most case, objects under 100 MB are 20% to 40% slower.

Such behaviour can be reproduced on a small test DB. At first, create a few tables with different size. Then call dbms_stats.gather_table_stats on them, and record session statistics for each case. (see appended TestCode).

Launch the test by:
     truncate table test_stats;
  exec test_tab_run(10);


then collect test result by:

 with sq as
   (select t.*, value - lag(value) over (partition by table_name, stat_name order by ts) val
      from test_stats t
      where stat_name in ('CPU used by this session', 'undo change vector size', 'sorts (rows)',
                          'db block gets', 'db block changes', 'physical reads', 'consistent gets'))
 select a.table_name, a.table_size, a.stat_name, a.val "AUTO_VAL", p.val "100%_VAL"
       ,(a.val - p.val) diff, round(100*(a.val - p.val)/p.val, 2) "DIFF_%"
 from  (select * from sq where estimate_percent = 'AUTO') a
      ,(select * from sq where estimate_percent = '100%') p
 where a.table_name = p.table_name
   and a.stat_name  = p.stat_name
 order by a.stat_name, a.table_size;


TABLE_NAME TABLE_SIZE STAT_NAME AUTO_VAL 100%_VAL DIFF DIFF_%
TEST_TAB_0KB
131072
CPU used by this session
62
32
30
93.75
TEST_TAB_200KB
262144
CPU used by this session
48
35
13
37.14
TEST_TAB_800KB
917504
CPU used by this session
51
54
-3
-5.56
TEST_TAB_2MB
2228224
CPU used by this session
58
91
-33
-36.26
TEST_TAB_8MB
8519680
CPU used by this session
89
289
-200
-69.2
TEST_TAB_20MB
20971520
CPU used by this session
150
702
-552
-78.63
TEST_TAB_80MB
83755008
CPU used by this session
436
2800
-2364
-84.43
TEST_TAB_0KB
131072
consistent gets
25595
14167
11428
80.67
TEST_TAB_200KB
262144
consistent gets
18976
13582
5394
39.71
TEST_TAB_800KB
917504
consistent gets
19730
14376
5354
37.24
TEST_TAB_2MB
2228224
consistent gets
21338
15990
5348
33.45
TEST_TAB_8MB
8519680
consistent gets
28935
23622
5313
22.49
TEST_TAB_20MB
20971520
consistent gets
44004
38464
5540
14.4
TEST_TAB_80MB
83755008
consistent gets
124858
114275
10583
9.26
TEST_TAB_0KB
131072
db block changes
3347
1748
1599
91.48
TEST_TAB_200KB
262144
db block changes
2277
1639
638
38.93
TEST_TAB_800KB
917504
db block changes
2308
1684
624
37.05
TEST_TAB_2MB
2228224
db block changes
2344
1640
704
42.93
TEST_TAB_8MB
8519680
db block changes
2302
1813
489
26.97
TEST_TAB_20MB
20971520
db block changes
2318
1685
633
37.57
TEST_TAB_80MB
83755008
db block changes
6998
1713
5285
308.52
TEST_TAB_0KB
131072
db block gets
3531
1611
1920
119.18
TEST_TAB_200KB
262144
db block gets
2785
1497
1288
86.04
TEST_TAB_800KB
917504
db block gets
2786
1555
1231
79.16
TEST_TAB_2MB
2228224
db block gets
2849
1499
1350
90.06
TEST_TAB_8MB
8519680
db block gets
2785
1659
1126
67.87
TEST_TAB_20MB
20971520
db block gets
2815
1540
1275
82.79
TEST_TAB_80MB
83755008
db block gets
2765
1577
1188
75.33
TEST_TAB_0KB
131072
physical reads
1885
988
897
90.79
TEST_TAB_200KB
262144
physical reads
1136
906
230
25.39
TEST_TAB_800KB
917504
physical reads
1208
965
243
25.18
TEST_TAB_2MB
2228224
physical reads
1382
1146
236
20.59
TEST_TAB_8MB
8519680
physical reads
2139
1903
236
12.4
TEST_TAB_20MB
20971520
physical reads
3639
25991
-22352
-86
TEST_TAB_80MB
83755008
physical reads
11188
101658
-90470
-88.99
TEST_TAB_0KB
131072
sorts (rows)
13623
9873
3750
37.98
TEST_TAB_200KB
262144
sorts (rows)
12609
69442
-56833
-81.84
TEST_TAB_800KB
917504
sorts (rows)
12609
248542
-235933
-94.93
TEST_TAB_2MB
2228224
sorts (rows)
12609
606742
-594133
-97.92
TEST_TAB_8MB
8519680
sorts (rows)
12609
2397742
-2385133
-99.47
TEST_TAB_20MB
20971520
sorts (rows)
12609
5979742
-5967133
-99.79
TEST_TAB_80MB
83755008
sorts (rows)
12609
23889742
-23877133
-99.95
TEST_TAB_0KB
131072
undo change vector size
143968
84360
59608
70.66
TEST_TAB_200KB
262144
undo change vector size
92536
81572
10964
13.44
TEST_TAB_800KB
917504
undo change vector size
102888
82036
20852
25.42
TEST_TAB_2MB
2228224
undo change vector size
103436
81240
22196
27.32
TEST_TAB_8MB
8519680
undo change vector size
104356
95100
9256
9.73
TEST_TAB_20MB
20971520
undo change vector size
94504
90772
3732
4.11
TEST_TAB_80MB
83755008
undo change vector size
94528
90664
3864
4.26


Above table points out:

  1. "CPU used by this session" is favorable to "AUTO_SAMPLE_SIZE" only when table size is over 800KB.
  2. "consistent gets", "physical reads" are increasing with table size.
          These reflect the real work of gather_table_stats.
  3. "db block changes", "db block gets", "undo change vector size" are almost constant.
     but "AUTO_SAMPLE_SIZE" is mostly higher than "100%".
          They stand for the base load of gather_table_stats, since gather_table_stats hardly makes any modifications on the tables.
          Regardless of table size, they pertain for each call of gather_table_stats.
  4. "sorts (rows)" is constant for "AUTO_SAMPLE_SIZE", but increasing rapidly for "100%".
         This exposes the effective advantage of new algorithm.
so we can say that only when base load in Point 3 is overwhelmed, AUTO_SAMPLE_SIZE exhibits the advantage.

The query on elapsed time (in millisecond) also confirms the variance of performance on object size for both methods:

  with sq as
    (select t.*
           ,(ts - lag(ts) over (partition by table_name, stat_name order by ts))dur
       from test_stats t
       where stat_name in ('CPU used by this session'))
  select a.table_name, a.table_size
        ,round(extract(minute from a.dur)*60*1000 + extract(second from a.dur)*1000) "AUTO_DUR"
        ,round(extract(minute from p.dur)*60*1000 + extract(second from p.dur)*1000) "100%_DUR"
        ,round(extract(minute from (a.dur-p.dur))*60*1000 + extract(second from (a.dur-p.dur))*1000) diff_dur
  from  (select * from sq where estimate_percent = 'AUTO') a
       ,(select * from sq where estimate_percent = '100%') p
  where a.table_name = p.table_name
    and a.stat_name  = p.stat_name
  order by a.stat_name, a.table_size;
 

TABLE_NAME TABLE_SIZE AUTO_DUR 100%_DUR DIFF_DUR
TEST_TAB_0KB
131072
1852
882
970
TEST_TAB_200KB
262144
986
748
239
TEST_TAB_800KB
917504
1066
1029
37
TEST_TAB_2MB
2228224
1179
1648
-469
TEST_TAB_8MB
8519680
1645
4819
-3174
TEST_TAB_20MB
20971520
2650
11407
-8756
TEST_TAB_80MB
83755008
7300
44910
-37610


As a conclusion, this Blog demonstrated the existence of crossing point on object size. Only apart from this point, AUTO_SAMPLE_SIZE is profitable, and benefice is proportional to the object size.

AUTO_SAMPLE_SIZE Internals


By a 10046 trace when running:
   exec dbms_stats.gather_table_stats(null, 'TEST_TAB_2MB', estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE);

we can see the new Operation: APPROXIMATE NDV:

 Rows     Row Source Operation
 -------  ---------------------------------------------------
       1  SORT AGGREGATE (cr=300 pr=296 pw=0 time=246170 us)
   30000   APPROXIMATE NDV AGGREGATE (cr=300 pr=296 pw=0 time=44330 us ...)
   30000    TABLE ACCESS FULL TEST_TAB_2MB (cr=300 pr=296 pw=0 time=23643 us ...)


As discussed in Amit Poddar's "One Pass Distinct Sampling" (comprising details of math foundations), the most expensive operation in the old method is due to aggregate function (count (distinct ..)) to calculate the NDVs, whereas the new algorithm scans the full table once to build a synopsis per column to calculate column NDV.

To explore APPROXIMATE NDV AGGREGATE internals, Oracle designated a special event:
   ORA-10832: Trace approximate NDV row source

The details of approximate NDV algorithm can be traced with event 10832 set at level 1.

The new algorithm is coded in DBMS_SQLTUNE_INTERNAL.GATHER_SQL_STATS and called by DBMS_STATS_INTERNAL.GATHER_SQL_STATS, which is again called by DBMS_STATS.GATHER_BASIC_USING_APPR_NDV, and finally by DBMS_STATS.GATHER_TABLE_STATS.

TestCode


drop view stats_v;

create view stats_v as
select n.name stat_name, s.value from v$statname n, v$mystat s where n.statistic# = s.statistic#;


drop table test_stats;

create table test_stats as
select timestamp'2013-11-22 10:20:30' ts, 'TEST_TAB_xxxxxxxxxx' table_name, 0 table_size, 'AUTO_or_100%' estimate_percent, v.*
from stats_v v where 1=2;


drop table test_tab_0kb;
create table test_tab_0kb(x number, y varchar2(100));
insert into test_tab_0kb(y) values(null);
commit;


drop table test_tab_200kb;
create table test_tab_200kb(x number, y varchar2(100));
insert into test_tab_200kb select level, rpad('abc', mod(level, 100), 'x') from dual connect by level <= 3000;
commit;


drop table test_tab_800kb;
create table test_tab_800kb(x number, y varchar2(100));
insert into test_tab_800kb select level, rpad('abc', mod(level, 100), 'x') from dual connect by level <= 12000;
commit;


drop table test_tab_1mb;
create table test_tab_1mb(x number, y varchar2(100));
insert into test_tab_1mb select level, rpad('abc', mod(level, 100), 'x') from dual connect by level <= 15000;
commit;


drop table test_tab_2mb;
create table test_tab_2mb(x number, y varchar2(100));
insert into test_tab_2mb select level, rpad('abc', mod(level, 100), 'x') from dual connect by level <= 30000;
commit;


drop table test_tab_8mb;
create table test_tab_8mb(x number, y varchar2(100));
insert into test_tab_8mb select level, rpad('abc', mod(level, 100), 'x') from dual connect by level <= 120000;
commit;


drop table test_tab_20mb;
create table test_tab_20mb(x number, y varchar2(100));
insert into test_tab_20mb select level, rpad('abc', mod(level, 100), 'x') from dual connect by level <= 300000;
commit;


drop table test_tab_80mb;
create table test_tab_80mb(x number, y varchar2(100));
insert into test_tab_80mb select level, rpad('abc', mod(level, 100), 'x') from dual connect by level <= 1200000;
commit;


create or replace procedure fill_tab(p_table_name varchar2, p_size_mb number) as
begin 
  for i in 1..p_size_mb loop
    execute immediate 'insert into ' || p_table_name||' select * from test_tab_1mb';
  end loop;
  commit;
end;
/


create or replace procedure test_tab_proc (p_table_name varchar2, p_cnt number) as
  l_table_size number;
 
  procedure flush as
    begin
      execute immediate 'alter system flush shared_pool';
      execute immediate 'alter system flush buffer_cache';   
    end;
begin
  execute immediate 'select bytes from dba_segments where segment_name = :s' into l_table_size using p_table_name;
  insert into test_stats select systimestamp, p_table_name, l_table_size, 'start', v.* from stats_v v;
  flush;
  for i in 1..p_cnt loop
    dbms_stats.gather_table_stats(null, p_table_name, estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE);   
  end loop;
  insert into test_stats select systimestamp, p_table_name, l_table_size, 'AUTO', v.* from stats_v v;
  flush;
  for i in 1..p_cnt loop
    dbms_stats.gather_table_stats(null, p_table_name, estimate_percent => 100);   
  end loop;
  insert into test_stats select systimestamp, p_table_name, l_table_size, '100%', v.* from stats_v v; 
  commit;
end;
/


create or replace procedure test_tab_run(p_cnt number) as
begin
 test_tab_proc('TEST_TAB_0KB',  p_cnt);
 test_tab_proc('TEST_TAB_200KB', p_cnt);
 test_tab_proc('TEST_TAB_800KB', p_cnt);
 test_tab_proc('TEST_TAB_2MB',  p_cnt);
 test_tab_proc('TEST_TAB_8MB',  p_cnt);
 test_tab_proc('TEST_TAB_20MB', p_cnt);
 test_tab_proc('TEST_TAB_80MB', p_cnt);
end;
/

Thursday, May 8, 2014

java stored procedure calls and latch: row cache objects, and performance

In the last Blog, we demonstrated java stored procedure calls and latch: row cache objects.

Is "latch: row cache objects" a real performance problem ?

Let's run some scalability tests for both cases (See TestCode in Blog: java stored procedure calls and latch: row cache objects) on AIX Power7 System with Entitled Capacity 4, SMT=4 and Oracle 11.2.0.3.0.

begin
  xpp_test.run_var(p_case => 1, p_job_var => 32, p_dur_seconds => 60);
  xpp_test.run_var(p_case => 2, p_job_var => 32, p_dur_seconds => 60);
end;
/


The above code runs Case_1 and Case_2 with Jobs varying from 1 to 32 for 60 seconds.

Scalability


Once the test terminated, run query: 

with run as
 (select case, job_cnt, sum(dur)*1000 elapsed,
         sum(run_cnt) run_cnt, sum(java_cnt) java_cnt, round(sum(java_nano)/1000/1000) java_elapsed
  from test_stats
  group by case, job_cnt)
select r1.job_cnt, r1.elapsed,
       r1.run_cnt c1_run_cnt, r2.run_cnt c2_run_cnt,
       r1.java_cnt c1_java_cnt, r2.java_cnt c2_java_cnt,
       r1.java_elapsed c1_java_elapsed, r2.java_elapsed c2_java_elapsed
from (select * from run where case = 1) r1
    ,(select * from run where case = 2) r2
where r1.job_cnt = r2.job_cnt(+)
order by r1.job_cnt;


The output shows that the throughput of both cases (C1_RUN_CNT and C2_RUN_CNT) is very closed.

JOB_CNT ELAPSED C1_RUN_CNT C2_RUN_CNT C1_JAVA_CNT C2_JAVA_CNT C1_JAVA_ELAPSED C2_JAVA_ELAPSED
1
60000
118
119
86612
119
390
1599
2
120000
240
242
176160
242
794
2944
3
180000
360
361
264240
361
1196
4439
4
240000
461
459
338374
459
1566
5756
5
300000
476
503
349384
503
1852
7055
6
360000
515
527
378010
527
2101
8040
7
420000
551
555
404434
555
2408
8952
8
480000
569
580
417646
580
2689
10464
9
540000
597
582
438198
582
2978
11459
10
600000
601
591
441134
591
3163
12797
11
660000
613
635
449942
635
3350
14150
12
720000
646
654
474164
654
3655
15019
13
780000
683
668
501322
668
3912
16619
14
840000
696
714
510864
714
4260
16985
15
900000
714
722
524076
722
4435
19440
16
960000
733
730
538022
730
4822
20250
17
1020000
730
728
535820
728
4947
22637
18
1080000
726
744
532884
744
5022
21034
19
1140000
743
727
545362
727
5274
23924
20
1200000
734
733
538756
733
5611
24402
21
1260000
737
729
540958
729
6075
26138
22
1320000
730
727
535820
727
6406
28011
23
1380000
737
735
540958
735
6555
27972
24
1440000
746
730
547564
730
7040
30529
25
1500000
733
742
538022
742
7181
29288
26
1560000
738
729
541692
729
6620
33637
27
1620000
740
735
543160
735
7269
32531
28
1680000
741
738
543894
738
7881
36467
29
1740000
749
747
549766
747
7652
34837
30
1800000
737
735
540958
735
8285
37758
31
1860000
746
743
547564
743
8759
38612
32
1920000
759
755
557106
755
8326
40002

                                                                     Table-1

If drawing a graph with above data, we can see that throughput is linear till JOB_CNT=4, and reach its peak point at JOB_CNT=16.


                                                                    Figure-1

Latch: row cache objects

                      
If we run the query for latch: row cache objects:

with rc as
  (select s.case, s.job_cnt, (e.gets-s.gets) gets, (e.misses-s.misses) misses, (e.sleeps-s.sleeps) sleeps,
        (e.spin_gets-s.spin_gets) spin_gets, (e.wait_time-s.wait_time) wait_time
  from  (select * from rc_latch_stats where step = 'start') s
       ,(select * from rc_latch_stats where step = 'end')   e
  where s.case = e.case and s.job_cnt = e.job_cnt)
select c1.job_cnt, c1.gets c1_gets, c2.gets c2_gets, c1.misses c1_misses, c2.misses c2_misses,
       c1.sleeps c1_sleeps, c2.sleeps c2_sleeps, c1.spin_gets c1_spin_gets, c2.spin_gets c2_spin_gets,
       round(c1.wait_time/1000) c1_wait_time, round(c2.wait_time/1000) c2_wait_time
from (select * from rc where case = 1) c1
    ,(select * from rc where case = 2) c2
where  c1.job_cnt = c2.job_cnt(+)
order by c1.job_cnt;


and look its output:

JOB_CNT C1_GETS C2_GETS C1_MISSES C2_MISSES C1_SLEEPS C2_SLEEPS C1_SPIN_GETS C2_SPIN_GETS C1_WAIT_TIME C2_WAIT_TIME
1
261925
38360
1
47
0
0
1
47
0
0
2
590567
4070
1950
0
0
0
1950
0
0
0
3
755390
2755
6342
0
0
0
6342
0
0
0
4
941682
3459
13772
0
2
0
13770
0
3
0
5
958751
4947
60853
0
4
0
60850
0
14
0
6
1128800
5224
92700
146
1
0
92699
146
1
0
7
1193847
4719
102209
31
1
0
102208
31
1
0
8
1219302
19716
151143
0
0
0
151143
0
0
0
9
1261478
5227
155337
8
13
0
155324
8
464
0
10
1256130
5404
195400
0
1
0
195399
0
0
0
11
1264005
7931
251793
27
0
0
251793
27
0
0
12
1320357
6014
254131
73
1
0
254130
73
0
0
13
1492126
7266
394229
121
0
0
394229
121
0
0
14
1496853
6948
408408
173
0
0
408408
173
0
0
15
1511372
7124
417898
472
13
0
417885
472
1795
0
16
1521562
8392
437642
283
11
0
437631
283
2271
0
17
1494286
20916
355035
179
36
0
354999
179
4324
0
18
1459280
7891
320098
420
53
0
320046
420
9841
0
19
1611370
9055
335870
212
37
0
335833
212
9564
0
20
1566755
9186
280001
218
100
5
279903
213
23946
530
21
1553970
8186
268234
52
70
14
268165
38
21438
3470
22
1499580
9350
230573
112
157
0
230420
112
40523
0
23
1474488
8538
244989
142
158
4
244832
138
36357
392
24
1628831
9677
229768
69
197
0
229573
69
54006
0
25
1568732
11112
226365
344
392
5
225984
339
103739
311
26
1548840
22742
229974
155
323
0
229656
155
76637
0
27
1519217
10326
204424
148
418
0
204014
148
86043
0
28
1479227
9322
200061
78
476
0
199596
78
102191
0
29
1641837
9841
226975
220
493
0
226486
220
107470
0
30
1556361
12102
210739
159
449
1
210305
158
93532
18
31
1518952
10004
191122
255
558
22
190572
233
104639
4785
32
1544149
10248
191564
254
628
9
190940
245
114982
447


                                                                   Table-2

There is a broad difference of latch gets and misses between Case_1 and Case_2.

Also drawing a graph for Case_1:


                                                                  Figure-2

For Case_1, latch misses reach its top at about JOB_CNT=16, and starting from JOB_CNT=15, there is a considerable latch wait_time.

Unbalanced Workload


One interesting query to reveal the unbalanced workload of Job sessions (UNIX processes) is:

select job_cnt, min(run_cnt) min, max(run_cnt) max, (max(run_cnt) - min(run_cnt)) delta_max
      ,round(avg(run_cnt), 2) avg, round(stddev(run_cnt), 2) stddev
from test_stats t
where case = 1
group by case, job_cnt
order by case, job_cnt;


which shows certain difference (STDDEV) starting from JOB_CNT = 4 on 4 Physical CPUs AIX (Entitled Capacity = 4) since at least one CPU is required to perform UNIX system and other Oracle tasks.
(Only Case_1 is showed, but Case_2 looks similar)


JOB_CNT MIN MAX DELTA_MAX AVG STDDEV
1
118
118
0
118
0
2
120
120
0
120
0
3
120
120
0
120
0
4
109
120
11
115.25
5.62
5
74
119
45
95.2
19.98
6
75
97
22
85.83
7.7
7
66
100
34
78.71
13.05
8
63
77
14
71.13
4.82
9
58
76
18
66.33
6.3
10
56
73
17
60.1
4.95
11
48
67
19
55.73
6.26
12
49
64
15
53.83
5.06
13
47
66
19
52.54
6.1
14
46
65
19
49.71
5.95
15
45
51
6
47.6
2.2
16
44
47
3
45.81
0.75
17
38
46
8
42.94
2.05
18
35
44
9
40.33
2.47
19
32
46
14
39.11
4.23
20
32
43
11
36.7
3.7
21
28
42
14
35.1
4.37
22
25
39
14
33.18
3.43
23
26
39
13
32.04
3.13
24
25
36
11
31.08
2.65
25
25
32
7
29.32
1.73
26
25
35
10
28.38
2.5
27
25
33
8
27.41
1.89
28
24
30
6
26.46
1.53
29
24
28
4
25.83
1.07
30
22
28
6
24.57
1.33
31
22
33
11
24.06
2.06
32
21
30
9
23.72
1.9

                                   Table-3

Wednesday, May 7, 2014

java stored procedure calls and latch: row cache objects

Frequent calls of java stored procedure by PL/SQL causes heavy latch: row cache objects.

(A previous Blog: dbms_aq.dequeue - latch: row cache objects on AIX  talked about dbms_aq.dequeue and latch: row cache objects)

As a concrete test example, appended code contains two different implementations of PL/SQL - Java interface when calling XML Pull Parser (XmlPullParser).
  1.  Case_1: for each TAG and text, make one java call.
  2.  Case_2: make one single java call for whole XML Document.
 
All tests are done on AIX Power7 System with Entitled Capacity 4, SMT=4 and Oracle 11.2.0.3.0.
It is also reproducible on Solaris Operating System (x86-64) with 16 virtual processors (2 8-Cores physical processors).

Case_1  32 Jobs


At first, run Case_1 for 10 minutes with 32 Jobs:
 
  exec xpp_test.run_job(p_case => 1, p_job_cnt => 32, p_dur_seconds => 600); 
 
and look the top 5 events in AWR report:
 

Top 5 Timed Foreground Events

EventWaitsTime(s)Avg wait (ms)% DB timeWait Class
DB CPU 2,410 6.70 
latch: row cache objects7,0261,3011853.62Concurrency
library cache: mutex X3,756111300.31Concurrency
log file sync321410.00Commit
control file sequential read3,700100.00System I/O

we can see a noticeable "latch: row cache objects" with average wait of 185 ms.

During the job running, pick the hex address of 'latch: row cache objects' by:

SQL> select p1raw from v$session v where p2=280 or event = 'latch: row cache objects';

    0X7000000B412F9F0

To look Oracle internal data structure, dump 16 bytes starting from this address by:

SQL> oradebug peek  0X7000000B412F9F0 16

   [7000000B412F9F0, 7000000B412FA10) = 00000000 00000026 2ED50834 01180000

where
    00000026 is Oracle PID (38 in hex) which is currently holding the latch.
    2ED50834 is v$latch.gets (785713204 in hex) where name='row cache objects'.
    0118     is v$latch.latch# (280 in hex).

If dumping address further, you can also find v$latch.misses, sleeps and other columns.

Case_1  16 Jobs


If reducing Jobs to 16,

 exec xpp_test.run_job(p_case => 1, p_job_cnt => 16, p_dur_seconds => 600);

AWR shows:

Top 5 Timed Foreground Events

EventWaitsTime(s)Avg wait (ms)% DB timeWait Class
DB CPU 2,385 14.19 
library cache: mutex X9,989100.01Concurrency
control file sequential read3,588100.00System I/O
db file sequential read839000.00User I/O
latch: row cache objects27020.00Concurrency


Case_2  32 Jobs


Then repeating the above two calls with Case_2:

 exec xpp_test.run_job(p_case => 2, p_job_cnt => 32, p_dur_seconds => 600);

Top 5 Timed Foreground Events

EventWaitsTime(s)Avg wait (ms)% DB timeWait Class
DB CPU 2,401 6.66 
log file sync402390.00Commit
control file sequential read3,700100.00System I/O
cursor: pin S wait on X201300.00Concurrency
SQL*Net break/reset to client54040.00Application


Case_2  16 Jobs


exec xpp_test.run_job(p_case => 2, p_job_cnt => 16, p_dur_seconds => 600);

Top 5 Timed Foreground Events

EventWaitsTime(s)Avg wait (ms)% DB timeWait Class
DB CPU 2,416 11.12 
log file sync885520.02Commit
latch: shared pool119780.00Concurrency
control file sequential read3,588100.00System I/O
library cache: mutex X169160.00Concurrency


We can hardly see any "latch: row cache objects" in other three AWR reports.


TestCode


create or replace type xpp_array_obj as object (
    evt_type    number(12)
   ,name        varchar2(32767)
   ,namespc     varchar2(4000)
   ,attr_cnt    number(12)
   ,text        varchar2(32767)
);
/


create or replace type xpp_array_tab as table of xpp_array_obj;
/


drop table test_stats;

create table test_stats (ts timestamp, case number, job_cnt number, dur number, run_cnt number, java_cnt number, java_nano number);

drop table rc_latch_stats;

create table rc_latch_stats as
select timestamp'2013-11-22 10:20:30' ts, 'start' step, 0 case, 0 job_cnt, v.*
from   v$latch v where 0=1 and name = 'row cache objects';


drop java source "XmlPullParserJava";

create or replace and resolve java source named "XmlPullParserJava" as
  import java.util.LinkedList;
  import java.util.Vector;
  import java.sql.Connection;
  import java.sql.Clob;

  import oracle.jdbc.driver.OracleDriver;
  import oracle.sql.ARRAY;
  import oracle.sql.ArrayDescriptor;
  import oracle.sql.STRUCT;
  import oracle.sql.StructDescriptor;

  import org.xmlpull.mxp1.MXParser;
  import org.xmlpull.v1.XmlPullParser;


public class XmlPullParserJava {
  private static XmlPullParser XPP = new MXParser();

  public static void setInput(Clob inputClob) {
    try {
      XPP.setInput(inputClob.getCharacterStream());
    } catch (Exception t) {
      System.out.println("\n Message: " + t.getMessage() + "\n Cause: " + t.getCause());
    }
  }

  public static int next(String[] name, String[] namespace, int[] attributeCount, String[] text, long[] elapsed) {
    try {
      long startTime    = System.nanoTime();
      int next          = XPP.next();
      name[0]           = XPP.getName();
      namespace[0]      = XPP.getNamespace();
      attributeCount[0] = XPP.getAttributeCount();
      text[0]           = XPP.getText();
      elapsed[0]        = System.nanoTime() - startTime;
      return next;
    } catch (Throwable t) {
      System.out.println("\n Message: " + t.getMessage() + "\n Cause: " + t.getCause());
      return -1;
    }
  }

  public static void nextAll(ARRAY retAllArray[], long[] elapsed) {
    try {
      Connection conn         = new OracleDriver().defaultConnection();
      ArrayDescriptor  aDescr = ArrayDescriptor.createDescriptor("XPP_ARRAY_TAB", conn);
      StructDescriptor sDescr = StructDescriptor.createDescriptor("XPP_ARRAY_OBJ", conn);
      String [] lineContent   = new String[5];
      STRUCT lineObj = null;
      Vector retAll  = new Vector();

      long startTime = System.nanoTime();
      String name;
      String nameSpace;
      int    attributeCount;
      String text;

      int eventType = XPP.next();
      while (eventType != XmlPullParser.END_DOCUMENT) {
        lineContent[0] = String.valueOf(eventType);
        lineContent[1] = XPP.getName();
        lineContent[2] = XPP.getNamespace();
        lineContent[3] = String.valueOf(XPP.getAttributeCount());
        lineContent[4] = XPP.getText();
        lineObj = new STRUCT(sDescr, conn, lineContent);
        retAll.add(lineObj);
        eventType = XPP.next();
      }

     elapsed[0] = System.nanoTime() - startTime;
     retAllArray[0] = new ARRAY(aDescr, conn, retAll.toArray());
    } catch (Throwable t) {
      System.out.println("\n Message: " + t.getMessage() + "\n Cause: " + t.getCause());
    }
  }
}
/


create or replace package k.xpp_test as
  type xpp_rec is record (
    evt_type   pls_integer
   ,name       varchar2(32767)
   ,namespc    varchar2(4000)
   ,attr_cnt   pls_integer := -1
   ,text       varchar2(32767)
   ,elapsed    pls_integer
  );

  v_xpp_rec        xpp_rec;
  v_xml_doc        varchar2(32767);
  v_java_call_nano number := 0;
  v_java_call_cnt  number := 0;

  ---------------------------------------------------------------------------
  function  xml_gen(p_limit number) return varchar2;
  procedure set_input(p_clob clob);
  function  next return integer;
  function  nextall return xpp_array_tab;
  procedure run_next;
  procedure run_nextall;
  procedure run_cnt(p_case number, p_job_cnt number, p_dur_seconds number);
  procedure run_job(p_case number, p_job_cnt number, p_dur_seconds number);
  procedure run_var(p_case number, p_job_var number, p_dur_seconds number);
end xpp_test;
/


create or replace package body k.xpp_test as
  procedure java_setInput(p_clob clob) as
  language java name 'XmlPullParserJava.setInput(java.sql.Clob)';

  function java_next(p_name out varchar2, p_namespc out varchar2, p_attr_cnt out number, p_text out varchar2, p_elapsed out number)
  return pls_integer as
  language java name 'XmlPullParserJava.next(java.lang.String[], java.lang.String[], int[], java.lang.String[], long[]) return int';

  procedure java_nextall(p_array out xpp_array_tab, p_elapsed out number) as
  language java name 'XmlPullParserJava.nextAll(oracle.sql.ARRAY[], long[])';

  ---------------------------------------------------------------------------
  function xml_gen(p_limit number) return varchar2 as
    l_level_1      varchar2(32767) := '';
    l_level_2      varchar2(32767) := '';
    l_level_3      varchar2(32767) := '';
    l_xml          varchar2(32767);
  begin
    for i in 1..p_limit loop
      l_level_1 := l_level_1||'<TAG_1_'||i||'>text_1_'||i||'</TAG_1_'||i||'>';
    end loop;
    for i in 1..p_limit loop
      l_level_2 := l_level_2||'<TAG_2_'||i||'>'||l_level_1||'</TAG_2_'||i||'>';
    end loop;
    for i in 1..p_limit loop
      l_level_3 := l_level_3||'<TAG_3_'||i||'>'||l_level_2||'</TAG_3_'||i||'>';
    end loop;

    l_xml := q'[<?xml version="1.0" encoding="UTF-8"?>]'||'<root>'||l_level_3||'</root>';
    return l_xml;
  end xml_gen;

  ----------------------------------------------------------------------------
  -- simulate XML processing
  procedure cpu_burning as
    l_cnt number := 10000;
    l_x   number := 0;
  begin
    for i in 1..l_cnt loop
      l_x := i;
    end loop;
  end cpu_burning;

  ----------------------------------------------------------------------------
  procedure set_input(p_clob clob) as
  begin
    java_setInput(p_clob);
  end set_input;

  ----------------------------------------------------------------------------
  function next return integer as
  begin
    return java_next(v_xpp_rec.name, v_xpp_rec.namespc, v_xpp_rec.attr_cnt, v_xpp_rec.text, v_xpp_rec.elapsed);
  end next;

  ---------------------------------------------------------------------------
  function nextall return xpp_array_tab as
    l_xpp_array_tab xpp_array_tab := new xpp_array_tab();
  begin
    java_nextall(l_xpp_array_tab, v_xpp_rec.elapsed);
    return l_xpp_array_tab;
  end nextall;

  ---------------------------------------------------------------------------
  procedure run_next as
    l_evt_type     pls_integer;
    l_name         varchar2(32767);
    l_text         varchar2(32767);
  begin
    set_input(v_xml_doc);
    l_evt_type := next;
    loop
      case l_evt_type
        when 0 then l_name := v_xpp_rec.name;
        when 1 then l_name := v_xpp_rec.name;
        when 2 then l_name := v_xpp_rec.name;
        when 3 then l_name := v_xpp_rec.name;
        when 4 then l_text := v_xpp_rec.text;
        else        null;
      end case;
      l_evt_type := next;
      cpu_burning;          -- generate high 'row cache objects'
      v_java_call_cnt  := v_java_call_cnt + 1;
      v_java_call_nano := v_java_call_nano + v_xpp_rec.elapsed;
      exit when l_evt_type = 1;   -- XPP END_DOCUMENT
    end loop;
  end run_next;

  ---------------------------------------------------------------------------
  procedure run_nextall as
    l_xpp_array_tab xpp_array_tab;
    l_name          varchar2(32767);
    l_text          varchar2(32767);   
  begin
    set_input(v_xml_doc);
    l_xpp_array_tab := nextall;
    v_java_call_cnt  := v_java_call_cnt + 1;
    v_java_call_nano := v_java_call_nano + v_xpp_rec.elapsed;
    for i in 1..l_xpp_array_tab.count loop
      case l_xpp_array_tab(i).evt_type
        when 0 then l_name := l_xpp_array_tab(i).name;
        when 1 then l_name := l_xpp_array_tab(i).name;
        when 2 then l_name := l_xpp_array_tab(i).name;
        when 3 then l_name := l_xpp_array_tab(i).name;
        when 4 then l_text := l_xpp_array_tab(i).text;
        else        null;
      end case;   
      cpu_burning;
    end loop;
  end run_nextall;

  ---------------------------------------------------------------------------
  procedure run_cnt(p_case number, p_job_cnt number, p_dur_seconds number)
  as
    l_cnt        number := 0;
    l_start_time number := dbms_utility.get_time;
  begin
    loop
      case p_case
        when 1 then run_next;
        when 2 then run_nextall;
        else        null;
      end case;
      l_cnt := l_cnt + 1;
      exit when (dbms_utility.get_time - l_start_time) > (p_dur_seconds * 100);
    end loop;

    insert into test_stats values(systimestamp, p_case, p_job_cnt, p_dur_seconds, l_cnt, v_java_call_cnt, v_java_call_nano);
    commit;
  end run_cnt;

  ----------------------------------------------------------------------------
  procedure run_job (p_case number, p_job_cnt number, p_dur_seconds number)
  as
    l_job_id pls_integer;
    l_stmt   varchar2(1000);
  begin

    insert into rc_latch_stats
    select systimestamp, 'start', p_case, p_job_cnt, v.*
    from   v$latch v
    where  name = 'row cache objects';
    commit;

    for i in 1.. p_job_cnt loop
      l_stmt := 'begin xpp_test.run_cnt(' || p_case        ||', '
                                          || p_job_cnt     ||', '
                                          || p_dur_seconds ||
                                        '); end;';
      dbms_output.put_line(l_stmt);
      dbms_job.submit(l_job_id, l_stmt);
    end loop;
    commit;

    dbms_lock.sleep(p_dur_seconds);
    insert into rc_latch_stats
    select systimestamp, 'end', p_case, p_job_cnt, v.*
    from   v$latch v
    where  name = 'row cache objects';
    commit;

    dbms_lock.sleep(5);
  end run_job;

  ----------------------------------------------------------------------------
  procedure run_var(p_case number, p_job_var number, p_dur_seconds number) as
  begin
    for job_cnt in 1..p_job_var loop
      run_job(p_case, job_cnt, p_dur_seconds);
    end loop;
  end run_var;

---------------------------------------------------------------------------
  begin
    v_xml_doc := xml_gen(6);   -- XML Doc with length=6681, Tag + Text = 734

end xpp_test;
/