Monday, January 8, 2018

Oracle Latch Misses, Spin_gets, Sleeps and Recurrent_Misses

After the correction of Blog: Is latch misses statistic gathered or deduced ? I re-examined a few hourly AWR reports in one heavily loaded system (Oracle 12.1.0.2.0).


1. Latch Statistics


We observed two distinctive Sets of Latch Statistics in AWR reports.

Here AWR Top Events Section and "Latch Sleep Breakdown" Section in both Cases.


1.1. Case-1: Recurrent Spin_gets


Top 10 Foreground Events by Total Wait Time

Event Waits Total Wait Time (sec) Avg wait (ms) % DB time Wait Class
latch: cache buffers chains 1,078,528 291.7K 270.50 37.6 Concurrency
latch: row cache objects 423,293 78.2K 184.86 10.1 Concurrency

Latch Sleep Breakdown

  • ordered by misses desc
  • Latch Name Get Requests Misses Sleeps Spin Gets
    cache buffers chains 15,134,617,472 176,939,147 1,040,507 314,665,205
    row cache objects 101,225,008 4,075,626 429,314 3,673,739
    Result Cache: RC Latch 7,876,403 44,622 25,380 19,440

    Look "cache buffers chains" (CBC) statistics:
         Misses    = 176,939,147
         Sleeps    =   1,040,507
         Spin Gets = 314,665,205
    
    Spin_Gets(314,665,205) > Misses(176,939,147) attested the existence of recurrent Spin_Gets.


    1.2. Case-2: Recurrent Sleeps


    Top 10 Foreground Events by Total Wait Time

    Event Waits Total Wait Time (sec) Avg wait (ms) % DB time Wait Class
    latch: cache buffers chains 1,630,692 748.3K 458.90 92.7 Concurrency
    latch: row cache objects 314,199 23.9K 76.14 3.0 Concurrency

    Latch Sleep Breakdown

  • ordered by misses desc
  • Latch Name Get Requests Misses Sleeps Spin Gets
    row cache objects 55,852,812 3,505,131 320,033 3,224,234
    cache buffers chains 1,722,527,053 2,564,938 2,819,003 1,026,077
    kokc descriptor allocation latch 5,984,155 343,396 17,985 326,266

    Look "cache buffers chains" (CBC) statistics:
         Misses    = 2,564,938
         Sleeps    = 2,819,003
         Spin Gets = 1,026,077
    
    Sleeps(2,819,003) > Misses(2,564,938) evidenced the existence of recurrent Sleeps.


    2. Recurrent_Misses


    Every unsuccessful Spin_Get will trigger a Sleep (or each Sleep follows a Spin_Get), every unsuccessful Sleep will result in a successive longer Sleep.

    Latch Sleep is started with an initial duration of 10ms (or 1ms), and progressively elevated after every unsuccessful Sleep (See later discussion of V$LATCH.sleep[1-11], V$EVENT_HISTOGRAM.wait_time_milli, and pseudo-code in Book: Oracle Performance Firefighting Page 79).

    Suppose initial duration being 10ms, Avg wait (ms) = 270.50 (Case-1) indicates that there exist Sleeps, which are not initial Sleeps, hence recurrent Sleeps.

    For OS scheduler Time Slice of 10ms, Avg wait (ms) = 270.50 signifies that majority of Sleeps spans more than 20 Ticks, probably recurrent Sleeps (also possible not awakened from waitqueue, i.e., not scheduled into runqueue).

    Recurrent_Misses can be computed by:
    Case-1:  
        spin_gets + sleeps - misses = 314,665,205 + 1,040,507 - 176,939,147 = 138,766,565 > 0
    
    Case-2:
        spin_gets + sleeps - misses = 1,026,077 + 2,819,003 - 2,564,938 = 1,280,142 > 0
    
    Therefore, above formula could be refined as:
           sleeps + spin_gets – misses 
         = recurrent_misses 
         = recurrent_sleeps + recurrent_spin_gets 
    
    Look proportion of recurrent Misses:
        Case-1: recurrent_misses / Misses amounts to 78.4% (138,766,565 / 176,939,147)
        Case-2: recurrent_misses / Misses amounts to 49.9% (1,280,142 / 2,564,938)
    
    both are serious cases of latch misses.


    3. Observations

    (1). Latch Sleep Breakdown Table shows:
             Case-1: "cache buffers chains": Spin_Gets(314,665,205) > Misses(176,939,147)
             Case-2: "cache buffers chains": Sleeps(2,819,003)           > Misses(2,564,938)
       
       Hence there exist Spin_Get and Sleep which are not followed by one success Get, that is, recurrent Miss.
       Or we can say that each Miss can trigger multi Spin_Gets or Sleeps.
       
    (2). Sleeps can be bigger than Spin_Gets, for example, 
          Case-1: "Result Cache: RC Latch": Sleeps(25,380) > Spin_Gets(19,440)
       
    (3). Usually when one Latch has very slight contention, 
           misses = sleeps + spin_gets
       It means that each Miss is resolved either by one Spin_Get, or by one Sleep.
       This also indicates that immediately after each Sleep (OS Process waking up and rescheduled), Oracle makes one Latch Get.
       
    (4). Top 10 Foreground Events Table shows:
          Latch Sleeps is approximate to Event Waits, i.e V$LATCH.sleeps = V$SYSTEM_EVENT.total_waits.
       It signifies that only Latch Sleeps is counted as Waits because Process is on the wait queue. 
       Latch Spin_Gets is not counted because Process is on CPU (runqueue).
    
    By the above observations, it is not clear how Oracle makes the counting of Misses, Sleeps and Spin_Gets in willing-to-wait mode. And it is hard to reconstruct a pseudo-code to match above countings.


    4. Discussions

    (1). Oracle session's Response Time is made of Service Time and Queue Time. Spin_Gets is counted as Service Time since it is on CPU.
       Whereas Sleeps is categorized as Queue Time since it is on waiting. For details, see Book: 
       Oracle Performance Firefighting Page 74-91.
    
    (2). As discussed, generally, Latch Spins is burning CPU; whereas Latch Sleeps yields CPU. Therefore when talking about Latch contentions, 
       it would be necessary to distinguish between Spin_gets and Sleeps. 
       As tested, usually Spin_gets of Latch Misses are caused by frequently concurrent access; whereas Sleeps of Latch Misses are triggered 
       by Invalidations or DML Modifications. 
       
    (3). Oracle V$LATCH SLEEP evaluated, which leaks the algorithm changing.
     
       In Oracle 9i, V$LATCH is documented as:
         SLEEP[1|2|3]           Waits that slept 1 time through 3 times, respectively
         SLEEP4                   Waits that slept 4 or more times
         SLEEP[5|6|7|8|9|10|11] present for compatibility with previous releases of Oracle. No data is accumulated.
         
       Since Oracle 10g, V$LATCH is updated as:
         SLEEP[1|2|3|4|5|6|7|8|9|10|11] deprecated. As a substitute, query V$EVENT_HISTOGRAM for EVENT like latch free or latch:%.                                 
       
       So the original design elevated SLEEP in 11 grades, with 9i, it was reduced to 4, and 10g regulated it in a new approach. 
       
    (4). Suppose OS scheduler Time Slice is 10ms (system clock tick frequency of 100 hertz), 
       Wait Event "Avg wait (ms)"/10 represents the paused ticks before placed back into the runqueue.
       
    (5). There are two sorts of Latches, one has children (see V$LATCH_CHILDREN), for example, "cache buffers chains";
       other one has no children and hence an instance_wide single Latch, for example, "Result Cache: RC Latch".
       Therefore, when comparing Sleeps and Spin_Gets among Latches, the number of children should be taken into account.
       For example, single Latch can serialize whole system.
    
    With following query, we can see number of Blocks protected per CBC Child Latch:
    
    with block_size as (select value b_val from v$parameter where name in ('db_block_size'))
        ,cache_size as (select sum(value) c_val from v$parameter where name in ('db_cache_size', 'db_keep_cache_size', 'db_recycle_cache_size'))
        ,latch_cnt  as (select count(*) cnt from v$latch_children where name in ('cache buffers chains'))
    select b_val, round(c_val/1024/1024) cache_size_mb, l.cnt, ceil(c.c_val/b_val/cnt) block_per_latch
    from block_size b, cache_size c, latch_cnt l;
    

    5. Queries

    -- v$latch
    select name, misses, sleeps, spin_gets
          ,sign((sleeps + spin_gets) - misses) miss_deduced_sign
          ,((sleeps + spin_gets) - misses)     recurrent_misses
          ,trunc((abs((sleeps + spin_gets) - misses) / nullif(misses, 0)) * 100 ,2)*100 recurrent_misses_perc
    from v$latch where misses > 1000
    order by misses desc;
    
    -- v$system_event
    select event, total_waits, round(time_waited_micro/1e6, 2) time_waited_sec, average_wait*1e4 average_wait_micro, e.*
    from v$system_event  e
    where event in ('latch free', 'latch: shared pool', 'latch: row cache objects', 'latch: cache buffers chains')
    order by e.event;
    
    -- v$session_event
    select sid, event, total_waits, round(time_waited_micro/1e6, 2) time_waited_sec, average_wait*1e4 average_wait_micro, e.*
    from v$session_event e
    where e.event in ('latch free', 'latch: shared pool', 'latch: row cache objects', 'latch: cache buffers chains')
     --and sid in (217, 1217)
    order by e.event, e.sid;
    
    -- dba_hist_latch (One can also query dba_hist_latch_parent, dba_hist_latch_children)
    select snap_id End_Snap, end_interval_time Snap_Time, latch_name Latch_Name
          ,gets_d       Get_request
          ,misses_d     Misses
          ,sleeps_d     Sleeps
          ,spin_gets_d  Spin_Gets
          --,v.*
    from (
    select gets-lag(gets) over(partition by latch_name order by e.snap_id) gets_d
          ,misses-lag(misses) over(partition by latch_name order by e.snap_id) misses_d
          ,sleeps-lag(sleeps) over(partition by latch_name order by e.snap_id) sleeps_d
          ,spin_gets-lag(spin_gets) over(partition by latch_name order by e.snap_id) spin_gets_d
          ,h.end_interval_time, e.* 
    from dba_hist_latch e, dba_hist_snapshot h
    where e.snap_id = h.snap_id 
      and e.latch_name in ('Result Cache: RC Latch', 'cache buffers chains', 'row cache objects')
      order by e.snap_id desc
    ) v
    --where snap_id = 1217
    order by snap_id desc, latch_name, gets_d desc;
    
    -- dba_hist_system_event. Similar to AWR "Top 10 Foreground Events by Total Wait Time"
    select snap_id End_Snap, end_interval_time Snap_Time,   event_name Event
          ,total_waits_d                                               Waits
          ,round(time_waited_micro_d/1e6,2)                            Total_Wait_Time_sec
          ,round(time_waited_micro_d/1e3/(nullif(total_waits_d, 0)),2) Wait_Avg_ms
          --,v.*
    from (
    select total_waits-lag(total_waits) over(partition by event_name order by e.snap_id) total_waits_d
          ,total_timeouts-lag(total_timeouts) over(partition by event_name order by e.snap_id) total_timeouts_d
          ,time_waited_micro-lag(time_waited_micro) over(partition by event_name order by e.snap_id) time_waited_micro_d
          ,h.end_interval_time, e.* 
    from dba_hist_system_event e, dba_hist_snapshot h
    where e.snap_id = h.snap_id 
      and e.event_name in ('latch free', 'latch: shared pool', 'latch: row cache objects', 'latch: cache buffers chains')
      order by e.snap_id desc
    ) v
    where snap_id = 1217;
    order by snap_id desc, total_waits_d desc;