Thursday, March 24, 2022

Index Skip Scan With First Column of Composite Index Specified

Oracle Docu: Index Skip Scans (9i introduced) wrote:
  An index skip scan occurs when the initial column of a composite index is "skipped" or not specified in the query. 

  The optimizer considers a skip scan when the following criteria are met:
    The leading column of a composite index is not specified in the query predicate.
   
    (first column, initial column, leading column are all used in different Oracle document)
This Blog will try to demonstrate that the Index Skip Scan can be chosen even though first column is specified.

Note: Tested in Oracle 19.13, 19.11, 19.7.


1. Test Setup


First we create a test table with 3 columns and a composite index including all 3 columns.

drop table test_tab_1;

create table test_tab_1 as select trunc(level/8) n1, level n2, rpad('ABC', 1000, 'X') a1 from dual connect by level <= 1e5; 

create index test_tab_1#i1 on test_tab_1(n1, a1, n2);

exec dbms_stats.gather_table_stats(null, 'TEST_TAB_1', cascade=>true);


2. Index Skip Scan


We run a query with conditions on first and last columns. Even though the first column is specified in the query, optimizer decides to use "INDEX SKIP SCAN" access path:

select * from test_tab_1 t where n1 = 1249 and n2 = 9999;

  ----------------------------------------------------------------------------------
  | Id  | Operation        | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
  ----------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT |               |     1 |  1010 |     3   (0)| 00:00:01 |
  |*  1 |  INDEX SKIP SCAN | TEST_TAB_1#I1 |     1 |  1010 |     3   (0)| 00:00:01 |
  ----------------------------------------------------------------------------------
  Predicate Information (identified by operation id):
  ---------------------------------------------------
     1 - access("N1"=1249 AND "N2"=9999)
         filter("N2"=9999)
  
  Statistics
  ----------------------------------------------------------
           13  consistent gets
Now we get some meta data and then go to further looking:
  
select object_name, data_object_id, to_char(data_object_id, 'xxxxxx') data_object_id_x
  from dba_objects where object_name = 'TEST_TAB_1#I1';
     
  OBJECT_NAME   DATA_OBJECT_ID  DATA_OBJECT_ID_X
  ------------- --------------  ----------------
  TEST_TAB_1#I1        3111401  2f79e9


select index_name, blevel, leaf_blocks, distinct_keys, num_rows from dba_indexes v where index_name = 'TEST_TAB_1#I1';

  INDEX_NAME    BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS
  ------------- ------ ----------- ------------- ----------
  TEST_TAB_1#I1      5       14286         96343     100000


Note: 
  Blevel: the number of branch levels (including the root node) in a B-Tree index. 
  Height: the actual depth of the index. Height is BLEVEL plus one.
  kdxcolev (in index data block dump): index level (0 represents leaf blocks)
We run the query again and list all index blocks kept in buffer cache:

-- cleanup buffer cache
alter system flush buffer_cache; 

select * from test_tab_1 t where n1 = 1249 and n2 = 9999;
       
select file#, to_char(file#, 'xxx') file#x, block#, to_char(block#, 'xxxxxx') block#x, objd, to_char(objd, 'xxxxxx') objd_x --, b.* 
from v$bh 
where objd = 3111401 and status='xcur' 
order by block#;

  FILE# FILE#X   BLOCK# BLOCK#X     OBJD OBJD_X
  ----- ------ -------- ------- -------- -------
   1445    5a5  2286339  22e303  3111401  2f79e9    -- Blevel 5
   1445    5a5  2287974  22e966  3111401  2f79e9    -- Leaf, contains 5 rows with n1=1249, N2 from 9992 to 9998
   1445    5a5  2287976  22e968  3111401  2f79e9    -- Leaf, contains 3 rows with n1=1249, N2 from 9997 to 9999
   1445    5a5  2287985  22e971  3111401  2f79e9    -- Blevel 1
   1445    5a5  2288028  22e99c  3111401  2f79e9    -- Blevel 2
   1445    5a5  2288446  22eb3e  3111401  2f79e9    -- Blevel 3
   1445    5a5  2292458  22faea  3111401  2f79e9    -- Blevel 4
      
  Note: Blevel and Leaf info are obtained by following index treedump.
It shows 7 index blocks in buffer cache.

Then we can make an index treedump to see the index data structure (we can also make index data block dump to check kdxcolev flag and index entry content):

-- dump index treedump  
  alter session set tracefile_identifier = "index_treedump_TEST_TAB_1#I1";
  alter session set events 'immediate trace name treedump level 3111401';     -- 3111401 is TEST_TAB_1#I1 OBJECT_ID
   
  ----- begin tree dump
  branch: 0x22e303 2286339 (0: nrow: 4, level: 5)
     branch: 0x22faea 2292458 (-1: nrow: 8, level: 4)     
        branch: 0x22eb3e 2288446 (1: nrow: 8, level: 3)  
           branch: 0x22e99c 2288028 (2: nrow: 8, level: 2)    
              branch: 0x22e971 2287985 (3: nrow: 9, level: 1)
                 leaf: 0x22e966 2287974 (-1: row:7.7 avs:849)  -- Leaf, contains 5 rows with n1=1249, N2 from 9992 to 9998
                 leaf: 0x22e968 2287976 (0: row:7.7 avs:847)   -- Leaf, contains 3 rows with n1=1249, N2 from 9997 to 9999
                 leaf: 0x22e969 2287977 (1: row:7.7 avs:842)   
                 leaf: 0x22e96a 2287978 (2: row:7.7 avs:842)
                 leaf: 0x22e96b 2287979 (3: row:7.7 avs:842)
                 leaf: 0x22e96c 2287980 (4: row:7.7 avs:842)
                 leaf: 0x22e96d 2287981 (5: row:7.7 avs:842)
                 leaf: 0x22e96e 2287982 (6: row:7.7 avs:842)
                 leaf: 0x22e96f 2287983 (7: row:7.7 avs:842)
There are 5 branch nodes (Blevel=5), and 2 index data nodes involved in above xplan.

With following query, we can also list 2 index data blocks and their stored index values. 8 index entries with n1=1249 are distributed in 2 blocks (2287974 and 2287976).

with sq as (
 select /*+ materialize */ 
        dbms_rowid.rowid_block_number(sys_op_lbid (3111401, 'L', rowid)) block#
        ,t.*
    from test_tab_1 t)
select block#, n1, n2 from sq t
where block# in (2286339
                ,2287974
                ,2287976
                ,2287985
                ,2288028
                ,2288446
                ,2292458)
order by block# , n1, n2;

    BLOCK#         N1         N2
  -------- ---------- ----------
   2287974       1248       9990
   2287974       1248       9991
   2287974       1249       9992
   2287974       1249       9993
   2287974       1249       9994
   2287974       1249       9995
   2287974       1249       9996
   2287976       1249       9997
   2287976       1249       9998
   2287976       1249       9999
   2287976       1250      10000
   2287976       1250      10001
   2287976       1250      10002
   2287976       1250      10003
Above query execution shows "13 consistent gets". Now with event 10046 and 10200, we can display the xplan and each of 13 consistent gets.

alter session set tracefile_identifier = 'index_skip_scan_trc';
 				
alter session set events '10046 trace name context forever, level 12';  
alter session set events '10200 trace name context forever, level 10';  

select * from test_tab_1 t where n1 = 1249 and n2 = 9999;

alter session set events '10200 trace name context off';
alter session set events '10046 trace name context off';
xplan shows 13 consistent gets with cost=3:

SQL ID: 4xv9sg1x5sa5x Plan Hash: 1789529535

select * from test_tab_1 t where n1 = 1249 and n2 = 9999

  call     count       cpu    elapsed       disk      query    current        rows
  ------- ------  -------- ---------- ---------- ---------- ----------  ----------
  Parse        1      0.00       0.00          0          0          0           0
  Execute      1      0.00       0.00          0          0          0           0
  Fetch        2      0.00       0.00          0         13          0           1
  ------- ------  -------- ---------- ---------- ---------- ----------  ----------
  total        4      0.00       0.00          0         13          0           1
  
  Rows (1st) Row Source Operation
  ---------- ---------------------------------------------------
           1 INDEX SKIP SCAN TEST_TAB_1#I1 (cr=13 pr=0 pw=0 time=571 us starts=1 cost=3 size=1010 card=1)(object id 3111401)
From raw trace file, we can see two top-down index searches, first makes 7 Gets, second does 6 Gets (together 13 Gets).

-- First top-down index skip scan 
ktrget2(): started for block  <0x0841 : 0x0022e303> objd: 0x002f79e9  -- Blevel 5
ktrget2(): started for block  <0x0841 : 0x0022faea> objd: 0x002f79e9  -- Blevel 4
ktrget2(): started for block  <0x0841 : 0x0022eb3e> objd: 0x002f79e9  -- Blevel 3
ktrget2(): started for block  <0x0841 : 0x0022e99c> objd: 0x002f79e9  -- Blevel 2 
ktrget2(): started for block  <0x0841 : 0x0022e971> objd: 0x002f79e9  -- Blevel 1
ktrget2(): started for block  <0x0841 : 0x0022e966> objd: 0x002f79e9  -- First Leaf containing n1=1249 
ktrget2(): started for block  <0x0841 : 0x0022e968> objd: 0x002f79e9  -- Next  Leaf pointed by previous leaf

-- Last top-down index skip scan
ktrget2(): started for block  <0x0841 : 0x0022e303> objd: 0x002f79e9  -- Blevel 5
ktrget2(): started for block  <0x0841 : 0x0022faea> objd: 0x002f79e9  -- Blevel 4
ktrget2(): started for block  <0x0841 : 0x0022eb3e> objd: 0x002f79e9  -- Blevel 3
ktrget2(): started for block  <0x0841 : 0x0022e99c> objd: 0x002f79e9  -- Blevel 2
ktrget2(): started for block  <0x0841 : 0x0022e971> objd: 0x002f79e9  -- Blevel 1
ktrget2(): started for block  <0x0841 : 0x0022e968> objd: 0x002f79e9  -- Last Leaf containing n1=1249   

  Legend  
    0x0841:     tablespace number (v$tablespace.TS# = 2113 = 0x841)
    0x0022e303: index block number (branch or leaf)
    0x002f79e9: data_object_id
One 10053 trace shows the optimizer decision of SkipScan and its cost (Cost: 3.000003):

alter session set tracefile_identifier = "10053_index_skip_scan";
alter session set events '10053 trace name context forever, level 12';
explain plan for select * from test_tab_1 t where n1 = 1249 and n2 = 9999;
alter session set events '10053 trace name context OFF';

 ****** Costing Index TEST_TAB_1#I1
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SCAN

  SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_FILTER
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: #1 (1 2 )  Sel: 1.0380e-05
  Estimated selectivity: 7.9994e-05 , col: #1 
  Estimated selectivity: 1.0000e-05 , col: #2 
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SKIP_SCAN
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: #1 (1 2 )  Sel: 1.0380e-05
  Estimated selectivity: 7.9994e-05 , col: #1 
  Estimated selectivity: 1.0000e-05 , col: #2 
  Access Path: index (skip-scan)
    SS scan sel: 1.0380e-05  SS filter sel: 1.0380e-05  ANDV (#skips): 1.000000
    SS io: 1.000000 vs. index scan io: 2.000000
    Skip Scan chosen
  Access Path: index (SkipScan)
    Index: TEST_TAB_1#I1
    resc_io: 6.000000  resc_cpu: 44329
    ix_sel: 1.0380e-05  ix_sel_with_filters: 1.0380e-05 
    Cost: 3.000003  Resp: 3.000003  Degree: 1
  Best:: AccessPath: IndexRange
  Index: TEST_TAB_1#I1
         Cost: 3.000003  Degree: 1  Resp: 3.000003  Card: 1.037958  Bytes: 0.000000
We can also force an index range scan via a no_index_ss hint:

alter session set tracefile_identifier = 'index_range_scan_trc';
 				
alter session set events '10046 trace name context forever, level 12';  
alter session set events '10200 trace name context forever, level 10';  

select /*+ no_index_ss(t test_tab_1#i1) */ * from test_tab_1 t where n1 = 1249 and n2 = 9999;

alter session set events '10200 trace name context off';
alter session set events '10046 trace name context off';
xplan shows 8 consistent gets with cost=4, and raw trace shows one single top-down index search containing 8 index block gets.

select /*+ no_index_ss(t test_tab_1#i1) */ * from test_tab_1 t where n1 = 1249 and n2 = 9999

  call     count       cpu    elapsed       disk      query    current        rows
  ------- ------  -------- ---------- ---------- ---------- ----------  ----------
  Parse        1      0.00       0.00          0          0          0           0
  Execute      1      0.00       0.00          0          0          0           0
  Fetch        2      0.00       0.00          0          8          0           1
  ------- ------  -------- ---------- ---------- ---------- ----------  ----------
  total        4      0.00       0.00          0          8          0           1
  
  Rows (1st) Row Source Operation
  ---------- ---------------------------------------------------
           1 INDEX RANGE SCAN TEST_TAB_1#I1 (cr=8 pr=0 pw=0 time=539 us starts=1 cost=4 size=1010 card=1)(object id 3111401)


  -- top-down index range scan                                                                                    
  
  ktrgtc2(): started for block <0x0841 : 0x0022e303> objd: 0x002f79e9  -- Blevel 5                           
  ktrgtc2(): started for block <0x0841 : 0x0022faea> objd: 0x002f79e9  -- Blevel 4                           
  ktrgtc2(): started for block <0x0841 : 0x0022eb3e> objd: 0x002f79e9  -- Blevel 3                           
  ktrgtc2(): started for block <0x0841 : 0x0022e99c> objd: 0x002f79e9  -- Blevel 2                           
  ktrgtc2(): started for block <0x0841 : 0x0022e971> objd: 0x002f79e9  -- Blevel 1                           
  ktrget2(): started for block <0x0841 : 0x0022e966> objd: 0x002f79e9  -- First Leaf containing n1=1249      
  ktrget2(): started for block <0x0841 : 0x0022e968> objd: 0x002f79e9  -- Next  Leaf pointed by previous leaf
  ktrget2(): started for block <0x0841 : 0x0022e968> objd: 0x002f79e9  -- Last  Leaf 
If we also make a 10053 trace, we can see IndexRange with "Cost: 3.500003", which is rounded to 4 in xplan.

alter session set tracefile_identifier = "10053_index_range_scan";
alter session set events '10053 trace name context forever, level 12';
explain plan for select /*+ no_index_ss(t test_tab_1#i1) */ * from test_tab_1 t where n1 = 1249 and n2 = 9999;
alter session set events '10053 trace name context OFF';

 ****** Costing Index TEST_TAB_1#I1
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SCAN

  SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_FILTER
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: #1 (1 2 )  Sel: 1.0380e-05
  Estimated selectivity: 7.9994e-05 , col: #1 
  Estimated selectivity: 1.0000e-05 , col: #2 
  Access Path: index (IndexOnly)
    Index: TEST_TAB_1#I1
    resc_io: 7.000000  resc_cpu: 51450
    ix_sel: 7.9994e-05  ix_sel_with_filters: 1.0380e-05 
    Cost: 3.500003  Resp: 3.500003  Degree: 1
  Best:: AccessPath: IndexRange
  Index: TEST_TAB_1#I1
         Cost: 3.500003  Degree: 1  Resp: 3.500003  Card: 1.037958  Bytes: 0.000000
From above tests, we can see that index skip scan has a lower CBO cost (3), but its real execution requires 13 consistent gets,
whereas index range scan has a higher CBO cost (4), but its real execution requires 8 consistent gets.


3. Index Range Scan


We can create a second test case by slightly modifying the first column from "trunc(level/8)" to trunc(level/7), index range scan is chosen by optimizer, probably due to the higher selectivity
(test_tab_2 fist column has 14286 distinct values. test_tab_1 fist column has 12501 distinct values).

drop table test_tab_2;

create table test_tab_2 as select trunc(level/7) n1, level n2, rpad('ABC', 1000, 'X') a1 from dual connect by level <= 1e5; 

create index test_tab_2#i1 on test_tab_2(n1, a1, n2);

exec dbms_stats.gather_table_stats(null, 'TEST_TAB_2', cascade=>true);

select * from test_tab_2 t where n1 = 1428 and n2 = 9999;


  ----------------------------------------------------------------------------------
  | Id  | Operation        | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
  ----------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT |               |     1 |  1010 |     3   (0)| 00:00:01 |
  |*  1 |  INDEX RANGE SCAN| TEST_TAB_2#I1 |     1 |  1010 |     3   (0)| 00:00:01 |
  ----------------------------------------------------------------------------------
  Predicate Information (identified by operation id):
  ---------------------------------------------------
     1 - access("N1"=1428 AND "N2"=9999)
         filter("N2"=9999)
  
  Statistics
  ----------------------------------------------------------
            8  consistent gets