Cardinality feedback for queries that use volatile tables with locked statistics.

April 27, 2012

As if dealing with volatile tables alone is not complicated enough, in Oracle 11gR2 we also have to consider how the new cardinality feedback feature (on by default) would affect us.

While there is a lot to be said about volatile tables (presentation, white paper)  , there are two mainstream approaches to dealing with them – dynamic sampling and locking table statistics.
In 11gR2, dynamic sampling turns off cardinality feedback, so there is no change in behavior from previous versions.
Locking table statistics, however, does not turn cardinality feedback off and that is a big concern. Oracle’s official blog entry acknowledges that carnality feedback is not useful for managing volatile tables: “Cardinality feedback is useful for queries where the data volume being processed is stable over time. For a query on volatile tables, the first execution statistics are not necessarily reliable. This feature is not meant to evolve plans over time as the data in the table changes”. The statement is a little vague though. The precise statement should be “Cardinality feedback should not be used for volatile tables with locked statistics”.

Here is why:
According to MOS Best Practices for Automatic Statistics Collection [ID 377152.1], we lock the table statistics when the table is at its maximum size. The idea is to use those stats regardless of what is actually in the table. Cardinality feedback essentially overwrites the estimates derived from the data dictionary with real-time statistics derived from the table at that time, which is exactly what we do not want to do when managing volatile tables.

This example will illustrate this point.

Let’s create volatile_tab and static_tab tables, with 50K each and an index on static_tab

create table volatile_tab
as
with generator as (
       select --+ materialize
                rownum id
       from dual
       connect by
                level < 10000
)
select
       mod(rownum,100) col1,
       rownum          col2
from
       generator v1,
       generator v2
where
       rownum < 50000;

Table created.

create table static_tab
as
with generator as (
        select --+ materialize
             rownum id
        from dual
        connect by
              level < 10000)
select
        mod(rownum,100) col1,
        rownum          col2
from
        generator v1,
        generator v2
where
        rownum < 50000 ;

Table created.

create index static_tab_i on static_tab (col2);

Index created.

Now, let’s gather tables statistics and lock the statistics of volatile_tab, the volatile table.

SQL>exec dbms_stats.gather_table_stats('TEST','static_tab');

PL/SQL procedure successfully completed.

SQL>exec dbms_stats.gather_table_stats('TEST','volatile_tab');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.lock_table_stats('TEST','volatile_tab');

PL/SQL procedure successfully completed.

Since volatile_tab is a volatile table with a max size of 50K, it could contain significantly fewer records at some point in time. That is a typical scenario for a volatile table.

SQL> truncate table volatile_tab ;

Table truncated.

SQL> insert into volatile_tab
with generator as (
        select --+ materialize
               rownum id
        from dual
        connect by
               level  < 10000 )
select
        mod(rownum,100) col1,
        rownum col2
from
        generator v1,
        generator v2
where
        rownum < 5 ;

commit;

Commit complete.

Now, let run this simple query:

SQL>
select
           /*+ gather_plan_statistics */ count(*)
from
        volatile_tab  v,
        static_tab    s
where
        v.col2 = s.col2
and     mod(v.col1,130) = 3 ;


COUNT(*)
----------
1

SQL> select * from table(dbms_xplan.display_cursor('','','ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6dx6gbc8kc030, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from volatile_tab v,
static_tab s where v.col2 = s.col2 and mod(v.col1,130) = 3


Plan hash value: 2579530871


-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 98 | | | |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 98 | | | |
|* 2 | HASH JOIN | | 1 | 500 | 1 |00:00:00.01 | 98 | 1517K| 1517K| 371K (0)|
|* 3 | TABLE ACCESS FULL| VOLATILE_TAB | 1 | 500 | 1 |00:00:00.01 | 6 | | | |
| 4 | TABLE ACCESS FULL| STATIC_TAB | 1 | 50000 | 50000 |00:00:00.03 | 92 | | | |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


2 - access("V"."COL2"="S"."COL2")
3 - filter(MOD("V"."COL1",130)=3)

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


23 rows selected.

SQL>

We got exactly what we want when dealing with volatile table – the robust HASH join. The plan would perform OK when volatile_tab is small and would work great when volatile_tab is large.

Let run this statement one more time:

SQL> select
        /*+ gather_plan_statistics */ count(*)
from
        volatile_tab   v,
        static_tab     s
where
        v.col2 = s.col2
and     mod(v.col1,130) = 3 ;


COUNT(*)
----------
1

SQL> select * from table(dbms_xplan.display_cursor('','','ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6dx6gbc8kc030, child number 1
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from volatile_tab v,
static_tab s where v.col2 = s.col2 and mod(v.col1,130) = 3

Plan hash value: 3273547765

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 8 |


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 8 |
| 2 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 8 |
|* 3 | TABLE ACCESS FULL| VOLATILE_TAB | 1 | 1 | 1 |00:00:00.01 | 6 |
|* 4 | INDEX RANGE SCAN | STATIC_TAB_I | 1 | 1 | 1 |00:00:00.01 | 2 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter(MOD("V"."COL1",130)=3)
4 - access("V"."COL2"="S"."COL2")

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Note
-----
- cardinality feedback used for this statement

27 rows selected.

Now, the cardinality feedback option kicked in, and the plan uses nested loops (NL) – the less robust join method. Even though this plan is OK for the current volume of volatile_tab, it would be bad when volatile_tab grows.

Let’s see:

SQL> insert into volatile_tab
     with generator as (
select --+ materialize
     rownum id
from dual
connect by
     level < 10000 )
select
     mod(rownum,100) col1,
     rownum          col2
from
     generator v1,
     generator v2
where
     rownum < 50000 ;


commit ;

Commit complete.

SQL>
select
      /*+ gather_plan_statistics */ count(*)
from
       volatile_tab v,
       static_tab s
where
       v.col2 = s.col2
and    mod(v.col1,130) = 3 ;


COUNT(*)
----------
501


SQL> select * from table(dbms_xplan.display_cursor('','','ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6dx6gbc8kc030, child number 2
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from volatile_tab v,
static_tab s where v.col2 = s.col2 and mod(v.col1,130) = 3


Plan hash value: 3273547765

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.02 | 643 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.02 | 643 |
| 2 | NESTED LOOPS | | 1 | 1 | 501 |00:00:00.02 | 643 |
|* 3 | TABLE ACCESS FULL| VOLATILE_TAB | 1 | 1 | 501 |00:00:00.02 | 98 |
|* 4 | INDEX RANGE SCAN | STATIC_TAB_I | 501 | 1 | 501 |00:00:00.01 | 545 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


3 - filter(MOD("V"."COL1",130)=3)
4 - access("V"."COL2"="S"."COL2")

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Note
-----
- cardinality feedback used for this statement

27 rows selected.

The nested loops (NL) plans, a clearly bad choice, is used even when volatile_tab has plenty of records.

Cardinality feedback for volatile tables with locked statistics is can be one, although not the only,  reason for plan instability. Oracle should modify  MOS Best Practices for Automatic Statistics Collection [ID 377152.1] to specifically request cardinality feedback to be disabled for volatile tables.


Follow

Get every new post delivered to your Inbox.