Stabilize Oracle Bind Peeking Behaviour with Range-Based Predicates

Posted in: Technical Track

In my previous post, I described the most common cause for unstable plans due to bind peeking — histograms. It is now time to move forward and take a look at another case, namely range-based predicates. Strictly speaking, the cases I’m going to describe can appear without range-based predicates as well, you just need to remember that a range-based operation doesn’t necessarily imply a range-based predicate.

How Can Range-Based Predicates Cause an Unstable Plan?

Quite easy, take the following example:

SQL> create table t as
  2  	select level n, rpad('x', 200, 'x') v
  3  		from dual
  4  		connect by level <= 100000;

Table created

SQL> create index i_t_n on t (n);

Index created

SQL> exec dbms_stats.gather_table_stats(user, 't');

PL/SQL procedure successfully completed

Now, I’ll query the table using two different conditions:

SQL> set autot traceonly explain
SQL> select * from t where n <= 100;

Execution Plan
-----------------------------------------------------
Plan hash value: 2912310446

----------------------------------------------------
| Id  | Operation                   | Name  | Rows |
----------------------------------------------------
|   0 | SELECT STATEMENT            |       |   95 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |   95 |
|*  2 |   INDEX RANGE SCAN          | I_T_N |   95 |
----------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N"<=100)

SQL> select * from t where n <= 25000;

Execution Plan
------------------------------------------
Plan hash value: 1601196873

------------------------------------------
| Id  | Operation         | Name | Rows  |
------------------------------------------
|   0 | SELECT STATEMENT  |      | 24998 |
|*  1 |  TABLE ACCESS FULL| T    | 24998 |
------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("N"<=25000)

(Note that I’ve trimmed the output for the sake of readability.)

The first query was executed using INDEX RANGE SCAN, the second one using TABLE ACCESS FULL. This makes perfect sense, since the first query is going to return only a small fraction of data in a table, while the second one is going to fetch substantially more data. As you probably already guessed, if you substitute a literal value for a bind variable, your plan will depend on what value was passed during a hard parse:

SQL> variable n number;
SQL> exec :n:=25000;

PL/SQL procedure successfully completed.

SQL> set autot traceonly stat
SQL> set arraysize 100
SQL> select * from t where n<=:n;

25000 rows selected.

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       3198  consistent gets
       2650  physical reads
          0  redo size
     271952  bytes sent via SQL*Net to client
       3135  bytes received via SQL*Net from client
        251  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      25000  rows processed

SQL> exec :n:=100;

PL/SQL procedure successfully completed.

SQL> select * from t where n<=:n;

100 rows selected.

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2949  consistent gets
       2647  physical reads
          0  redo size
       1479  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

You can already spot from the stats that both queries have used full table scans:

SQL> select sql_id, executions, child_number
  2   from v$sql
  3   where sql_text='select * from t where n<=:n';

SQL_ID        EXECUTIONS CHILD_NUMBER
------------- ---------- ------------
avj8fuq6s3j2q          2            0

SQL> select * from table(dbms_xplan.display_cursor('avj8fuq6s3j2q'));

PLAN_TABLE_OUTPUT
------------------------------------------
SQL_ID  avj8fuq6s3j2q, child number 0
-------------------------------------
select * from t where n<=:n

Plan hash value: 1601196873

------------------------------------------
| Id  | Operation         | Name | Rows  |
------------------------------------------
|   0 | SELECT STATEMENT  |      |       |
|*  1 |  TABLE ACCESS FULL| T    | 24998 |
------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("N"<=:N)

18 rows selected.

(Note that I’ve trimmed the output again.)

If I had started my example with n=100, both executions would have used index range scans instead. What if both plans are equally important to you and you can’t sacrifice one in favor of the other? There are a number of solutions to that; I’m going to describe my favorite one.

Tell the Database About It

You have to cooperate with the database, because if you do not, then the database will not cooperate with you. It is a fair game and the choice is yours.

After all, your are the only one who has the intimate knowledge of your data and how it is used. Let’s say your application is using the following function which returns a ref-cursor:

SQL> create or replace function get_data(
  2  	p_n	in number
  3  ) return sys_refcursor is
  4  	l_rf sys_refcursor;
  5  begin
  6  	open l_rf for 'select * from t where n<=:n' using p_n;
  7  	return l_rf;
  8  end;
  9  /

Function created

The above function will be subject to the same problem as described above: the plan will be developed on the first hard parse, and all subsequent executions will use the same plan regardless of the fact that it might be inefficient. The optimal execution plan is driven by how much data you request, which in turn depends on what bind variable value was supplied.

There is a point where Oracle should switch to an FTS instead of an IRS (and vise versa). Ultimately, we would like to make our function smart enough to be able to provide us different execution plans, and at the same time, leave the final decision to the optimizer. If we expect potentially different execution plans when p_n changes to the power of 2, then all we have to do is tell the database about it:

SQL> create or replace function get_data(
  2  	p_n	in number
  3  ) return sys_refcursor is
  4  	l_rf sys_refcursor;
  5  begin
  6  	open l_rf for 'select /* '||to_char(power(2, floor(log(2, p_n))))||' */ * from t where n<=:n' using p_n;
  7  	return l_rf;
  8  end;
  9  /

Function created

The above function will automatically add a comment into each SQL statement, embedding as many rows (to the power of 2) as we are expect:

SQL> variable rf refcursor;
SQL> exec :rf:=get_data(100);

PL/SQL procedure successfully completed.

SQL> print rf;

100 rows selected.

SQL> exec :rf:=get_data(25000);

PL/SQL procedure successfully completed.

SQL> print rf;

25000 rows selected.

SQL> select	s.sql_text,
  2  		max(p.operation||' '||p.options) keep (dense_rank last order by id) access_path
  3    	from v$sql s, v$sql_plan p
  4    	where s.SQL_ID=p.SQL_ID
  5  		and s.sql_text like 'select /* % */ * from t where n<=:n'
  6  	group by s.sql_id, s.sql_text;

SQL_TEXT                                 ACCESS_PATH
---------------------------------------- --------------------
select /* 16384 */ * from t where n<=:n  TABLE ACCESS FULL
select /* 64 */ * from t where n<=:n     INDEX RANGE SCAN

Since different comments will essentially lead to a different SQL statements, Oracle was able to develop appropriate plans.

This solution is very simple, and it provides the desired results. Its advantage is that, compared to replacing bind variables with literals, you still share most of your SQL while not sacrificing the quality of execution plans. Our function will result in only 17 different SQL statements for all possible values:

SQL> declare
  2   l_rf sys_refcursor;
  3  begin
  4   for i in 1 .. 100000
  5   loop
  6    l_rf:=get_data(i);
  7    close l_rf;
  8   end loop;
  9  end;
 10  /

PL/SQL procedure successfully completed.

SQL> select s.sql_text,
  2      max(p.operation||' '||p.options)
  3     keep (dense_rank last order by id) access_path,
  4    max(executions) execs
  5     from v$sql s, v$sql_plan p
  6     where s.SQL_ID=p.SQL_ID
  7      and s.sql_text like 'select /* % */ * from t where n<=:n'
  8     group by s.sql_text
  9   order by execs;

SQL_TEXT                                 ACCESS_PATH           EXECS
---------------------------------------- -------------------- ------
select /* 1 */ * from t where n<=:n      INDEX RANGE SCAN          1
select /* 2 */ * from t where n<=:n      INDEX RANGE SCAN          3
select /* 4 */ * from t where n<=:n      INDEX RANGE SCAN          4
select /* 8 */ * from t where n<=:n      INDEX RANGE SCAN          8
select /* 16 */ * from t where n<=:n     INDEX RANGE SCAN         16
select /* 32 */ * from t where n<=:n     INDEX RANGE SCAN         32
select /* 64 */ * from t where n<=:n     INDEX RANGE SCAN         64
select /* 128 */ * from t where n<=:n    INDEX RANGE SCAN        128
select /* 256 */ * from t where n<=:n    INDEX RANGE SCAN        256
select /* 512 */ * from t where n<=:n    INDEX RANGE SCAN        512
select /* 1024 */ * from t where n<=:n   INDEX RANGE SCAN       1024
select /* 2048 */ * from t where n<=:n   INDEX RANGE SCAN       2048
select /* 4096 */ * from t where n<=:n   INDEX RANGE SCAN       4096
select /* 8192 */ * from t where n<=:n   INDEX RANGE SCAN       8192
select /* 16384 */ * from t where n<=:n  INDEX RANGE SCAN      16384
select /* 32768 */ * from t where n<=:n  TABLE ACCESS FULL     32768
select /* 65536 */ * from t where n<=:n  TABLE ACCESS FULL     34464

17 rows selected.

Because the decision is still made by the optimizer, the above function will be able to automatically adjust to changing data volumes as well. If tomorrow our table contains 100M rows instead of 100K, our function will still be able to provide us good execution plans — which, by the way, might be different from what we see with 100K rows. This is a key difference from explicitly hinting the query.

Note that we still can have different execution plans inside a “transient” window. That should not be a problem, since this is where a plan switch should occur anyway, and doing the entire window one way or the other should not produce bad results. You can even go a bit further and make the base adjustable; this will allow us to properly balance SQL statements reusability while providing optimal execution plans:

SQL> create or replace function get_data(
  2  	p_n	in number,
  3  	p_f	in number default 2
  4  ) return sys_refcursor is
  5  	l_rf sys_refcursor;
  6  begin
  7  	open l_rf for 'select /* '||to_char(power(p_f, floor(log(p_f, p_n))))||' */ * from t where n<=:n' using p_n;
  8  	return l_rf;
  9  end;
 10  /

Function created

The above technique can be used virtually everywhere you have to have bind variables, and at the same time absolutely need different execution plans depending on bind variable values. You will only have to come up with a proper way to tell Oracle when you expect different plans, based on your domain knowledge and how your data is being used.

What About 11G?

There is a new feature introduced in 11G to help overcome the above problem — the ability to produce a different child cursor if the database detects a potentially suboptimal execution plan due to bind peeking. If you are interested how well it performs, then stay tuned for a next blog post.

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *