Issues with Plan Cache Reuse & Row Goal Optimization

Posted in: Microsoft SQL Server, Technical Track

 

I am presenting here on behalf of my colleague  Fabiano Amorim (he is busy resolving other exciting performance issues…  :-D ) .

Fabiano had an interesting case with one of our customers that is very common in SQL Server.

The case is about a performance issue caused by two optimizer decisions not working well together:

 

Problem Description

Let’s review the following query:

select top 1 col_date from tab1
where col1 = 10
and col2 = 1
and col3 = 1
order by col_date asc

 

Table tab1 have two indexes:

  1. ix1 (col1, col_date, col2) include(col3)
  2. ix2 (col1, col2, col3) include(col_date)

 

The Query optimizer (QO) has two query plan options:

  1. select -> top -> filter -> index seek (ix1) Read the ordered index ix1 by b-tree seeking by “col1 = 10”, apply the residual predicates (filter) “col2 = 1 and col3=1”, after reading just 1 row (TOP 1) the execution is finished since the index is ordered by  col1, col_date, the first col_date returned is already the TOP1 ASC according to the index order.
  2. select -> top N sort -> index seek (ix2) Read the covered index ix2 b-tree (notice it has all needed columns), seeking by “col1 = 10 and col2 = 1 and col3=1”, get the col_date in the index leaf level (included column), use “top N sort” algorithm to sort and keep only TOP 1 row, finish execution.

The problem, is that, if the QO chooses the first option, this will be good for high selectivity predicates.
For instance, let’s suppose that “col1 = 10” returns 5 rows; remember that index ix1 is ordered by col1, col_date, col2:

 

col1 | col2| col3 | col_date

10    | 4      | 4      | 2015-12-01

10    | 3      | 3      | 2015-12-02

10    | 1      | 1      | 2015-12-03

10    | 5      | 5      | 2015-12-04

10    | 2      | 2      | 2015-12-05

 

After seeking the index, SQL will need to apply the residual predicate (“col2 = 1 and col3=1”) until it finds the “row goal”: TOP iterator is asking for just one row, in this case the third row will match the predicate and SQL Server will return the first row that matches the residual predicate.

So, in this case it has to read only 3 rows. So far so good…

Now, let’s supposed SQL created that plan, and now it’s going to reuse it for a new value on col1 filter:

 

select top 1 col_date from tab1
where col1 = 99
and col2 = 1
and col3 = 1
order by col_date asc

 

What if after the seek (“col1 = 99”) 2 million of rows are returned? Now this plan is not so good, since it will need to apply the predicate on many rows before it finds a match:

 

col1 | col2| col3 | col_date

99    | 2      | 2      | 2015-12-01

99    | 2      | 2      | 2015-12-02

…after a couple of million rows…

99    | 1      | 1      | 2015-12-03

99    | 2      | 2      | 2015-12-04

99    | 2      | 2      | 2015-12-05

 

In this case, using the second option is better. Just go and seek the b-tree for all values (col1 = 99 and col2 = 1 and col3 = 1), this will return 1 row… TOP n SORT will do almost nothing and execution will finish quickly.

Here is the problem: most of the times, SQL knows whether to use option 1 or option 2 based on the parameters values. But if it is reusing the plan from cache, the optimization path may already be set improperly resulting in the known issue called “parameter sniffing” (plan reuse that is wrong for the specific set of rows)… That means that the row goal optimization should not be used if there is a covered indx.

Unfortunately by default, QO “thinks” this is cheaper than “seek+top n sort”… Of course it all depends on the distribution of data…So in a nutshell,  QO chooses rowgoal optimization where this should not be used therefore we should pay extra attention to those kind of plans…

 

Possible Solutions

There are many alternatives to fix it.

Some examples:

  1. Force the index (index=ix2)
  2. Option(recompile)
  3. drop the index ix1, define ix2 as a unique (tells QO that only 1 row will be returned)

Each one of the above has advantages and disadvantages.

We also need to ensure that statistics are up to date!

 

Additional Resources

 

Discover our expertise in SQL Server. 

email

Interested in working with Michelle? Schedule a tech call.

About the Author

Microsoft Data Olatform Architect
Michelle has 30 years in IT, and has been working with SQL Server for the past 20 years. She has designed methodologies that consist of documentation, utilities, and scripts to automate architecture, design, and performance tuning initiatives for her clients. Michelle is able to see the wider vision of her clients’ business. She is passionate about solving problems quickly and providing value to her clients. She speaks English, Hebrew, Spanish, and a bit of French.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *