MySQL – No Index Used With ORDER BY + LIMIT and DISTINCT

Posted in: MySQL, Technical Track

Continuing MySQL saga…
This is actually a follow up on my previous post. Developers tried to rewrite all statements and even overdid it. As we say in Russia – “teach fool how to pray and he will break his forehead”. Note, I had to rename columns/tables as I write so sorry for possible typos

Query converted by developers from IN subqueries to joins:

SELECT distinct t.col_1, t.col_2, t.col_3
   FROM t LEFT JOIN t1 ON t.col_3 = t1.col_3
          LEFT JOIN t2 ON t.col_3 = t2.col_3
   WHERE TIMEDIFF(now(),t.col_date) >= '00:05:00'
     AND t1.col_3 is not null
     AND t2.col_3 is not null
     AND t.col_4 = 'string'
   ORDER BY t.col_5 desc
   LIMIT 0,6;

My first take was “smart” use of TimeDiff function in the WHERE clause. I was slightly surprised by the way it was compared – “>= '00:05:00'” but more about this later. I checked selectivity of that condition and found that it was very poor – returning large percent of total rows in the table. Column col_4 contained only 2 values either.

Next, I looked at ORDER BY – created index on col_5 but MySQL didn’t pick it up. Luckily, couple days ago Paul Moen posted about ORDER BY and LIMIT optimization so I knew it should work – MySQL is not that bad in the end. ;-) One more careful look and you notice DISTINCT keyword. Bingo! Obviously, MySQL couldn’t and shouldn’t use index in that case. Interesting that t.col_3 is the primary key so distinct was introduced during conversion from IN subqueries to joins. Well, there is no silver bullet so join is no good here.

In the end query was rewritten as

SELECT 1.col_1, t.col_2, t.col_3
    FROM t
    WHERE TIMEDIFF(now(),t.col_date) >= '00:05:00'
      AND t.col_4 = 'string'
      AND EXISTS (select 1 from t1 where t.col_3=t1.col_3)
      AND EXISTS (select 1 from t2 where t.col_3=t2.col_3)
   ORDER BY t.col_5 DESC
   LIMIT 6;

Note that DISTINCT disappeared and joins were converted to EXISTS conditions. Very handy! Index created on two columns to cover also t.col_4 = 'string':

create index col_4_col_5 on t (col_4,col5);

Result was very positive and response was down to less than 10 ms.

I noticed later that condition with TimeDiff is very tricky and string comparison in the end is not always appropriate. I would rather use AddTime and date comparison.

Just now I figured what kept me uneasy about this statement – DISTINCT causes grouping and ORDER BY column is not part of the select list! In this particular case there is no problem as primary key is in the select list. However, I can’t figure out how MySQL would group and sort if there are several col_5 values per distinct row. I can’t see the deterministic result. In another example today, I witnessed that I can use a column in ORDER BY that is not part of GROUP BY list. Seems like the same issue. If I lost you with this – stay tuned. I must be blogging about it soon.

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

What does it take to be chief technology officer at a company of technology experts? Experience. Imagination. Passion. Alex Gorbachev has all three. He’s played a key role in taking the company global, having set up Pythian’s Asia Pacific operations. Today, the CTO office is an incubator of new services and technologies – a mini-startup inside Pythian. Most recently, Alex built a Big Data Engineering services team and established a Data Science practice. Highly sought after for his deep expertise and interest in emerging trends, Alex routinely speaks at industry events as a member of the OakTable.

4 Comments. Leave new

Go to bed!

(Said the man who is still playing around with parallel stuff at 02:26 ;-))

Reply
Alex Gorbachev
January 14, 2007 2:49 am

Well, still recovering “lost” standby. By the way, I recorded one soccer game yesterday (Arsenal with …. don’t remember who) so I expect you to come over tomorrow for beer and vobla.

Reply
Niall Litchfield
January 14, 2007 12:59 pm

Watch out for an absolutely fantastic piece of clinical finishing in the Arsenal game, even though as a Liverpool fan i’m not talking about Arsenal right now :(.

Reply

I actually caught the second half on Saturday :-(

It’s even better than I thought over here – three premiership games on normal TV on a Saturday, including one at 3 o’clock. Not very conducive to working on presentations, though ;-)

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *