MyISAM Statistics Gathering

Posted in: MySQL, Technical Track

So, how does one gather statistics on indexes? With InnoDB it’s one thing, with MyISAM it’s another thing. This post however, will focus on MyISAM. There are various ways, each having their own drawbacks and positive sides.

What about ANALYZE TABLE? It’s great — very fast in InnoDB. But with MyISAM a lock occurs while scanning the table.

OPTIMIZE TABLE? Still relatively quick with MyISAM, unless the table in question is huge. InnoDB requires an ALTER TABLE — it might take forever, depending on the server configuration, as OPTIMIZE TABLE for InnoDB maps to a ALTER TABLE tableName ENGINE=InnoDB

We all know how ANALYZE TABLE, REPAIR TABLE, and OPTIMIZE TABLE work with MyISAM. However, there’s a less explored way, for MyISAM — the myisam_stats_method.

myisam_stats_method is used to rebuild the statistics of the indexes, which again is used by the optimizer to make smart choices, such as what index to use for a given query.

This allows you to change the way in which MyISAM table index statistics are gathered. The default is nulls_unequal, with which every NULL is not considered the same. Instead of being grouped together with other NULL values, each NULL forms its own group of size 1. This might make the optimizer look for ref when other joins are better than that, simply because of the amount of NULL groups with size 1. A group, in this context, is rows with the same value.

The other option, nulls_equal, is when all NULL values are treated as equals, which means they form a single value group which will (potentially) be much larger than the average value group, and might offset the optimizer, deeming the group less useful than it really is, and thus not use the ref method when it really should.

For index rebuilding (really regenerating the statistics), there is a quick way: change the variable myisam_stats_method from nulls_unequal to nulls_equal, and issue a check table <table>. This will rebuild the statistics of the indexes. Then change the stats method again, and go through the same process to get the normal process. Not the most efficient method, but in some cases, it might save you a lot of valuable time, and also downtime, by being a lot faster than other options.

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

4 Comments. Leave new

Alex Gorbachev
July 4, 2008 3:02 pm

We all know how ANALYZE TABLE, REPAIR TABLE, and OPTIMIZE TABLE work with MyISAM.

I wouldn’t be so sure if I’d been you. ;-)
Some of us are DBA’s for non-MySQL databases by enjoy the read so if you have pointers on how it actually works would be great. Or we just stay tuned for it to come on the blog. Keep it up!

Reply
Nicklas Westerlund
July 4, 2008 3:46 pm

Alex,

I can do a post with pointers on those different commands, thanks for the idea!

Reply
Sergei Golubchik
July 8, 2008 7:07 am

Just to mention that while REPAIR TABLE and CHECK TABLE are “known to work with MyISAM” – where “work” here means “they update the statistics of the indexes” – they are not guaranteed to.

The manual does not say that REPAIR or CHECK update statistics. They do, because it comes at no cost during, say, repair. In other words, it’s an undocumented side-effect of the REPAIR. And it could change anytime.

Reply
MySQL Back to Basics: Analyze, Check, Optimize, and Repair
July 25, 2008 11:53 am

[…] my post about gathering index statistics, I referred to OPTIMIZE TABLE, ANALYZE TABLE, and REPAIR TABLE — but I never explained in […]

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *