How to find unused indexes and drop them safely in MongoDB

Posted in: DBA Lounge, MongoDB, Technical Track

Indexes increase read performance and allow queries to access data faster. But they also need to be updated when the indexed keys are created or modified, which consumes additional time to complete C*UD operations. In this blog post, we will discuss how to find unused indexes and safely remove them.

Querying index usage stats

In version 3.2, MongoDB introduced new instrumentation that tracks index usage. $indexStat is the new aggregation stage which gives information on index usage stats. Let’s take a look at the sample output below.

MongoDB Enterprise > db.idxsts.aggregate({$indexStats:{}}).pretty()
{
 "name" : "name_1_grade_1",
 "key" : {
  "name" : 1,
  "grade" : 1
 },
 "host" : "DarshanMBP:30008",
 "accesses" : {
  "ops" : NumberLong(1),
  "since" : ISODate("2018-07-25T10:08:06.109Z")
 }
}
{
 "name" : "name_1_slno_1",
 "key" : {
  "name" : 1,
  "slno" : 1
 },
 "host" : "DarshanMBP:30008",
 "accesses" : {
  "ops" : NumberLong(4),
  "since" : ISODate("2018-07-25T10:12:59.106Z")
 }
}
{
 "name" : "_id_",
 "key" : {
  "_id" : 1
 },
 "host" : "DarshanMBP:30008",
 "accesses" : {
  "ops" : NumberLong(0),
  "since" : ISODate("2018-07-25T09:29:00.245Z")
 }
}
MongoDB Enterprise > 


We can see that there are three indexes for the
idxsts collection, Let’s dive deeper into the output. Consider name_1_slno_1 Index

{
	"name" : "name_1_slno_1",
	"key" : {
		"name" : 1,
		"slno" : 1
	},
	"host" : "DarshanMBP:30008",
	"accesses" : {
		"ops" : NumberLong(4),
		"since" : ISODate("2018-07-25T10:12:59.106Z")
	}
}


The first part of the document shows basic information about the index and the MongoDB server. The key we are interested in is
accesses, which shows how many times the index was used since the server was started. The since field tells you when MongoDB started gathering statistics about this index. If accesses.ops is 0, it means that the index was never used since MongoDB was restarted or since the index was created.

Source: https://docs.mongodb.com/manual/reference/operator/aggregation/indexStats/

Retrieving unused indexes for all databases

As $indexStats is an aggregation stage, it needs to be executed individually for every collection. The script below will reduce the tediousness.

db.getMongo().getDBNames().forEach(function (dbname) {
	if (dbname != "admin") {
    db.getSiblingDB(dbname).getCollectionNames().forEach(function (cname) {
        output = db.getSiblingDB(dbname)[cname].aggregate({$indexStats:{} });  
        output.forEach(function(findUnused) { 
        if (findUnused.accesses.ops == 0 && findUnused.name != "_id_") {  
            print(dbname + " \t" + cname + " \t" + JSON.stringify(findUnused) );  
        } 
        })
    })
}}) 


The following  output is returned:

mjour 	marks 	{"name":"grade_1_name_-1","key":{"grade":1,"name":-1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:25:59.949Z"}}
mjour 	marks 	{"name":"grade_1","key":{"grade":1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:25:53.414Z"}}
mjour 	marks 	{"name":"std_id_1_name_-1","key":{"std_id":1,"name":-1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:26:09.201Z"}}
mjour 	petrol 	{"name":"shop_id_1","key":{"shop_id":1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:26:35.958Z"}}
njour 	idxsts 	{"name":"grade_1","key":{"grade":1},"host":"DarshanMBP:30008","accesses":{"ops":{"$numberLong":"0"},"since":"2018-08-01T11:33:45.822Z"}}


The script will query the index stats for each collection in each database and report those which were never used (accesses.ops = 0). Keep in mind that the uptime should be long enough to have all different queries executed at least once. Now that we have all unused indexes, we can look at dropping them.

Dropping unused indexes

Before you drop the indexes make sure that:

1. The instance was running long enough for the applications to run all possible queries and the query pattern no longer requires an index.  Uptime can be easily retrieved using db.serverStatus().uptime.

2. You have a backup of all affected collections.

3. Indexes are also not being used on SECONDARY nodes.

4. A write lock will be acquired for the database the index belongs to, blocking all operations.

Dropping an index in a production environment can be done using one of the 2 methods below:

a. Drop the index on the primary, which will replicate overall secondaries. This is only recommended for very small indexes and non-busy databases.

b. Drop as ReplSet maintenance method (Secondary, Secondary, stepDown Primary).

Conclusion

Unused indexes will decrease write performance (due to index maintenance) with no read performance benefits. Properly maintaining indexes will ensure optimal execution times for your queries. Another source of unused indexes is archived collections. We could consider dropping them if we can afford the time to recreate the indexes in case data needs to be queried.

 
email

Interested in working with Darshan? Schedule a tech call.

2 Comments. Leave new

Nice post

Reply

Very useful piece of info!

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *