Postgres Covering Indexes and the Visibility Map

Posted in: PostgreSQL, Technical Track

Covering indexes in Postgres came up in our weekly tech call because of a query optimization review one of the teams has in progress. It seemed like some covering indexes weren’t working.

I’ve been reading a book called Curious Moon, a work of fiction about Postgres (it’s as delightful as it sounds). Since I had the enceladus dataset already loaded, I used it to answer some questions about this kind of index.

Postgres handles covering indexes through the “index-only scans” feature. The requirements are similar to those of most RDBMSs. The index type matters (currently supported only in btree or gist indexes) and the index has to fully cover the columns requested by the query.

However, there’s an additional requirement related to a query optimization. The page block has to be fully “visible” to the query’s mvcc snapshot. Data that hasn’t changed recently has a better chance of remaining stable among older and newer transactions on the table. This information is stored in a visibility map adjacent to the table data.

Postgres of course has a robust set of extensions, and pg_visibility helps give us insight into the stats around the visibility map.

No surprises on installing the extension. This is also one that is available in Cloud SQL for Postgres.

enceladus=# create extension pg_visibility;

I’ll import the master_plan data from the Cassini mission:

enceladus=# COPY master_plan FROM '.../data/master_plan.csv' WITH DELIMITER ',' HEADER CSV;
COPY 61873

enceladus=# select * from master_plan limit 1;
-[ RECORD 1 ]------+---------------------------
start_time_utc     | 2004-135T18:40:00
duration           | 000T09:22:00
date               | 14-May-04
team               | CAPS
spass_type         | Non-SPASS
target             | Saturn
request_name       | SURVEY
library_definition | Magnetospheric survey
title              | MAPS Survey
description        | MAPS magnetospheric survey

At this point, all the visibility flags are off. This makes sense, because we just performed a bunch of inserts. The vacuum process turns them all on:

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
count | all_visible
-------+-------------
1959 | f

enceladus=# vacuum master_plan ;
VACUUM

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
count | all_visible
-------+-------------
1959 | t

Let’s perform a lot of inserts. I’m using the “watch” option here so I don’t have to type as much. Afterwards, a lot of the visibility flags are marked as off:

enceladus=# COPY master_plan FROM '.../data/master_plan.csv' WITH DELIMITER ',' HEADER CSV;
COPY 61873

enceladus=# \watch
Fri Jul 24 16:26:52 2020 (every 2s)
...
enceladus=# select count(*) from master_plan;
-[ RECORD 1 ]--
count | 3093650

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
count | all_visible
-------+-------------
97868 | f
1941 | t

Before I add the covering index, I’ll run my intended query. By the way, don’t take the absolute times into account here. This is a test on a puny laptop. I’m mainly trying to demonstrate the absence or presence of “index only scan” in the explain plan (notice the “Parallel Seq Scan” below):

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"

                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=133280.23..161724.62 rows=243792 width=22) (actual time=5075.047..5244.599 rows=286500 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=132280.21..132584.95 rows=121896 width=22) (actual time=5069.007..5103.744 rows=95500 loops=3)
         Sort Key: team, target
         Sort Method: external merge  Disk: 3352kB
         Worker 0:  Sort Method: external merge  Disk: 3448kB
         Worker 1:  Sort Method: external merge  Disk: 3432kB
         ->  Parallel Seq Scan on master_plan  (cost=0.00..119480.37 rows=121896 width=22) (actual time=0.996..4877.452 rows=95500 loops=3)
               Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
               Rows Removed by Filter: 935717
 Planning Time: 0.419 ms
 Execution Time: 5272.250 ms
(13 rows)
real    0m6.450s
user    0m0.012s
sys    0m0.027s

Let’s add an index we hope to provide index-only queries:

CREATE INDEX idx_tpt ON public.master_plan USING btree (team, spass_type, target);

It doesn’t look like it helped. I still see “Parallel Seq Scan” below:

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"

                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=132711.41..160669.96 rows=239628 width=22) (actual time=4746.679..4915.888 rows=286500 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=131711.39..132010.92 rows=119814 width=22) (actual time=4722.466..4757.252 rows=95500 loops=3)
         Sort Key: team, target
         Sort Method: external merge  Disk: 3424kB
         Worker 0:  Sort Method: external merge  Disk: 3416kB
         Worker 1:  Sort Method: external merge  Disk: 3392kB
         ->  Parallel Seq Scan on master_plan  (cost=0.00..119144.31 rows=119814 width=22) (actual time=3.395..4533.381 rows=95500 loops=3)
               Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
               Rows Removed by Filter: 935717
 Planning Time: 21.649 ms
 Execution Time: 4943.530 ms
(13 rows)
real    0m7.130s
user    0m0.013s
sys    0m0.038s

Remember that Postgres-specific requirement regarding the visibility flag for covering indexes?

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
-------+-------------
 97868 | f
  1941 | t

Clean up the flags:

enceladus=# vacuum master_plan ;
VACUUM

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
-------+-------------
 97947 | t

Now we see the index-only scan show up (notice “Index Only Scan” below):

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"

                                                                           QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=114722.12..142667.60 rows=239516 width=22) (actual time=461.861..612.337 rows=286500 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=113722.09..114021.49 rows=119758 width=22) (actual time=455.925..482.057 rows=95500 loops=3)
         Sort Key: team, target
         Sort Method: external merge  Disk: 4080kB
         Worker 0:  Sort Method: external merge  Disk: 3080kB
         Worker 1:  Sort Method: external merge  Disk: 3072kB
         ->  Parallel Index Only Scan using idx_tpt on master_plan  (cost=0.43..101163.65 rows=119758 width=22) (actual time=24.874..358.271 rows=95500 loops=3)
               Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
               Rows Removed by Filter: 935717
               Heap Fetches: 0
 Planning Time: 0.538 ms
 Execution Time: 640.655 ms
(14 rows)
real    0m1.065s
user    0m0.009s
sys    0m0.012s

The problem returns if you perform a lot of updates. Back to “Seq Scan” below:

enceladus=# update master_plan set target='Saturn2' where target='Saturn';
UPDATE 847900

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count  | all_visible
--------+-------------
 119439 | f
   4627 | t

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"

                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=165711.17..201108.67 rows=303386 width=22) (actual time=14423.062..14960.570 rows=286500 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=164711.14..165090.37 rows=151693 width=22) (actual time=14414.109..14449.936 rows=95500 loops=3)
         Sort Key: team, target
         Sort Method: external merge  Disk: 3456kB
         Worker 0:  Sort Method: external merge  Disk: 3432kB
         Worker 1:  Sort Method: external merge  Disk: 3360kB
         ->  Parallel Seq Scan on master_plan  (cost=0.00..148545.86 rows=151693 width=22) (actual time=0.113..14210.193 rows=95500 loops=3)
               Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
               Rows Removed by Filter: 935717
 Planning Time: 8.331 ms
 Execution Time: 15011.649 ms
(13 rows)
real    0m17.167s
user    0m0.014s
sys    0m0.044s

We can clean it up again and get our index-only scans (and performance) back:

enceladus=# vacuum master_plan ;
VACUUM

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count  | all_visible
--------+-------------
 124066 | t

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"

                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=135747.09..163705.64 rows=239628 width=22) (actual time=1822.618..1963.040 rows=286500 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=134747.07..135046.60 rows=119814 width=22) (actual time=1811.738..1835.136 rows=95500 loops=3)
         Sort Key: team, target
         Sort Method: external merge  Disk: 3640kB
         Worker 0:  Sort Method: external merge  Disk: 3336kB
         Worker 1:  Sort Method: external merge  Disk: 3280kB
         ->  Parallel Index Only Scan using idx_tpt on master_plan  (cost=0.43..122179.99 rows=119814 width=22) (actual time=50.554..1716.339 rows=95500 loops=3)
               Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
               Rows Removed by Filter: 935717
               Heap Fetches: 0
 Planning Time: 4.707 ms
 Execution Time: 2005.764 ms
(14 rows)
real    0m2.710s
user    0m0.011s
sys    0m0.015s

Moral: You knew this already, but make sure you’re vacuuming frequently enough, especially if you’re doing a lot of inserts or updates. The default vacuum ratio is generally too high, so you could lower that and autovacuum will likely help you out. If you are seeing that covering indexes don’t seem to be working, a likely reason is that you aren’t running vacuums frequently enough on quickly changing datasets.

Here are some other queries to use with the extension. This one does an in-depth check to find out if the visibility map is corrupt (maybe due to a previous database crash). We want zero rows found:

enceladus=# select * from pg_check_visible('master_plan');

 t_ctid
--------
(0 rows)

If you do discover that a visibility map is corrupt, rebuild it using the pg_truncate_visibility_map function, and then vacuum.

enceladus=# select pg_truncate_visibility_map('master_plan');

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count  | all_visible
--------+-------------
 124066 | f

enceladus=# vacuum master_plan ;
VACUUM

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count  | all_visible
--------+-------------
 124066 | t

 

email

Interested in working with Valerie? Schedule a tech call.

About the Author

Internal Principal Consultant
With experience as an open-source DBA and developer for software-as-a-service environments, Valerie has expertise in web-scale data storage and data delivery, including MySQL, Cassandra, Postgres, and MongoDB.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *