Why is my index not being used

The age old question of why or why is my table index not being used is probably the most common question that ever gets asked even by expert database users.

In this brief article we will cover the most common reasons and try to order by statistical significance.

For the uninitiated, how do you know when an index is not being used? You use EXPLAIN, EXPLAIN ANALYZE or PgAdmin's nifty graphical Explain plan described in Reading PgAdmin Graphical Explain Plans.

Okay your query is not using an index, why and what can you do about it?

  1. Problem: Outdated stats. Now this is less common of an issue if you have auto-vacuuming turned on, but if you have recently bulk loaded a table or added new indexes and then try to do a query, then the statistics may be out of date.

    Solution: vacuum analyze verbose sometable
    I add the verbose in because it is fun to see what vacuuming is doing and if you are as impatient as I am and you tables are big, its nice to get some confirmation that something indeed is happening. You can also just
    vacuum analyze verbose
    if you want to run against all tables.
  2. Problem: The planner has decided its faster to do a table scan than an index scan : This can happen if a) your table is relatively small, or the field you are indexing has a lot of duplicates. Solution: Case in point, boolean fields are not terribly useful to index since 50% of your data is one thing and 50% is another. However they are good candidates to use for Partial indexes e.g. to only index data that is active.
  3. Problem: You set up an index that is incompatible with how you are actually filtering a field. There are a couple of variants of this situation. The old
    • LIKE '%me' will never use an index, but LIKE 'me%' can possibly use an index.
    • The upper lower trap - you defined your index like:
      CREATE INDEX idx_faults_name ON faults USING btree(fault_name);, But you are running a query like this:
      SELECT * FROM faults where UPPER(fault_name) LIKE 'CAR%' Possible fix:
      CREATE INDEX idx_faults_name ON faults USING btree(upper(fault_name));
    • This is one that I wasn't even aware of. It is always nice to read about other peoples problems via newgroups because you discover all these problems you never even knew you had. If your server was init with a non-C locale, doing the above still doesn't work. This came up in Pgsql-novice newsgroup recently and Tom Lane provided an answer which works. This I suspect bites more people than is known. The solution Possible fix:
      CREATE INDEX idx_faults_uname_varchar_pattern ON faults USING btree(upper(fault_name) varchar_pattern_ops);

      However even with the above solution it appears you may still need the variant below for exact matches and IN clauses:

      We haven't quite proven if this an issue with database encoding, data itself or difference in versions of 8.2 vs 8.3. It seems for 8.3 for our particular dataset in UTF-8 the below is still needed for exact matches, however for similar use-case in a 8.2 database in SQL-ASCII, the varchar_pattern_ops is enough for both exact and LIKE matches.

      CREATE INDEX idx_faults_uname ON faults USING btree(upper(fault_name));
      e.g.
      SELECT fault_name from  faults
      WHERE upper(fault_name) IN('CASCADIA ABDUCTION', 'CABIN FEVER');
    • Dum newbie user - Just incompatible data type. E.g. doing something like creating an index on a date field and then doing a text compare with dates by casting your date to text.
  4. Not all indexes can be used. Although PostgreSQL 8.1+ versions support what is known as Bitmap Index Scan, which allows multiple indexes on a table to be used in a query by creating in-memory bitmap indexes. If you have got many indexes, don't expect all possible candidate indexes to be used. Sometimes a table scan is just more efficient.
  5. Problem: The planner isn't perfect. Solution Cry and pray for a brighter day. Actually I've been pretty impressed with PostgreSQL's planning abilities compared to other databases. Some would say, If only I could provide hints, I could make this faster. I tend to think hints are a bad idea and the best solution is just to make the planner better. The problem with hints is they take away one beauty of databases. That is the idea that the database knows the state of the data better than you do and constantly updates that knowledge. Hints can quickly become stale where as a well-primed planner, will constantly be changing strategy as the database changes and that is really what makes database programming unique among various modes of programming.