You have a large table and the default planner stats collector underestimates distinct counts of a critical query column thus resulting in much less than optimal query plans.
How can you manually set this?
PostgreSQL 9.0 introduced ability to set two settings on table columns: n_distinct and n_distinct_inherited which are described a bit in ALTER TABLE help.
The n_distinct is the estimated number of distinct values for that column
with -1 or any negative number representing a percentage of estimated table count instead of a true count.
n_distinct_inherited is a setting useful for parent tables that denotes the estimated distinct count sum of all a parent's child tables.
By tweaking these settings when they are less than optimal, you can influence the query planner to produce better plans. Why this is necessary is mostly for large
tables where the stat collector will not query the whole table to determine stats. The stats collector generally queries at most 10-30% of a table.
Determine If you need to set counts
It's always nice to have the stat collector do all these things for you especially if you have a table that is constantly updated and distinct counts can fluctuate a lot.
For static tables you may just want to set them manually.
So how do you know whether you should bother or not. Well you can check the current values
the stats collector has with this query:
--determineifyourstatsarefine-compareestimateswithwhatyouknowtheyare--SELECT tablename, schemaname, attname As colname, n_distinct,
array_to_string(most_common_vals, E'\n')AS common_vals,
array_to_string(most_common_freqs, E'\n')As dist_freq
WHERE tablename ='table_of_interest'ORDERBY schemaname, tablename, attname;