You'll often hear the term planner statistics thrown around by database geeks. Did you update your statistics. This lingo isn't even limited
to PostgreSQL, but is part and parcel to how most decent databases work. For example in PostgreSQL you do a vacuum analyze
to update your planner statistics in addition
to cleaning up dead space. In SQL Server you do an UPDATE STATISTICS
. In MySQL you do an
ANALYZE TABLE
or a more invasive OPTIMIZE TABLE
.
Normally all this "update your stats so your planner can be happy" is usually unnecessary unless
you just did a bulk load or a bulk delete or you are noticing your queries are suddenly slowing down. These stat things are generally updated behind the scenes by most databases
on an as needed basis.
What makes SQL really interesting and a bit different from procedural languages is that it is declarative (like functional and logical programming languages) and relies on the database planner to come up with strategies for navigating the data. Its strategy is not fixed as it is in procedural languages.
A big part of this strategy is decided on by the query planner which looks at distributions of data. Given different WHERE conditions for similar queries, it could come up with vastly different strategies if one value has a significantly
higher distribution in a table than another. This is also the mystery of why it sometimes refuses to use an index on a field because it has decided a table scan is more efficient and also why some people consider HINTS evil because they pollute the imperative nature of the language.