Another is a recent one just committed by Alexander Korotkov which I just recently found out about on New node splitting algorithm for GIST and admit I don't know enough about to judge. I have to admit to being very clueless when it comes to the innards of index implementations so don't ask me any technical details. It's one of those short-comings among the trillion others I have that I have learned to accept will probably never change.
What it provides specifically for PostGIS is summarized in Paul's call for action noted below. As a passionate user of PostGIS
,ltree, tsearch, and hstore, I'm pretty excited about these patches and other GIST and general index enhancements and there potential use in GIST dependent extensions. I'm hoping to see
these spring to life in PostgreSQL 9.2 and think it will help to further push the envelope of where PostgreSQL can go as a defacto platform
for cutting-edge technology and scientific research. I think one of PostgreSQL's greatest strength is its extensible index API.
Thanks to the sponsorship of Michigan Technological University, we now
have 50% of the work complete. There is a working patch at the
which provides quad-tree and kd-tree indexes.
However, there is a problem: unless the patch is reviewed and goes
through more QA/QC, it'll never get into PostgreSQL proper. In case
you think I am kidding: we had a patch for KNN searching ready for the
9.0 release, but it wasn't reviewed in time, so we had to wait all the
way through the 9.1 cycle to get it.
I am looking for sponsors in the $5K to $10K range to complete this
work. If you use PostgreSQL in your business, this is a chance to add
a basic capability that may help you in all kinds of ways you don't
expect. We're talking about faster geospatial indexes here, but this
facility will also radically speed any partitioned space. (For
example, the suffix-tree, which can search through URLs incredibly
fast. Another example, you can use a suffix tree to very efficiently
index geohash strings. Interesting.)
If you think there's a possibility, please contact me and I will send
you a prospectus you can take to your manager. Let's make this happen
The Lisp programmer stood up, a bit in disgust, and said, "No no! You are doing it all wrong!" The Lisp Programmer then pulled out
a Polish calculator, punched in + 1 2
,and with a very serious face, explained "+ should be pushing those other two around."
I find this episode interesting because while the Lisp programmer I feel is more right, the Smalltalk programmer has managed to follow the rest of the crowd and still stick
to her core principle. This brings us to what does this have to do with trigrams
in PostgreSQL 9.1. Well just like 1 + 2 being a common mathematical expression, abc LIKE '%b%' is a common logical relational database expression that we have long taken for granted as not an indexable operation in most
databases (not any other database to I can think of) until PostgreSQL 9.1, which can utilize trigram indices (the Lisp programmer behind the curtain) to make it fast.
There are 2 main enhancements happening with trigrams in PostgreSQL 9.1
both of which depesz has already touched on in FASTER LIKE/ILIKE
and KNNGIST. This means you can have an even faster trigram search than you ever
have had before and you can do it in such a fashion that doesn't require any PostgreSQL trigram specific syntactical expressions. So while PostgreSQL 9.1 might be understanding LIKE much like all the other databases
you work with, if you have a trigram index in place, it will just be doing it a little faster and sometimes a lot faster using the more clever PostgreSQL 9.1 planner.
This is one example of how you can use applications designed for many databases and still be able to utilize advanced features in
your database of choice. In this article we'll demonstrate.
For this example we'll use a table of 490,000 someodd records consisting of Massachusetts street segments and their names excerpted from TIGER 2010 data. You can
download the trimmed data set from here if you want to play along.
In an earlier article Where is Soundex and other Fuzzy string things we covered the PostgreSQL contrib module fuzzstrmatch which contains the very popular function
soundex that is found in other popular relational databases. We also covered the more powerful levenshtein distance, metaphone and
dmetaphone functions included in fuzzstrmatch, but rarely found in other relational databases.
As far as fuzzy string matching goes, PostgreSQL has other functions up its sleeves. This time we will cover
the contrib module pg_trgm which was introduced in PostgreSQL 8.3. pgtrgm uses a concept called trigrams for doing string comparisons. The pg_trgm module has several functions and gist/gin operators.
Like other contrib modules, you just need to run the /share/contrib/pg_trgm.sql file packaged in your PostgreSQL install to enable it in your database.
The most useful are the similarity function and the
% operator. The % operator allows for using a GIST/GIN index and the similarity function allows for narrowing your filter similar to what
levenshtein did for us in fuzzstrmatch.