<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    
    <title>Postgres OnLine Journal - pgtrgm</title>
    <link>https://www.postgresonline.com/journal/</link>
    <description>Tips and tricks for PostgreSQL</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 2.3.5 - http://www.s9y.org/</generator>
    <pubDate>Wed, 19 Oct 2011 02:09:41 GMT</pubDate>

    <image>
    <url>https://www.postgresonline.com/journal/templates/default/img/s9y_banner_small.png</url>
    <title>RSS: Postgres OnLine Journal - pgtrgm - Tips and tricks for PostgreSQL</title>
    <link>https://www.postgresonline.com/journal/</link>
    <width>100</width>
    <height>21</height>
</image>

<item>
    <title>Improving speed of GIST indexes in PostgreSQL 9.2</title>
    <link>https://www.postgresonline.com/journal/index.php?/archives/225-Improving-speed-of-GIST-indexes-in-PostgreSQL-9.2.html</link>
            <category>9.2</category>
            <category>editor note</category>
            <category>gis</category>
            <category>hstore</category>
            <category>intermediate</category>
            <category>ltree</category>
            <category>pgtrgm</category>
            <category>postgis</category>
            <category>postgresql versions</category>
            <category>tsearch</category>
    
    <comments>https://www.postgresonline.com/journal/index.php?/archives/225-Improving-speed-of-GIST-indexes-in-PostgreSQL-9.2.html#comments</comments>
    <wfw:comment>https://www.postgresonline.com/journal/wfwcomment.php?cid=225</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>https://www.postgresonline.com/journal/rss.php?version=2.0&amp;type=comments&amp;cid=225</wfw:commentRss>
    

    <author>nospam@example.com (Leo Hsu and Regina Obe)</author>
    <content:encoded>
    &lt;p&gt;This is about improvements to GIST indexes that I hope to see in PostgreSQL 9.2.  One is a patch for possible inclusion in PostgreSQL 9.2 called &lt;b&gt;&lt;a href=&quot;https://commitfest.postgresql.org/action/patch_view?id=631&quot; target=&quot;_blank&quot;&gt;SP-GiST, Space-Partitioned GiST&lt;/a&gt;&lt;/b&gt; created by 
&lt;a href=&quot;http://www.sigaev.ru/&quot; target=&quot;_blank&quot;&gt;Teodor Sigaev&lt;/a&gt; and &lt;a href=&quot;http://www.sai.msu.su/~megera/&quot; target=&quot;_blank&quot;&gt;Oleg Bartunov&lt;/a&gt; whose basic technique is described in &lt;a href=&quot;http://www.cs.purdue.edu/spgist/papers/W87R36P214137510.pdf&quot; target=&quot;_blank&quot;&gt;SP-GiST: An Extensible Database Index for Supporting Space Partitioning Trees&lt;/a&gt;. For those who don&#039;t know Teodor and Oleg,  they are the great fellows that brought us many other GiST and GIN goodnesses that many specialty PostgreSQL
extensions enjoy -- e.g. &lt;a href=&quot;http://postgis.net/documentation/manual-svn/&quot; target=&quot;_blank&quot;&gt;PostGIS&lt;/a&gt;, &lt;a href=&quot;http://developer.postgresql.org/pgdocs/postgres/pgtrgm.html&quot; target=&quot;_blank&quot;&gt;trigrams&lt;/a&gt;, &lt;a href=&quot;http://developer.postgresql.org/pgdocs/postgres/ltree.html&quot; target=&quot;_blank&quot;&gt;ltree&lt;/a&gt;, &lt;a href=&quot;http://pgsphere.projects.postgresql.org/&quot; target=&quot;_blank&quot;&gt;pgsphere&lt;/a&gt;, &lt;a href=&quot;http://developer.postgresql.org/pgdocs/postgres/hstore.html&quot; target=&quot;_blank&quot;&gt;hstore&lt;/a&gt;, &lt;a href=&quot;http://developer.postgresql.org/pgdocs/postgres/textsearch-intro.html&quot; target=&quot;_blank&quot;&gt;full-text search&lt;/a&gt; to name a few.&lt;/p&gt;
&lt;p&gt;Another is a recent one just committed by Alexander Korotkov which I just recently found out about on &lt;a href=&quot;http://postgis.net/pipermail/postgis-devel/2011-October/015561.html&quot; target=&quot;_blank&quot;&gt;New node splitting algorithm for GIST&lt;/a&gt; and admit I don&#039;t know enough about to judge. I have to admit to being very clueless when it comes to the innards of index implementations so don&#039;t ask me any technical details.  It&#039;s one of those short-comings among the trillion others I have that I have learned to accept will probably never change.&lt;/p&gt;
&lt;p&gt;What the SP-GIST patch will provide in terms of performance and speed was outlined in 
&lt;a href=&quot;http://www.pgcon.org/2011/schedule/events/309.en.html&quot; target=&quot;_blank&quot;&gt;PGCon 2011: SP-GiST - a new indexing infrastructure for PostgreSQL
Space-Partitioning trees in PostgreSQL&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;What it provides specifically for PostGIS is summarized in Paul&#039;s call for action noted below.  As a passionate user of PostGIS
,ltree, tsearch, and hstore, I&#039;m pretty excited about these patches and other GIST and general index enhancements and there potential use in GIST dependent extensions. I&#039;m hoping to see
these spring to life in PostgreSQL 9.2 and think it will help to further push the envelope of where PostgreSQL can go as a defacto platform 
for cutting-edge technology and scientific research.  I think one of PostgreSQL&#039;s greatest strength is its extensible index API.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://postgis.net/pipermail/postgis-users/2011-October/031078.html&quot; target=&quot;_blank&quot;&gt;Paul&#039;s PostGIS newsgroup note about seeking funding for faster GIST indexes&lt;/a&gt; , work done so far on SP-GIST and call for further action is rebroadcast in it&#039;s entirety here.
&lt;pre&gt;Thanks to the sponsorship of &lt;a href=&quot;http://www.mtu.edu&quot; target=&quot;_blank&quot;&gt;Michigan Technological University&lt;/a&gt;, we now
have 50% of the work complete. There is a working patch at the
commitfest &lt;a href=&quot;https://commitfest.postgresql.org/action/patch_view?id=631&quot; target=&quot;_blank&quot;&gt;https://commitfest.postgresql.org/action/patch_view?id=631&lt;/a&gt;
which provides quad-tree and kd-tree indexes.

However, there is a problem: unless the patch is reviewed and goes
through more QA/QC, it&#039;ll never get into PostgreSQL proper. In case
you think I am kidding: we had a patch for KNN searching ready for the
9.0 release, but it wasn&#039;t reviewed in time, so we had to wait all the
way through the 9.1 cycle to get it.

I am looking for sponsors in the $5K to $10K range to complete this
work. If you use PostgreSQL in your business, this is a chance to add
a basic capability that may help you in all kinds of ways you don&#039;t
expect. We&#039;re talking about faster geospatial indexes here, but this
facility will also radically speed any partitioned space. (For
example, the suffix-tree, which can search through URLs incredibly
fast. Another example, you can use a suffix tree to very efficiently
index geohash strings. Interesting.)

If you think there&#039;s a possibility, please contact me and I will send
you a prospectus you can take to your manager. Let&#039;s make this happen
folks!

Paul
&lt;/pre&gt; &lt;a class=&quot;block_level&quot; href=&quot;https://www.postgresonline.com/journal/index.php?/archives/225-Improving-speed-of-GIST-indexes-in-PostgreSQL-9.2.html#extended&quot;&gt;Continue reading &quot;Improving speed of GIST indexes in PostgreSQL 9.2&quot;&lt;/a&gt;
    </content:encoded>

    <pubDate>Wed, 12 Oct 2011 18:24:00 -0400</pubDate>
    <guid isPermaLink="false">https://www.postgresonline.com/journal/index.php?/archives/225-guid.html</guid>
    
</item>
<item>
    <title>PostgreSQL 9.1 Trigrams teaching LIKE and ILIKE new tricks</title>
    <link>https://www.postgresonline.com/journal/index.php?/archives/212-PostgreSQL-9.1-Trigrams-teaching-LIKE-and-ILIKE-new-tricks.html</link>
            <category>9.1</category>
            <category>basics</category>
            <category>contrib spotlight</category>
            <category>intermediate</category>
            <category>pgtrgm</category>
            <category>postgresql versions</category>
    
    <comments>https://www.postgresonline.com/journal/index.php?/archives/212-PostgreSQL-9.1-Trigrams-teaching-LIKE-and-ILIKE-new-tricks.html#comments</comments>
    <wfw:comment>https://www.postgresonline.com/journal/wfwcomment.php?cid=212</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>https://www.postgresonline.com/journal/rss.php?version=2.0&amp;type=comments&amp;cid=212</wfw:commentRss>
    

    <author>nospam@example.com (Leo Hsu and Regina Obe)</author>
    <content:encoded>
    &lt;p&gt;There once existed programmers who were asked to explain this snippet of code: &lt;code&gt;1 + 2&lt;/code&gt; &lt;/p&gt;
&lt;ul&gt;&lt;li&gt;The C programmer explained &amp;quot;It&#039;s a common mathematical expression.&amp;quot;&lt;/li&gt;
    &lt;li&gt;The C++, Java, C# and other impure object-oriented programmers said &amp;quot;We concur.  It&#039;s a common mathematical expression.&amp;quot;&lt;/li&gt;
    &lt;li&gt;The &lt;a href=&quot;http://en.wikipedia.org/wiki/Smalltalk&quot; target=&quot;_blank&quot;&gt;Smalltalk&lt;/a&gt; programmer explained &amp;quot;1 adds 2.&amp;quot;&lt;/li&gt;
    &lt;li&gt;The &lt;a href=&quot;http://en.wikipedia.org/wiki/Lisp_(programming_language)&quot; target=&quot;_blank&quot;&gt;Lisp&lt;/a&gt; programmer stood up, a bit in disgust, and said, &amp;quot;No no! You are doing it all wrong!&amp;quot;&lt;br /&gt; The Lisp Programmer then pulled out
        a &lt;a href=&quot;http://en.wikipedia.org/wiki/Polish_notation&quot; target=&quot;_blank&quot;&gt;Polish calculator&lt;/a&gt;, punched in &lt;code&gt;+ 1 2&lt;/code&gt;
        ,and with a very serious face, explained &lt;br /&gt; &amp;quot;+ should be pushing those other two around.&amp;quot;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I find this episode interesting because while the Lisp programmer I feel is more right, the Smalltalk programmer has managed to follow the rest of the crowd and still stick 
to her core principle. This brings us to what does this have to do with &lt;a href=&quot;https://www.postgresonline.com/journal/archives/169-Fuzzy-string-matching-with-Trigram-and-Trigraphs.html&quot; target=&quot;_blank&quot;&gt;trigrams&lt;/a&gt; 
in PostgreSQL 9.1.  Well just like &lt;code&gt;1 + 2&lt;/code&gt; being a common mathematical expression, &lt;code&gt;abc LIKE &#039;%b%&#039;&lt;/code&gt; is a common logical relational database expression that we have long taken for granted as not an indexable operation in most
databases (not any other database to I can think of) until PostgreSQL 9.1, which can utilize trigram indices (the Lisp programmer behind the curtain) to make it fast.&lt;/p&gt;


&lt;p&gt;There are 2 main enhancements happening with &lt;a href=&quot;https://www.postgresonline.com/journal/archives/169-Fuzzy-string-matching-with-Trigram-and-Trigraphs.html&quot; target=&quot;_blank&quot;&gt;trigrams&lt;/a&gt; in PostgreSQL 9.1
both of which depesz has already touched on in &lt;a href=&quot;http://www.depesz.com/index.php/2011/02/19/waiting-for-9-1-faster-likeilike/&quot; target=&quot;_blank&quot;&gt;FASTER LIKE/ILIKE&lt;/a&gt;
and &lt;a href=&quot;http://www.depesz.com/index.php/2010/12/11/waiting-for-9-1-knngist/&quot; target=&quot;_blank&quot;&gt;KNNGIST&lt;/a&gt;.  This means you can have an even faster trigram search than you ever
have had before and you can do it in such a fashion that doesn&#039;t require any PostgreSQL trigram specific syntactical expressions.  So while PostgreSQL 9.1 might be understanding LIKE much like all the other databases
you work with, if you have a trigram index in place, it will just be doing it a little faster and sometimes a lot faster using the more clever PostgreSQL 9.1 planner.   
This is one example of how you can use applications designed for many databases and still be able to utilize advanced features in
your database of choice. In this article we&#039;ll demonstrate.&lt;/p&gt;

&lt;p&gt;For this example we&#039;ll use a table of 490,000 someodd records consisting of Massachusetts street segments and their names excerpted from &lt;a href=&quot;http://www.census.gov/geo/www/tiger/tgrshp2010/tgrshp2010.html&quot; target=&quot;_blank&quot;&gt;TIGER 2010&lt;/a&gt; data. You can
download the trimmed data set from &lt;a href=&quot;/downloads/featnames_short.zip&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt; if you want to play along.&lt;/p&gt; &lt;a class=&quot;block_level&quot; href=&quot;https://www.postgresonline.com/journal/index.php?/archives/212-PostgreSQL-9.1-Trigrams-teaching-LIKE-and-ILIKE-new-tricks.html#extended&quot;&gt;Continue reading &quot;PostgreSQL 9.1 Trigrams teaching LIKE and ILIKE new tricks&quot;&lt;/a&gt;
    </content:encoded>

    <pubDate>Mon, 06 Jun 2011 01:23:00 -0400</pubDate>
    <guid isPermaLink="false">https://www.postgresonline.com/journal/index.php?/archives/212-guid.html</guid>
    
</item>
<item>
    <title>Fuzzy string matching with Trigram and Trigraphs</title>
    <link>https://www.postgresonline.com/journal/index.php?/archives/169-Fuzzy-string-matching-with-Trigram-and-Trigraphs.html</link>
            <category>8.3</category>
            <category>8.4</category>
            <category>9.0</category>
            <category>contrib spotlight</category>
            <category>fuzzystrmatch</category>
            <category>intermediate</category>
            <category>pgtrgm</category>
            <category>postgresql versions</category>
    
    <comments>https://www.postgresonline.com/journal/index.php?/archives/169-Fuzzy-string-matching-with-Trigram-and-Trigraphs.html#comments</comments>
    <wfw:comment>https://www.postgresonline.com/journal/wfwcomment.php?cid=169</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>https://www.postgresonline.com/journal/rss.php?version=2.0&amp;type=comments&amp;cid=169</wfw:commentRss>
    

    <author>nospam@example.com (Leo Hsu and Regina Obe)</author>
    <content:encoded>
    &lt;p&gt;In an earlier article &lt;a href=&quot;https://www.postgresonline.com/journal/archives/158-Where-is-soundex-and-other-warm-and-fuzzy-string-things.html&quot; target=&quot;_blank&quot;&gt;Where is Soundex and other Fuzzy string things&lt;/a&gt; we covered the PostgreSQL contrib module fuzzstrmatch which contains the very popular function
soundex that is found in other popular relational databases. We also covered  the more powerful levenshtein distance, metaphone and 
dmetaphone functions included in fuzzstrmatch, but rarely found in other relational databases.&lt;/p&gt;

&lt;p&gt;As far as fuzzy string matching goes, PostgreSQL has other functions up its sleeves.  This time we will cover
the contrib module &lt;a href=&quot;http://www.postgresql.org/docs/8.4/interactive/pgtrgm.html&quot; target=&quot;_blank&quot;&gt;pg_trgm&lt;/a&gt; which was introduced in PostgreSQL 8.3.  pgtrgm uses a concept called trigrams  for doing string comparisons. The pg_trgm module has several functions and gist/gin operators.  
Like other contrib modules, you just need to run the &lt;b&gt;/share/contrib/pg_trgm.sql&lt;/b&gt; file packaged in your PostgreSQL install to enable it in your database. 
&lt;/p&gt;
&lt;p&gt;For this set of exercises, we&#039;ll use trigrams to compare words using the same set of data we tested 
with soundex and metaphones. For the next set of exercises, we will be using the places dataset we created in &lt;a href=&quot;https://www.postgresonline.com/journal/archives/157-Import-fixed-width-data-into-PostgreSQL-with-just-PSQL.html&quot; target=&quot;_blank&quot;&gt;Importing Fixed width data into PostgreSQL with just PSQL&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt; The most useful are the &lt;B&gt;similarity&lt;/B&gt; function and the
% operator.  The &lt;b&gt;%&lt;/b&gt; operator allows for using a GIST/GIN index and the similarity function allows for narrowing your filter similar to what
levenshtein did for us in fuzzstrmatch.&lt;/p&gt; &lt;a class=&quot;block_level&quot; href=&quot;https://www.postgresonline.com/journal/index.php?/archives/169-Fuzzy-string-matching-with-Trigram-and-Trigraphs.html#extended&quot;&gt;Continue reading &quot;Fuzzy string matching with Trigram and Trigraphs&quot;&lt;/a&gt;
    </content:encoded>

    <pubDate>Wed, 21 Jul 2010 18:20:00 -0400</pubDate>
    <guid isPermaLink="false">https://www.postgresonline.com/journal/index.php?/archives/169-guid.html</guid>
    
</item>

</channel>
</rss>
