Sunday, January 08. 2012
Printer Friendly
PostgreSQL has this interesting placeholder called anyelement which it has had for a long time and its complement anyarray. They are used when you want to define a function that can handle many types arguments or can output many types of outputs. They are particularly useful for defining aggregates, which we demonstrated in
Who's on First and Who's on Last and several other aggregate articles.
Anyelement / anyarray can be used just as conveniently in other functions. The main gotcha is that when you pass in the first anyelement/anyarray all subsequent anyelement / anyarray must match the same data type as the first anyelement / anyarray.
Continue reading "The wonders of Any Element"
Wednesday, December 28. 2011
Printer Friendly
For those who aren't familiar with hstore, it's a key/value
storage type that is packaged as an extension or contrib in PostgreSQL 8.2+. In PostgreSQL 9.0 it got a little extra loving in several ways one of which was the introduction
of the hstore(record) casting function that converts a record to an hstore. In this article, I'll demonstrate how you can use this new casting function to do very sleek mail merges right in the database. The only caveat is that it seems to only correctly name the keys if it is fed a real table or view. Derived queries such as aggregates etc get keys named f1, f2, etc.
If you are on PostgreSQL 9.1 or above installing -- hstore is just a CREATE EXTENSION hstore; sql command away. If you are on a lower version of PostgreSQL,
you can usually find the hstore.sql in share/contribs.
Continue reading "Mail Merging using Hstore"
Friday, November 11. 2011
Printer Friendly
One of my favorite tools and I think that of many folks working with GIS and other kinds of Multimedia is the GDAL/OGR suite.
Though I started using it to conquer GIS ETL activities, I found myself using it for problems that are inherently not GIS at all. I talked
about the GDAL OGR2OGR component a while ago in GDAL OGR2OGR for Data Loading
and this time I'll talk tangentially about its raster capabilities. It is a fantastic tool for converting between various raster formats and applying various raster operations.
In PostGIS world the new 2.0 raster functionality puts an SQL wrapper around much of its power. I'm not going to talk about that though except as a fleeting comment to explore later (we've got cool 2 band Map Algebra in PostGIS 2.0 to flaunt its stuff).
So what does this have to do with XPathing XML data with PostgreSQL? Well that's what I'm going to talk about what to do with machine generated data that comes at you in XML format.
A lot of machine generated data is hitting us in an XML like form. I talked about GPX data and navigating that in Which by the way GDAL/OGR can load and export easily into/out of a PostGIS enabled database.
GDAL exposes another kind of machine generated data in XML format which turns out to be very useful for all kinds of things. This is Exchangeable image file format (EXIF) data. There are all kinds
of random text information embedded in pictures and this varies depending on what camera is taking it. Newer cameras like the ones you have built into your iphone or android
embed location based information into them sometimes like where you were standing when you took the picture. Most cameras these days embed the time the picture was taken.
This information is important because if you are taking electronic notes while you are snapping your pictures, it provides an easy way to match up your notes with the picture about the object. So what does this EXIF info look like when you point GDAL at it? We'll see.
Continue reading "XPathing XML data with PostgreSQL"
Thursday, November 03. 2011
Printer Friendly
As mentioned in Lessons Learned Packaging PostGIS extensions, I am working on PostGIS 2.0.0 extension packaging. One of the things I wanted to know was what objects, types, functions etc were installed by my extension. The new packaging system allows for cataloging this relatively easily, but I couldn't find a function or view for this and didn't see one mentioned in the manual, so I created this query which seems to work pretty well as far as I can tell. The basic idea being that any object that an extension depends on
that is not an extension is part of the extension package.
SELECT c.relname As item_type,
COALESCE(proc.proname,typ.typname, cd.relname, op.oprname,
'CAST(' || cs.typname || ' AS ' || ct.typname || ') ', opcname, opfname) As item_name,
COALESCE(proc.proisagg,false) As is_agg, oidvectortypes(proc.proargtypes) As arg_types
FROM pg_depend As d INNER JOIN pg_extension As e
ON d.refobjid = e.oid INNER JOIN pg_class As c ON
c.oid = d.classid
LEFT JOIN pg_proc AS proc ON proc.oid = d.objid
LEFT JOIN pg_type AS typ ON typ.oid = d.objid
LEFT JOIN pg_class As cd ON cd.oid = d.objid
LEFT JOIN pg_operator As op ON op.oid = d.objid
LEFT JOIN pg_cast AS ca ON ca.oid = d.objid
LEFT JOIN pg_type AS cs ON ca.castsource = cs.oid
LEFT JOIN pg_type AS ct ON ca.casttarget = ct.oid
LEFT JOIN pg_opclass As oc ON oc.oid = d.objid
LEFT JOIN pg_opfamily As ofa ON ofa.oid = d.objid
WHERE d.deptype = 'e' and e.extname = 'postgis'
ORDER BY item_type, item_name;
The output looks like:
Continue reading "What objects are in a PostgreSQL installed extension"
Tuesday, November 01. 2011
Printer Friendly
Ever have the need to create a holding table say spreadsheet data with say 100 columns. You need to create a table to hold this stuff. Or perhaps you were feeling in a sadist mood and wanted to abuse your PostgreSQL database to see how many columns you can create in a table of a specific data type.
Here is a quick script to do it:
SELECT 'CREATE TABLE data_import('
|| array_to_string(array_agg('field' || i::text || ' varchar(255)'), ',') || ');'
FROM generate_series(1,10) As i;
SELECT 'CREATE TABLE data_import('
|| string_agg('field' || i::text || ' varchar(255)', ',') || ');'
FROM generate_series(1,10) As i;
Both variants will return output that looks like this:
CREATE TABLE data_import(field1 varchar(255),field2 varchar(255),field3 varchar(255),field4 varchar(255)
,field5 varchar(255),field6 varchar(255),field7 varchar(255)
,field8 varchar(255),field9 varchar(255),field10 varchar(255));
Now if you want it to also execute because you are running it as part of an sql script, you could wrap it in an anonymous function.
DO language 'plpgsql'
$$
DECLARE var_sql text := 'CREATE TABLE data_import('
|| string_agg('field' || i::text || ' varchar(255)', ',') || ');'
FROM generate_series(1,10) As i;
BEGIN
EXECUTE var_sql;
END;
$$ ;
Tuesday, October 18. 2011
Printer Friendly
One of the great lessons learned in building PostGIS extensions is my rediscovery of SED. SED turned out to be
mighty useful in this regard and I'll explain a bit in this article. Unfortunately there is still a lot I need to learn about it
to take full advantage of it and most of my use can be summed up as monkey see, monkey scratch head, monkey do. In addition I came across what I shall refer to as Pain points with using
the PostgreSQL Extension model. Part of which has a lot to do with the non-granular management of changes in PostGIS,
the day to day major flux of changes happening in PostGIS 2.0 space, and my attempt at trying to creat upgrade freeze points amidst these changes.
When PostGIS 2.0 finally arrives, the freeze points will be better defined and not change from day to day. So some of these issues
may not be that big of a deal.
Continue reading "Lessons learned Packaging PostGIS Extensions: Part 2"
Thursday, October 06. 2011
Printer Friendly
In prior articles we talked about the new PostgreSQL 9.1 extension model
and upcoming PostGIS 2.0 extensions which we have experimental builds of so far.
In this article and other's to follow, we shall provide a documentary of our venture into this new extensions world. We'll discuss some of the obstacles we had with building
extensions, lessons learned, and foolishness exposed, with the hope that others can learn from our experience.
First off, the upcoming PostGIS 2.0 extensions will be packaged as at least two extensions -- postgis which will contain both PostGIS geometry/geography types, functions, meta views and tables as well as raster type and associated functions and tables. Topology support, while a part of upcoming PostGIS 2.0, will be packaged as a separate extension called postgis_topology. The main reason for breaking topology out as a separate extension is that it is always stored in a schema called topology and is not relocatable
to another schema. The way the current extension model works, all the parts of your extension should live in the same schema. Later we plan to package tiger geocoder as an extension, but this one probably makes more sense to live on http://pgxn.org/ since it is only of interest to United States users,
, is purely plpgsql with dependency on PostGIS, and we had beefed it up as part of a consulting contract for a company running PostGIS 1.5. It's the only piece documented in PostGIS 2.0 that works on 1.5 as well (aside from the tiger toplogy loader which has dependency on toplogy), although it has always lived as an extra in the PostGIS code base.
We'll probably package postgis_legacy_functions as an extension too for those people who badly need those 500 alias functions I chucked.
We mentioned in our prior article that we ran into some issues with how our extension worked -- e.g. topology referencing the postgis extension. Most of these turned out just to be ignorance on my part as to how the different pieces fit together and I'll elaborate on these.
Much of what will be described here is also documented in Packaging Related Objects into an Extension.
In the future I'm hoping we'll also see plr and pgrouting packaged as extensions which are common favorites of PostGIS users.
Continue reading "Lessons learned Packaging PostGIS Extensions: Part 1"
Wednesday, October 05. 2011
Printer Friendly
In most release notices, it's the big shiny sexy features that get all the glamor, but in reality on day to day use
it's the small usability enhancements that make the most difference. I'm reminded about this now that I'm working
on upgrade scripts and extensions for PostGIS. There are a couple of new features that make application upgrades easier that I
regret not having in older versions of PostgreSQL we support and additional ones I had in other databases that I find lacking in PostgreSQL. PostgreSQL 8.2 for example brought us DROP IF EXISTS ...
and all I can say is thank goodness we dropped support of prior versions of PostgreSQL in PostGIS 1.4 otherwise developing upgrade scripts would have been more of a nightmare.
PostgreSQL 8.4 introduced the ability to add additional columns to a view using CREATE OR REPLACE VIEW as
long as those columns were at the end of the view which Gabrielle Roth demonstrates an example of in This week’s find: CREATE OR REPLACE VIEW
If you were a MySQL user or application developer not having such features would be one reason to frown on PostgreSQL
and MySQL users and other database converts still have reasons to frown for lack of usability features they had
in their other database that they feel naked without in PostgreSQL.
In 9.1 we got two new DDL commands not much talked about that I am very excited about.
CREATE TABLE .. IF NOT EXISTS . I can't tell you how many times I've heard MySQL users whine about the lack of this in PostgreSQL
and I felt their pain. It would be really nice to have this feature for other things such as TYPES or even possibly a CREATE OR REPLACE TYPE which would allow
some alteration of types like adding attributes at the end.
- ALTER TYPE ..[ADD ATTRIBUTE] [RENAME ATTRIBUTE] [ADD VALUE].
The [ADD VALUE] is a clause specific to ENUM types which allows you to add new enum values before or after an existing. The lack of that feature
in prior versions was the major reason I stayed away from enums.
- And of cause my favorite CREATE EXTENSTION ALTER EXTENSION family which admittedly do get talked about a lot more often and which I'll discuss more in a later
article.
.
I know it sounds like I'm complaining. That's because I am. Honestly though, I think the first step to caring about something is really taking notice of its
flaws and wanting to change them. The strength of an open source project is the ease with which it allows its developers and users to have a great impact on its direction. This is something I do think PostgreSQL excels much much better than most open source projects. I find a ton of flaws in PostGIS I'd like to change and have and I am greatful that PostGIS, like PostgreSQL is not resistant to change if the community wants it. If you are going to take notice of flaws in other products without admitting to your own or admitting that some things are easier in other products and learning from them, then you are a hypocrite or living in a closet. Now getting back to my complaining. Things I miss in PostgreSQL that I had in others which I'm sure I'm not alone.
- Being able to change a table column type of a table column that is used in a VIEW and have PostgreSQL just correct the type in the view
or allow me the option to change it later. This is something we had in SQL Server which Leo whines about often. Actually Leo's whining is more annoying than
the actual problem itself. The notice is at least very descriptive which is more than I can say for other databases.
- Being able to reorder columns in a table. Again something fairly trivial to do in SQL Server and MySQL but not possible in PostgreSQL.
Thursday, September 29. 2011
Printer Friendly
Lots of people have been asking the never ending question of when PostGIS is going to get on the band wagon and support KNN GIST like other GIST based types trigrams, full text search etc. Well it's happened in PostGIS 2.0 and now committed. More of the gory details at Indexed Nearest Neighbour Search in PostGIS.
In short this will make point / point distance searches and rankings way way faster and help also with other distance searches by providing approximations to start with.
We are still preparing the PostgreSQL 9.1 2.0 32-bit windows builds that will have this functionality and should have that ready in the next day or so.
To summarize what you can expect. We spent a lot of time discussing and were torn between a box distance operator <#> and a centroid box distance operator <->, so we ended up having both. The reason being is that for some kinds of geometries e.g. streets that aren't diagonal a box distance operator seems to be a much better approximation of distance than a centroid box distance operator. For points of course the two concepts are the same and not an approximation so point / point distance you'd be better off using the new KNN sorting than ST_Distance + ST_DWithin as we have suggested in past. I'll be doing some benchmarks in the coming weeks comparing the old way and speed differences you can expect and perhaps throwing together box and centroid cocktails that combine the two weapons into thought provoking WTFs (or as Dave Fetter would say "That's very Rube Goldberg of you").
I suspect I'll probably be sticking with <#> because I like the symbol better and I was one of the ones fighting for it :).
Monday, September 26. 2011
Printer Friendly
UPDATE Turns out there is a simpler way of getting rid of roles that have explicit permissions to objects as Tom Lane pointed out in the comments.
DROP OWNED BY some_role;
Will drop the permissions to objects a user has rights to even if they don't own the objects. Of course this needs to be applied with caution since it will drop tables
and other things you don't want necessarily dropped. So it is best to first run a:
REASSIGN OWNED BY some_role TO new_role;
And then run the DROP OWNED BY.
The REASSIGN OWNED BY which is what we did originally is not sufficient since it doesn't drop the permissions or reassign
them as we assumed it would. This is noted in the docs.
And then you will be allowed to
DROP ROLE some_role
One of the things that is still tricky in PostgreSQL is permission management. Even though 9.0 brought us default privileges and the like, these permissions aren't
retroactive so still a pain to deal with if you already have objects defined in your database.
One of the annoyances we come across with is deleting roles. Lets say you have a role and it has explicit permissions to an object.
PostgreSQL won't allow you to delete this role if it owns objects or has explicit permissions to objects. In order to delete it seems you have
to go in and clear out all those permissions. To help with that -- we wrote a quickie script that will generate a script to revoke all permissions on objects
for a specific role. It looks like this:
Continue reading "Bulk Revoke of Permissions for Specific Group/User role"
Sunday, September 18. 2011
Printer Friendly
We attended FOSS4G this year in Denver, Colorado. Friday was a PostGIS bonanza with 5 PostGIS talks back to back including ours.
The crowd was huge. All the PostGIS talks as I recall were so packed that there were not enough seats to accommodate everyone. A more comprehensive
detail of the events is described on OpenGeo FOSS4G Day #5
We admit to overstuffing our slides with SQL and ran short on time at the end. Leo complained and vowed to do a better job next time.
We really weren't expecting such a large crowd. Admittedly I'm all for the after conference experience which is much longer than the conference which is why I tend to make slides that are very dense. WARNING: The following slides feature SQL doing unconventional things suitable only for mature audiences. Viewer discretion is adviced.. You can check out our slides here PostGIS 2.0 the new stuff.
Continue reading "FOSS4G 2011 PostGIS the new stuf"
Sunday, August 14. 2011
Printer Friendly
One of the new features I'm excited about in upcoming PostgreSQL 9.1 are extensions. It is also my hope
that for PostGIS 2.0, we'll be able to package PostGIS 2.0 as an extension.
Reinspired in my mission by David Wheeler's recent post and video on Building and Distributing Extensions without C, I decided to take some time to investigate how all the extension pieces fit together.
The three things I like most about extensions
are: - It has a companion sql CREATE EXTENSION and catalog of what's installed and available right from the db, which makes installing/uninstalling relatively painless
- Installed functions don't get backed up with data, which is really a bit of a nightmare for PostGIS folks and relief be much welcome as you can tell in Paul's PostGIS backup and restore
which gets a bit more of an adventure in PostGIS 2.0 now that we have raster and topology and many more fun fun GEOS functions.
- The ease with which you can uninstall,migrate your extension to another schema, and upgrade (in theory). There will be caveats here of course
with changes that require on disk format and involve data.
Of course the ease is all in the thoughtfulness of the packaging. To get some ideas of how we would go about packaging
PostGIS 2.0 as an extension (it could very well be 3 extensions if we decide to package the core postgis, raster, and topology (and even tiger geocoder) as separate extensions), I thought I would take a look at how others have packaged theirs, and how one goes about registering one of these packages
to make it available in CREATE EXTENSION.
Figuring out the extensions you have available ready to install
First I decided to start by doing a little snooping, by applying some lessons from our previous article Querying table, view, column and function descriptions
I wrote this query to figure out what useful functions are available to learn about extensions.
SELECT p.proname AS funcname, d.description
FROM pg_proc p
INNER JOIN pg_namespace n ON n.oid = p.pronamespace
LEFT JOIN pg_description As d ON (d.objoid = p.oid )
WHERE n.nspname = 'pg_catalog'
AND (d.description ILIKE '%extension%' or p.proname ILIKE '%extension%')
ORDER BY n.nspname, p.proname ;
-- which output this --
funcname | description
---------------------------------+-----------------------------------------------------
pg_available_extension_versions | list available extension versions
pg_available_extensions | list available extensions
pg_extension_config_dump | flag an extension's table contents to be emitted by pg_dump
pg_extension_update_paths | list an extension's version update paths
Continue reading "PostgreSQL 9.1 Exploring Extensions"
Monday, June 27. 2011
Printer Friendly
I'm one of those old-fashioned folks that debugs with print lines and raise notices. They're nice.
They always work, you can put clock time stops in there and don't require any fancy configuration.
At a certain point you do have to pull out a real debugger to see what is going on. This often
happens when your one-liners are no longer good enough and now you have to write 20 liners of plpgsql code.
Such is the case with geocoding and the PostGIS tiger geocoder specifically. Lots of interest has revived
on that with people submitting bug reports and we've got paying clients in need of a fairly easy and speedy drop-in geocoder
that can be molded to handle such things as road way locations, badly mis-spelled real estate data, or just simply
to get rid of their dependency on Google, Yahoo, MapQuest, ESRI and other online or pricey geocoding tools.
So I thought I'd take this opportunity to supplement our old-fashioned debugging with plpgsqldebugger goodness.
In this article, we'll show you how to configure the plpgsql debugger integrated in PgAdmin and run with it.
Continue reading "Using PgAdmin PLPgSQL Debugger"
Thursday, June 16. 2011
Printer Friendly
PostGIS 2.0.0 has inched a lot closer to completion. This past week, Paul enabled his gserialization work which changed the on disk-format of PostGIS and in return I think we'll have a much better platform to grow on. With this change we now have the 3D index and bounding box bindings in place. Say hello to the &&& operator which is like &&, but is 3D aware and comes with its own companion 3D spatial indexes. This will allow you to do true 3D bounding box searches with any of the new 2.5/3D geometries we have in place for PostGIS 2.0.0. We are still noodling out the semantics of boxes. Read Paul's call for action on the The Box Plan?, if you are interested. PostgreSQL 8.4 is the lowest supported version for PostGIS 2.0.0. It took a bit of squabbling between PSC members to make that decision, but I put my foot down and I think in the end was for the best to allow us to use new features, less platforms to test, and get rid of some unnecessary code.
PostGIS Windows 32-bit Experimental builds fresh off the presses
With all these changes, if you are running an earlier alpha release of PostGIS 2.0.0, you'll need to do a dump restore since the on disk format is now changed.
If you are on windows and want to give some of this all a test drive, you can download one of our PostGIS 2.0.0 Windows experimental builds. We still only have 32-bit builds. We have builds
for PostgreSQL 8.4, PostgreSQL 9.0, and PostgreSQL 9.1 beta 2. The problems we faced in PostgreSQL 9.1 beta 1 were resolved in beta 2 so that most regress tests past except some minor ones involving stupid things like difference in line number marking of errors. Complement your PostgreSQL 9.1 beta 2 meal with a yummy large helping of PostGIS 2.0.0 goodness.
Continue reading "State of PostGIS and PostGIS Windows binaries for 9.1 9.0 and 8.4"
Monday, June 06. 2011
Printer Friendly
There once existed programmers who were asked to explain this snippet of code: 1 + 2
- The C programmer explained "It's a common mathematical expression."
- The C++, Java, C# and other impure object-oriented programmers said "We concur. It's a common mathematical expression."
- The Smalltalk programmer explained "1 adds 2."
- The Lisp programmer stood up, a bit in disgust, and said, "No no! You are doing it all wrong!"
The Lisp Programmer then pulled out
a Polish calculator, punched in + 1 2
,and with a very serious face, explained "+ should be pushing those other two around."
I find this episode interesting because while the Lisp programmer I feel is more right, the Smalltalk programmer has managed to follow the rest of the crowd and still stick
to her core principle. This brings us to what does this have to do with trigrams
in PostgreSQL 9.1. Well just like 1 + 2 being a common mathematical expression, abc LIKE '%b%' is a common logical relational database expression that we have long taken for granted as not an indexable operation in most
databases (not any other database to I can think of) until PostgreSQL 9.1, which can utilize trigram indices (the Lisp programmer behind the curtain) to make it fast.
There are 2 main enhancements happening with trigrams in PostgreSQL 9.1
both of which depesz has already touched on in FASTER LIKE/ILIKE
and KNNGIST. This means you can have an even faster trigram search than you ever
have had before and you can do it in such a fashion that doesn't require any PostgreSQL trigram specific syntactical expressions. So while PostgreSQL 9.1 might be understanding LIKE much like all the other databases
you work with, if you have a trigram index in place, it will just be doing it a little faster and sometimes a lot faster using the more clever PostgreSQL 9.1 planner.
This is one example of how you can use applications designed for many databases and still be able to utilize advanced features in
your database of choice. In this article we'll demonstrate.
For this example we'll use a table of 490,000 someodd records consisting of Massachusetts street segments and their names excerpted from TIGER 2010 data. You can
download the trimmed data set from here if you want to play along.
Continue reading "PostgreSQL 9.1 Trigrams teaching LIKE and ILIKE new tricks"
|