Friday, October 28. 2011
Printer Friendly
FOSS 4G 2011 Videos.
My resident PostGIS developer strk says he can't see them because blip.tv is using some sort of proprietary video swf format. I can't really tell what he is talking about. Does anyone know if fosslic videos are available in other formats like ogg or a Gnash swf viewer compatible format?
We mentioned in prior article Our FOSS4G 2011 the new Stuff and provided the slides in that article. Now we have the video to go with it.
Here is a partial list of PostGIS videos:
Continue reading "Many PostGIS FOSS4G 2011 videos have landed"
Thursday, October 27. 2011
Printer Friendly
One of the main features I love about PostgreSQL is its array support. This is a
feature you won't find in most relational databases, and even databases that support some variant
of it, don't allow you to use it as easily. It is one of the features that makes building
aggregate functions wicked easy in PostgreSQL with no messy compiling required.
Aside from building aggregate functions, it has some other common day uses.
In this article, I'll cover two common ways we use them which I will refer to as the ANY
and Contains tricks.
I like to think of this approach as YeSQL programming style: how SQL can be augmented by more complex data types and index retrieval mechanisms.
Arrays and many other data types (spatial types, keyvalue (hstore), ltree etc) are far from relational structures, yet we can query them easily with SQL and can even relate them.
Continue reading "PostgreSQL Array: The ANY and Contains trick"
Tuesday, October 18. 2011
Printer Friendly
One of the great lessons learned in building PostGIS extensions is my rediscovery of SED. SED turned out to be
mighty useful in this regard and I'll explain a bit in this article. Unfortunately there is still a lot I need to learn about it
to take full advantage of it and most of my use can be summed up as monkey see, monkey scratch head, monkey do. In addition I came across what I shall refer to as Pain points with using
the PostgreSQL Extension model. Part of which has a lot to do with the non-granular management of changes in PostGIS,
the day to day major flux of changes happening in PostGIS 2.0 space, and my attempt at trying to creat upgrade freeze points amidst these changes.
When PostGIS 2.0 finally arrives, the freeze points will be better defined and not change from day to day. So some of these issues
may not be that big of a deal.
Continue reading "Lessons learned Packaging PostGIS Extensions: Part 2"
Friday, October 14. 2011
Printer Friendly
It is with sadness that I learned of the passing of Dennis Ritchie - inventor of C and who made much of Unix, other operating systems, and many software (including Postgres) possible.
More details at Remembering Dennis Ritchie: Software Pioneer
and Dennis Ritchie, in Memoriam.
Dennis Ritchie co-authored the book, The C Programming Language, a classic,
which many of my peers grew up with. It was one of the textbooks at MIT for Civil Engineering 1.00 when we were attending when the course was essentially an introduction to programming with C.
The harsh irony is that when Steve Jobs passed away I was probably the only one around me who
felt no remorse and hoped the curve fanaticism Jobs fueled would die with him. When Dennis Ritchie passed
away I was probably one of the few around me who knew who he was and appreciated the great contributions
he made to the computer industry.
Wednesday, October 12. 2011
Printer Friendly
This is about improvements to GIST indexes that I hope to see in PostgreSQL 9.2. One is a patch for possible inclusion in PostgreSQL 9.2 called SP-GiST, Space-Partitioned GiST created by
Teodor Sigaev and Oleg Bartunov whose basic technique is described in SP-GiST: An Extensible Database Index for Supporting Space Partitioning Trees. For those who don't know Teodor and Oleg, they are the great fellows that brought us many other GiST and GIN goodnesses that many specialty PostgreSQL
extensions enjoy -- e.g. PostGIS, trigrams, ltree, pgsphere, hstore, full-text search to name a few.
Another is a recent one just committed by Alexander Korotkov which I just recently found out about on New node splitting algorithm for GIST and admit I don't know enough about to judge. I have to admit to being very clueless when it comes to the innards of index implementations so don't ask me any technical details. It's one of those short-comings among the trillion others I have that I have learned to accept will probably never change.
What the SP-GIST patch will provide in terms of performance and speed was outlined in
PGCon 2011: SP-GiST - a new indexing infrastructure for PostgreSQL
Space-Partitioning trees in PostgreSQL.
What it provides specifically for PostGIS is summarized in Paul's call for action noted below. As a passionate user of PostGIS
,ltree, tsearch, and hstore, I'm pretty excited about these patches and other GIST and general index enhancements and there potential use in GIST dependent extensions. I'm hoping to see
these spring to life in PostgreSQL 9.2 and think it will help to further push the envelope of where PostgreSQL can go as a defacto platform
for cutting-edge technology and scientific research. I think one of PostgreSQL's greatest strength is its extensible index API.
Paul's PostGIS newsgroup note about seeking funding for faster GIST indexes , work done so far on SP-GIST and call for further action is rebroadcast in it's entirety here.
Thanks to the sponsorship of Michigan Technological University, we now
have 50% of the work complete. There is a working patch at the
commitfest https://commitfest.postgresql.org/action/patch_view?id=631
which provides quad-tree and kd-tree indexes.
However, there is a problem: unless the patch is reviewed and goes
through more QA/QC, it'll never get into PostgreSQL proper. In case
you think I am kidding: we had a patch for KNN searching ready for the
9.0 release, but it wasn't reviewed in time, so we had to wait all the
way through the 9.1 cycle to get it.
I am looking for sponsors in the $5K to $10K range to complete this
work. If you use PostgreSQL in your business, this is a chance to add
a basic capability that may help you in all kinds of ways you don't
expect. We're talking about faster geospatial indexes here, but this
facility will also radically speed any partitioned space. (For
example, the suffix-tree, which can search through URLs incredibly
fast. Another example, you can use a suffix tree to very efficiently
index geohash strings. Interesting.)
If you think there's a possibility, please contact me and I will send
you a prospectus you can take to your manager. Let's make this happen
folks!
Paul
Continue reading "Improving speed of GIST indexes in PostgreSQL 9.2"
Thursday, October 06. 2011
Printer Friendly
In prior articles we talked about the new PostgreSQL 9.1 extension model
and upcoming PostGIS 2.0 extensions which we have experimental builds of so far.
In this article and other's to follow, we shall provide a documentary of our venture into this new extensions world. We'll discuss some of the obstacles we had with building
extensions, lessons learned, and foolishness exposed, with the hope that others can learn from our experience.
First off, the upcoming PostGIS 2.0 extensions will be packaged as at least two extensions -- postgis which will contain both PostGIS geometry/geography types, functions, meta views and tables as well as raster type and associated functions and tables. Topology support, while a part of upcoming PostGIS 2.0, will be packaged as a separate extension called postgis_topology. The main reason for breaking topology out as a separate extension is that it is always stored in a schema called topology and is not relocatable
to another schema. The way the current extension model works, all the parts of your extension should live in the same schema. Later we plan to package tiger geocoder as an extension, but this one probably makes more sense to live on http://pgxn.org/ since it is only of interest to United States users,
, is purely plpgsql with dependency on PostGIS, and we had beefed it up as part of a consulting contract for a company running PostGIS 1.5. It's the only piece documented in PostGIS 2.0 that works on 1.5 as well (aside from the tiger toplogy loader which has dependency on toplogy), although it has always lived as an extra in the PostGIS code base.
We'll probably package postgis_legacy_functions as an extension too for those people who badly need those 500 alias functions I chucked.
We mentioned in our prior article that we ran into some issues with how our extension worked -- e.g. topology referencing the postgis extension. Most of these turned out just to be ignorance on my part as to how the different pieces fit together and I'll elaborate on these.
Much of what will be described here is also documented in Packaging Related Objects into an Extension.
In the future I'm hoping we'll also see plr and pgrouting packaged as extensions which are common favorites of PostGIS users.
Continue reading "Lessons learned Packaging PostGIS Extensions: Part 1"
Wednesday, October 05. 2011
Printer Friendly
In most release notices, it's the big shiny sexy features that get all the glamor, but in reality on day to day use
it's the small usability enhancements that make the most difference. I'm reminded about this now that I'm working
on upgrade scripts and extensions for PostGIS. There are a couple of new features that make application upgrades easier that I
regret not having in older versions of PostgreSQL we support and additional ones I had in other databases that I find lacking in PostgreSQL. PostgreSQL 8.2 for example brought us DROP IF EXISTS ...
and all I can say is thank goodness we dropped support of prior versions of PostgreSQL in PostGIS 1.4 otherwise developing upgrade scripts would have been more of a nightmare.
PostgreSQL 8.4 introduced the ability to add additional columns to a view using CREATE OR REPLACE VIEW as
long as those columns were at the end of the view which Gabrielle Roth demonstrates an example of in This week’s find: CREATE OR REPLACE VIEW
If you were a MySQL user or application developer not having such features would be one reason to frown on PostgreSQL
and MySQL users and other database converts still have reasons to frown for lack of usability features they had
in their other database that they feel naked without in PostgreSQL.
In 9.1 we got two new DDL commands not much talked about that I am very excited about.
CREATE TABLE .. IF NOT EXISTS . I can't tell you how many times I've heard MySQL users whine about the lack of this in PostgreSQL
and I felt their pain. It would be really nice to have this feature for other things such as TYPES or even possibly a CREATE OR REPLACE TYPE which would allow
some alteration of types like adding attributes at the end.
- ALTER TYPE ..[ADD ATTRIBUTE] [RENAME ATTRIBUTE] [ADD VALUE].
The [ADD VALUE] is a clause specific to ENUM types which allows you to add new enum values before or after an existing. The lack of that feature
in prior versions was the major reason I stayed away from enums.
- And of cause my favorite CREATE EXTENSTION ALTER EXTENSION family which admittedly do get talked about a lot more often and which I'll discuss more in a later
article.
.
I know it sounds like I'm complaining. That's because I am. Honestly though, I think the first step to caring about something is really taking notice of its
flaws and wanting to change them. The strength of an open source project is the ease with which it allows its developers and users to have a great impact on its direction. This is something I do think PostgreSQL excels much much better than most open source projects. I find a ton of flaws in PostGIS I'd like to change and have and I am greatful that PostGIS, like PostgreSQL is not resistant to change if the community wants it. If you are going to take notice of flaws in other products without admitting to your own or admitting that some things are easier in other products and learning from them, then you are a hypocrite or living in a closet. Now getting back to my complaining. Things I miss in PostgreSQL that I had in others which I'm sure I'm not alone.
- Being able to change a table column type of a table column that is used in a VIEW and have PostgreSQL just correct the type in the view
or allow me the option to change it later. This is something we had in SQL Server which Leo whines about often. Actually Leo's whining is more annoying than
the actual problem itself. The notice is at least very descriptive which is more than I can say for other databases.
- Being able to reorder columns in a table. Again something fairly trivial to do in SQL Server and MySQL but not possible in PostgreSQL.
Thursday, September 29. 2011
Printer Friendly
Lots of people have been asking the never ending question of when PostGIS is going to get on the band wagon and support KNN GIST like other GIST based types trigrams, full text search etc. Well it's happened in PostGIS 2.0 and now committed. More of the gory details at Indexed Nearest Neighbour Search in PostGIS.
In short this will make point / point distance searches and rankings way way faster and help also with other distance searches by providing approximations to start with.
We are still preparing the PostgreSQL 9.1 2.0 32-bit windows builds that will have this functionality and should have that ready in the next day or so.
To summarize what you can expect. We spent a lot of time discussing and were torn between a box distance operator <#> and a centroid box distance operator <->, so we ended up having both. The reason being is that for some kinds of geometries e.g. streets that aren't diagonal a box distance operator seems to be a much better approximation of distance than a centroid box distance operator. For points of course the two concepts are the same and not an approximation so point / point distance you'd be better off using the new KNN sorting than ST_Distance + ST_DWithin as we have suggested in past. I'll be doing some benchmarks in the coming weeks comparing the old way and speed differences you can expect and perhaps throwing together box and centroid cocktails that combine the two weapons into thought provoking WTFs (or as Dave Fetter would say "That's very Rube Goldberg of you").
I suspect I'll probably be sticking with <#> because I like the symbol better and I was one of the ones fighting for it :).
Monday, September 26. 2011
Printer Friendly
UPDATE Turns out there is a simpler way of getting rid of roles that have explicit permissions to objects as Tom Lane pointed out in the comments.
DROP OWNED BY some_role;
Will drop the permissions to objects a user has rights to even if they don't own the objects. Of course this needs to be applied with caution since it will drop tables
and other things you don't want necessarily dropped. So it is best to first run a:
REASSIGN OWNED BY some_role TO new_role;
And then run the DROP OWNED BY.
The REASSIGN OWNED BY which is what we did originally is not sufficient since it doesn't drop the permissions or reassign
them as we assumed it would. This is noted in the docs.
And then you will be allowed to
DROP ROLE some_role
One of the things that is still tricky in PostgreSQL is permission management. Even though 9.0 brought us default privileges and the like, these permissions aren't
retroactive so still a pain to deal with if you already have objects defined in your database.
One of the annoyances we come across with is deleting roles. Lets say you have a role and it has explicit permissions to an object.
PostgreSQL won't allow you to delete this role if it owns objects or has explicit permissions to objects. In order to delete it seems you have
to go in and clear out all those permissions. To help with that -- we wrote a quickie script that will generate a script to revoke all permissions on objects
for a specific role. It looks like this:
Continue reading "Bulk Revoke of Permissions for Specific Group/User role"
Sunday, September 18. 2011
Printer Friendly
We attended FOSS4G this year in Denver, Colorado. Friday was a PostGIS bonanza with 5 PostGIS talks back to back including ours.
The crowd was huge. All the PostGIS talks as I recall were so packed that there were not enough seats to accommodate everyone. A more comprehensive
detail of the events is described on OpenGeo FOSS4G Day #5
We admit to overstuffing our slides with SQL and ran short on time at the end. Leo complained and vowed to do a better job next time.
We really weren't expecting such a large crowd. Admittedly I'm all for the after conference experience which is much longer than the conference which is why I tend to make slides that are very dense. WARNING: The following slides feature SQL doing unconventional things suitable only for mature audiences. Viewer discretion is adviced.. You can check out our slides here PostGIS 2.0 the new stuff.
Continue reading "FOSS4G 2011 PostGIS the new stuf"
Sunday, September 04. 2011
Printer Friendly
We've been working on converting some of our SQL Server apps to PostgreSQL. In this
article we'll describe some things to watch out for and provide a function we wrote to automate some of
the conversion.
Although both databases are fairly ANSI-SQL compliant, there are still differences
with their CREATE TABLE statements, data types, and how they handle other things that makes porting applications not so trivial.
Continue reading "SQL Server to PostgreSQL: Converting table structure"
Friday, August 26. 2011
Printer Friendly
A while back in New Additions and Promotions in PostGIS Development Team, I mentioned that the new addition to our team Bborie Park was working on image output functions for raster support, among other things. His last addition was ST_AsRaster which allows a PostGIS geometry to cross the line to the raster world, all in the database. This new addition almost completes the basic cycle of making PostGIS not only a spatial analytical tool, but also a rendering engine.
To test out these new functions, I whipped up a quick ASP.NET/JQuery app as described in Minimalist Web-based ASP.NET PostGIS 2.0 Spatial Geometry/Raster Viewer, and Bborie followed up with the PHP version which you can download from http://www.postgis.us/downloads/postgis_webviewer_php.zip.
There is still much room for improvement, e.g. intersection of 2 rasters, faster response, etc, but I can see all the lights flickering and the connections coming together like a self-orchestrating organism. From chaos comes order.
Wednesday, August 24. 2011
Printer Friendly
The Free and Open Source Software for Geospatial (FOSS4G) is schedule for September 12-16, 2011 in Denver, CO. PostGIS is going
to be making a big showing at this event. Paul Ramsey's popular Introduction to PostGIS workshop is already sold out. Check out the schedule of other PostGIS related talks FOSS4G 2011 PostGIS related talks.
We'll be presenting on
Friday PostGIS 2.0, the new stuff and showcasing some of the new features in upcoming PostGIS 2.0. In fact Friday seems to be a day jam packed with PostGIS talks back to back in the Windows room. We probably won't even have to leave the room to get our fill of PostGIS.
I'm particularly looking forward to Steven Singer's PostGIS replication talk
and Jim Mlodgenski's Scaling PostGIS Queries with Stado since these
are becoming critical areas as we take on larger and more complex work.
Sunday, August 14. 2011
Printer Friendly
One of the new features I'm excited about in upcoming PostgreSQL 9.1 are extensions. It is also my hope
that for PostGIS 2.0, we'll be able to package PostGIS 2.0 as an extension.
Reinspired in my mission by David Wheeler's recent post and video on Building and Distributing Extensions without C, I decided to take some time to investigate how all the extension pieces fit together.
The three things I like most about extensions
are: - It has a companion sql CREATE EXTENSION and catalog of what's installed and available right from the db, which makes installing/uninstalling relatively painless
- Installed functions don't get backed up with data, which is really a bit of a nightmare for PostGIS folks and relief be much welcome as you can tell in Paul's PostGIS backup and restore
which gets a bit more of an adventure in PostGIS 2.0 now that we have raster and topology and many more fun fun GEOS functions.
- The ease with which you can uninstall,migrate your extension to another schema, and upgrade (in theory). There will be caveats here of course
with changes that require on disk format and involve data.
Of course the ease is all in the thoughtfulness of the packaging. To get some ideas of how we would go about packaging
PostGIS 2.0 as an extension (it could very well be 3 extensions if we decide to package the core postgis, raster, and topology (and even tiger geocoder) as separate extensions), I thought I would take a look at how others have packaged theirs, and how one goes about registering one of these packages
to make it available in CREATE EXTENSION.
Figuring out the extensions you have available ready to install
First I decided to start by doing a little snooping, by applying some lessons from our previous article Querying table, view, column and function descriptions
I wrote this query to figure out what useful functions are available to learn about extensions.
SELECT p.proname AS funcname, d.description
FROM pg_proc p
INNER JOIN pg_namespace n ON n.oid = p.pronamespace
LEFT JOIN pg_description As d ON (d.objoid = p.oid )
WHERE n.nspname = 'pg_catalog'
AND (d.description ILIKE '%extension%' or p.proname ILIKE '%extension%')
ORDER BY n.nspname, p.proname ;
-- which output this --
funcname | description
---------------------------------+-----------------------------------------------------
pg_available_extension_versions | list available extension versions
pg_available_extensions | list available extensions
pg_extension_config_dump | flag an extension's table contents to be emitted by pg_dump
pg_extension_update_paths | list an extension's version update paths
Continue reading "PostgreSQL 9.1 Exploring Extensions"
Sunday, July 31. 2011
Printer Friendly
One of the biggest complaints about software and perhaps even more specifically FLOSS software is the lack of documentation. I'm not talking about those
small little dialogs we throw in our code that rarely see the light of day. I'm talking about stuff you throw in user docs or specifications and so forth
that an analyst or end-user reads.
The main reasons for this lack of documentation is that while everyone seems to want documentation, few really want to pay for it, and it's time consuming to keep
documentation in synch with what the software actually does. Having documentation that is very inconsistent with the software is worse than not having any documentation at all.
A good way to keep documentation up to date is to fold it into the process of developing and testing the software such that changes in software expose errors in the documentation and vice-versa
or that the act of changing the software corrects the documentation.
Part of the way we try to do that on the PostGIS project is to require each function patch to include documentation. We also try to make our documentation executable
by auto-generating PostgreSQL help and
some test suites from the documentation. For example the process of installing documentation-generated function and type descriptions
in PostgreSQL catches inconsistencies between
the implementation and what we actually publish about how the functions work. Even the image generator that builds most of the images in the manual is built with PostGIS as a core of it
so that when our build-bot is rebuilding the documentation it is exercising some PostGIS output functions.
I can't tell you how many times I've flagged changes in the current PostGIS 2.0 release
simply by trying to install the docbook generated PostgreSQL comment descriptions and PostgreSQL complains that the function signature no longer exists that the documentations says should be there.
So then I go back and correct
the documentation or yell at the programmer if the documentation makes more sense than what they coded. On the other side,
its easy to catch what we neglected to document simply by scanning the functions in PostgreSQL and seeing which ones don't have descriptions.
Okay as far as databases goes, many a DB Programmer/Analyst has waltzed into a project only to be befuddled about the meanings of all these tables used by the application.
If you are like us, you don't even want to look at any database documentation that is not part of the definition of the database because you know 99% of the time it's so obsolete
or was just a pipe dream of someone working in a vacuum
that its more useless than not having any
documentation at all. It is nice to have nicely formatted documentation you can read separate from the code,
but even sweeter if you can query the documentation just as easily as you can query the data.
A good way of having up to date documentation is to weave it in as part of the process of defining the structure. For databases
this means using foreign keys, primary keys, and using the commenting features that most relational databases offer these days. For example the databases we commonly work with,
PostgreSQL, SQL Server, MS Access, and even MySQL all allow you to provide descriptions for tables, table columns, and sometimes other objects such as functions and stored procs right in the database. PostgreSQL even allows you to provide descriptions of columns in
views though that's a bit messier to do. Sadly there isn't a consistent way of pulling these descriptions out of the database that will work for all of these. Each has differently defined meta tables it stores these descriptions in.
For thise article, we'll demonstrate how to pull this information from PostgreSQL.
PostgreSQL does offer many useful switches in psql for querying this data, but we'll focus our attention on pulling this data via SQL. It's much easier to incorporate this information
in auto-generated documentation with SQL because you can have more control what you include and how to format it.
Continue reading "Querying table, view, column and function descriptions"
|