Sunday, July 01. 2012
Printer Friendly
In last article Finding Contiguous primary keys we detailed one of many ways of finding continuous ranges in data, but the approach would only work on higher-end dbs like Oracle 11G, SQL Server 2012, and PostgreSQL 8.4+. Oracle you'd have to replace the EXCEPT I think with MINUS. It wouldn't work on lower Oracle because of use of CTEs. It wouldn't work on lower SQL Server because it uses window LEAD function which wasn't introduced into SQL Server until SQL Server 2012. Someone on reddit provided a Microsoft SQL Server implementation which we found particularly interesting because - it's a bit shorter and it's more cross-platform. You can make it work with minor tweaks on any version of PostgreSQL, MySQL, SQL Server and even MS Access. The only downside I see with this approach is that it uses correlated subqueries which tend to be slower than window functions. I was curious which one would be faster, and to my surprise, this version beats the window one we described in the prior article. It's in fact a bit embarrassing how well this one performs. This one finished in 462 ms on this dataset and the prior one we proposed took 11seconds on this dataset. Without further ado. To test with we created a table:
CREATE TABLE s(n int primary key);
INSERT INTO s(n)
SELECT n
FROM generate_series(1,100000) As n
WHERE n % 200 != 0;
Continue reading "Contiguous Ranges of primary keys: a more cross platform and faster approach"
Saturday, April 07. 2012
Printer Friendly
One of the things people have complained about for quite some time is that postgis is installed in the public schema by default and it's difficult to move after the fact. With now over 900 functions types, etc, in the 2.0.0 release that is a lot of cluttering of workspace. Now that postgis 2.0.0 is packaged as an extension, you can move all those functions etc. to another schema with the ALTER EXTENSION command. PgAdmin even throws a nice GUI on top to allow you to do this with some mouse maneuvering if you prefer the guided way. This might very well be my most favorite usability feature, because if things don't work out you can just move it back to public. I've been hesitant to do this before because well it was harder and I have a lot of 3rd party apps I work with and fear one of them hard-coded public.geometry somewhere. With extensions I can easily revert if it doesn't work out.
I've done this with some of my databases and been testing out how it works. So far so good. Here is how you do it.
CREATE SCHEMA postgis;
ALTER DATABASE your_db_goes_here SET search_path="$user", public, postgis,topology;
GRANT ALL ON SCHEMA postgis TO public;
ALTER EXTENSION postgis SET SCHEMA postgis;
On a somewhat unrelated side note aside from the fact it has to do with postgis not being in same schema as geometry table is someone mentioned in PostGIS newsgroup recently that is an issue if you are using conditional triggers. That is that if you have a conditional when trigger it can't find the geometry when you restore the database because of the way the restore process changes search_path.
I'm expecting the extension model to significantly simplify PostGIS upgrades in the future, because since the functions don't get backed up, they don't get in the way when you do a hard upgrade. Hard upgrade will simply reduce to just restoring your database.
Tuesday, February 14. 2012
Printer Friendly
Have you ever tried to output a file stored in a bytea column from PostgreSQL using PSQL? It ain't pretty. Andrew Dunstan
has a couple of examples of doing this in his article Clever Trick Challenge. This issue has now become very relevant for PostGIS raster use. The reason being is that now that PostGIS in it's 2.0 encarnation can do clipping of rasters, unions of rasters,
map algebra, convert geometries to rasters and convert those rasters to various imagery type file formats, people naturally assume it should be trivial
to output your new fangled raster query as a jpeg or png etc via, PSQL. Turns out it is not. For those who need to, I've documented an example using
one suggestion from Andrew's blog which utilizes the Large object support built into PostgreSQL. It would be nice if this were simpler.
I chose this approach because it was the only one that didn't assume a user has Perl installed or is on a Unix system so should work fairly much the same regardless the client OS in use. You can find the example in Outputting Rasters with PSQL. Suggestions on how to improve on this are welcome.
Friday, November 11. 2011
Printer Friendly
One of my favorite tools and I think that of many folks working with GIS and other kinds of Multimedia is the GDAL/OGR suite.
Though I started using it to conquer GIS ETL activities, I found myself using it for problems that are inherently not GIS at all. I talked
about the GDAL OGR2OGR component a while ago in GDAL OGR2OGR for Data Loading
and this time I'll talk tangentially about its raster capabilities. It is a fantastic tool for converting between various raster formats and applying various raster operations.
In PostGIS world the new 2.0 raster functionality puts an SQL wrapper around much of its power. I'm not going to talk about that though except as a fleeting comment to explore later (we've got cool 2 band Map Algebra in PostGIS 2.0 to flaunt its stuff).
So what does this have to do with XPathing XML data with PostgreSQL? Well that's what I'm going to talk about what to do with machine generated data that comes at you in XML format.
A lot of machine generated data is hitting us in an XML like form. I talked about GPX data and navigating that in Which by the way GDAL/OGR can load and export easily into/out of a PostGIS enabled database.
GDAL exposes another kind of machine generated data in XML format which turns out to be very useful for all kinds of things. This is Exchangeable image file format (EXIF) data. There are all kinds
of random text information embedded in pictures and this varies depending on what camera is taking it. Newer cameras like the ones you have built into your iphone or android
embed location based information into them sometimes like where you were standing when you took the picture. Most cameras these days embed the time the picture was taken.
This information is important because if you are taking electronic notes while you are snapping your pictures, it provides an easy way to match up your notes with the picture about the object. So what does this EXIF info look like when you point GDAL at it? We'll see.
Continue reading "XPathing XML data with PostgreSQL"
Wednesday, October 05. 2011
Printer Friendly
In most release notices, it's the big shiny sexy features that get all the glamor, but in reality on day to day use
it's the small usability enhancements that make the most difference. I'm reminded about this now that I'm working
on upgrade scripts and extensions for PostGIS. There are a couple of new features that make application upgrades easier that I
regret not having in older versions of PostgreSQL we support and additional ones I had in other databases that I find lacking in PostgreSQL. PostgreSQL 8.2 for example brought us DROP IF EXISTS ...
and all I can say is thank goodness we dropped support of prior versions of PostgreSQL in PostGIS 1.4 otherwise developing upgrade scripts would have been more of a nightmare.
PostgreSQL 8.4 introduced the ability to add additional columns to a view using CREATE OR REPLACE VIEW as
long as those columns were at the end of the view which Gabrielle Roth demonstrates an example of in This week’s find: CREATE OR REPLACE VIEW
If you were a MySQL user or application developer not having such features would be one reason to frown on PostgreSQL
and MySQL users and other database converts still have reasons to frown for lack of usability features they had
in their other database that they feel naked without in PostgreSQL.
In 9.1 we got two new DDL commands not much talked about that I am very excited about.
CREATE TABLE .. IF NOT EXISTS . I can't tell you how many times I've heard MySQL users whine about the lack of this in PostgreSQL
and I felt their pain. It would be really nice to have this feature for other things such as TYPES or even possibly a CREATE OR REPLACE TYPE which would allow
some alteration of types like adding attributes at the end.
- ALTER TYPE ..[ADD ATTRIBUTE] [RENAME ATTRIBUTE] [ADD VALUE].
The [ADD VALUE] is a clause specific to ENUM types which allows you to add new enum values before or after an existing. The lack of that feature
in prior versions was the major reason I stayed away from enums.
- And of cause my favorite CREATE EXTENSTION ALTER EXTENSION family which admittedly do get talked about a lot more often and which I'll discuss more in a later
article.
.
I know it sounds like I'm complaining. That's because I am. Honestly though, I think the first step to caring about something is really taking notice of its
flaws and wanting to change them. The strength of an open source project is the ease with which it allows its developers and users to have a great impact on its direction. This is something I do think PostgreSQL excels much much better than most open source projects. I find a ton of flaws in PostGIS I'd like to change and have and I am greatful that PostGIS, like PostgreSQL is not resistant to change if the community wants it. If you are going to take notice of flaws in other products without admitting to your own or admitting that some things are easier in other products and learning from them, then you are a hypocrite or living in a closet. Now getting back to my complaining. Things I miss in PostgreSQL that I had in others which I'm sure I'm not alone.
- Being able to change a table column type of a table column that is used in a VIEW and have PostgreSQL just correct the type in the view
or allow me the option to change it later. This is something we had in SQL Server which Leo whines about often. Actually Leo's whining is more annoying than
the actual problem itself. The notice is at least very descriptive which is more than I can say for other databases.
- Being able to reorder columns in a table. Again something fairly trivial to do in SQL Server and MySQL but not possible in PostgreSQL.
Sunday, September 04. 2011
Printer Friendly
We've been working on converting some of our SQL Server apps to PostgreSQL. In this
article we'll describe some things to watch out for and provide a function we wrote to automate some of
the conversion.
Although both databases are fairly ANSI-SQL compliant, there are still differences
with their CREATE TABLE statements, data types, and how they handle other things that makes porting applications not so trivial.
Continue reading "SQL Server to PostgreSQL: Converting table structure"
Monday, June 27. 2011
Printer Friendly
I'm one of those old-fashioned folks that debugs with print lines and raise notices. They're nice.
They always work, you can put clock time stops in there and don't require any fancy configuration.
At a certain point you do have to pull out a real debugger to see what is going on. This often
happens when your one-liners are no longer good enough and now you have to write 20 liners of plpgsql code.
Such is the case with geocoding and the PostGIS tiger geocoder specifically. Lots of interest has revived
on that with people submitting bug reports and we've got paying clients in need of a fairly easy and speedy drop-in geocoder
that can be molded to handle such things as road way locations, badly mis-spelled real estate data, or just simply
to get rid of their dependency on Google, Yahoo, MapQuest, ESRI and other online or pricey geocoding tools.
So I thought I'd take this opportunity to supplement our old-fashioned debugging with plpgsqldebugger goodness.
In this article, we'll show you how to configure the plpgsql debugger integrated in PgAdmin and run with it.
Continue reading "Using PgAdmin PLPgSQL Debugger"
Monday, June 06. 2011
Printer Friendly
There once existed programmers who were asked to explain this snippet of code: 1 + 2
- The C programmer explained "It's a common mathematical expression."
- The C++, Java, C# and other impure object-oriented programmers said "We concur. It's a common mathematical expression."
- The Smalltalk programmer explained "1 adds 2."
- The Lisp programmer stood up, a bit in disgust, and said, "No no! You are doing it all wrong!"
The Lisp Programmer then pulled out
a Polish calculator, punched in + 1 2
,and with a very serious face, explained "+ should be pushing those other two around."
I find this episode interesting because while the Lisp programmer I feel is more right, the Smalltalk programmer has managed to follow the rest of the crowd and still stick
to her core principle. This brings us to what does this have to do with trigrams
in PostgreSQL 9.1. Well just like 1 + 2 being a common mathematical expression, abc LIKE '%b%' is a common logical relational database expression that we have long taken for granted as not an indexable operation in most
databases (not any other database to I can think of) until PostgreSQL 9.1, which can utilize trigram indices (the Lisp programmer behind the curtain) to make it fast.
There are 2 main enhancements happening with trigrams in PostgreSQL 9.1
both of which depesz has already touched on in FASTER LIKE/ILIKE
and KNNGIST. This means you can have an even faster trigram search than you ever
have had before and you can do it in such a fashion that doesn't require any PostgreSQL trigram specific syntactical expressions. So while PostgreSQL 9.1 might be understanding LIKE much like all the other databases
you work with, if you have a trigram index in place, it will just be doing it a little faster and sometimes a lot faster using the more clever PostgreSQL 9.1 planner.
This is one example of how you can use applications designed for many databases and still be able to utilize advanced features in
your database of choice. In this article we'll demonstrate.
For this example we'll use a table of 490,000 someodd records consisting of Massachusetts street segments and their names excerpted from TIGER 2010 data. You can
download the trimmed data set from here if you want to play along.
Continue reading "PostgreSQL 9.1 Trigrams teaching LIKE and ILIKE new tricks"
Friday, May 06. 2011
Printer Friendly
We've started to play with PostgreSQL 9.1beta and the PgAdmin III 1.14.0 Beta 1. We'll briefly go over the cool gems
found in PgAdmin III beta 1. Most of the new features are for navigating the upcoming PostgreSQL 9.1. Well first obstacle we ran into
was we can't get our favorite extension, PostGIS, to compile against PostgreSQL 9.1beta though it did with the alphas, so you won't be seeing any windows experimental builds until we resolve this issue.
Details of ticket here? PostGIS 2.0 won't compile for PostgreSQL 9.1 beta1
Despite that minor set back, we decided to push on and navigate the new features by using PgAdmin III 1.14.0 as our Tour Guide. Below is a list of new features you can experience
via PgAdmin III 1.14.0 Beta 1. I'm sure there are more we missed, but these are the ones that were most flashing.
Continue reading "Navigating PostgreSQL 9.1 beta 1 with PgAdmin III 1.14.0 Beta 1"
Saturday, April 30. 2011
Printer Friendly
We like to enforce business rules at the database level wherever
we can, for the simple reason, particularly the business we are in, most database update happens
outside the end-user application layer.
That is not to say you shouldn't enforce at the application level too, but that the database is the last
line of defense, is usually more self-documenting than application code can be, and also protects you from your
programmers, even when that your programmers is you.
Domains are objects that you will find in many high-end
standards-compliant databases. They exist in SQL Server, Oracle, IBM Db2, Firebird, and PostgreSQL to name a few.
Domains have existed for a really long time in PostgreSQL. In PostGIS topology, Sandro Santilli (usually known as strk), takes advantage of them for fleshing out the topology support, and I got turned on to them by him.
With that said - let's dive into domains.
What are domains?
Domains are essentially a reusable packaging of check constraints. You use them as if they were a custom data type.
The nice thing about them is that they are usually transparent to applications that
don't understand them.
Example 1: Enforce pay ending/pay day happens only on certain days of the week
Here is an example -- suppose you had a payment system, and you had a rule that the pay thru end date has to
fall on a Friday. You could create a domain such as the following:
CREATE DOMAIN dom_payday
AS date
CONSTRAINT check_dow CHECK (trim(to_char(VALUE, 'day')) = 'friday');
COMMENT ON DOMAIN dom_payday IS 'Company payday rules';
Continue reading "Using Domains to Enforce Business Rules"
Thursday, April 21. 2011
Printer Friendly
In the past I have always chosen to compile my own PostGIS because the GEOS library that came with the regular PostgreSQL yum install, was a bit antiquated.
This has changed, so this time around I figured I'd give it a go at using the Yum repository 1.5.2 release of PostGIS available via Yum Package List.
Before you can follow these directions, make sure you have your PostgreSQL 9.0 setup via our An almost idiot's guide to Install PostgreSQL 9.0 with Yum.
PostGIS in Action has started shipping from Amazon and we already have 3 positive reviews. We are hoping to write another book sometime soon, but haven't decided yet on the topic. Will definitely have something to do with databases and probably a lot of PostgreSQL in it.
Continue reading "An Almost Idiot's Guide to Installing PostGIS 1.5 on PostgreSQL 9.0 via Yum"
Monday, April 18. 2011
Printer Friendly
In this exercise, we'll go thru installing PostgreSQL 9.0 on a Cent OS 5.5 32-bit box. This I'm testing on a GoGrid Cloud server so I can do parallel benchmarks between my windows GoGrid
and Linux GoGrid server.
Upgrading from PostgreSQL 8.* to PostgreSQL 9.0
If you are upgrading from a PostgreSQL 8.4 to PostgreSQL 9.0, please refer to Devrim's article:
Upgrading from 8.4 to 9.0 on Fedora / Red Hat / CentOS using RPMs.
For the rest of this article, we'll go over configuring your yum to use the PostgreSQL PGDG Yum repository managed by Devrim Gunduz, which has the latest and greatest of
9.0 as well as the 9.1 latest development release. We'll also demonstrate how to have two instances of PostgreSQL running so you can experiment with the new features of
PostgreSQL 9.1 while reminiscing about the soon to be old-hat features of PostgreSQL 9.0.
Continue reading "An almost idiot's guide to Install PostgreSQL 9.0 with Yum"
Monday, February 21. 2011
Printer Friendly
We were setting up another SQL Server 2005 64-bit where we needed a linked server connection to our PostgreSQL 9.0 server. This is something we've done before so not new and something we documented in
Setting up PostgreSQL as a Linked Server in Microsoft SQL Server 64-bit.
What was different this time is that we decided to use the latest version of the new PostgreSQL 64-bit drivers now available main PostgreSQL site http://www.postgresql.org/ftp/odbc/versions/msi/.
Sadly these did not work for us. They seemed to work fine in our MS Access 2010 64-bit install, but when used via SQL Server, SQL Server would choke with a message:
Msg 7350, Level 16, State 2, Line 1 Cannot get the column information from OLE DB provider "MSDASQL"
If you tried to do a query with them. You can however see all the tables via the linked server tab.
Continue reading "SQL Server 64-bit Linked Server woes"
Friday, October 29. 2010
Printer Friendly
pgAdmin has this feature called a pgScript. Its a very simple scripting language for running
tasks in a pgAdmin SQL window. The documentation is PgScript manual.
Why would you use it over say writing a plpgsql function?
One main reason we use it is to run quick ad-hoc batch jobs such as geocoding addresses and so forth. The
benefit it has over running a stored function is that you don't have to run it as a single transaction.
This is important for certain kinds of tasks where you just want to run something in a loop and have each loop commit
separately. To us the syntax with the @ resembles SQL Server Transact-SQL more than it does any PostgreSQL language. WhenI first
saw pgScript I thought Wow PgAdmin talks Transact-SQL; now -- what will they think of next :).
Continue reading "pgAdmin pgScript"
Monday, October 11. 2010
Printer Friendly
One of the neat changes already present in the PgAdmin III 1.13dev, is the change in plugin architecture. Version 1.13 dev allows for multiple plugin*.ini files. How does this work.
Well if you have a plugins.d folder in your PgAdmin III version folder, it will read all the inis in that folder and load them as plugins.
Recall in PgAdmin III Plug-in Registration: PostGIS Shapefile and DBF Loader,
we demonstrated how to load the PostGIS shapefile and dbf loader as a plugin in PgAdmin III, well this time we will demonstrate how to do it using PgAdmin version 1.13. Better yet, we'll show you the new and improved
PgAdmin III Shapefile and DBF Loader in the works for PostGIS 2.0.
Continue reading "PgAdmin III 1.13 - change in plugin architecture and PostGIS Plugins"
|