Wednesday, May 09. 2012
Last time we demonstrated how to query delimited text files using the fdw_file that comes pacakged with PostgreSQL 9.1+, this time we'll continue our journey into Flat file querying Foreign Data Wrapper using an experimental foreign data wrapper designed for also querying delimited data, but outputting it as a single column text array table.
This one is called file_textarray_fdw and developed by Andrew Dunstan. It's useful if you are dealing with for example jagged files, where not all columns are not properly filled in for each record or there are just a ton of columns you don't want to bother itemizing before you bring in. The benefit is you can still query and decide how you want to break it apart. You can grab the source code from file_text_array_fdw source code. If you are on windows, we have compiled binaries in our Bag o' FDWs for both PostgreSQL 9.1 32-bit FDW for Windows bag and PostgreSQL 9.1 64-bit FDW for Windows bag that should work fine with the EDB installed windows binaries.
For other systems, the compile is fairly easy if you have the postgresql development libraries installed.
Continue reading "File FDW Family: Part 2 file_textarray_fdw Foreign Data Wrapper"
Thursday, May 03. 2012
Last time we demonstrated how to use the ODBC Foreign Data wrapper, this time we'll continue our journey into Foreign Data Wrapper land by demonstrating what I'll call the File FDW family of Foreign Data Wrappers. There is one that usually comes packaged with PostgreSQL 9.1 which is called fdw_file but there are two other experimental ones I find very useful which are
developed by Andrew Dunstan both of which Andrew demoed in PostgreSQL Foreign Data Wrappers and talked
about a little bit Text files from a remote source. As people who have to deal with text data files day in and out, especially ones from mainframes, these satisfy a certain itch.
- file_fdw - for querying delimited text files.
- file_fixed_length_fdw - this one deals with fixed length data. We discussed methods of importing fixed length data in Import Fixed width data. This is yet another approach but has the benefit that you can also use it to import just a subset of a file.
- file_text_array_fdw - this one queries a delimited file as if each delimiete row was a text array. It is ideal for those less than perfect moments when someone gives you a file with a 1000 columns and you don't have patience to look at what the hell those columns mean just yet.
In this article, we'll just cover the file_fdw one, but will follow up in subsequent articles, demonstrating the array and fixed length record ones.
Continue reading "File FDW Family: Part 1 file_fdw"
Sunday, April 15. 2012
As promised in our prior article: ODBC Foreign Data wrapper on windows, we'll demonstrate how to query SQL Server using the Foreign Data Wrapper. This we are testing on windows.
As far as querying SQL Server / PostgreSQL goes, the Foreign Data Wrapper still lacks many features that the SQL Server Linked Server approach provides.
The key ones we find currently lacking: ability to do updates and reference a table directly from server without knowing underlying structure. That said
the Foreign data Wrapper approach has possiblity to support a lot more data sources with ease. We'll demonstrate in subsequent articles using the www_fdw to query
web services which we've been playing a lot with and the often packaged in file_fdw. Enough of that let's start with a concrete example.
Warning, this is not production ready, but seems like a very promising start and with more testing can become very robust. Although we are demonstrating odbc_fdw on windows,
it is supported on Unix via the UnixODBC, but the data sources you can query will probably be different.
I'm really looking forward to how the FDW technology in PostgreSQL will push the envelop. I've been playing around with the www_fdw as well and been impressed how easily it is to
query webservices with SQL. A very ah-hah moment.
Continue reading "ODBC Foreign Data wrapper to query SQL Server on Window - Part 2"
Saturday, April 07. 2012
One of the things people have complained about for quite some time is that postgis is installed in the public schema by default and it's difficult to move after the fact. With now over 900 functions types, etc, in the 2.0.0 release that is a lot of cluttering of workspace. Now that postgis 2.0.0 is packaged as an extension, you can move all those functions etc. to another schema with the ALTER EXTENSION command. PgAdmin even throws a nice GUI on top to allow you to do this with some mouse maneuvering if you prefer the guided way. This might very well be my most favorite usability feature, because if things don't work out you can just move it back to public. I've been hesitant to do this before because well it was harder and I have a lot of 3rd party apps I work with and fear one of them hard-coded public.geometry somewhere. With extensions I can easily revert if it doesn't work out.
I've done this with some of my databases and been testing out how it works. So far so good. Here is how you do it.
CREATE SCHEMA postgis;
ALTER DATABASE your_db_goes_here SET search_path="$user", public, postgis,topology;
GRANT ALL ON SCHEMA postgis TO public;
ALTER EXTENSION postgis SET SCHEMA postgis;
On a somewhat unrelated side note aside from the fact it has to do with postgis not being in same schema as geometry table is someone mentioned in PostGIS newsgroup recently that is an issue if you are using conditional triggers. That is that if you have a conditional when trigger it can't find the geometry when you restore the database because of the way the restore process changes search_path.
I'm expecting the extension model to significantly simplify PostGIS upgrades in the future, because since the functions don't get backed up, they don't get in the way when you do a hard upgrade. Hard upgrade will simply reduce to just restoring your database.
Tuesday, April 03. 2012
Yap that's right. PostGIS 2.0.0 is finally out the door. It took us Two years and 2 months, a super long incubation for us, but we did it and just in time for Javier's Where 2.0 2.0 Talk.. Paul has some border-line R rated pictures of the birthing process.
We have windows 32 binaries posted for those adventurous enough to taste the cookies while they are hot. We are working on the windows 64-bit binaries. Those should be out tomorrow. We'll be working in the coming week to get the installers ready to put up so they are available via Stack Builder. We'll probably put up the 32-bit ones first, hopefully followed shortly by the 64-bit ones. You should see PostGIS 2.0.0 soon on Yum as well. Devrim is cooking :).
Tuesday, March 27. 2012
One of the new features in PostgreSQL 9.1 that we've been meaning to try is the new foreign data wrapper support.
Now that we are in compile mode gearing up for releasing PostGIS 2.0.0 for windows (both 32 and 64-bit), we thought we'd give the odbc_fdw a try trying to compile on windows. Last we tried we weren't successful because
we couldn't get past the -lodbc required step.
It turns out there is an easy fix to the ODBC dependency issue and I'm not sure I changed the line right. In the makefile we changed -lodbc to -lodbc32. This was needed for both compiling 32-bit as well as the 64-bit. We compiled the 64-bit version under our Mingw-64 chain
and 32-bit under our old Mingw gcc 3.4.5. Sadly we still don't have our mingw64 (compile for windows 32-bit compile up yet). Our ming64 for windows 32 can compile the 9.2 development branch but not the 9.1.3. Go figure. Anyrate to make a long story short -- we have 32-bit binaries for PostgreSQL (you can use in VC++ builds) and 64-bit binaries as well that you can use for the VC++ EDB builds for those who are interested in experimenting.
PostgreSQL 9.1 Windows 32-bit ODBC FDW PostgreSQL 9.1 Windows 64-bit ODBC FDW
So far we've tried the PostgreSQL 64-bit data wrapper against a SQL Server 2005 DSN and it seems to work fine. Have yet to try it on other ODBC sources.
We'll write up a more detailed article describing how to make the connections.
There is one trick to getting Mingw64 compiled PostgreSQL extensions to work with the Windows 64-bit EDB builds, and that is that when you compile your PostgreSQL under mingw64,
you have to configure with option ----disable-float8-byval as we noted in our PostGIS Window 64 build instructions.
Saturday, March 10. 2012
UPDATE We have PostGIS 2.0.0 available for both 32-bit and 64-bit windows PostgreSQL. We are wroking on getting the installers out
This past week has been very nerve racking but also exciting. We have successfully compiled PostGIS under the mingw64 chain and built a PostGIS windows 64-bit
for 2.0 (and 1.5), that can install under the Enterprise Db VC++ 64-bit builds of PostgreSQL 9.1. We haven't tried on 9.0, but we assume that should be fairly trivial.
Note only that, but it passes most of the PostGIS battery of tests. We first want to thank a group of people which made this all possible:
- Andrew Dunstan we are greatly indebted to for making it possible to compile PostgreSQL under mingw64 tool chain. As much as people have whined
about wanting to compile PostGIS under a pure VC chain, this is not possible at this juncture just because a lot of the tests and other tool chains PostGIS uses for building
are too tied to the Unix build environment.
- We want to thank the generous folks who provided money for our campaign so that we could funnel time from paid consulting work to focus on this effort and to prove that every little bit counts.
- SpatiaLite developer Alessandro Furieri whose mingw64 compile instructions were invaluable to helping us overcome our GEOS and other compile obstacles. SpatiaLite (the OGC spatial extender for SQLite),uses much of the same plumbing that PostGIS uses under the hood, so many of the lessons he learned an provide could be put to use with our problems.
- To Paul Ramsey especially and other PostGIS devs for general moral support and helping us tackle some PostGIS specific issues when compiled with mingw64. Paul demonstrated that yes you can mix VC++ built components with MingW and steps on how to do it. Part of the reason for that is the newer mingw32 seemed to crash with GEOS compiled under mingw32. Though the mingw64 chain didn't have this issue once we overcame our compile obstacle. We may in the future compare and see if compiling Geos under VC++ provides better performance and will also get us closer to having it possible to compile PostGIS fully under VC++ if people choose to. For the time being having a single tool chain that we can extract and run with is most important. We are preparing a self-standing Mingw64 tool chain with all the components needed to build PostGIS already compiled so that windows users who want to help with PostGIS need only extract to have a fully functioning postGIS dev environment and we also plan to move our mingw32 build to mingw64 chain of tools.
We hope to have a 64-bit compiled download ready next week for PostGIS 2.0.0 beta3 for people to try out. We are working on some issues with the raster2pgsql and loader/dumper guis we compiled not working right, but the core PostGIS works just fine in 64-bit and the 32-bit loader tools work fine against a 64-bit install. One thing we did notice with the 64-bit PostgreSQL is that we
can set shared_buffers much higher than the 32-bit PostgreSQL windows. On windows we could never go beyond ~700MB without it not being able to start or crashing. With the 64-bit we were able to go to 2GB. Haven't tried higher yet. We hope this will prove to be a performance boost for tasks such as geocoding that reuse a lot of the same datasets and benefit a lot from share memory.
Friday, February 17. 2012
I was excited to learn from Pasha Golub's blog LibreOffice Base 3.5 now comes packaged with native PostgreSQL driver so no separate configuration is required.
The connection string syntax follows the old SBC native driver of prior OpenOffice versions we itemized in Using OpenOffice Base with PostgeSQL.
What I really wanted to do with it is experiment with its graphical rendering capabilities. As discussed in PSQL needs a better way of outputting bytea
one of the frequently asked questions on the PostGIS list by folks using the new not yet officially released (alpha5 recently released) functionality in PostGIS 2.0 is how to render rasters
with common variety tools. I suspected Base was a capable option, but had never tested it to confirm. Since I was installing new LibreOffice 3.5, I thought this might be a good test of its metal.
Continue reading "Rendering PostGIS Raster graphics with LibreOffice Base Reports"
Tuesday, February 14. 2012
Have you ever tried to output a file stored in a bytea column from PostgreSQL using PSQL? It ain't pretty. Andrew Dunstan
has a couple of examples of doing this in his article Clever Trick Challenge. This issue has now become very relevant for PostGIS raster use. The reason being is that now that PostGIS in it's 2.0 encarnation can do clipping of rasters, unions of rasters,
map algebra, convert geometries to rasters and convert those rasters to various imagery type file formats, people naturally assume it should be trivial
to output your new fangled raster query as a jpeg or png etc via, PSQL. Turns out it is not. For those who need to, I've documented an example using
one suggestion from Andrew's blog which utilizes the Large object support built into PostgreSQL. It would be nice if this were simpler.
I chose this approach because it was the only one that didn't assume a user has Perl installed or is on a Unix system so should work fairly much the same regardless the client OS in use. You can find the example in Outputting Rasters with PSQL. Suggestions on how to improve on this are welcome.
Sunday, February 05. 2012
As many may have noticed, PostgresOnline.com has been down for the past week or so and probably is still not reachable from many parts of the world since our DNS server was also taken down as a result of a Distributed Denial of Service (DDOS) attack instigated by an Activision Call of Duty Game exploit that turned thousands of Call of Duty game servers into Zombies launching an attack on us.
We have a small confession to make. One of the businesses we co-own is an e-Commerce site that sells condoms. You never know how people will react when you say that in mixed company so we only mention it in closer company. Some people are glad we are in a business protecting against venereal diseases or unwanted pregnancies and some feel strongly we are violating a mother nature creed of conduct. WowCondoms was the site that was under attack on a UDP port and we are not sure if it was a malicious intent or not since the root instigator has not been found yet. The attack was higher up from our servers so it knocked our ISP who in turn blamed us for their outage. We never saw the traffic.
The tragic thing is that it can happen to any site and does all the time. It really hit home when it happened to us.
Details of our fight are described here: WowCondoms plugs hole in Activision's Call of Duty Game Servers
Wednesday, January 25. 2012
PostgreSQL has aggregate functions called bool_and and bool_or which it's had for as far back as I can remember.
What do they do? given rows of premises (things that resolve to booleans), bool_and will return true if all of the premises are true. Similarly bool_or
will return true if any of the premises in the set of rows is true. What if however your boolean expressions are not in rows, but instead passed in
as a sequence of arbitrary statements of questionable fact. We want a function like bool_or or bool_and that takes an arbitrary number of boolean arguments. Are there functions
that fit the bill. Indeed there are, but they don't scream out and say I work with booleans because they fit into a class of function we
discussed in The wonders of Any Element and that also happen to be variadic functions.
These are none other than
greatest and least and they are old timer functions that you can find in most versions of PostgreSQL. We'll demonstrate how to use
all 4 with booleans in this article. It must be said that greatest and least are much more useful when applied to other data types like dates
and numbers, but we were amused at the parallel with booleans.
Side note: we've started to write our book on PostgreSQL that will be published by O'Reilly. We hope to finish this book within the next 5-7 months but have preliminary e-Book drafts before then for early purchasers to scan.
The focus of the book will be PostgreSQL 9.1 with some highlights the upcoming PostgreSQL 9.2. Of course oldie but goodie topics are in the book too. It's a thrill to be writing again.
Continue reading "True or False every which way"
Monday, January 16. 2012
If I could name a number one feature I love most about PostgreSQL, it's the table inheritance feature which we described in How to Inherit and Uninherit. A lot of people use it for table partitioning using CONSTRAINT EXCLUSION. Aside from that, in combination with PostgreSQL schema search_path (customizable by user and/or database) it makes for a very flexible abstraction tool. For example, for many of our web apps that service many departments where each department/client wants to keep a high level of autonomy, we have a schema set aside for each
that inherits from a master template schema. Each department site uses a different set of accounts with the primary schema being that of the department/client so that they are hitting their own tables.
Inheritance allows us to keep data separate,do roll-up reports if we need to, use the same application front-end, and yet allows us the ability to add new columns in just one place (the master template schema). It is more flexible than other approaches because for example we may have a city organization that need to share tables, like for example a system loaded list of funding source shared across the agency. We can set aside these shared tables in a separate schema visible to all or have some have their own copy they can change if they don't want to use the shared one.
Every once in a while, we find ourselves needing to query the whole hierarchy and needing to know which table the results of the query are coming from. To help
solve that issue, we employ the use of the system column tableoid which all user tables have. The tableoid is the the object id of a table. PostgreSQL has many system columns that you have to explicitly select
and can't be accessed with a SELECT * with the tableoid being one of them. These are: tableoid, cmax,cmin, xmin,xmax,ctid which are all described in System Columns. The PostgreSQL docs on inheritance have examples of using it, but we thought it worthwile to repeat the exercise since it's not that common knowledge and is unique enough feature of PostgreSQL that others coming from other relational databases, may miss the treat. I've often demonstrated
it to non-PostgreSQL users who use for example SQL Server or MySQL, and they literally fall out of their chair when I show the feature to them and its endless possibilities.
Continue reading "Table Inheritance and the tableoid"
Sunday, January 08. 2012
PostgreSQL has this interesting placeholder called anyelement which it has had for a long time and its complement anyarray. They are used when you want to define a function that can handle many types arguments or can output many types of outputs. They are particularly useful for defining aggregates, which we demonstrated in
Who's on First and Who's on Last and several other aggregate articles.
Anyelement / anyarray can be used just as conveniently in other functions. The main gotcha is that when you pass in the first anyelement/anyarray all subsequent anyelement / anyarray must match the same data type as the first anyelement / anyarray.
Continue reading "The wonders of Any Element"
Wednesday, December 28. 2011
For those who aren't familiar with hstore, it's a key/value
storage type that is packaged as an extension or contrib in PostgreSQL 8.2+. In PostgreSQL 9.0 it got a little extra loving in several ways one of which was the introduction
of the hstore(record) casting function that converts a record to an hstore. In this article, I'll demonstrate how you can use this new casting function to do very sleek mail merges right in the database. The only caveat is that it seems to only correctly name the keys if it is fed a real table or view. Derived queries such as aggregates etc get keys named f1, f2, etc.
If you are on PostgreSQL 9.1 or above installing -- hstore is just a CREATE EXTENSION hstore; sql command away. If you are on a lower version of PostgreSQL,
you can usually find the hstore.sql in share/contribs.
Continue reading "Mail Merging using Hstore"
Wednesday, December 14. 2011
Two exciting things happened this past month.
- Our article on upcoming PostGIS 2.0 recently came out in GeoInformatics Magazine December 2011 Issue 8. Check it out here. Starts on page 30.
- We just got notice that our book proposal has been accepted and this time it's not about PostGIS.
Continue reading "GeoInformatics article and new book in the works"
|