intermediate

Friday, May 30. 2014

psql watch for batch processing

A while back, we discussed using pgAdmin pgScript as a quicky way for running a repetitive update script where you want each loop to commit right away. Since stored functions have to commit as a whole, you can't use stored functions alone for this kind of processing.

Question: Can you do similar easily with psql?
Answer: yes with the \watch command described nicely by Michael Paquier a while back.

If you are using the psql client packaged with PostgreSQL 9.3 or above, then you can take advantage of the \watch command that was introduced in that version of psql. We'll demonstrate that by doing a batch geocoding exercise with PostGIS tiger geocoder and also revise our example from the prior article to use the more efficient and terser LATERAL construct introduced in PostgreSQL 9.3.

Continue reading "psql watch for batch processing"

Posted by Leo Hsu and Regina Obe in 9.3, 9.4, gis, intermediate, postgis, postgresql versions, q&a at 23:24 | Comments (3) | Trackbacks (0)

Friday, April 25. 2014

Using HStore for Archiving

Printer Friendly

I'm not a big proponent of schemaless designs, but they have their place. One particular place where I think they are useful is for archiving of data where even though the underlying table structure of the data you need to archive is changing, you want the archived record to have the same fields as it did back then. This is a case where I think Hstore and the way PostgreSQL has it implemented works pretty nicely.

Side note: one of the new features of PostgreSQL 9.4 is improved GIN indexes (faster and smaller) which is very often used with hstore data (and the new jsonb type). We're really looking forward to the GIN improvements more so than the jsonb feature. We're hoping to test out this improved index functionality with OpenStreetMap data soon and compare with our existing PostgreSQL 9.3. OpenStreetMap pbf and osm extract loaders (osm2pgsql, imposm) provide option for loading tagged data into PostgreSQL hstore fields, in addition to PostGIS geometry and other attribute fields. So 9.4 enhancements should be a nice gift for OSM data users. More on that later.

Continue reading "Using HStore for Archiving"

Posted by Leo Hsu and Regina Obe in application development, hstore, intermediate, postgis at 00:24 | Comments (0) | Trackbacks (0)

Saturday, December 21. 2013

Using wget directly from PostgreSQL using COPY FROM PROGRAM

Printer Friendly

Recommended Books: PostgreSQL: Up and Running PostGIS in Action

One of the features new to PostgreSQL 9.3 is the COPY FROM PROGRAM sql construct and in psql the equivalent \copy from program. Michael Paquier covered the COPY TO/FROM PROGRAM in hist Postgres 9.3 feature highlight: COPY TO/FROM PROGRAM. Depesz covered the companion psql construction in Support for piping copy to from an external program. Michael demonstrated an example using curl. I wanted to try something similar using wget since I have wget readily available on all my Linux and Unix boxes. For this example I'll demonstrate doing it on windows, but doing it on Linux is much the same and simpler since the wget and curl are generally already in Linux default path.

Continue reading "Using wget directly from PostgreSQL using COPY FROM PROGRAM"

Posted by Leo Hsu and Regina Obe in 9.3, basics, intermediate, postgresql versions at 23:12 | Comments (2) | Trackbacks (0)

Thursday, August 15. 2013

Materialized geometry_columns using Event Triggers

Printer Friendly

One of the big changes in PostGIS 2.0 was that the geometry_columns table became a read only view. This change was not without sacrifice. On the plus it meant a table no longer needed to be maintained which was a relief for those who had to constantly ensure the sideline table was up to date. Also on the plus it meant we could base the views on the current user so that if a user connected, they would only see tables they had access to. The sacrifice made was a bit of speed. In most use cases, the speed difference is hardly noticeable and only noticeable if you have a query constantly polling this table. However if you have a database of 15,000 geometry tables it could take as long as 450ms as noted in this ticket.

So just playing with ideas, could the new PostgreSQL 9.3 event triggers and materialized views provide a solution. Keep in mind this is just food for thought. We're not condoning people go out and hack their PostGIS install. We suspect if we implement such a thing in PostGIS core it will change from what we propose here. If you saw our prior article Caching data with materialized views and statement level triggers you know where this is going. We'll add yet another concept to this cocktail and that is what we'll call schema hiding which by that we mean counting on the order of a search_path to hide a named table/view you don't want.

Continue reading "Materialized geometry_columns using Event Triggers"

Posted by Leo Hsu and Regina Obe in 9.3, intermediate, new in postgresql, postgis, postgresql versions at 00:18 | Comment (1) | Trackbacks (0)

Tuesday, August 13. 2013

Caching data with materialized views and statement level triggers

Printer Friendly

One exciting new feature coming in PostgreSQL 9.3 is materialized views. Materialized views is really a mechanism for caching data of a query. It is especially useful if you have long running queries where the answers change infreqently. There are many things unfortunately that materialized views won't do where you are still better off with regular views.

They don't refresh themselves automatically. Thus requiring a cron job/pgagent job or a trigger on something to refresh.
They can't be user dependent or time dependent. For example if you have a view that does something like WHERE user=current_user(), then a materialized view is out of the question.

Continue reading "Caching data with materialized views and statement level triggers"

Posted by Leo Hsu and Regina Obe in 9.3, intermediate, new in postgresql, postgis, postgresql versions at 22:49 | Comment (1) | Trackback (1)

Saturday, April 13. 2013

Determine which version of PostGIS each database is running

Printer Friendly

One of the features of PostGIS (pain to some however you look at it), is that PostGIS library file is versioned by minor version. The library will have for example a postgis-2.0 or postgis-1.5.dll / .so to denote the version. Each version of PostGIS can be compiled to run on usually about 3 or 4 versions of PostgreSQL.

Since PostGIS is not part of PostgreSQL proper and has to be installed separately, it is possible to run a different version of PostGIS in each database of a cluster. While this is a great feature for PostGIS developers and also great for users who want to keep their old legacy PostGIS apps, while testing or creating new apps with the PostGIS 2.0 or experiment with 2.1 development series, it does pose some obvious challenges.

For example you can't simply just upgrade your cluster to a new version of PostgreSQL. You need to make sure the new cluster has the various versions of PostGIS compiled and available. One step to that end is figuring out exactly what version of PostGIS each database in your cluster is running. Here is a quick psql script I wrote up to help with that.

Continue reading "Determine which version of PostGIS each database is running"

Posted by Leo Hsu and Regina Obe in basics, intermediate, postgis at 18:32 | Comments (4) | Trackbacks (0)

Monday, January 28. 2013

How to map MS Access auto number to PostgreSQL serial

Printer Friendly

This exercise is a continuation of our How to bulk export tables from MS Access and How to recreate MS Access primary keys and indexes in PostgreSQL. As mentioned in the first article, when you use the built-in export feature of MS Access, it exports autonumber fields as integers instead of the more appropriate PostgreSQL pseudo serial type.

The serial type in PostgreSQL is not really a type though, so this behavior is expected. The serial type is really short-hand for defining an integer column and a corresponding sequence object with a specific naming convention and setting the default value to the next value of the sequence. So this is what we will do in this exercise after we have already exported our data to PostgreSQL. The script we are about to demonstrate will generate an sQL script containing all the CREATE SEQUENCE, and ALTER TABLE ALTER COLUMN SET DEFAULT .. needed to convert our integer column to a serial column.

As a side note, Mark mentioned a similar approach to what we've been describing here, but builds all the table , key structures utilizing python. You can check out the python script at http://code.activestate.com/recipes/52267-reverse-engineer-ms-accessjet-databases

Continue reading "How to map MS Access auto number to PostgreSQL serial"

Posted by Leo Hsu and Regina Obe in intermediate, ms access, other dbms, q&a at 17:56 | Comments (0) | Trackbacks (0)

Sunday, January 27. 2013

How to recreate MS Access primary keys and indexes in PG

Printer Friendly

This exercise is a continuation of our How to bulk export tables from MS Access. Now while this approach will work for other databases besides PostgreSQL, you'll probably need to fiddle with the subroutine to make it work for some other databases. PostgreSQL is fairly ANSI-SQL standard so not too much fiddling should be required to port to SQL Server, MySQL, Oracle etc.

Unlike the prior Visual basic subroutine we showed that exports the tables, this just creates an SQL script that you run on the already created PostgreSQL database that contains the exported data. We didn't test the quote option though we coded it in the subroutine, since like we said we hate having to quote fields. If perchance you are one of those folks that likes to put spaces in your field names to make it more englishy, then you'll need to quote or revise the other routine to convert your spaces to _ or some other thing.

Continue reading "How to recreate MS Access primary keys and indexes in PG"

Posted by Leo Hsu and Regina Obe in intermediate, ms access, mysql, oracle, other dbms, q&a, sql server at 13:49 | Comments (3) | Trackback (1)

How to bulk export tables from MS Access

Printer Friendly

Despite all the nasty things people say about MS Access, it does have fairly database agnostic, easy to use import and export tools and a not too shabby query wizard and query tool. But of course, it's not a server side database so at a certain point won't scale for database storage. You can still use it as a front-end to a server-side database such as PostgreSQL or SQL Server. So once you outgrow it for data storage, you'll probably want a quick way to export your data out. Unfortunately, well at least in the 2003 version, while you can bulk link tables from an ODBC connection, you can only export one table at a time using the "select table" right-click export. In this article we'll show a quick and dirty export MS Access Visual basic sub routine we use to export all tables. This we've only tested on MS Access 2003, so if you are using higher, you might need to customize it a bit. This script should work fine for exporting to any database connection with modification of the connection string.

Continue reading "How to bulk export tables from MS Access"

Posted by Leo Hsu and Regina Obe in intermediate, ms access, other dbms, q&a, sql server at 13:36 | Comments (6) | Trackbacks (0)

Sunday, January 06. 2013

PostgreSQL 9.3 Lateral Part2: The Lateral Left Join

Printer Friendly

In the last article we said you can't have a LEFT JOIN with LATERAL. Turns out we were mistaken and YES indeed you can and when you do it is equivalent or more powerful than SQL Server's OUTER APPLY.

Let us say that in our query we wanted all my zips returned but for ones that had keys where the value is in a certain range, we want those keys returned. we'd do this. The fact we need all even if they have no such keys necessitates us putting the condition in the ON rather than the WHERE.

SELECT zip, (h).key, (h).value As val
 FROM zcta5 AS foo 
   LEFT JOIN LATERAL each(hstore(foo) - 'zip'::text)  As h 
    ON ((h).value BETWEEN '12345' and '14567')
    ORDER BY zip
   limit 5;

Output would be

 zip  |  key   |   val
------+--------+----------
00601 |        |
00602 |        |
00603 |        |
00606 | awater | 12487.00
00610 | hu10   | 12618

Posted by Leo Hsu and Regina Obe in 9.3, intermediate, new in postgresql, other dbms, postgis, postgresql versions, sql server at 15:35 | Comments (0) | Trackbacks (0)

Monday, November 12. 2012

Searching for phrases embedded in stored function and procedures

Printer Friendly

Recommended Books: PostgreSQL: Up and Running

Issue: Ever had the situation where you decided to delete a table or some other function or type but you weren't absolutely sure if other functions in the database depended on these. For things like tables and views that depend on a function, type or other table/view, PostgreSQL won't allow you to delete the dependency object without doing a drop cascade. For stored procedures though, while it will prevent you from deleting a dependency type if the function returns or takes as input the object to be deleted, it doesn't save you if the body of the function references these objects. This dependency information is not always known and in fact may be dynamic with dynamically generated sql or schema path settings. So how do you inspect functions for usage of other items?

Solution: PostgreSQL has a table called pg_catalog.pg_proc which stores the source code of functions (non-C) in the prosrc column.

So lets say you had a table called employees you want to get rid of or simply rename, to find out all functions that reference the term "employees", you would do something like this:

SELECT proname, proargnames, prosrc 
 FROM pg_proc
 WHERE prosrc ILIKE '%employees%';

Posted by Leo Hsu and Regina Obe in intermediate, q&a at 17:59 | Comments (0) | Trackbacks (0)

Friday, August 10. 2012

PLV8JS and PLCoffee Part 2B: PHP JQuery App

Printer Friendly

Recommended Books: PostgreSQL: Up and Running PostGIS in Action

In our last article, PL/V8JS and PL/Coffee JSON search requests we demonstrated how to create a PostgreSQL PL/Javascript stored function that takes as input, a json wrapped search request. We generated the search request using PostgreSQL. As mentioned, in practice, the json search request would be generated by a client side javascript API such as JQuery. This time we'll put our stored function to use in a real web app built using PHP and JQuery. The PHP part is fairly minimalistic just involving a call to the database and return a single row back. Normally we use a database abstraction layer such as ADODB or PearDB, but this is so simple that we are just going to use the raw PHP PostgreSQL connection library directly. This example requires PHP 5.1+ since it uses the pg_query_param function introduced in PHP 5.1. Most of the work is happening in the JQuery client side tier and the database part we already saw. That said the PHP part is fairly trivial to swap out with something like ASP.NET and most other web server side languages.

Continue reading "PLV8JS and PLCoffee Part 2B: PHP JQuery App"

Posted by Leo Hsu and Regina Obe in application development, intermediate at 13:32 | Comments (0) | Trackbacks (0)

Thursday, August 09. 2012

PLV8JS and PLCoffee Part 2: JSON search requests

Printer Friendly

Recommended Books: PostgreSQL: Up and Running PostGIS in Action

PostgreSQL 9.2 beta3 got released this week and so we inch ever closer to final in another 2 months or so. One of the great new features is the built-in JSON type and companion PLV8/PLCoffee languages that allow for easy processing of JSON objects. One of the use cases we had in mind is to take as input a JSON search request that in turn returns a JSON dataset.

We'll use our table from PLV8 and PLCoffee Upserting. Keep in mind that in practice the json search request would be generated by a client side javascript API such as our favorite JQuery, but for quick prototyping, we'll generate the request in the database with some SQL.

If you are on windows and don't have plv8 available we have PostgreSQL 9.2 64-bit and 32-bit plv8/plcoffee experimental binaries and instructions. We haven't recompiled against 9.2beta3, but our existing binaries seem to work fine on our beta3 install.

Continue reading "PLV8JS and PLCoffee Part 2: JSON search requests"

Posted by Leo Hsu and Regina Obe in intermediate, pl programming, plcoffee, plv8js at 01:35 | Comments (0) | Trackbacks (0)

Saturday, July 21. 2012

PLV8JS and PLCoffee Part 1: Upserting

Printer Friendly

Recommended Books: PostgreSQL: Up and Running PostGIS in Action

Today's modern web application workflow in its simplest form looks something like this:

Get dataset as JSON object usually using yet another JSON query object to pass the request using a javascript framework like JQuery/ExtJS/OpenLayers/Leaflet etc.
Make changes to JSON dataset object and send back to the web server.
On webserver unravel the JSON object and save to respective database tables. This part is really yucky as it often involves the web application server side language doing the unraveling and then yet another step of setting up stored procedures or other update logic to consume it.

We hate the way people build tiers for the same reason Cartman hates lines at the amusement park. Sure tiers are great for certain things like building connected microcosms, but most of the time they are overkill and if applied too early make your application needlessly complicated. In the end all we care about is data: serving data, analyzing data, getting good data and everything else is just peacock feathers.

The introduction of JSON type support in PostgreSQL 9.2 and languages PL/V8 (PL/Javascript) and its Pythoness-like twin PL/Coffee provides several options for bringing your data and application closer together since they have native support for JSON. In this first part we'll demonstrate one: An upsert stored procedure that takes a single JSON object instead of separate args and updates existing data and adds missing records. In later articles we'll show you the front end app and also add a sprinkle of PostGIS in there to demonstrate working with custom types.

Continue reading "PLV8JS and PLCoffee Part 1: Upserting"

Posted by Leo Hsu and Regina Obe in 9.2, intermediate, new in postgresql, pl programming, plcoffee, plv8js, postgis, postgresql versions at 13:13 | Comments (3) | Trackback (1)

Tuesday, July 10. 2012

Working with Timezones

Printer Friendly

Recommended Books: PostgreSQL: Up and Running

UPDATE TO UPDATE: Bruce Momjian suggested replacing the dynamic set local sql with set_config. We've revised further to incorporate this suggestion. That got rid of our last pet peeve about this function. Thanks all.

Simon Bertrang proposed using set local which seems much nicer. We've updated our function using his revision.

One of PostgreSQL's nice features is its great support for temporal data. In fact it probably has the best support for temporal data than any other database. We'll see more of this power in PostgreSQL 9.2 with the introduction of date time range types. One of the features we've appreciated and leveraged quite a bit in our applications is its numerous time zone aware functions. In PostgreSQL timestamp with time zone data type always stores the time in UTC but default displays in the time zone of the server, session, user. Now one of the helper functions we've grown to depend on is to_char() which supports timestamp and timestamp with timezone among many other types and allows you to format the pieces of a timestamp any way you like. This function is great except for one small little problem, it doesn't allow you to designate the display of the output timezone and always defaults to the TimeZone value setting of the currently running session. This is normally just fine (since you can combine with AT TIMEZONE to get a timestamp only time that will return the right date parts, except for the case when you want your display to output the time zone -- e.g. EDT, EST, PST, PDT etc (timestamp without timezone is timezone unaware). In this article we'll demonstrate a quick hack to get around this issue. First let's take to_char for a spin.

Continue reading "Working with Timezones"

Posted by Leo Hsu and Regina Obe in basics, intermediate at 01:09 | Comments (4) | Trackbacks (0)

Postgres OnLine Journal

PostGIS in Action About the Authors Consulting

Friday, May 30. 2014

psql watch for batch processing

Friday, April 25. 2014

Using HStore for Archiving

Saturday, December 21. 2013

Using wget directly from PostgreSQL using COPY FROM PROGRAM

Thursday, August 15. 2013

Materialized geometry_columns using Event Triggers

Tuesday, August 13. 2013

Caching data with materialized views and statement level triggers

Saturday, April 13. 2013

Determine which version of PostGIS each database is running

Monday, January 28. 2013

How to map MS Access auto number to PostgreSQL serial

Sunday, January 27. 2013

How to recreate MS Access primary keys and indexes in PG

How to bulk export tables from MS Access

Sunday, January 06. 2013

PostgreSQL 9.3 Lateral Part2: The Lateral Left Join

Monday, November 12. 2012

Searching for phrases embedded in stored function and procedures

Friday, August 10. 2012

PLV8JS and PLCoffee Part 2B: PHP JQuery App

Thursday, August 09. 2012

PLV8JS and PLCoffee Part 2: JSON search requests

Saturday, July 21. 2012

PLV8JS and PLCoffee Part 1: Upserting

Tuesday, July 10. 2012

Working with Timezones

Quicksearch

Calendar

Categories

Archives

Subscribe

Blog Administration