A while ago we demonstrated how to create cross tabulation tables using tablefunc extension aka (Pivot Tables) (basically collapsing
rows into columns).
Recently someone asked me how to do the reverse (convert columns to rows). He found a solution to the problem here: http://stackoverflow.com/questions/1128737/unpivot-and-postgresql
using a combination of array and unnest. That approach is very similar to SQL Server's built-in Pivot SQL predicate.
The solution seemed
nice enough except similar to the SQL Server Unpivot, it required knowing the column names beforehand so very hard to genericize.
So would it be possible to accomplish this feat without knowing the columns names (except for the key) and be able to do it with one SQL statement. I realized that the
PostgreSQL hstore extension fit the bill nicely. In this article I'll demonstrate both approaches by creating a view
If you don't have hstore installed and are running PostgreSQL 9.1+, you can use the handy
CREATE EXTENSION hstore;
For lower versions, look for the hstore.sql file in share/contrib and run in your database.
The second approach is to use array and unnest. As mentioned, the thing I didn't like about this
approach is that it's not dynamic - you have to itemize the columns. It also on the downside requires you to cast all the column
values before hand if they are not all the same type. It does have a couple of benefits:
It uses built-in types so no need to install an extension though it requires PostgreSQL 8.4+ because of unnest
You can grab just a subset very efficiently and if you only care about numeric columns would probably be best.
It's generally faster.
To test performance I ran this:
Create the view
, 'awater', 'intptlat', 'intptlong', 'aland_sqmi', 'awater_sqmi']
, awater::text, intptlat::text, intptlong::text
, aland_sqmi::text, awater_sqmi::text]
Test some queries
The single record one takes same time and uses index but shorter plan:
--where filter 10 ms --
WHERE zip = '02109';
Output the whole view
-- network effects --
-- 3120 ms (264960 records)
SELECT * from vw_zcta_unpivot_ary;
-- no network effects --
-- 200 ms
SELECT count(*) FROM vw_zcta_unpivot_ary;
So there you have it, two very different approaches for accomplishing the same task.