Thursday, February 14. 2013
A while back I mentioned to Bborie (aka dustymugs) and Sandro (aka strk): We're missing ability to label our images with text. Picture this: What if someone has got a road or a parcel of land and they want to stamp the parcel id or the road name on it and they don't want to have to depend on some piece of mapping software. Many report writers and databases can talk to each other directly such as we demonstrated in Rendering PostGIS raster graphics with LibreOffice and we've got raster functionality in the database already. Can we cut out the middleman? Some ideas came to mind. One we could embed a font lib into PostGIS thus having the ultimate bloatware minus the built-in coffee maker. strk screeched. He's still recovering from my constant demands of having him upgrade his PostgreSQL version. Okay fair enough.
Bborie suggested why don't you import your letters as rasters and then vectorize them. So that's exactly what we are going to demonstrate in this article and in doing so we will combine some of the new features coming in PostGIS 2.1 we've been talking about in Waiting for PostGIS 2.1 series.
Warning this article will have a hint of Rube Goldbergishness in it. If you are easily offended by seeing stupid tricks done with SQL, stop reading now. We are going to take a natural vector product and rasterize it just so we can vectorize it again so we can then rasterize it again. Don't think about it too much. It may trigger activity in parts of your brain you didn't know were there thus resulting in stabbing pains similar to what you experience by quickly guplping down a handful of Wasabi peas.So here are the steps for creating your own font set you can overlay on your geometries and rasters.
See if you can spot the use of window functions and CTEs in these examples.
Continue reading "Saying Happy Valentine in PostGIS"
Wednesday, December 28. 2011
For those who aren't familiar with hstore, it's a key/value storage type that is packaged as an extension or contrib in PostgreSQL 8.2+. In PostgreSQL 9.0 it got a little extra loving in several ways one of which was the introduction of the hstore(record) casting function that converts a record to an hstore. In this article, I'll demonstrate how you can use this new casting function to do very sleek mail merges right in the database. The only caveat is that it seems to only correctly name the keys if it is fed a real table or view. Derived queries such as aggregates etc get keys named f1, f2, etc.
If you are on PostgreSQL 9.1 or above installing -- hstore is just a
Continue reading "Mail Merging using Hstore"
Sunday, August 14. 2011
One of the new features I'm excited about in upcoming PostgreSQL 9.1 are extensions. It is also my hope that for PostGIS 2.0, we'll be able to package PostGIS 2.0 as an extension. Reinspired in my mission by David Wheeler's recent post and video on Building and Distributing Extensions without C, I decided to take some time to investigate how all the extension pieces fit together.
The three things I like most about extensions are:
Of course the ease is all in the thoughtfulness of the packaging. To get some ideas of how we would go about packaging PostGIS 2.0 as an extension (it could very well be 3 extensions if we decide to package the core postgis, raster, and topology (and even tiger geocoder) as separate extensions), I thought I would take a look at how others have packaged theirs, and how one goes about registering one of these packages to make it available in CREATE EXTENSION.
Figuring out the extensions you have available ready to install
First I decided to start by doing a little snooping, by applying some lessons from our previous article Querying table, view, column and function descriptions I wrote this query to figure out what useful functions are available to learn about extensions.
Continue reading "PostgreSQL 9.1 Exploring Extensions"
Tuesday, May 10. 2011
Recommended Books: PostGIS in Action PostgreSQL 9.0 Manual - Volume 1A: SQL Reference PostgreSQL 9.0 Reference - Volume 1B: SQL Command Reference
PostgreSQL 9 High Performance / Admin Cookbook combo
What is the difference between CURRENT_TIMESTAMP and clock_timestamp()Answer:
CURRENT_TIMESTAMP is an ANSI-SQL Standard variable you will find in many relational databases including PostgreSQL, SQL Server, Firebird, IBM DB2 and MySQL to name a few that records the start of the transaction. The important thing to keep in mind about it is there is only one entry per transaction so if you have a long running transaction, you won't be seeing it changing as you go along.
clock_timestamp() is a PostgreSQL function that always returns the current clock's timestamp. I don't think I'm alone in using it for doing simple benchmarking and other things where for example I need to record the timings of each part of a function within the function using pedestrian RAISE NOTICE debug print statements.
There is another cool way I like using it, and that is for a batch of records each with an expensive function call, benchmarking how long it takes to process each record. One of the things I'm working on is improving the speed of the tiger_geocoder packaged in PostGIS 2.0. The first root of attack seemed to me would be the normalize_address function which I was noticing was taking anywhere from 10% to 50% of my time in the geocode process. That's a ton of time if you are trying to batch geocode a ton of records. The thing is the function is very particular to how badly formed the address is so a whole batch could be held up by one bad apple and since the batch doesn't return until all are processed, it makes the whole thing seem to take a while.
So rather than looping thru each, I thought it would be cool if I could run the batch, but for each record have it tell me how long it took to process relative to the rest so I could get a sense of what a problem address looks like. So I wrote this query:
Which returned an output something like this:
address_1 | city | state | zip | pp_addr | the_time | process_time | diff_from_start ------------------+------------+-------+------- +-------------------------------------------+--------------+------------------ 48 MAIN ST .. | S.. | MA | 021.. | 48 MAIN .. | 2011-05-10 03:24:43.078-04 | 00:00:00.032 | 00:00:00.032 15 ... | | MA | 018... | 15 GREN... | 2011-05-10 03:24:50.796-04 | 00:00:00.031 | 00:00:07.75
Wednesday, March 30. 2011
Recommended Books: PostGIS in Action
I am happy to report, that the final proof of the PostGIS in Action E-Book got released today and the printed version is scheduled for release Aprill 11th, 2011 and should be available on Amazon and other locations around then. The other e-Reader formats will come after that. You can buy from here or download the two free chapters, if you haven't already.
Each hard-copy purchase comes with a free E-Book version. There is a coupon in the back of the book when you get it to get the E-Book versions.
Yes, I know it's been a really really long time. On the bright side, we produced twice as much content as we had set out to do and that was with keeping things as concise as we could get away with, still managing to cover more than we set out to cover, and stripping out as many unnecessary words as we could muster. So 520 pages and almost 2 years later, this is where we are.
A good chunk of the additional bulk of the book was the appendices which are about 150 pages total and focus strictly on PostgreSQL and SQL. After many comments from early reviewers, we thought it unfair not to have a good chunk of PostgreSQL and just general relational database content to familiarize programmers and GIS folks with the RDBMS that PostGIS lives in. Most GIS folk unfortunately have the hardest time with getting up to speed with SQL and just standard RDBMS management.
Two free chapters and accompanying code for all chapters
The two free chapters we selectively picked because we thought they would be most beneficial to newcomers and people new to relational databases. So the free chapters are:
So even if you don't buy our book, we hope you find the free chapters useful.
You can get a more detailed listing of all the chapters from the PostGIS in Action book site.
We'd like to thank all those who supported us through this long and unpredictable journey. Hopefully we'll have several more, though hopefully a bit less nerve-racking than this first one.
Friday, December 24. 2010
Recommended Books: PostgreSQL 9 SQL Reference Manual SQL Server MVP Deep Dives High Performance MySQL
Question: You have a table of people and a table that specifies the activities each person is involved in. You want to return a result that has one record per person and a column that has a listing of activities for each person separated by semicolons and alphabetically sorted by activity. You also want the whole set alphabetically sorted by person's name.
This is a question we are always asked and since we mentor on various flavors of databases, we need to be able to switch gears and provide an answer that works on the client's database. Most often the additional requirement is that you can't install new functions in the database. This means that for PostgreSQL/SQL Server that both support defining custom aggregates, that is out as an option.
Normally we try to come up with an answer that works in most databases, but sadly the only solution that works in most is to push the problem off to the client front end and throw up your hands and proclaim -- "This ain't something that should be done in the database and is a reporting problem." That is in fact what many database purists do, and all I can say to them is wake up and smell the coffee before you are out of a job. We feel that data transformation is an important function of a database, and if your database is incapable of massaging the data into a format your various client apps can easily digest, WELL THAT's A PROBLEM.
We shall now document this answer rather than trying to answer for the nteenth time. For starter's PostgreSQL has a lot of answers to this question, probably more so than any other, though some are easier to execute than others and many depend on the version of PostgreSQL you are using. SQL Server has 2 classes of answers neither of which is terribly appealing, but we'll go over the ones that don't require you to be able to install .NET stored functions in your database since we said that is often a requirement. MySQL has a fairly simple, elegant and very portable way that it has had for a really long time.
Continue reading "String Aggregation in PostgreSQL, SQL Server, and MySQL"
Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, 9.0, cte, db2, intermediate, mysql, oracle, postgresql versions, q&a, sql server, window functions at 11:24 | Comments (14) | Trackbacks (0)
Monday, October 05. 2009
In our prior story about allocating people with the power of window aggregation, we saw our valiant hero and heroine trying to sort people into elevators to ensure that each elevator ride was not over capacity. All was good in the world until someone named Frank came along and spoiled the party. Frank rightfully pointed out that our algorithm was flawed because should Charlie double his weight, then we could have one elevator ride over capacity. We have a plan.
Continue reading "Allocating People into Groups with SQL the Sequel"
Monday, September 28. 2009
Recommended Books: PostGIS In Action PostgreSQL 8.4 Official The SQL Language PostgreSQL 8.4 Server Administration
This is along the lines of more stupid window function fun and how many ways can we abuse this technology in PostgreSQL. Well actually we were using this approach to allocate geographic areas such that each area has approximately the same population of things. So you can imagine densely populated areas would have smaller regions and more of them and less dense areas will have larger regions but fewer of them (kind of like US Census tracts). So you have to think about ways of allocating your regions so you don't have a multipolygon where one part is in one part of the world and the other in another etc. Using window aggregation is one approach in conjunction with spatial sorting algorithms.
The non-spatial equivalent of this problem is how do you shove people in an elevator and ensure you don't exceed the capacity of the elevator for each ride. Below is a somewhat naive way of doing this. The idea being you keep on summing the weights until you reach capacity and then start a new grouping.
Continue reading "Allocating People into Groups with Window aggregation"
Wednesday, July 01. 2009
PostgreSQL 8.4 has ANSI SQL:2003 window functions support. These are often classified under the umbrella terms of basic Analytical or Online Application Processing (OLAP) functions. They are used most commonly for producing cumulative sums, moving averages and generally rolling calculations that need to look at a subset of the overall dataset (a window frame of data) often relative to a particular row. For users who use SQL window constructs extensively, this may have been one reason in the past to not to give PostgreSQL a second look. While you may not consider PostgreSQL as a replacement for existing projects because of the cost of migration, recoding and testing, this added new feature is definitely a selling point for new project consideration.
If you rely heavily on windowing functions, the things you probably want to know most about the new PostgreSQL 8.4 offering are:
To make this an easier exercise we have curled thru the documents of the other database vendors to distill what the SQL Windowing functionality they provide in their core product. If you find any mistakes or ambiguities in the below please don't hesitate to let us know and we will gladly amend.
For those who are not sure what this is and what all the big fuss is about, please read our rich commentary on the topic of window functions.
Continue reading "Window Functions Comparison Between PostgreSQL 8.4, SQL Server 2008, Oracle, IBM DB2"
Wednesday, May 27. 2009
One thing that is pretty neat about windowing functions in PostgreSQL 8.4 aside from built-in windowing functions (row_number(), rank(), lead(), lag(), dense_rank(), percent_rank(), cume_dist(), first_value, last_value, nth_value) as documented in the manual Windowing Functions is that you can use windows with most aggregate functions (built-in or custom defined) as well as define your own specific windowing functions. In a later article, we'll demonstrate creating custom windowing functions.
In our PGCon 2009 PostGIS presentation one of the last slides demonstrates using lead() and lag() windowing functions to show a family's income level in the same record with the income levels of the next door neighbors in the fictitious town we created. This is not terribly useful unless you live in a somewhat dysfunctional neighborhood where everyone is concerned about how rich their neighbors are compared to themselves. Our town was very dysfunctional but mostly geographically dysfunctional. We will have much more useful use cases of this as applied to GIS in our upcoming PostGIS in Action book.
Hitoshi Harada and David Fetter did a presentation of this in PGCon 2009 which sadly we missed since we were giving our own presentation. Check out the PGCon2009 PostgreSQL 8.4 Windowing Functions Video. Also check out the slides at Introducing Windowing Functions.
Those who have used SQL Server 2005+, Oracle or IBM DBII are probably familar or have run into examples of Windowing functions in those products. Windowing in PostgreSQL 8.4 works more or less the same way. In a prior article, we demonstrated how to return running totals and sums using rudimentary SQL. To precelebrate the eminent arrival of PostgreSQL 8.4 and the current PostgreSQL 8.4 beta 2 release, we shall demonstrate the same exercise using the new ANSI SQL:2003 Windowing functionality built into the upcoming PostgreSQL 8.4.
Continue reading "Running totals and sums using PostgreSQL 8.4 Windowing functions"
Sunday, April 05. 2009
PostgresQL 8.4 beta will be out any day and 8.4 official release will hopefully not be too far behind. As we wait patiently for the official release, Robert Treat has summarized nicely all the new features you can expect in 8.4. PostgreSQL 8.4 is what I like to call an earth-shattering release because it has so many big ticket items in there, but also some long-needed usability features in it.
While we all know about the Windowing functions and CTEs and Recursive CTEs, there are a couple of usability features that we always get beat up on, which I am glad to see will be in 8.4. these are
Now the other niceties and usuability features which are nice but not quite as top of our list as the aforementioned. Note this far from an exhaustive list, but Robert Treat's 8.4 slide presentaton is pretty exhaustive:
(Page 1 of 1, totaling 11 entries)
Syndicate This Blog
Show tagged entries