A while back I mentioned to Bborie (aka dustymugs) and Sandro (aka strk): We're missing ability to label our images with text.
Picture this: What if someone has got a road or a parcel of land and they want to stamp
the parcel id or the road name on it and they don't want to have to depend on some piece of mapping software. Many report writers and databases can talk to each other directly
such as we demonstrated in Rendering PostGIS raster graphics with LibreOffice
and we've got raster functionality in the database already. Can we cut out the middleman?
Some ideas came to mind. One we could embed a font lib into PostGIS thus having the ultimate bloatware minus the built-in coffee maker.
strk screeched. He's still recovering from my constant demands of having him upgrade his PostgreSQL version. Okay fair enough.
Bborie suggested why don't you import your letters as rasters and then vectorize them. So that's exactly what we are going to demonstrate in this article
and in doing so we will combine some of the new features coming in PostGIS 2.1 we've been talking about in Waiting for PostGIS 2.1 series.
Warning this article will have a hint of Rube Goldbergishness in it.
If you are easily offended by seeing stupid tricks done with SQL, stop reading now.
We are going to take a natural vector product and rasterize it just so
we can vectorize it again so we can then rasterize it again. Don't think about it too much. It may trigger activity in parts of your brain you didn't know were there
thus resulting in stabbing pains similar to what you experience by quickly guplping down a handful of Wasabi peas.
So here are the steps for creating your own font set you can
overlay on your geometries and rasters.
See if you can spot the use of window functions and CTEs in these examples.
Step 1: Find a font you like
I haven't tried, but in theory you can just write out letters evenly spaced on a piece of paper and scan the paper so you have a single image.
This would allow you to automate your own writing in the database thus allowing your database be able to impersonate you. A very exhilarating, but slightly disturbing idea.
Too many years of using a keyboard has made my technical drawing skills less than legible, so instead:
Opened up a spreadsheet and in each cell typed in a different letter across with the font.
Took a snapshot image of masterpiece spreadsheet. Which gave this image:
Imported this image into PostGIS with raster2pgsql using command-line packaged tool: raster2pgsql -F -Y pics/fonts/kankin.png font_pics | psql -U postgres -d testpostgis210 -h localhost -p 5432
For the rest of these steps you are going to need to be running a fairly recent PostGIS 2.1.0 development version which has the ubber-cool ST_Tile function which we demonstrated here
and is just well super cool.
Step 2: Chop your image into letter bits, reclass, vectorize
For this exercise I'm assuming the letters are evenly spaced. One of the changes in 2.1.0 is that ST_Polygon function ALWAYS returns a multipolygon.
Prior versions sometimes returned Polygons.
Do a quick histogram using the raster ST_Histogram function to get a sense of where the pixel values that repesent letters. In a simple world every thing would be as clear as black and white
. You'd only have 2 values: that which represents
the drawing of a letter and that which represents whitespace. But our world has all this noise caused by differences in pressure you put on your pen, the ink running out and so forth,
and the imperfectness of your snapshotting. Our histogram demonstates the issue.
WITH fp AS(SELECT ST_Tile(ST_Band(rast,1),ST_Width(rast)/36, ST_Height(rast)) As rast
FROM font_pics LIMIT 1 )
FROM (SELECT ST_Histogram(rast) As h
FROM fp) As foo;
The output of the above query is shown below and was so fast I didn't even bother timing it - so as you can see our letter drawing probably falls in the 0-30 range.
Looking at the above and eyeballing looks like we want to keep 0 - 38 and chuck the rest. What we need is a way to reclassify our world into black and white.
Luckily PostGIS 2+
has a function ST_Reclass that allows us to chuck the pixel values we don't want.
Now we reclassify and vectorize. See the Seamless Vector / Raster operations flying around like a lab experiment gone nuts.
We leave it as an exercise to the reader to figure out what we are doing in this snippet -- or you can read the Second edition of PostGIS in Action book in our upcoming Chapter 7.
Chapter 7 first draft will be available on MEAP next month and we hope to explain some of these operations.
INSERTINTOfont_set(letter, geom, font_name)WITH fp AS(SELECTST_Tile(ST_Band(rast,1),ST_Width(rast)/36, ST_Height(rast))As rast
FROM font_pics WHERE filename ='kankin.png')
, fpi As(SELECT(ROW_NUMBER()OVER())::integerAs pos, rast
FROM fp)SELECTsubstring('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', pos,1)As letter
, 'kankin'FROM fpi;
Putting our fontset to work
So now we've got a table of vectorized letters. To make our new vectorized fontset easier to use, we create a word function.
CREATEORREPLACEFUNCTIONget_word_vector(param_word text, param_font_name text)RETURNS geometry
WITH letters As(SELECT geom, width, sum(width +1)OVER(ORDERBy pos)As run_length
FROM(SELECT geom, letter,ROW_NUMBER()OVER()As pos, ST_XMax(geom)As width
, ST_YMax(geom)As height
FROM font_set CROSSJOINregexp_split_to_table($1, E'')As l
WHERE letter = l AND font_name = $2)As foo
)SELECTST_Union(ST_Translate(geom, run_length - width,0))As word_geom
And now we use it to write a somewhat complicated looking SQL statement that can probably be wrapped into a function and be significantly shorter.
This utilizes the ST_Union aggregate function for raster, which is much improved in speed in upcoming 2.1.
You know what happens if you could take this (in a cleaned up or more elegant manner) and took it to it's endpoint? PDF reports all generated through SQL using GDAL's Geospatial PDF and fonts as vectors loaded using Apache's Batik's SVG Font Converter. I'm currently generating PDF reports programmatically but having to call command line tools sucks.
That would be cool. I was thinking about that that I have to do PDFs the old yucky way at the moment and woudl be nice to have my PDF just come out of the statement and dump to browser like any other query, but for complex reports I'm still attached to my report writer, but at least the companion image would just be a query which would be huge improvement :)