Saturday, April 17. 2010
Printer Friendly
We just finished the first draft of the last chapter of our book: First look at PostGIS WKT Raster. This completes our hard-core writing and now on to more drafting,
polishing all the chapters.
In Chapter 13 we demonstrate how to use PostGIS WKT Raster functions by example and cross breed with PostGIS geometry functionality. I was pleasantly surprised to see how nicely the raster and geometry functions play together.
We had intended this chapter to be short about 20 pages in length, because how much can one say about pixels and pictures. As it turns out, a lot.
Rasters are more versatile than their picture portrayal on a screen. Rasters are a class of structured storage suitable for representing any numeric,
cell based data where each cell has one or more numeric properties (the bands). This covers quite a bit of data you collect with remote sensing and other electronic instrumentation. We had to stretch to over 30 pages; even then we felt we were missing some critical examples.
There is a lot of useful functionality in PostGIS WKT Raster
already and should make a lot of people looking for raster support in PostgreSQL very happy. Although the chapter may portray some scenes of violence and torture inflicted on elephants, you can rest assured
that it is pure illusion and no real elephants or blue elephant dolls were harmed in the making of this chapter.
As a side note -- our book is now listed on Amazon PostGIS in Action.
It is not available in hard-copy yet,but you can pre-order and of course you can order from PostGIS in Action from Manning directly
to get the chapter drafts we have posted, updates as we polish them, and the final book when it comes out in hard print.
The Amazon listing would have been so much more exciting, had they not stripped me of my last name or had Leo married to himself.
UPDATE: It appears I now have a last name again
In hind sight, I suppose OBE is more commonly seen as a title of honor rather than a last
name, so its only fitting that I should be stripped of mine and Tim Berners-Lee gets it tacked on at the end of his name.
To find out more about PostGIS WKT Raster, we encourage you to check out these links.
Now we'll itemize 10 things you can do now with PostGIS WKT Raster. In order to use PostGIS WKT Raster, you need PostGIS 1.3.5 or above. Preferably 1.4 or 1.5 or 2.0 alpha.
PostGIS WKT Raster is currently packaged as a separate library and we have windows binaries available.
Continue reading "PostGIS Raster its on: 10 things you can do NOW with raster"
Sunday, February 07. 2010
Printer Friendly
PostGIS 1.5.0 is finally out
I'm happy to report that after a long haul, we have finally released PostGIS 1.5.0.
Two months late, but there it is, and its a really great release I think; Perhaps the best release ever.
Details on what makes this release so special. The geodetic support.
Summary excerpted from Paul's slightly premature announcement
February 4, 2010
The PostGIS development team has, after a long course of reflection
and a detailed self-examination of our various personal failings,
decided to release PostGIS 1.5.0 to the public.
http://postgis.org/download/postgis-1.5.0.tar.gz
This new version of PostGIS includes a new "geography" type for
managing geodetic (lat/lon) data, performance-enhanced distance
calculations, GML and KML format readers, an improved shape loading
GUI, and other new features as well.
Especial thanks to:
* Dave Skea for algorithms and mathematics necessary to support
spherical geometry
* Nicklas Avén for the new performance enhanced distance calculations
and other distance-related functions
* Sandro Santilli for new buffering features (end caps and style options)
* Olivier Courtin for GML/KML input functions
* Guillaume Lelarge for support for the upcoming PgSQL 9.0
* George Silva for an example implementation of history tables
* Vincent Picavet for Hausdorff distance calculations
* The maintainers of GEOS, Proj4, and LibXML, without whom we would
have less of a spatial database
Love, the PostGIS project steering committee,
Mark Cave-Ayland
Kevin Neufeld
Regina Obe
Paul Ramsey
Continue reading "PostGIS 1.5.0 out and PLR working on Windows 8.3-8.4 installs"
Thursday, January 21. 2010
Printer Friendly
Every once in a while, especially if you have a fairly large database, you may find the need to do select backups of certain tables.
Your criteria might be based on name or how relatively recently data has changed in the table.
Below are some of the tricks we use. Some use our favorite hack of scripting command line scripts with SQL.
Continue reading "Making backups of select tables"
Saturday, January 09. 2010
Printer Friendly
UPDATE: Thanks all for the suggestions. For now we ended up increasing the
seq_page_cost from 1 to 2 in the database. That has gotten us back to our old much much faster speeds without change in code and seems to have
improved the speeds of other queries as well, without reducing speed of any.
ALTER DATABASE mydb SET seq_page_cost=2;
As Jeff suggested, we'll try to come up with a standalone example that exhibits the behavior. The below example was more to demonstrate the construct. Table names and fields were changed to protect the innocent so that is why we didn't bother showing explain plans. The behavior also seems to do
with the distribution of data and gets worse when stats are updated (via vacuum analyze). Didn't see this in PostgreSQL 8.3 and this was a system recently upgraded from 8.3 to 8.4
---ORIGINAL ARTICLE HERE --
This is a very odd thing and I think has happened to us perhaps once before.
Its a bit puzzling, and we aren't particularly happy with our work around because its
something that looks to a casual observer as a bit bizarre. The hack is setting the enable_seqscan setting
off for a particular query to force the planner to use indexes available to it.
What is particularly troubling about this problem, is that it wasn't always this way.
This is a piece of query code we've had in an application for a while, and its worked shall
I say really fast. Response times in 300 ms - 1 sec, for what is not a trivial query against a not
so trivially sized hierarchy of tables.
Anyrate, one day -- this query that we were very happy with, suddenly started
hanging taking 5 minutes to run. Sure data had been added and so forth, but that didn't
completely explain this sudden change of behavior. The plan it had taken had changed drastically.
It just suddenly decided to stop using a critical index it had always used. Well it was still using it but just on
the root table, not the children. Though querying a child directly proved that it still refused to use it,
so it didn't seem to be the hierarchy at fault here.
Continue reading "Forcing the planner's hand with set enable_seqscan off WTF"
Saturday, November 28. 2009
Printer Friendly
In this exercise, we'll go thru our steps for upgrading a Redhat Enterprise Linux 64-bit PostgreSQL
box from PostgreSQL 8.3 to PostgreSQL 8.4. If you don't have any kind of PostgreSQL installed on your box,
you can skip the Upgrade step.
UPDATE - since Devrim's move from Command prompt - he has started a new yum repository. You may want to use
this one instead since it seems more up to date than the other. http://yum.pgrpms.org/
UPDATEWe have instructions for installing PostgreSQL 9.0 via yum.
Updgrading from PostgreSQL 8.* to PostgreSQL 8.4
If you are starting from scratch -- just skip this section.
If you are upgrading from 8.4.0 to 8.4.1 you can get away with a simple
yum update postgresql and skip the rest of this article.
If you are upgrading from PostgreSQL 8.3 to 8.4, in theory you can use PgMigrator, but in practice,
particularly with a Yum install, you are bound to run into obstacles. If you are running an older version, you must dump and restore.
Continue reading "An almost idiot's guide to Install and Upgrade to PostgreSQL 8.4 with Yum"
Friday, November 06. 2009
Printer Friendly
The upcoming version of PostGIS - PostGIS 1.5 will be an exciting one. It has native geodetic support in the form of the new
geography type, similar in concept to SQL Server's geography support. For windows users, we have experimental binary builds hot off the presses for PostgreSQL 8.3 and 8.4
Continue reading "PostGIS does Geography"
Monday, October 26. 2009
Printer Friendly
This is an unfortunate predicament that many people find themselves in and does cause a bit of frustration. You bring in some tables into your PostgreSQL
database using some column name preserving application, and the casings are all preserved from the source data store. So now you have to quote all the fields
everytime you need to use them. In these cases, we usually rename the columns to be all lower case using a script. There are two approaches we have seen/can think of for doing this
one to run a script that generates the appropriate alter table statements and the other is to update the pg_attribute system catalog table directly.
Continue reading "Lowercasing table and column names"
Sunday, September 20. 2009
Printer Friendly
A while ago we wrote about DZone RefCards cheatsheets and how its a shame there isn't one for PostgreSQL. They are a very attractive and useful vehicle for learning and brushing up on the most important pieces
of a piece of software or framework. Since that time we have been diligently working on one for PostgreSQL to fill the missing PostgreSQL slot. The fruits of
our labor are finally out, and a bit quicker than we expected. The cheatsheet covers both old features and new features introduced in PostgreSQL 8.4. We hope its useful to many old and new PostgreSQL users.
The Essential PostgreSQL Refcard can be downloaded from Essential PostgreSQL http://refcardz.dzone.com/refcardz/essential-postgresql?oid=hom12841
Thursday, July 30. 2009
Printer Friendly
PostgreSQL has supported what are called Out (output) parameters since version 8.1.
We were surprised it has been that long since we always thought of it as a feature from 8.2+
until it recently came up for discussion on PostGIS newsgroup and we decided to investigate how
long it has been supported.
What are OUT parameters? These are parameters you define as part of the function argument
list that get returned back as part of the result. When you create functions, the arguments
are defaulted to IN parameters when not explicitly specified (which means they are passed in and not returned) which
is why you sometimes see PgAdmin do something like IN somevariable variabletype when you use
the function wizard.
You can have INOUT parameters as well which are function inputs that both get passed in, can be modified
by the function and also get returned.
As a side note - In 8.4, PostgreSQL was enhanced to allow dynamic sql RETURN QUERY using RETURN QUERY EXECUTE syntax for plpgsql queries and also
allow set returning functions being called in the SELECT part for any pl language. In prior versions,
this was only a feature of PL functions written in SQL.
8.3 introduced RETURN query which required a static sql statement, but did make things a bit easier.
One of the common use cases for using OUT parameters is to be able to return multiple outputs from a function without having
to declare a PostgreSQL type as output of the function. In this article we shall cover all variants of this. We'll just focus on sql and plpgsql for this
discussion, since we are not sure to what extent other pl languages (if at all) support IN OUT.
Continue reading "Use of OUT and INOUT Parameters"
Tuesday, June 09. 2009
Printer Friendly
You'll often hear the term planner statistics thrown around by database geeks. Did you update your statistics. This lingo isn't even limited
to PostgreSQL, but is part and parcel to how most decent databases work. For example in PostgreSQL you do a vacuum analyze to update your planner statistics in addition
to cleaning up dead space. In SQL Server you do an UPDATE STATISTICS . In MySQL you do an
ANALYZE TABLE or a more invasive OPTIMIZE TABLE .
Normally all this "update your stats so your planner can be happy" is usually unnecessary unless
you just did a bulk load or a bulk delete or you are noticing your queries are suddenly slowing down. These stat things are generally updated behind the scenes by most databases
on an as needed basis.
What makes SQL really interesting and a bit different from procedural languages is that it is declarative (like functional and logical programming languages) and relies on the database planner to come up with strategies for navigating the data. Its strategy is not fixed as it is in procedural languages.
A big part of this strategy is decided on by the query planner which looks at distributions of data. Given different WHERE conditions for similar queries, it could come up with vastly different strategies if one value has a significantly
higher distribution in a table than another. This is also the mystery of why it sometimes refuses to use an index on a field because it has decided a table scan is more efficient and also why some people consider HINTS evil because they pollute the imperative nature of the language.
Continue reading "Planner Statistics"
Thursday, May 21. 2009
Printer Friendly
Someone asked me this recently and not playing with custom types much, I'm not sure this is
the best solution. Anyrate here it goes. Note this works for PostgreSQL 8.2 and above. Note sure about 8.1 etc.
I assume it probably does.
Let us say you created a custom type something like this:
CREATE TYPE my_type1 AS
(name varchar(150),
rotation_x double precision,
rotation_y double precision,
x_pos integer,
y_pos integer
);
Continue reading "Creating instance of custom type"
Wednesday, April 29. 2009
Printer Friendly
Simon Greener, wrote an article on how to load GPX xml files into Oracle XMLDB. That
got me thinking that I haven't really explored all the XML features that PostgreSQL has to offer
and to some extent I've been reticent about XML processed in any database for that matter.
In this article we shall attempt to perform the same feats that Simon did, but with PostgreSQL instead of
Oracle XMLDB. Note while we are demonstrating this with a GPX file, the same XPath approach can be used to process any XML file.
Continue reading "Loading and Processing GPX XML files using PostgreSQL"
Sunday, April 26. 2009
Printer Friendly
Different Linux distros have their preferred place of where stuff goes and of course the default location on windows is completely different from that too. So there isn't really one default location where you can find PostgreSQL data cluster. Of course user's can pick their
locations as well. So what is a casual DBA supposed to do?
The pg_settings table
PostgreSQL has a convenient system table view called pg_settings that stores a lot of information. It stores the location of the data cluster, the pg_hbafile and other conf files.
In additon to that you can interogate it to find out information you will find in the postgresql.conf file. Why sift thru that postgresql.conf file (assuming you can already query your postgresql server) when you can find the answers you are looking
for with an SQL query?
Continue reading "Where is my data and other stuff"
Saturday, April 18. 2009
Printer Friendly
This question is one that has come up a number of times in PostGIS newsgroups worded in many different ways. The situation is that if you use a function a number of times
not changing the arguments that go into the function, PostgreSQL still insists on recalculating the value even when the function is marked IMMUTABLE. I have tested this on
8.2 and 8.3 with similarly awful results.
This issue is not so much a problem if function calculations are fast, but spatial function calculations relative to most other functions you will use are pretty
slow especially when dealing with large geometries. As a result your query could end up twice as slow. Even setting the costs of these functions to relatively high does not help the situation.
To demonstrate here is a non-PostGIS version of the issue that everyone should be able to run and demonstrates its not a PostGIS only issue.
Continue reading "How to force PostgreSQL to use a pre-calculated value"
Saturday, March 28. 2009
Printer Friendly
Even though others have blogged about this in the past and its well-documented in the docs, its a frequently enough asked question, that we thought we'd post it here again
with a couple of additional twists.
How to determine the size of a database on disk
SELECT pg_size_pretty(pg_database_size('somedatabase')) As fulldbsize;
How to determine the size of a database table on disk
NOTE: There are two functions in PostgreSQL - pg_relation_size and pg_total_relation_size. The pg_relation_size just measures the size of the actual table where
as the pg_total_relation_size includes both the table and all its toasted tables and indexes.
SELECT pg_size_pretty(pg_total_relation_size('someschema.sometable')) As fulltblsize,
pg_size_pretty(pg_relation_size('someschema.sometable')) As justthetblsize;
How to determine the size of a database schema
When you are doing a lot of data loading for later massaging to dump into your production tables, you will often generate junk tables in the process and worse yet other
people may generate junk tables period. As a general practice, we like to create a schema just for junk. The reason for that is that it is extremely easy to
exclude schemas from being backed up and to load individual schemas. So with this approach we create or move all tables we are not quite sure if they are needed or we only need temporarily
into a schema
called junk, scratch, possiblejunk or whatever you want to call the schema and exclude it from our backups.
Something of the form:
CREATE SCHEMA scratch;
ALTER TABLE someschema.joeyplaying SET SCHEMA scratch;
After saving the world from junk, you would like to know how much space you have saved your backups from backing up. So you would employ a query something of the form:
SELECT pg_size_pretty(pg_database_size('mycurrentdb')) As fullprod,
pg_size_pretty(CAST(pg_database_size('mycurrentdb') - (SELECT SUM(pg_total_relation_size(table_schema || '.' || table_name) )
FROM information_schema.tables WHERE table_schema = 'scratch') As bigint)) As tobebackedup_size,
pg_size_pretty(CAST((SELECT SUM(pg_total_relation_size(table_schema || '.' || table_name) )
FROM information_schema.tables
WHERE table_schema = 'scratch') As bigint) ) As junk_size;
Size of Geometries in PostGIS tables
PostGIS has a companion function for measuring geometry size which is useful when you want to get a sense of how much space your geometries are taking up on disk
for a set of records.
SELECT ST_Mem_Size(ST_GeomFromText('LINESTRING(220268 150415,220227 150505,220227 150406)'));
SELECT pg_size_pretty(CAST(SUM(ST_Mem_Size(the_geom)) As bigint) ) as totgeomsum
FROM sometable WHERE state = 'MA';
|