Wednesday, September 24. 2008
Printer Friendly
We've been playing around with the snapshot builds of PgAdmin III 1.9 and would like to summarize some
of the new nice features added. PgAdmin III 1.9 has not been released yet, but has a couple of neat features brewing.
For those interested in experimenting with the snapshot builds and src tarballs, you can download them from http://www.pgadmin.org/snapshots/
Continue reading "PgAdmin III 1.9 First Glance"
Sunday, September 14. 2008
Printer Friendly
OpenJump is a Java Based, Cross-Platform open source GIS analysis and query tool. We've been using it a lot lately, and I would
say out of all the open source tools (and even compared to many commercial tools) for geospatial analysis, it is one of the best out there.
While it is fairly rich in functionality in terms of doing statistical analysis on ESRI shapefile as well as PostGIS and other formats and also has numerous geometry manipulation features and plugins in its tool belt,
we like the ad-hoc query ability the most. The ease and simplicity of that one tool makes it stand out from the pack. People not comfortable with SQL may not appreciate that feature as much as we do though.
In this excerpt we will quickly go thru the history of project and the ties between the PostGIS group and OpenJump group,
how to install, setup a connection to a PostGIS enabled PostgreSQL database and doing some ad-hoc queries.
Quick History Lesson
- OpenJump is descended from Java Unified Mapping Platform - JUMP which was incubated by Vivid Solutions.
- OpenJump and the whole JUMP family tree have Java Topology Suite (JTS) as a core foundation of their functionality.
- GEOS which is a core foundation of PostGIS functionality and numerous other projects, is a C++ port of JTS. New Enhancements often are created in JTS and ported to GEOS and a large body of GEOS work has been incubated
by Refractions Research, the PostGIS incubation company.
- For more gory details about how all these things are intertwined, check out Martin Davis' recount of the history of GEOS and JTS.
Continue reading "OpenJump for PostGIS Spatial Ad-Hoc Queries"
Sunday, September 07. 2008
Printer Friendly
One of the nice things about the PostgreSQL command-line restore tool is the ease with which you can restore
select objects from a backup. We tend to use schemas for logical groupings which are partitioned by context, time, geography etc.
Often times when we are testing things, we just want to restore one schema or set of tables from our backup because restoring a 100 gigabyte database
takes a lot of space, takes more time and is unnecessary for our needs. In order to be able to accomplish such a feat, you need to
create tar or compressed (PG custom format) backups. We usually maintain PG custom backups of each of our databases.
Continue reading "How to restore select tables, select objects, and schemas from Pg Backup"
Printer Friendly
As we mentioned in a previous article, RECURSIVE queries, often referred to in SQL ANSI specs and by DB2 and SQL Server as
Common Table Expressions (CTE) will make it into the 8.4 release and can already be found in the dev source. Technically CTE is a
superset and RECURSIVE queries are a subclass of CTE. Looks like basic windowing functionality will make it in 8.4 as well.
A summary of where your favorite patches are at can be found at the September 2008 PostgreSQL 8.4 commit-fest summary page http://wiki.postgresql.org/wiki/CommitFest:2008-09.
Continue reading "CTEs and Windowing Functions in 8.4"
Printer Friendly
One thing I'm really looking forward to have in the upcoming PostgreSQL 8.4 is the introduction
of the WITH RECURSIVE feature that IBM DB2 and SQL Server 2005 already have. Oracle has it too but in a non-standard CONNECT BY so is much less portable.
This is a feature that is perhaps more important to
us for the kind of work we do than the much complained about lack of windowing functions.
I was recently taking a snoop at IBM DB2 newsletter. Why I read magazines and newsletters on databases I don't even use I guess is to see what I'm missing out on
and to sound remotely educated on the topic when I run into one of those people. I also have a general fascination with magazines.
In it their latest newsletter they had examples of doing Fibonacci and Graphs with Common Table Expressions (CTEs).
Robert Mala's Fibonacci CTE
Robert Mala's Graph CTE
Compare the above to David Fetter's Fibonacci Memoizing
example he posted in our comments way back when.
I'd be interested in seeing what solutions David and others come out with using new features of 8.4. We can see a before 8.4 and after 8.4 recipe.
As a slightly off-topic side note - of all the Database magazines I have read - Oracle Magazine is the absolute worst. SQL Server Magazine and IBM DB2 are pretty decent.
The real problem is that Oracle's magazine is not even a database magazine.
Its a mishmash of every Oracle offering known to man squashed into a compendium that can satisfy no one. You would think that Oracle as big as their database is
would have a magazine dedicated to just that.
Perhaps there is another magazine besides Oracle Magazine, but haven't found it so I would be interested to know if I missed something.
Sunday, August 31. 2008
Printer Friendly
Common Case Scenario:
You have a very aggravated person who demands you purge their email from any table you have in your system.
You have lots of tables that have email addresses. How do you find which tables have this person's email address.
Below is a handy plpgsql function we wrote that does the following. Given a search criteria, field name pattern, table_name pattern,
schema name pattern, data type pattern, and max length of field to check, it will search all fields in the database fitting those
patterns and return to you the names of these schema.table.field names that contain the search phrase.
To use the below you would do something like:
SELECT pc_search_tablefield('%john@hotmail%', '%email%', '%', '%', '%', null);
The above will return all database field names that have the phrase email in the field name and that contain the term john@hotmail
Continue reading "How to determine if text phrase exists in a table column"
Tuesday, August 12. 2008
Printer Friendly
Microsoft Access has these peculiar set of aggregates called First and Last. We try to avoid them because while the concept is useful, we find Microsoft Access's implementation of them
a bit broken. MS Access power users we know moving over to something like MySQL, SQL Server, and PostgreSQL often ask - where's first and where's last?
First we shall go over what exactly these aggregates do in MS Access and how they are different from MIN and MAX and what they should do in an ideal world. Then we shall create our ideal
world in PostgreSQL.
Continue reading "More Aggregate Fun: Who's on First and Who's on Last"
Sunday, August 10. 2008
Printer Friendly
One of the things we love most about PostgreSQL is the ease with which one can define new aggregate functions with even a language as
succinct as SQL. Normally when we have needed a median function, we've just used the built-in median function in PL/R as we briefly demonstrated in
Language Architecture in PostgreSQL.
If all you demand is a simple median aggregate function ever then installing the whole R statistical environment so you can use PL/R is overkill and much less
portable.
In this article we will demonstrate how to create a Median function with nothing but the built-in PostgreSQL SQL language, array constructs,
and functions.
Continue reading "Build Median Aggregate Function in SQL"
Friday, August 01. 2008
Printer Friendly
Every once in a while - particularly if you are using inherited tables, you forget to put an important index on one of your tables
which bogs down critical queries. Its sometimes convenient to inspect the index catalog to see what tables are missing indexes or
what tables are missing a critical index. Normally we try to stick with querying the information_schema because queries against that
schema work pretty much the same in PostgreSQL as they do in SQL Server and MySQL. For most of the examples below we had to delve into pg_catalog schema territory
since there was no view we could find in information_schema that would give us enough detail about indexes.
Continue reading "How to determine which tables are missing indexes"
Sunday, July 27. 2008
Printer Friendly
Backup and Restore is probably the most important thing to know how to do when you have a database with data you care about.
The utilities in PostgreSQL that accomplish these tasks are pg_restore, pg_dump, pg_dumpall, and for restore of plain text dumps - psql.
A lot of the switches used by pg_dump, pg_restore, pg_dumpall are common to all three and on rare cases, the switches used by each overlap but mean different things.
pg_dump and pg_restore are complementary. You use pg_dump to do hot backups of a database and pg_restore to restore it either to another database or to recover portions of a database.
Rather than trying to keep track of which switch works with which, we decided to combine all into a single cheat sheet with a column denoting which utility the switch is supported in.
Pretty much all the text is compiled from the --help switch of each.
Below is a Thumbnail view of the PostgreSQL 8.3 Dump Restore cheat sheet
that covers PostgreSQL 8.3 pg_dump, pg_dumpall, pg_restore utilities.
PDF Portrait version 8.5 x 11" of this cheatsheet is available at PostgreSQL 8.3 Dump Restore 8.5 x 11 and also available in
PDF A4 format and HTML.
Saturday, July 19. 2008
Printer Friendly
As many people who know us know we sit on several camps especially when it comes to databases.
The camps we sit on are growing rather than shrinking.
While we do have our favorites, we understand that peoples needs and comfort levels are different from ours and we try to take that into
consideration when making recommendations to people. The only thing that is generally true about the clientele we consult for is that they
fit one of the following features:
- Very minimal bureaucratic structure - this generally rules out most fortune 500 companies
and shall we say smaller companies who are too bureaucratic for their own good
- Dot com startups/Niche product developers who are looking to keep costs down to a minimum without too much fuss and are trying to produce a product to change the world
- Small companies who have a relatively low IT budget, but are predominantly windows-based
- Mid-sized companies predominantly windows-based or departments with decent IT staff,
who are looking for something their staff can easily maintain rather than simply keeping licensing costs down
It has come up as a topic of discussion, now that SQL Server 2008 is coming out soon and with its new fangled geodetic spatial support,
how does this change things?
The short answer is - not much except to increase awareness of spatial databases and to give us more business. As part of our due diligence work
we have put together a comparison of the 3 databases spatial functionality -
Cross Compare SQL Server 2008 Spatial, PostgreSQL/PostGIS 1.3-1.4, MySQL 5-6
to compliment our Cross Compare of SQL Server, MySQL, and PostgreSQL
Continue reading "More Database Comparisons"
Monday, July 14. 2008
Printer Friendly
Programming Design Patterns define recommended approaches of solving common application problems. Within design patterns is a subset of design patterns called Idioms.
Idioms you can think of as a strategy for expressing recurring constructs or if you will sub-problems and often take advantage of the special features of a language.
They tend to be specific to a programming language and can not be reused
in other languages they were not specifically designed for. To demonstrate the differences lets compare two design patterns we commonly use.
Continue reading "SQL Idiom Design Patterns"
Sunday, July 06. 2008
Printer Friendly
In our April Issue An Almost Idiot's Guide to PostgreSQL YUM
we covered using the new PostgreSQL Yum repository to install the PostgreSQL 8.3.1 release on Fedora, RedHat Enterprise, and CentOS. We also received numerous useful feedback from others on issues they
ran into and how they overcame them. The blog comments are definitely worth a read.
Now that 8.3.3 has come out, many of you should be considering upgrading if you haven't already since there are a couple of bug fixes as outlined in
http://www.postgresql.org/docs/8.3/static/release-8-3-2.html, http://www.postgresql.org/docs/8.3/static/release-8-3-3.html,
and for those running 8.3.0 you will need to reindex your tables after as noted in http://www.postgresql.org/docs/8.3/static/release-8-3-1.html.
If you are running version 8.3.1 and above then
this is a fairly painless upgrade that just requires you to backup your data as a precautionary measure, but doesn't require a dump reload.
Continue reading "YUM 2: Almost Idiot's Guide to upgrade from PostgreSQL 8.3.1 to 8.3.3"
Wednesday, July 02. 2008
Printer Friendly
A lot of this information is nicely tucked away in the PostgreSQL docs in http://www.postgresql.org/docs/8.3/interactive/ddl-inherit.html,
but since the docs are so huge and rich, one may tend to miss
these things.
While there are numerous interesting use cases for the PostgreSQL inheritance structure, one of the key reasons people use it is for table partitioning strategies.
How do you make a stand-alone table a child of another table?
The first question that comes to mind is why would you ever need a table to adopt another table. There are 2 reasons that come to mind.
- When you are loading huge amounts of data especially of a read only nature - its often convenient to not have that table be visible to your applications until
you are done with the loading process. So you may want to make it a child after the loading.
- Your tables seemed fairly unrelated when you started out and then one day you realized you really were talking about apples and apples and need to report on them together at a higher level.
One situation like this to give a somewhat real-world perspective - lets say you developed a timesheet app for an organization and each department insisted on having their own version of the app and each along with
the basic fields needed to track some additional ones of their own. Then higher forces
came in and said I need to know what everyone is doing, but I don't need to see all that other crap they keep track of.. Two options come to mind - create a bunch of views
that union stuff together or institute a round-up-the-children-and-adopt-them program.
This fits into one of the categories of things that PostgreSQL lets you do that PgAdmin III doesn't have a graphical way to let you do it.
If you try to inherit in PgAdmin III from a table that already exists, that option is just greyed out. So you have to resort to DDL SQL statements. Luckily its fairly trivial. Well this really only works
for PostgreSQL 8.2+. I don't think PostgreSQL 8.1 and below supported INHERIT/NO INHERIT in the ALTER TABLE statement.
Continue reading "How to Inherit, Unherit and Merge Inherit"
Wednesday, June 18. 2008
Printer Friendly
First this is a windows only package, but nevertheless sweet. In our article
What can PostgreSQL learn from MySQL?
we complained about the fact that there is nothing like Server2GO pre-packaged with PostgreSQL. Low and behold comes this thing
called Portable GIS 1.2 which can be downloaded from http://www.archaeogeek.com/blog/portable-gis/.
This is similar in architecture to Portable Apps. Its a suite of applications you can run from your USB drive without having to
reboot your windows computer.
I'm not sure if a similar thing exists for Linux, but would be nice to
know if it does. Note: all the packages this portable tool set comes with work on Linux and most started life on Linux, so it seems to me it should
not be too hard to make a Linux port of this if it doesn't already exist. Also most of these tools work on Mac OSX as well so a similar package can be made for Mac OSX.
Continue reading "Portable GIS: PostgreSQL and PostGIS on a USB Stick"
|