postgresql versions

Sunday, November 28. 2010

Quick Intro to R and PL/R - Part 1

Recommended Books: Free R reference card / Cheat Sheet PostGIS In Action R in Action R In a Nutshell

In this article we'll provide a summary of what PL/R is and how to get running with it. Since we don't like repeating ourselves, we'll refer you to an article we wrote a while ago which is still fairly relevant today called Up and Running with PL/R (PLR) in PostgreSQL: An almost Idiot's Guide and just fill in the parts that have changed. We should note that particular series was more geared toward the spatial database programmer (PostGIS in particular). There is a lot of overlap between the PL/R, R, and PostGIS user-base which is comprised of many environmental scientists and researchers in need of powerful charting and stats tools to analyse their data who are high on the smart but low on the money human spectrum.

This series will be more of a general PL/R user perspective. We'll follow more of the same style we did with Quick Intro to PL/Python. We'll end our series with a PL/R cheatsheet similar to what we had for PL/Python.

As stated in our State of PostGIS article, we'll be using log files we generated from our PostGIS stress tests. These stress tests were auto-generated from the PostGIS official documentation. The raster tests are comprised of 2,095 query executions exercising all the pixel types supported. The geometry/geograpy tests are comprised of 65,892 spatial SQL queries exercising every PostGIS geometry/geography supported in PostGIS 2.0 -- yes this includes TINS, Triangles,Polyhedral Surfaces, Curved geometries and all dimensions of them. Most queries are unique. If you are curious to see what these log tables look like or want to follow along with these exercises, you can download the tables from here.

What is R and PL/R and why should you care?

R is both a language and an environment for doing statistics and generating graphs and plots. It is GNU-licensed and a common favorite of Universities and Research institutions. PL/R is a procedural language for PostgreSQL that allows you to write database stored functions in R. R is a set-based and domain specific language similar to SQL except unlike the way relational databases treat data, it thinks of data as matrices, lists and vectors. I tend to think of it as a cross between LISP and SQL though more experienced Lisp and R users will probably disagree with me on that. This makes it easier in many cases to tabulate data both across columns as well as across rows. The examples we will show in these exercises, could be done in SQL, but they are much more succinct to write in R. In addition to the language itself, there are a whole wealth of statistical and graphing functions available in R that you will not find in any relational database. These functions are growing as more people contribute packages. Its packaging system called Comprehensive R Archive (CRAN) is similar in concept to Perl's CPAN and the in the works PGXN for PostgreSQL.

Continue reading "Quick Intro to R and PL/R - Part 1"

Posted by Leo Hsu and Regina Obe in 9.0, gis, intermediate, pl programming, PLR at 14:19 | Comments (4) | Trackback (1)

Tuesday, November 23. 2010

The State of PostGIS, Joys of Testing, and PLR the Prequel

Printer Friendly

Recommended Books: PostGIS In Action R in Action

I've always enjoyed dismantling things. Deconstruction was a good way of analyzing how things were built by cataloging all the ways I could dismantle or destroy them. I experimented with mechanical systems, electrical circuitry, chemicals and biological systems sometimes coming close to bodily harm. In later years I decided to play it safe and just stick with programming and computer simulation as a convenient channel to enjoy my destructive pursuits. Now getting to the point of this article.

In later articles, I'll start to demonstrate the use of PL/R, the procedural language for PostgreSQL that allows you to program functions in the statistical language and Environment R. To make these examples more useful, I'll be analyzing data generated from PostGIS tests I've been working on for stress testing the upcoming PostGIS 2.0. PostGIS 2.0 is a major and probably the most exciting release for us. Paul Ramsey did a summary talk recently of Past, Present, Future of PostGIS at State of PostGIS FOSS4G Japan http://www.ustream.tv/recorded/10667125 which provides a brief glimpse of what's in store in 2.0.

Continue reading "The State of PostGIS, Joys of Testing, and PLR the Prequel"

Posted by Leo Hsu and Regina Obe in 9.0, editor note, gis, PLR, postgis at 18:22 | Comments (0) | Trackbacks (0)

Sunday, November 21. 2010

PostgreSQL 9.0 pg_dump, pg_dumpall, pg_restore cheatsheet overview

Printer Friendly

Recommended Books:

PostgreSQL 9.0 High Performance

PostgreSQL 9.0 Admin Cookbook

PostGIS in Action

Backup and Restore is probably the most important thing to know how to do when you have a database with data you care about.

The utilities in PostgreSQL that accomplish these tasks are pg_restore, pg_dump, pg_dumpall, and for restore of plain text dumps - psql.

A lot of the switches used by pg_dump, pg_restore, pg_dumpall are common to all three. You use pg_dump to do backups of a single database or select database objects and pg_restore to restore it either to another database or to recover portions of a database. You use pg_dumpall to dump all your databases in plain text format.

Rather than trying to keep track of which switch works with which, we decided to combine all into a single cheat sheet with a column denoting which utility the switch is supported in. Pretty much all the text is compiled from the --help switch of each.

We created a similar Backup and Restore cheatsheet for PostgreSQL 8.3 and since then some new features have been added such as the jobs parallel restore feature in 8.4. We have now created an updated sheet to comprise all features present in PostgreSQL 9.0 packaged pg_dump, pg_restore, pg_dumpall command line utilities.

PDF Portrait version 8.5 x 11" of this cheatsheet is available at PostgreSQL 9.0 Dump Restore 8.5 x 11 and also available in PDF A4 format and HTML.

As usual please let us know if you find any errors or omissions and we'll be happy to correct.

Posted by Leo Hsu and Regina Obe in 9.0, beginner, special feature at 20:35 | Comments (7) | Trackbacks (0)

Friday, November 05. 2010

greatest and least - oldies but goodies

Printer Friendly

Problem: You have a set of numbers, or characters or whatever and you are trying to find the max or min of this set?

If the values are separate records in a table or query, the answer is well known and respected across all relational databases -- use the aggregate MAX and MIN functions.

But what if you have a set of free wheeling numbers or text not in separate records, and you want the max or min of each. Here is where the least and greatest functions come in handy.

PostgreSQL has had these functions for as far back as I can remember and is not the only database to sport these marvelous functions. Our beloved MySQL and Oracle database have these functions as well. Sadly our more beloved SQL Server even in the SQL Server 2008 variant - lacks these functions.

Okay how to use these functions -- you use it like this:

    SELECT least(1,-2,5) As num_least, greatest('Bobby', 'Catty', 'Kitty') As greatest_cat;
    
    Yields:
 num_least| greatest_cat
----------+----------
       -2 | Kitty

We would classify these functions along the lines of COALESCE. They are like COALESCE because they take an arbitrary number of arguments and the datatype that is returned is highest datatype that all arguments in the function can be autocast to. If there is no autocast then well you get an error. To demonstrate, guess what happens when you do this:

SELECT least(-1, 'Kitty');

Well do this in PostgreSQL at least in 8.3+, you get a nice slap if you haven't installed any deprecated autocasts:

ERROR: invalid input syntax for integer: "Kitty" LINE 1: SELECT least('Kitty', -1)

Do this in MySQL - so friendly and forgiving, and great reader of minds and you get:

-1
OF COURSE

I apologize for the ambiguous sarcasm, its just sometimes I want my mind read and sometimes I don't and I just can't figure out whether today is one of those days or the other day.

Posted by Leo Hsu and Regina Obe in 8.2, beginner, mysql, oracle, other dbms, q&a, sql server at 01:50 | Comments (0) | Trackbacks (0)

Wednesday, November 03. 2010

PostgreSQL 9 Admin Cookbook Book Review

Printer Friendly

I was excited when PostgreSQL 9 Admin Cookbook by Simon Riggs and Hannu Krosing and the companion book PostgreSQL 9 High Performance, by Greg Smith were available. All three authors are well known experts in the PostgreSQL community and consultants at 2ndQuadrant, so you are sure to learn a lot from both books. Both books are published by Packt Publishing and can be bought directly from Packt Publishing or via Amazon. Packt is currently running a 50% off sale if you buy both books (e-Book version) directly from Packt. In addition Packt offers free shipping for US, UK, Europe and select Asian countries. The pair of books make attractive companions.

The main thing I felt missing in this duo was a book dedicated to PostgreSQL: The platform that would cover all the various PL languages and the various neat ways PostgreSQL is used and has been extended by many to do things one would not normally expect of a database. Some day perhaps someone will write such a book.

This article is a review about PostgreSQL 9 Admin Cookbook and we'll be following up later with PostgreSQL 9 High Performance.

This is my first book review. I have a lot of patience for writing, but little patience when it comes to reading. That said, I found PostgreSQL 9 Admin Cookbook an easy and enjoyable read, and a book that I managed to learn more tricks from than I care to admit. It is a handy book to have for reference regardless of if you consider yourself a novice, intermediate or advanced user.

As the book title suggests, it's a cookbook, but a cookbook that combines a question and answer style with a discussion style of writing. The tasks are neatly categorized into 12 chapters and each task smoothly builds on previous tasks discussed. It is still categorized in such a way that you can jump to a particular task you are currently having problems with without having read the other parts of the book.

Although it is titled PostgreSQL 9 -- it covers earlier versions as well.

Continue reading "PostgreSQL 9 Admin Cookbook Book Review"

Posted by Leo Hsu and Regina Obe in 9.0, product showcase at 17:49 | Comments (4) | Trackbacks (0)

Entries from November 2010

PostGIS in Action About the Authors Consulting

Sunday, November 28. 2010

Quick Intro to R and PL/R - Part 1

What is R and PL/R and why should you care?

Tuesday, November 23. 2010

The State of PostGIS, Joys of Testing, and PLR the Prequel

Sunday, November 21. 2010

PostgreSQL 9.0 pg_dump, pg_dumpall, pg_restore cheatsheet overview

Friday, November 05. 2010

greatest and least - oldies but goodies

Problem: You have a set of numbers, or characters or whatever and you are trying to find the max or min of this set?

Wednesday, November 03. 2010

PostgreSQL 9 Admin Cookbook Book Review

Quicksearch

Calendar

Categories

Archives

Subscribe

Blog Administration

postgresql versions

Entries from November 2010 PostGIS in Action About the Authors Consulting

Sunday, November 28. 2010

Quick Intro to R and PL/R - Part 1

What is R and PL/R and why should you care?

Tuesday, November 23. 2010

The State of PostGIS, Joys of Testing, and PLR the Prequel

Sunday, November 21. 2010

PostgreSQL 9.0 pg_dump, pg_dumpall, pg_restore cheatsheet overview

Friday, November 05. 2010

greatest and least - oldies but goodies

Problem: You have a set of numbers, or characters or whatever and you are trying to find the max or min of this set?

Wednesday, November 03. 2010

PostgreSQL 9 Admin Cookbook Book Review

Quicksearch

Calendar

Categories

Archives

Subscribe

Blog Administration

Entries from November 2010

PostGIS in Action About the Authors Consulting