beginner - 3 - Postgres OnLine Journal

Monday, July 05. 2010

What is new in PgAdmin III 1.12.0

Recommended Books: PostgreSQL 8.4 SQL Reference

Gathering from the number of hits we got from our What's new in PostgreSQL 9.0, and the large number of slashdot responses we got as well as the fair number of reddit responses, I guess a lot of people are really excited about the upcoming PostgreSQL 9.0 or at least have a lot of opinions about what is still missing in it.

For this discussion, we would like to point out one of the companion adminstration tools that will be packaged in with PostgreSQL 9.0 (and currently packaged in beta 2). This is PgAdmin III, which we will affectionately refer to as the Administrative tool for mere mortals. It is the first administrative tool that most users new to PostgreSQL use and gives them a user-friendly interface to the power behind PostgreSQL. I would say if it were not for this tool and its web cousin PhpPgAdmin, many a scared newbie user would be running away at the vast unencumbered freedom that PostgreSQL/psql and sibling commandline tools offer.

Continue reading "What is new in PgAdmin III 1.12.0"

Posted by Leo Hsu and Regina Obe in 9.0, basics, beginner, pgadmin, postgresql versions at 03:19 | Comments (0) | Trackbacks (0)

Monday, June 28. 2010

Importing data into PostgreSQL using Open Office Base 3.2

Printer Friendly

Recommended Books: Getting Started with OpenOffice.org 3 written by OO.org group Database Programming with OO Base Beginning OpenOffice 3 from Novice to Professional

A while ago we demonstrated how to use Open Office Base to connect to a PostgreSQL server using both the native PostgreSQL SBC and the PostgreSQL JDBC driver.

The routine for doing the same in Open Office Base 3.2 is pretty much the same as it was in the 2.3 incarnation. In this excerpt, we'll demonstrate how to import data into PostgreSQL using Open Office Base, as we had promised to do in Database Administration, Reporting, and Light Applicaton Development and some stumbling blocks to watch out for.

Use Case

Command line lovers are probably scratching there head, why you want to do this. After all stumbling your way thru a commandline and typing stuff is much more fun and you can automate it after you are done. For our needs, we get stupid excel or some other kind of tab delimeted data from somebody, and we just want to cut and paste that data in our database. These files are usually small (under 5000 records) and the column names are never consistent. We don't want to fiddle with writing code to do these one off type exercises.

For other people, who are used to using GUIs or training people afraid of command lines, the use cases are painfully obvious, so we won't bore you.

Importing Data with Open Office Base Using copy and paste

Open Office has this fantastic feature called Copy and Paste (no kidding), and we will demonstrate in a bit, why their copy and paste is better than Microsoft Access's Copy and Paste particularly when you want to paste into some database other than a Microsoft one. It is worthy of a metal if I dear say.

Continue reading "Importing data into PostgreSQL using Open Office Base 3.2"

Posted by Leo Hsu and Regina Obe in beginner, ms access, oobase, oracle, product showcase at 03:42 | Comments (6) | Trackbacks (0)

Tuesday, June 22. 2010

NOT IN NULL Uniqueness trickery

Printer Friendly

Recommended Books: SQL Cookbook SQL Hacks

I know a lot has been said about this beautiful value we affectionately call NULL, which is neither here nor there and that manages to catch many of us off guard with its casual neither here nor thereness. Database analysts who are really just back seat mathematicians in disguise like to philosophize about the unknown and pat themselves on the back when they feel they have mastered the unknown better than any one else. Of course database spatial analysts, the worst kind of back seat mathematicians, like to talk not only about NULL but about EMPTY and compare notes with their brethren and write dissertations about what to do about something that is neither here nor there but is more known than the unknown, but not quite as known as the empty string.

Okay getting to the point, one of our clients asked us about a peculiar problem they had with a query, and the strange results they were getting. We admit this still manages to catch us off guard every once in a while.

Continue reading "NOT IN NULL Uniqueness trickery"

Posted by Leo Hsu and Regina Obe in basics, beginner, postgresql versions, sql server at 04:19 | Comments (2)

Monday, May 17. 2010

Where is soundex and other warm and fuzzy string things

Printer Friendly

For those people coming from Oracle, SQL Server and MySQL or other databases that have soundex functionality, you may be puzzled, or even frustrated when you try to do something like
WHERE soundex('Wushington') = soundex('Washington')
in PostgreSQL and get a function does not exist error.

Well it does so happen that there is a soundex function in PostgreSQL, and yes it is also called soundex, but is offered as a contrib module and not installed by default. It also has other fuzzy string matching functions in addition to soundex. One of my favorites, the levenshenstein distance function is included as well. In this article we'll be covering the contrib module packaged as fuzzystrmatch.sql. Details of the module can be found in FuzzyStrMatch. The contrib module has been around for sometime, but has changed slightly from PostgreSQL version to PostgreSQL version. We are covering the 8.4 version in this article.

For those unfamiliar with soundex, its a basic approach developed by the US Census in the 1930s as a way of sorting names by pronounciation. Read Census and Soundex for more gory history details.

Given that it is an approach designed primarily for the English alphabet, it sort of makes sense why its not built-in to PostgreSQL, which has more of a diverse international concern. For example if you used it to compare two words in Japanese or Chinese, don't think it would fair too well in any of the database platforms that support this function.

The original soundex algorithm has been improved over the years. Though its still the most common used today, newer variants exist called MetaPhone developed in the 1990s and Double Metaphone (DMetaPhone) developed in 2000 that support additional consonants in other languages such as Slavic, Celtic, Italian, Spanish etc. These two variants are also included in the fuzzystrmatch contrib library. The soundex function still seems to be the most popularly used at least for U.S. This is perhaps because most of the other databases (Oracle, SQL Server, MySQL) have soundex built-in but not the metaphone variants. So in a sense soundex is a more portable function. The other reason is that metaphone and dmetaphone take up a bit more space and are also more processor intensive to compute than soundex. We'll demonstrate some differences between them in this article.

To enable soundex and the other fuzzy string matching functions included, just run the share/contrib/fuzzystrmatch.sql located in your PostgreSQL install folder. This library is an important piece of arsenal for geocoding and genealogy tracking particularly the U.S. streets and surnames data sets. I come from a long line of Minors, Miners, Burnettes and Burnets.

For the next set of exercises, we will be using the places dataset we created in Importing Fixed width data into PostgreSQL with just PSQL.

Continue reading "Where is soundex and other warm and fuzzy string things"

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, 9.0, beginner, contrib spotlight, fuzzystrmatch, mysql, oracle, postgresql versions, sql server at 16:53 | Comments (2) | Trackbacks (3)

Friday, April 23. 2010

Import fixed width data into PostgreSQL with just PSQL

Printer Friendly

Fixed width data is probably the most annoying data to import because you need some mechanism to break the columns at the column boundaries. A lot of people bring this kind of data into a tool such as OpenOffice, Excel or MS Access, massage it into a delimeted format and then pull it in with PostgreSQL copy command or some other means. There is another way and one that doesn't require anything else aside from what gets packaged with PostgreSQL. We will demonstrate this way.

Its quite simple. Pull each record in as a single column and then spit it into the columns you want with plain old SQL. We'll demonstrate this by importing Census data places fixed width file.

Although this technique we have is focused on PostgreSQL, its pretty easy to do the same steps in any other relational database.

Both David Fetter and Dimitri Fontaine have demonstrated other approaches of doing this as well so check theirs out.

UPDATE

David Fetter - psql, Paste, Perl: Pefficiency!
Dimitri Fontaine - Import fixed width data with pgloader

Continue reading "Import fixed width data into PostgreSQL with just PSQL"

Posted by Leo Hsu and Regina Obe in basics, beginner at 17:38 | Comment (1) | Trackbacks (0)

Sunday, February 14. 2010

Regular Expressions in PostgreSQL

Printer Friendly

Recommended Books: Regular Expressions Pocket Reference Mastering Regular Expressions Sed and Awk

Every programmer should embrace and use regular expressions (INCLUDING Database programmers). There are many places where regular expressions can be used to reduce a 20 line piece of code into a 1 liner. Why write 20 lines of code when you can write 1.

Regular expressions are a domain language just like SQL. Just like SQL they are embedded in many places. You have them in your program editor. You see it in sed, grep, perl, PHP, Python, VB.NET, C#, in ASP.NET validators and javascript for checking correctness of input. You have them in PostgreSQL as well where you can use them in SQL statements, domain definitions and check constraints. You can mix regular expressions with SQL. When you mix the two domain languages, you can do enchanting things with a flip of a wrist that would amaze your less informed friends. Embrace the power of domain languages and mix it up. PostgreSQL makes that much easier than any other DBMS we can think of.

For more details on using regular expressions in PostgreSQL, check out the manual pages Pattern Matching in PostgreSQL

The problem with regular expressions is that they are slightly different depending on what language environment you are running them in. Different enough to be frustrating. We'll just focus on their use in PostgreSQL, though these lessons are applicable to other environments.

Continue reading "Regular Expressions in PostgreSQL"

Posted by Leo Hsu and Regina Obe in 8.4, beginner, q&a at 11:43 | Comments (12) | Trackbacks (3)

Thursday, January 21. 2010

Making backups of select tables

Printer Friendly

Every once in a while, especially if you have a fairly large database, you may find the need to do select backups of certain tables. Your criteria might be based on name or how relatively recently data has changed in the table. Below are some of the tricks we use. Some use our favorite hack of scripting command line scripts with SQL.

Continue reading "Making backups of select tables"

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, 9.0, basics, beginner at 00:55 | Comments (0) | Trackbacks (0)

Thursday, December 24. 2009

PgAdmin III Plug-in Registration: PostGIS Shapefile and DBF Loader

Printer Friendly

PgAdmin 1.9+ has a simple Plug-In architecture which makes it relatively simple to introduce new plugins. It is pretty much all controlled by the file plugins.ini. In that file you can register any executable you want in there. If you want the executable to get passed database configuration settings, there is an option for that and you just have to build your executable to accept commandline switches.

You can download the windows version from http://postgis.net/windows_downloads. If you just want it without the PostGIS 1.5 binaries -- just download the one labeled PostGIS ESRI Shapefile GUI.

Continue reading "PgAdmin III Plug-in Registration: PostGIS Shapefile and DBF Loader"

Posted by Leo Hsu and Regina Obe in beginner, contrib spotlight, pgadmin, postgis at 14:19 | Comments (23) | Trackbacks (4)

Saturday, November 28. 2009

An almost idiot's guide to Install and Upgrade to PostgreSQL 8.4 with Yum

Printer Friendly

In this exercise, we'll go thru our steps for upgrading a Redhat Enterprise Linux 64-bit PostgreSQL box from PostgreSQL 8.3 to PostgreSQL 8.4. If you don't have any kind of PostgreSQL installed on your box, you can skip the Upgrade step.

UPDATE - since Devrim's move from Command prompt - he has started a new yum repository. You may want to use this one instead since it seems more up to date than the other. http://yum.pgrpms.org/

UPDATE

We have instructions for installing PostgreSQL 9.0 via yum.

Updgrading from PostgreSQL 8.* to PostgreSQL 8.4

If you are starting from scratch -- just skip this section.

If you are upgrading from 8.4.0 to 8.4.1 you can get away with a simple
yum update postgresql
and skip the rest of this article.

If you are upgrading from PostgreSQL 8.3 to 8.4, in theory you can use PgMigrator, but in practice, particularly with a Yum install, you are bound to run into obstacles. If you are running an older version, you must dump and restore.

Continue reading "An almost idiot's guide to Install and Upgrade to PostgreSQL 8.4 with Yum"

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, basics, beginner, yum at 04:48 | Comments (18) | Trackback (1)

Monday, October 26. 2009

Lowercasing table and column names

Printer Friendly

This is an unfortunate predicament that many people find themselves in and does cause a bit of frustration. You bring in some tables into your PostgreSQL database using some column name preserving application, and the casings are all preserved from the source data store. So now you have to quote all the fields everytime you need to use them. In these cases, we usually rename the columns to be all lower case using a script. There are two approaches we have seen/can think of for doing this one to run a script that generates the appropriate alter table statements and the other is to update the pg_attribute system catalog table directly.

Continue reading "Lowercasing table and column names"

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, basics, beginner at 02:44 | Comments (6) | Trackbacks (0)

Wednesday, October 21. 2009

Enable and Disable Vacuum per table

Printer Friendly

Vacuuming and analyzing is the process that removes dead rows and also updates the statistics of a table. As of PostgreSQL 8.3, auto vacuuming (the process that runs around cleaning up tables), is on by default. If you are creating a lot of tables and bulk loading data, the vacuumer sometimes gets in your way. One way to get around that is to disable auto vacuuming on the tables you are currently working on and then reenable afterward. You can also do this from the PgAdmin III management console.

Continue reading "Enable and Disable Vacuum per table"

Posted by Leo Hsu and Regina Obe in 8.4, beginner, q&a at 00:31 | Comments (4) | Trackback (1)

Wednesday, September 09. 2009

Terminating Annoying Back Ends

Printer Friendly

One of the small little treats provided in PostgreSQL 8.4 is the new pg_terminate_backend function. In the past when we wanted to kill runaway postgresql queries issued by a database or user or hmm us, we would call the pg_cancel_backend function. The problem with that is it would simply cancel the query in the backend process, but often times the offending application would simply launch the same query again.

In PostgreSQL 8.4 a new function was introduced called pg_terminate_backend. This doesn't completely replace pg_cancel_backend, but basically does what you do when you go into say a Windows Task manager and kill the offending postgres process or on Linux, you call a kill command on a postgres process. Its nicer in the sense that you can do it all within PostgreSQL and you can use the pg_stat_activity query to help you out a bit. Also you don't run the risk as easily of killing the root postgres process and killing the postgres service all together.

Continue reading "Terminating Annoying Back Ends"

Posted by Leo Hsu and Regina Obe in 8.4, beginner, q&a at 20:47 | Comments (2) | Trackbacks (0)

Monday, September 07. 2009

Database Administration, Reporting, and Light application development

Printer Friendly

One of the most common questions people ask is Which tools work with PostgreSQL. In a sense the measure of a database's maturity/popularity are the number of vendors willing to produce management and development tools for it. Luckily there are a lot of vendors producing tools for PostgreSQL and the list is growing. One set of tools people are interested in are Database administration, ER diagramming, Query tools, and quickie application generators (RAD).

For this issue of our product showcase, we will not talk about one product, but several that fit in the aforementioned category. All the listed products work with PostgreSQL and can be used for database administration and/or architecting or provide some sort of light reporting/rapid application building suite. By light reporting/application building, we mean a tool with a simple wizard that a novice can use to build somewhat functional applications in minutes or days. This rules out all-purpose development things like raw PHP, .NET, Visual Studio, database drivers etc. Things we consider in this realm are things like OpenOffice Base and MS Access. Most of these tools are either free or have 30-day try before you buy options.

You can't really say one tool is absolutely better than another since each has its own strengths and caters to slightly different audiences and also you may like the way one tool does one important thing really well, though it may be mediocre in other respects. We also left out a lot of products we are not familiar with and may have gotten some things wrong.

If we left out your favorite product and you feel it meets these criteria, or you feel we made any errors, please let us know, and we'll add or correct it. We will be including Free open source as well as proprietary products in this mix. If we left out what you consider an important criteria, please let us know and we'll try to squeeze it in somewhere.

Continue reading "Database Administration, Reporting, and Light application development"

Posted by Leo Hsu and Regina Obe in beginner, db2, Dbase, firebird, informix, ms access, mysql, oobase, oracle, other dbms, pgadmin, product showcase, sql server, sqlite at 01:54 | Comments (21) | Trackback (1)

Tuesday, June 30. 2009

Managing disk space using table spaces

Printer Friendly

We have covered this briefly before, but its an important enough concept to cover again in more detail.

Problem: You are running out of disk space on the drive you keep PostgreSQL data on

Solution:

Create a new tablespace on a separate drive and move existing tables to it, or create a new tablespace and use for future tables.

What is a tablespace and how to create a tablespace

A tablespace in PostgreSQL is similar to a tablespace in Oracle and a filegroup in SQL Server. It segments a piece of physical disk space for use by the PostgreSQL process for holding data. Below are steps to creating a new tablespace. Tablespaces have existed since PostgreSQL 8.0.

More about tablespaces in PostgreSQL is outlined in the manual PostgreSQL 8.3 tablespaces

While it is possible to create a table index on a different tablespace from the table, we won't be covering that.

Continue reading "Managing disk space using table spaces"

Posted by Leo Hsu and Regina Obe in basics, beginner at 00:00 | Comment (1) | Trackbacks (0)

Tuesday, June 09. 2009

Planner Statistics

Printer Friendly

You'll often hear the term planner statistics thrown around by database geeks. Did you update your statistics. This lingo isn't even limited to PostgreSQL, but is part and parcel to how most decent databases work. For example in PostgreSQL you do a vacuum analyze to update your planner statistics in addition to cleaning up dead space. In SQL Server you do an UPDATE STATISTICS. In MySQL you do an ANALYZE TABLE or a more invasive OPTIMIZE TABLE.

Normally all this "update your stats so your planner can be happy" is usually unnecessary unless you just did a bulk load or a bulk delete or you are noticing your queries are suddenly slowing down. These stat things are generally updated behind the scenes by most databases on an as needed basis.

What makes SQL really interesting and a bit different from procedural languages is that it is declarative (like functional and logical programming languages) and relies on the database planner to come up with strategies for navigating the data. Its strategy is not fixed as it is in procedural languages. A big part of this strategy is decided on by the query planner which looks at distributions of data. Given different WHERE conditions for similar queries, it could come up with vastly different strategies if one value has a significantly higher distribution in a table than another. This is also the mystery of why it sometimes refuses to use an index on a field because it has decided a table scan is more efficient and also why some people consider HINTS evil because they pollute the imperative nature of the language.

Continue reading "Planner Statistics"

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, basics, beginner, mysql, sql server at 09:10 | Comments (0) | Trackbacks (0)

Monday, July 05. 2010

Monday, June 28. 2010

Use Case

Importing Data with Open Office Base Using copy and paste

Tuesday, June 22. 2010

Monday, May 17. 2010

Friday, April 23. 2010

Sunday, February 14. 2010

Thursday, January 21. 2010

Thursday, December 24. 2009

Saturday, November 28. 2009

Updgrading from PostgreSQL 8.* to PostgreSQL 8.4

Monday, October 26. 2009

Wednesday, October 21. 2009

Wednesday, September 09. 2009

Monday, September 07. 2009

Tuesday, June 30. 2009

Problem: You are running out of disk space on the drive you keep PostgreSQL data on

What is a tablespace and how to create a tablespace

Tuesday, June 09. 2009

Quicksearch

Calendar

Categories

Archives

Subscribe

Blog Administration