Wednesday, July 01. 2009
Printer Friendly
PostgreSQL 8.4 has ANSI SQL:2003 window functions support. These are often classified under the umbrella terms of basic Analytical or Online Application Processing (OLAP) functions.
They are used most commonly for producing cumulative sums, moving averages and generally rolling calculations that need to look at a subset of the overall dataset (a window frame of data) often relative to a particular row.
For users who use SQL window constructs extensively, this may have been one reason in the past to not to give PostgreSQL a second look. While you may not
consider PostgreSQL as a replacement for existing projects because of the cost of migration, recoding and testing, this added new feature is definitely a selling point
for new project consideration.
If you rely heavily on windowing functions, the things you probably want to know most about the new PostgreSQL 8.4 offering are:
- What SQL window functionality is supported?
- How does PostgreSQL 8.4 offering compare to that of the database you are currently using?
- Is the subset of functionality you use supported?
To make this an easier exercise we have curled thru the documents of the other database vendors to distill what the SQL Windowing functionality they provide in their core product.
If you find any mistakes or ambiguities in the below please don't hesitate to let us know and we will gladly amend.
For those who are not sure what this is and what all the big fuss is about, please read our rich commentary on the topic of window functions.
Continue reading "Window Functions Comparison Between PostgreSQL 8.4, SQL Server 2008, Oracle, IBM DB2"
Tuesday, June 30. 2009
Printer Friendly
We have covered this briefly before, but its an important enough concept to cover again in more detail.
Problem: You are running out of disk space on the drive you keep PostgreSQL data on
Solution:
Create a new tablespace on a separate drive and move existing tables to it, or create a new tablespace and use for future tables.
What is a tablespace and how to create a tablespace
A tablespace in PostgreSQL is similar to a tablespace in Oracle and a filegroup in SQL Server. It segments a piece of physical disk space
for use by the PostgreSQL process for holding data. Below are steps to creating a new tablespace. Tablespaces have existed since PostgreSQL 8.0.
More about tablespaces in PostgreSQL is outlined in
the manual PostgreSQL 8.3 tablespaces
While it is possible to create a table index on a different tablespace from the table, we won't be covering that.
Continue reading "Managing disk space using table spaces"
Tuesday, June 09. 2009
Printer Friendly
You'll often hear the term planner statistics thrown around by database geeks. Did you update your statistics. This lingo isn't even limited
to PostgreSQL, but is part and parcel to how most decent databases work. For example in PostgreSQL you do a vacuum analyze to update your planner statistics in addition
to cleaning up dead space. In SQL Server you do an UPDATE STATISTICS . In MySQL you do an
ANALYZE TABLE or a more invasive OPTIMIZE TABLE .
Normally all this "update your stats so your planner can be happy" is usually unnecessary unless
you just did a bulk load or a bulk delete or you are noticing your queries are suddenly slowing down. These stat things are generally updated behind the scenes by most databases
on an as needed basis.
What makes SQL really interesting and a bit different from procedural languages is that it is declarative (like functional and logical programming languages) and relies on the database planner to come up with strategies for navigating the data. Its strategy is not fixed as it is in procedural languages.
A big part of this strategy is decided on by the query planner which looks at distributions of data. Given different WHERE conditions for similar queries, it could come up with vastly different strategies if one value has a significantly
higher distribution in a table than another. This is also the mystery of why it sometimes refuses to use an index on a field because it has decided a table scan is more efficient and also why some people consider HINTS evil because they pollute the imperative nature of the language.
Continue reading "Planner Statistics"
Wednesday, May 27. 2009
Printer Friendly
One thing that is pretty neat about windowing functions in PostgreSQL 8.4 aside from built-in windowing functions (row_number(), rank(), lead(), lag(), dense_rank(), percent_rank(), cume_dist(), first_value, last_value, nth_value) as documented in the manual Windowing Functions is that you can use windows with most aggregate functions (built-in or custom defined) as well as define your own specific windowing functions. In a later article, we'll demonstrate creating custom windowing functions.
In our PGCon 2009 PostGIS presentation one of the last slides demonstrates using lead() and lag() windowing functions
to show a family's income level in the same record with the income levels of the next door neighbors in the fictitious town we created. This is not terribly useful unless you live
in a somewhat dysfunctional neighborhood where everyone is concerned about how rich their neighbors are compared to themselves. Our town was very dysfunctional but mostly geographically dysfunctional. We will have much more useful use cases of this as applied to GIS in our upcoming PostGIS in Action book.
Hitoshi Harada and David Fetter did a presentation of this in PGCon 2009 which sadly we missed since we were giving our own presentation.
Check out the PGCon2009 PostgreSQL 8.4 Windowing Functions Video. Also check out the slides at Introducing Windowing Functions.
Those who have used SQL Server 2005+, Oracle or IBM DBII are probably familar or have run into examples of Windowing functions in those products. Windowing in PostgreSQL 8.4 works more or less the same way. In a prior article, we demonstrated how to return running totals and sums using rudimentary SQL.
To precelebrate the eminent arrival of PostgreSQL 8.4 and the current PostgreSQL 8.4 beta 2 release, we shall demonstrate the same exercise using the new ANSI SQL:2003 Windowing functionality built
into the upcoming PostgreSQL 8.4.
Continue reading "Running totals and sums using PostgreSQL 8.4 Windowing functions"
Thursday, March 05. 2009
Printer Friendly
PostgreSQL 8.2 and above has this pretty neat feature of allowing you to define aggregate functions
that take more than one column as an input. First we'll start off with a rather pointless but easy to relate to example and then we'll follow up with something
a bit more interesting.
For more examples of creating aggregates in PostgreSQL, check out our other articles:
Continue reading "How to create multi-column aggregates"
Tuesday, January 06. 2009
Printer Friendly
Both Josh Berkus and Hubert made blog entries about our last excerpt. In general I will start off by saying that we are more or less in agreement on what is good SQL coding format.
Here are the things I think we can all agree on
- SQL Keywords should be upper cased
- Field names should be prefixed with their tables especially when you have a multi-join statement involved
- Use JOIN syntax instead of stuffing everything in the WHERE though we are highly against just saying JOIN. We prefer INNER JOIN
The major points of contention I think are
- Should you use aliases over tables and if you use aliases should you keep them short or more descriptive. Josh thinks table names should be used where possible and when aliases are used they should be longer than a few characters
and Hubert agrees with us that short aliases are fine and in fact desirable. I think we all agree aliases should be meaningful when used, but our idea of what it means to be meaningful is a little different.
- In use of JOIN syntax -- we prefer using INNER JOIN instead of using just JOIN and in fact find it quite irritating that PostgreSQL rewrites our INNERs as JOIN. I suspect Hubert and Josh and many other PostgreSQL folk are on
the other side of the fence on this. The reason we feel strongly about this is there are so many kinds of JOINS - INNER JOIN, RIGHT JOIN, LEFT JOIN, CROSS JOIN, FULL JOIN, and the horrid NATURAL JOIN (that should be shot and put out of its misery). To just say JOIN to us is just confusing.
- While you can write LEFT OUTER JOIN, the OUTER is kind of pointless because no one goes around writing LEFT INNER JOINS
- Use well supported standards where possible. This means CURRENT_TIMESTAMP instead of now(). now() is not in all databases, but most relational databases support CURRENT_TIMESTAMP
Continue reading "SQL Coding Standards To Each His Own Part II"
Wednesday, December 31. 2008
Printer Friendly
I was reading Josh Berkus last blog post and was intrigued by his last post
Writing Maintainable Queries Part I.
He is right that lots has been said about coding standards in other languages and even right out holy wars have been launched on the
subject, but as far as SQL goes, not quite enough has been said on the subject for us to have a great war to end all wars.
I was also happy to see that we agreed with all his points except his first one. Yes I felt dissed, and thought hmm
if someone as important as Josh thinks our aliases should be very descriptive and we should use the table name rather than
the alias where possible, surely there must be something wrong with me for not believing in this fundamental philosophy.
In the rest of this excerpt I shall make fun of Josh and also set forth some of our own SQL Coding guidelines. Hopefully
Josh won't take too much offense at this small jibe.
Continue reading "SQL Coding Standards To Each His Own"
Wednesday, November 26. 2008
Printer Friendly
We had the pleasure of doing a fresh install of PostgreSQL 8.3.5 on RedHat EL4 box and when using the Yum repository, we noticed a couple of changes from last time we did this.
This could have been an oversight in our documentation before.
Changes to Yum Install for 8.3.5?
In our April 2008 issue we had An Almost Idiot's Guide to PostgreSQL YUM
and that article still seems to be surprisingly popular.
In the first step we had:
yum install postgresql
and that as I recall installed the postgresql server in addition to some client libraries.
For 8.3.5 fresh install it seems they are separated and to get the postgresql server you need to do:
yum install postgresql
yum install postgresql-server
Continue reading "Yum addendum for 8.3.5 and PgAgent"
Monday, November 17. 2008
Printer Friendly
Sometimes when you are testing or setting up a server or just porting things to another server, you just want to install the same set of users as you had before without restoring any databases.
In PostgreSQL, the users (Login Roles) and group roles are stored at the server level and only the permissions to objects are stored at the database level.
Question: How do you restore just the users and roles without having to do a full pg_dumpall of your server?
Continue reading "Backing up Login Roles aka Users and Group Roles"
Wednesday, September 24. 2008
Printer Friendly
We've been playing around with the snapshot builds of PgAdmin III 1.9 and would like to summarize some
of the new nice features added. PgAdmin III 1.9 has not been released yet, but has a couple of neat features brewing.
For those interested in experimenting with the snapshot builds and src tarballs, you can download them from http://www.pgadmin.org/snapshots/
Continue reading "PgAdmin III 1.9 First Glance"
Sunday, September 07. 2008
Printer Friendly
One of the nice things about the PostgreSQL command-line restore tool is the ease with which you can restore
select objects from a backup. We tend to use schemas for logical groupings which are partitioned by context, time, geography etc.
Often times when we are testing things, we just want to restore one schema or set of tables from our backup because restoring a 100 gigabyte database
takes a lot of space, takes more time and is unnecessary for our needs. In order to be able to accomplish such a feat, you need to
create tar or compressed (PG custom format) backups. We usually maintain PG custom backups of each of our databases.
Continue reading "How to restore select tables, select objects, and schemas from Pg Backup"
Monday, July 14. 2008
Printer Friendly
Programming Design Patterns define recommended approaches of solving common application problems. Within design patterns is a subset of design patterns called Idioms.
Idioms you can think of as a strategy for expressing recurring constructs or if you will sub-problems and often take advantage of the special features of a language.
They tend to be specific to a programming language and can not be reused
in other languages they were not specifically designed for. To demonstrate the differences lets compare two design patterns we commonly use.
Continue reading "SQL Idiom Design Patterns"
Sunday, July 06. 2008
Printer Friendly
In our April Issue An Almost Idiot's Guide to PostgreSQL YUM
we covered using the new PostgreSQL Yum repository to install the PostgreSQL 8.3.1 release on Fedora, RedHat Enterprise, and CentOS. We also received numerous useful feedback from others on issues they
ran into and how they overcame them. The blog comments are definitely worth a read.
Now that 8.3.3 has come out, many of you should be considering upgrading if you haven't already since there are a couple of bug fixes as outlined in
http://www.postgresql.org/docs/8.3/static/release-8-3-2.html, http://www.postgresql.org/docs/8.3/static/release-8-3-3.html,
and for those running 8.3.0 you will need to reindex your tables after as noted in http://www.postgresql.org/docs/8.3/static/release-8-3-1.html.
If you are running version 8.3.1 and above then
this is a fairly painless upgrade that just requires you to backup your data as a precautionary measure, but doesn't require a dump reload.
Continue reading "YUM 2: Almost Idiot's Guide to upgrade from PostgreSQL 8.3.1 to 8.3.3"
Sunday, May 18. 2008
Printer Friendly
Comparison of Microsoft SQL Server 2005, MySQL 5, and PostgreSQL 8.3
The below is by no means an exhaustive comparison of these 3 databases and functionality may not
be necessarily ordered in order of importance. These are just our experiences with using these 3 databases.
These are the databases we use most often. If we left your favorite database out - please don't take offense. Firebird for one has some neat features such as its small footprint and extensive SQL support, but we have not explored that Db.
People ask us time and time again what's the difference why should you care which database you use. We will
try to be very fair in our comparison. We will show equally how PostgreSQL sucks compared to the others. These are the items we
most care about or think others most care about. There are numerous other differences if you get deep into the trenches of each.
For those looking to compare MySQL and PostgreSQL you may want to also check out http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL
If you really want to get into the guts of a relational database and the various parts that make it up and how the various databases differentiate in their implementations,
we suggest reading Architecture of a Database System by
Joseph M. Hellerstein, Michael Stonebraker, and James Hamilton. Architecture of a Database System focuses mostly on Oracle, DB2, and SQL Server but does provide some insight into MySQL and PostgreSQL.
Continue reading "Cross Compare of SQL Server, MySQL, and PostgreSQL"
Sunday, April 20. 2008
Printer Friendly
First we'd like to thank Devrim of Command Prompt for working hard on making this new YUM repository available. In this article we will go over using the new PostgreSQL YUM repository for Redhat Fedora, Enterprise Linux and CentOS distros that is available at http://yum.pgsqlrpms.org/.
UPDATE - since Devrim's move from Command prompt - he has started a new yum repository. You may want to use
this one instead since it seems more up to date than the other. http://yum.pgrpms.org/.
We are gearing the content of this article to the described user profile
- Person has SSH root access to their PostgreSQL box
- Person is new to using Red Hat Linux
- Person may be new to using Linux in general
NOTE: We have a newer article on Installing PostgreSQL 8.4 with Yum and an even newer one for PostgreSQL 9.0
please refer to Installing PostgreSQL 9.0 via Yum. This old article is for PostgreSQL 8.3.
Continue reading "An Almost Idiot's Guide to PostgreSQL YUM"
|