other dbms - 3 - Postgres OnLine Journal

Friday, July 23. 2010

Of Camels and People: Converting back and forth from Camel Case, Pascal Case to underscore lower case

Recommended Books: Mastering Regular Expressions PostgreSQL Developer's Handbook Elements of Programming

When it comes to naming things in databases and languages, there are various common standards. For many languages the camel family of namings is very popular. For unix based databases usually UPPER or lower _ is the choice and for databases such as SQL Server and MySQL which allow you to name your columns with mixed casing but couldn't care less what case you express them in selects, you get a mish mush of styles depending on what camp the database user originated from.

So to summarize the key styles and the family of people

camelCase : lastName - employed by SmallTalk, Java, Flex, C++ and various C derivative languages.
Pascal Case: (a variant of Camel Case) -- LastName which is employed by C#, VB.NET, Pascal (and Delphi), and SQL Server (and some MySQL windows converts). Also often used for class names by languages that use standard camelCase for function names.
lower case _ last_name : often found in C, a favorite among PostgreSQL database users. (some MySQL)
upper case _ LAST_NAME : a favorite among Oracle Users (some MySQL Oracle defectors)

Being at the cross roads of all the above, we often have to deal with the various above as well as having internal schizophrenic strife and external fights. The internal turmoil is the worst and is worse than an ambidextrous person trying to figure out which hand to use in battle. For these exercises, we'll demonstrate one way how to convert between the various conventions. These are the first thoughts that came to our mind, so may not be the most elegant.

Continue reading "Of Camels and People: Converting back and forth from Camel Case, Pascal Case to underscore lower case"

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, 9.0, beginner, mysql, oracle, postgresql versions, q&a, sql server at 17:17 | Comments (4) | Trackbacks (0)

Monday, June 28. 2010

Importing data into PostgreSQL using Open Office Base 3.2

Printer Friendly

Recommended Books: Getting Started with OpenOffice.org 3 written by OO.org group Database Programming with OO Base Beginning OpenOffice 3 from Novice to Professional

A while ago we demonstrated how to use Open Office Base to connect to a PostgreSQL server using both the native PostgreSQL SBC and the PostgreSQL JDBC driver.

The routine for doing the same in Open Office Base 3.2 is pretty much the same as it was in the 2.3 incarnation. In this excerpt, we'll demonstrate how to import data into PostgreSQL using Open Office Base, as we had promised to do in Database Administration, Reporting, and Light Applicaton Development and some stumbling blocks to watch out for.

Use Case

Command line lovers are probably scratching there head, why you want to do this. After all stumbling your way thru a commandline and typing stuff is much more fun and you can automate it after you are done. For our needs, we get stupid excel or some other kind of tab delimeted data from somebody, and we just want to cut and paste that data in our database. These files are usually small (under 5000 records) and the column names are never consistent. We don't want to fiddle with writing code to do these one off type exercises.

For other people, who are used to using GUIs or training people afraid of command lines, the use cases are painfully obvious, so we won't bore you.

Importing Data with Open Office Base Using copy and paste

Open Office has this fantastic feature called Copy and Paste (no kidding), and we will demonstrate in a bit, why their copy and paste is better than Microsoft Access's Copy and Paste particularly when you want to paste into some database other than a Microsoft one. It is worthy of a metal if I dear say.

Continue reading "Importing data into PostgreSQL using Open Office Base 3.2"

Posted by Leo Hsu and Regina Obe in beginner, ms access, oobase, oracle, product showcase at 03:42 | Comments (6) | Trackbacks (0)

Tuesday, June 22. 2010

NOT IN NULL Uniqueness trickery

Printer Friendly

Recommended Books: SQL Cookbook SQL Hacks

I know a lot has been said about this beautiful value we affectionately call NULL, which is neither here nor there and that manages to catch many of us off guard with its casual neither here nor thereness. Database analysts who are really just back seat mathematicians in disguise like to philosophize about the unknown and pat themselves on the back when they feel they have mastered the unknown better than any one else. Of course database spatial analysts, the worst kind of back seat mathematicians, like to talk not only about NULL but about EMPTY and compare notes with their brethren and write dissertations about what to do about something that is neither here nor there but is more known than the unknown, but not quite as known as the empty string.

Okay getting to the point, one of our clients asked us about a peculiar problem they had with a query, and the strange results they were getting. We admit this still manages to catch us off guard every once in a while.

Continue reading "NOT IN NULL Uniqueness trickery"

Posted by Leo Hsu and Regina Obe in basics, beginner, postgresql versions, sql server at 04:19 | Comments (2)

Wednesday, June 02. 2010

STRICT on SQL Function Breaks In-lining Gotcha

Printer Friendly

One of the coolest features of PostgreSQL is the ability to write functions using plain old SQL. This feature it has had for a long time. Even before PostgreSQL 8.2. No other database to our knowledge has this feature. By SQL we mean sans procedural mumbo jumbo like loops and what not. This is cool for two reasons:

Plain old SQL is the simplest to write and most anyone can write one and is just what the doctor ordered in many cases. PostgreSQL even allows you to write aggregate functions with plain old SQL. Try to write an aggregate function in SQL Server you've got to pull out your Visual Studio this and that and do some compiling and loading and you better know C# or VB.NET. Try in MySQL and you better learn C. Do the same in PostgreSQL (you have a large choice of languages including SQL) and the code is simple to write. Nevermind with MySQL and SQL Server, you aren't even allowed to do those type of things on a shared server or a server where the IT department is paranoid. The closest with this much ease would be Oracle, which is unnecessarily verbose.
Most importantly -- since it is just SQL, for simple user-defined functions, a PostgreSQL sql function can often be in-lined into the overall query plan since it only uses what is legal in plain old SQL.

This inlining feature is part of the secret sauce that makes PostGIS fast and easy to use. So instead of writing geom1 && geom2 AND Intersects(geom1,geom2) -- a user can write ST_Intersects(geom1,geom2) . The short-hand is even more striking when you think of the ST_DWithin function.

With an inlined function, the planner has visibility into the function and breaks apart the spatial index short-circuit test && from the more exhaustive absolute test Intersects(geom1,geom2) and has great flexibility in reordering the clauses in the plan.

Continue reading "STRICT on SQL Function Breaks In-lining Gotcha"

Posted by Leo Hsu and Regina Obe in 8.3, 8.4, 9.0, basics, intermediate, mysql, oracle, postgis, postgresql versions, sql functions, sql server at 05:06 | Comments (3) | Trackback (1)

Saturday, May 29. 2010

PostGIS, SQL Server, Oracle spatial compares and other news

Printer Friendly

PostGIS, SQL Server 2008 R2, Oracle 11G R2

We just completed our compare of the spatial functionality of PostgreSQL 8.4/PostGIS 1.5, SQL Server 2008 R2, Oracle 11G R2 (both its built-in Locator and Spatial add-on). Most of the compare is focused on what can be gleaned from the manual of each product.

In summary, all products have changed a bit since their prior versions. The core changes:

PostGIS 1.5 has geodetic support now in the form of geography as well as some beefed up functions and additional distance functions like ST_ClosestPoint, ST_MaxDistance, ST_ShortestLine/LongestLine
SQL Server 2008 R2 basic spatial support hasn't changed much when compared to SQL Server 2008, but there is a lot more integration going on integrating Spatial into reporting services, Share Point and just integration in general with SQL Server 2008 R2 and the Office 2010 stack.
Oracle 11G R2 - has finally offered an uninstall script for Locator folks who do not care to break the law by accidentally using functions only licensed in Oracle spatial, but innocently exposed in Oracle Locator. If all that were not great enough, you are now allowed to legally do a centroid if you are using Oracle Locator. Doing unions, intersections, and differences is still a legal no no for Oracle Locator folks. Oracle now provides Affine transform functions, which have long been provided by PostGIS and have been available via the MPL licensed CLR Spatial package of SQL Server 2008.

I still haven't figured out where this R2 convention started. I thought it was just a Microsoft thing, but I see Oracle follows the same convention as well.

Continue reading "PostGIS, SQL Server, Oracle spatial compares and other news"

Posted by Leo Hsu and Regina Obe in 8.4, editor note, oracle, postgis, sql server at 20:56 | Comments (3) | Trackbacks (0)

Monday, May 17. 2010

Where is soundex and other warm and fuzzy string things

Printer Friendly

For those people coming from Oracle, SQL Server and MySQL or other databases that have soundex functionality, you may be puzzled, or even frustrated when you try to do something like
WHERE soundex('Wushington') = soundex('Washington')
in PostgreSQL and get a function does not exist error.

Well it does so happen that there is a soundex function in PostgreSQL, and yes it is also called soundex, but is offered as a contrib module and not installed by default. It also has other fuzzy string matching functions in addition to soundex. One of my favorites, the levenshenstein distance function is included as well. In this article we'll be covering the contrib module packaged as fuzzystrmatch.sql. Details of the module can be found in FuzzyStrMatch. The contrib module has been around for sometime, but has changed slightly from PostgreSQL version to PostgreSQL version. We are covering the 8.4 version in this article.

For those unfamiliar with soundex, its a basic approach developed by the US Census in the 1930s as a way of sorting names by pronounciation. Read Census and Soundex for more gory history details.

Given that it is an approach designed primarily for the English alphabet, it sort of makes sense why its not built-in to PostgreSQL, which has more of a diverse international concern. For example if you used it to compare two words in Japanese or Chinese, don't think it would fair too well in any of the database platforms that support this function.

The original soundex algorithm has been improved over the years. Though its still the most common used today, newer variants exist called MetaPhone developed in the 1990s and Double Metaphone (DMetaPhone) developed in 2000 that support additional consonants in other languages such as Slavic, Celtic, Italian, Spanish etc. These two variants are also included in the fuzzystrmatch contrib library. The soundex function still seems to be the most popularly used at least for U.S. This is perhaps because most of the other databases (Oracle, SQL Server, MySQL) have soundex built-in but not the metaphone variants. So in a sense soundex is a more portable function. The other reason is that metaphone and dmetaphone take up a bit more space and are also more processor intensive to compute than soundex. We'll demonstrate some differences between them in this article.

To enable soundex and the other fuzzy string matching functions included, just run the share/contrib/fuzzystrmatch.sql located in your PostgreSQL install folder. This library is an important piece of arsenal for geocoding and genealogy tracking particularly the U.S. streets and surnames data sets. I come from a long line of Minors, Miners, Burnettes and Burnets.

For the next set of exercises, we will be using the places dataset we created in Importing Fixed width data into PostgreSQL with just PSQL.

Continue reading "Where is soundex and other warm and fuzzy string things"

Posted by Leo Hsu and Regina Obe in 8.2, 8.3, 8.4, 9.0, beginner, contrib spotlight, fuzzystrmatch, mysql, oracle, postgresql versions, sql server at 16:53 | Comments (2) | Trackbacks (3)

Thursday, April 01. 2010

CatchMe - Microsoft SQL Server for Unix and Linux

Printer Friendly

Today Microsoft unveiled their top secret project code named CatchMe. This is their new flagship database for Linux and Unix based on predominantly the PostgreSQL 9.0 code base, but with an emulation layer that makes it behave like SQL Server 2008 R2. Unlike the Windows SQL Server 2008 R2 product, this version is completely free and open source under the Microsoft Public License (Ms-PL). Downloads for the RCs of these will be available soon. Please stay tuned.

Reporter Dat A. Base managed to get an exclusive interview with the head of the project, Quasi Modo. The transcript follows:

Continue reading "CatchMe - Microsoft SQL Server for Unix and Linux"

Posted by Leo Hsu and Regina Obe in 9.0, joke, new in postgresql, oracle, sql server at 12:42 | Comments (2) | Trackback (1)

Thursday, March 04. 2010

In Defense of varchar(x)

Printer Friendly

This is a rebuttal to depesz's charx, varcharx, varchar, and text and David Fetter's varchar(n) considered harmful. I respect both depesz and David and in fact enjoy reading their blogs. We just have deferring opinions on the topic.

For starters, I am pretty tired of the following sentiments from some PostgreSQL people:

99% of the people who choose varchar(x) over text in PostgreSQL in most cases are just ignorant folk and don't realize that text is just as fast if not faster than varchar in PostgreSQL.
stuff your most despised database here compatibility is not high on my priority list.
It is unfortunate you have to work with the crappy tools you work with that can't see the beauty in PostgreSQL text implementation. Just get something better that treats PostgreSQL as the superior creature it is.

Continue reading "In Defense of varchar(x)"

Posted by Leo Hsu and Regina Obe in basics, mysql, oracle, sql server at 19:23 | Comments (15) | Trackbacks (0)

Wednesday, January 06. 2010

Looking forward to PostgreSQL 8.5

Printer Friendly

Ah a new year, a new PostgreSQL release in the works. Beware -- this post is a bit sappy as we are going to highlight those that have made our lives and lives of many a little easier.

These are people we consider the most important because they provide the first impression that newcomers get when first starting off with PostgreSQL. The newcomer that quickly walks out the door unimpressed, is the easy sale you've lost. Make your pitch short and sweet.

As always Hubert does a really good job of taste testing the new treats in the oven and detailing how cool they are. I highly suggest his posts if people have not read them already or are looking at PostgreSQL for the first time. You can catch his Waiting for PostgreSQL 8.5 series which is in progress. Surely gives us a list of things to test drive.

Then there are those that document, the volumes of PostgreSQL documentation which are just great, up to date and rich with content. Probably too many of these people to call out, and sadly we don't know them by name.

Of course its not just enough to announce releases, document them and talk about them, you must make it really easy for people to try them out. If people have to compile stuff, especially windows users, forget about it. You won't hear complaints, you won't hear whispers, you'll hear dust blowing. The biggest audience you have is the one you just lost because you didn't make it easy for them to try your stuff. The apple hit me on the head one day when a very dear friend said to me and here is a slight paraphrase. You don't actually expect me to compile this myself do you? How much time do you think I have? It is not about you, it is about me.. This was especially surprising coming from a guy I always thought of as selfless. This I realized is the biggest problem with many open source projects, that they are lost in the flawed mentality that its about scratching their own itch and the rest will come. It is not. Always concentrating on your own itch and scratching it is a sure way of guaranteeing that no one will scratch your itch for you. Think of it like a pool game. Do you target the aim at the ball you are trying to hit, or balls near by that will knock down the others. So in short don't be a complete wuss that people can walk all over, but look past your nose and choose your balls wisely; make sure all your balls are not focused on software development.

Continue reading "Looking forward to PostgreSQL 8.5"

Posted by Leo Hsu and Regina Obe in 9.0, mysql, new in postgresql, postgis, sql server at 04:14 | Comments (5) | Trackbacks (0)

Friday, November 06. 2009

PostGIS does Geography

Printer Friendly

The upcoming version of PostGIS - PostGIS 1.5 will be an exciting one. It has native geodetic support in the form of the new geography type, similar in concept to SQL Server's geography support. For windows users, we have experimental binary builds hot off the presses for PostgreSQL 8.3 and 8.4

Continue reading "PostGIS does Geography"

Posted by Leo Hsu and Regina Obe in 8.3, 8.4, gis, new in postgresql, oracle, postgis, sql server at 06:42 | Comments (3) | Trackback (1)

Monday, September 07. 2009

Database Administration, Reporting, and Light application development

Printer Friendly

One of the most common questions people ask is Which tools work with PostgreSQL. In a sense the measure of a database's maturity/popularity are the number of vendors willing to produce management and development tools for it. Luckily there are a lot of vendors producing tools for PostgreSQL and the list is growing. One set of tools people are interested in are Database administration, ER diagramming, Query tools, and quickie application generators (RAD).

For this issue of our product showcase, we will not talk about one product, but several that fit in the aforementioned category. All the listed products work with PostgreSQL and can be used for database administration and/or architecting or provide some sort of light reporting/rapid application building suite. By light reporting/application building, we mean a tool with a simple wizard that a novice can use to build somewhat functional applications in minutes or days. This rules out all-purpose development things like raw PHP, .NET, Visual Studio, database drivers etc. Things we consider in this realm are things like OpenOffice Base and MS Access. Most of these tools are either free or have 30-day try before you buy options.

You can't really say one tool is absolutely better than another since each has its own strengths and caters to slightly different audiences and also you may like the way one tool does one important thing really well, though it may be mediocre in other respects. We also left out a lot of products we are not familiar with and may have gotten some things wrong.

If we left out your favorite product and you feel it meets these criteria, or you feel we made any errors, please let us know, and we'll add or correct it. We will be including Free open source as well as proprietary products in this mix. If we left out what you consider an important criteria, please let us know and we'll try to squeeze it in somewhere.

Continue reading "Database Administration, Reporting, and Light application development"

Posted by Leo Hsu and Regina Obe in beginner, db2, Dbase, firebird, informix, ms access, mysql, oobase, oracle, other dbms, pgadmin, product showcase, sql server, sqlite at 01:54 | Comments (21) | Trackback (1)

Saturday, August 15. 2009

Cross Compare of PostgreSQL 8.4, SQL Server 2008, MySQL 5.1

Printer Friendly

Comparison of PostgreSQL 8.4, Microsoft SQL Server 2008, MySQL 5.1

In our May 2008 issue of Postgres OnLine Journal, we cross compared Microsoft SQL Server 2005, MySQL 5, and PostgreSQL 8.3. Some people mentioned well since 8.4 has now come out, shouldn't we go back and update the reference. We deliberated and decided not to. To be fair all 3 products have released new versions, so it would seem unfair to compare a newer PostgreSQL against older versions of MS SQL Server and MySQL. We have therefore decided to repeat our exercise and include parts people felt we should have covered, as well as comparing the latest and greatest stable release of each product.

People ask us time and time again what's the difference why should you care which database you use. We will try to be very fair in our comparison. We will show equally how PostgreSQL sucks compared to the others. These are the items we most care about or think others most care about. There are numerous other differences if you get deep into the trenches of each.

Continue reading "Cross Compare of PostgreSQL 8.4, SQL Server 2008, MySQL 5.1"

Posted by Leo Hsu and Regina Obe in basics, mysql, other dbms, sql server at 09:48 | Comments (26) | Trackbacks (2)

Thursday, July 16. 2009

PostgresQL 8.4: Common Table Expressions (CTE), performance improvement, precalculated functions revisited

Printer Friendly

Common table expressions are perhaps our favorite feature in PostgreSQL 8.4 even more so than windowing functions. Strangely enough I find myself using them more in SQL Server too now that PostgreSQL supports it.

CTEs are not only nice syntactic sugar, but they also produce better more efficient queries. To our knowledge only Firebird (see note below), PostgreSQL,SQL Server, and IBM DB2 support this, though I heard somewhere that Oracle does too or is planning too UPDATE: As noted below Oracle as of version 9 supports non-recursive CTEs. For recursion you need to use the Oracle proprietary corresponding by syntax.

As far as CTEs go, the syntax between PostgreSQL, SQL Server 2005/2008, IBM DB2 and Firebird is pretty much the same when not using recursive queries. When using recursive queries, PostgreSQL and Firebird use WITH RECURSIVE to denote a recursive CTE where as SQL Server and IBM DB2 its just WITH.

All 4 databases allow you to have multiple table expressions within one WITH clause anda RECURSIVE CTE expression can have both recursive and non-recursive CTEs. This makes writing complex queries especially where you have the same expressions used multiple times in the query, a lot easier to debug and also more performant.

In our article on How to force PostgreSQL to use a pre-calculated value we talked about techniques for forcing PostgreSQL to cache a highly costly function. For PostgreSQL 8.3 and below, the winning solution was using OFFSET which is not terribly cross platform and has the disadvantage of materializing the subselect. David Fetter had suggested for 8.4, why not try CTEs. Yes CTEs not only are syntactically nice, more portable, but they help you write more efficient queries. To demonstrate, we shall repeat the same exercise we did in that article, but using CTEs instead.

Continue reading "PostgresQL 8.4: Common Table Expressions (CTE), performance improvement, precalculated functions revisited"

Posted by Leo Hsu and Regina Obe in basics, cte, db2, firebird, intermediate, oracle, sql server at 22:40 | Comments (6) | Trackbacks (0)

Monday, July 13. 2009

PostgreSQL 8.4 Faster array building with array_agg

Printer Friendly

One of the very handy features introduced in PostgreSQL 8.4 is the new aggregate function called array_agg which is a companion function to the unnest function we discussed earlier. This takes a set of elements similar to what COUNT, SUM etc do and builds an array out of them. This approach is faster than the old used array_append , array_accum since it does not rebuild the array on each iteration.

Sadly it does not appear to be completely swappable with array_append as there does not seem to be a mechanism to use it to build your own custom aggregate functions that need to maintain the set of objects flowing thru the aggregate without venturing into C land. This we tried to do in our median example but were unsuccessful.

In PostGIS 1.4 Paul borrowed some of this array_agg logic to make the PostGIS spatial aggregates much much faster with large numbers of geometries. So collecting polygons or making a line out of say 30,000 geometries which normally would have taken 2 minutes or more (just accumulating), got reduced to under 10 seconds in many cases. That did require C code even when installed against PostgreSQL 8.4. Though in PostGIS you reap the benefits as far as geometries go even if you are running lower than 8.4.

We had originally thought array_agg was a PostgreSQL only creation, but it turns out that array_agg is a function defined in the ANSI SQL:2008 specs and for one appears to exist in IBM DB2 as well. I don't think Oracle or any other database supports it as of yet.

As we had demonstrated in the other article, we shall demonstrate the olden days and what array_agg brings to the table to make your life easier.

Continue reading "PostgreSQL 8.4 Faster array building with array_agg"

Posted by Leo Hsu and Regina Obe in 8.4, basics, db2, intermediate, postgis at 22:55 | Comments (2) | Trackback (1)

Wednesday, July 01. 2009

Window Functions Comparison Between PostgreSQL 8.4, SQL Server 2008, Oracle, IBM DB2

Printer Friendly

PostgreSQL 8.4 has ANSI SQL:2003 window functions support. These are often classified under the umbrella terms of basic Analytical or Online Application Processing (OLAP) functions. They are used most commonly for producing cumulative sums, moving averages and generally rolling calculations that need to look at a subset of the overall dataset (a window frame of data) often relative to a particular row. For users who use SQL window constructs extensively, this may have been one reason in the past to not to give PostgreSQL a second look. While you may not consider PostgreSQL as a replacement for existing projects because of the cost of migration, recoding and testing, this added new feature is definitely a selling point for new project consideration.

If you rely heavily on windowing functions, the things you probably want to know most about the new PostgreSQL 8.4 offering are:

What SQL window functionality is supported?
How does PostgreSQL 8.4 offering compare to that of the database you are currently using?
Is the subset of functionality you use supported?

To make this an easier exercise we have curled thru the documents of the other database vendors to distill what the SQL Windowing functionality they provide in their core product. If you find any mistakes or ambiguities in the below please don't hesitate to let us know and we will gladly amend.

For those who are not sure what this is and what all the big fuss is about, please read our rich commentary on the topic of window functions.

Continue reading "Window Functions Comparison Between PostgreSQL 8.4, SQL Server 2008, Oracle, IBM DB2"

Posted by Leo Hsu and Regina Obe in 8.4, advanced, basics, db2, firebird, oracle, sql server, window functions at 13:00 | Comment (1) | Trackbacks (0)

Friday, July 23. 2010

Monday, June 28. 2010

Use Case

Importing Data with Open Office Base Using copy and paste

Tuesday, June 22. 2010

Wednesday, June 02. 2010

Saturday, May 29. 2010

PostGIS, SQL Server 2008 R2, Oracle 11G R2

Monday, May 17. 2010

Thursday, April 01. 2010

Thursday, March 04. 2010

Wednesday, January 06. 2010

Friday, November 06. 2009

Monday, September 07. 2009

Saturday, August 15. 2009

Comparison of PostgreSQL 8.4, Microsoft SQL Server 2008, MySQL 5.1

Thursday, July 16. 2009

Monday, July 13. 2009

Wednesday, July 01. 2009

Quicksearch

Calendar

Categories

Archives

Subscribe

Blog Administration