Thursday, March 04. 2010
Printer Friendly
This is a rebuttal to depesz's charx, varcharx, varchar, and text
and David Fetter's varchar(n) considered harmful.
I respect both depesz and David and in fact enjoy reading their blogs. We just have deferring opinions on the topic.
For starters, I am pretty tired of the following sentiments from some PostgreSQL people:
- 99% of the people who choose varchar(x) over text in PostgreSQL in most cases
are just ignorant folk and don't realize that text is just as fast if not faster than varchar in PostgreSQL.
- stuff your most despised database here compatibility is not high on my priority list.
- It is unfortunate you have to work with the crappy tools you work with that can't see the beauty in PostgreSQL text implementation.
Just get something better that treats PostgreSQL as the superior creature it is.
Continue reading "In Defense of varchar(x)"
Wednesday, January 06. 2010
Printer Friendly
Ah a new year, a new PostgreSQL release in the works. Beware -- this post is a bit sappy as we are going to highlight those that have made our lives and lives of many a little easier.
These are people we consider the most important because they provide the first impression that newcomers get
when first starting off with PostgreSQL. The newcomer that quickly walks out the door unimpressed, is the easy sale you've lost. Make your pitch short and sweet.
As always Hubert does a really good job of taste testing the new treats
in the oven and detailing how cool they are. I highly suggest his posts if people have not read them already or are
looking at PostgreSQL for the first time.
You can catch his Waiting for PostgreSQL 8.5 series which is in progress.
Surely gives us a list of things to test drive.
Then there are those that document, the volumes of PostgreSQL documentation which are just great, up to date and rich with content. Probably too many of these
people to call out, and sadly we don't know them by name.
Of course its not just enough to announce releases, document them and talk about them, you must make it really easy for people to try them out.
If people have to compile stuff, especially windows users, forget about it.
You won't hear complaints, you won't hear whispers, you'll hear dust blowing. The biggest audience you have is the one you just lost
because you didn't make it easy for them to try your stuff. The apple hit me on the head one day when a very dear friend said to me
and here is a slight paraphrase.
You don't actually expect me to compile this myself do you? How much time do you think I have? It is not about you, it is about me..
This was especially surprising coming from a guy I always thought of as selfless.
This I realized is the biggest problem with many open source projects, that they are lost in the flawed mentality that its about scratching
their own itch and the rest will come. It is not. Always concentrating on your own itch and scratching it is a sure way of guaranteeing that no one will scratch your itch for you.
Think of it like a pool game. Do you target the aim at the ball you are trying to hit, or balls near by that will knock down the others.
So in short don't be a complete wuss that people can walk all over, but look past your nose and choose your balls wisely; make sure all your balls are not
focused on software development.
Continue reading "Looking forward to PostgreSQL 8.5"
Monday, September 07. 2009
Printer Friendly
One of the most common questions people ask is Which tools work with PostgreSQL. In a sense the measure of a database's
maturity/popularity are the number of vendors willing to produce management and development tools for it. Luckily there are a lot of vendors producing tools for PostgreSQL and the list is growing.
One set of tools people are interested in are Database administration, ER diagramming, Query tools, and quickie application generators (RAD).
For this issue of our product showcase, we will not talk about one product, but several that fit in the aforementioned category.
All the listed products work with PostgreSQL and can be used for database administration and/or architecting or provide some sort of
light reporting/rapid application building suite. By light reporting/application building, we mean
a tool with a simple wizard that a novice can use to build somewhat functional applications in minutes or days. This rules out all-purpose development
things like raw PHP, .NET, Visual Studio, database drivers etc. Things we consider in this realm are things like OpenOffice Base and
MS Access. Most of these tools are either free or have 30-day try before you buy options.
You can't really say one tool is absolutely better than another since each has its own strengths and caters to slightly different audiences and also
you may like the way one tool does one important thing really well, though it may be mediocre in other respects. We also left out a lot of products we are not familiar with and may have gotten
some things wrong.
If we left out your favorite product and you feel it meets these criteria, or you feel we made any errors, please let us know, and we'll add or correct it.
We will be including Free open source as well as proprietary products in this mix. If we left out what you consider an
important criteria, please let us know and we'll try to squeeze it in somewhere.
Continue reading "Database Administration, Reporting, and Light application development"
Saturday, August 15. 2009
Printer Friendly
Comparison of PostgreSQL 8.4, Microsoft SQL Server 2008, MySQL 5.1
In our May 2008 issue of Postgres OnLine Journal, we cross compared Microsoft SQL Server 2005, MySQL 5, and PostgreSQL 8.3.
Some people mentioned well since 8.4 has now come out, shouldn't we go back and update the reference. We deliberated and decided not to.
To be fair all 3 products have released new versions, so it would seem unfair to compare a newer PostgreSQL against older versions of MS SQL Server and MySQL.
We have therefore decided to repeat our exercise and include parts people felt we should have covered, as well as comparing the latest and greatest stable release of each product.
People ask us time and time again what's the difference why should you care which database you use. We will
try to be very fair in our comparison. We will show equally how PostgreSQL sucks compared to the others. These are the items we
most care about or think others most care about. There are numerous other differences if you get deep into the trenches of each.
Continue reading "Cross Compare of PostgreSQL 8.4, SQL Server 2008, MySQL 5.1"
Tuesday, June 09. 2009
Printer Friendly
You'll often hear the term planner statistics thrown around by database geeks. Did you update your statistics. This lingo isn't even limited
to PostgreSQL, but is part and parcel to how most decent databases work. For example in PostgreSQL you do a vacuum analyze to update your planner statistics in addition
to cleaning up dead space. In SQL Server you do an UPDATE STATISTICS . In MySQL you do an
ANALYZE TABLE or a more invasive OPTIMIZE TABLE .
Normally all this "update your stats so your planner can be happy" is usually unnecessary unless
you just did a bulk load or a bulk delete or you are noticing your queries are suddenly slowing down. These stat things are generally updated behind the scenes by most databases
on an as needed basis.
What makes SQL really interesting and a bit different from procedural languages is that it is declarative (like functional and logical programming languages) and relies on the database planner to come up with strategies for navigating the data. Its strategy is not fixed as it is in procedural languages.
A big part of this strategy is decided on by the query planner which looks at distributions of data. Given different WHERE conditions for similar queries, it could come up with vastly different strategies if one value has a significantly
higher distribution in a table than another. This is also the mystery of why it sometimes refuses to use an index on a field because it has decided a table scan is more efficient and also why some people consider HINTS evil because they pollute the imperative nature of the language.
Continue reading "Planner Statistics"
Thursday, February 19. 2009
Printer Friendly
This article is a bit of a companion to our article on Setting up PostgreSQL as a Linked Server in Microsoft SQL Server 64-bit
In this article we shall demonstrate using Microsoft SQL Server 2005/2008 OPENQUERY AND OPENROWSET to add, delete and update data in PostgreSQL.
First we must start by saying there are a number of ways of copying data between databases. While OPENROWSET is not necessarily the fasted,
in certain cases such as when you are wrapping this in a stored procedure, it is one of the most convenient ways of doing this.
Why on earth would you want to copy data back and forth between 2 servers and 2 disparate DBMS systems for that matter?
We all would like to think we are an island and live in a world with one DBMS system, but we don't. There are many reasons for having multiple DBMS providers in
an organization. Some are better for some things than others, some are more integrated in an environment -- (for example in a windows shop the SQL Server drivers are already loaded on all
windows machines, but PostgreSQL provides the advantage of being able to run on more platforms such a FreeBSD/Unix/Linux box and with cheaper cost and more options for PL programming so is often better for a front-facing DMZ accessible database),
and there are numerous other reasons that are too hard to itemize. The other question of why triggering from SQL Server instead of PostgreSQL is because
its just a little easier from Microsoft SQL Server. The OPENROWSET and OPENQUERY logic that SQL Server provides is just simply better and easier to use than the dblink provided for PostgreSQL. Anyrate with that said lets move on with the show.
Although this example is focused on using PostgreSQL with Microsoft SQL Server, the same technique applies when
copying retrieving updating data from other databases such as MySQL or Oracle or DB II.
Continue reading "Using Microsoft SQL Server to Update PostgreSQL Data"
Sunday, January 04. 2009
Printer Friendly
Problem
You are developing a consultant search application where by a user looking for a consultant can check all the items in the list they require a consultant to have.
You have 2 tables. consultant and consultant_qual. The consultant_qual has primary key formed by 2 fields consultant_id and qual_id.
qual_id contains a constrained list
with things like 'OpenLayers', 'PostGIS', 'Python', 'C#' etc.
How do you devise a query that given a list of checked options by the user, only returns consultants with not 1 but all of those qualifications?
Continue reading "How to require all checked conditions are met by a result"
Friday, October 31. 2008
Printer Friendly
PostgreSQL 8.4 will have a ROW_NUMBER() windowing function so this little hack will
hopefully be unnecessary when 8.4 is in production.
Getting back to this exercise, this was actually inspired by Hubert's recent article Tips N’ Tricks - setting field based on order. Why this inspired me, I guess because it stirred
up memories about the often forgotten utility of arrays in PostgreSQL and I thought it would answer a question that was haunting
me - How do I assign sequential numbers to a list?. The article just didn't quite read the way I expected it to and actually
was answering another question I cared much less about, but it did get the
juices flowing. So without much further ado.
Continue reading "Simulating Row Number in PostgreSQL Pre 8.4"
Sunday, October 05. 2008
Printer Friendly
MySQL is turning out to be one big soap opera as far as I can tell and as Bruce Momjian has mentioned.
Lets go over some of the interesting episodes of this saga:
- First Sun buys MySQL
- Falcon storage engine creator Jim Starkey leaves MySQL/Sun
- Brian Aker heads Drizzle which is a fork of MySQL that hopes to be a stream-lined
implementation of MySQL that leaves out all that nonsense we don't need such as views and stored procs
and targeting it self for running on the cloud, optimizing for massive concurrency, and ease of install. I presume Brian still works for MySQL/Sun though.
- Michael "Monty" Widenius Leaves MySQL and Sun
to help with this Drizzle thing, but will still be working on Maria
according to Brian (see comment below).
- And now Jay Pipes leaves as well to work on Drizzle. He will still be a Sun staff engineer on Drizzle, but not the MySQL community leader.
I'm not sure what all these things say about the stability of the MySQL core. I mean should I stick with MySQL 5 or run for the Drizzle, but I think I'll stick with PostgreSQL
where ever I can. PostgreSQL may not be quite as interesting from a soap opera perspective,
but it seems a tad bit more dependable and I really like my views and stored functions.
Tuesday, August 12. 2008
Printer Friendly
Microsoft Access has these peculiar set of aggregates called First and Last. We try to avoid them because while the concept is useful, we find Microsoft Access's implementation of them
a bit broken. MS Access power users we know moving over to something like MySQL, SQL Server, and PostgreSQL often ask - where's first and where's last?
First we shall go over what exactly these aggregates do in MS Access and how they are different from MIN and MAX and what they should do in an ideal world. Then we shall create our ideal
world in PostgreSQL.
Continue reading "More Aggregate Fun: Who's on First and Who's on Last"
Friday, August 01. 2008
Printer Friendly
Every once in a while - particularly if you are using inherited tables, you forget to put an important index on one of your tables
which bogs down critical queries. Its sometimes convenient to inspect the index catalog to see what tables are missing indexes or
what tables are missing a critical index. Normally we try to stick with querying the information_schema because queries against that
schema work pretty much the same in PostgreSQL as they do in SQL Server and MySQL. For most of the examples below we had to delve into pg_catalog schema territory
since there was no view we could find in information_schema that would give us enough detail about indexes.
Continue reading "How to determine which tables are missing indexes"
Saturday, July 19. 2008
Printer Friendly
As many people who know us know we sit on several camps especially when it comes to databases.
The camps we sit on are growing rather than shrinking.
While we do have our favorites, we understand that peoples needs and comfort levels are different from ours and we try to take that into
consideration when making recommendations to people. The only thing that is generally true about the clientele we consult for is that they
fit one of the following features:
- Very minimal bureaucratic structure - this generally rules out most fortune 500 companies
and shall we say smaller companies who are too bureaucratic for their own good
- Dot com startups/Niche product developers who are looking to keep costs down to a minimum without too much fuss and are trying to produce a product to change the world
- Small companies who have a relatively low IT budget, but are predominantly windows-based
- Mid-sized companies predominantly windows-based or departments with decent IT staff,
who are looking for something their staff can easily maintain rather than simply keeping licensing costs down
It has come up as a topic of discussion, now that SQL Server 2008 is coming out soon and with its new fangled geodetic spatial support,
how does this change things?
The short answer is - not much except to increase awareness of spatial databases and to give us more business. As part of our due diligence work
we have put together a comparison of the 3 databases spatial functionality -
Cross Compare SQL Server 2008 Spatial, PostgreSQL/PostGIS 1.3-1.4, MySQL 5-6
to compliment our Cross Compare of SQL Server, MySQL, and PostgreSQL
Continue reading "More Database Comparisons"
Sunday, June 08. 2008
Printer Friendly
The PostgreSQL 8.4 planned release is March 1, 2009 and is outlined in the PostgreSQL 8.4 Development plan.
It has just passed its May 2008 commit fest milestone and is currently in its July 2008 Commit Fest. Lots of PostgreSQL Planet bloggers have started showcasing some of the new features in store.
We will briefly list our favorite planned and already committed patches.
Continue reading "PostgreSQL 8.4 goodies in store"
Sunday, May 18. 2008
Printer Friendly
Comparison of Microsoft SQL Server 2005, MySQL 5, and PostgreSQL 8.3
The below is by no means an exhaustive comparison of these 3 databases and functionality may not
be necessarily ordered in order of importance. These are just our experiences with using these 3 databases.
These are the databases we use most often. If we left your favorite database out - please don't take offense. Firebird for one has some neat features such as its small footprint and extensive SQL support, but we have not explored that Db.
People ask us time and time again what's the difference why should you care which database you use. We will
try to be very fair in our comparison. We will show equally how PostgreSQL sucks compared to the others. These are the items we
most care about or think others most care about. There are numerous other differences if you get deep into the trenches of each.
For those looking to compare MySQL and PostgreSQL you may want to also check out http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL
If you really want to get into the guts of a relational database and the various parts that make it up and how the various databases differentiate in their implementations,
we suggest reading Architecture of a Database System by
Joseph M. Hellerstein, Michael Stonebraker, and James Hamilton. Architecture of a Database System focuses mostly on Oracle, DB2, and SQL Server but does provide some insight into MySQL and PostgreSQL.
Continue reading "Cross Compare of SQL Server, MySQL, and PostgreSQL"
Saturday, May 10. 2008
Printer Friendly
One of the great selling points of PostgreSQL is its pluggable PL language architecture. MySQL is known for its pluggable storage and
PostgreSQL is known for its pluggable PL language architecture. From Monty's notes on slide 12 looks like MySQL
may be working on a pluggable PL language architecture of their own.
The most common of these languages are the all-purpose languages SQL and C (these are built-in and not really PLs like the others, but we'll throw them in there),
PLPgSQL which is also built-in but not always enabled, PL/Perl, PL/Python, and the domain specific languages PL/R, PL/SH and gaining popularity Skype released PL/Proxy.
There are others in the family such as PL/Tcl, PL/PHP, PL/Ruby, PL/Scheme (a dialect of Lisp), PL/Java, PL/Lua and PL/LOLCode (for kicks and as a reference implementation. Think of LOLCode as PostgreSQL Pluggable PL equivalent of MySQL's BLACK HOLE storage engine.) .
The other interesting thing about the PostgreSQL PL language architecture is that it is a fairly thin wrapper around these languages.
This means the kind of code you write in those languages is pretty much what you would write if you were doing general programming
in those languages minus some spi calls. Since the handler is a just a thin wrapper around the environment, the language environment must be installed on the database server before you can use the PL language handler. This means you can have these functions utilized in your SQL statements and you can write
in a language you feel comfortable with if you can get the darn PL compiled for your environment or someone has already kindly compiled it for your environment or that it is even compilable for your environment. The pluggable PL architecture means you can
write a PL Handler for your favorite language or invent your own language that you can run in the database. In the end
the barrier between code,data, and semantic constructs is more of a constraint imposed by compilers. If you have any doubts about the above statement, you need only look at some javascript injection attacks
to bring the statement home. One of my fantasies is
developing a language that morphs itself, that utilizes the database as its morphing engine and its OS and that breaks the illusion of data being data, code being code, and lacks rigid semantics.
Of the languages we have worked with, SmallTalk comes closest to a language that satisfies these ideals and Lisp to a much lesser extent. Lisp lacked the semantic elegance of SmallTalk among other things.
Most people are used to having their procedural language push their data around. PL code living in PostgreSQL allows your data to push your procedural code around in a set-based way. This is a simple but pretty powerful feature since data is in general more fluid than code. For interpretated/just-in time compiled languages it can live in the database, for compiled it has to call compiled functions.
Continue reading "Choosing the right Database Procedural Language PL"
|