Saturday, August 15. 2009
Recommended Books: MySQL Administrator's Bible SQL Server 2008 Administration in Action PostgreSQL 8.4 - Server Administration
Comparison of PostgreSQL 8.4, Microsoft SQL Server 2008, MySQL 5.1
In our May 2008 issue of Postgres OnLine Journal, we cross compared Microsoft SQL Server 2005, MySQL 5, and PostgreSQL 8.3. Some people mentioned well since 8.4 has now come out, shouldn't we go back and update the reference. We deliberated and decided not to. To be fair all 3 products have released new versions, so it would seem unfair to compare a newer PostgreSQL against older versions of MS SQL Server and MySQL. We have therefore decided to repeat our exercise and include parts people felt we should have covered, as well as comparing the latest and greatest stable release of each product.
People ask us time and time again what's the difference why should you care which database you use. We will try to be very fair in our comparison. We will show equally how PostgreSQL sucks compared to the others. These are the items we most care about or think others most care about. There are numerous other differences if you get deep into the trenches of each.
People have been posting comments on Reddit as well
SQL Server 2008 R2 CTP
One of the interesting things to come out this week is the SQL Server 2008 R2 CTP One of the enhancements we are really looking forward to and interested in experimenting with is the Report Builder 3.0/Reporting Services support for Geospatial data. We
Weblog: BostonGIS Blog
Tracked: Aug 15, 09:37
SQL-andmebaaside võrdlus: MySQL, Postgresql, SQL Server ja muud
Teinekord on vaja omavahel võrrelda erinevaid andmebaasiservereid. Olen selleks kasutanud näiteks selliseid lehti: enamus põhilistest andmebaasidest MySQL, Postgresql, SQL Server
Weblog: Kirjutisi seinal
Tracked: Dec 07, 09:56
Recommended Books: MySQL Administrator's Bible SQL Server 2008 Administration in Action PostgreSQL 8.4 - Server Administration
Display comments as (Linear | Threaded)
For whatever it is worth...
Date Time support in MySQL I think exists better than you think in the datatype TIMESTAMP. It is stored in gmt and converted automatically to the tz of the session of the db connections. It is also possible to do conversion. Is this the same as storing a tz, no. I think it is better.
Table Partitioning is very much in 5.1 (and it is very much ugly). So I suggest that you revise "Yes?".
Access tables from other databases on same server ( but not easily across servers): on MySQL this exists in Federated (and Federated-X) and is easy, though not recommended for many work loads.
The TIMESTAMP data type of MySQL is very limited in its date range: it cannot store dates before 1970 and not after 2038...
Comparing DBs is not an easy task, indeed.
Nonetheless, I'd have divided the comparison in sections: costs (money, human, operational and maintenance), features, performance, reliability.
Then I'd have added Oracle.
Finally I'd have cut away all those topics with the same values (like readonly views or stored procedures.
Finally, the comments on the MySQL column seem too skinny when compared to with the ones in PG and MS columns. This could lead the reader to perceive some kind of bias, which is human anyway.
A referecne to the online documentation for each technical topic would moreover give more confidence in the comparison.
good ideas -- we'll change to include links to the online docs where relevant.
You are right its hard not to be biased in these things. Though we work with all 3, we have worked more intimately with PostgreSQL and SQL Server.
To add we wanted to include all the Yes Yes Yes for 2 reasons
1) these answers are a bit different from last time.
2) This is also meant for people migrating from some other db, who aren't sure if the other covers this important feature. So even though in some cases the answers are yes in all, we thought it important to include.
The discussion about maintenance is difficult. I don't think there is a clear cut answer. If you are a unix user and someone slaps you with a choice of SQL Server or PostgreSQL or MySQL. Pg or MySQL are obviously more comfortable and easier for you to deploy.
In case of Microsoft users, its even less clear cut. If you talk about hard-core database users who are very into security, programming in the db and so forth. They would find MySQL extemely cumbersome. But for someone who knows little about databases and just needs one -- MySQL's simplicity is a feature -- as all the features in SQL Server and PostgreSQL would just be noise. Then there is the issue of ISP support and so forth. I'm not sure how to categorize all those things into a cohesive chart.
Some remarks, for the sake of completeness:
"Partial Indexes - e.g. you want to create a unique index but only consider non-null values"
Yes. MySQL supports it. It's called UNIQUE, and differs from PRIMARY because it allows NULL values.
"ACID compliance": you mention MariaDB, which is not an engine but a fork of MySQL that includes the "maria" engine. The "Maria" engine is like MyISAM with crash recovery, but it is far from being ACID. So, only InnoDB at the moment is ACID. You may instead mention PBXT, a community developed engine, which is ACID.
"UPSERT logic". MySQL also supports INSERT ... ON DUPLICATE KEY UPDATE ...
"Table partitioning" is supported in MySQL 5.1, which has been GA since November 2008
"case insensitivity" in MySQL it depends on the character set used.
Partial index is not described in that way. Just a FYI.
Can you compare perfomance against all three? I always feel that mysql joins are extremly slow and would love to see real world comparisons against the other two. I've always been happy with ms-sql but would like to see how postgres performs.
Its hard to give these tests without coming up with some example that everyone will claim is biased against one database or that you didn't optimize the database right. So I tend not to like to do that.
So my off the cuff answer -- which is biased based on working with all 3.
Yes Mysql join is kind of slow compared to SQL Server and PostgreSQL especially when you have a lot of joins. I think many people agree with me on that.
In terms of the speed difference with complex queries between SQL Server and PostgreSQL. I don't think there is a clear winner. See my note to Ryan -- who says he has better luck with SQL Server. My gut sense is that just looking at the architecture and the fact that SQL Server has a more complex planner strategy, that if you have say 4 or more cpus and have license for all 4 your sql server, SQL Server will perform better on an individual query because of its parallelism. However my general impression (which I haven't timed) is if you have more people hitting the server with complex queries, PostgreSQL seems to perform better (not sure if its because of its free threading model or if that makes a difference), but it seems to handle more queries at a time a bit better than SQL Server. I'll need to experiment with that a bit to turn feeling into hard numbers. In the multi case the processors are used to server different users. In the parallelism the processors are used to divide the work of one query.
You missed "GUI Front-end that does 90% of what you need to do without you having to remember the f-in SQL"
2008 - YES, and it's fantastic
Everything else - what's a GUI? Command-lines-roolz.
What makes you claim PostgreSQL's planner is the best? Personally, I get far better plans with Microsoft SQL's planner than Postgres' for complex queries.
I'm pretty sure IronPython cannot be used for stored procedures because it relies on runtime code generation which is prohibited, even in unsafe mode.
You should add in a row for Index Organized Tables/Clustered Indexes.
I guess Best wasn't quite right. We should have said Best for us.
Which version of SQL Server are you running? With Standard for example you don't get partitioning -- so can't talk about how well SQL Server handles that without a big disclaimer.
With PostgreSQL you get simple partitioning you get that and it works pretty well with constraint_exclusion -- so that we are generally able to run more complex bigger db workloads on PostgreSQL without shelling out a lot of money.
The other thing that makes PostgreSQL planner better in many cases is the IN LINING of SQL functions. SQL Server sees all functions as Opaque so doesn't take advantage of indexes that can be used within a function that can be folded into the overall plan. You can wrap pretty complicated constructs in an sql function.
PostgreSQL also allows you to cost functions differently which I don't think SQL Server has yet so in conditions where the planner has to decide which function to test first -- it can look a the cost of each and use the less costly first. Lets just say when I started using PostgreSQL, I was very impressed with its speed and ease with which I could use my SQL Server knowledge right away.
For our spatial queries, PostGIS is just better. The SQL Server 2008 spatial selectivity still seems pretty weak and it often brain-deadly doesn't use a spatial index when it obviously should, but even when it does it is still a bit slower (unless you fiddle around with gridding and so forth). This all may change in the SQL Server 2008 R2 (which the CTP1 has just been released). So of course we are a bit biased in this regard.
IronPython. I haven't tried writing a stored proc in IronPython, but I gathered it was possible by this
http://ironpython.codeplex.com/WorkItem/View.aspx?WorkItemId=21740. I should probably give it a spin since it seems rot with problems from above.
As far as PostgreSQl, well its almost brain dead easy to write a python stored procs and lots of people do.
-- Ah yes clustered indexes -- we'll add that to our list.
One of the coolest Postgres features that you don't mention is that in Postgres almost all DDL is transactional, so you can roll it back safely. This is terrific for things like upgrade scripts - if the script fails somewhere you are not left with the database in a half upgraded state. It's also a great boon for developers. Most databases don't have this.
There are also several other replication methods for Postgres that you didn't mention. One I like is Londiste, from the Skype people, and the soon to be (I hope) new release of which looks very cool indeed.
Andrew -- good points. The transaction one I think we should include too. Its important feature for people packaging applications. For the replication thing. I think that may deserve another article since there are many replication products out there and many that will help with cross replication against disparate products too.
So it seems a bit unfair to talk about things that aren't built into the core product in this article.
Granted part of what makes Microsoft SQL server much longer install -- is that its really 5 or 6 products (and some that you can't just not check) and if you are like most who just check most everything (I want reporting services, ssis, database server, management tools, notification this and that) -- Its like installing a whole OS, but at least you can say its part of the same product package even though a lot of that stuff has not much to do with what makes a database good.
I believe SQL Server has Partial Indexes in 2008 - SQL Server calls them Filtered Indexes. You can index on non-null values, or numbers 1, etc http://sqlfool.com/2009/04/filtered-indexes-what-you-need-to-know/
PostgreSQL does support Materialized Views, but they're not automatic... You have to write your own Triggers maintain the table(s).
The cool thing about Oracle's MViews is that they are used even if the original SQL doesn't include a reference to them. If Oracle's optimizer sees that a SQL statement (or even part of it) could be replaced with a select to a MV it does that in the background.
That is something that is not available with either "manual" MVs in Postgres (or any other DBMS) nor the indexed views of SQL Server.
Great article, and apparently great site. But design of the main page is quite confusing.
I mean I wouldn't guessed the menu on the left and the links on the top are The Main Thing on the page. It took me while to figure out all the golden! articles are hidden under the crude menu on the left. (Not minutes ofcourse, but I went back and forth couple of times when looking for "the category the article belonged to".)
On 1600x1200 res. the main screen looks like your site has partial crash or the page is broken, there is tons of white space and some crude plain-text links.
PLZ don't get this like I am spitting on your great site. It is likely to become my PostgreSQL knowledge-source anyways, I just think there is lot to polish UI-wise.
Great comparism, thanks! We mostly use MS SQL though and have good results with it.
At least some mention of 64-bit OS seems appropriate nowadays. Most people developing new apps are probably considering that question. And in that vein, it seems like postgresql is way behind. We'd like to use postgresql, but definitely going to use .NET in 64 bit Windows OS. So it makes us pull back.
PostgreSQL runs natively 64-bit on Linux fine and has since I think pretty much the existence of 64-bit Linus. We also run it on our 64-bit Windows 2003 and 2008 servers.
The only caveat is that it runs in WOW mode in Windows, however it utilizes the 64-bit memory allocation since it delegates that to the OS so can use more than the 4 gig or I forget limit that 32-bit has. So in short it runs better on 64-bit windows than it does on 32-bit windows.
The database drivers come in 32-bit and 64-bit variants and those run in native 64-bit space. So as far as .NET goes its pretty much a non-issue regardless of if you are building a 32-bit or 64-bit application. Magnus post probably does a better job of explaining the 64-bit windows question.
We've been really happy with its performance on both Linux and Windows.
Hope that answers your questions. :)
Why Move to MySQL from Microsoft SQL Server?
Although installing software isn’t a major factor when looking at databases, it can still come into play if you have many servers that you commission and upgrade. The last SQL Server 2008 Enterprise download I utilized was 1.6GB in size with another download required for a new .NET framework install, which ironically, was the same size as the full MySQL 5.1 GA install for Windows – 150MB. I can install MySQL on my WIN boxes and be at a MySQL command utility prompt in under 5 minutes, but a SQL Server install takes much longer (with your mileage varying depending on the server you use). Just the .NET framework install took more than 5 times the installation time of the MySQL Server on one of my test machines.
But as I said above, installation isn’t a big consideration with databases. And some may think that they small size of the MySQL WIN download is indicative of the fact that the database server isn’t very feature rich and therefore that’s why it’s so tiny. We’ll more fully dispel that myth later in this article, but suffice it to say that such an assertion is false. Good things do come in small packages, as they say. One of my favorite examples to illustrate this point is one of our partner storage engines – InfoBright. Their MySQL-based data warehouse engine is only a 17MB download, installs in about 3 minutes, and can mange up to 30TB of data on a single server with incredibly fast response times.
Fairly fast indexes and effective planer
You can wrap pretty complicated constructs in an sql function.
combination with advanced aggregate and "windowed" functions, you can make really complex calculations within single query between many tables in plain sql language, where in other
databases you are forced to write complex stored procedures to achieve same result with many times slower performance.
Those are the things I cherish most in Postgres.
If you need a database for "static" data storage i would recommend MySQL.
But if you want database with complex "data calculations calculations" on the fly (for example FIFO,LIFO calculations), definitely Postgre,
none of other databases can do what postgre can, not even close.(except oracle a little)
Syndicate This Blog
Show tagged entries
Remote RSS/OPML-Blogroll Feed
No RSS/OPML feed selected