PostgreSQL vs. MySQL comparison
prostoalex writes "Ever find yourself wondering which open source database is the best tool for the job? Well, wonder no more, and let your tax dollars do the work in the form of Fermi National Accelerator Laboratory publishing this unbiased review of MySQL vs. PostgreSQL. After reading it, however, it seems that MySQL ranks the same or better on most of the accounts." My poor sleepy eyes misread the date of posting on here; caveat that this is more then 15 months old.
From TFA:
Foreign keys are nice, I have to say; I implement them in mysql anyway, in spite of the fact that they're ignored for MyISAM.
1. There's no such thing as unbiased. Especially on a page that gives a fairly abstract review.
2. This article is 2 years old. Everything in its comparisons is out of date.
Javascript + Nintendo DSi = DSiCade
From the site:
"Last modified: February 15, 2005."
Having foreign keys, views, subselects, and transactions can all be very attractive in PostgreSql -
if you need them and you will make any use of them. If you don't need them or won't use them, then
you're probably better off with MySQL and its superior performance.
PostgreSql is more stable than MySQL, (and has better performance when saturated), shouldn't you take that into consideration?
thank God the internet isn't a human right.
postgresql has a native Win32 version, complete with an installer, service support and does not depend on cygwin.
Just take a look at the description per item. I couldn't possibly call this unbiased in any way.
I have been involved with a smallish ("hundreds") installation of Movable Type using a mysql backend.
One comment spammer can completely annihilate it.
One developer I talked to once did some testing. On one simultaneous connection, mysql was way faster. By five or so, they were close. At ten, PostgreSQL was definitely winning. At a hundred, he was simply unable to get a single MySQL server to complete the test successfully, let alone do it quickly.
The impression I get is that PostgreSQL uses more robust algorithms, with higher constant costs and lower quadratic costs. In any event, never had any problems.
As noted elsewhere, these comparisons are quite old...
But in any event, in my own experience, mysql is a lot easier to blow up by overloading than postgres is, at least if you have a lot of writes going on. For pure-lookup functions, it might do better -- but a lot of modern database apps are pretty compulsive about saving at least something every time someone touches them. (For instance, modern vBulletin saves last visits, threads seen, and so on; all of that adds up to a huge load on the database server.)
My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
Glad to see the comparison, but I would really like to see is a comparison that includes the new 2.0 release of Firebird. Their new release is impressive, but I dont know how the features pan out with MySQL or Postgres. Including Derby would also be nice.
The most important factor to me in any comparison is the licensing agreement. I like a very open agreement and the MySQL license requires you to release the source code to your product in some cases, or you have to purchase a license from them.
You have to give the MySQL guys credit for the fact that it is an incredibly easy product when it comes to configuring it for your needs. For me, out of college, going to Oracle was a culture shock because the process of configuring Oracle was so convoluted and drawn out for simple stuff. I know that Oracle and PostgreSQL can be much more powerful than MySQL, but there is something to be said for how easy it is for a developer to install MySQL and just start working with it.
1 -- This article is years old.
2 -- This article is posted solely to stir up (repetitive) discussion.
3 -- This article pretends that MySQL is a real database, even though in order to do so it has to make gigantic leaps like considering data integrity to be not really all that important in a database.
4 -- This article trolled me.
Whence? Hence. Whither? Thither.
I'd rather have a new (not THAT old) comparison between Oracle and MySQL
It's been a long time since any of their PostgreSQL statements were true. It's a very happy native windows service with a nice installer, and the administrative interface is very easy to use. Let's try posting current reviews of software, rather than reruns from a year or two ago.
Easy Online Role Playing Campaign Management
February 15, 2005
My karma is not a Chameleon.
MySQL speed will really depend on the database engine you use (MyISAM or InnoDB do not perform the same!). PostgreSQL performance is pretty much consistent across platforms.
On the HA side, PostgreSQL has maybe less options: Slony/I (http://gborg.postgresql.org/project/slony1/) for master/slave or Sequoia (http://sequoia.continuent.org/) for multi-master.o n.html) for master/slave, MySQL cluster (http://dev.mysql.com/doc/refman/5.0/en/mysql-clus ter.html) for those who want to switch to a new storage engine (NDB) or Sequoia (URL:http://sequoia.continuent.org/) for multi-master with transparent failover.
MySQL offers MySQL replication (http://dev.mysql.com/doc/refman/5.0/en/replicati
You don't need to setup anything to run it for the first time, only if you want to play with performance, you can start to modify parameters as memory, max connections, etc. PostgreSQL is easy and powerful, just give it a try.
ajf
I call pure, unadulterated crap on this one.
One of the major new features in Postgresql 8 was native Windows support. It runs just fine as a service.
This comparision is either very old news, incompetence in action, or, um! strongly biased.
ich bin der musikant
mit taschenrechner in der hand
kraftwerk
This almost seems like the same comparisons we've been hearing for years.
1) Postgresql is more full featured than MySQL
2) MySQL is faster in a read-mostly environment
That's pretty much the same as the anecdotal arguments have been for years.
In my job, we moved from mysql to postgres several years ago (around PG 7.0). At the time, we needed to make the move for performance reasons. We are in a read-write system, and MySQL's locking was killing us (this was before InnoDB was well established). The features are better too, as our developers were used to having data integrity features, server side programming, and all of the SQL92 constructs available. We also learned a bit about PG performance, which I'll share.
1) Run EXPLAIN ANALYZE on everything. Postgresql is touchier about query performance than MySQL was. This just needs to be a habit if you're using PG. (You really should do performance analysis no matter your DB. It's just a good practice). The biggest gain will be making sure you're using index scans rather than sequential scans.
2) Use persistent connections. Everyone likes to point out the forking issue with PG vs. MySQL's threaded. PG's connection handling is slow, there's no doubt about it. But there's an easy answer. Just limit how often you connect. If you can keep a connection pool, and just reuse those connections, you'll save this big hit.
3) Full vacuum and reindex regularly. We've found the docs to be a bit off on this. It indicates that you should run these occasionally. If you're in a read-write system, a full vacuum on a regular basis is very important. It really doesn't take that long if you do it regularly. Also, we've had trouble with indexes getting unbalanced (we see 50->90% tuple turnover daily). This has gotten better, but it doesn't hurt to let your maintenance scripts make things ideal for you. So, we run a full vacuum and reindex of our tables nightly through cron.
4) Get your shared memory right. PG's shared buffers is probably the most important config attribute. It controls how much of your DB is memory resident vs disk resident. Avoiding disk hits is a big deal for any DB, so get this right. If you can fit your whole DB in memory, then do it. If not, make sure your primary tables will fit. The more you use the shared memory, and the less you have to page data in/out, the better your overall performance will be.
Most DB systems seem to be read-mostly, so I can understand the performance comparisons focusing on that. In our read-write system though, the locking was the biggest issue and it tilted the performance comparison toward PG.
MySQL/MyISAM is the one with the massive legacy code base, the one that your open-source blogging software uses and probably the one that your web host supports. It beautifully hits the "sweet spot" for data-driven web sites with infrequent and simple updates, where trading integrity for "read only" performance is sensible. It does not even purport to compete with PostgreSQL on features - but it does offer fulltext searches, again
MySQL/InnoDB is the one that offers transactions, foreign keys etc. (ISTR it doesn't do fulltext indexes, though) - this is the "version" that bears comparison with PostgreSQL. I wonder how its user base compares?
(OK - you can mix InnoDB and MyISAM tables in a single database, but you can't use InnoDB if your web host hasn't installed it - heck, one provider I use is still on MySQL V3.23)
Flamewars have tended to pit PostgreSQL against a mythical database with the performance of MyISAM and the features of InnoDB...
As for the GUI software, the MySQL GUI Admin/query browser stuff is shinier than PgAdmin3 - but the MacOS version of the former is a complete crashfest! Neither of them steps up to the plate of providing a FOSS equivalent of (the good bits) of MS Access.
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
Ok, this is yet another outdated report comparing three mainstream RDBMS'es - MySQL, PostgreSQL and ORACLE. It was done for yet another physical experiment - for choosing the proper system for storing data about the construction process of one of the LHC detectors - ALICE.y k/db_compare/db_compare.html
And this report is at least professional, which cannot be said about the one mentioned in the article.
http://dcdbappl1.cern.ch:8080/dcdb/archive/ttracz
http://tweakers.net/reviews/657
They compare PostgreSQL 8.2 vs MySQL 4.1.20 and MySQL 5.1.20a.
Interactive Visual Medical Dictionary
I did a presentation at the Atlanta Unix Users' Group this month that is a more up-to-date comparison. It's available in Open Office format. You can also get to it from my home page. I did a similar talk almost four years ago. My conclusion is that MySQL has closed the feature gap with PostgreSQL in recent years. I still give PostgreSQL the edge in features, and MySQL the edge in out-of-the-box untuned performance. I also discuss replication and clustering.
Here's a comprehensive performance review between PostgreSQL and MySQL. It compares both DB's under load as well as comparing Intel/AMD chips. http://tweakers.net/reviews/657/6
I had to make a decision recently between Mysql and Postgresql for a database composed of many tables with greater than 50,000,000 rows. While going through the decision making process, I loaded a sample table with 50,000,000 rows to do some benchmarks. The first thing I had to do to run my tests was index the table. I started with Mysql using a InnoDB table type. I had both database servers relatively tuned to the hardware they were running on. I ran the create index with MySQL and detached my screen session. I came back several hours later to find MySQL was doing something along the lines of INDEX via REPAIR SORT. After some reading, I learned that this takes an order of magnitude longer than building an index the "normal" way and is caused by the index becoming corrupted during the creation. Okay... so, I restarted this process several times and encountered the same problem. This is clean data mind you that has already been exported from an existing SQL server. I duplicated my install on a second server and had the same problems. Very annoyed with MySQL, I gave Postgres a try. It worked on the first time in less than 25 minutes without issue. Since then I've been using it on 250,000,000 row datasets without issue. It's always reliable, and as long as you remember to use CURSORS for huge SELECT statement, it's painless to work with.
A musician without the RIAA, is like a fish without a bicycle.
The first screen that it says MySQL supports ODBC and doesn't mention that PostgreSQL does as well -- so why should I read further? Either sloppy, ignorant, or biased writing.
There were a couple comparisons a couple years ago. It was my understanding that PostgreSQL did better with large data sets in a P vs. M match. In getting hammered with connections, another test between MySQL, PostgreSQL, DB2, Oracle, and SQLServer, if I remember, Microsoft's offering started to crap out along a power curve at maybe just 200(?) hits and the others degraded pretty equally along a straight line.
My client/server experience started with some Oracle classes and managing a department server. I must say I am _much_ more comfortable with PostgreSQL and find MySQL a little alien no matter how popular it is. Just my 2 cents.
From the comparison table:
These statements convey the same information but that the author has presented them in different lights suggests to me a premeditated bias in favour of MySQL.
How can you find a trivial, severely out of date comparison useful?
Does the Internet's favorite DBMS have an IP address datatype yet?
How about MAC address? CIDR block?
"An IP address is just a 32-bit unsigned int, duh. Any DBMS can store those."
Wrong. A datatype isn't just about storage, but also about operations. In PostgreSQL, when you do a SELECT across a table with IP addresses in it, you get them formatted and displayed as IP addresses, not as opaque ints. Likewise with CIDR blocks, like "192.168.42.0/23". There's also a comparison operator for asking whether an IP address is within a CIDR block.
If you're implementing a network registration system or an incident logging system, how much of your time do you want to waste staring at opaque ints like 3232246272 rather than IP addresses like 192.168.42.0 when you're trying to debug it?
MySQL is a bimbo, a fratboy: it's easy, but so shallow! The amount of time you save in one-time setup, you will lose many times over in all the little annoyances and deficiencies of a DBMS that was originally designed by folks who didn't really believe in DBMSes. Over time they've slowly been shamed into including many of the features they used to despise: transactions, relational integrity checks, and so on. But there's still so much missing ... not just essential integrity features, but little fiddly bits like good datatype support, the kinds of things that make your life easier (as a programmer or as a DBA) in the long run.
How about an article on that neat analysis framework Fermi (and others) use for the terabytes of data they generate and have to sift through?
http://root.cern.ch/ (large scale repository for data analysis)
And how would you use Postgres for something like that anyway? Maybe something like:
http://www.greenplum.com/ (biggy-sized Postgres based data warehousing)
i was using pgsql 8.x back in 2005 - so it existed. they just cherry picked versions to get the result they wanted. postgresql.org linked to a new study where both current versions (at the time, and it was recent - pgsql is 8.2 now) of mysql and postgresql were put throught the paces... http://www.postgresql.org/about/news.691
MysSQL has a much larger user base than PostgreSQL, therefore the code is more tested and has historically been more stable than PostgreSQL and more used in production environments.
"Claiming that your RDBMS is the best in the world because more people use it is like saying McDonalds makes the best food in the world."
Sorry, just an old joke that deserved retreading... ;)
"I can be self-referential if I want to," said Tom, swiftly.
I've used Oracle for about 3 years and Postgres for about 1. Both are good databases. I may write what you're asking for eventually, but don't hold your breath. :) Here are a few thoughts off the top of my head that might help.
In a way-oversimplified nutshell, Postgres seems sorta-kinda modelled after Oracle. But many times I look into a feature that I've used on Oracle and find it doesn't exist in Postgres or it's not up to my expectations. The Postgres feature might be a little half-assed. One example is the multi-version concurrency control. If you update a row in Oracle, the undo tablespace contains a snapshot of the old version. All other concurrent users are directed to the undo tablespace until the transaction completes. They get the old version of the uncommitted row from the undo tablespace. Postgres on the other hand makes a new row in the table for your update, and the old one is marked as a "dead tuple" whose data can be retrieved by other transactions.
Might not seem like a major difference, but you'll have to "vacuum" your Postgres database periodically to clean up your dead tuples. Otherwise, table scans have to scan over all the dead tuples in addition to the "real" data. Some DBAs simply run a vacuum job every night, or use the auto-vacuum daemon. It's not always so simple however. Imagine you have to update data every morning and get it done ASAP. Well, now that you've updated your 500,000 rows, you have to vacuum the table to keep queries running at a decent speed. When I was still a Postgres newbie, I would just vacuum whenever I remembered. One time I forgot about it for a few days, when I finally ran the vacuum it took 17 hours to complete - granted it was a one disk server. It's just another thing to worry about.
Once I looked for a replication solution for Postgres. Some were incomplete. Slony-I - the most popular Postgres replication solution - seemed like the way to go. But upon delving into it, I found it imposed limitations because it sort of "corrupts" the Postgres data dictionary. I believe this corruption broke triggers on all replicated tables (don't quote me on that). Also Slony-I does not support multi-master replication at all.
Sometimes a feature is just great. Rarely is the Postgres feature better, except in terms of user-friendliness. For instance the psql command line tool blows sqlplus out of the water IMO. (Wow, paginated results, tab completion!)
Oracle has a wealth of OLAP/business intelligence features that are lacking in Postgres. There is a flavor of Postgres called Bizgres that is a bit better in this area, though still lacking in comparison. Bizgres is more cutting edge and doesn't seem to have nearly as strong a community as Postgres. Go to http://www.bizgres.com/bb/ and you'll see there are very few posts.
Oracle has an expert named Tom - asktom.oracle.com - who is a genius and answers all my questions. Postgres has an expert, also named Tom, who is a genius and answers all my questions on the newsgroup.
When would I use one over the other? Personally I would use Oracle if there were no budgetary constraints, but if Postgres definitely met all my requirements, I'd go with that. (Yes yes, I am Captain Obvious.) If you need Oracle's enterprise-level features then Postgres is out of the question. There is however EnterpriseDB, a commercial, Postgres-based enterprise-level database. Several months ago Sony "snubbed" Oracle and went with EnterpriseDB for its MMORPGs. I have not looked into EnterpriseDB myself so I can't comment further.
Well I'm rambling and not getting work done. Hope that helps just a little bit.
I suspect you're thinking of relatively simple cases where you have a single application working against a database. Which of course is the standard MySQL scenario.
Where foreign keys and the other referential integrity features really shine is in true enterprise scenarios, when you may have hundreds or thousands of applications, written in multiple languages, working against the same shared database(s).
In that scenario, the only viable way to duplicate the functionality of foreign keys at the application level is to have a middle layer which all other applications are required to go through. Realistically, that middle layer has to be implemented as a server, serving requests for object/record creation, update and delete over the network. Implementing it as a library to be linked into applications doesn't work well, because there are multiple applications accessing the database, and integrity enforcement needs to be centrally coordinated.
Implementing a middleware data server for an application isn't all that difficult, but integrating it into applications can be. Most application development environments know how to talk to databases, but don't automatically know how to talk to your application-specific, language-independent, data server. So now you're writing a client library for each app dev platform used in the enterprise, and dealing with things like integrating your custom interface with data-bound controls in the user interface. BTW, this is where people start resorting to e.g. SOAP, and projects start going off the rails (no pun intended, Ruby fans).
Luckily, as it turns out, there are already standardized, widely-available, well-supported systems that implement a centralized data serving service which enforces referential integrity. They're called databases. And foreign keys are an essential part of the service they provide.