Slashdot Mirror


PostgreSQL vs. MySQL comparison

prostoalex writes "Ever find yourself wondering which open source database is the best tool for the job? Well, wonder no more, and let your tax dollars do the work in the form of Fermi National Accelerator Laboratory publishing this unbiased review of MySQL vs. PostgreSQL. After reading it, however, it seems that MySQL ranks the same or better on most of the accounts." My poor sleepy eyes misread the date of posting on here; caveat that this is more then 15 months old.

13 of 390 comments (clear)

  1. No Digg by AKAImBatman · · Score: 5, Informative

    1. There's no such thing as unbiased. Especially on a page that gives a fairly abstract review.

    2. This article is 2 years old. Everything in its comparisons is out of date.

    1. Re:No Digg by electroniceric · · Score: 5, Informative

      Just to continue on your good points, especially troubling is the fact that this article compares the then-unreleased MySQL 5 to the Postgres 7.x series. Nearly all the drawbacks to Postgres that this article highlights have been addressed in the 8.x series.

      We run Postgres for our main business application and the main limitations are of two forms:
      1) Depth of community
      The Postgres community is great - very responsive and knowledgeable, but its size is a limitation in a number of ways. The ODBC driver is a bit of stepchild to the main project, and some key functions like dblink that address missing features like cross-database selects are relegated to /contrib, and rely on their individual authors for nearly all maintenance. This means that as a user you are more likely to bump up against the bleeding edge earlier than in communities where these outside-the-core projects are more supported.

      For the same reason a key subset of its documentation is very sparse. Documentation for the core system is thorough, clear and concise, but anything in contrib or any projects like the ODBC or .NET drivers are much less like to have the same quality of documentaton. Postgres' extremely powerful GIST indexes are unparalled as a feature, but you need a background in theoretical CompSci to figure them out, thanks to limited documentation (note to aspire database index geeks - I would gladly buy a book on GIST aimed at proficient DBAs who are not giants of theoretical CS). Likewise its procedural languages: thanks to its architecture and open codebase, Postgres offers more server-side languages than any other database that I know of, but few of them have more than basic documentation, let alone the stacks of books you'd find with other procedural languages.

      2) Postgres is very close to being a true enterprise contender (unlike MySQL, which is evolving that direction but distinctly further off), but lacks some key features like XML handling, a more comprehensible approach to result sets (anyone who's dealt with rowtypes and casting resultsets can attest to the steep learning curve), and a userbase that has put the product through the wringer. Now that some corporate heads are getting interested (e.g. Sun, Red Hat, EnterpriseDB) hopefully some of these shortcomings will be addressed in short order.

      Don't let this outdated, apples to oranges comparison fool you: Postgres is a very solid and usable database.

  2. Old news by daffmeister · · Score: 5, Informative

    From the site:

    "Last modified: February 15, 2005."

    1. Re:Old news by suv4x4 · · Score: 5, Funny

      Next article on Slashdot: 486 SX vs 486 DX

  3. Old and wrong by ldapboy · · Score: 5, Informative

    postgresql has a native Win32 version, complete with an installer, service support and does not depend on cygwin.

  4. Re:Foreign Keys by mwanaheri · · Score: 5, Insightful

    Foreign keys are more than nice, they are essential. Unless, maybe you don't care about the integrity of your data or want to make the necessary checks in their application. The latter should keep their eyes down and their mouth shut if the talk is about 'speed' of any rdbms, off course.

    --
    Idha khatabahum lijahiluna qalu salaman
  5. Re:Foreign Keys by ShieldW0lf · · Score: 5, Informative

    This is unbiased? Give me a break.

    WTF is with putting up an "unbiased comparison" between Postgres 7.2 and MySQL 5.0 when Postgres is now up to 8.2 and has most of their concerns addressed in that release, whereas MySQL is still at 5.0?

    MySQL is a great database, if you need clustering but not referencial integrity or ACID compliance, that is.

    --
    -1 Uncomfortable Truth
  6. Re:MySQL is ridiculously easy to configure by Schraegstrichpunkt · · Score: 5, Insightful

    You have to give the Notepad guys credit for the fact that it is an incredibly easy product when it comes to configuring it for your needs. For me, out of college, going to Vim was a culture shock because the process of learning Vim was so convoluted and drawn out for simple stuff. I know that Vim and Emacs can be much more powerful than Notepad, but there is something to be said for how easy it is for a developer to install Notepad and just start working with it.

  7. Re:Foreign Keys by CaptainZapp · · Score: 5, Insightful
    Foreign keys are more than nice, they are essential.

    Bingo!

    It doesn't cease to amaze me, when the Mysql croud argues that "you don't really need those pesky integrity stuff, it just slows down the database."

    Guess what guys; You're dead wrong!

    Any DBA worth his salary will enforce data integrity on the lowest possible level, which means constraints (however implemented) on the object level.

    Sure, you can let your coders in Bengaluru ensure that the primary key is unique instead of just applying a unique index and the same goes for referential constraints between tables. You can implement them in the application just fine until somebody overlooks some minor detail in the code and you're royally fucked!

    Again! Foreign keys or triggers are not "niceties". They are essential in implementing an industry strength database; period!

    --
    ich bin der musikant

    mit taschenrechner in der hand

    kraftwerk

  8. Why we moved from MySQL to PG by punker · · Score: 5, Interesting

    This almost seems like the same comparisons we've been hearing for years.
    1) Postgresql is more full featured than MySQL
    2) MySQL is faster in a read-mostly environment
    That's pretty much the same as the anecdotal arguments have been for years.

              In my job, we moved from mysql to postgres several years ago (around PG 7.0). At the time, we needed to make the move for performance reasons. We are in a read-write system, and MySQL's locking was killing us (this was before InnoDB was well established). The features are better too, as our developers were used to having data integrity features, server side programming, and all of the SQL92 constructs available. We also learned a bit about PG performance, which I'll share.

    1) Run EXPLAIN ANALYZE on everything. Postgresql is touchier about query performance than MySQL was. This just needs to be a habit if you're using PG. (You really should do performance analysis no matter your DB. It's just a good practice). The biggest gain will be making sure you're using index scans rather than sequential scans.

    2) Use persistent connections. Everyone likes to point out the forking issue with PG vs. MySQL's threaded. PG's connection handling is slow, there's no doubt about it. But there's an easy answer. Just limit how often you connect. If you can keep a connection pool, and just reuse those connections, you'll save this big hit.

    3) Full vacuum and reindex regularly. We've found the docs to be a bit off on this. It indicates that you should run these occasionally. If you're in a read-write system, a full vacuum on a regular basis is very important. It really doesn't take that long if you do it regularly. Also, we've had trouble with indexes getting unbalanced (we see 50->90% tuple turnover daily). This has gotten better, but it doesn't hurt to let your maintenance scripts make things ideal for you. So, we run a full vacuum and reindex of our tables nightly through cron.

    4) Get your shared memory right. PG's shared buffers is probably the most important config attribute. It controls how much of your DB is memory resident vs disk resident. Avoiding disk hits is a big deal for any DB, so get this right. If you can fit your whole DB in memory, then do it. If not, make sure your primary tables will fit. The more you use the shared memory, and the less you have to page data in/out, the better your overall performance will be.

    Most DB systems seem to be read-mostly, so I can understand the performance comparisons focusing on that. In our read-write system though, the locking was the biggest issue and it tilted the performance comparison toward PG.

  9. Re:Foreign Keys by Tony+Hoyle · · Score: 5, Informative

    Untrue.

    The client library is GPL. That means you cannot create a commercial program that uses it without using the commercial licensed version. Which is $200 per client

    You can't even create a library and not ship mysql - the mysql site is very clear that they consider distributing a program that *uses* mysql as being exactly the same as distributing mysql itself:

    http://www.mysql.com/company/legal/licensing/comme rcial-license.html

    Typical examples of MySQL distribution include: ...
            * Selling software that requires customers to install MySQL themselves on their own machines.

    Specifically:

            * If you develop and distribute a commercial application and as part of utilizing your application, the end-user must download a copy of MySQL; for each derivative work, you (or, in some cases, your end-user) need a commercial license for the MySQL server and/or MySQL client libraries.

    This makes mysql unusable for anything except large products. Our entire product only cost $70 for the single user version. No way in hell we're upping the price by $200 a copy.

  10. Re:Foreign Keys by Branko · · Score: 5, Insightful
    Your app should be checking itself anyway.

    Actually it shouldn't (in this context). Typically, one database will have several client applications attached to it. If data consistency is not checked at DB level, then:

    • Bug in single application might compromise the data consistency of the whole system.
    • You must keep all of your applications precisely synchronized.
    • You are repeating the job of implementing the same consistency logic across all applications instead of implementing it only once - in database.
    • Implementing these sorts of checks can be difficult to do correctly at the application level in a concurrent environment typical for a DBMS.
    • Data consistency at DB level is directly supported by modeling tools, so you can plan for it and visualize it early enough to spot problems and communicate it to the other team members more easily.
  11. Postgres For Larger Datasets by Hornsby · · Score: 5, Interesting

    I had to make a decision recently between Mysql and Postgresql for a database composed of many tables with greater than 50,000,000 rows. While going through the decision making process, I loaded a sample table with 50,000,000 rows to do some benchmarks. The first thing I had to do to run my tests was index the table. I started with Mysql using a InnoDB table type. I had both database servers relatively tuned to the hardware they were running on. I ran the create index with MySQL and detached my screen session. I came back several hours later to find MySQL was doing something along the lines of INDEX via REPAIR SORT. After some reading, I learned that this takes an order of magnitude longer than building an index the "normal" way and is caused by the index becoming corrupted during the creation. Okay... so, I restarted this process several times and encountered the same problem. This is clean data mind you that has already been exported from an existing SQL server. I duplicated my install on a second server and had the same problems. Very annoyed with MySQL, I gave Postgres a try. It worked on the first time in less than 25 minutes without issue. Since then I've been using it on 250,000,000 row datasets without issue. It's always reliable, and as long as you remember to use CURSORS for huge SELECT statement, it's painless to work with.

    --
    A musician without the RIAA, is like a fish without a bicycle.