Slashdot Mirror


PostgreSQL vs. MySQL comparison

prostoalex writes "Ever find yourself wondering which open source database is the best tool for the job? Well, wonder no more, and let your tax dollars do the work in the form of Fermi National Accelerator Laboratory publishing this unbiased review of MySQL vs. PostgreSQL. After reading it, however, it seems that MySQL ranks the same or better on most of the accounts." My poor sleepy eyes misread the date of posting on here; caveat that this is more then 15 months old.

92 of 390 comments (clear)

  1. Foreign Keys by P(0)(!P(k)+P(k+1)) · · Score: 2, Insightful

    From TFA:

    Having foreign keys [...] can all be very attractive in PostgreSql—if you need them and you will make any use of them.

    Foreign keys are nice, I have to say; I implement them in mysql anyway, in spite of the fact that they're ignored for MyISAM.

    1. Re:Foreign Keys by mwanaheri · · Score: 5, Insightful

      Foreign keys are more than nice, they are essential. Unless, maybe you don't care about the integrity of your data or want to make the necessary checks in their application. The latter should keep their eyes down and their mouth shut if the talk is about 'speed' of any rdbms, off course.

      --
      Idha khatabahum lijahiluna qalu salaman
    2. Re:Foreign Keys by ShieldW0lf · · Score: 5, Informative

      This is unbiased? Give me a break.

      WTF is with putting up an "unbiased comparison" between Postgres 7.2 and MySQL 5.0 when Postgres is now up to 8.2 and has most of their concerns addressed in that release, whereas MySQL is still at 5.0?

      MySQL is a great database, if you need clustering but not referencial integrity or ACID compliance, that is.

      --
      -1 Uncomfortable Truth
    3. Re:Foreign Keys by Tet · · Score: 3, Informative
      WTF is with putting up an "unbiased comparison" between Postgres 7.2 and MySQL 5.0 when Postgres is now up to 8.2

      That'd be because the article was written in 2005. Unbiased? Maybe. Vague, unscientific and out of date? Definitely.

      --
      "The invisible and the non-existent look very much alike." -- Delos B. McKown
    4. Re:Foreign Keys by Phil+John · · Score: 2, Informative
      MySQL is a great database, if you need clustering but not referencial(sic) integrity or ACID compliance, that is.

      Is that the same referential integrity and ACID compliance afforded by using INNOdb as your table type in MySQL? ;o)

      --
      I am NaN
    5. Re:Foreign Keys by Brummund · · Score: 3, Insightful

      Foreign keys aren't "nice", THEY ARE ESSENTIAL TO A RDBMS.

      It is the same thinking that probably made the retards at MySQL AB make a datatype that accepts 30th February as a date. (At least did, a few years ago.) Why EVEN include a datetime datatype if it isnt capable of the SIMPLEST validations ever.

      Yes, I'm fuming. Those MySQL retards has made a generation of programmers think they can do SQL when they manage to put crap into MySQL. Gahhh, I hope their puny webapps will haunt them down sometime.

      (I was once searching for a simple webbased forum, and tested phpnuke. It had the following gem to display the 5 most recent articles in the database:

      1. "SELECT * FROM ARTICLES ORDER BY ID DESC"
      2. Retrieve all articles from the database
      3. Then a for loop printing out the 5 first entries.

      They basically transferred all data in the articles database everytime, just to iterate over the 5 first rows. Gahhhhhh)

    6. Re:Foreign Keys by Tony+Hoyle · · Score: 2, Interesting

      Foreign keys don't speed anything up, they just add an extra layer of checks on your database. Your app should be checking itself anyway.

      It's the subselects that get me - without them you have to jump through a lot of hoops. The sentence quoted basically translates as 'as long as you are only storing your CD collection and not doing anything serious with a database, then use mysql'.

      TFA also fails to mention that mysql cannot be used in commercial development without paying $200 per client - which makes it more expensive than most other solutions (except maybe oracle, and even they have cheap licenses for some uses).

    7. Re:Foreign Keys by CaptainZapp · · Score: 5, Insightful
      Foreign keys are more than nice, they are essential.

      Bingo!

      It doesn't cease to amaze me, when the Mysql croud argues that "you don't really need those pesky integrity stuff, it just slows down the database."

      Guess what guys; You're dead wrong!

      Any DBA worth his salary will enforce data integrity on the lowest possible level, which means constraints (however implemented) on the object level.

      Sure, you can let your coders in Bengaluru ensure that the primary key is unique instead of just applying a unique index and the same goes for referential constraints between tables. You can implement them in the application just fine until somebody overlooks some minor detail in the code and you're royally fucked!

      Again! Foreign keys or triggers are not "niceties". They are essential in implementing an industry strength database; period!

      --
      ich bin der musikant

      mit taschenrechner in der hand

      kraftwerk

    8. Re:Foreign Keys by mwanaheri · · Score: 3, Insightful

      >Foreign keys don't speed anything up, they just add an extra layer of checks on your database. Right, they even make the dbms slower. But the dbms certainly does it faster than the application you write. So, to rely on the checks being made in the application results in a waste of speed on the application side. If I don't care about the speed in the application, why make a fuss about the speed of the rdbms? By the way: I rather have things reliable than fast. Subselects are something I also heavily use (and thus mainly stay away from MySQL) although I think it's better to use views for queries performed more than once in a while. Probably one of the main reasons for the spread of MySQL is the fact that is is frequently pre-installed.

      --
      Idha khatabahum lijahiluna qalu salaman
    9. Re:Foreign Keys by Tony+Hoyle · · Score: 5, Informative

      Untrue.

      The client library is GPL. That means you cannot create a commercial program that uses it without using the commercial licensed version. Which is $200 per client

      You can't even create a library and not ship mysql - the mysql site is very clear that they consider distributing a program that *uses* mysql as being exactly the same as distributing mysql itself:

      http://www.mysql.com/company/legal/licensing/comme rcial-license.html

      Typical examples of MySQL distribution include: ...
              * Selling software that requires customers to install MySQL themselves on their own machines.

      Specifically:

              * If you develop and distribute a commercial application and as part of utilizing your application, the end-user must download a copy of MySQL; for each derivative work, you (or, in some cases, your end-user) need a commercial license for the MySQL server and/or MySQL client libraries.

      This makes mysql unusable for anything except large products. Our entire product only cost $70 for the single user version. No way in hell we're upping the price by $200 a copy.

    10. Re:Foreign Keys by Tony+Hoyle · · Score: 2, Informative

      An app can do its checks with full knowlege of the structure of data it's writing, and often it's just the comparison of a couple of integers anyway and have no impact on speed. You don't want to rely solely on the DB to do that... you end up having to handle a lot of nasty exception cases. Far better to avoid them first. Put foreign keys in, but definately don't make them your first line of defence.

    11. Re:Foreign Keys by Brummund · · Score: 2, Insightful

      A database allowing even simple datatypes to contain crap which is totally inconsistent with any calendar in use for the last 5000 years is responsible for some of the crap .

    12. Re:Foreign Keys by Branko · · Score: 5, Insightful
      Your app should be checking itself anyway.

      Actually it shouldn't (in this context). Typically, one database will have several client applications attached to it. If data consistency is not checked at DB level, then:

      • Bug in single application might compromise the data consistency of the whole system.
      • You must keep all of your applications precisely synchronized.
      • You are repeating the job of implementing the same consistency logic across all applications instead of implementing it only once - in database.
      • Implementing these sorts of checks can be difficult to do correctly at the application level in a concurrent environment typical for a DBMS.
      • Data consistency at DB level is directly supported by modeling tools, so you can plan for it and visualize it early enough to spot problems and communicate it to the other team members more easily.
    13. Re:Foreign Keys by Smidge204 · · Score: 3, Insightful
      I wans't aware that "commercial application" and "GPL licensed" were mutually exclusive. From the page you linked:

      The Commercial License is an agreement with MySQL AB for organizations that do not want to release their application source code. Commercially licensed customers get a commercially supported product with assurances from MySQL. Commercially licensed users are also free from the requirement of making their own application open source.

      When your application is not licensed under either the GPL-compatible Free Software License as defined by the Free Software Foundation or approved by OSI, and you intend to or you may distribute MySQL software, you must first obtain a commercial license to the MySQL product.


      Emphasis mine. In other words, You don't have to pay the $200 if your project is itself compliant with the GPL or similar license scheme.

      "Comply with the GPL or pay us $200 to legally use our code or libraries" is not the same as saying "You have to pay us $200 if you plan to sell software you made using our code or libraries."
      =Smidge=
    14. Re:Foreign Keys by vadim_t · · Score: 4, Insightful

      Well, I sure hope you never work on anything serious.

      The database's function is to provide a RELIABLE storage for your data. Part of the whole reliability thing is making sure crap can't get in, because once it's there everything goes to heck.

      For instance, let's take a shopping cart. Can an order be for a negative quantity? If your app doesn't work that way (it could, using a negative amount for returns for example), and you still allow it in the DB, then all your reporting goes to heck, as SELECT SUM... now returns the wrong thing.

      A proper database is set up in such a way that every piece of data in it makese sense. This means for instance not having things like orders hanging around without in the void without being linked to some client. This is something easily ensured by foreign keys. Otherwise you have an utter mess - the total of the orders in the database doesn't match the sum of the orders of all clients!

      If you put your checks in the database, you have a guarantee that when somebody else codes another frontend to it (say, you had a website and now are making a special version for PDAs), if the application does the wrong thing, the database simply won't let it happen. This may cost a bit of speed, but I assure you that peace, your sanity and your ASS (if you have a boss and he's got any sense, he's not going to like it at ALL if it turns out that reports don't match reality, and that reality can't be even easily extracted) is far, far more valuable.

    15. Re:Foreign Keys by Trifthen · · Score: 2, Interesting

      I think the concern here was that even 15 months ago, Postgres was at least in the 8.0 tree, a vast improvement over the 7.4 tree which itself offered not insignificant advances over 7.2 and 7.3. The linked table of pros/cons for each database is dated February 2005, and browsing the postgres archives informs me that 8.0.1 was released by that point. If they waited two months, 8.0.2 would have been available. As it stands, the 8.0 tree was deprecated back in November of 2005, and the future now sits squarely on 8.2 after 8.1 experienced a short life of roughly a year.

      The truth of the matter is that postgres is a rapidly fluxuating target. I remember waiting five years for mysql 5.0 to finally see the light of day, and during that time postgres has truly experienced a ridiculous flurry of revisions and tweaks sometimes requiring alterations to their core storage format. That they do this all while providing full ACID compliance is a Godsend.

      I don't begrudge MySQL their success, but the roots of that database were thanks to simplicity and speed, like where SQLite is now. It's also an evolving product, but I think its lost the original focus in the mad rush of (friendly) competition all products eventually engage.

      --
      Read: Rabbit Rue - Free serial nove
    16. Re:Foreign Keys by Phil+John · · Score: 2, Informative
      1. MySQL isn't the only database where you have to declare foreign keys, so that one's a non issue. I don't want a database to "guess" what my FK constraints are, I want to tell it explicitly.
      2. True, but most of the tools I use to create DB schemas do this automatically now (phpMyAdmin et al)
      3. It's a good idea to have indexes on foreign keys, as you'll often be joining on them. As for silently ignoring, I've never seen that - I get an error message if I try to add a foreign key with no indexes created.
      4. Not so much of an issue these days, every build I've used recently from ones for Mac OS X and those for Fedora/ES/CentOS always have it compiled in. Even cheaper hosting accounts now very often have InnoDB support compiled into their MySQL installs.

      Don't get me wrong, MySQL is far from perfect, but if used properly (like any tool) it can power some very cool stuff.

      We always make sure that our apps work on multiple database servers as we realise our clients don't need a one-size-fits-all solution. We used to have to work around MySQL's limitations, but these days that's no longer as big an issue.

      --
      I am NaN
    17. Re:Foreign Keys by ThePiMan2003 · · Score: 2

      This only works if you are the only person touching the database. If you are working on any large scale application where multiple different programs use your database ensuring referential integrity is a must. Sure if only one application is touching your data you can forget about all the nice things that databases do, and gain a considerable speed advantage, but at that point why not just use flat files they are even faster.

    18. Re:Foreign Keys by Crizp · · Score: 2, Informative

      Yes. You can sell it for as much as you like. Just remember to make the source available :)

    19. Re:Foreign Keys by cloudmaster · · Score: 3, Insightful

      Umm, I'm pretty sure that MySQL had had subselects for several years. It's not clear if you knew that or not.

    20. Re:Foreign Keys by insignis · · Score: 2, Informative

      As both a DBA and programmer, I enforce referential integrity at the database level. I cannot fathom why you'd implement it at a code level--if you don't trust your DBMS, why are you using it? The database is the only common chokehold in a multi-user, multi-application environment. To implement it at an application level, you'd have to try to somehow synchronize communication between all those apps, some of which you might not maintain, and this would only be complicated by each app instance having its own transaction open. The database is already handling those transactions, which makes enforcing referential integrity easiest to do at the database level.

    21. Re:Foreign Keys by emurphy42 · · Score: 4, Informative

      It's worth pointing out that the parent is not being sarcastic. You can sell it for whatever price you like - but you still have to make the source available, and you can't place any restrictions on your customers selling copies for whatever price they like. In practice, this tends to drive the cost of GPLed software down to $0.

    22. Re:Foreign Keys by anothy · · Score: 2, Insightful

      sort of. you can sell GPL software, technically, but it's not really a sustainable business: you're required to ship the source with it, and anyone you ship it to can resell it or give it away.

      --

      i speak for myself and those who like what i say.
    23. Re:Foreign Keys by Eivind+Eklund · · Score: 2, Insightful
      (1) wasn't meant as a problem; it's the only reasonable step. The problem is that MySQL silently mess up in so many cases. When I have a database, I do not want it to silently mess up. This is doubly true for foreign key constraints, as it is non-obvious when these are messed up. Whether it is a good idea or not to have an index on a particular column is irrelevant: What I am protesting is SILENT IGNORE. This seems to be something that's missed by MySQL fundamentalists, who always come up with some set of random excuses addressing something else.

      I also happen to believe I am better qualified than anybody else for selecting what indexes I want in a particular database I'm designing, which none of you others know the purpose of nor the update frequency of nor the join frequency of. It's a good rule of thumb; it's a lousy requirement.

      Oh, and I'm perfectly aware that MySQL can power cool stuff - I have used it a ton myself (as an inherited database too expensive to replace, mostly). That doesn't mean that it doesn't suck compared to PostgreSQL (in my experience), and IMO is popular mostly because of being insecure by default (thus easy to install), being incompatible with the rest in subtly icky ways ("embrance and extend"), and due to semi-falsified benchmarks a long time ago (MySQL AB published only the benchmarks where they were best, varying what benchmarks they displayed by what database they were comparing against, giving the impression that the they were "as good or better" in all performance areas.)

      Eivind.

      --
      Doubting the existence of evolution is like doubting the existence of China: It just shows that you're uninformed.
    24. Re:Foreign Keys by anothy · · Score: 3, Interesting

      good clarification. but still: does this really hold water? i mean, unless they're using a modified GPL or whatever (is the GPL open-source?), they don't really get to dictate the terms. "derivative work", for example, isn't something they get to define. if i'm shipping a closed-source app that simply requires MySQL be also installed, and not something that's statically linked to their binaries, where's the "derivative work" come in?

      --

      i speak for myself and those who like what i say.
    25. Re:Foreign Keys by jadavis · · Score: 2, Insightful

      The maintenance costs are higher if there are constraints in the database, because it is duplication

      The maintenance costs are usually lower, and the reason for that is *when* you catch the error. If the database enforces some simple constraints, you catch the error *before* it goes into your database, and you know exactly which application tried to insert bad data. It's the best kind of error report a programmer can see.

      If you do everything at the application level, any kind of bug can result in bad data being inserted. When a later point in your app finds the bad data, you have no idea where it came from. Good luck tracking that down. The part that inserted the bad data might not have even been written by you, maybe it can't be replaced by you, and maybe you don't even have the source to it. Maybe it's due to a security flaw.

      Not only that, but some constraints are nearly impossible to enforce from the application without going way overboard on locking. UNIQUE constraints are a good example.

      The "duplication" argument is just not true. The constraints offer your application an API in a way: it shows the app developer the nature of the data they can get from the database, and the nature of the information you can add to your database. In an application, almost all non-static functions do some sanity checking on their inputs, and by your definition that would be "duplication". However, it might be almost impossible for the caller to know it would cause an error, and there are so many callers that, if you don't do sanity checking, it would be impossible to trace the error backwards. It's exactly the same with a database, unless your application is so simple that tracing backwards is possible (i.e. only a couple of points can modify data). How would you like it if API calls didn't return errors? After all, it's just duplicating code, since you should check the data first anyway. And comments are the ultimate in "duplication"; I'm sure you don't write any of those, right?

      Errors from constraint violations are a part of normal operating procedure for databases; a database error does not necessarily indicate an application bug.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    26. Re:Foreign Keys by poot_rootbeer · · Score: 3, Insightful

      Foreign keys don't speed anything up, they just add an extra layer of checks on your database.

      Correct. That extra layer of checks will probably actually slow things down a bit.

      But foreign keys aren't about performance. They're about data integrity, which I would hope every database administrator or developer is more concerned with anyway. It doesn't matter how many requests/second your DBMS can handle if the data is fuxxored.

      Your app should be checking itself anyway.

      Yes, it should be catching "foreign key constraint violation" exceptions thrown by the DB interface and handling them appropriately. I hope that's what you meant.

    27. Re:Foreign Keys by dubious9 · · Score: 2, Informative

      I didn't think you could sell GPL software, only support. I thought with GPL it also had to be free as in beer. Am I wrong? Yep, your wrong. It's a common misunderstanding though. You can charge 14 billionty dollars for GPL'ed software, you just have to be able to provide the source code for said software for a nominal fee.
      --
      Why, o why must the sky fall when I've learned to fly?
    28. Re:Foreign Keys by thedave · · Score: 2, Interesting
      Don't get to excited about accurate calendaring. Our calendar has only been in use for a few hundred years.

      Very few (except special purpose) databases do ancient dates correctly.

      AD alone is highly revised. BC is all over the map,

      This bug report for DEC VMS is amazing in its analysis:

      D I G I T A L

      SPR ANSWER FORM

      SPR NO. 11-60903

      SYSTEM VERSION PRODUCT VERSION COMPONENT
      SOFTWARE: VAX/VMS V3.2 VAX/VMS V3.2 Run-Time Library

      PROBLEM:

      The LIB$DAY Run-Time Library service "incorrectly" assumes the year
      2000 is a leap year.

      RESPONSE:

      Thank you for your forward-looking SPR.

      Various system services, such as SYS$ASCTIM assume that the year 2000
      will be a leap year. Although one can never be sure of what will
      happen at some future time, there is strong historical precedent for
      presuming that the present Gregorian calendar will still be in affect
      by the year 2000. Since we also hope that VMS will still be around by
      then, we have chosen to adhere to these precedents.

      The purpose of a calendar is to reckon time in advance, to show how
      many days have to elapse until a certain event takes place in the
      future, such as the harvest or the release of VMS V4. The earliest
      calendars, naturally, were crude and tended to be based upon the
      seasons or the lunar cycle.

      The calendar of the Assyrians, for example, was based upon the phases
      of the moon. They knew that a lunation (the time from one full moon
      to the next) was 29 1/2 days long, so their lunar year had a duration
      of 364 days. This fell short of the solar year by about 11 days.
      (The exact time for the solar year is approximately 365 days, 5 hours,
      48 minutes, and 46 seconds.) After 3 years, such a lunar calendar
      would be off by a whole month, so the Assyrians added an extra month
      from time to time to keep their calendar in synchronization with the
      seasons.

      The best approximation that was possible in antiquity was a 19-year
      period, with 7 of these 19 years having 13 months (leap months). This
      scheme was adopted as the basis for the religious calendar used by the
      Jews. (The Arabs also used this calendar until Mohammed forbade
      shifting from 12 months to 13 months.)

      When Rome emerged as a world power, the difficulties of making a
      calendar were well known, but the Romans complicated their lives
      because of their superstition that even numbers were unlucky. Hence
      their months were 29 or 31 days long, with the exception of February,
      which had 28 days. Every second year, the Roman calendar included an
      extra month called Mercedonius of 22 or 23 days to keep up with the
      solar year.

      Even this algorithm was very poor, so that in 45 BC, Caesar, advised
      by the astronomer Sosigenes, ordered a sweeping reform. By imperial
      decree, one year was made 445 days long to bring the calendar back in
      step with the seasons. The new calendar, similar to the one we now
      use was called the Julian calendar (named after Julius Caesar). It's
      months were 30 or 31 days in length and every fourth year was made a
      leap year (having 366 days). Caesar also decreed that the year would
      start with the first of January, not the vernal equinox in late March.

      Caesar's year was 11 1/2 minutes short of the calculations recommended
      by Sosigenes and eventually the date of the vernal equinox began to
      drift. Roger Bacon became alarmed and sent a note to Pope Clement IV,
      who apparently was not impressed. Pope Sixtus IV later became
      convinced that another reform was neede

      --
      [ .sig removed due to death threats from zealots who seek to control me out of fear for their hidden d
    29. Re:Foreign Keys by tacocat · · Score: 4, Insightful

      When are you non-database types going to stop saying "Your app should be checking itself anyway."

      This is an insanely inneficient method of execution. It's also highly presumptive.

      Inneficient: If you are going to insert a record you have to first check to make sure it's not there. Then if it is there you have to change your INSERT to an UPDATE. This is dumb. Some databases do a INSERT OR UPDATE. but if they don't, it's faster to do an INSERT, handle failure, UPDATE. Alternatively -- UPDATE and INSERT on ZERO ROWS CHANGED. This means you have to run less than 2 queries on average. Your app should check method guarantees two SQL statements are executed every single time.

      Dumb. Say you check for a record to exist. You get a "NO" answer. While you are preparing and executing your next INSERT, some other process or a thread inserts that same record into the databse. Now you have an error and you still don't know what to do. In short, you're in a pretty bad way.

      Presumptive. In all my years of living I've never seen any company happy with the only interface to the data being through the application interface. Especially with a database on the back end. The business types, Marketing in partitular, love to screw with database information to try and identify trends, patterns, and correlations between the customer behaviour, product representation, and sales metrics. It is presumptive that the application can safely contain all of the business logic and you can assume that no one will ever come in the back end and change something -- thereby breaking all your business rules.

      The other consideration is that the business logic contained in a database is going to run a heck of a lot faster on the database than anything you can dream up in your application, unless the application is written in C. Databases are generally written in C/C++. Applications are generally written in Java,Perl,Python,Ruby. None of these can compete with C. Add to that the fact that databases have been designed for years to do only one thing -- manage data. Do you seriously think you can out perform a decade of database optimization in a ruby script?

      If you are going to base an application on data it would be useful to know how to capitalize on the features of a database rather than trying to repeat it. At the very least, you are less likely to introduce bugs.

    30. Re:Foreign Keys by tacocat · · Score: 4, Insightful

      Additionally, databases generally can do this faster than the application code. I can say this because databases are written in C and optimized and debugged for years. Applications are rarely (relatively) written in C and have not been debugged for years when released.

      This is something that actually really pisses me off about Ruby, Rails, and ActiveRecord. ActiveRecord is an insane violation of everything that a database has been built to do. It breaks consistency, violates keys, ignores so many rules... And it's beats the crap out of a database to do what a database is designed to do and can handle much faster.

      This is regardless of the flame wars of Postgres vs MySQL.

    31. Re:Foreign Keys by ShieldW0lf · · Score: 2, Insightful

      I don't agree with you regarding the use of triggers, I find they end up making optimization a lot more difficult. But I do agree regarding stored procs.

      Personally, I tend to steer towards procs that are complex internally with a simple external signature rather than using triggers. I find triggers are a real pain in the ass when you're trying to figure out how to optimize a slow running query.

      When I develop, I usually put all my data access functionality into stored procedures, deny access to all tables and views, then selectively grant access to the stored procs.

      Makes securing your data a lot easier, prevents most sql injection attacks, avoids the whole "magic quotes" mess, makes centrally managing your data a lot easier, and keeps the code-jockeys from screwing things up when they're in a rush.

      It's also a big advantage when you're changing your schema. You don't even need to touch the codebase in a lot of cases.

      On top of all this, it's more efficient. You send a lot less data back and forth across the wire, which most people don't think of until things start to bog down and it's time to move your db off the webserver and onto its own box on the network. And most dbs support some level or another of precompilation, which saves even more resources.

      If you can save a trip across the wire to the db by doing data validation in code, checking that that email address has an @ symbol and all that jazz, well that's good. But if you need to hit the db to do that validation, as you'd have to do when you're enforcing integrity in the middle tier, you just wasted network resources. You shouldn't have bothered.

      Even if you don't need all these sorts of benefits right now, there's still value in doing things the right way. Aside from building good working habits in yourself, you're building something that has a value external to the application.

      A well designed database generally has value that goes beyond the application that prompted its initial design. In the absense of the middle and client tier, it can still be utilized to generate projections and answer questions, and it's trivial to slap a new UI onto it. This is generally not true for dbs that are tightly bound to the web tier.

      To throw my 2c into the Postgres vs MySQL debate, there is one thing that stands out between the two that has nothing to do with the technology.

      MySQL developers have demonstrated time and again through their history that they have no problems selling the ignorant a bunch of bullshit to spur adoptation of their product. They do not concede that their product is unsuitable for some niches because of its limitations, instead, they knowingly advise new users to use poor development techniques even as they struggle to fix those limitations in their product.

      I do not trust them not to lie to me, and I would not stake my reputation on their products for that reason. That's something that is most likely never going to change regardless of what any "feature set" charts say.

      --
      -1 Uncomfortable Truth
    32. Re:Foreign Keys by vadim_t · · Score: 3, Insightful

      That's the reason OOP is there :-) Database abstraction, DAO's and Model's (POxO/DTO/whatever) can be responsible for storing/validating data and reused in other app's (as webservices, libraries, ...)


      None of this applies when somebody logs in with psql/enterprise manager/whatever and updates something in the database by hand. You can have all the OO and libraries you want, but it doesn't help if the new application doesn't use it. Yesterday we had code in VB6, today we have it in C#. Application is completely different. Guaranteeing that all the VB code will be exactly translated to C# is very, very hard.

      On the other hand, the database remains being the same, and all the constraints it has don't care about which language, methodology or whatever is being used. VB, C#, Perl, PHP, are all automatically held to the same constraints.


        You can have N account types (customer, broker, corporate, ...) Each account type have it's own set of "valid data" constraints. And even inside the same "account type", the validation can change (if an account was opened before date XX, it's permitted to do bla). You just can't do that using simple foreign keys. And if you want to ensure your data is consistent, you *will* need Stored Procedures and Triggers.


      And what's the problem with that? Use stored procedures and triggers then. Seriously, in a database of any size, forget about any attempts at compatibility with other databases. It only works on very, very trivial applications.

      Just take postgres and mysql. PostgreSQL loves big transactions. The overhead for a transaction is high, but it's perfectly happy with large, long running transactions, and the more the better. PostgreSQL will be slow if you have a transaction per statement.

      On the other hand, databases like mySQL want tiny transactions because the locks are really problematic. Leave a transaction uncommitted, and quickly things will grind to a halt. On the other hand, on postgresql the worst problem will be the lack of vacuum, which will gradually slow things down, but doesn't cause immediate problems.

      If you make it for mySQL, without a redesign it'll suck on postgres and viceversa. If you try to make it for both, it'll be suboptimal on both.

    33. Re:Foreign Keys by nuzak · · Score: 2, Insightful

      > Your app should be checking itself anyway.

      No it shouldn't -- the purpose of a database is to make it happen whether the client apps care about data integrity or not. Now a good client will gracefully handle the errors that a database throws back at it, but the database is supposed to take care of the checks in the first place.

      TFA reminds me of the anti-transaction FUD in old MySQL docs, which suddenly disappeared as soon as MySQL got a transactional backend. But hey, its system tables are still MyISAM, so you'd best be careful with those admin apps.

      --
      Done with slashdot, done with nerds, getting a life.
    34. Re:Foreign Keys by Branko · · Score: 2, Interesting
      You are correct about everything except the most important part. I'm not sure your wrong, but I question it. That is that it's typical to have multiple applications attached to the same db.


      Good point. Al lot of "small" Web applications are actually "single application - single database". In addition to that, many are heavily read oriented with very few writes and manipulate data that is not very critical. In environment like this, having "application level" data consistency might be quick and dirty solution that actually works quite well in practice.


      However, I'm under impression that many people learn these quick and dirty solutions first and never quite take the time u understand why the "slow and clean" solutions were invented in the first place.


      In addition to that, I would argue that most development goes into "large" Web applications (they are fewer, but require significantly more development). Also, Web is by no means the only area where databases find their application.

    35. Re:Foreign Keys by vadim_t · · Score: 2, Insightful

      However, I'd absolutely rather have messy data: imagine some type of glitch occurs (yes, unavoidable when working on anything with "serious" volume) and you end up performing the credit card charge but some part of the insert fails. In that case I want a partial transaction because there really was a partial transaction. It will aid in identifiying that something happened, and also in figuring out what it was.

      So you do it in multiple transactions then. Transaction 1 inserts the order data, transaction 2 processes the payment, transaction 3 updates statistics, with each saving a note somewhere of how far it got.

      This way you have both things: consistency, and the possibility to have a partially complete (but cut off at a well determined point) operations.


      Yes, I do log errors elsewhere. But if you've worked on anything "serious" you'll know that there's always an error case that can come up aside from what you're able to log well.

      That's easy, in my application I just log all of the requests headers and POST data. That's step 1, and will always succeed unless the database is down, as it's nice and simple.

      Step 2 is processing it, in one transaction. If it fails, I can retry the operation.


      I've come accross other examples, like making inventory records that don't have a foreign key because it's better than having no record at all.

      Why? In a well designed database, things don't just vanish. What happens is that it returns an error to the user, who knows it wasn't saved at all, instead of being in some half-saved state the user may not be able to recover from. Then the user can retry saving it, knowing that unless the database says it's good, nothing gets written, and so there won't be 20 half-written records in the database due to the previous attempts.

    36. Re:Foreign Keys by Ugot2BkidNme · · Score: 2, Insightful

      Actually I have to disagree with both of you I believe Foreign keys are useful in some cases but not in others. As far as the insertion of data and data control All database access should go through "Stored Procedures". No user of any application should have direct access to any table for insert or view. your stored procedure should handle all the checks you need. You can use a foreign key on top of it if you like. However if your right your procedures properly there is no need for foreign keys. No matter what application your using you have one access point the Stored Procedures. This controls what you can an cannot do and is a way better check on data integrity. Just remember to remove all access from the idiot developers to anything but your stored procedures. If they need something you don't have then you or whoever is your DBA writes it for them.

    37. Re:Foreign Keys by Haeleth · · Score: 2, Interesting
      i mean, unless they're using a modified GPL or whatever (is the GPL open-source?), they don't really get to dictate the terms. "derivative work", for example, isn't something they get to define.
      MySQL AB's interpretation of the terms matches the FSF's interpretation, and has also been supported by courts in various parts of the world. It holds water. Nobody's found a loophole yet, and believe me, there are a hell of a lot of people who have been looking.

      if i'm shipping a closed-source app that simply requires MySQL be also installed, and not something that's statically linked to their binaries, where's the "derivative work" come in?
      The FSF's interpretation of their own license holds that both static and dynamic linking create a derivative-work relationship. I don't know what their legal argument is, but the advice they issue is that if a binary does not work unless a specific shared library available, it uses that library and is therefore required to respect that library's license. Note that a even dynamically linked executable does generally have specific references to parts of the shared library compiled into it, and on some platforms is even statically linked to an import library, so there is some logic in this position.

      Now, if what you had in mind was an application that wasn't linked to MySQL libraries at all, but interacted with MySQL by running the mysql program with specific command-line options and parsing the output, then you would be correct: that does not create a derivative-work relationship, and even RMS himself is of the opinion that a program that works that way would not infringe on the GPL and could be licensed however you please. Of course, there would probably be something of a performance hit.
    38. Re:Foreign Keys by killjoe · · Score: 2, Informative

      Personally I think putting business logic in the database is wrong. It's harder to debug, harder to version control, harder to write.

      Putting some code to handle data integrity is OK but not business logic. Performance isn't everything. I would rather have an application that runs slower and is easier to maintain then the other way around.

      "unless the application is written in C. Databases are generally written in C/C++. Applications are generally written in Java,Perl,Python,Ruby. None of these can compete with C."

      This is nonsensical. If I write in python the python VM compiles and then runs my code. The VM is written in C. If I write in PL/PSQL then the postgres interprets my PL/PSQL line by line (it never gets compiled). Running your business logic in java should be significantly faster then writing it in PL/PSQL except when the application needs to transfer an ungodly amount of data over the network.

      --
      evil is as evil does
    39. Re:Foreign Keys by cyberon22 · · Score: 2, Interesting

      This is a really interesting post. What are your thoughts on a program that could hot-swap between backend database systems, optionally choosing to interact with MySQL or Postgresql or SQLite or whatever backend database storage system was required. If the choice was left to the user in terms of implementing the database backend, would this still be considered a derivative work for sales purposes?

      Not looking for legal advice, just curious because you have some interesting thoughts on things.

  2. No Digg by AKAImBatman · · Score: 5, Informative

    1. There's no such thing as unbiased. Especially on a page that gives a fairly abstract review.

    2. This article is 2 years old. Everything in its comparisons is out of date.

    1. Re:No Digg by electroniceric · · Score: 5, Informative

      Just to continue on your good points, especially troubling is the fact that this article compares the then-unreleased MySQL 5 to the Postgres 7.x series. Nearly all the drawbacks to Postgres that this article highlights have been addressed in the 8.x series.

      We run Postgres for our main business application and the main limitations are of two forms:
      1) Depth of community
      The Postgres community is great - very responsive and knowledgeable, but its size is a limitation in a number of ways. The ODBC driver is a bit of stepchild to the main project, and some key functions like dblink that address missing features like cross-database selects are relegated to /contrib, and rely on their individual authors for nearly all maintenance. This means that as a user you are more likely to bump up against the bleeding edge earlier than in communities where these outside-the-core projects are more supported.

      For the same reason a key subset of its documentation is very sparse. Documentation for the core system is thorough, clear and concise, but anything in contrib or any projects like the ODBC or .NET drivers are much less like to have the same quality of documentaton. Postgres' extremely powerful GIST indexes are unparalled as a feature, but you need a background in theoretical CompSci to figure them out, thanks to limited documentation (note to aspire database index geeks - I would gladly buy a book on GIST aimed at proficient DBAs who are not giants of theoretical CS). Likewise its procedural languages: thanks to its architecture and open codebase, Postgres offers more server-side languages than any other database that I know of, but few of them have more than basic documentation, let alone the stacks of books you'd find with other procedural languages.

      2) Postgres is very close to being a true enterprise contender (unlike MySQL, which is evolving that direction but distinctly further off), but lacks some key features like XML handling, a more comprehensible approach to result sets (anyone who's dealt with rowtypes and casting resultsets can attest to the steep learning curve), and a userbase that has put the product through the wringer. Now that some corporate heads are getting interested (e.g. Sun, Red Hat, EnterpriseDB) hopefully some of these shortcomings will be addressed in short order.

      Don't let this outdated, apples to oranges comparison fool you: Postgres is a very solid and usable database.

    2. Re:No Digg by leonmergen · · Score: 2, Informative

      Does it have clustering yet? How about load ballancing? Those were the main issues that stopped our deployment of postgres.

      Yep, it does.

      --
      - Leon Mergen
      http://www.solatis.com
  3. Old news by daffmeister · · Score: 5, Informative

    From the site:

    "Last modified: February 15, 2005."

    1. Re:Old news by suv4x4 · · Score: 5, Funny

      Next article on Slashdot: 486 SX vs 486 DX

    2. Re:Old news by cloudmaster · · Score: 3, Funny

      To paraphrase a comment from farther up, "yeah, floating point is nice. I implement it in my application on the SX anyway."

  4. stability by oliverthered · · Score: 2, Informative

    Having foreign keys, views, subselects, and transactions can all be very attractive in PostgreSql -
    if you need them and you will make any use of them. If you don't need them or won't use them, then
    you're probably better off with MySQL and its superior performance.


    PostgreSql is more stable than MySQL, (and has better performance when saturated), shouldn't you take that into consideration?

    --
    thank God the internet isn't a human right.
    1. Re:stability by TheRaven64 · · Score: 3, Insightful

      According to TFA, 'MySQL does very good job even on the busiest sites,' while for PostgreSQL 'Random disconnects, core dumps and memory leaks are usual.' This flies in the face of my own experience and testing results I have seen. Under heavy load, PostgreSQL has a habit of slowing to a crawl, while MySQL just dies. How many web pages have you seen where the entire text was a PHP error saying it was unable to connect to the MySQL server?

      --
      I am TheRaven on Soylent News
    2. Re:stability by scribblej · · Score: 2, Interesting

      This is what blows my mind, too. I've been using postgresql 7.x (yeah, I'm a Debian user) for years and I have never, ever seen it crash, disconnect, dump core, or leak memory. Ever.

      I've never seen MySQL do those things, either, to be fair, but I don't use it as much, and I *did* have it destroy a bunch of my data once when a machine was rebooted without being properly shut down.

      The comment that Postgresql is unstable, even in the older version numbers, seems to indicate some kind of separation from reality. I didn't see anything in the article about their testing methods. I wonder, given some of the other complaints, if they tested them on *WINDOWS* despite the lousy support for Windows, and the outright recommendation that you don't try it, from the devs.

      Seriously. Anyone else with me on that? Even outdated as this review is, it STILL doesn't make sense... unless you think maybe they did it on Windows. Then it makes more sense. Postgresql on Windows used to suck, and for good reason. It wasn't supported, either, and for good reason.

    3. Re:stability by Blimey85 · · Score: 2, Interesting

      Just because you see that error doesn't mean that MySQL has "died". If the number of people hammering your site exceeds the number of connections you have allowed, that message will come up. Granted by that point the server is probably crawling but maybe not. You may be doing a dump from a particular table causing it to remain locked while the dump is in progress. All queries to that table will stack waiting for it to get unlocked and that can cause your connections to max which would in turn cause that particular error message. Not the only caue of course but the main one I've come across.

      --
      How is it that one careless match can start a forest fire, but it takes a whole box to start a campfire?
    4. Re:stability by rycamor · · Score: 2, Interesting

      It flies in the face of my experience, too. Even with the older 7.x series, if I accidentally ran a stupid query, (for example, a join on multiple tables, forgetting the WHERE clause, resulting ridiculous multiplication of returned rows) PostgreSQL would literally run that query for 24 hours without dying, until I finally killed the query or re-started the server.

      Come to think of it, I haven't had PostgreSQL ever die on me, once. Even when my server crashed and rebooted, PostgreSQL would just happily pick up where it left off and get back to work.

      Meanwhile, on more than one Linux server I had the pleasure to maintain, MySQL was notorious for randomly dying.

      Something tells me that the person who wrote this report simply read the MySQL mailling lists, and compiled his report from the discussions, without any real testing, much less considering that a checklist is meaningless unless you also evaluate how WELL each DBMS implements said feature.

    5. Re:stability by localman · · Score: 2, Insightful

      Any DB misconfigured is going to die under load. MySQL can be configured to be extremely stable -- we've been running the fastest & most reliable retail site online for the past year now with MySQL as the DB.

      I've got nothing against PostgreSQL -- just never used it. I'm sure it's a fine piece of software, but please don't spread falsehoods about MySQL just because people don't know how to configure it. That would be like me claiming PostgreSQL sucks because I couldn't get it working easily. It's all about knowing what you're doing in any case.

      Specifically, most of the errors you're seeing are because they've got it configured to use more memory than their 32 bit arcitecture supports. It's fairly easy to misconfiure so that in a high traffic situation the MySQL process will use over 2GB and then the OS shuts it down. The options are to go 64 bit or to configure it to use less memory for performance or limit the connections -- just like with Apache's MaxClients option.

      Cheers.

  5. Old and wrong by ldapboy · · Score: 5, Informative

    postgresql has a native Win32 version, complete with an installer, service support and does not depend on cygwin.

    1. Re:Old and wrong by Anonymous Coward · · Score: 3, Funny

      Yeah but In 2005 it didn't!

      You can't say it is "old" and "wrong" when it is wrong because it is old.

  6. Unbiased ? by UncleH · · Score: 2, Interesting

    Just take a look at the description per item. I couldn't possibly call this unbiased in any way.

  7. Not similar to my experience by seebs · · Score: 4, Informative

    I have been involved with a smallish ("hundreds") installation of Movable Type using a mysql backend.

    One comment spammer can completely annihilate it.

    One developer I talked to once did some testing. On one simultaneous connection, mysql was way faster. By five or so, they were close. At ten, PostgreSQL was definitely winning. At a hundred, he was simply unable to get a single MySQL server to complete the test successfully, let alone do it quickly.

    The impression I get is that PostgreSQL uses more robust algorithms, with higher constant costs and lower quadratic costs. In any event, never had any problems.

    As noted elsewhere, these comparisons are quite old...

    But in any event, in my own experience, mysql is a lot easier to blow up by overloading than postgres is, at least if you have a lot of writes going on. For pure-lookup functions, it might do better -- but a lot of modern database apps are pretty compulsive about saving at least something every time someone touches them. (For instance, modern vBulletin saves last visits, threads seen, and so on; all of that adds up to a huge load on the database server.)

    --
    My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
    1. Re:Not similar to my experience by walt-sjc · · Score: 2, Informative

      We use both PostgreSQL and MySQL for a large web-based application that does a reasonable mix of reads / writes (sessions / profiles are in MySQL so it gets MANY MANY writes.) Neither MySQL nor PostgreSQL has problems handling many many connections. Our load frequently hits around 1000 connections on Postgres and 4000 on MySQL on individual database servers (we replicate too.)

      Obviously you need to tune your environment (there are a plethora of options including table types which can impact things a LOT) to the load, so if you are running into problems at 100 connections, something is wrong.

    2. Re:Not similar to my experience by hey! · · Score: 2, Insightful

      Somewhere in here, there's a tortoise and hare analogy trying to get out.

      It seems to me that if you step back from the details, there is a fundamental difference in style between the two systems that could be summarized thus:

      Postgres: emphasizes completeness, correctness, and conformance.

      MySQL: emphasizes immediate practicality.

      One style is not intrinsically better than the other. Given time their results may begin to converge, which I think is starting to happen. However, I am not surprised that many people are starting to give Postgres a second look after having dismissed several years ago. The Postgres strategy is a long term one. Early adopters of Postgres were a minority with a particular interest in the relational model and for whom conformance was a relatively high priority. Pragmatists who wanted to cherry pick a few of the model's most important advantages were drawn to MySQL.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    3. Re:Not similar to my experience by Wdomburg · · Score: 2, Informative

      For all it's faults, MySQL does scale with a largely read-only data set. We currently have twenty-eight production servers running and about twenty development and testing machines. On the busiest servers we're pushing somewhere in the neighborhood of 4000 queries a second sustained.

      Write performance can certainly be an issue, but it depends largely on the application and the table backend. For example, if you can avoid doing deletes on a MyISAM table INSERTs get appended, allowing concurrent reads.

      I've not looked at how MT uses the database, but blog software would imply large variable length writes. Definitely not an ideal application for MySQL.

    4. Re:Not similar to my experience by AKAImBatman · · Score: 2, Informative
      THERE IS NO ODBC FOR LINUX (or equivalent).

      What the hell? If that were true, then what is this? And this? Not to mention this.

      Are you making a reference to Windows-specific APIs or something? Because I don't understand how you can shout that ODBC doesn't exist for Linux/Unix?
  8. I want to see more databases - Firebird, Derby by Anonymous Coward · · Score: 2, Insightful

    Glad to see the comparison, but I would really like to see is a comparison that includes the new 2.0 release of Firebird. Their new release is impressive, but I dont know how the features pan out with MySQL or Postgres. Including Derby would also be nice.

    The most important factor to me in any comparison is the licensing agreement. I like a very open agreement and the MySQL license requires you to release the source code to your product in some cases, or you have to purchase a license from them.

  9. MySQL is ridiculously easy to configure by MikeRT · · Score: 2, Insightful

    You have to give the MySQL guys credit for the fact that it is an incredibly easy product when it comes to configuring it for your needs. For me, out of college, going to Oracle was a culture shock because the process of configuring Oracle was so convoluted and drawn out for simple stuff. I know that Oracle and PostgreSQL can be much more powerful than MySQL, but there is something to be said for how easy it is for a developer to install MySQL and just start working with it.

    1. Re:MySQL is ridiculously easy to configure by Schraegstrichpunkt · · Score: 5, Insightful

      You have to give the Notepad guys credit for the fact that it is an incredibly easy product when it comes to configuring it for your needs. For me, out of college, going to Vim was a culture shock because the process of learning Vim was so convoluted and drawn out for simple stuff. I know that Vim and Emacs can be much more powerful than Notepad, but there is something to be said for how easy it is for a developer to install Notepad and just start working with it.

    2. Re:MySQL is ridiculously easy to configure by MP3Chuck · · Score: 2, Insightful

      Part of me sees the point you're making, but another part of me say "Yea, and ... what?" Does Notepad, embarrassingly simple though it may be, not still have appropriate uses?

    3. Re:MySQL is ridiculously easy to configure by value_added · · Score: 4, Funny

      Part of me sees the point you're making, but another part of me say "Yea, and ... what?" Does Notepad, embarrassingly simple though it may be, not still have appropriate uses?

      Short answer: No.

      Longer answer: None at all.

    4. Re:MySQL is ridiculously easy to configure by multipartmixed · · Score: 2, Insightful

      Dude, his post is 100% on the money.

      The problem is that you apparently missed his point entirely.

      --

      Do daemons dream of electric sleep()?
    5. Re:MySQL is ridiculously easy to configure by ThJ · · Score: 2, Funny

      I use it for taking notes. :P

  10. There are several problems by kahei · · Score: 4, Informative


    1 -- This article is years old.

    2 -- This article is posted solely to stir up (repetitive) discussion.

    3 -- This article pretends that MySQL is a real database, even though in order to do so it has to make gigantic leaps like considering data integrity to be not really all that important in a database.

    4 -- This article trolled me.

    --
    Whence? Hence. Whither? Thither.
  11. I'd rather by Klaidas · · Score: 2, Insightful

    I'd rather have a new (not THAT old) comparison between Oracle and MySQL

  12. Outdated and Silly by ClayDowling · · Score: 3, Informative

    It's been a long time since any of their PostgreSQL statements were true. It's a very happy native windows service with a nice installer, and the administrative interface is very easy to use. Let's try posting current reviews of software, rather than reruns from a year or two ago.

  13. Doesn't mean a thing and this is why ...... by Stumbles · · Score: 2, Insightful

    February 15, 2005

    --
    My karma is not a Chameleon.
  14. What about clustering? by cecchet · · Score: 2, Insightful
    Clustering and High-Availability aspects are not mentioned at all.

    MySQL speed will really depend on the database engine you use (MyISAM or InnoDB do not perform the same!). PostgreSQL performance is pretty much consistent across platforms.

    On the HA side, PostgreSQL has maybe less options: Slony/I (http://gborg.postgresql.org/project/slony1/) for master/slave or Sequoia (http://sequoia.continuent.org/) for multi-master.
    MySQL offers MySQL replication (http://dev.mysql.com/doc/refman/5.0/en/replicatio n.html) for master/slave, MySQL cluster (http://dev.mysql.com/doc/refman/5.0/en/mysql-clus ter.html) for those who want to switch to a new storage engine (NDB) or Sequoia (URL:http://sequoia.continuent.org/) for multi-master with transparent failover.

  15. Re: PostgreSQL is easy too by ajaf · · Score: 2, Informative

    You don't need to setup anything to run it for the first time, only if you want to play with performance, you can start to modify parameters as memory, max connections, etc. PostgreSQL is easy and powerful, just give it a try.

    --
    ajf
  16. Crap! by CaptainZapp · · Score: 3, Informative
    MySQL runs as a native Windows application (a service on NT/Win2000/WinXP), while PostgreSQL is run under the cygwin emulation.

    I call pure, unadulterated crap on this one.

    One of the major new features in Postgresql 8 was native Windows support. It runs just fine as a service.

    This comparision is either very old news, incompetence in action, or, um! strongly biased.

    --
    ich bin der musikant

    mit taschenrechner in der hand

    kraftwerk

  17. Why we moved from MySQL to PG by punker · · Score: 5, Interesting

    This almost seems like the same comparisons we've been hearing for years.
    1) Postgresql is more full featured than MySQL
    2) MySQL is faster in a read-mostly environment
    That's pretty much the same as the anecdotal arguments have been for years.

              In my job, we moved from mysql to postgres several years ago (around PG 7.0). At the time, we needed to make the move for performance reasons. We are in a read-write system, and MySQL's locking was killing us (this was before InnoDB was well established). The features are better too, as our developers were used to having data integrity features, server side programming, and all of the SQL92 constructs available. We also learned a bit about PG performance, which I'll share.

    1) Run EXPLAIN ANALYZE on everything. Postgresql is touchier about query performance than MySQL was. This just needs to be a habit if you're using PG. (You really should do performance analysis no matter your DB. It's just a good practice). The biggest gain will be making sure you're using index scans rather than sequential scans.

    2) Use persistent connections. Everyone likes to point out the forking issue with PG vs. MySQL's threaded. PG's connection handling is slow, there's no doubt about it. But there's an easy answer. Just limit how often you connect. If you can keep a connection pool, and just reuse those connections, you'll save this big hit.

    3) Full vacuum and reindex regularly. We've found the docs to be a bit off on this. It indicates that you should run these occasionally. If you're in a read-write system, a full vacuum on a regular basis is very important. It really doesn't take that long if you do it regularly. Also, we've had trouble with indexes getting unbalanced (we see 50->90% tuple turnover daily). This has gotten better, but it doesn't hurt to let your maintenance scripts make things ideal for you. So, we run a full vacuum and reindex of our tables nightly through cron.

    4) Get your shared memory right. PG's shared buffers is probably the most important config attribute. It controls how much of your DB is memory resident vs disk resident. Avoiding disk hits is a big deal for any DB, so get this right. If you can fit your whole DB in memory, then do it. If not, make sure your primary tables will fit. The more you use the shared memory, and the less you have to page data in/out, the better your overall performance will be.

    Most DB systems seem to be read-mostly, so I can understand the performance comparisons focusing on that. In our read-write system though, the locking was the biggest issue and it tilted the performance comparison toward PG.

    1. Re:Why we moved from MySQL to PG by tcopeland · · Score: 4, Informative

      > So, we run a full vacuum and reindex
      > of our tables nightly through cron.

      I've found that just enabling autovacuum seems to keep things in order. And you can tweak it for individual tables if you're so inclined.

  18. Of course, MySQL is effectively two products... by itsdapead · · Score: 3, Insightful

    MySQL/MyISAM is the one with the massive legacy code base, the one that your open-source blogging software uses and probably the one that your web host supports. It beautifully hits the "sweet spot" for data-driven web sites with infrequent and simple updates, where trading integrity for "read only" performance is sensible. It does not even purport to compete with PostgreSQL on features - but it does offer fulltext searches, again

    MySQL/InnoDB is the one that offers transactions, foreign keys etc. (ISTR it doesn't do fulltext indexes, though) - this is the "version" that bears comparison with PostgreSQL. I wonder how its user base compares?

    (OK - you can mix InnoDB and MyISAM tables in a single database, but you can't use InnoDB if your web host hasn't installed it - heck, one provider I use is still on MySQL V3.23)

    Flamewars have tended to pit PostgreSQL against a mythical database with the performance of MyISAM and the features of InnoDB...

    As for the GUI software, the MySQL GUI Admin/query browser stuff is shinier than PgAdmin3 - but the MacOS version of the former is a complete crashfest! Neither of them steps up to the plate of providing a FOSS equivalent of (the good bits) of MS Access.

    --
    In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
  19. Ok, here is another outdated test by xyvimur · · Score: 3, Informative

    Ok, this is yet another outdated report comparing three mainstream RDBMS'es - MySQL, PostgreSQL and ORACLE. It was done for yet another physical experiment - for choosing the proper system for storing data about the construction process of one of the LHC detectors - ALICE.
    And this report is at least professional, which cannot be said about the one mentioned in the article.
    http://dcdbappl1.cern.ch:8080/dcdb/archive/ttraczy k/db_compare/db_compare.html

  20. more recent benchmarks by darekana · · Score: 4, Interesting

    http://tweakers.net/reviews/657

    They compare PostgreSQL 8.2 vs MySQL 4.1.20 and MySQL 5.1.20a.

    1. Re:more recent benchmarks by daBass · · Score: 3, Insightful

      That is a great review; an actual real-world scenario buy guys who seem to know their database, not some spotty teenager who thinks a database benchmark is seeing how fast you can load 1 million records and then see how long a "select *" on that table takes...

      As the article shows, every time they double the number of cores, Postgres gains 75% in performance - like any good application should do. At 4 cores, it is already twice as fast as MySQL under reasonable concurrency; I'd like to see this test on a 8-core server - my guess is MySQL wouldn't be much faster than it is now and Postgres would perform at least 3 times better than MySQL.

      Oh, and Postgres doesn't think 0000-00-00 is a valid date, which is nice too.

  21. a more up-to-date comparison by brentlaminack · · Score: 2, Informative

    I did a presentation at the Atlanta Unix Users' Group this month that is a more up-to-date comparison. It's available in Open Office format. You can also get to it from my home page. I did a similar talk almost four years ago. My conclusion is that MySQL has closed the feature gap with PostgreSQL in recent years. I still give PostgreSQL the edge in features, and MySQL the edge in out-of-the-box untuned performance. I also discuss replication and clustering.

  22. Here's a recent (Nov. 2006) Performance Review by drake · · Score: 2, Informative

    Here's a comprehensive performance review between PostgreSQL and MySQL. It compares both DB's under load as well as comparing Intel/AMD chips. http://tweakers.net/reviews/657/6

  23. Postgres For Larger Datasets by Hornsby · · Score: 5, Interesting

    I had to make a decision recently between Mysql and Postgresql for a database composed of many tables with greater than 50,000,000 rows. While going through the decision making process, I loaded a sample table with 50,000,000 rows to do some benchmarks. The first thing I had to do to run my tests was index the table. I started with Mysql using a InnoDB table type. I had both database servers relatively tuned to the hardware they were running on. I ran the create index with MySQL and detached my screen session. I came back several hours later to find MySQL was doing something along the lines of INDEX via REPAIR SORT. After some reading, I learned that this takes an order of magnitude longer than building an index the "normal" way and is caused by the index becoming corrupted during the creation. Okay... so, I restarted this process several times and encountered the same problem. This is clean data mind you that has already been exported from an existing SQL server. I duplicated my install on a second server and had the same problems. Very annoyed with MySQL, I gave Postgres a try. It worked on the first time in less than 25 minutes without issue. Since then I've been using it on 250,000,000 row datasets without issue. It's always reliable, and as long as you remember to use CURSORS for huge SELECT statement, it's painless to work with.

    --
    A musician without the RIAA, is like a fish without a bicycle.
  24. Disapponting start by smchris · · Score: 2, Insightful

    The first screen that it says MySQL supports ODBC and doesn't mention that PostgreSQL does as well -- so why should I read further? Either sloppy, ignorant, or biased writing.

    There were a couple comparisons a couple years ago. It was my understanding that PostgreSQL did better with large data sets in a P vs. M match. In getting hammered with connections, another test between MySQL, PostgreSQL, DB2, Oracle, and SQLServer, if I remember, Microsoft's offering started to crap out along a power curve at maybe just 200(?) hits and the others degraded pretty equally along a straight line.

    My client/server experience started with some Oracle classes and managing a department server. I must say I am _much_ more comfortable with PostgreSQL and find MySQL a little alien no matter how popular it is. Just my 2 cents.

  25. Unbiased? by evrybodygonsurfin · · Score: 3, Insightful

    From the comparison table:

    • Postgres: Lacks binary distribution for all the supported plataforms.[sic]
    • MySQL: There are binary distribution for most of the supported plataforms.

    These statements convey the same information but that the author has presented them in different lights suggests to me a premeditated bias in favour of MySQL.

  26. Re:Whoa... this was actually helpful! by bgalbrecht · · Score: 2

    How can you find a trivial, severely out of date comparison useful?

  27. MySQL short on features by Frater+219 · · Score: 4, Informative

    Does the Internet's favorite DBMS have an IP address datatype yet?

    How about MAC address? CIDR block?

    "An IP address is just a 32-bit unsigned int, duh. Any DBMS can store those."

    Wrong. A datatype isn't just about storage, but also about operations. In PostgreSQL, when you do a SELECT across a table with IP addresses in it, you get them formatted and displayed as IP addresses, not as opaque ints. Likewise with CIDR blocks, like "192.168.42.0/23". There's also a comparison operator for asking whether an IP address is within a CIDR block.

    If you're implementing a network registration system or an incident logging system, how much of your time do you want to waste staring at opaque ints like 3232246272 rather than IP addresses like 192.168.42.0 when you're trying to debug it?

    MySQL is a bimbo, a fratboy: it's easy, but so shallow! The amount of time you save in one-time setup, you will lose many times over in all the little annoyances and deficiencies of a DBMS that was originally designed by folks who didn't really believe in DBMSes. Over time they've slowly been shamed into including many of the features they used to despise: transactions, relational integrity checks, and so on. But there's still so much missing ... not just essential integrity features, but little fiddly bits like good datatype support, the kinds of things that make your life easier (as a programmer or as a DBA) in the long run.

  28. I'd like to see... by spurioustruth · · Score: 2, Informative

    How about an article on that neat analysis framework Fermi (and others) use for the terabytes of data they generate and have to sift through?

    http://root.cern.ch/ (large scale repository for data analysis)

    And how would you use Postgres for something like that anyway? Maybe something like:

    http://www.greenplum.com/ (biggy-sized Postgres based data warehousing)

  29. This Was a Hit Piece - Better Data Here... by skeeterbug · · Score: 2, Informative

    i was using pgsql 8.x back in 2005 - so it existed. they just cherry picked versions to get the result they wanted. postgresql.org linked to a new study where both current versions (at the time, and it was recent - pgsql is 8.2 now) of mysql and postgresql were put throught the paces... http://www.postgresql.org/about/news.691

  30. I love this part... by naChoZ · · Score: 3, Funny

    MysSQL has a much larger user base than PostgreSQL, therefore the code is more tested and has historically been more stable than PostgreSQL and more used in production environments.

    "Claiming that your RDBMS is the best in the world because more people use it is like saying McDonalds makes the best food in the world."

    Sorry, just an old joke that deserved retreading... ;)

    --
    "I can be self-referential if I want to," said Tom, swiftly.
  31. Re:PostgreSQL vs. Oracle? by Angvaw · · Score: 2, Interesting

    I've used Oracle for about 3 years and Postgres for about 1. Both are good databases. I may write what you're asking for eventually, but don't hold your breath. :) Here are a few thoughts off the top of my head that might help.

    In a way-oversimplified nutshell, Postgres seems sorta-kinda modelled after Oracle. But many times I look into a feature that I've used on Oracle and find it doesn't exist in Postgres or it's not up to my expectations. The Postgres feature might be a little half-assed. One example is the multi-version concurrency control. If you update a row in Oracle, the undo tablespace contains a snapshot of the old version. All other concurrent users are directed to the undo tablespace until the transaction completes. They get the old version of the uncommitted row from the undo tablespace. Postgres on the other hand makes a new row in the table for your update, and the old one is marked as a "dead tuple" whose data can be retrieved by other transactions.

    Might not seem like a major difference, but you'll have to "vacuum" your Postgres database periodically to clean up your dead tuples. Otherwise, table scans have to scan over all the dead tuples in addition to the "real" data. Some DBAs simply run a vacuum job every night, or use the auto-vacuum daemon. It's not always so simple however. Imagine you have to update data every morning and get it done ASAP. Well, now that you've updated your 500,000 rows, you have to vacuum the table to keep queries running at a decent speed. When I was still a Postgres newbie, I would just vacuum whenever I remembered. One time I forgot about it for a few days, when I finally ran the vacuum it took 17 hours to complete - granted it was a one disk server. It's just another thing to worry about.

    Once I looked for a replication solution for Postgres. Some were incomplete. Slony-I - the most popular Postgres replication solution - seemed like the way to go. But upon delving into it, I found it imposed limitations because it sort of "corrupts" the Postgres data dictionary. I believe this corruption broke triggers on all replicated tables (don't quote me on that). Also Slony-I does not support multi-master replication at all.

    Sometimes a feature is just great. Rarely is the Postgres feature better, except in terms of user-friendliness. For instance the psql command line tool blows sqlplus out of the water IMO. (Wow, paginated results, tab completion!)

    Oracle has a wealth of OLAP/business intelligence features that are lacking in Postgres. There is a flavor of Postgres called Bizgres that is a bit better in this area, though still lacking in comparison. Bizgres is more cutting edge and doesn't seem to have nearly as strong a community as Postgres. Go to http://www.bizgres.com/bb/ and you'll see there are very few posts.

    Oracle has an expert named Tom - asktom.oracle.com - who is a genius and answers all my questions. Postgres has an expert, also named Tom, who is a genius and answers all my questions on the newsgroup.

    When would I use one over the other? Personally I would use Oracle if there were no budgetary constraints, but if Postgres definitely met all my requirements, I'd go with that. (Yes yes, I am Captain Obvious.) If you need Oracle's enterprise-level features then Postgres is out of the question. There is however EnterpriseDB, a commercial, Postgres-based enterprise-level database. Several months ago Sony "snubbed" Oracle and went with EnterpriseDB for its MMORPGs. I have not looked into EnterpriseDB myself so I can't comment further.

    Well I'm rambling and not getting work done. Hope that helps just a little bit.

  32. Foreign keys are an enterprise feature by alienmole · · Score: 4, Interesting

    I suspect you're thinking of relatively simple cases where you have a single application working against a database. Which of course is the standard MySQL scenario.

    Where foreign keys and the other referential integrity features really shine is in true enterprise scenarios, when you may have hundreds or thousands of applications, written in multiple languages, working against the same shared database(s).

    In that scenario, the only viable way to duplicate the functionality of foreign keys at the application level is to have a middle layer which all other applications are required to go through. Realistically, that middle layer has to be implemented as a server, serving requests for object/record creation, update and delete over the network. Implementing it as a library to be linked into applications doesn't work well, because there are multiple applications accessing the database, and integrity enforcement needs to be centrally coordinated.

    Implementing a middleware data server for an application isn't all that difficult, but integrating it into applications can be. Most application development environments know how to talk to databases, but don't automatically know how to talk to your application-specific, language-independent, data server. So now you're writing a client library for each app dev platform used in the enterprise, and dealing with things like integrating your custom interface with data-bound controls in the user interface. BTW, this is where people start resorting to e.g. SOAP, and projects start going off the rails (no pun intended, Ruby fans).

    Luckily, as it turns out, there are already standardized, widely-available, well-supported systems that implement a centralized data serving service which enforces referential integrity. They're called databases. And foreign keys are an essential part of the service they provide.