Slashdot Mirror


MySQL And PostgreSQL Compared

unicron writes: "PHPBuilder has got an article MySQL and PostgreSQL Compared. " Everyone who has used these DBs knows the differences between them, and now that licensing isn't one of them, let's try to talk about where each excels and the other fails. I know people get almost as religious about their DBs as they do about OSs and programming languages, but let's try to get somewhere here and not just needlessly flame and rant, mmkay?

8 of 147 comments (clear)

  1. Tool for the Job by Greyfox · · Score: 5
    DBase III wasn't a very good database program, but in its heyday millions of people used it and it got the job done for them. Even relatively inexperienced users could make use of it and write simple programs to manipulated their data. Even though it sucked, it was the right tool for a lot of jobs at the time.

    Compared to DBase III, both of these tools are excellent. I wish I'd had either one a decade ago when I started work doing clipper programming for a dog track related publishing company.

    For the dog track application I would have preferred Postgres; the rollback support would be pretty compelling for an application like the one we were doing. For something where I was just kicking around a database (Which I've also done a lot of) MySQL would be perfect. MySQL would be ideal in something like the RHS Orchid Registry, for instance.

    If application bigotry keeps you from choosing the right tool for a job, you will be a less valuable resource to those who employ you. Not too many people seem to "Get" this. People are often surprised that I will, on occasion, suggest that Microsoft products are the best tool for what they're trying to do. Usually those people asked me expecting a "Windows sucks use Linux" spiel, but if I think their situation warrants it (Inexperienced user, just wants to browse the web, word process and send E-Mail or wants to play games at all) I'll tell them to use Windows.

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

  2. Re:Diferences? by Jerry · · Score: 4
    When I was comparing the two (MySQL 3.21.33 and PostgreSQL 6.4) I noticed that the MySQL manual gave a list of SQL functionalities that were missing in it.
    • Sub-selects
    • Selects into a table
    • Transactions
    • Triggers
    • Foreign Keys
    • Views


    These aren't trivial weaknesses, and says nothing about syntactical differences between MySQL and ANSI SQL 92. I use these features everyday at work. To impliment them in code is possible but would be a royal pain. TRANSACTIONS, for example, can be worked around by creating duplicates of tables and batch processing into them. If all the updates on all the tables goes ok then copy the dups over the original tables. Not practical if the original tables are huge, but can be done at night after backups are done.

    It also listed functions which 'exist' only for compatibility (i.e., not functional)
    • GRANT
    • CREATE INDEX
    • DROP INDEX
    • ALTER TABLE

    After that list of don'ts I decided to check PostgreSQL, especially since TRANSACTIONS was missing and MySQL said it would require a 'completely different table layout than *MySQL* uses today.' I understand that they have employed a wrapper technique to supply a 'TRANSACTION'-like functionality. I've had experiences with extending fuctionality with wrappers and they are generally not a stable or acceptable solution.

    PostgreSQL has everything MySQL is missing. It has a very powerful set of operators (std, numerical, time-interval, geometric) which can be evoked by name. It also has a similar set of fuctions which can be used to increase the functionality of the operators. Postgres alows attributes of an instance to be defined as fixed-length or variable-length multi-dimenstional arrays, either of base type or user-defined types. (Not exactly like Pick's AMV, but close). I seldom see folks mention a powerful feature of PostgreSQL that MySQL doesn't have - Inheritance.
    From their documentation:


    CREATE TABLE cities (
    name text,
    population float,
    altitude int -- (in ft)
    );

    CREATE TABLE capitals (
    state char2
    ) INHERITS (cities);



    That's why I chose to PostgreSQL over MySQL.

    --

    Running with Linux for over 20 years!

  3. Re:You know... by wik · · Score: 5
    The article was listed as a column, not a full-disclosure benchmark report. I'll let the author slide on those points, but here are some things to think about when running a benchmark:

    • He gave a very vague description of the layout of his database and the nature of the queries. Without that, it is impossible to reproduce his results.
    • He did not describe his client/driver configuration, for all we know, they could have been programs running on the same machine as the DB server
    • For database benchmarks such as TPC-C (you can see the stringent reporting specifications at www.tpc.org) it is common to allow the driver program to warm up the database for a long amount of time (45 - 60 minutes seems to be common now, depending on the size of the database) then allow the database to reach steady state after at least 20 minutes. He does not mention whether he gave time for the database to stablize. Transient numbers are not a good indication of actual system performance.
    • His reported statistics are lacking. Saying that PostgreSQL is 2 - 3 times faster than MySQL leaves a 50% uncertainty in his measurements. At times, he reports 2 - 3, and others he reports simply 3. In fact, I don't see any solid measured numbers for queries/second, transactions a second or some other metric.
    Granted, most websites running either of these software packages are probably not going to see very heavy loads, but if you're deciding between two vendors, it's good to know exactly *what* you are comparing.
    --
    / \
    \ / ASCII ribbon campaign for peace
    x
    / \
  4. Re:Flame by wik · · Score: 5

    Try the TPC-C reports here. These are not specifically comparisons between vendors, but they are all reports of systems running a common benchmark. The reported benchmarks all went through approved TPC audits before being released. (On an aside, Microsoft just withdrew their results on SQL Server 2000 because they failed the audit, it's on ZDNet somewhere, I don't have the link).

    --
    / \
    \ / ASCII ribbon campaign for peace
    x
    / \
  5. Re:8K limit ? by paulrussell · · Score: 5

    > my sites all have whole articles, pictures, etc... into a MySQL database.

    (Warning: DBA in a previous life - expect pain ;)

    I really wouldn't suggest placing text articles, images or other large objects into a database tuple. Modern databases (including PostgreSQL; not sure about MySQL) have support for 'BLOBs' (Binary Large Objects), which allow you to associate very large files and objects with a particular tuple.

    The BLOBs are stored as part of the database, but not within the relation itself. This works a *lot* faster than putting the images or text inline with the tuple, and is I suspect partly why the folks at PostgreSQL haven't yet fixed the 8kb limit.

  6. He missed Stored Procedures! by Ami+Ganguli · · Score: 5

    The reviewer mentioned Postgres triggers, but didn't make it clear that triggers use a more general feature called stored procedures.

    By coding lots of low-level functions as stored procedures that are stored _inside_ the database I can drastically reduce the number of database calls my application has to make.

    The benchmarks he used can't show this (because he said himself that he doesn't use these features) but proper use of stored procedures should improve performance drastically while simplifying application code.

    --
    It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow
  7. Re:Diferences? by Carnage4Life · · Score: 4

    The difference isn't just the SQL support but more the transaction handling abilities of the DB specifically the ACID properties of the database. The ACIDity of a database is what makes it a RDBMS. Acid properties are

    Atomicity:
    This means that transactions are either fully completed or never begun at all. Any updates that a transaction might make on a system are completed in its entirety. If any error ccurs during the transaction keeping it from completing all its steps then the DB is rolled back to its previous state before the transaction began. E.g. Let's say a transaction consists of money being removed from a checking account and stored in a savings account. If there is an error after the money is removed from the checking account but not stored in the savings account then the transaction can be rolled back to when the money was still in the checking account.

    Consistency:
    This means that the DB is always in a valid state after a transaction. Thus if an error occurs during a transaction, the DB can be rolled back to the last valid state. E.g. From the above example the total amount of money in both accounts is constant and is equal to the sum of both amounts. If an error occurs while transferring money from the checking account to the savings account then the total of both accounts is not consistent with the amounts actually in the accounts. This inconsistency is handled by rolling back the DB to the last consistent state.

    Isolation:
    Each transaction appears to the only one being carried out by the system at that time. If there are two transactions both performing the same function and both running at the same time they will be invisible to each other. This ensures that the system is consistent because if transactions do not run in isolation the they may access data from the DB that is inconsistent due to the fact that some other transaction is in the middle of performing its tasks. E.g. if my paycheck is being deposited in my checking account at the same time as when I am viewing an account summary I will not be able to see the changes being made to my account until the deposit transaction is completed. This stops unsavory things like seeing the total amount in the account remaining constant while the recent deposits field may already contain the deposit.

    Durability:
    Once changes have een made to the DB they are permanent. There are safeguards that will prevent loss of information even in the case of system failure. By logging the steps that each transaction takes the state of the system can be recreated even if the hardware has failed.


    Here's an article that explains why mySQL is not a real RDBMS since it's support for transactions are lacking. PostrgreSQL on the other hand supports transactions. For instance, people who use a DB that supports transactions don't have to lock the tables themselves when accessing the DB from their code instead stuff like that is handled by the DB.

    PS: Also not all of SQL is supported by mySQL (e.g. Foreign Keys for specifying integrity constraints).

  8. Re:both have serious limitations by laertes · · Score: 4

    Don't you think that your comments are more than a little unsubstatiated? Have you ever used an ATM? If so, you are using transactions. Have you ever paid with something using a credit cart? If so, you are using a transaction application. Have you ever made an insurance claim? If so, you again have used transactions.

    While preventing dataloss is an important application for databases, it is not the only one. Instead, try creating a large, complicated system using no transactions. In order to execute most data manipulations, you would have to modify more than one table. Even something like a bank account transfer would access at least three tables. How would you do this using something with no transactions? You would lock the tables as you needed them. How would you do this with a transaction safe system? You would forget about locking the tables and commit or rollback, depending on the success of the action.

    The fact is, for simple joins, MySQL will always be faster, and simpler. For more complicated joins, it's naive query optimization is incredibly limited, with joins involving more than seven tables, my experience has shown that MySQL just cannot optimize correctly, even PostgreSQL was even faster on some of my most complicated joins.

    Simple insert/update actions likewise are much faster, but with the lack of foreign keys, you have to write select statements yourself to maintain referential integrety.

    MySQL is faster for simple actions that wouldn't need foreign key checks, and complex manipulations. However, consider when you have to make a modification to a large number of tables. in MySQL, you would have to lock each table, blocking out any other access to the same data, while something like PostgreSQL allows concurrent access to the data.

    You talk about the performance killing features of full SQL, they do indeed have a performance hit. These are overkill for glorified flat file databases, but for a normalised database, transactions give you an order of magnintude more speed.

    Consider the ACID test, durability is only one fourth of the equation, atomicity, durablity and isolation make up the rest of the equation. These features not only make the system more stable, they also make life for a developer much easier. They also allow a far more robust, scalable and flexible system to be developed in a shorter amount of time. Every big website that uses a DMBS uses something with transactions, not because they are a performance killer, but because they are a more scalable solution.

    At the low end, big databases are slower, but as the size of the system grows, MySQL gets slower faster, untill at Enterprise level, it becomes slower. Lest you think I don't know what I'm talking abount, I work for a fairly large e-commerce site as the senior programmer.

    BTW: I do agress that SQL is not the ultimate solution, I wouldn't go so far as to say it's awful, though. Instead of bitching, why don't you design something better, I am designing something better, it's just a long way down the road.

    --

    Yes, I'm still a junky. Are you still a bitch?