Slashdot Mirror


What Would You Want to See in Database Benchmarks?

David Lang asks: "With the release of MySQL 5.0, PostgreSQL 8.1, and the flap over Oracle purchasing InnoDB, the age old question of performance is coming up again. I've got some boxes that were purchased for a data warehouse project that isn't going to be installed for a month or two, and could probably squeeze some time in to do some benchmarks on the machines. However, the question is: what should be done that's reasonably fair to both MySQL and PostgreSQL? We all know that careful selection of the benchmark can seriously skew the results, and I want to avoid that (in fact I would consider it close to ideal if the results came out that each database won in some tests). I would also not like to spend time generating the benchmarks only to have the losing side accuse me of being unfair. So, for both MySQL and PostgreSQL advocates, what would you like to see in a series of benchmarks?" "The hardware I have available is as follows:
  • 2x dual Opteron 8G ram, 2x144G 15Krpm SCSI
  • 2x dual Opteron 8G ram, 2x72G 15Krpm SCSI
  • 1x dual Opteron 16G ram, 2x36G 15Krpm SCSI 16x400G 7200rpm SATA
I would prefer to use Debian Sarge as the base install of the systems (with custom built kernels), but would compile the databases from source rather then using binary packages.

For my own interests, I would like to at least cover the following bases: 32 bit vs 64 bit vs 64 bit kernel + 32 bit user-space; data warehouse type tests (data >> memory); and web prefs test (active data RAM)

What specific benchmarks should be run, and what other things should be tested? Where should I go for assistance on tuning each database, evaluating the benchmark results, and re-tuning them?"

9 of 42 comments (clear)

  1. Ask the dev teams for both! by iamsure · · Score: 3, Insightful

    Send an open email to the dev teams on both projects, and ask for their opinions on what should be tested. It might take 3-4 rounds of back and forth to settle on a set of reasonable benchmarks and settings, but at least that way both sides are involved from the beginning.

  2. Don't bother by Anonymous Coward · · Score: 4, Insightful

    in fact I would consider it close to ideal if the results came out that each database won in some tests

    With an attitude like that, there's no point running benchmarks. The idea is that you run the benchmarks to get an idea of how the databases perform. But it seems you are already rejecting one possible result (that one database performs worse than others in all respects) because you don't consider it "fair".

    Well life isn't fair. I'm sure people worked hard on all databases, but that doesn't mean they all have value. Sometimes people try hard and fail. And you want to ignore the numbers that tell you this because you think it's fairer that way? Give me a break, you don't want to run a real benchmark, you want to run something that will tell you what you have already decided upon is the best.

  3. This is simple by MerlynEmrys67 · · Score: 3, Insightful
    Take your current application that you need the database for
    Compile your application to use each database
    Now go and compare which database runs fastest on your application

    Anything else just doesn't matter - your application is going to be different than every benchmark, so what you need is to run your application on the database and see what happens.

    What I have usually found is that while you can highly tune the database, and have great database benchmarks - most of those are ruined by completely brain dead applications that do very stupid things, ruining any kind of performance the database will give you

    --
    I have mod points and I am not afraid to use them
  4. Re:What I'd really like to see... by dtfinch · · Score: 3, Interesting

    What you do is:
    The person publishing the benchmark does not use Oracle.
    The Oracle user running the benchmark remains anonymous.

    But there are many ways that Oracle (or any other database software) can be made to perform badly in a benchmark that would be no fault of the software. If someone wants to benchmark against Oracle, Oracle wants to make sure they do it correctly, or else not at all. If they didn't have that clause, Microsoft would have dozens of studies and benchmarks saying that Oracle is slower than SQL Server under certain setups, just like those bullshit VeriTest benchmarks they have against crippled setups of Red Hat, Apache, and Samba.

  5. easy, TPC by ZuggZugg · · Score: 2, Informative

    TPC-C and especialy TPC-H for DW benchmarking. Buy the benchmark kit and run it...done. If you're really serious have it independantly audited and submit the results to the TPC.org. You could probably wrangle up some sponsors to help foot the bill.

    Good luck.

  6. Very large database operations by eyeball · · Score: 3, Interesting

    I would love to see operations on very large databases, say 100 million or 1 billion records (or even more). Operations like bulk loading, inserting, querying, deleting; against indexed and un-indexed tables; reindexing a whole table (*).

    (*) Reindexing caused me a ton of grief. I inherited a huge mysql db once that required an emergency reindex. Unfortunately mysql locked the table while it did a full table copy, which took hours.

    --

    _______
    2B1ASK1
    1. Re:Very large database operations by captainclever · · Score: 2, Informative

      Oh yes.. mysql makes a copy of the table for every ALTER TABLE command i think, even if you want to drop an index. This sucks royally. Sometimes it's quicker to add and remove indexes when you want to run a few queries on a large table (at least it is with postgres). Locking the table whilst dropping/creating indexes is a huge pain in the ass - but won't show up in benchmark results. This also means that if you don't have enough disk space to hold a copy of the table, you can't easily alter it :(

      --
      Last.fm - join the social music revolution
  7. Re:single-user - long query by Nutria · · Score: 2, Insightful

    Althought I highly suspect that mysql is not suited for that sort of usage anyway.

    On the contrary, I bet that MySQL w/ MyISAM would be well suited to that task.

    --
    "I don't know, therefore Aliens" Wafflebox1
  8. Contact us on the pgsql-performance list by jeffroe · · Score: 2, Informative

    We'd love to see some benchmarks run on this equipment. It's a great chance for us to evaluate and boost postgresql performance in general. Can you contact us directly? You can find a subscription link here: http://archives.postgresql.org/pgsql-performance/ as well as the thread regarding your ask slashdot question here: http://archives.postgresql.org/pgsql-performance/2 005-11/msg00514.php