Slashdot Mirror


First "Real" Benchmark for PostgreSQL

anticlimate writes "A new benchmark published on SPEC shows PostgreSQL's performance approaching that of Oracle's and surpassing or on par with MySQL (however the test-hardwares of the other DB systems are somewhat different). The test was put together by PostgreSQL's core developers working at Sun. They certainly are not unbiased, but this is the first 'real' benchmark with PostgreSQL — according to Josh Berkus's blog. The main difference compared to earlier benchmarks (and anecdotes) seems to be the tuning of PostgreSQL."

17 of 275 comments (clear)

  1. The best way to truly compare by CaptainPatent · · Score: 3, Interesting

    Because Sun systems will always be different from the x86 based cores that run MySQL and Oracle, I think the best way to compare such software would be by constructing servers of equal price and seeing how PostgreSQL fares. The true question on any business person's mind is "how much to implement?"

    --
    Well, back to rejecting software patent applications.
    1. Re:The best way to truly compare by Doctor+Memory · · Score: 3, Interesting

      Sun systems will always be different from the x86 based cores that run MySQL and Oracle Umm, wrong both ways. Oracle runs really well on Sun SPARC hardware (and I suspect MySQL at least runs), and Sun also makes x86-based servers (built with AMD's Opteron chips). It shouldn't be any trouble to benchmark all three on the same hardware.

      Well, no technical trouble, anyway — I doubt Oracle would like to have its performance compared to two free-as-in-beer competitors. Even if it comes out on top, people will still be tempted to think "Jeez, with the money I save on Oracle licenses, I can buy a faster server and make up the speed difference"...
      --
      Just junk food for thought...
  2. What are the tuning parameters? by Anonymous Coward · · Score: 2, Interesting

    For those of us who don't have dozens of hours to do the necessary research, can some postgresql gurus sum up some of the most significant tuning parameters so us mere mortals can see similar performance gains?

    I realize that a large part of the answer is going be "it depends on application, your hardware, and you query types", but surely there must be some general tips that we can follow given various typical setups. MySQL, for example, ships with several different configuration files: One suitable for a small installation, one for a mid-sized installation, one for large installation, etc.

    What tuning can someone do to tune postgresql's default (conservative) config file to make it perform better?

  3. Good, but it can be improved. by khasim · · Score: 5, Interesting

    Get people from each group, give them the requirements and 5 different dollar amounts.

    Let each team setup their systems how ever they want at each price point. Some will go with clustered servers. Some will go a single monster server. They know their products best so they'll be the ones best suited to choosing the configuration.

    Then run the benchmarks. And keep hammering on them until AFTER the next patch release.

    Yeah, it might run fast, but still be a bitch to patch/upgrade.

    At $5,000 you might find that a cluster of MySQL boxes beats everything.

    At $10,000 maybe something else is best.

    $25,000

    $50,000

    $100,000

    etc.

    And finally, break it. Break it bad. What happens when something goes wrong? Oracle might cost a lot, but if they can come through with your data they might just be worth it.

    If nothing else, you'll get the "best practices" nicely demonstrated by each group. :)

  4. Re:Mod parent way up! by Anonymous Coward · · Score: 2, Interesting

    I can't speak for complex queries, but here are some simple findings from my testing:

    Inserting 20 million rows, all simple inserts, only one primary key (int) with autoincrement for mysql and a sequence for postgres:
    Avg Mysql time per 1000 inserts: 3 seconds
    Avg Postgres time per 1000 inserts: 15 seconds (and gets worse over time)

    That was after increasing the caches and disabling fsync on postgres too.

    I also did a delete then insert for both (to flush out already existing rows), with similar results.

    Ended up just inserting into a mysql table with no keys/indexes at all for maximum speed (average of .4 seconds per 1000 inserts), but didn't test that with postgres. Decided I only needed a static table anyway, so mysql was more ideal for it's speed in my situation. Sure with I had used postgres on a few more complicated projects though...

    Posting AC to avoid flames.

  5. performance isn't the issue by SEAL · · Score: 1, Interesting

    Performance isn't what causes a lack of acceptance in the marketplace for PostgreSQL.

    The problem is twofold:

    MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast, easy to use database for those who don't need a whole lot.

    Oracle supplies an enterprise level database that MySQL doesn't aspire to. PostgreSQL doesn't know where to fit in.

    Do a little investigation on setting up PostgreSQL with fault tolerance and replication and you'll quickly see why large corporations cough up money for Oracle. Performance is one aspect of the price tag, but it is certainly not the only factor.

    1. Re:performance isn't the issue by aled · · Score: 2, Interesting

      MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast, easy to use database for those who don't need a whole lot.


      What is exactly easier in MySql?
      --

      "I think this line is mostly filler"
    2. Re:performance isn't the issue by Anonymous Coward · · Score: 1, Interesting

      PostgreSQL's command-line shell is far better than mysql. In addition to what mysql does, psql supports tab-completion and other nice things.
      MySQL has supported tab-completion for a long time (2000 maybe?). I know it's there in the 3.x series. What are the "other nice things" that you speak off? You chide the parent because he's "sure" yet you won't list any specifics (ok, just one) yourself. Step up and tell us.
  6. Which MySQL? by itsdapead · · Score: 4, Interesting

    MySQL is modular - pick'n'mix data storage engines sharing a SQL front end. I can't find the bit in TFA that says which one they compared.

    I've always suspected that most MySQL vs Postgres flame wars are based on comparing Postgres with the speed of MySQL/MyISAM (No transactions or relational integrity checks - so, big surprise, dead fast for simple queries) and then waving MySQL/InnoDB around when the functionality issue is raised.

    MySQL/MyISAM hits the speed/functionality sweet spot for LAMP data-driven websites, is supported by lots of free webapp software and offered by most decent web hosting services. Comparing it speedwise with Postgres has always been pointless, though. If Postgres has caught up, colour me impressed, but if they're pro-Postgres I bet they're comparing with MySQL/InnoDB (which is a bit closer to like-with-like).

    Never quite seen the point of MySQL/InnoDB really - all the advantages of MySQL/MyISAM minus the speed, support by popular webapps, availbility on low-cost hosts... and still lacks the features of Postgres.

    --
    In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
  7. Re:on the playground... by Fozzyuw · · Score: 2, Interesting

    Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.

    Databases are even being used in many situations that would be better addressed using flat files.

    Now there's a good point. Is there any good documentation out there on databasing on what is the best solution for what kind of problem? Particularly, I would like to see something between flat-files and DBMS and then between different DBMS.

    Though, technically, I'm stuck between whatever our 'host' is giving us (currently PostgreSQL, but usually MySQL, and I dabbled with MSSQL when an intern for the Government), I'd like to know when it might be better to actually just store some stuff in a flat file or even use an XML DB (which one collage professor loved to talked about as a holy grail).

    Outside of "Basic Database Theory and Generic SQL" books, a programming class on Databases (how to build tree structures and navigate them, etc), I've never really got any 411 on the benefits of using MSSQL, Oracal, MySQL, PostgreSQL, etc and I've never gotten the change to learn more about advanced database features such as stored procedures or functions or whatever they might be called and what they're good for.

    Recommended books, anyone?

    Cheers,
    Fozzy

    p.s. I work mostly with internet applications.

    --
    "The past was erased, the erasure was forgotten, the lie became truth." ~1984 George Orwell
  8. Re:Mod parent way up! by Doctor+Memory · · Score: 5, Interesting

    Inserting 20 million rows, all simple inserts, only one primary key (int) with autoincrement for mysql and a sequence for postgres:
    Avg Mysql time per 1000 inserts: 3 seconds
    Avg Postgres time per 1000 inserts: 15 seconds (and gets worse over time) OK, now do a seven-table join, including a self-join with a correlated subquery (MySQL does those now, right?). I think everybody knows by now that MySQL is pretty much untouchable as long as all you're doing is simple single-table stuff. Kind of like comparing a pickup truck to a moving van: if all you're doing is moving a couple of boxes around, then the pickup kicks. But when you need to move serious loads, then it's the pickup that gets to sit by the curb...
    --
    Just junk food for thought...
  9. Re:Mod parent way up! by jaredmauch · · Score: 5, Interesting
    AC, sorry, but I have a postgres install working where I get 70k inserts/second or more with a single index on the table during the day. The first insert of course is faster as the index doesn't exist yet. I'm not sure what you're doing, but I can tell you that I have tuned postgres by increasing some simple parameters. If you're using some Linux package, you're likely not seeing the benefits that are possible by stuff like changing the block size parameter in the source. Yes, it's kinda lame you have to do this, but at the same time, it's not too unreasonable. I'd like it if I could set this larger.

    This is on 'decent' hardware running Solaris 10 (amd64). Obviously you need to tweak stuff like wal size, checkpoints, etc.. But getting this type of performance is not hard to do. I can scan an hours worth of data in a short amount of time. Each one of these 'hourly' tables contains roughly 30-32M rows. this is nothing to sneeze at from what I can tell. I haven't had a reason to re-evaluate mysql to see if there are enough tweaks to make it perform similarly, but if you're getting the crappy insert rate that you're talking about, you clearly need to change something as you're doing it wrong if you truly care about performance. E-Mail me if you're interested in my postgresql config files. I'm happy to share to minimize the FUD out there.

  10. I once did benchmarking by Stinking+Pig · · Score: 5, Interesting

    I worked for a company whose product ran on MS-SQL, PostgreSQL, and Oracle. Should I explain why we didn't support MySQL or not? It'll draw fanboys either way. I used the same server, reinstalled the OS (Red Hat Enterprise 3 or Windows 2000) and database between each test, and rebuilt the application server to be extra sure.

    Since it was more difficult to write Oracle-compliant SQL and we didn't have a lot of Oracle customers, the developers didn't care to spend time on it, and our stuff ran about 20 percent slower there. That's after a lot of tuning time, it was 50% slower on a default install. Oracle 9 took two days to install and tune, plus another two days of preparation. I was particularly underwhelmed that I had to deal with stupid errors like tarballs that extracted onto themselves and assumptions about the shell being used. At the time, Oracle was a very Solaris-like experience; user-unfriendly to the extreme.

    Postgresql 7 ran great; it was neck and neck with MS-SQL in all tests, after proper tuning, and 30 percent slower on a default install. Postgres took half a day to install and tune, but it took me a week and conversation with the postgres mailing lists to find out what needed tuning. Still, we were able to put together a document that took users from bare metal to RHEL+Postgres in four hours if they had all their media handy.

    Microsoft SQL Server 2000 ran great with no tuning at all, and took fifteen minutes to install. It also cost as much as paying me to do the entire set of tests. OS installation/patching times and tested workloads were the same for all three tests.

    YMMV.

    --
    "Nothing was broken, and it's been fixed." -- Jon Carroll
  11. Version 8.1 :-) by Ayanami+Rei · · Score: 2, Interesting

    8.1 automatically does table maintenance (vacuum, stats, etc.)
    http://www.postgresql.org/docs/8.2/static/routine- vacuuming.html#AUTOVACUUM

    You can tune it as to when to do such operations, or what max percent of normal I/O can be used for those tasks.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  12. Re:Mod parent way up! by Gorshkov · · Score: 2, Interesting

    If you need one of Sun's key luminaries in order to get that level of performance it is also of limited value. I don't want to have to drag Burleson or Niemiec out to my shop just to get my database to run well. I need to be able to get the database to run well on my own. My company won't spring for Niemiec's pool boy, nevermind Niemiec himself.
    You're missing the entire point of what he's saying.

    When you see benchmarks run and comparisons made between different databases that are conducted by a single person or company, all things are seldom equal. The people involved have more expertiese with this db package and less with that ... things are not set up equally well.

    For these published tests, Sun spent a lot of time optimizing postgress to make it fly .... just as the people at oracle spent a lot of time optimizing *their* database configuration for the tests that *they* publish.

    So, all things ARE equal, in relation to the database tuning & setup - both were done by experts with the respective packages, and we don't have to wonder if some yahoo screwed things up, intentionally or otherwise, by missing a basic configuration option.
  13. Re:on the playground... by Dan+Ost · · Score: 2, Interesting

    The guidelines are pretty simple, but highly subjective.

    Here are your options:
    A. flat file
    B. key-indexed (berkley db, dbm, gdbm)
    C. feature/config light db (sqlite)
    D. feature/config heavy db (postgrsql, oracle)

    * Use A for configuration
    * Use A if you have a small amount of data and can slurp it all into memory.
    * Use B if you only access data via a known unique value and never compare records.
    * Never use either A or B if the data is shared between processes unless file locking is okay.

    For anything else, use C or D. If you've got a lot of familiarity with D, sometimes it's easier to use D than C, even if D seems like overkill. My recommendation is to get familiar with C and then worry about D once you understand the limitations of C. It'll help you appreciate what D gives you when you're ready for it.

    --

    *sigh* back to work...
  14. Re:Mod parent way up! by arivanov · · Score: 4, Interesting
    Who cares if MySQL does them or not. Show me the developers that can both develop applications and do SQL. That is a dieing breed. Most developers nowdays go for a really trivial schema and an abstraction layer. At that point the only thing that matters is row speed on simple table operations and there MySQL or in-memory OO database frameworks with a simple backing store wipe the floor. This is the reality of life. And it is not going to get better. If you look at the books on the market the only book that used to teach "proper" SQL (with joins and the lot) strictly from the context of application development was the old DB2 bible. It has not been reprinted since the late 90-es. All the rest that is out there is either heavily slanted toward the app side or towards the DB side (usually the latter). Add to that the fact that many universities try to teach "real life software engineering skills" instead of proper data structure and data manipulation classes and the picture is complete: http://www.joelonsoftware.com/articles/ThePerilsof JavaSchools.html. Add to that the fact that DBDs when you actually corner them to ask something meaningfull answer with SQL technobabble like in your post. To the average developer it sounds like fortran. And if it looks like fortran, walks like fortran and talks like fortran it gotta be fortran. From the point of view of a average software engineer SQL and especially stored procedures look like a blast from the past. He expects to see objects, constructors, destructors, private and public structures. And what does he see? He sees something that looks like written by his grandparents. As a result he turns around and starts doing delete/insert/last_insert_id instead of replace and sequential deletions in software instead of foreign keys. I have tried in the past to work with developers who write commercial apps on top of SQL to optimise their code. And I have wanted to scream all along. In 95% of the cases you deal with either one of the following:
    • A nice schema designed once upon a time properly by a proper DBD that is vandalised in the application abstraction layer because the developers are are sorely pissed off by the endless wingeing of the SQL server and/or its abissmal performance. So they take the matters in their own hands and violate ACID by cashing and bypassing restrictions in the app. Sooner or later someone comes around and says - WTF, why don't we rewrite this all in software and sod off the expensive database. And surprise surprise it ends up being done in MySQL.
    • An abissmal schema or no schema at all where all restrictions are done in the app. That is MySQL country all the way.
    MySQL is a result of the way current software development is taught and done. Unless Jo the Average Developer starts understanding how to use SQL in his application (and he does not) and unless SQL data representation grows up to modern non-fortran-like OO semantics MySQL will proliferate. And if you think that MySQL is bad think twice. There are the object persistence frameworks and in-memory crap that follow in its wake.
    --
    Baker's Law: Misery no longer loves company. Nowadays it insists on it
    http://www.sigsegv.cx/