First "Real" Benchmark for PostgreSQL
anticlimate writes "A new benchmark published on SPEC shows PostgreSQL's performance approaching that of Oracle's and surpassing or on par with MySQL (however the test-hardwares of the other DB systems are somewhat different). The test was put together by PostgreSQL's core developers working at Sun. They certainly are not unbiased, but this is the first 'real' benchmark with PostgreSQL — according to Josh Berkus's blog. The main difference compared to earlier benchmarks (and anecdotes) seems to be the tuning of PostgreSQL."
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless. But, being the intrepid slashdotter I am, I went ahead and R'ed the FA anyway, in case I could glean some useful information from it.
Which revealed that the linked article doesn't actually contain any information whatsoever about Oracle* or MySQL, much less benchmarks on named hardware.
So...what am I supposed to get out of this, again? Or is this just supposed to be some kind of PostgreSQL love-in, so I should take my wet blanket elsewhere?
*Well, the second link contains someone claiming that Oracle is only 15% faster...but without providing any actual data.
Reality has a conservative bias: it conserves mass, energy, momentum...
------
i o o
agree >
wit -------
da
op oOo g0d oOoOo4
They're all shitty.
boys use mysql
men use PostgreSQL,
and _____ use MSSQL.
flame on... =)
Website Hosting
You cannot compare benchmarks without SOMETHING standard between them.
Okay, if they can't match the hardware (why not?) then focus on price points. I notice that they're looking at "$65,500 for the hardware". That's a LOT of hardware at today's prices.
I'm sure MySQL would (and will) come back with a "benchmark" on hardware costing $10,000.
There is nothing "real" about this "benchmark".
Because Sun systems will always be different from the x86 based cores that run MySQL and Oracle, I think the best way to compare such software would be by constructing servers of equal price and seeing how PostgreSQL fares. The true question on any business person's mind is "how much to implement?"
Well, back to rejecting software patent applications.
Uh... WTF?
To paraphrase an old saying:
There are lies, damned lies and benchmarks.
Bearded Dragon
Why this emaciated post made it while mine didn't I'll never know...here's how I submitted this story:
The current version of PostgreSQL now has its first real benchmark, a SPECjAppServer2004 submission from Sun Microsystems. The results required substantial tuning of many performance-related PostgreSQL parameters, some of which are set to extremely low values in the default configuration — a known issue that contributes to why many untuned PostgreSQL installations appear sluggish compared to its rivals. The speed result is close but slightly faster than an earlier Sun submission using MySQL 5 (with enough hardware differences to make a direct comparison of those results unfair), and comes close to keeping up with Oracle on similarly priced hardware — but with a large software savings. Having a published result on the level playing field of an industry-standard benchmark like SPECjAppServer2004, with documentation on all the tuning required to reach that performance level, should make PostgreSQL an easier sell to corporate customers who are wary of adopting open-source applications for their critical databases.
For those of us who don't have dozens of hours to do the necessary research, can some postgresql gurus sum up some of the most significant tuning parameters so us mere mortals can see similar performance gains?
I realize that a large part of the answer is going be "it depends on application, your hardware, and you query types", but surely there must be some general tips that we can follow given various typical setups. MySQL, for example, ships with several different configuration files: One suitable for a small installation, one for a mid-sized installation, one for large installation, etc.
What tuning can someone do to tune postgresql's default (conservative) config file to make it perform better?
The plural of hardware is hardware, not hardwares.
Does slashdot not have a "check spelling" feature on the submission page?
I think that somebody sent the wrong link and (surprise!) the editors didn't even follow it to check.
Here's a more useful one: All SPEC jAppServer2004 Results Published by SPEC
The benchmarks aren't standardized enough for any useful comparison. The hardware and configurations vary in almost every one.
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless.
Not necessarily.
It's essentially useless for separating out how much of the performance difference is the result of the software's design, implementation, and tuning versus how much is due to the platform differences.
But such tests CAN be used to examine the performance of competing ENTIRE SYSTEMS, to inform choices between them.
They say: "Oracle on does THIS well, PostgreSQL on can be tuned so it does THAT well on the same benchmark."
This lets administrators (presuming they have access to the hardware info) get a bang-for-the-buck comparison.
For the rest of us, the interesting point is that PostgreSQL, running on its team's idea of realistic hardware, can produce performance in the same ballpark as Oracle running on Oracle's choice of hardware.
(Whether the necessary remaining data (what are hardwares x and y? how was PostgreSQL tunde) is published now, later, or never, is a separate issue. B-) )
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
That should have read:
They say: "Oracle on {hardware x} does THIS well, PostgreSQL on {hardware y} can be tuned so it does THAT well on the same benchmark."
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
SQLite and Postgres are current my choices. What I really want are language bindings that preserve the structures, syntax and semantics of the language I'm using without the performance penalty of abstraction layers.
Has anyone used the PostgreSQL open source to refactor the DB to support just a subset of SQL and features (the most popular stuff that eg. "LAMP" uses), then benchmarked it vs the default distro, to show higher performance?
For that matter, has anyone merged any open source Java server container with PostgreSQL for higher performance of that use case, in an integrated architecture without network and other overhead for messages, and more atomic transactions?
--
make install -not war
Get people from each group, give them the requirements and 5 different dollar amounts.
:)
Let each team setup their systems how ever they want at each price point. Some will go with clustered servers. Some will go a single monster server. They know their products best so they'll be the ones best suited to choosing the configuration.
Then run the benchmarks. And keep hammering on them until AFTER the next patch release.
Yeah, it might run fast, but still be a bitch to patch/upgrade.
At $5,000 you might find that a cluster of MySQL boxes beats everything.
At $10,000 maybe something else is best.
$25,000
$50,000
$100,000
etc.
And finally, break it. Break it bad. What happens when something goes wrong? Oracle might cost a lot, but if they can come through with your data they might just be worth it.
If nothing else, you'll get the "best practices" nicely demonstrated by each group.
SQLite is faster than Postgres for many types of workloads.
http://www.spec.org/jAppServer2004/results/jAppSer ver2004.html contains all the results. Unfortunately the different hardware configurations make it rather hard to draw any conclusions. Which begs the question, how did the submiter knew that these specific guys where biased or not? From what I can see, the whole setup is inherently biased.
MySQL 5 on AMD Opteron 285
The UltraSPARC has 8 cores on 1 chip and 16GB of memory.
The Opteron has 4 cores and 8GB of memory.
The UltraSPARC should smoke it every time.
They certainly are not unbiased
I guess that's a deal breake right there, no?
The test was put together by PostgreSQL core developers at Sun. Didn't we agree earlier, when talking about Intel/AMD benchmarks, that vendor supplied tests are wildly inaccurate?
PostgreSQL should concentrate on more developer tools and better marketing. The "it's got a ton of features you don't need" on a cryptic site doesn't help its cause.
People use MySQL because there's a wide support and lots of dev tools for it, and because the kind of people going for MySQL needs to do simple selects and inserts most of the time.
As I recall, Oracle's choice of hardware consists of a very large chequebook and a stamp with your signature on it.
Won't you guys agree, "elephant" doesn't exactly communicate "fast and modern" very well.
"Dolphin" comes a bit closer.
Who's coming up with those logos?
Oh, I agree. A benchmark of whole systems can be just as (or more) useful as a benchmark of individual pieces of software, depending on what your goals are.
But what's been presented here isn't even that. Links #1 takes us to a SPEC benchmark of PostgreSQL. It doesn't provide any information about anything else; there isn't anything to compare the benchmark to. Link #2 provides an unreferenced statement about Oracle's marginally superior performance on much more expensive equipment.
So, perhaps, one can begin to draw conclusions about PostgreSQL vs Oracle in the contexts of full systems. But neither link #1 nor link #2 provide any information about MySQL (except the quote: "[t]his publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL").
Really, my criticism isn't of the benchmark (the data are the data, after all) or of the blog (one expects a vested PostgreSQL interest to comment on such a benchmark), but of the blurb here that either a) draws totally unwarranted conclusions, or b) depends on information it doesn't bother sharing.
Reality has a conservative bias: it conserves mass, energy, momentum...
Performance isn't what causes a lack of acceptance in the marketplace for PostgreSQL.
The problem is twofold:
MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast, easy to use database for those who don't need a whole lot.
Oracle supplies an enterprise level database that MySQL doesn't aspire to. PostgreSQL doesn't know where to fit in.
Do a little investigation on setting up PostgreSQL with fault tolerance and replication and you'll quickly see why large corporations cough up money for Oracle. Performance is one aspect of the price tag, but it is certainly not the only factor.
If you want to setup a dedicated database server, you want to know what software with what hardware will run the fastest. So while the benchmarks may not be useful to people wanting to setup a small multi-purpose server, it can still be useful for some people.
No question of bias there...
If you want news from today, you have to come back tomorrow.
They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
Led me to...
http://blogs.digitar.com/media/2/T2000_Experience
where a UltraSPARC T1 customer states that MySQL 5 is up to 13.5 times faster than MySQL 5 on an AMD Opteron.
The results of the original article's data were as follows:
Postresql 8.2 on UltraSPARC T1 - 778.14
MySLQ 5 on AMD Opteron - 720.56
So using my blatantly biased (yet at the same time factual) numbers, I correct the MySQL performance number to reflect what it would be on an UltraSPARC T1: 720.56 x 13.5 = 9727.56
9727.56 > 778.15
MySQL wins!
From the link in the article you can find the results, or just look here.
Hardware, kernel changes, PostgreSQL configuration, etc.
Its all on that page.
MySQL is modular - pick'n'mix data storage engines sharing a SQL front end. I can't find the bit in TFA that says which one they compared.
I've always suspected that most MySQL vs Postgres flame wars are based on comparing Postgres with the speed of MySQL/MyISAM (No transactions or relational integrity checks - so, big surprise, dead fast for simple queries) and then waving MySQL/InnoDB around when the functionality issue is raised.
MySQL/MyISAM hits the speed/functionality sweet spot for LAMP data-driven websites, is supported by lots of free webapp software and offered by most decent web hosting services. Comparing it speedwise with Postgres has always been pointless, though. If Postgres has caught up, colour me impressed, but if they're pro-Postgres I bet they're comparing with MySQL/InnoDB (which is a bit closer to like-with-like).
Never quite seen the point of MySQL/InnoDB really - all the advantages of MySQL/MyISAM minus the speed, support by popular webapps, availbility on low-cost hosts... and still lacks the features of Postgres.
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
If I have been able to see further than others, it is because I bought a pair of binoculars.
It's like getting married. "How much money do you have?" "Well, that's what it costs."
I have used Postgres in two different production environments over the past 6 years, but they have both been environments where it was possible to effectively take the system down for a little while each night while a vacuum is done. The vacuums seem to be quite necessary, as without them, performance slowly and steadily degrades as the trees get funky and the tables grow larger and larger with unused update holes in them. It would probably be possible to hold off and do them weekly rather than daily, but the point is that they seem to be necessary, and they seem to take the system down while they're running. Between vacuums, the database is fast.
This leads me to believe that Postgres isn't suitable as part of a system that needs to be running 24/7, such as an online order processor. What am I missing?
I worked for a company whose product ran on MS-SQL, PostgreSQL, and Oracle. Should I explain why we didn't support MySQL or not? It'll draw fanboys either way. I used the same server, reinstalled the OS (Red Hat Enterprise 3 or Windows 2000) and database between each test, and rebuilt the application server to be extra sure.
Since it was more difficult to write Oracle-compliant SQL and we didn't have a lot of Oracle customers, the developers didn't care to spend time on it, and our stuff ran about 20 percent slower there. That's after a lot of tuning time, it was 50% slower on a default install. Oracle 9 took two days to install and tune, plus another two days of preparation. I was particularly underwhelmed that I had to deal with stupid errors like tarballs that extracted onto themselves and assumptions about the shell being used. At the time, Oracle was a very Solaris-like experience; user-unfriendly to the extreme.
Postgresql 7 ran great; it was neck and neck with MS-SQL in all tests, after proper tuning, and 30 percent slower on a default install. Postgres took half a day to install and tune, but it took me a week and conversation with the postgres mailing lists to find out what needed tuning. Still, we were able to put together a document that took users from bare metal to RHEL+Postgres in four hours if they had all their media handy.
Microsoft SQL Server 2000 ran great with no tuning at all, and took fifteen minutes to install. It also cost as much as paying me to do the entire set of tests. OS installation/patching times and tested workloads were the same for all three tests.
YMMV.
"Nothing was broken, and it's been fixed." -- Jon Carroll
Back when mysql was faster than postgres for simple selects, mysql proponents were jumping up and down about performance. And, in fact, a very narrow definition of performance, with only simple reads. No writes. Single CPU. One client. You can't step too far away from those parameters because then mysql performance starts to suck.
Now that there is a respectable benchmark showing postgresql to be faster, performance is suddenly not the issue.
Yep. It's right up there with mysql developers claiming that transactions are useless back when mysql didn't have them.
___
If you think big enough, you'll never have to do it.
I am more interested in how it handles an axe wielding maniac going postal in the datacenter, or what would happen of a mile wide tornado struck town, or what would happen if there's a bug, or if the project becomes wildly successful and does 10x or 100x the processing.
A Pirate and a Puritan look the same on a balance sheet.
8.1 automatically does table maintenance (vacuum, stats, etc.)- vacuuming.html#AUTOVACUUM
http://www.postgresql.org/docs/8.2/static/routine
You can tune it as to when to do such operations, or what max percent of normal I/O can be used for those tasks.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Not necessarily.
Oracle will happily run on the cheap crap wintel servers that Postgres will. It will even do so under the typical OS of choice for running Postgres on cheap crap wintel servers.
Oracle will also theoretically run on some 64 cpu monster (from Sun, IBM or SGI).
A Pirate and a Puritan look the same on a balance sheet.
I thought Sun paid Pervasive for their distribution in Solaris 10. Has this changed or by "core developers" do we really mean "people who commit the pkgs to the WOS" or "people who actually work at Sun who also just happen to work on Pg in their free time"?
PostgresSQL: 778.14
s 2006q4/jAppServer2004-20061121-00046.html
Oracle 10gR2: 6,812.79
Real close, I agree.. Postgres had a 40GB cache, Oracle used 21GB
See http://www.spec.org/osg/jAppServer2004/results/re
If you mod me down, I *will* introduce you to my sister!
That is all.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I have heard from several different sources that postgres is much better than mysql etc, and all you have to do is to properly tune it. I have spent days converting databases, and trying to tune the parameters and queries. I have examined the query plans, and they seem to be pretty optimal to me, but still - I can't seem to beat MySQL.
So, why is it so hard to tune postgres? Would it be possible to make it autotune some parameters? Is there any outstanding tutorials out there somewhere were I can learn all this magic (yes, I have looked high and low but all I have found seem pretty basic)?
Pretty much, it does.
Running on an Xbox versus an IPX, hmm. With no data, you assume the worst case scenario because any given variable is UNKNOWN. RAM, Processor, disk IO?
Correction, it certainly does.
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.
PostgreSQL setup: 8 cores, 16GB RAM.
Oracle setup: 64 cores, 512GB RAM.
- Sw Usr
Are you adequate?
I think he is reffering to larry's insane licensing polices, such as dual core = 2 cpu's
If you mod me down, I will become more powerful than you can imagine....
Ooops, sorry, 1133 is the comparable result.
More than 50% higher performance.
If you mod me down, I *will* introduce you to my sister!
And, my personal favourite, "Oh, you bought all of your servers under our old licensing plan? Well we just changed it so please give us more money."
Well, for people who consider prices I guess it isn't since the oracle system cost more (exclusive software cost), thought then the choice of systems might have been "optimized" aswell.
I never needed a 7 table join in my life. If you can't design a database without having to do a 7 table join then you need to get a different occupation because you are fucking up the system everyone else has to work on after you.
Have a nice day.
Try following the HTML link - http://www.spec.org/jAppServer2004/results/res2007 q3/jAppServer2004-20070606-00065.html
Is more indepth on the systems and tuning parameters used.
I think this sounds pretty obvious, probably has been done:
Have an open source suite that allows anybody to verify the test results. Let the experts tweak the configuration for each database. Provide free downloads to databases and/or easy install. This way entire community is tweaking the configurations for different servers and posting back results.
Not only this would be a more or less "proper" benchmark, but also the secrets of db tuning, if there are any, will be revealed to public. Vendors get to brag if they are in top3, rest get a nice motivation to improve.
Just my $2,000,000
Lone Gunmen crew.
This is the best post I have read in ages. I am old school. I like DB's and when doing requirements gathering am already thinking in terms of tables and relational schema. In a really well designed relational schema it is possible to do very complex operations without ever resorting to plsql/transact etc. simply using plain sql (insert/update etc). I hate cursors/recordsets. As soon as I see these constructs in the db I generally find a VB/java programmer messing where he is not wanted.
PS: Oracle is not Fortran - PL/SQL is derived from Ada and therefore shares much of the Pascal/Modula feel.
http://blogs.sun.com/tomdaly/
At first, many situations simply don't need a complex database schema. Think of a contact or email list in a simple web application. Such applications often need only one table. And therefore MySQL suffices. I would not call that bad design. Moreover, don't underestimate MySQL! Recent versions are not so much behind PostgreSQL and they perform reasonably well.
It's true that the SQL knowledge of many developers is quite limited. But for this reason, and, more important, due to the bad compatibility of SQL and OO design, ORM libraries such as Java Hibernate or Python's SQL Alchemy were developed that handle the SQL part. In this case we still need a decent database schema, but it may be described in the framework, which looks more familiar to developers. However, don't underestimate the generated SQL code! Complicated SQL statements may be generated and ORM libaries may make use of referential integrity, check constraints etc. and many other features a mature database has to offer. If designed right, a ORM-based design will have a decent database schema and will make use of many "database goodies".
ORM-generated SQL code may therefore stress the database in many ways - and therefore we still need good databases - regardless of the developer's SQL knowledge.
One pitfall we always seem to step into is that in case a good, new technology is invented, we think of it as a solution to all our problems. SQL is no different here, in the past we were so proud of designing normalized schemas that we forgot about the scalability. Such schemas - although perfect - are often today the reason for performance problems. Sadly, we often overreact and tend to throw away old technology and follow the new hype. And that's what we can see today, such as thoughtlessly vandalizing database schemas due to performance problems. Nevertheless, it is often appropriate to denormalize a database schema to improve the performance. If done right, I would not call this "vandalizing".
Hardware concerns are valid. But more importantly, SPEC jAppServer is not a database benchmark. It's a Java App Server benchmark with a database component. To be fair, you have to change only the database, not the app server, hardware, and everything else too. Not to say that the conclusion is wrong, but it's very weird to say anything about database performance in this circumstance.
In our tests, which are database-only and data-warehouse focused, PG seems to be 3x-5x slower than Oracle on the exact same box, as it does not do data compression and is not as CPU-efficient. Which is quite respectable given the price difference, but may be a big enough gap to keep people who have money to spend coming back to commercial databases.
Obviously the vendor puts together the best submission they can for their own product. Oracle did the same for theirs.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....