First "Real" Benchmark for PostgreSQL
anticlimate writes "A new benchmark published on SPEC shows PostgreSQL's performance approaching that of Oracle's and surpassing or on par with MySQL (however the test-hardwares of the other DB systems are somewhat different). The test was put together by PostgreSQL's core developers working at Sun. They certainly are not unbiased, but this is the first 'real' benchmark with PostgreSQL — according to Josh Berkus's blog. The main difference compared to earlier benchmarks (and anecdotes) seems to be the tuning of PostgreSQL."
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless. But, being the intrepid slashdotter I am, I went ahead and R'ed the FA anyway, in case I could glean some useful information from it.
Which revealed that the linked article doesn't actually contain any information whatsoever about Oracle* or MySQL, much less benchmarks on named hardware.
So...what am I supposed to get out of this, again? Or is this just supposed to be some kind of PostgreSQL love-in, so I should take my wet blanket elsewhere?
*Well, the second link contains someone claiming that Oracle is only 15% faster...but without providing any actual data.
Reality has a conservative bias: it conserves mass, energy, momentum...
To paraphrase an old saying:
There are lies, damned lies and benchmarks.
Bearded Dragon
I like yours better. The Slashdot editors need to have their balls cut off if they think the post that beat your onto the front page is better. Feel free to mod me down any time for bitching about this, but seriously, this post is SO much better than the one that made it.
I hate printers.
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless.
Not necessarily.
It's essentially useless for separating out how much of the performance difference is the result of the software's design, implementation, and tuning versus how much is due to the platform differences.
But such tests CAN be used to examine the performance of competing ENTIRE SYSTEMS, to inform choices between them.
They say: "Oracle on does THIS well, PostgreSQL on can be tuned so it does THAT well on the same benchmark."
This lets administrators (presuming they have access to the hardware info) get a bang-for-the-buck comparison.
For the rest of us, the interesting point is that PostgreSQL, running on its team's idea of realistic hardware, can produce performance in the same ballpark as Oracle running on Oracle's choice of hardware.
(Whether the necessary remaining data (what are hardwares x and y? how was PostgreSQL tunde) is published now, later, or never, is a separate issue. B-) )
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
You cannot compare benchmarks without SOMETHING standard between them.
The thing that's standard is the benchmarking software.
If I were to buy a database server, do I really care which component of the solution is providing me with the great performance, or do I just want the performance? At the end of the day the only thing that really matters is the performance that comes out of the box.
It doesn't really matter if "Postgresql" is faster than "MySQL", because they always run on a certain physical computer. What matters is "I need to accomplish X,Y and Z. I have A dollars to spend. Which solutions accomplishes X, Y and Z the best within my budget? You can't separate the software from the hardware and get an answer that's very meaningful.
This benchmark isn't the last word on anything. Even a benchmark run on the exact same hardware means very little if you have a 2 core machine instead of 8.
AccountKiller
Oh, I agree. A benchmark of whole systems can be just as (or more) useful as a benchmark of individual pieces of software, depending on what your goals are.
But what's been presented here isn't even that. Links #1 takes us to a SPEC benchmark of PostgreSQL. It doesn't provide any information about anything else; there isn't anything to compare the benchmark to. Link #2 provides an unreferenced statement about Oracle's marginally superior performance on much more expensive equipment.
So, perhaps, one can begin to draw conclusions about PostgreSQL vs Oracle in the contexts of full systems. But neither link #1 nor link #2 provide any information about MySQL (except the quote: "[t]his publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL").
Really, my criticism isn't of the benchmark (the data are the data, after all) or of the blog (one expects a vested PostgreSQL interest to comment on such a benchmark), but of the blurb here that either a) draws totally unwarranted conclusions, or b) depends on information it doesn't bother sharing.
Reality has a conservative bias: it conserves mass, energy, momentum...
They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
"Dolphin" also conveys "fun play thing" to me...
I'd prefer the elephant that never forgets.
Actually I think PostgreSQL might have eaten a lot more of the MySQL market if they'd simply been faster to market with better admin tools and Windows support.
Lots of folks went with MySQL early because of those factors. They also then tended to write all their PHP, etc, applications to only talk to MySQL, thus making folks who might have preferred PostgreSQL use MySQL to run the app that they needed to run. Once that happens you are kind of in a Catch-22 place. Folks won't write the apps for PostgreSQL until it's used by a larger chunk of the market, but it won't take that large chunk because all the 'cool' apps were written MySQL only, so they have to run MySQL instead of PostgreSQL
PostgreSQL's command-line shell is far better than mysql. In addition to what mysql does, psql supports tab-completion and other nice things. It's been a while since I looked at mysql, but when I did the difference between the two respective command-line interfaces was like the difference between bash and sh.
2) Extensive developer community. We use python and the MySQL/python integration is great. We have a few UDFs that are home-grown but some of them were just downloaded off the net and installed. I'm sure you can find far more for MySQL than for Postgres.
Uh-huh. You are "sure". Several times I asked for support on postgresql mailing lists and the response has always been excellent. Usually I got answers within hours.
I think one of the reasons that mysql became ubiquitous is that it had proper windows support early on. So, just like windows, everyone uses mysql because everyone else does, and they are willing to jump through hoops to work around all the deficiencies the platform has, simply because they don't know any better.
___
If you think big enough, you'll never have to do it.
That is the problem - it does not in real life. Application works in developer hands, goes out in the field and breaks (seen that one time too many). Millions if not billions of dollars have been put to make sure that RDBMS transactions are atomic and preserve data integrity. No application level interface abstraction has ever afforded the expense and could ever afford the expense to do that. In every single instance I have looked at application developers replacing SQL ACID with "bake-their-own" system I have found cases of data integrity violations. In modern multithreaded (or web server based) apps the most common result from this is race conditions which are probably the hardest to debug problem in software.
The other common problem in using application level abstractions is performance. Once again - works in developer hands, goes in the field, gets real data loaded in it and all hell breaks lose. Similar reasons to ACID as the next biggest investment after data integrity in a database is in its ability to fine-grain lock data objects. If a developer tries to replace RDBMS locking in the application layer, he usually ends up with higher granularity lock that is more contended. In addition to that to avoid race conditions, developers usually deliberately create a bottleneck by muxing all RDBMs access to a single thread and a single access point to simplify locking. In fact probably one of the most beneficial uses of MySQL is its ability to support server-based fine-grained locks that are not tied to a specific data object. You can use these in global semaphors and global locking even in cases where POSIX locks do not work (f.e. across clusters).
Overall, yeah, MySQL and your app "already works". For Proof of Concept - maybe (in fact I use it myself). For real stuff - no, not really, unless you put a lot of work in the application layer. I have done that on quite a few occasions and the performance gains can be staggering compared to ACIDising your brain with a proper RDBMS, but the effort is hardly worth it in most real life scenarios. It also makes it considerably less maintainable.
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/