First "Real" Benchmark for PostgreSQL
anticlimate writes "A new benchmark published on SPEC shows PostgreSQL's performance approaching that of Oracle's and surpassing or on par with MySQL (however the test-hardwares of the other DB systems are somewhat different). The test was put together by PostgreSQL's core developers working at Sun. They certainly are not unbiased, but this is the first 'real' benchmark with PostgreSQL — according to Josh Berkus's blog. The main difference compared to earlier benchmarks (and anecdotes) seems to be the tuning of PostgreSQL."
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless. But, being the intrepid slashdotter I am, I went ahead and R'ed the FA anyway, in case I could glean some useful information from it.
Which revealed that the linked article doesn't actually contain any information whatsoever about Oracle* or MySQL, much less benchmarks on named hardware.
So...what am I supposed to get out of this, again? Or is this just supposed to be some kind of PostgreSQL love-in, so I should take my wet blanket elsewhere?
*Well, the second link contains someone claiming that Oracle is only 15% faster...but without providing any actual data.
Reality has a conservative bias: it conserves mass, energy, momentum...
You cannot compare benchmarks without SOMETHING standard between them.
Okay, if they can't match the hardware (why not?) then focus on price points. I notice that they're looking at "$65,500 for the hardware". That's a LOT of hardware at today's prices.
I'm sure MySQL would (and will) come back with a "benchmark" on hardware costing $10,000.
There is nothing "real" about this "benchmark".
Because Sun systems will always be different from the x86 based cores that run MySQL and Oracle, I think the best way to compare such software would be by constructing servers of equal price and seeing how PostgreSQL fares. The true question on any business person's mind is "how much to implement?"
Well, back to rejecting software patent applications.
men use PostgreSQL,
and _____ use MSSQL.
flame on... =) flaming is ok... but stupid. and even more stupid is people who can code only for one db platform.
Why UNIX?
Yeah, why the hell is this tag (and others like it) showing up today? It's not coming from jokes in the comments. So what's up with these useless tags?
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."
wussy bedwetter MSFT fanbois?
My blog
To paraphrase an old saying:
There are lies, damned lies and benchmarks.
Bearded Dragon
and real DBA's use informix..
like BSD.. no it's not dead yet.
What about IBM's DB? I hear it has 30% of the market however i have never seen it in action!
Why this emaciated post made it while mine didn't I'll never know...here's how I submitted this story:
The current version of PostgreSQL now has its first real benchmark, a SPECjAppServer2004 submission from Sun Microsystems. The results required substantial tuning of many performance-related PostgreSQL parameters, some of which are set to extremely low values in the default configuration — a known issue that contributes to why many untuned PostgreSQL installations appear sluggish compared to its rivals. The speed result is close but slightly faster than an earlier Sun submission using MySQL 5 (with enough hardware differences to make a direct comparison of those results unfair), and comes close to keeping up with Oracle on similarly priced hardware — but with a large software savings. Having a published result on the level playing field of an industry-standard benchmark like SPECjAppServer2004, with documentation on all the tuning required to reach that performance level, should make PostgreSQL an easier sell to corporate customers who are wary of adopting open-source applications for their critical databases.
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."
For those of us who don't have dozens of hours to do the necessary research, can some postgresql gurus sum up some of the most significant tuning parameters so us mere mortals can see similar performance gains?
I realize that a large part of the answer is going be "it depends on application, your hardware, and you query types", but surely there must be some general tips that we can follow given various typical setups. MySQL, for example, ships with several different configuration files: One suitable for a small installation, one for a mid-sized installation, one for large installation, etc.
What tuning can someone do to tune postgresql's default (conservative) config file to make it perform better?
DB2? It's only useable on large mainframes (big iron, so to speak) from what I understand. Generally speaking only the REALLY large shops would use it, so I wouldn't be surprised you'd never seen it - neither have I. We're a pretty big organisation where I work, and we have a mix of Oracle, MSSQL, and Sybase servers.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
I think that somebody sent the wrong link and (surprise!) the editors didn't even follow it to check.
Here's a more useful one: All SPEC jAppServer2004 Results Published by SPEC
The benchmarks aren't standardized enough for any useful comparison. The hardware and configurations vary in almost every one.
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless.
Not necessarily.
It's essentially useless for separating out how much of the performance difference is the result of the software's design, implementation, and tuning versus how much is due to the platform differences.
But such tests CAN be used to examine the performance of competing ENTIRE SYSTEMS, to inform choices between them.
They say: "Oracle on does THIS well, PostgreSQL on can be tuned so it does THAT well on the same benchmark."
This lets administrators (presuming they have access to the hardware info) get a bang-for-the-buck comparison.
For the rest of us, the interesting point is that PostgreSQL, running on its team's idea of realistic hardware, can produce performance in the same ballpark as Oracle running on Oracle's choice of hardware.
(Whether the necessary remaining data (what are hardwares x and y? how was PostgreSQL tunde) is published now, later, or never, is a separate issue. B-) )
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
I dunno, I kinda like MSSQL. Hell, I use it alongside MySQL servers for my own projects (that, and having support for multiple platforms in your product is kinda a good idea). Sure, it's got horrific licensing (nowhere near as bad as Oracle's, though) but other than that, it's pretty good and reliable. I get the impression that the core of it wasn't written by MS way back when, though. And it sure wasn't built by the Windows team.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
That should have read:
They say: "Oracle on {hardware x} does THIS well, PostgreSQL on {hardware y} can be tuned so it does THAT well on the same benchmark."
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Get people from each group, give them the requirements and 5 different dollar amounts.
:)
Let each team setup their systems how ever they want at each price point. Some will go with clustered servers. Some will go a single monster server. They know their products best so they'll be the ones best suited to choosing the configuration.
Then run the benchmarks. And keep hammering on them until AFTER the next patch release.
Yeah, it might run fast, but still be a bitch to patch/upgrade.
At $5,000 you might find that a cluster of MySQL boxes beats everything.
At $10,000 maybe something else is best.
$25,000
$50,000
$100,000
etc.
And finally, break it. Break it bad. What happens when something goes wrong? Oracle might cost a lot, but if they can come through with your data they might just be worth it.
If nothing else, you'll get the "best practices" nicely demonstrated by each group.
big iron?
I have something in common with Stephen Hawking...
http://www.spec.org/jAppServer2004/results/jAppSer ver2004.html contains all the results. Unfortunately the different hardware configurations make it rather hard to draw any conclusions. Which begs the question, how did the submiter knew that these specific guys where biased or not? From what I can see, the whole setup is inherently biased.
Um, no. DB2 these days runs on most major UNIX variants (HP-UX, Solaris, AIX, IRIX, etc.), Linux and Windows. It's used quite often, in fact. Most Enovia/VPM installations use DB2 backends, for instance. Modern versions use XML along with regular relational database stores and are very, very up-to-date technology-wise. Very scalable.
My blog
MySQL 5 on AMD Opteron 285
The UltraSPARC has 8 cores on 1 chip and 16GB of memory.
The Opteron has 4 cores and 8GB of memory.
The UltraSPARC should smoke it every time.
Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.
Databases are even being used in many situations that would be better addressed using flat files.
They certainly are not unbiased
I guess that's a deal breake right there, no?
The test was put together by PostgreSQL core developers at Sun. Didn't we agree earlier, when talking about Intel/AMD benchmarks, that vendor supplied tests are wildly inaccurate?
PostgreSQL should concentrate on more developer tools and better marketing. The "it's got a ton of features you don't need" on a cryptic site doesn't help its cause.
People use MySQL because there's a wide support and lots of dev tools for it, and because the kind of people going for MySQL needs to do simple selects and inserts most of the time.
nope. It's h4rdw4r3z!
My blog
As I recall, Oracle's choice of hardware consists of a very large chequebook and a stamp with your signature on it.
Won't you guys agree, "elephant" doesn't exactly communicate "fast and modern" very well.
"Dolphin" comes a bit closer.
Who's coming up with those logos?
At where I work we've got a number of clients running DB2 on x86 Linux platforms, as well as larger AIX platforms and of course the mainframe. Seems to have a lot of capabilities but is a bit more finicky than Oracle and MS SQL. It seems to like to dead lock unless the DBA has really made sure it is set up properly.
DB2 runs on distributed systems too - I'm using it on a project right now. It's not bad at all.
Website Hosting
Agree, and they're all SQL based anyway.
Oh, I agree. A benchmark of whole systems can be just as (or more) useful as a benchmark of individual pieces of software, depending on what your goals are.
But what's been presented here isn't even that. Links #1 takes us to a SPEC benchmark of PostgreSQL. It doesn't provide any information about anything else; there isn't anything to compare the benchmark to. Link #2 provides an unreferenced statement about Oracle's marginally superior performance on much more expensive equipment.
So, perhaps, one can begin to draw conclusions about PostgreSQL vs Oracle in the contexts of full systems. But neither link #1 nor link #2 provide any information about MySQL (except the quote: "[t]his publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL").
Really, my criticism isn't of the benchmark (the data are the data, after all) or of the blog (one expects a vested PostgreSQL interest to comment on such a benchmark), but of the blurb here that either a) draws totally unwarranted conclusions, or b) depends on information it doesn't bother sharing.
Reality has a conservative bias: it conserves mass, energy, momentum...
Absolutely! Of course, they don't really live in the same solution space so it's equally true to say that PostgreSQL is much faster than SQLite for many workloads.
Dewey, what part of this looks like authorities should be involved?
Performance isn't what causes a lack of acceptance in the marketplace for PostgreSQL.
The problem is twofold:
MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast, easy to use database for those who don't need a whole lot.
Oracle supplies an enterprise level database that MySQL doesn't aspire to. PostgreSQL doesn't know where to fit in.
Do a little investigation on setting up PostgreSQL with fault tolerance and replication and you'll quickly see why large corporations cough up money for Oracle. Performance is one aspect of the price tag, but it is certainly not the only factor.
Databases are even being used in many situations that would be better addressed using flat files. That kind of mainframe/JCL talk is not welcome here...
Website Hosting
If you want to setup a dedicated database server, you want to know what software with what hardware will run the fastest. So while the benchmarks may not be useful to people wanting to setup a small multi-purpose server, it can still be useful for some people.
then you're a good candidate to use an object-database. However, I don't know yet how well they scale. Apart from that, managing concurrent transaction might be a pain. I'm playing with db4o in my next project. Interesting database, but I will restrict it to single-user mode.
Idha khatabahum lijahiluna qalu salaman
No! gad damn it!
"h4rdw4r3" is however an acceptable alternate spelling.
They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
Led me to...
http://blogs.digitar.com/media/2/T2000_Experience
where a UltraSPARC T1 customer states that MySQL 5 is up to 13.5 times faster than MySQL 5 on an AMD Opteron.
The results of the original article's data were as follows:
Postresql 8.2 on UltraSPARC T1 - 778.14
MySLQ 5 on AMD Opteron - 720.56
So using my blatantly biased (yet at the same time factual) numbers, I correct the MySQL performance number to reflect what it would be on an UltraSPARC T1: 720.56 x 13.5 = 9727.56
9727.56 > 778.15
MySQL wins!
From the link in the article you can find the results, or just look here.
Hardware, kernel changes, PostgreSQL configuration, etc.
Its all on that page.
> The plural of hardware is hardware, not hardwares.
> Does slashdot not have a "check spelling" feature on the submission page?
You must be new here.
Windows 2000 - from the guys who brought us edlin
Ok, apparently I'm somewhat wrong there. I wasn't aware it ran nowadays on your regular desktop PCs with Linux. Though, I must admit I can't see it running very fast compared to software DESIGNED for that type of platform.
How exactly did that get modded informative if it's wrong?
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
Just try to hammer each database with 50 concurrent transactions to see which one scales better!!
please excuse my apathy
I started getting problems just rapidly F5ing a php script that used an SQLite DB. Can't say anything about PgSQL, because I couldn't figure out how to set that up.
MySQL is modular - pick'n'mix data storage engines sharing a SQL front end. I can't find the bit in TFA that says which one they compared.
I've always suspected that most MySQL vs Postgres flame wars are based on comparing Postgres with the speed of MySQL/MyISAM (No transactions or relational integrity checks - so, big surprise, dead fast for simple queries) and then waving MySQL/InnoDB around when the functionality issue is raised.
MySQL/MyISAM hits the speed/functionality sweet spot for LAMP data-driven websites, is supported by lots of free webapp software and offered by most decent web hosting services. Comparing it speedwise with Postgres has always been pointless, though. If Postgres has caught up, colour me impressed, but if they're pro-Postgres I bet they're comparing with MySQL/InnoDB (which is a bit closer to like-with-like).
Never quite seen the point of MySQL/InnoDB really - all the advantages of MySQL/MyISAM minus the speed, support by popular webapps, availbility on low-cost hosts... and still lacks the features of Postgres.
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Now there's a good point. Is there any good documentation out there on databasing on what is the best solution for what kind of problem? Particularly, I would like to see something between flat-files and DBMS and then between different DBMS.
Though, technically, I'm stuck between whatever our 'host' is giving us (currently PostgreSQL, but usually MySQL, and I dabbled with MSSQL when an intern for the Government), I'd like to know when it might be better to actually just store some stuff in a flat file or even use an XML DB (which one collage professor loved to talked about as a holy grail).
Outside of "Basic Database Theory and Generic SQL" books, a programming class on Databases (how to build tree structures and navigate them, etc), I've never really got any 411 on the benefits of using MSSQL, Oracal, MySQL, PostgreSQL, etc and I've never gotten the change to learn more about advanced database features such as stored procedures or functions or whatever they might be called and what they're good for.
Recommended books, anyone?
Cheers,
Fozzy
p.s. I work mostly with internet applications.
"The past was erased, the erasure was forgotten, the lie became truth." ~1984 George Orwell
> Though, I must admit I can't see it running very fast compared to software DESIGNED for that type of platform.
Actually, it's probably the fastest solution out there for heavy reporting/analytical workloads on windows, linux or unix. Not sure about teradata and informix - haven't looked at them in quite a while.
Teradata, informix and db2 were doing the 'beowulf' thing years before anyone on slashdot asked what a beowulf cluster of these would be like. So, you can easily distribute data across a hundred db2 server blades to get very fast response times over terabytes of data. This can be far faster at the high-ends than the typical oracle scenario of range partitioning on a 32-way smp. And oracle grid is more failover oriented than performance oriented, so that doesn't really compete either, last I looked.
I've built and managed very large databases on oracle, db2 and sql server in the last five years. Of them all, db2 on unix was the easiest to build and manage. It did take a little extra time - since some of the tools are a little finicky, but the manageability and performance was far better than with either oracle or sql server. And with very large data - far, far better than with mysql or postgresql.
Please mod parent down. DB2 is not only for mainframes. I worked at a place which used a bunch of DB2 boxes running AIX.
While DB2 does run on mainframes, it also runs on many other operating systems, including Linux.
http://www-306.ibm.com/software/data/db2/9/
As you can see, DB2 runs on at least Linux, Windows, and Solaris. It also runs on AIX, even though I didn't see it mentioned in a quick glance on the page. Please check your facts next time.
It's like getting married. "How much money do you have?" "Well, that's what it costs."
I get the impression that the core of it wasn't written by MS way back when, though.
MS licensed Sybase's database for Windows back in the late eighties/early nineties and eventually MS SQL Server grew out of that.
Every time you post an article on Slashdot, I kill a server. Think of the servers!
Microsoft bought Sybase SQL Server and released it as MSSQL 6. Since then, Microsoft has maintained the code independently of Sybase. So yes, it wasn't written by MS way back in the days. MS did what they always do and bought one of the underdogs, then marketed the snot out of it.
:)
If you prefer MSSQL, then I'm assuming that you haven't developed any sophisticated transactions in a high-concurrency environment. It's painful to work around the limitations of the old system of hierarchal locks that MSSQL uses (as does DB2, and MySQL with anything except for InnoDB tables). It can be done, and the app can be coerced into working well if you base your design around MSSQL's limitations, but it requires careful planning. (This is somewhat outdated, since MSSQL did legitimately get much better in this regard with 2005)
From a development point of view though, once you've used any system which supports multi-versioning concurrency control PostGres, Oracle, MySQL with InnoDB, or even MSSQL 2005 using the new snapshot isolation features they bolted on, you'll never want to go back
Berkeley is extremely fast for simple key/value pairs. It's often embedded and has bindings to every major scripting language.
SQLite is a small and fast database that works from local files. Again it's often embedded and there are bindings for all major scripting hosts. You could run into concurrency issues using it under heavy load.
Flat files are simple and fast, you have the concurrency issue and need to lock the file for writes.
XML is typically flat files but sleepycat and others have XML databases. XML is mainly useful for interchange where several apps need access to the same data. Leaner formats like YAML and JSON are gaining ground for serialized data.
Then you get the full blown RDBMS which are first choice for large scale apps with multiple clients. Multi user ERP, accounts packages, busy web forums - it's the right choice for these apps. Using MySQL or Postgres to catalog your personal DVD collection or for a personal blog/forum with only a handful of users is worse than overkill - it's ridiculous.
I couldn't figure out how to set that up
Better hope you never have to use Oracle.
sic transit gloria mundi
I don't see how stripping out nused features would lead to a big increase in performance. Unused code just sits there in the RAM and does nothing to slow you down (_IF_ the application is well-designed)
For that matter, has anyone merged any open source Java server container with PostgreSQL for higher performance of that use case, in an integrated architecture without network and other overhead for messages, and more atomic transactions? Again, i think this would not increase the performance by a notable amount. The bottleneck nowadays lies in I/O; the two copy-commands needed to pipe something through the tcp/ip-stack are negligable.
I worked for a company whose product ran on MS-SQL, PostgreSQL, and Oracle. Should I explain why we didn't support MySQL or not? It'll draw fanboys either way. I used the same server, reinstalled the OS (Red Hat Enterprise 3 or Windows 2000) and database between each test, and rebuilt the application server to be extra sure.
Since it was more difficult to write Oracle-compliant SQL and we didn't have a lot of Oracle customers, the developers didn't care to spend time on it, and our stuff ran about 20 percent slower there. That's after a lot of tuning time, it was 50% slower on a default install. Oracle 9 took two days to install and tune, plus another two days of preparation. I was particularly underwhelmed that I had to deal with stupid errors like tarballs that extracted onto themselves and assumptions about the shell being used. At the time, Oracle was a very Solaris-like experience; user-unfriendly to the extreme.
Postgresql 7 ran great; it was neck and neck with MS-SQL in all tests, after proper tuning, and 30 percent slower on a default install. Postgres took half a day to install and tune, but it took me a week and conversation with the postgres mailing lists to find out what needed tuning. Still, we were able to put together a document that took users from bare metal to RHEL+Postgres in four hours if they had all their media handy.
Microsoft SQL Server 2000 ran great with no tuning at all, and took fifteen minutes to install. It also cost as much as paying me to do the entire set of tests. OS installation/patching times and tested workloads were the same for all three tests.
YMMV.
"Nothing was broken, and it's been fixed." -- Jon Carroll
I AM collecting paychecks, and I use mySQL.
Of course... with InnoDB.
Stored procedures are bits of code (previously database-specific, but there's often support for java, perl and python now) that are stored in the database and can be executed by the server. There are at least three times when they're a Really Good Idea:
1. when you want to make sure something is always done despite lazy or even hostile users. You attach the SP to standard operations. E.g., it's common to set a 'last modified date' on every insert and update to a table. Depending on the DB you could even do fancy stuff like sending an alarm when certain actions are performed. You can even support things like updateable views this way.
2. when you want to really lock down your database. You can revoke 'insert' and 'update' rights from everyone and force them to go through a stored procedure to insert/update data. No risk of SQL injection, you can do arbitrarily complex input validation, etc.
3. when you want to perform complex aggregation operations. I had a problem where I needed to create a table that contained the rolling average of the 30 most recent values for each date. It was much faster (orders of magnitude faster) to do this in a SP than programmatically on a client.
Many DB also allow you to create 'user-defined functions' and 'user-defined types', albeit at a far higher integration cost. The first allows you to add C functions to the database server itself. This would allow you to perform fairly complex operations, but the need for it has dropped significantly with direct support for perl/java/etc., but you can still use them when you don't want to make it easy for somebody to alter your procedures.
The latter (with appropriate user-defined functions) allow you to create new first-order objects in your database. E.g., I've written a user-defined type that can hold digital certificates and wrapped encryption keys. Standard databases can always store them as blobs, but with the UDT I can provide an extremely powerful set of tools to the stored procedures and sql queries.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
Back when mysql was faster than postgres for simple selects, mysql proponents were jumping up and down about performance. And, in fact, a very narrow definition of performance, with only simple reads. No writes. Single CPU. One client. You can't step too far away from those parameters because then mysql performance starts to suck.
Now that there is a respectable benchmark showing postgresql to be faster, performance is suddenly not the issue.
Yep. It's right up there with mysql developers claiming that transactions are useless back when mysql didn't have them.
___
If you think big enough, you'll never have to do it.
I'm sure that's true, but that page is crap:
"SQLite 2.7.6 is significantly faster (sometimes as much as 10 or 20 times faster) than the default PostgreSQL 7.1.3 installation on RedHat 7.2 for most common operations."
A drunken badger is 10 to 20 times faster than a default Postgres 7.1.3 installation. The 7.x branch, and particularly the early releases, had pretty bad performance (for this type of queries, at least) - they didn't really start sorting that out until 7.4
Isn't SQLite a bit of a toy though? Something like HSQLDB or (better yet) Berkeley DB usually seems more appealing when you don't need a standalone DB server.
sic transit gloria mundi
I am more interested in how it handles an axe wielding maniac going postal in the datacenter, or what would happen of a mile wide tornado struck town, or what would happen if there's a bug, or if the project becomes wildly successful and does 10x or 100x the processing.
A Pirate and a Puritan look the same on a balance sheet.
8.1 automatically does table maintenance (vacuum, stats, etc.)- vacuuming.html#AUTOVACUUM
http://www.postgresql.org/docs/8.2/static/routine
You can tune it as to when to do such operations, or what max percent of normal I/O can be used for those tasks.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Not necessarily.
Oracle will happily run on the cheap crap wintel servers that Postgres will. It will even do so under the typical OS of choice for running Postgres on cheap crap wintel servers.
Oracle will also theoretically run on some 64 cpu monster (from Sun, IBM or SGI).
A Pirate and a Puritan look the same on a balance sheet.
My other account has a 3-digit UID.
I thought Sun paid Pervasive for their distribution in Solaris 10. Has this changed or by "core developers" do we really mean "people who commit the pkgs to the WOS" or "people who actually work at Sun who also just happen to work on Pg in their free time"?
I believe you no longer need to take the database offline and it now has an autovacuum daemon which automatically performs the operation when needed.
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
You talk as if this were a trivial thing to do just because you have access to the sources.
Are you adequate?
PostgresSQL: 778.14
s 2006q4/jAppServer2004-20061121-00046.html
Oracle 10gR2: 6,812.79
Real close, I agree.. Postgres had a 40GB cache, Oracle used 21GB
See http://www.spec.org/osg/jAppServer2004/results/re
If you mod me down, I *will* introduce you to my sister!
> Can't say anything about PgSQL, because I couldn't figure out how to set that up.
.MSI installer?
apt-get install postgresql-server too hard for you? Or are you having trouble double-clicking on the
Done with slashdot, done with nerds, getting a life.
Do you know how relational algebra and propositional calculus are implemented? There are many logic "ifs" and other evaluations in the code path to accommodate optional features. It's one reason why caching queries and results is so efficient. Those computations wouldn't even have to be run on uncached logic if the optional features were stripped out. It's not a matter of extra code stored in RAM, but more complex code paths.
And again with the Java/SQL integration: the extra layers to return from a Java scope, then jump the CPU to the SQL code, is inefficient. In tight loops, the least context switching shows real efficiency gains. Any simplification, even reducing handles to pointers for CPU address calculation, can improve performance. If the task is simple. Which is what most Web DB/object transactions are.
DB performance is a very tough game. Tiny gains result in many more sales. PostgreSQL is not as popular as others partly because there haven't been good performance metrics, so there hasn't been a big investment in performance metrics, so it stays relatively unpopular. But if it were tailored for the simplicity of most Web apps, it might have not only good performance, but the kind of marketable gains that Oracle and others will not revise themselves to produce.
--
make install -not war
That is all.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
No, I never said it was trivial, or any degree of difficulty or ease. I merely asked if it had been done, since the prohibition from proprietary source does not stop it.
You are projecting your inferences on what I said. Projecting a binary worldview, where coding is only either "impossible" or "trivial". All I said was it's not impossible, and wondered if anyone had done it.
--
make install -not war
I have an intense dislike of Oracle. PL/SQL is a nasty language. And MSSQL and MySQL aren't even on the same level. MySQL is generally seen as one of the lower tier of "low-range database servers" while MSSQL is considered "mid-range". Oracle is considered "high-range", and it's ease of use is pretty much testament to that.
That being, the higher up the chain the harder to bloody configure and use.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
Pretty much, it does.
Running on an Xbox versus an IPX, hmm. With no data, you assume the worst case scenario because any given variable is UNKNOWN. RAM, Processor, disk IO?
Correction, it certainly does.
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.
What sort of tables are you using in MySQL... InnoDB or something else? Only if you were using InnoDB tables would it be an apples to apples comparison.
If I have seen further it is by stealing the Intellectual Property of giants.
I get the impression that the core of it wasn't written by MS way back when, though. And it sure wasn't built by the Windows team.
It was written by Sybase, bought by MS and forked somewhere around Release 10. MS then ripped out the good parts (i.e. the ability to run on real operating systems) and rebranded it. Can you tell I come from a Sybase shop?
PostgreSQL setup: 8 cores, 16GB RAM.
Oracle setup: 64 cores, 512GB RAM.
- Sw Usr
Are you adequate?
I think he is reffering to larry's insane licensing polices, such as dual core = 2 cpu's
If you mod me down, I will become more powerful than you can imagine....
Ooops, sorry, 1133 is the comparable result.
More than 50% higher performance.
If you mod me down, I *will* introduce you to my sister!
Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
The guidelines are pretty simple, but highly subjective.
Here are your options:
A. flat file
B. key-indexed (berkley db, dbm, gdbm)
C. feature/config light db (sqlite)
D. feature/config heavy db (postgrsql, oracle)
* Use A for configuration
* Use A if you have a small amount of data and can slurp it all into memory.
* Use B if you only access data via a known unique value and never compare records.
* Never use either A or B if the data is shared between processes unless file locking is okay.
For anything else, use C or D. If you've got a lot of familiarity with D, sometimes it's easier to use D than C, even if D seems like overkill. My recommendation is to get familiar with C and then worry about D once you understand the limitations of C. It'll help you appreciate what D gives you when you're ready for it.
*sigh* back to work...
I've many times gotten support from the core devs who pop into the IRC channel to give support. Also, the idea that PG is hard to set up is a 5 year old story, long since obsoleted. PG now comes in RPMs, can be apt-gotten in Debian and has a .msi installer that is far easier than even MySQL's Win32 installer. If you are having trouble installing PostgreSQL, then you're obviously not worth anything as a sysadmin.
I hate printers.
Yes it is, and no it isn't. It's really useful for "databases" that are only storing a small amount of data, or mostly static data. While I generally use Postgresql for any serious work, I've made trivial apps out of php+sqlite, because I can send someone the entire application as a single tarball. It's mostly useful to avoid having to write your own storage routines (which will invariably be buggier then something with lots of eyeballs on it).
Incidentally SQLite also requires regular VACUUMs.
// MD_Update(&m,buf,j);
You should still run a "vacuum full analyze" after schema changes
Do not run VACUUM FULL under normal circumstances. It's a special command that's used to return free space in a table to the operating system by compacting the file and eliminating internal free space. PostgreSQL will reuse this space anyway, so normal operation does not require VACUUM FULL, a plain VACUUM will suffice.
Schema changes may or may not affect the need to run VACUUM FULL, but if you are worried about the performance impact of VACUUM FULL, read the docs, because chances are you don't need it.
Social scientists are inspired by theories; scientists are humbled by facts.
Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.
Databases aren't just about performance. Type checking is a big reason why RDBMSs are more prevalent.
Type checking databases are more useful in the same way that a type-checked language is more useful than assembly. Assembly is very simple to program in, but all variables are just bytes. SQLite can't order by date, because dates are strings and don't order the same way. I'm not saying SQLite is bad (not at all, I think it's an interesting project), but simpler does not always mean better.
Social scientists are inspired by theories; scientists are humbled by facts.
Well, for people who consider prices I guess it isn't since the oracle system cost more (exclusive software cost), thought then the choice of systems might have been "optimized" aswell.
I'm with the parent in terms of when to choose what...
For anything "small" (and you can put your own definition to that) and Web-based, SQLite is my first choice. Works well, solid, runs on any platform, and there's only a single file to worry about when it comes time to backup/restore so development version control is a non-issue. In particular, I'd suggest that SQLite would be a fine choice for just about any small-scale DB-backed Web site out there (again, choosing your own definition for the word "small").
For bigger stuff, I'd choose Postgres over MySQL, but that's more a personal preference than anything else at this point. There's pros and cons to both - coming from an Oracle background, I find the mentality behind Postgres easier to work with, but I'll happily work with MySQL when the situation requires it.
Oracle, DB2, MS-SQL - you've got to use them when the situation demands it. They all work great, scale well and have excellent support, but you pay for it. I wouldn't even think about using them with a startup, but many big corps won't consider anything else.
SQLite has before insert triggers. Why doesn't MSSQL?
Try following the HTML link - http://www.spec.org/jAppServer2004/results/res2007 q3/jAppServer2004-20070606-00065.html
Is more indepth on the systems and tuning parameters used.
DB2 is used as (optional) storage engine on later versions of Lotus Domino, which makes it widely used in larger enterprises, not necessarily on large irons. http://www-142.ibm.com/software/sw-lotus/products/ product4.nsf/wdocs/nsfdb2
I think this sounds pretty obvious, probably has been done:
Have an open source suite that allows anybody to verify the test results. Let the experts tweak the configuration for each database. Provide free downloads to databases and/or easy install. This way entire community is tweaking the configurations for different servers and posting back results.
Not only this would be a more or less "proper" benchmark, but also the secrets of db tuning, if there are any, will be revealed to public. Vendors get to brag if they are in top3, rest get a nice motivation to improve.
Just my $2,000,000
Lone Gunmen crew.
Sadly has a certain amount of truth to it. I've been trying to convince my company to use PostgreSQL for years. Yet I'm still stuck having to use MSSQL. The it's free arguement doesn't work very well. MSSQL is free for developers when you already have the rather expensive MSDN subscription, and you can package MSDE with a runtime for free.
Sure it has a pile of restrictions and performance sucks but management doesn't see it that way. Nobody every got fired for buying IBM^W Microsoft...
Don't blame me - this
This is the best post I have read in ages. I am old school. I like DB's and when doing requirements gathering am already thinking in terms of tables and relational schema. In a really well designed relational schema it is possible to do very complex operations without ever resorting to plsql/transact etc. simply using plain sql (insert/update etc). I hate cursors/recordsets. As soon as I see these constructs in the db I generally find a VB/java programmer messing where he is not wanted.
PS: Oracle is not Fortran - PL/SQL is derived from Ada and therefore shares much of the Pascal/Modula feel.
http://blogs.sun.com/tomdaly/
Why UNIX?
At first, many situations simply don't need a complex database schema. Think of a contact or email list in a simple web application. Such applications often need only one table. And therefore MySQL suffices. I would not call that bad design. Moreover, don't underestimate MySQL! Recent versions are not so much behind PostgreSQL and they perform reasonably well.
It's true that the SQL knowledge of many developers is quite limited. But for this reason, and, more important, due to the bad compatibility of SQL and OO design, ORM libraries such as Java Hibernate or Python's SQL Alchemy were developed that handle the SQL part. In this case we still need a decent database schema, but it may be described in the framework, which looks more familiar to developers. However, don't underestimate the generated SQL code! Complicated SQL statements may be generated and ORM libaries may make use of referential integrity, check constraints etc. and many other features a mature database has to offer. If designed right, a ORM-based design will have a decent database schema and will make use of many "database goodies".
ORM-generated SQL code may therefore stress the database in many ways - and therefore we still need good databases - regardless of the developer's SQL knowledge.
One pitfall we always seem to step into is that in case a good, new technology is invented, we think of it as a solution to all our problems. SQL is no different here, in the past we were so proud of designing normalized schemas that we forgot about the scalability. Such schemas - although perfect - are often today the reason for performance problems. Sadly, we often overreact and tend to throw away old technology and follow the new hype. And that's what we can see today, such as thoughtlessly vandalizing database schemas due to performance problems. Nevertheless, it is often appropriate to denormalize a database schema to improve the performance. If done right, I would not call this "vandalizing".
Hardware concerns are valid. But more importantly, SPEC jAppServer is not a database benchmark. It's a Java App Server benchmark with a database component. To be fair, you have to change only the database, not the app server, hardware, and everything else too. Not to say that the conclusion is wrong, but it's very weird to say anything about database performance in this circumstance.
In our tests, which are database-only and data-warehouse focused, PG seems to be 3x-5x slower than Oracle on the exact same box, as it does not do data compression and is not as CPU-efficient. Which is quite respectable given the price difference, but may be a big enough gap to keep people who have money to spend coming back to commercial databases.
Obviously the vendor puts together the best submission they can for their own product. Oracle did the same for theirs.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Get real, that does happen. We have a product, which is a set of surveys patients or their caregivers can take. Each survey event represents a survey taken by a patient or caregiver regarding one particular patient. Each survey event results in one associated report getting printed. Survey event completion also generates scores, and score values can trigger up to 2 additional reports. The additional report behavior is governed by a score to report mapping table. Also, each survey can have alternate reports specified based upon site and product settings.
When an administrator logs into the system, an 11 table join is done to generate the list of survey_reports. For sites with 3000 patients and 4500 surveys, it takes Postgres less than a second on old hardware.
So, you can't read then?
MySQL's docs are set up as a tutorial, and I can't find anything easily in them. If you know anything about databases then reading MySQL's docs is painfully slow.
PostgreSQLs docs are set up with one section being the tutorial, one a user's guide, and one an admin's guide. I find it MUCH easier to find what I want with it, plus it's got a search engine sitting on top of the docs that's fast and accurate for finding the relevant information.
How someone can find MySQL easy and PostgreSQL difficult (or vice versa really) is beyond me. They're both dirt simple to set up and use. Just one is better at keeping your data coherent.
I repeat, if you couldn't find your answer in the pg docs, you can't read.
--- It is not the things we do which we regret the most, but the things which we don't do.