First "Real" Benchmark for PostgreSQL
anticlimate writes "A new benchmark published on SPEC shows PostgreSQL's performance approaching that of Oracle's and surpassing or on par with MySQL (however the test-hardwares of the other DB systems are somewhat different). The test was put together by PostgreSQL's core developers working at Sun. They certainly are not unbiased, but this is the first 'real' benchmark with PostgreSQL — according to Josh Berkus's blog. The main difference compared to earlier benchmarks (and anecdotes) seems to be the tuning of PostgreSQL."
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless. But, being the intrepid slashdotter I am, I went ahead and R'ed the FA anyway, in case I could glean some useful information from it.
Which revealed that the linked article doesn't actually contain any information whatsoever about Oracle* or MySQL, much less benchmarks on named hardware.
So...what am I supposed to get out of this, again? Or is this just supposed to be some kind of PostgreSQL love-in, so I should take my wet blanket elsewhere?
*Well, the second link contains someone claiming that Oracle is only 15% faster...but without providing any actual data.
Reality has a conservative bias: it conserves mass, energy, momentum...
They're all shitty.
You cannot compare benchmarks without SOMETHING standard between them.
Okay, if they can't match the hardware (why not?) then focus on price points. I notice that they're looking at "$65,500 for the hardware". That's a LOT of hardware at today's prices.
I'm sure MySQL would (and will) come back with a "benchmark" on hardware costing $10,000.
There is nothing "real" about this "benchmark".
Because Sun systems will always be different from the x86 based cores that run MySQL and Oracle, I think the best way to compare such software would be by constructing servers of equal price and seeing how PostgreSQL fares. The true question on any business person's mind is "how much to implement?"
Well, back to rejecting software patent applications.
men use PostgreSQL,
and _____ use MSSQL.
flame on... =) flaming is ok... but stupid. and even more stupid is people who can code only for one db platform.
Why UNIX?
Yeah, why the hell is this tag (and others like it) showing up today? It's not coming from jokes in the comments. So what's up with these useless tags?
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."
wussy bedwetter MSFT fanbois?
My blog
Uh... WTF?
To paraphrase an old saying:
There are lies, damned lies and benchmarks.
Bearded Dragon
and real DBA's use informix..
like BSD.. no it's not dead yet.
What about IBM's DB? I hear it has 30% of the market however i have never seen it in action!
Why this emaciated post made it while mine didn't I'll never know...here's how I submitted this story:
The current version of PostgreSQL now has its first real benchmark, a SPECjAppServer2004 submission from Sun Microsystems. The results required substantial tuning of many performance-related PostgreSQL parameters, some of which are set to extremely low values in the default configuration — a known issue that contributes to why many untuned PostgreSQL installations appear sluggish compared to its rivals. The speed result is close but slightly faster than an earlier Sun submission using MySQL 5 (with enough hardware differences to make a direct comparison of those results unfair), and comes close to keeping up with Oracle on similarly priced hardware — but with a large software savings. Having a published result on the level playing field of an industry-standard benchmark like SPECjAppServer2004, with documentation on all the tuning required to reach that performance level, should make PostgreSQL an easier sell to corporate customers who are wary of adopting open-source applications for their critical databases.
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."
For those of us who don't have dozens of hours to do the necessary research, can some postgresql gurus sum up some of the most significant tuning parameters so us mere mortals can see similar performance gains?
I realize that a large part of the answer is going be "it depends on application, your hardware, and you query types", but surely there must be some general tips that we can follow given various typical setups. MySQL, for example, ships with several different configuration files: One suitable for a small installation, one for a mid-sized installation, one for large installation, etc.
What tuning can someone do to tune postgresql's default (conservative) config file to make it perform better?
DB2? It's only useable on large mainframes (big iron, so to speak) from what I understand. Generally speaking only the REALLY large shops would use it, so I wouldn't be surprised you'd never seen it - neither have I. We're a pretty big organisation where I work, and we have a mix of Oracle, MSSQL, and Sybase servers.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
The plural of hardware is hardware, not hardwares.
Does slashdot not have a "check spelling" feature on the submission page?
I think that somebody sent the wrong link and (surprise!) the editors didn't even follow it to check.
Here's a more useful one: All SPEC jAppServer2004 Results Published by SPEC
The benchmarks aren't standardized enough for any useful comparison. The hardware and configurations vary in almost every one.
however the test-hardwares of the other DB systems are somewhat different
Which makes the results pretty much useless.
Not necessarily.
It's essentially useless for separating out how much of the performance difference is the result of the software's design, implementation, and tuning versus how much is due to the platform differences.
But such tests CAN be used to examine the performance of competing ENTIRE SYSTEMS, to inform choices between them.
They say: "Oracle on does THIS well, PostgreSQL on can be tuned so it does THAT well on the same benchmark."
This lets administrators (presuming they have access to the hardware info) get a bang-for-the-buck comparison.
For the rest of us, the interesting point is that PostgreSQL, running on its team's idea of realistic hardware, can produce performance in the same ballpark as Oracle running on Oracle's choice of hardware.
(Whether the necessary remaining data (what are hardwares x and y? how was PostgreSQL tunde) is published now, later, or never, is a separate issue. B-) )
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Runs on Linux too...
I dunno, I kinda like MSSQL. Hell, I use it alongside MySQL servers for my own projects (that, and having support for multiple platforms in your product is kinda a good idea). Sure, it's got horrific licensing (nowhere near as bad as Oracle's, though) but other than that, it's pretty good and reliable. I get the impression that the core of it wasn't written by MS way back when, though. And it sure wasn't built by the Windows team.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
That should have read:
They say: "Oracle on {hardware x} does THIS well, PostgreSQL on {hardware y} can be tuned so it does THAT well on the same benchmark."
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
SQLite and Postgres are current my choices. What I really want are language bindings that preserve the structures, syntax and semantics of the language I'm using without the performance penalty of abstraction layers.
Has anyone used the PostgreSQL open source to refactor the DB to support just a subset of SQL and features (the most popular stuff that eg. "LAMP" uses), then benchmarked it vs the default distro, to show higher performance?
For that matter, has anyone merged any open source Java server container with PostgreSQL for higher performance of that use case, in an integrated architecture without network and other overhead for messages, and more atomic transactions?
--
make install -not war
Get people from each group, give them the requirements and 5 different dollar amounts.
:)
Let each team setup their systems how ever they want at each price point. Some will go with clustered servers. Some will go a single monster server. They know their products best so they'll be the ones best suited to choosing the configuration.
Then run the benchmarks. And keep hammering on them until AFTER the next patch release.
Yeah, it might run fast, but still be a bitch to patch/upgrade.
At $5,000 you might find that a cluster of MySQL boxes beats everything.
At $10,000 maybe something else is best.
$25,000
$50,000
$100,000
etc.
And finally, break it. Break it bad. What happens when something goes wrong? Oracle might cost a lot, but if they can come through with your data they might just be worth it.
If nothing else, you'll get the "best practices" nicely demonstrated by each group.
SQLite is faster than Postgres for many types of workloads.
big iron?
I have something in common with Stephen Hawking...
http://www.spec.org/jAppServer2004/results/jAppSer ver2004.html contains all the results. Unfortunately the different hardware configurations make it rather hard to draw any conclusions. Which begs the question, how did the submiter knew that these specific guys where biased or not? From what I can see, the whole setup is inherently biased.
Um, no. DB2 these days runs on most major UNIX variants (HP-UX, Solaris, AIX, IRIX, etc.), Linux and Windows. It's used quite often, in fact. Most Enovia/VPM installations use DB2 backends, for instance. Modern versions use XML along with regular relational database stores and are very, very up-to-date technology-wise. Very scalable.
My blog
MySQL 5 on AMD Opteron 285
The UltraSPARC has 8 cores on 1 chip and 16GB of memory.
The Opteron has 4 cores and 8GB of memory.
The UltraSPARC should smoke it every time.
Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.
Databases are even being used in many situations that would be better addressed using flat files.
They certainly are not unbiased
I guess that's a deal breake right there, no?
The test was put together by PostgreSQL core developers at Sun. Didn't we agree earlier, when talking about Intel/AMD benchmarks, that vendor supplied tests are wildly inaccurate?
PostgreSQL should concentrate on more developer tools and better marketing. The "it's got a ton of features you don't need" on a cryptic site doesn't help its cause.
People use MySQL because there's a wide support and lots of dev tools for it, and because the kind of people going for MySQL needs to do simple selects and inserts most of the time.
As I recall, Oracle's choice of hardware consists of a very large chequebook and a stamp with your signature on it.
Won't you guys agree, "elephant" doesn't exactly communicate "fast and modern" very well.
"Dolphin" comes a bit closer.
Who's coming up with those logos?
At where I work we've got a number of clients running DB2 on x86 Linux platforms, as well as larger AIX platforms and of course the mainframe. Seems to have a lot of capabilities but is a bit more finicky than Oracle and MS SQL. It seems to like to dead lock unless the DBA has really made sure it is set up properly.
DB2 runs on distributed systems too - I'm using it on a project right now. It's not bad at all.
Website Hosting
Agree, and they're all SQL based anyway.
Oh, I agree. A benchmark of whole systems can be just as (or more) useful as a benchmark of individual pieces of software, depending on what your goals are.
But what's been presented here isn't even that. Links #1 takes us to a SPEC benchmark of PostgreSQL. It doesn't provide any information about anything else; there isn't anything to compare the benchmark to. Link #2 provides an unreferenced statement about Oracle's marginally superior performance on much more expensive equipment.
So, perhaps, one can begin to draw conclusions about PostgreSQL vs Oracle in the contexts of full systems. But neither link #1 nor link #2 provide any information about MySQL (except the quote: "[t]his publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL").
Really, my criticism isn't of the benchmark (the data are the data, after all) or of the blog (one expects a vested PostgreSQL interest to comment on such a benchmark), but of the blurb here that either a) draws totally unwarranted conclusions, or b) depends on information it doesn't bother sharing.
Reality has a conservative bias: it conserves mass, energy, momentum...
Performance isn't what causes a lack of acceptance in the marketplace for PostgreSQL.
The problem is twofold:
MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast, easy to use database for those who don't need a whole lot.
Oracle supplies an enterprise level database that MySQL doesn't aspire to. PostgreSQL doesn't know where to fit in.
Do a little investigation on setting up PostgreSQL with fault tolerance and replication and you'll quickly see why large corporations cough up money for Oracle. Performance is one aspect of the price tag, but it is certainly not the only factor.
Databases are even being used in many situations that would be better addressed using flat files. That kind of mainframe/JCL talk is not welcome here...
Website Hosting
If you want to setup a dedicated database server, you want to know what software with what hardware will run the fastest. So while the benchmarks may not be useful to people wanting to setup a small multi-purpose server, it can still be useful for some people.
No question of bias there...
If you want news from today, you have to come back tomorrow.
They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
Led me to...
http://blogs.digitar.com/media/2/T2000_Experience
where a UltraSPARC T1 customer states that MySQL 5 is up to 13.5 times faster than MySQL 5 on an AMD Opteron.
The results of the original article's data were as follows:
Postresql 8.2 on UltraSPARC T1 - 778.14
MySLQ 5 on AMD Opteron - 720.56
So using my blatantly biased (yet at the same time factual) numbers, I correct the MySQL performance number to reflect what it would be on an UltraSPARC T1: 720.56 x 13.5 = 9727.56
9727.56 > 778.15
MySQL wins!
From the link in the article you can find the results, or just look here.
Hardware, kernel changes, PostgreSQL configuration, etc.
Its all on that page.
Ok, apparently I'm somewhat wrong there. I wasn't aware it ran nowadays on your regular desktop PCs with Linux. Though, I must admit I can't see it running very fast compared to software DESIGNED for that type of platform.
How exactly did that get modded informative if it's wrong?
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
MySQL is modular - pick'n'mix data storage engines sharing a SQL front end. I can't find the bit in TFA that says which one they compared.
I've always suspected that most MySQL vs Postgres flame wars are based on comparing Postgres with the speed of MySQL/MyISAM (No transactions or relational integrity checks - so, big surprise, dead fast for simple queries) and then waving MySQL/InnoDB around when the functionality issue is raised.
MySQL/MyISAM hits the speed/functionality sweet spot for LAMP data-driven websites, is supported by lots of free webapp software and offered by most decent web hosting services. Comparing it speedwise with Postgres has always been pointless, though. If Postgres has caught up, colour me impressed, but if they're pro-Postgres I bet they're comparing with MySQL/InnoDB (which is a bit closer to like-with-like).
Never quite seen the point of MySQL/InnoDB really - all the advantages of MySQL/MyISAM minus the speed, support by popular webapps, availbility on low-cost hosts... and still lacks the features of Postgres.
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Now there's a good point. Is there any good documentation out there on databasing on what is the best solution for what kind of problem? Particularly, I would like to see something between flat-files and DBMS and then between different DBMS.
Though, technically, I'm stuck between whatever our 'host' is giving us (currently PostgreSQL, but usually MySQL, and I dabbled with MSSQL when an intern for the Government), I'd like to know when it might be better to actually just store some stuff in a flat file or even use an XML DB (which one collage professor loved to talked about as a holy grail).
Outside of "Basic Database Theory and Generic SQL" books, a programming class on Databases (how to build tree structures and navigate them, etc), I've never really got any 411 on the benefits of using MSSQL, Oracal, MySQL, PostgreSQL, etc and I've never gotten the change to learn more about advanced database features such as stored procedures or functions or whatever they might be called and what they're good for.
Recommended books, anyone?
Cheers,
Fozzy
p.s. I work mostly with internet applications.
"The past was erased, the erasure was forgotten, the lie became truth." ~1984 George Orwell
> Though, I must admit I can't see it running very fast compared to software DESIGNED for that type of platform.
Actually, it's probably the fastest solution out there for heavy reporting/analytical workloads on windows, linux or unix. Not sure about teradata and informix - haven't looked at them in quite a while.
Teradata, informix and db2 were doing the 'beowulf' thing years before anyone on slashdot asked what a beowulf cluster of these would be like. So, you can easily distribute data across a hundred db2 server blades to get very fast response times over terabytes of data. This can be far faster at the high-ends than the typical oracle scenario of range partitioning on a 32-way smp. And oracle grid is more failover oriented than performance oriented, so that doesn't really compete either, last I looked.
I've built and managed very large databases on oracle, db2 and sql server in the last five years. Of them all, db2 on unix was the easiest to build and manage. It did take a little extra time - since some of the tools are a little finicky, but the manageability and performance was far better than with either oracle or sql server. And with very large data - far, far better than with mysql or postgresql.
Please mod parent down. DB2 is not only for mainframes. I worked at a place which used a bunch of DB2 boxes running AIX.
While DB2 does run on mainframes, it also runs on many other operating systems, including Linux.
http://www-306.ibm.com/software/data/db2/9/
As you can see, DB2 runs on at least Linux, Windows, and Solaris. It also runs on AIX, even though I didn't see it mentioned in a quick glance on the page. Please check your facts next time.
It's like getting married. "How much money do you have?" "Well, that's what it costs."
I get the impression that the core of it wasn't written by MS way back when, though.
MS licensed Sybase's database for Windows back in the late eighties/early nineties and eventually MS SQL Server grew out of that.
Every time you post an article on Slashdot, I kill a server. Think of the servers!
I have used Postgres in two different production environments over the past 6 years, but they have both been environments where it was possible to effectively take the system down for a little while each night while a vacuum is done. The vacuums seem to be quite necessary, as without them, performance slowly and steadily degrades as the trees get funky and the tables grow larger and larger with unused update holes in them. It would probably be possible to hold off and do them weekly rather than daily, but the point is that they seem to be necessary, and they seem to take the system down while they're running. Between vacuums, the database is fast.
This leads me to believe that Postgres isn't suitable as part of a system that needs to be running 24/7, such as an online order processor. What am I missing?
Microsoft bought Sybase SQL Server and released it as MSSQL 6. Since then, Microsoft has maintained the code independently of Sybase. So yes, it wasn't written by MS way back in the days. MS did what they always do and bought one of the underdogs, then marketed the snot out of it.
:)
If you prefer MSSQL, then I'm assuming that you haven't developed any sophisticated transactions in a high-concurrency environment. It's painful to work around the limitations of the old system of hierarchal locks that MSSQL uses (as does DB2, and MySQL with anything except for InnoDB tables). It can be done, and the app can be coerced into working well if you base your design around MSSQL's limitations, but it requires careful planning. (This is somewhat outdated, since MSSQL did legitimately get much better in this regard with 2005)
From a development point of view though, once you've used any system which supports multi-versioning concurrency control PostGres, Oracle, MySQL with InnoDB, or even MSSQL 2005 using the new snapshot isolation features they bolted on, you'll never want to go back
Yes, those folks at Yahoo, Ebay, Amazon, Google, and even Hotmail all run that lightening fast MSsql. You were talking about all those systems for tracking games scores with, yes? Because if you meant REAL work, then time to chuckle.
Berkeley is extremely fast for simple key/value pairs. It's often embedded and has bindings to every major scripting language.
SQLite is a small and fast database that works from local files. Again it's often embedded and there are bindings for all major scripting hosts. You could run into concurrency issues using it under heavy load.
Flat files are simple and fast, you have the concurrency issue and need to lock the file for writes.
XML is typically flat files but sleepycat and others have XML databases. XML is mainly useful for interchange where several apps need access to the same data. Leaner formats like YAML and JSON are gaining ground for serialized data.
Then you get the full blown RDBMS which are first choice for large scale apps with multiple clients. Multi user ERP, accounts packages, busy web forums - it's the right choice for these apps. Using MySQL or Postgres to catalog your personal DVD collection or for a personal blog/forum with only a handful of users is worse than overkill - it's ridiculous.
I worked for a company whose product ran on MS-SQL, PostgreSQL, and Oracle. Should I explain why we didn't support MySQL or not? It'll draw fanboys either way. I used the same server, reinstalled the OS (Red Hat Enterprise 3 or Windows 2000) and database between each test, and rebuilt the application server to be extra sure.
Since it was more difficult to write Oracle-compliant SQL and we didn't have a lot of Oracle customers, the developers didn't care to spend time on it, and our stuff ran about 20 percent slower there. That's after a lot of tuning time, it was 50% slower on a default install. Oracle 9 took two days to install and tune, plus another two days of preparation. I was particularly underwhelmed that I had to deal with stupid errors like tarballs that extracted onto themselves and assumptions about the shell being used. At the time, Oracle was a very Solaris-like experience; user-unfriendly to the extreme.
Postgresql 7 ran great; it was neck and neck with MS-SQL in all tests, after proper tuning, and 30 percent slower on a default install. Postgres took half a day to install and tune, but it took me a week and conversation with the postgres mailing lists to find out what needed tuning. Still, we were able to put together a document that took users from bare metal to RHEL+Postgres in four hours if they had all their media handy.
Microsoft SQL Server 2000 ran great with no tuning at all, and took fifteen minutes to install. It also cost as much as paying me to do the entire set of tests. OS installation/patching times and tested workloads were the same for all three tests.
YMMV.
"Nothing was broken, and it's been fixed." -- Jon Carroll
I AM collecting paychecks, and I use mySQL.
Of course... with InnoDB.
Stored procedures are bits of code (previously database-specific, but there's often support for java, perl and python now) that are stored in the database and can be executed by the server. There are at least three times when they're a Really Good Idea:
1. when you want to make sure something is always done despite lazy or even hostile users. You attach the SP to standard operations. E.g., it's common to set a 'last modified date' on every insert and update to a table. Depending on the DB you could even do fancy stuff like sending an alarm when certain actions are performed. You can even support things like updateable views this way.
2. when you want to really lock down your database. You can revoke 'insert' and 'update' rights from everyone and force them to go through a stored procedure to insert/update data. No risk of SQL injection, you can do arbitrarily complex input validation, etc.
3. when you want to perform complex aggregation operations. I had a problem where I needed to create a table that contained the rolling average of the 30 most recent values for each date. It was much faster (orders of magnitude faster) to do this in a SP than programmatically on a client.
Many DB also allow you to create 'user-defined functions' and 'user-defined types', albeit at a far higher integration cost. The first allows you to add C functions to the database server itself. This would allow you to perform fairly complex operations, but the need for it has dropped significantly with direct support for perl/java/etc., but you can still use them when you don't want to make it easy for somebody to alter your procedures.
The latter (with appropriate user-defined functions) allow you to create new first-order objects in your database. E.g., I've written a user-defined type that can hold digital certificates and wrapped encryption keys. Standard databases can always store them as blobs, but with the UDT I can provide an extremely powerful set of tools to the stored procedures and sql queries.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
Back when mysql was faster than postgres for simple selects, mysql proponents were jumping up and down about performance. And, in fact, a very narrow definition of performance, with only simple reads. No writes. Single CPU. One client. You can't step too far away from those parameters because then mysql performance starts to suck.
Now that there is a respectable benchmark showing postgresql to be faster, performance is suddenly not the issue.
Yep. It's right up there with mysql developers claiming that transactions are useless back when mysql didn't have them.
___
If you think big enough, you'll never have to do it.
I am more interested in how it handles an axe wielding maniac going postal in the datacenter, or what would happen of a mile wide tornado struck town, or what would happen if there's a bug, or if the project becomes wildly successful and does 10x or 100x the processing.
A Pirate and a Puritan look the same on a balance sheet.
8.1 automatically does table maintenance (vacuum, stats, etc.)- vacuuming.html#AUTOVACUUM
http://www.postgresql.org/docs/8.2/static/routine
You can tune it as to when to do such operations, or what max percent of normal I/O can be used for those tasks.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Not necessarily.
Oracle will happily run on the cheap crap wintel servers that Postgres will. It will even do so under the typical OS of choice for running Postgres on cheap crap wintel servers.
Oracle will also theoretically run on some 64 cpu monster (from Sun, IBM or SGI).
A Pirate and a Puritan look the same on a balance sheet.
I thought Sun paid Pervasive for their distribution in Solaris 10. Has this changed or by "core developers" do we really mean "people who commit the pkgs to the WOS" or "people who actually work at Sun who also just happen to work on Pg in their free time"?
PostgresSQL: 778.14
s 2006q4/jAppServer2004-20061121-00046.html
Oracle 10gR2: 6,812.79
Real close, I agree.. Postgres had a 40GB cache, Oracle used 21GB
See http://www.spec.org/osg/jAppServer2004/results/re
If you mod me down, I *will* introduce you to my sister!
That is all.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I have heard from several different sources that postgres is much better than mysql etc, and all you have to do is to properly tune it. I have spent days converting databases, and trying to tune the parameters and queries. I have examined the query plans, and they seem to be pretty optimal to me, but still - I can't seem to beat MySQL.
So, why is it so hard to tune postgres? Would it be possible to make it autotune some parameters? Is there any outstanding tutorials out there somewhere were I can learn all this magic (yes, I have looked high and low but all I have found seem pretty basic)?
I have an intense dislike of Oracle. PL/SQL is a nasty language. And MSSQL and MySQL aren't even on the same level. MySQL is generally seen as one of the lower tier of "low-range database servers" while MSSQL is considered "mid-range". Oracle is considered "high-range", and it's ease of use is pretty much testament to that.
That being, the higher up the chain the harder to bloody configure and use.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
Pretty much, it does.
Running on an Xbox versus an IPX, hmm. With no data, you assume the worst case scenario because any given variable is UNKNOWN. RAM, Processor, disk IO?
Correction, it certainly does.
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.
I'd add
4. You're supporting multiple front ends (e.g., web and native apps) or providing an SQL API and want to enforce business/data model logic at a common point, but generic SQL constraints aren't enough.
John.
I get the impression that the core of it wasn't written by MS way back when, though. And it sure wasn't built by the Windows team.
It was written by Sybase, bought by MS and forked somewhere around Release 10. MS then ripped out the good parts (i.e. the ability to run on real operating systems) and rebranded it. Can you tell I come from a Sybase shop?
PostgreSQL setup: 8 cores, 16GB RAM.
Oracle setup: 64 cores, 512GB RAM.
- Sw Usr
Are you adequate?
I think he is reffering to larry's insane licensing polices, such as dual core = 2 cpu's
If you mod me down, I will become more powerful than you can imagine....
Ooops, sorry, 1133 is the comparable result.
More than 50% higher performance.
If you mod me down, I *will* introduce you to my sister!
The guidelines are pretty simple, but highly subjective.
Here are your options:
A. flat file
B. key-indexed (berkley db, dbm, gdbm)
C. feature/config light db (sqlite)
D. feature/config heavy db (postgrsql, oracle)
* Use A for configuration
* Use A if you have a small amount of data and can slurp it all into memory.
* Use B if you only access data via a known unique value and never compare records.
* Never use either A or B if the data is shared between processes unless file locking is okay.
For anything else, use C or D. If you've got a lot of familiarity with D, sometimes it's easier to use D than C, even if D seems like overkill. My recommendation is to get familiar with C and then worry about D once you understand the limitations of C. It'll help you appreciate what D gives you when you're ready for it.
*sigh* back to work...
And, my personal favourite, "Oh, you bought all of your servers under our old licensing plan? Well we just changed it so please give us more money."
Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.
Databases aren't just about performance. Type checking is a big reason why RDBMSs are more prevalent.
Type checking databases are more useful in the same way that a type-checked language is more useful than assembly. Assembly is very simple to program in, but all variables are just bytes. SQLite can't order by date, because dates are strings and don't order the same way. I'm not saying SQLite is bad (not at all, I think it's an interesting project), but simpler does not always mean better.
Social scientists are inspired by theories; scientists are humbled by facts.
Well, for people who consider prices I guess it isn't since the oracle system cost more (exclusive software cost), thought then the choice of systems might have been "optimized" aswell.
I'm with the parent in terms of when to choose what...
For anything "small" (and you can put your own definition to that) and Web-based, SQLite is my first choice. Works well, solid, runs on any platform, and there's only a single file to worry about when it comes time to backup/restore so development version control is a non-issue. In particular, I'd suggest that SQLite would be a fine choice for just about any small-scale DB-backed Web site out there (again, choosing your own definition for the word "small").
For bigger stuff, I'd choose Postgres over MySQL, but that's more a personal preference than anything else at this point. There's pros and cons to both - coming from an Oracle background, I find the mentality behind Postgres easier to work with, but I'll happily work with MySQL when the situation requires it.
Oracle, DB2, MS-SQL - you've got to use them when the situation demands it. They all work great, scale well and have excellent support, but you pay for it. I wouldn't even think about using them with a startup, but many big corps won't consider anything else.
SQLite has before insert triggers. Why doesn't MSSQL?
Try following the HTML link - http://www.spec.org/jAppServer2004/results/res2007 q3/jAppServer2004-20070606-00065.html
Is more indepth on the systems and tuning parameters used.
DB2 is used as (optional) storage engine on later versions of Lotus Domino, which makes it widely used in larger enterprises, not necessarily on large irons. http://www-142.ibm.com/software/sw-lotus/products/ product4.nsf/wdocs/nsfdb2
I think this sounds pretty obvious, probably has been done:
Have an open source suite that allows anybody to verify the test results. Let the experts tweak the configuration for each database. Provide free downloads to databases and/or easy install. This way entire community is tweaking the configurations for different servers and posting back results.
Not only this would be a more or less "proper" benchmark, but also the secrets of db tuning, if there are any, will be revealed to public. Vendors get to brag if they are in top3, rest get a nice motivation to improve.
Just my $2,000,000
Lone Gunmen crew.
Sadly has a certain amount of truth to it. I've been trying to convince my company to use PostgreSQL for years. Yet I'm still stuck having to use MSSQL. The it's free arguement doesn't work very well. MSSQL is free for developers when you already have the rather expensive MSDN subscription, and you can package MSDE with a runtime for free.
Sure it has a pile of restrictions and performance sucks but management doesn't see it that way. Nobody every got fired for buying IBM^W Microsoft...
Don't blame me - this
This is the best post I have read in ages. I am old school. I like DB's and when doing requirements gathering am already thinking in terms of tables and relational schema. In a really well designed relational schema it is possible to do very complex operations without ever resorting to plsql/transact etc. simply using plain sql (insert/update etc). I hate cursors/recordsets. As soon as I see these constructs in the db I generally find a VB/java programmer messing where he is not wanted.
PS: Oracle is not Fortran - PL/SQL is derived from Ada and therefore shares much of the Pascal/Modula feel.
http://blogs.sun.com/tomdaly/
Why UNIX?
At first, many situations simply don't need a complex database schema. Think of a contact or email list in a simple web application. Such applications often need only one table. And therefore MySQL suffices. I would not call that bad design. Moreover, don't underestimate MySQL! Recent versions are not so much behind PostgreSQL and they perform reasonably well.
It's true that the SQL knowledge of many developers is quite limited. But for this reason, and, more important, due to the bad compatibility of SQL and OO design, ORM libraries such as Java Hibernate or Python's SQL Alchemy were developed that handle the SQL part. In this case we still need a decent database schema, but it may be described in the framework, which looks more familiar to developers. However, don't underestimate the generated SQL code! Complicated SQL statements may be generated and ORM libaries may make use of referential integrity, check constraints etc. and many other features a mature database has to offer. If designed right, a ORM-based design will have a decent database schema and will make use of many "database goodies".
ORM-generated SQL code may therefore stress the database in many ways - and therefore we still need good databases - regardless of the developer's SQL knowledge.
One pitfall we always seem to step into is that in case a good, new technology is invented, we think of it as a solution to all our problems. SQL is no different here, in the past we were so proud of designing normalized schemas that we forgot about the scalability. Such schemas - although perfect - are often today the reason for performance problems. Sadly, we often overreact and tend to throw away old technology and follow the new hype. And that's what we can see today, such as thoughtlessly vandalizing database schemas due to performance problems. Nevertheless, it is often appropriate to denormalize a database schema to improve the performance. If done right, I would not call this "vandalizing".
Hardware concerns are valid. But more importantly, SPEC jAppServer is not a database benchmark. It's a Java App Server benchmark with a database component. To be fair, you have to change only the database, not the app server, hardware, and everything else too. Not to say that the conclusion is wrong, but it's very weird to say anything about database performance in this circumstance.
In our tests, which are database-only and data-warehouse focused, PG seems to be 3x-5x slower than Oracle on the exact same box, as it does not do data compression and is not as CPU-efficient. Which is quite respectable given the price difference, but may be a big enough gap to keep people who have money to spend coming back to commercial databases.
Obviously the vendor puts together the best submission they can for their own product. Oracle did the same for theirs.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Get real, that does happen. We have a product, which is a set of surveys patients or their caregivers can take. Each survey event represents a survey taken by a patient or caregiver regarding one particular patient. Each survey event results in one associated report getting printed. Survey event completion also generates scores, and score values can trigger up to 2 additional reports. The additional report behavior is governed by a score to report mapping table. Also, each survey can have alternate reports specified based upon site and product settings.
When an administrator logs into the system, an 11 table join is done to generate the list of survey_reports. For sites with 3000 patients and 4500 surveys, it takes Postgres less than a second on old hardware.