New Linux TPC-H Record Set
prostoalex writes: "New TPC-H world record for performance and scalability of database software on Linux platform has been set. The winner - Oracle 10g running on a four-node Lenovo Cluster Server DeepComp 6800, each with four Intel Itanium 2 1.3 GHz processors. Oracle also emphasizes that it's 3.5 times more performance than similar IBM DB2 benchmark. TPC-H benchmarks are available at TPC Web site."
Sun is behind windows by such a huge margin? I thought solaris sets standards for stability.
New year Resolution: Don't change sig this year
Now SCO's gonna want $2800 for a license.
"I might have made a tactical error in not going to a physician for 20 years." -- Warren Zevon
Linux is clearly being taken seriously. It's pounding the competition in the server space, and it's beginning to make serious inroads to the desktop.
Desktop Linux stories carry some interest to me, but on a server? That's old hat, old news, and very much humdrum.
This article really should be more about the cluster of Itanium chips, which actually determine the speed of the system, rather than "it runs Linux!" which in this case is largely irrelevant.
Linux is as responsible for the success of this as a dog is responsible for the bus that hit it. Similar results could easily be obtained, I'm sure, with any number of BSD variants, or other *nixes compiled to run on Itanium.
This would have been news 3 years ago, but today? Bah!
I have no problem with your religion until you decide it's reason to deprive others of the truth.
But it is still number 5 on the list. IBM DB2 on a Windows cluster is number 4 and if you are going to go with Oracle then it looks like you really want to be running it on Solaris if performance is your main objective.
"She's a West Texas girl, just like me" - G.W Bush Iraqis
Is it because I'm high... ... or, why doesn't it strike me as obvious that the "1,000 GB Results" is the one that matters?
Maybe it's just the link that is confusing/wrong...
zuhl
are probably comparing this system to some old ibm benchmark. They didn't say in the press release, so I'd assume the worst.
l .asp?id=103073001
IBM appears to dominate the TPC-Hs at the top & bottom, with oracle owning it in the middle.
The only really interesting benchmark out there at the moment is the IBM DB2 ICE configuration - in which they spread db2 across dozens of low-end AMD Opteron dual-cpu servers. DB2 (and informix god bless them) partition differently than oracle - more like a database implementation of beowulf (that they've been doing for 8+ years). Way cheaper than anything from oracle, and you can toss up to 1000 servers into it. Their benchmark is in the 300 gbyte range, not 1000 - but it'll scale way beyond oracle, and is cheap for that kind of power: http://www.tpc.org/tpch/results/tpch_result_detai
Makes me wonder how many pcs I've got laying around the house...
there are basically three type of clusters:
1) shared nothing: in this, each computer is only connected to each other via simple IP network. no disks are shared. each machine serves part of data. these cluster doesn't work reliably when you have to aggregations. e.g. if one of the machine fails and you try to to "avg()" and if the data is spread across machines, the query would fail, since one of the machine is not available. most enterprise apps cannot work in this config without degradation. e.g. IBM study showed that 2 node cluster is slower and less reliable than 1 node system when running SAP.
IBM on windows and unix and MS uses this type of clustering (also called federated database approach or shared nothing approach).
2) shared disk between two computers: in this case, there are multiple machines and multiple disks. each disk is atleast connected to two computers. if one of the computer fails, other takes over. no mainstream database uses this mode, but it is used by hp-nonstop. still, each machine serves up part of the data and hence standard enterprise apps like SAP etc cannot take clustering advantage without lot of modification.
3) shared everything: in this, each disk is connected to all the machines in the cluster. any number of machines can fail and yet the system would keep running as long as atleast one machine is up. this is used by Oracle. all the machine sees all the data. standard apps like SAP etc can be run in this kind of configs with minor modification or no modification at all. this method is also used by IBM in their mainframe database (which outsells their windows and unix database by huge margine). most enterprise apps are deployed in this type of cluster configuration.
the approach one is simpler from hardware point of view. also, for database kernel writers, this is the easiest to implement. however, the user would need to break up data judiciously and spread acros s machines. also adding a node and removing a node will require re-partitioning of data. mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage.
it is also easy to make it scale for simple custom app and so most of TPC-C benchmarks are published in this configuration.
approach 3 requires special shared disk system. the database implementation is very complex. the kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. this is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes.
approach 2 is similar to approach 1 except that it adds redundancy and hence is more reliable.
Oracle walks a dangersous path running on linux. Sure, the money saved by using linux/x86/oracle vs solaris/SPARC/oracle is significant, but linux can be a gateway drug to other Open Source/FREE software. Once PHBs realize that the OS is a commodity, the next step is realizing the DB is also a commodity. Postgresql or mysql isn't suitable for enterprise-level work, but it's more than suitable for small internal projects that used to mean extra orcle seats.
Do you even lift?
These aren't the 'roids you're looking for.
Shouldn't that read "New TPC-H Record Set Using Oracle?"
The article didn't give much details, but how much of this performance is directly attributable to Linux (specifically Red Hat AS3)? What was the OS of the system it beat? Could that also have been Linux? How much of the performance can be attributed to the (suspiciously un-Beowulf) Lenovo cluster?
From what I know of benchmarks, the numbers given reflect real-world preformance, to within one order of magnitude.
At first, I thought, It's just a press release, big deal... But wait, they used Linux, so it must be another straw on the back of the camel knows as the Closed Source Business Model. But wait, it's running Oracle, so it must therefore be evil. Aieeeeeeeeeeeeee!!!!
"Myaaa, Did you get the memo? We're now using the new cover on all TPC Reports. If you could just do that, that would be great. Thanks."
Linux is king of the hill. Bravo.
Looks like linux is moving up in the world.
I'm still impressed by teradata.
But what is MP-RAS 3.02.00 OS that the 3,000 GB Results on Terra Data ran on?
Well.. maybe. Or Maybe not. But Definitely not sort of.
Linux is almost double that of the MS solutions or solaris solutions when you compare price per QphH. Anyway this for datawarehousing, real test is the C which is relational database tests.
Have you ever been to a turkish prison?
Anyone else read that as THC??
He painted a unicorn in outer space. I'm askin' ya, what's it breathin'?
Holy crap this story is useless! Go to the TCP-H site and actually look at the results, it really is nothing even remotely impressive.
- It's NOT the fastest TCP-H result, it's the fastest LINUX TCP-H 1000GB result. Actually it's the ONLY Linux TCP-H 1000GB result. 5th of 8 overall
- It's not even offering very good bang for your buck, coming in 5th of 8 for Price/QphH ($156 US according to today's currency exchange). The only systems it managed to beat are two outdated systems (both from HP) and an old price for a Fujitsu system, quoted in euro (the same system offers the same performance but a lower price on a newer entry quoted in US $).
In short, if anything this suggests that Linux is a BAD choice for this work! The performance isn't there and the cost is high.
Where things get REALLY bad though is the claim that this is "3.5 times faster" than a system running IBM's DB2. This is just 100% pure bullshit! The new Linux/Oracle system runs 1.3GHz Itanium2 processors and Oracle 10g. The HP/Windows/DB2 system runs 900MHz Xeon processors and runs DB2 7.2 (8.1 is current version). What's more, the Oracle/Linux system isn't even 3.5 times faster, it's just 3.5 times faster PER PROCESSOR! Great, your brand-spanking new Itanium2 is 3.5 times faster than four year old Xeon 900MHz chips. Whoopie!
Note: if you do want to see impressive Linux results, look at what IBM is doing with their Opteron cluster and DB2 running under SuSE Linux. They turned in the top results in the two TPC-H tests they entered (100GB and 300GB).
SGI have built the largest Linux machine (512 processor machine at NASA) and managed to destroy the previous memory bandwidth record held by NEC, by achieving 1 terabyte/s.
:P
Who else read that as "New Linux THC" and got overly excited?
the setup is impressive, considering it's a 4x4 cluster of 1.3ghz CPU's. it's not earth shattering, but it does make linux look like a serious contender for large clustered deployments. Looking at the detail, the setup isn't really optimal, especially if you compare it to clustered setups by HP and NEC.
So looking at the chart you can only conclude for linux that it rules supreme the 100gb section with double the score of the 2nd. Middle class is really a money choice and since they don't use one currency, how much is a yaun anyway, it is hard to compare. But if you wanna go top neither windows or linux will do.
So to conclude that linux has won from this is completly and totally wrong. Sure it owns the light database side. This indeed is hardly news. But then this story wasn't about linux wasn't it?
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
heh? Don't want to compete against open software? I guess the TPC/$ would be, err, off the chart?
I was wondering who bought all the Itanium 2s....
So many zealots from both PostgreSQL and MySQL sides are publishing their thoughts on which OSS DBMS is faster, but I do not see any high-end test results from them? Why TPC results do not include anything from OSS DBMSs? No results or nobody cares or OSS DBMS cannot have any TPC results by some political reasons? Can someone explain?
Less is more !
Let's give some credit to the underlying architecture.
Bravo
The OS is largely irrelevant to speed tests which never swap or do I/O, like generating graphics. But servers show weaknesses in an OS like nothing else, since they really hammer context switches and I/O.
This IS significant. It shows the suits that Linux can handle swap intensive tasks, even tho they don't know that is what it shows.
Infuriate left and right
Truthfully, I'm not a Sun fanboy (I just think they make cool shiny toys that cost a lot). Despite their corporate issues of late, they can still flex when it comes time to move things. Given any of those system built into a decent cluster (note that no pure Sun solutions were clustered), I think something worthwhile might show up.
Even if you disagree with me on those points though, you do have to agree that the /. article itself just sucked.
SIG: HUP
As you can see here, the DB2 systems they seem to be comparing themselves with scored more than double what this one did.
I would expect a larger system to score lower on a pre-processor basis just from scaling issues, even if the processors were identical.
While the 3.5x ratio is impressive, the manner of it's announcement is very misleading.
"The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
For other odd results, Oracle doesn't seem to go up as high as DB2 and Teradata. I thought Oracle was supposed to be the heavy hitter?
Anyone got a clue as to how the opensource databases would fare in these tests?
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
and bitch about how the MS solution is better, here is a little secret. If you look at the current #3 from HP http://www.tpc.org/tpcc/results/tpcc_result_detail .asp?id=103082701,
you see it says COM+. Well that's not the whole truth. If you look at the actual source code, you will see references to tuxedo. It's a C++ port of tuxedo. the original TUX/TUXEDO was created by AT&T http://www.middleware.net/tuxedo/articles/tuxedo_h istory.html. Microsoft isn't stupid, but it's hardly surprising. It doesn't make any sense for anyone to re-invet transaction management, but it is lame that Microsft tries to pass it off as their innovative technology. I don't know if MS is the one who wrote the COM+ scheduler for the clients, but that is reason for the good results the last few years. I'm guessing HP is the one who wrote the COM+ port of tuxedo, since they have lots of experience with unix and MS doesn't. don't take my word, read the full disclosure yourself.
But - db2 (and I think Oracle) can only access around 1 gbyte of memory on a 32-bit linux OS - without reliance upon extended memory functionality.
So - even if xeon if fast, the impact of 1 gbyte of memory per cpu - vs 8+ more than makes up for it.
I'm not even sure if MySQL is even capable of running these tests, what with needing transactions and all. Do they have that yet?
autopr0n is like, down and stuff.
mmmm.... no.
No cheer leading from me...
Seriously, isn't Oracle involved in the Great Firewall of China?
What are they storing, a list of all the sites that AREN'T allowed to be viewed?
A list of the 1 billion hotmail addresses used by mainland Chinese?
It is nice that they've pushed the tech to a new level, but you've got to think...
http://jesus.everdense.com/
You left out the military type of cluster.
Sorry, the only reason that this resulted in a new record was because of the artifically controlled exchange rate between the Yuan and the dollar.
Sorry, try again.
[RIAA] says its concern is artists. That's true, in just the sense that a cattle rancher is concerned about its cattle.
...I really care about database app benchmarks...
"The TPC Benchmark(TM)H (TPC-H) is a decision support benchmark." i.e. for management accountants.
_ detail.asp?id=103090501">result</a> for TCP-C, which looks OK but not stunning. The <a href="http://www.tpc.org/results/FDR/TPCC/HP%20Int egrity%20rx5670%20Linux%20FDR.pdf">Full Disclosure Report</a> shows horrendous maximum response times. This would kill a real system.
"The TPC-C benchmark continues to be a popular yardstick for comparing OLTP performance on various hardware and software configurations." i.e. for me to get cash from an ATM
There's only one <a href="http://www.tpc.org/tpcc/results/tpcc_result
Linux is good, but 2.6 will be better!
"You fell victim to one of the classic blunders. The most famous is "Never get involved in a land war in Asia." But only slightly less well known is this: "Never go in against a Sicilian when death is on the line." Ahahahahaha! Ahahahaha! Ahahaha, (choke, die). Vizzini - The Princess Bride in a battle of wits with the man in black (Westley). (Vizzini died of the iocaine powder laced drink).
Long ago in another world when I worked for a db vendor, we looked at publishing TPC benchmarks for our product, and it would have cost us 2x as much as our annual sales in bribes to do so. There's no way it's worth that much money to anyone to publish almost meaningless benchmark numbers for PostgreSQL or MySQL.
"...but linux can be a gateway drug to other Open Source/FREE software"
For now, I don't think this is a (serious) risk. Oracle has been distributing Apache now for a number of years, for example. If you know anything about the history of Oracle, the success it has achieved is more about sales and marketing than about having a superior (or cheaper) product (remember Ingres??). If you're an CTO for a Fortune 500, are you going to move your corporate databases to MySQL? I don't think so. You are going to stick with the database vendor that's running corporate databases for most of the rest of the Fortune 500. If you're the kind of company that has a budget so tight that you NEED to run MySQL or Postgres for core systems, Oracle doesn't want or need you. Maybe the best weapon Oracle has against MySQL and Postgres is the fact that you are able to download the complete version of Oracle from OTN. There are many unlicenced Oracle implementations around the world as a result of this free download facility.
I'm all for Postgres and MySQL pushing into the enterprise world, but MSSQL should be the first target. If Oracle are prepared to put real money into backing Linux, let's support them...
No experience with Unix?
Hmm they even wrote one years ago. 1980 to be exact. Some of us even remember our xenix floppies.
How about Xenix which could be argued quite well that it morphed into Sco.
MS has quite a bit of experience with Unix do not fool yourself. Even ported IE 4.0 over way back in the day.
Though MS has some evil qualities, to not overlook the fact they have many smart people working there. And they are not stupid.
Puto
The Revolution Will Not Be Televised
So let me get this straight. NASA has installed a big new Linux-based supercomputer, consuming only 5.2kW of power per rack, that has broken previous transaction processing benchmarks running Oracle 10g.
-- Ed Avis ed@membled.com
True, but its not a bad sign. It seems a number of those benchmarks are made with outdated processors, I imagine at somepoint someone is going to try to get a benchmark on a Win03 server that runs on Opetron or Itanium and oracle. Then we can sorta compare the relitive strengths of the OS' running the benchmark.
Well.. maybe. Or Maybe not. But Definitely not sort of.