New Linux TPC-H Record Set
prostoalex writes: "New TPC-H world record for performance and scalability of database software on Linux platform has been set. The winner - Oracle 10g running on a four-node Lenovo Cluster Server DeepComp 6800, each with four Intel Itanium 2 1.3 GHz processors. Oracle also emphasizes that it's 3.5 times more performance than similar IBM DB2 benchmark. TPC-H benchmarks are available at TPC Web site."
Sun is behind windows by such a huge margin? I thought solaris sets standards for stability.
New year Resolution: Don't change sig this year
Now SCO's gonna want $2800 for a license.
"I might have made a tactical error in not going to a physician for 20 years." -- Warren Zevon
Linux is clearly being taken seriously. It's pounding the competition in the server space, and it's beginning to make serious inroads to the desktop.
Desktop Linux stories carry some interest to me, but on a server? That's old hat, old news, and very much humdrum.
This article really should be more about the cluster of Itanium chips, which actually determine the speed of the system, rather than "it runs Linux!" which in this case is largely irrelevant.
Linux is as responsible for the success of this as a dog is responsible for the bus that hit it. Similar results could easily be obtained, I'm sure, with any number of BSD variants, or other *nixes compiled to run on Itanium.
This would have been news 3 years ago, but today? Bah!
I have no problem with your religion until you decide it's reason to deprive others of the truth.
are probably comparing this system to some old ibm benchmark. They didn't say in the press release, so I'd assume the worst.
l .asp?id=103073001
IBM appears to dominate the TPC-Hs at the top & bottom, with oracle owning it in the middle.
The only really interesting benchmark out there at the moment is the IBM DB2 ICE configuration - in which they spread db2 across dozens of low-end AMD Opteron dual-cpu servers. DB2 (and informix god bless them) partition differently than oracle - more like a database implementation of beowulf (that they've been doing for 8+ years). Way cheaper than anything from oracle, and you can toss up to 1000 servers into it. Their benchmark is in the 300 gbyte range, not 1000 - but it'll scale way beyond oracle, and is cheap for that kind of power: http://www.tpc.org/tpch/results/tpch_result_detai
Makes me wonder how many pcs I've got laying around the house...
there are basically three type of clusters:
1) shared nothing: in this, each computer is only connected to each other via simple IP network. no disks are shared. each machine serves part of data. these cluster doesn't work reliably when you have to aggregations. e.g. if one of the machine fails and you try to to "avg()" and if the data is spread across machines, the query would fail, since one of the machine is not available. most enterprise apps cannot work in this config without degradation. e.g. IBM study showed that 2 node cluster is slower and less reliable than 1 node system when running SAP.
IBM on windows and unix and MS uses this type of clustering (also called federated database approach or shared nothing approach).
2) shared disk between two computers: in this case, there are multiple machines and multiple disks. each disk is atleast connected to two computers. if one of the computer fails, other takes over. no mainstream database uses this mode, but it is used by hp-nonstop. still, each machine serves up part of the data and hence standard enterprise apps like SAP etc cannot take clustering advantage without lot of modification.
3) shared everything: in this, each disk is connected to all the machines in the cluster. any number of machines can fail and yet the system would keep running as long as atleast one machine is up. this is used by Oracle. all the machine sees all the data. standard apps like SAP etc can be run in this kind of configs with minor modification or no modification at all. this method is also used by IBM in their mainframe database (which outsells their windows and unix database by huge margine). most enterprise apps are deployed in this type of cluster configuration.
the approach one is simpler from hardware point of view. also, for database kernel writers, this is the easiest to implement. however, the user would need to break up data judiciously and spread acros s machines. also adding a node and removing a node will require re-partitioning of data. mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage.
it is also easy to make it scale for simple custom app and so most of TPC-C benchmarks are published in this configuration.
approach 3 requires special shared disk system. the database implementation is very complex. the kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. this is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes.
approach 2 is similar to approach 1 except that it adds redundancy and hence is more reliable.
Oracle walks a dangersous path running on linux. Sure, the money saved by using linux/x86/oracle vs solaris/SPARC/oracle is significant, but linux can be a gateway drug to other Open Source/FREE software. Once PHBs realize that the OS is a commodity, the next step is realizing the DB is also a commodity. Postgresql or mysql isn't suitable for enterprise-level work, but it's more than suitable for small internal projects that used to mean extra orcle seats.
Do you even lift?
These aren't the 'roids you're looking for.
Shouldn't that read "New TPC-H Record Set Using Oracle?"
The article didn't give much details, but how much of this performance is directly attributable to Linux (specifically Red Hat AS3)? What was the OS of the system it beat? Could that also have been Linux? How much of the performance can be attributed to the (suspiciously un-Beowulf) Lenovo cluster?
From what I know of benchmarks, the numbers given reflect real-world preformance, to within one order of magnitude.
At first, I thought, It's just a press release, big deal... But wait, they used Linux, so it must be another straw on the back of the camel knows as the Closed Source Business Model. But wait, it's running Oracle, so it must therefore be evil. Aieeeeeeeeeeeeee!!!!
"Myaaa, Did you get the memo? We're now using the new cover on all TPC Reports. If you could just do that, that would be great. Thanks."
Anyone else read that as THC??
He painted a unicorn in outer space. I'm askin' ya, what's it breathin'?
Holy crap this story is useless! Go to the TCP-H site and actually look at the results, it really is nothing even remotely impressive.
- It's NOT the fastest TCP-H result, it's the fastest LINUX TCP-H 1000GB result. Actually it's the ONLY Linux TCP-H 1000GB result. 5th of 8 overall
- It's not even offering very good bang for your buck, coming in 5th of 8 for Price/QphH ($156 US according to today's currency exchange). The only systems it managed to beat are two outdated systems (both from HP) and an old price for a Fujitsu system, quoted in euro (the same system offers the same performance but a lower price on a newer entry quoted in US $).
In short, if anything this suggests that Linux is a BAD choice for this work! The performance isn't there and the cost is high.
Where things get REALLY bad though is the claim that this is "3.5 times faster" than a system running IBM's DB2. This is just 100% pure bullshit! The new Linux/Oracle system runs 1.3GHz Itanium2 processors and Oracle 10g. The HP/Windows/DB2 system runs 900MHz Xeon processors and runs DB2 7.2 (8.1 is current version). What's more, the Oracle/Linux system isn't even 3.5 times faster, it's just 3.5 times faster PER PROCESSOR! Great, your brand-spanking new Itanium2 is 3.5 times faster than four year old Xeon 900MHz chips. Whoopie!
Note: if you do want to see impressive Linux results, look at what IBM is doing with their Opteron cluster and DB2 running under SuSE Linux. They turned in the top results in the two TPC-H tests they entered (100GB and 300GB).
SGI have built the largest Linux machine (512 processor machine at NASA) and managed to destroy the previous memory bandwidth record held by NEC, by achieving 1 terabyte/s.
I was wondering who bought all the Itanium 2s....
The OS is largely irrelevant to speed tests which never swap or do I/O, like generating graphics. But servers show weaknesses in an OS like nothing else, since they really hammer context switches and I/O.
This IS significant. It shows the suits that Linux can handle swap intensive tasks, even tho they don't know that is what it shows.
Infuriate left and right
Truthfully, I'm not a Sun fanboy (I just think they make cool shiny toys that cost a lot). Despite their corporate issues of late, they can still flex when it comes time to move things. Given any of those system built into a decent cluster (note that no pure Sun solutions were clustered), I think something worthwhile might show up.
Even if you disagree with me on those points though, you do have to agree that the /. article itself just sucked.
SIG: HUP
As you can see here, the DB2 systems they seem to be comparing themselves with scored more than double what this one did.
I would expect a larger system to score lower on a pre-processor basis just from scaling issues, even if the processors were identical.
While the 3.5x ratio is impressive, the manner of it's announcement is very misleading.
"The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
and bitch about how the MS solution is better, here is a little secret. If you look at the current #3 from HP http://www.tpc.org/tpcc/results/tpcc_result_detail .asp?id=103082701,
you see it says COM+. Well that's not the whole truth. If you look at the actual source code, you will see references to tuxedo. It's a C++ port of tuxedo. the original TUX/TUXEDO was created by AT&T http://www.middleware.net/tuxedo/articles/tuxedo_h istory.html. Microsoft isn't stupid, but it's hardly surprising. It doesn't make any sense for anyone to re-invet transaction management, but it is lame that Microsft tries to pass it off as their innovative technology. I don't know if MS is the one who wrote the COM+ scheduler for the clients, but that is reason for the good results the last few years. I'm guessing HP is the one who wrote the COM+ port of tuxedo, since they have lots of experience with unix and MS doesn't. don't take my word, read the full disclosure yourself.
Sorry, the only reason that this resulted in a new record was because of the artifically controlled exchange rate between the Yuan and the dollar.
Sorry, try again.
[RIAA] says its concern is artists. That's true, in just the sense that a cattle rancher is concerned about its cattle.
"...but linux can be a gateway drug to other Open Source/FREE software"
For now, I don't think this is a (serious) risk. Oracle has been distributing Apache now for a number of years, for example. If you know anything about the history of Oracle, the success it has achieved is more about sales and marketing than about having a superior (or cheaper) product (remember Ingres??). If you're an CTO for a Fortune 500, are you going to move your corporate databases to MySQL? I don't think so. You are going to stick with the database vendor that's running corporate databases for most of the rest of the Fortune 500. If you're the kind of company that has a budget so tight that you NEED to run MySQL or Postgres for core systems, Oracle doesn't want or need you. Maybe the best weapon Oracle has against MySQL and Postgres is the fact that you are able to download the complete version of Oracle from OTN. There are many unlicenced Oracle implementations around the world as a result of this free download facility.
I'm all for Postgres and MySQL pushing into the enterprise world, but MSSQL should be the first target. If Oracle are prepared to put real money into backing Linux, let's support them...