Slashdot Mirror


New Linux TPC-H Record Set

prostoalex writes: "New TPC-H world record for performance and scalability of database software on Linux platform has been set. The winner - Oracle 10g running on a four-node Lenovo Cluster Server DeepComp 6800, each with four Intel Itanium 2 1.3 GHz processors. Oracle also emphasizes that it's 3.5 times more performance than similar IBM DB2 benchmark. TPC-H benchmarks are available at TPC Web site."

24 of 130 comments (clear)

  1. Sun is 9th? by civilengineer · · Score: 2, Interesting

    Sun is behind windows by such a huge margin? I thought solaris sets standards for stability.

    --

    New year Resolution: Don't change sig this year
  2. Great... by RiffRafff · · Score: 4, Funny

    Now SCO's gonna want $2800 for a license.

    --
    "I might have made a tactical error in not going to a physician for 20 years." -- Warren Zevon
    1. Re:Great... by Soko · · Score: 3, Funny

      OMG...Imagine if SCO is stupid enough ^W^W^W has the cojones to take on Oracle too. O_O

      "Haha, you fool! You fell victim to one of the classic blunders. The most famous, is never get involved in a litigation war with a company who has more lawyers than Linden, Utah has farmers. But only slightly less well-known is this: Never go in against a meglomaniac when big bucks are on the line! Ahahahahaha! Ahahahaha! Ahahaha--" ~ Larry Elliston

      Soko

      --
      "Depression is merely anger without enthusiasm." - Anonymous
  3. Hey guys, we WON already... by mcrbids · · Score: 3, Interesting

    Linux is clearly being taken seriously. It's pounding the competition in the server space, and it's beginning to make serious inroads to the desktop.

    Desktop Linux stories carry some interest to me, but on a server? That's old hat, old news, and very much humdrum.

    This article really should be more about the cluster of Itanium chips, which actually determine the speed of the system, rather than "it runs Linux!" which in this case is largely irrelevant.

    Linux is as responsible for the success of this as a dog is responsible for the bus that hit it. Similar results could easily be obtained, I'm sure, with any number of BSD variants, or other *nixes compiled to run on Itanium.

    This would have been news 3 years ago, but today? Bah!

    --
    I have no problem with your religion until you decide it's reason to deprive others of the truth.
  4. those oracle guys and their bag of tricks... by kpharmer · · Score: 3, Informative

    are probably comparing this system to some old ibm benchmark. They didn't say in the press release, so I'd assume the worst.

    IBM appears to dominate the TPC-Hs at the top & bottom, with oracle owning it in the middle.

    The only really interesting benchmark out there at the moment is the IBM DB2 ICE configuration - in which they spread db2 across dozens of low-end AMD Opteron dual-cpu servers. DB2 (and informix god bless them) partition differently than oracle - more like a database implementation of beowulf (that they've been doing for 8+ years). Way cheaper than anything from oracle, and you can toss up to 1000 servers into it. Their benchmark is in the 300 gbyte range, not 1000 - but it'll scale way beyond oracle, and is cheap for that kind of power: http://www.tpc.org/tpch/results/tpch_result_detail .asp?id=103073001

    Makes me wonder how many pcs I've got laying around the house...

  5. Three types of clusters by Preach+the+Good+Word · · Score: 5, Informative

    there are basically three type of clusters:

    1) shared nothing: in this, each computer is only connected to each other via simple IP network. no disks are shared. each machine serves part of data. these cluster doesn't work reliably when you have to aggregations. e.g. if one of the machine fails and you try to to "avg()" and if the data is spread across machines, the query would fail, since one of the machine is not available. most enterprise apps cannot work in this config without degradation. e.g. IBM study showed that 2 node cluster is slower and less reliable than 1 node system when running SAP.

    IBM on windows and unix and MS uses this type of clustering (also called federated database approach or shared nothing approach).

    2) shared disk between two computers: in this case, there are multiple machines and multiple disks. each disk is atleast connected to two computers. if one of the computer fails, other takes over. no mainstream database uses this mode, but it is used by hp-nonstop. still, each machine serves up part of the data and hence standard enterprise apps like SAP etc cannot take clustering advantage without lot of modification.

    3) shared everything: in this, each disk is connected to all the machines in the cluster. any number of machines can fail and yet the system would keep running as long as atleast one machine is up. this is used by Oracle. all the machine sees all the data. standard apps like SAP etc can be run in this kind of configs with minor modification or no modification at all. this method is also used by IBM in their mainframe database (which outsells their windows and unix database by huge margine). most enterprise apps are deployed in this type of cluster configuration.

    the approach one is simpler from hardware point of view. also, for database kernel writers, this is the easiest to implement. however, the user would need to break up data judiciously and spread acros s machines. also adding a node and removing a node will require re-partitioning of data. mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage.
    it is also easy to make it scale for simple custom app and so most of TPC-C benchmarks are published in this configuration.

    approach 3 requires special shared disk system. the database implementation is very complex. the kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. this is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes.

    approach 2 is similar to approach 1 except that it adds redundancy and hence is more reliable.

    1. Re:Three types of clusters by borgboy · · Score: 2, Informative

      Actually, Microsoft employs both federated and failover clustering, and the two are not mutually exclusive - you can build a federated cluster with failover nodes - to one another or to a hot spare. Federating and failover aren't really related. Federation is a way of dividing up your large tables across multiple database servers for performance, failover is for redundancy.

      most enterprise apps are deployed in [the shared everything] type of cluster configuration.

      Really? Wow. You sure about that? I would disagree. Many, perhaps. We don't, and we're a Fortune 500 with over 60k employees. Most requires a majority, and I'd have to see numbers to be convinced.

      --
      meh.
  6. oracle and linux by larry+bagina · · Score: 3, Insightful

    Oracle walks a dangersous path running on linux. Sure, the money saved by using linux/x86/oracle vs solaris/SPARC/oracle is significant, but linux can be a gateway drug to other Open Source/FREE software. Once PHBs realize that the OS is a commodity, the next step is realizing the DB is also a commodity. Postgresql or mysql isn't suitable for enterprise-level work, but it's more than suitable for small internal projects that used to mean extra orcle seats.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

    1. Re:oracle and linux by nemesisj · · Score: 2, Insightful

      This is not an insightful post. Serious people use Oracle. Poor people use PostgreSQL. If you need Oracle, you will have to use it or DB2 - PostgreSQL just can't cut it on several levels with Oracle, currently.

    2. Re:oracle and linux by alannon · · Score: 2, Informative

      Apparently the American Chemical Society has a Postgresql database in use that's over a terabyte in size. I don't know if this is the largest one currently in use.

      Also the largest commercial database is about 23Tb and runs on Oracle.

      What these numbers don't say anything about, though, is how much of these databases are taken up by BLOBs, and how much is actual field data. Having most of your data in BLOBs is really just making your database a fancy file system, since BLOBs reside in a different part of the database, cannot be indexed (at least not like normal fields), cannot be used in SELECT statements, etc.

      Actually, this is what Oracle has been trying to get companies to do for a few years now. Put EVERYTHING in the database.

      For that matter, Microsoft plans to take this approach by actually placing the filesystem in the database in an upcoming Windows release.

      Give me access to a 50 terabyte disc array and I'll gladly build you a 50 terabyte Postgres database.

  7. Should we be happy or sad? by Qrlx · · Score: 4, Interesting

    Shouldn't that read "New TPC-H Record Set Using Oracle?"

    The article didn't give much details, but how much of this performance is directly attributable to Linux (specifically Red Hat AS3)? What was the OS of the system it beat? Could that also have been Linux? How much of the performance can be attributed to the (suspiciously un-Beowulf) Lenovo cluster?

    From what I know of benchmarks, the numbers given reflect real-world preformance, to within one order of magnitude.

    At first, I thought, It's just a press release, big deal... But wait, they used Linux, so it must be another straw on the back of the camel knows as the Closed Source Business Model. But wait, it's running Oracle, so it must therefore be evil. Aieeeeeeeeeeeeee!!!!

  8. These TPC reports by Fnkmaster · · Score: 3, Funny

    "Myaaa, Did you get the memo? We're now using the new cover on all TPC Reports. If you could just do that, that would be great. Thanks."

  9. Whooa (-1 Redundant)(Probably)(Hopefully =) by D+iz+a+n+k+Meister · · Score: 2, Funny

    Anyone else read that as THC??

    --

    He painted a unicorn in outer space. I'm askin' ya, what's it breathin'?
  10. What kind of marketing garbage is this crap?! by Hoser+McMoose · · Score: 4, Insightful

    Holy crap this story is useless! Go to the TCP-H site and actually look at the results, it really is nothing even remotely impressive.

    - It's NOT the fastest TCP-H result, it's the fastest LINUX TCP-H 1000GB result. Actually it's the ONLY Linux TCP-H 1000GB result. 5th of 8 overall

    - It's not even offering very good bang for your buck, coming in 5th of 8 for Price/QphH ($156 US according to today's currency exchange). The only systems it managed to beat are two outdated systems (both from HP) and an old price for a Fujitsu system, quoted in euro (the same system offers the same performance but a lower price on a newer entry quoted in US $).

    In short, if anything this suggests that Linux is a BAD choice for this work! The performance isn't there and the cost is high.

    Where things get REALLY bad though is the claim that this is "3.5 times faster" than a system running IBM's DB2. This is just 100% pure bullshit! The new Linux/Oracle system runs 1.3GHz Itanium2 processors and Oracle 10g. The HP/Windows/DB2 system runs 900MHz Xeon processors and runs DB2 7.2 (8.1 is current version). What's more, the Oracle/Linux system isn't even 3.5 times faster, it's just 3.5 times faster PER PROCESSOR! Great, your brand-spanking new Itanium2 is 3.5 times faster than four year old Xeon 900MHz chips. Whoopie!

    Note: if you do want to see impressive Linux results, look at what IBM is doing with their Opteron cluster and DB2 running under SuSE Linux. They turned in the top results in the two TPC-H tests they entered (100GB and 300GB).

    1. Re:What kind of marketing garbage is this crap?! by Anonymous Coward · · Score: 3, Interesting

      have to point out the Xeon processors have 2Mb L2 cache, where as the the Itanium2 have 512kb. That makes a huge difference for TPC-H queries. Plus, don't believe Intel's hype about Itanium. Xeon is still a kick ass CPU.

    2. Re:What kind of marketing garbage is this crap?! by Hoser+McMoose · · Score: 5, Informative

      The Itanium's have 512KB of L2 cache and 3MB of L3 cache, with it's L3 cache being faster and having lower latency than the L2 cache of the old Xeons.

      Xeons are fine chips, but the 900MHz Xeon is totally outdated. A new 2.8GHz XeonMP system with 2MB of L3 cache would probably also be about 3.5 times faster on this test than the old 900MHz Xeon.

  11. And on other linux benchmarking news... by Anonymous Coward · · Score: 2, Interesting

    SGI have built the largest Linux machine (512 processor machine at NASA) and managed to destroy the previous memory bandwidth record held by NEC, by achieving 1 terabyte/s.

  12. Ahh, so that's it! by jjeffries · · Score: 4, Funny

    I was wondering who bought all the Itanium 2s....

  13. Oh but this IS significant by A+nonymous+Coward · · Score: 4, Insightful

    The OS is largely irrelevant to speed tests which never swap or do I/O, like generating graphics. But servers show weaknesses in an OS like nothing else, since they really hammer context switches and I/O.

    This IS significant. It shows the suits that Linux can handle swap intensive tasks, even tho they don't know that is what it shows.

  14. Better summary by autocracy · · Score: 3, Insightful
    Sun Micro kicked everybody's ass. Read across the board, they had the cheapest cost per performance and though Fujitsu systems really shined through on the 1000GB test, they're still SPARC architecture and still running Solaris.

    Truthfully, I'm not a Sun fanboy (I just think they make cool shiny toys that cost a lot). Despite their corporate issues of late, they can still flex when it comes time to move things. Given any of those system built into a decent cluster (note that no pure Sun solutions were clustered), I think something worthwhile might show up.

    Even if you disagree with me on those points though, you do have to agree that the /. article itself just sucked.

    --
    SIG: HUP
  15. That's 3.5 times more performance per processor by RalphBNumbers · · Score: 2, Informative

    As you can see here, the DB2 systems they seem to be comparing themselves with scored more than double what this one did.

    I would expect a larger system to score lower on a pre-processor basis just from scaling issues, even if the processors were identical.

    While the 3.5x ratio is impressive, the manner of it's announcement is very misleading.

    --
    "The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
  16. for those who don't read the full disclosure by Anonymous Coward · · Score: 2, Interesting

    and bitch about how the MS solution is better, here is a little secret. If you look at the current #3 from HP http://www.tpc.org/tpcc/results/tpcc_result_detail .asp?id=103082701, you see it says COM+. Well that's not the whole truth. If you look at the actual source code, you will see references to tuxedo. It's a C++ port of tuxedo. the original TUX/TUXEDO was created by AT&T http://www.middleware.net/tuxedo/articles/tuxedo_h istory.html. Microsoft isn't stupid, but it's hardly surprising. It doesn't make any sense for anyone to re-invet transaction management, but it is lame that Microsft tries to pass it off as their innovative technology. I don't know if MS is the one who wrote the COM+ scheduler for the clients, but that is reason for the good results the last few years. I'm guessing HP is the one who wrote the COM+ port of tuxedo, since they have lots of experience with unix and MS doesn't. don't take my word, read the full disclosure yourself.

  17. Exchange Rate by charnov · · Score: 2, Insightful

    Sorry, the only reason that this resulted in a new record was because of the artifically controlled exchange rate between the Yuan and the dollar.

    Sorry, try again.

    --
    [RIAA] says its concern is artists. That's true, in just the sense that a cattle rancher is concerned about its cattle.
  18. Oracle and Open Source??? by markxsd · · Score: 2, Insightful
    As somebody who worked for O$ for many years, I'm interested by Larry's jump into bed with Linux. I'm out of the loop now, but I wouldn't be surprised if this is part of a "testing the OSS water" strategy. I also wouldn't be surprised to see some low revenue Oracle products (e.g. JDeveloper, Developer, Application Server) being turned open source.

    "...but linux can be a gateway drug to other Open Source/FREE software"

    For now, I don't think this is a (serious) risk. Oracle has been distributing Apache now for a number of years, for example. If you know anything about the history of Oracle, the success it has achieved is more about sales and marketing than about having a superior (or cheaper) product (remember Ingres??). If you're an CTO for a Fortune 500, are you going to move your corporate databases to MySQL? I don't think so. You are going to stick with the database vendor that's running corporate databases for most of the rest of the Fortune 500. If you're the kind of company that has a budget so tight that you NEED to run MySQL or Postgres for core systems, Oracle doesn't want or need you. Maybe the best weapon Oracle has against MySQL and Postgres is the fact that you are able to download the complete version of Oracle from OTN. There are many unlicenced Oracle implementations around the world as a result of this free download facility.

    I'm all for Postgres and MySQL pushing into the enterprise world, but MSSQL should be the first target. If Oracle are prepared to put real money into backing Linux, let's support them...