Slashdot Mirror


NCSA and IBM Part Ways Over Blue Waters

An anonymous reader writes "IBM has terminated its contract with NCSA for the petascale Blue Waters system that was expected to go online in the next year. The reason stated was that NCSA found IBM's technology 'was more complex and required significantly increased financial and technical support by IBM beyond its original expectations.' The IT community is now wondering if NCSA will be renting out space in the new data center that is being built to house Blue Waters or if they will go with another vendor."

50 of 76 comments (clear)

  1. Translation by andydread · · Score: 2, Insightful

    The reason stated was that NCSA found IBM's technology 'was more complex and required significantly increased financial and technical support by IBM beyond its original expectations.'

    Translation: NCSA found that IBM was trying to lock them in with ultra proprietary technology that would have required IBM's expensive services for the life of the installation.

    1. Re:Translation by Chris+Burke · · Score: 1

      Translation: NCSA found that IBM was trying to lock them in with ultra proprietary technology that would have required IBM's expensive services for the life of the installation.

      They only just found out about IBM's business model?!

      --

      The enemies of Democracy are
    2. Re:Translation by Anonymous Coward · · Score: 2, Informative

      The reason stated was that NCSA found IBM's technology 'was more complex and required significantly increased financial and technical support by IBM beyond its original expectations.'

      As usual the /. summary is misleading at best. The actual language used was:

      The innovative technology that IBM ultimately developed was more complex and required significantly increased financial and technical support by IBM beyond its original expectations. NCSA and IBM worked closely on various proposals to retain IBM's participation in the project but could not come to a mutually agreed-on plan concerning the path forward.

      Other tidbits from the real press release are that IBM terminated the contract, not NCSA, IBM is refunding the money paid to date, and NCSA is giving back the hardware delivered to date.

      Translation:
      NCSA found that IBM was trying to lock them in with ultra proprietary technology that would have required IBM's expensive services for the life of the installation.

      That's a really dumb translation. Nobody expects a supercomputer to be commodity hardware. Just the opposite, as there is no such thing as a commodity supercomputer. Especially this kind of supercomputer, built in part to attain new performance records. When you buy something like that, you thoroughly expect vendor lock-in, expensive services, etc. There's only two or three vendors you can buy it from, and they're all going to be doing a lot of custom engineering for you, so proprietary is by definition what you're buying.

      The real translation here is: IBM realized there was no way to deliver on the original contract without taking a huge loss, and tried to negotiate with NCSA for more budget, or maybe reduced system capability, but NCSA couldn't or wouldn't do that. (Probably couldn't, I doubt they can just scare up more money at the drop of a hat. As for backing off, when your project was funded to build a "petascale" computer, you're pretty committed to delivering a petaflop, so scaling back capabilities was probably not an option.)

      Since the sides couldn't come to terms, IBM took a huge hit by terminating the contract. Yeah, they get their hardware back, but it's probably not very easy to sell to anybody other than NCSA. And they have to return all the money, which means they did a lot engineering work for $0, once again with few prospects of monetizing the work in a future deal.

      As for NCSA, even though they get the money back they still lost a lot too. Years of development down the tubes, and they have to start over (if at all) with a new supercomputer capable vendor. From scratch. At 2011 prices instead of 2007 prices. Which might well be a disaster for them if they couldn't afford to give IBM enough money to finish the original system.

    3. Re:Translation by That+Guy+From+Mrktng · · Score: 1

      Stop trying to plant common sense and facts in our IBM bashing

      Thats why you get moded 0! Now to make things even let me say one thing "IBM helped the nazis" there .. now everything is normal again, just like we want it.

    4. Re:Translation by imsabbel · · Score: 1

      If you read between the lines, we are of course back to the point of the headline:

      a) IBM wanted to siginificantly increase the price ("required signifcantly increased financial support...", which they would of course passed on), which they could not get through. So they decilined delivery to the initially contracted conditions.

      --
      HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
    5. Re:Translation by dbo42 · · Score: 1

      Since the sides couldn't come to terms, IBM took a huge hit by terminating the contract. Yeah, they get their hardware back, but it's probably not very easy to sell to anybody other than NCSA. And they have to return all the money, which means they did a lot engineering work for $0, once again with few prospects of monetizing the work in a future deal.

      Not to mention the marketing disaster.

    6. Re:Translation by petermgreen · · Score: 1

      My experiance with microsoft in academia is they like to get us on programs that are a lot cheaper than paying for the software normally but are based on paying a subscription based on the size of the whole institution rather than paying for each individual machine.

      The result is that there is no motivation to gradually migrate away from MS software since the only way to reduce the ammount paid to MS would be to virtually eliminate MS software from the institution (which is not realisitically going to happen).

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    7. Re:Translation by Old+Sparky · · Score: 1

      Once again a Microsoft shill masquerading as a Slashdot moderator mods down a comment that is both insightful AND funny!

    8. Re:Translation by rgviza · · Score: 1

      Yea you don't buy a mainframe from IBM, you rent it and never stop paying them monthly until the project is terminated. It's not exactly small potatoes either.

      --
      Don't kid yourself. It's the size of the regexp AND how you use it that counts.
    9. Re:Translation by flaming-opus · · Score: 1

      One of the big problems here is that this system was a one-off, that was not meant to be. IBM developed the system under the DARPA HPCS contract. They made a very capable system that is also very expensive. They hoped to sell a bunch of them; It looks like they sold just one. As such, all of the engineering costs are being amortised across just one machine. They couldn't leverage a bunch of smaller systems at other customer sites to stabilize the technology before deploying the monster big one at ncsa. Some of this is due to the success of their idataplex offerings, which have stolen the smaller sites away from Power7 machines.

      I agree, though, that vendor lock-in is the name of the game in these sorts of systems. However, vendors do care about competing for the next contract, and try to keep engineering costs down. One of the ways you do that, of course, is to not make one-off systems.

    10. Re:Translation by jc42 · · Score: 1

      I'm still waiting for them to find out about Microsoft's [business model].

      It might be interesting to look through the flock of Microsoft patents (thousands? millions?) with the idea of listing the patents for things published by NCSA people. More generally, how many patent violations there will be in the new super-computer, and how much will NCSA have to pay for licenses to use the things discovered/invented by their own researchers?

      And how many companies in addition to Microsoft will be filing infringement suits against the NCSA? Yeah, we know that IV will be there, but how many others will file in their own names?

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
    11. Re:Translation by treeves · · Score: 1

      I'm sure any downmod was due to the lack of the possessive apostrophe in "Microsoft's".

      --
      ...the future crusty old bastards are already drinking the Kool-Aid.
    12. Re:Translation by flaming-opus · · Score: 1

      Not sell a system as big as Blue Waters, but using the same technology.The power 755, of which blue waters was supposed to be the prime example, is very powerful per node, has a lot of bandwidth in node, and between nodes, and could be quite useful in much smaller configurations. Tim Morgan at The Register indicates that IBM will still be selling smaller configurations of this machine. It's just hard to keep up that level of per-node performance across so large a machine, for the agreed upon cost.

  2. Why did IBM do this, and what next for NCSA? by bridges · · Score: 5, Interesting

    Pretty surprising development, given the length of time that IBM and NCSA had been working on this. Dropping a contract like this essentially puts into question IBM's costing on future contract bids, so it's not something that they'd do lightly. It'll be interesting to see the scuttlebutt that comes out afterward to see how much of this was technical shortcomings and how much pure financial considerations from IBM. Maybe since IBM already got their big publicity for Power7 from Watson, they're being more profit-concious on future Power systems so they don't tie themselves to margins that are too low.

    From the NCSA side, there will certainly be a fallback of some sort - NSF and NCSA are already working out those details according to recent reports. I'd guess that they go with a large Cray XE6 system, given that a pretty sizeable version of that system is already being stood up and ironed out (the Sandia/Los Alamos Cielo system), and Cray has a lot of history successfully standing up big systems (e.g. ORNL Jaguar, Sandia Red Storm, etc.). SGI Altix is the other alternative, I guess, and there's a pretty big one up at NASA now, though that'd probably be a riskier proposition than Cray IMO, and I expect that NCSA and NSF are going to be pretty risk averse on following up on this.

    1. Re:Why did IBM do this, and what next for NCSA? by flaming-opus · · Score: 1

      I'm sure Cray can get up to speed in this time frame. They've done if before for the jaguar deployment. However, if they go with Cray, why install it at NCSA. The NSF already has a big Cray running at University of Tennessee. (Kracken) Why not just upgrade the existing cray? They already have the bugs worked out, they would just have to add more cabinets, and probably upgrade the processors.

    2. Re:Why did IBM do this, and what next for NCSA? by flaming-opus · · Score: 1

      NSF already has a big cray XT5: Kraken at UofTenn. So the risk averse would probably say get a next generation XE6. Cray has announced an integrated GPGPU option, so NCSA could get a few cabinets of GPUs to play with, but integrated into a more traditional x86 super. The fact that NSF is already familiar with the machine could make this less risky.

      However, this machine is not run by NSF, it's run by NCSA, who have no recent experience with Crays. Mostly they've been running whitebox clusters. They had SGI stuff half a decade ago, but nothing on the scale of what we're talking about here. I'd rule out SGI Altix, because it is not built to compete on price/performance, and not designed to scale this large, as a single system. IF SGI is in the running, it's probably an ICE cluster that would be used. If the problem with the IBM was cost, I don't think altix is going to fix that problem.

    3. Re:Why did IBM do this, and what next for NCSA? by Dop · · Score: 1

      Anonymous Coward, eh? You must still work there.

    4. Re:Why did IBM do this, and what next for NCSA? by Bill+Barth · · Score: 1

      Not that it changes your argument, but you should know that NCSA has a brand new Altix.

      --
      Yes...I am a rocket scientist.
    5. Re:Why did IBM do this, and what next for NCSA? by bridges · · Score: 1

      As opposed to inexpensive IBM maintenance contracts? All of the big HPC machines are expensive to run and maintain, and NCSA/NSF would be incredibly foolish if they haven't already budgeted for this.

    6. Re:Why did IBM do this, and what next for NCSA? by bridges · · Score: 1

      Why build it at NCSA instead of just upgrading Kraken? Because:

      1) Kraken is an XT5, not an XE system - the associated changes of an upgrade from XT to XE would be very large.
      2) NCSA already has a big machine room (that they just built) to support that scale of a system. Does ORNL have enough additional power and cooling capacity to support Keeneland, Jaguar, and growing Kraken by an order of magnitude in size?
      3) ORNL is already installing Keeneland, an NSF track 2 system this coming year
      4) The larger political implications to NSF of failing the $200M track 1 grant that was awarded to NCSA would probably be catastrophic.

    7. Re:Why did IBM do this, and what next for NCSA? by Durinia · · Score: 1

      5) ...and Cray is already installing a 20-ish PF machine at ORNL in the next year named "Titan".

    8. Re:Why did IBM do this, and what next for NCSA? by flaming-opus · · Score: 1

      Yes. Good find. However, that sort of system speaks to the Altix' strengths. You program it like it's a SMP, you have one coherent memory space, and several hundred processor cores. This is the perfect use of an Altix. Of course SGI would rather you use your pre/post processing Altix next to a big ICE cluster, rather than a big IBM.

    9. Re:Why did IBM do this, and what next for NCSA? by ptrifoliata · · Score: 1

      Does anyone have any idea on where HP stands here?

      --
      -mL
  3. I.B.M. by Anonymous Coward · · Score: 1

    I've Been Mugged

  4. NNSA and IBM Blue Gene by 1729 · · Score: 1

    Good for NCSA! I just wish that the NNSA had the guts to do the same with the Blue Gene/Q.

    1. Re:NNSA and IBM Blue Gene by Anonymous Coward · · Score: 1

      Absolutely. RIKEN in Japan got torn a new one when Fujitsu blew out the schedule (thereby jacking up the price) of the "K computer" by a couple of years, but being the ever trusting society Japan is, nobody made a fuss. Even talking about cancelling such a project would have been considered the height of rudeness, not to mention an admission of incompetence.

      It's great to see academic institutions stand up for a change instead of just bending over and taking it.

    2. Re:NNSA and IBM Blue Gene by halfdan+the+black · · Score: 2

      IBM does need to drop the price of Blue Gene, BUT Blue Gene is absolutely awesome to work on (I use Intrepid). Almost all the rest of the rest of the "supercomputers" out there like Cray are basically just PC clusters.

    3. Re:NNSA and IBM Blue Gene by 1729 · · Score: 3, Interesting

      Blue Gene is absolutely awesome to work on (I use Intrepid).

      Seriously? That's the first time I've heard that. What do you like about it? The buggy toolchain and CNK? The joys of (sort-of) cross-compiling? The I/O bottlenecks? The blazing fast (for 1999) CPUs?

      The only way I can see BG/P being a useful machine is either:
      1) All you need to do is run LINPACK
      2) You're booting Linux on the compute nodes (in which case a commodity Linux cluster would probably be a lot cheaper)

    4. Re:NNSA and IBM Blue Gene by dbo42 · · Score: 1

      What were you trying to run on there, a web server?
      One of the advantages of Blue Gene is precisely that its compute nodes do not run some full-featured OS that gets in your way. As HPC platform, the Blue Gene line is pretty much unrivaled in terms of energy efficiency and reliability.

    5. Re:NNSA and IBM Blue Gene by erikscott · · Score: 1

      If your code is pure MPI C or Fortran, then the BG is a decent idea. Remember, the original name of the machine was "QCDOC", or "QCD On a Chip" - if you're running QCD, it rocks. Other things, not so good. Let's say you have a big code in Java and you want to run it on your Blue Gene. Well, you're screwed - there's no JVM for the worker nodes. Let's say you have a big code in Perl (and don't laugh - Perl is what about half of computational biology gets done in). That's a problem, because there's no OS on the nodes, so there's no way to run Perl. Couple that with the bugginess of the software, the brittleness of the hardware, and desktop-class I/O and you have a machine that basically is just good for QCD and linpack. So, yeah, running a real OS on the nodes isn't all that bad an idea. Which is probably why slashdot reported on the port of Plan 9 for Blue Gene back in 2007. Links to "official" IBM site down in there, which is now throwing Lotus Notes' version of a 404 - and did we expect anything else from IBM?

  5. Cue the PERL / Beowulf cluster posts! by FlyingGuy · · Score: 2

    As in I could do what they do with a few lines of PERL and a Beowulf Cluster!!

    --
    Hey KID! Yeah you, get the fuck off my lawn!
  6. Re:Cue the PERL / Beowulf cluster posts! by bigtrike · · Score: 2

    How many gigabytes long would each line of perl be?

  7. Not really shocking news. by Zero1za · · Score: 3, Interesting

    'was more complex and required significantly increased financial and technical support by IBM beyond its original expectations.'

    Sounds about normal for an IBM gig then...

  8. Ever heard the saying: by Anonymous Coward · · Score: 1

    Go away or I will replace you with a very small shell script!

    I think: 0.0000001GB would do it.

  9. Job application by wirelesslayers · · Score: 2

    Now I know why they canceled the job exams (C, Perl and Linux Admin) I was about to do this week for a position at IBM-LTC. =(

  10. Mosaic and Netscape redux by Anonymous Coward · · Score: 1

    I hope we'll have a thread here rehashing how the Mosaic browser was developed at NCSA in the early '90s by a group of grad students informally lead by Marc Andreesen, and how the university sued after Andreesen and most of the original team took off for Silicon Valley to form Netscape.

    1. Re:Mosaic and Netscape redux by lucm · · Score: 1

      Netscape was a crime against the internet and especially against web developers of late 90s early 2000s. If you ever had to design a form in Netscape 4.7 you know what I mean - having textboxes that can only be sized in characters is significantly painful. And I won't even talk about layers because already my blood pressure is getting too high.

      --
      lucm, indeed.
    2. Re:Mosaic and Netscape redux by Bing+Tsher+E · · Score: 1

      Netscape the company was a crime against the Internet. Their aim was to introduce proprietary tags into Navigator and serve up those proprietary tags with their server technology. They were a genuine threat to Microsoft. That doesn't absolve Microsoft for crushing them, but it explains it. And things wouldn't automatically be 'better' if Netscape had won 'the browser war.' We wouldn't have Mozilla in it's present state. And I would really miss my SeaMonkey.

    3. Re:Mosaic and Netscape redux by TWX · · Score: 1

      Heh. I downloaded and installed NCSA Mosaic about twenty minutes ago, and unfortunately it no longer appears to work on Windows 7. I don't know if there's something missing in the TCP/IP stack, something in the Windows Socket Services implementation, or what, but it crashes on trying to load URLs. And yes, I did add the "http://" to the front of the URL like you used to have to do.

      --
      Do not look into laser with remaining eye.
    4. Re:Mosaic and Netscape redux by petermgreen · · Score: 1

      IIRC when I tried it on XP it ran ok but you couldn't get very far on the modern web because it doesn't work with servers that use name based virtual hosting.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  11. What CPU? by Arakageeta · · Score: 1

    It took a custom CPU to knock out the Tianhe (GPU-based) supercomputer. Did IBM plan to use an existing POWER chip, or were they trying to develop a new Cell-like (or other boutique) processor? IBM keeps saying that the future of Cell isn't dead. I wonder if NCSA thought they'd get more bang for their buck with a GPU-based solution?

    1. Re:What CPU? by That+Guy+From+Mrktng · · Score: 1

      This RIG is not for bitcoin mining this is for natural language wiretapping and crypto cracking *rolleyes*

  12. Typical by lucm · · Score: 3, Interesting

    My experience with IBM is that every new software or equipment setup is painful, complicated and goes over-budget, but once things are up and running, it is rock-solid, so in the long run it is still the vendor I would trust the most for enterprise projects. Knowing them, I always take into account the extra oil and time that will be needed to make things go smoothly at first.

    This is very different from a vendor like Dell, who takes good care of its new customers (especially the ones with deep pockets) and make sure that the delivery is on time and budget, but after a while problems start to appear (wrong firmware, obsolete drivers, etc) and pretty soon they tend to ignore you if they feel you won't bring new business in the next quarter.

    In this case with the NCSA thing, it's a typical situation where budgets have no room for the fudge factor because the organization has a price-driven selection process, which is wrong.

    --
    lucm, indeed.
    1. Re:Typical by bill_mcgonigle · · Score: 1

      In this case with the NCSA thing, it's a typical situation where budgets have no room for the fudge factor because the organization has a price-driven selection process, which is wrong.

      As in they don't have an infinite slush fund to tap into? That would be most organizations.

      You'd think by now IBM would know how to develop a specification, price it, and honor the contract price. I have to and I've only been in business 7 years. Yeah, once in a while I take a haircut, but that's called honoring your contracts.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    2. Re:Typical by lucm · · Score: 1

      A price-driven selection is an incentive for bidders to go in very lowball, and this only leads to nightmares for both parties. It's a silly practice based on obsolete purchasing practices (such as requiring three quotes for any important purchase - which over the long run drives off the vendors who usually don't win; those could be a very good match in a specific situation but after a while they won't even bother try to win a business because they know that most of the time they are contacted just to make the quota).

      This is why more and more organizations are doing a selection process where the price is sealed at first, so they can identify which bids are a match for the requirement and score them accordingly. Then when the prices are revealed a simple dollars per point formula shows which bids are off, and the ratio helps the selection committee to justify not taking the lowest bid if they feel it does not offer good value. In such situation, IBM will shine because they can offer a lot of value without cutting prices like a flea market operator.

      --
      lucm, indeed.
    3. Re:Typical by rgviza · · Score: 1

      To be fair, Dell is limited with driver support by what their vendors provide. You can reasonably expect your hardware to be supported until the next version of windows is released. At that time if the drivers aren't compatible with the new version of windows you will be upgrading your hardware.

      Pretty much an x86/x86_64 given.

      Hardware companies don't make any money maintaining drivers for 4 year old hardware for which they will never see revenue again. Their margins are so thin there's no way they could afford to.

      --
      Don't kid yourself. It's the size of the regexp AND how you use it that counts.
    4. Re:Typical by bill_mcgonigle · · Score: 1

      That sounds like a wise way to score proposals, but it still sounds like one of the following is true:

      1) the spec was insufficient
      2) IBM isn't honoring its contract

      I'm assuming here that IBM's contract said they'd complete the spec for a fixed price.

      Somehow these government contracts seem to allow for a fixed price bid that doesn't actually work, and then more money appears out of nowhere to make the contractor happy.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    5. Re:Typical by Bill+Barth · · Score: 1

      It appears to be the latter. The spec is available here. NCSA negotiated a system with IBM, proposed it to NSF under the above linked RFP, went through a peer-reviewed awards process, negotiated an award with NSF, and started working on the delivery and other aspects with IBM and NCSA's other partners. Something went wrong in the last several months, and IBM's pull out was the result. I doubt that there is any more money to be found, and all parties knew what was asked of them in order for the project to be successful.

      --
      Yes...I am a rocket scientist.
    6. Re:Typical by lucm · · Score: 1

      > To be fair, Dell is limited with driver support by what their vendors provide.

      When you have some equipment installed and "certified" by Dell, you don't expect them to use obsolete drivers while there are three or for more recent versions on their own website. This happened to me twice, and almost a third time but then I knew the drill so when the setup was completed I asked for a complete driver inventory and did the comparison with the available versions myself - thankfully I catched them before they left. If this was for a video card or a usb port it would not be so bad, but when it is for a storage adapter it is another ballgame.

      This does not happen with IBM. Before they certify a setup they will do extensive tests and validation. What sucks is that when you have a problem with a new equipement at IBM you end up in the support queue with everyone else, while with Dell usually having a technician on-site will speed things up.

      --
      lucm, indeed.
  13. Where's your SparcStation now? by Douglas+Goodall · · Score: 1

    For forty years I dreamed passionately of having my ultimate computer at home. The Apple ][ was my first "workstation, and I invested heavily and actually had two floppy drives. Then I wanted to wire wrap myself an 8086 multitasking computer. Then I had to have an IBM PC/AT. But I knew in my heart that there were these special "expensive" machines called "workstations" that ran on some strange OS called UNIX. I discovered the RISC philosophy, and began dreaming of owning a RISC workstation. I found out about Sun Microsystems, and SunOS 3.1. I began to dream of having one of these 68xxx based Sun workstations someday. Intel based PCs continued evolving, and every time Intel coughed, machines sped up considerably. But each time the architecture sped up, Microsoft released another version of their OS (if you can call it that) that took most of the ram, and ate most of the cycles. So no matter how fast the Intel boxes were, Windows based machines were not showing the performance I expected out of a "workstation". In the mid-nineties, I took a contract to set up a demonstration of an application running on a contemporary Intel Windows box and a contemporary Sun SparcStation. I had to open the Sun box to add memory, and I was stunned by how little electronics was on the circuit board for the ten thousand dollars they wanted for this "workstation". I just could't mortgage the house to buy something with such trivial hardware. Besides, I just hated the look and feel of OpenLook. I worked on contract at Autodesk briefly, and was exposed to a number of contemporary workstations, HP, SGI, MIPS,... But as cool as the X-Window system was, it still seemed somewhat raw, even with Motif. At the commodity level, I began to see computers with multiple CPU's, and operating system support in Windows NT and 386BSD as well as Linux. The day came that I heard about Apple bringing out a new operating system based on the Mach kernel, with 386BSD on top, and their GUI layer on top of that. I was intrigued and it didn't take me long to realize I was getting old and grey trying to compute with Microsoft software. Eventually I invested in the workstation I had been waiting for all those years. I bought a Mac Pro 8-core 3.0GHz 16GB-ram machine, and almost four years later it is still kicking ass, and I haven't seriously considered the need to upgrade to a newer Mac Pro, as my current one still has computing capacity to spare, and plenty of memory for what I do. Sure I paid a little more for an Apple branded Intel box, But almost four years later, Processors are not significantly faster (clock rate wise). The newer processors are said to be more efficient internally, but as I said, I haven't found the need. My entire suite of software I work on compiles in 58 milliseconds. What more can I say. So it never turned out to be a spare, or some HP cpu, or an IBM Power. I fell in love with an x86 workstation. To me, a supercomputer. To me a cluster (8-cores).