World's Fastest Supercomputer to be Linux
xinit was one of the people who pointed us to the CNET story running about the possibility that a current bid by SGI for a supercomputer could be run on Linux. The supercomputer could be the fastest in the world at time of its production. SGI has confirmed the bid, saying it's being targeted for 2001, if the bid is accepted. The placement would be Los Alamos National Laboratory.
Unicos runs on the Cray family of supercomputers such as the T90 and SV1 families of vector processing supercomputers (which use custom designed vector processors) or the T3E family of massively parallel supercomputers (which use DEC Alpha processors). SGI is spinning off Cray or something effectively similar to that. SGI won't be releasing any Unicos running computers in the future.
SGI also makes massively parallel computers, which if properly configured (read lots and lots of processors) are supercomputer class machines. These machines presently run, hold on to your hats, IRIX. These machines presently use MIPS processors. One of these machines is part of the ASCI contract (Accelerated Strategic Computing Initiative) and is based on an Origin 2000 system.
Right now SGI is developing CC-NUMA computers (the same multiprocessing technology behind the Origin 2000 computers) using Intel IA-64 processors. Rather than attempting to port IRIX to an Intel processor or pretending that Windows NT will scale SGI is relying on Linux. Right now Linux can't do it, but SGI is working on improving that aspect of Linux. This is all stuff thats been posted to slashdot before. Here's a blurb to that effect.
There are several Linux clusters in the top 500, not individual computers and certainly not anything that could replace an E10K easily. While Beowulf is great for a certain class of parallelizable problems. A Big Iron database server such as the E10K is useful for it's single system image and huge memory access as well as fault tolerance and failover capablities in hardware and software. I'm not convinced that Linux will ever be used in such as system, since the advantages of open source are not as great in an environment where there are few end users, and where the end users already are spending enormous amounts of money on hardware, software, and support contracts. So the cost of an OS (really just the price of the OS support contract) is minimal in comparison to other costs. And the ability of the hardware to work tightly with the OS is a major selling point.
Linux will continue to thrive in the low-end and will migrate up to more and more powerful servers as they get cheaper and used more generally. High Availability solutions are already beginning to surface, as with TurboLinux. This will probably be the way to go for most modest sized enterprise applications.
The only way Linux will get onto a Big Iron box is for SGI, IBM or Sun to put it on there. The only good reason to do this would be to ease migration from low end solutions running on Linux. Or to appease the PHB's that demand a Linux based solution (just wait...It'll happen). Since they wouldn't abandon their current customers, they would be supporting two OSes in the same space with the same developers for quite some time. While the Linux solution would be open source, there would not be great advantages to this since the user community would be so small.
--
"L'IT c'est moi!"
When Steven Chen left Cray to start (SuperComputer systems?) they took over the old PC Board facilites in Eau Claire.
This was back in the days when the model for a fast machine was a big processor. And lo and behold, with the funding they had (read: NSA) they were able to produce a big Al clad box that DID run.
The OS they used for that project?
Linux.
What happened? Well, processors got faster and cheaper. So the need for a hi dollar mondo machine has fallen off. And today, the whole supercomputer industry is hurting.
So, a Cray/Chen/SGI/Linux connection. And I'm betting the 'linux champions' from Chen's venture are now back working for Cray/SGI. (and Intel picked up the other stragglers)
Something else to think about:
The OS is nothing more than a way to make the hardware useful. And, if the company can use the work of others, it lowers their development costs. Thus, it is now a race to the bottom (cost wise) with OpenSourced BSD and GNU/Linux being the lowest development costs and licencing fees.
SGI is fighting to exist, and OpenSource will help them do just that.
If it was said on slashdot, it MUST be true!
I've read about computations which determine the mass of a proton within the context of a particular physics theory.
The result would be a single floating point number! Or maybe I'm just simplifying things...
If tits were wings it'd be flying around.
You know, I have to wonder, when will you people realize that it's a pipedream?
If this system is to be ready by 2001, and is to be faster than ASCI Red (9,282 Intel Pentium Pro 200MHz with 1024k cache each, using proprietary interconnects (IP over SCSI, IIRC)) then it's not going to happen.
Even if the gov't is willing to sacrifice the reliability of the system, Linux is not ready and will not be for many years. Period. You're talking of going from 2 processors, which only works on x86 and Alpha currently, to over a thousand.
Sorry, folks, it isn't going to happen. It's a matter of rapid development, and availability. Sure, SGI could probably do it, but not by due date. Nor would it perform as necessary due to the requirement of extensive assembly-level optimization in both compilers and kernels.
ASCI Blue Pacific, IBM's entry, is one of the most powerful computers in the world. And what I'm about to say will probably send most of you into denial and/or shock.
Blue Pacific isn't customized all that much. In fact, barely customized. It's nearly the same machine you can order for your business today. Perhaps even slightly slower.
That's right - ASCI Blue Pacific CTR SP Silver and ASCI Blue Pacific, #11 and #2 respectively on the Top 500 list, are retail systems with some additional software. IBM's SP Silver in Poughkeepsie is a retail SP.
Shocking, isn't it? That someone can build a supercomputer that any business can buy. IBM holds a *lot* of the first 50. You don't see Sun till 54 with a machine that was totally custom built. Hell, look at #20! IBM SP Power3 200MHz. *200MHz!* And it smokes 480 supercomputers! That should tell you something right there. Now, SGI's talking about, more than likely, an x86 supercomputer or IA64 supercomputer, that's supposed to run Linux, have more than a thousand processors, and outperform ASCI Red? Nope, 'fraid not, folks. Maybe in 5 or 6 years, but not one and a half. Sorry. Deal with it.
-RISCy Business | Rabid unix guy, networking guru
your company here.
shelby != ford
Blue Mountain has two parts, an open and a secure side. As far as I can tell (and this is not terribly surprising), only details on the open side are available from press releases, etc. Anyway, it's a big beefy SGI Origin2000 system, with lots and lots of boxes each holding lots and lots of processors. (Sorry about the vagueness here -- you can probably find details if you look hard enough.) We're talking thousands of processors here, in case that wasn't abundantly clear.
My slightly-biased opinion would be that, in light of the many millions of dollars which were undoubtedly spent on said machines, it is extremely unlikely that the cluster would be ditched anytime in the near future, even if we end up getting a faster cluster -- you can always use more computing power. :-)
For lots more info, check out this press release, which gives some (now outdated) details on nirvana, the open part of blue mountain. Also, the ACL site at Los Alamos is pretty good, though a big PR-y. It also has details about the (currently extant) Linux cluster, in case you're interested. Finally, if you're curious about the real details of the Blue Mountain operating environment, you can take a look at this page, which has lots of good info.
Have fun.
No, they didn't even confirm the possibility of using Linux, unlike the Slashdot summary stated. I suspect that they will use a tried and true method that demonstrates their strengths in high end Cray and Irix technology. After all, the #1 supercomputer is not built for profit, but as an advertising tool.
--
"L'IT c'est moi!"
All I can find in the actual article (not the /. summary) is that they confirm that they are making a bid. I can't find anything from SGI which confirms the possibility of Linux. I would be quite surprised by this if it were the case.
--
"L'IT c'est moi!"
SGI has more than one line of supercomputers. ASCI Blue Mountain is an SGI Origin2000 machine. I don't think we can expect to see Linux on Crays in the near future. (And didn't SGI just divest Cray again?)
It's true that Linux hasn't currently "mastered" 16-CPU SMPs. If Larry McVoy is to be believed, that's probably a good thing for the correctness and stability of the kernel.
CPlant is number 129 on the TOP500 list; it's the fastest Linux machine currently listed that runs Linux. It used to be below 100, but more new machines were added.
The 1000-node genetic-programming cluster mentioned recently on Slashdot, and distributed.net, are not on the list at all; to get on the TOP500 list, you need to run LINPACK fast. This (a) does not interest some people, and (b) is not well-suited to the structure of some clusters. A parallel machine that is very fast for some tasks may be very poor at others.
With regard to DES cracking: the EFF's DES cracker, which cost less than a quarter of a million dollars to build, cracks DES keys in a matter of days. Such a machine can scale linearly. The fact that distributed.net takes a month to crack a single DES key does not demonstrate that the NSA requires months to do the same.
Generally, "secure" DoD sites are not connected to the Internet, auditing or no.
"supercomputer" and "enterprise server" are very different categories. "enterprise server" means "mainframe killer" -- that is, reasonable CPU speed, fast I/O, but above all, reliability. Linux is definitely fit for supercomputing, and is being used for supercomputing all over the world. Linux is probably not quite yet fit for being an "enterprise server".
However, many supercomputers do indeed need lots of disk storage.
With regard to http://www.gapcon.org/listg.html: someone said, "You will notice there are no Linux installations in that list." Actually, they list a bunch of machines from Atipa Linux Solutions at LANL, the Avalon Beowulf at LANL, the Parnass2 Beowulf at the University of Bonn, the LoBoS Beowulf at the National Institutes of Health, the Centurion Beowulf at the University of Virginia, and possibly some others. They're a minority, but they're way cheap, and they're growing fast.
With regard to the GPL: if I hack something proprietary into Linux, I need to give source, licensed under the GPL, only to people whom I give binaries to. I am under no obligation to give source to anyone else. However, the person to whom I give it can put it up on their FTP site if they want.
Kragen Sitaker, current Beowulf FAQ maintainer
Actually no, nobody is saying that the NSA is using a general purpose supercomputer to crack DES. Specialized hardware is clearly the way to go. Witness The DES cracker built by the EFF for US$250,000. This is a purely brute force attack. Even so, along with Distributed.net, they broke DES in 22 hours. The NSA could probably use more efficient techniques of cryptanalysis and more expensive hardware to be faster.
As far as I know, no one in the private sector has built such a beast for cracking RC5, but it could certainly be done.
The NSA would typically use a supercomputer only for the last step in certain factoring algorithms for breaking RSA or other such problems that require a single system image and huge amounts of memory. This was the technique used by a group of researchers that cracked a 512 bit RSA key. The first phase was distributed, but the last step required a single supercomputer.
--
"L'IT c'est moi!"
I remember it was origionally 7,000 - 7,5000, and then the steps for upgrading to 9,000 (or so) proccessors was going to take place. Of course, that was the idea when it launched, so I'm sure its been at 9,000 for a long time.
ASCII Red is at Livermore Labs, right? To bad, I'm told, it hasn't been to useful for NIF...
"Open Source?" - Press any key to continue
The second fastest current supercomputer (ASCI Blue) resides at LANL as well. Will they be running these concurrently, or are they scrapping their current cluster to put this one in?
The current fastest, ASCI Red, is located in Sandia National Labs. Both these systems were built by Intel I believe, and are gigantic clusters running some custom software.
The biggest Linux box/cluster/whatever is Avalon I believe, currently ranked #160, and also resides in LANL. Wasn't there one in the 50-60 range as well?
Personally I can't see this being anything but a good thing for Linux, both in terms of another selling point (Hey, it's good enough to be on the world's fastest computer!) as well as (hopefully) advancements in scalability (I can't imagine SGI implementing a massive cluster of single CPU boxen, meaning they may take a long hard look at SMP code and optomize it for whatever platform they're considering rolling out for this).
And, it has to be said:
Imagine a Beowulf of these things! Heh...
Also, I fail to see how incredible CPU power would be used to enhance pr0n-downloading speeds. That's generally a bandwidth issue, not a CPU issue.
---
"'Is not a quine' is not a quine" is a quine.
"'Is not a quine' is not a quine" is a quine.
Quine "quine?
No, I'm sorry folks. That line is to be read:
to be installed in Nate Fox's garage (NFg) in suburban Los Angeles.
You'd think a major publication like C|Net would get thier facts straight ;)
...Nothing is so smiple it cant get screwed up.
-----
If Bill Gates had a nickel for every time Windows crashed...
Don't get me wrong, I love Linux. But let's be realistic here. That article contained little to no factual data. The only thing they're running on is conjecture -- SGI has expressed interest in Linux in the past, so they're assuming this hundred-million-dollar multi-teraflop machine will run it? I'll believe it when I see it.
I'll bet you, unless SGI come up with some sort of Beowulf solution instead of their time-tested Cray supercomputers, we'll be seeing yet another Unicos machine at the top of the "World's Fastest Supercomputers" list in a few years.
- A.P.
--
"One World, one Web, one Program" - Microsoft promotional ad
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
Wakko is absolutely right when it comes to the relative merits of Linux and Unicos on SMP or parallel machine - Linux wasn't designed for that job and won't do it. Not now, not in the near future. Unicos currently runs machine with up to 2048 processors. Linux hasn't really mastered 16.
However, the proposed machine is bound to be a cluster, not a single machine. Linux will do this just fine. In fact Linux would run on ASCI Red if there were drivers for the networking hardware. (They've booted individual nodes with both Linux and NT).
So there is a possibility of Linux. I guess it might be easier to use Linux on IA64 than port IRIX/Unicos.
My reservation here is that neither the Itanic nor MIPS chips offer cutting edge performance.
First of all, there's quite a bit of difference between fastest and fastest known. I can't imagine both the chinese AND the american governments from having some exceedingly classified hardware that blows the pants off the open stuff(read: governmental phallus-phlashing.)
Second, the meaning of fastest is very unclear. I'd go so far to say that any system that implements a given function in software instead of hardware is going to be orders of magnitude slower than the state of the art. Witness the EFF DES cracking machine, 3D Graphics Accelerators, even Math Coprocessors. Fitting a square peg into a round hole is actually a pretty common occurance in the computer world, but it takes a relatively tortoise-like rate compared to what can be pulled off with raw gates.
That's why XISC--Extensible Instruction Set Computing--is probably the upcoming processor paradigm. Programmers need the ability to redefine round holes into square ones, so the square pegs fit right in.
Yours Truly,
Dan Kaminsky
DoxPara Research
http://www.doxpara.com
First off, this has absolutely nothing to do with the Cray division (which several people, including Hemos seem to think). This project, and the current ASCI Blue Mountain project, are both built from SGI's Origin line of servers (the SN1 is the next generation of this). Cray's unit only works with the Vector supercomputers. Also remember (from August) that SGI is going to be getting rid of this division.
Second, I'd like to point out that this article is really just speculation. Read it closely, they take a couple of facts - SGI is trying to get this contract, they are working on ramping up the scalability of Linux, and the new SN1 servers will be eventually based on the Itanium - and they try to draw a conclusion that Linux might be what is run on this supercomputer.
Now, I'm not saying that this isn't a valid argument - but Linux as an operating system has a LONG way to go before it supports the massive number of processors and amount of memory we're talking about here (Blue Mountain has 6144 processors). There is still a lot that Linux is missing.
This does not necessarily mean that SGI can't get it that far, especially with its experience in scalable OSes. I would love to see them do it. But when you already have an architecture and OS that works running on the Blue Mountain configuration, it would be going quite a bit out of their way. So, until I hear SGI themselves say "we're running Linux on T30", I'm going to be skeptical.
If they DID - hey, that'd be a GREAT push for Linux. Lets hope they go for it.
Yes there will be hard and soft faults in a large experimental system. That is understood by you and me and SGI and the customer. You don't push the limits with a tried and true platform. You push the envelope when you first try a technology.
SGI will likely be able to bring redundant processors and subsystems online (hotswapable) as needed. Software becomes stable over time on these types of systems. The key is that SGI support will have access to all source whenever they need it.
I'm one of the SysAdmins for the Centurion cluster at UVa, as well as a student in Andrew Grimshaw 's (professor who built Centurion) Operating Systems class. First of all, I'll play the part of Greg Lindahl briefly and say that Centurion is technically not a Beowulf cluster. AFAIK, part of what defines a Beowulf-class machine is one or more head nodes -- usually one -- which dispatch jobs to multiple client nodes. This is somewhat like Asymmetric Multiprocessing, in which there is a master processor which runs the OS and dispatches jobs to the slave processors. The head node(s) has more processing power, memory, etc, in order to be able to manage the other nodes. The nodes in the cluster itself are usually of homogenous composition running some freenix, usually Linux. Centurion itself consists of 128 DEC 21164 Alphas and 128 dual PII-400's, all running Linux. There are a few of us just itching to try and run LINPACK on it. :) There are also several assorted machines which serve as frontents, running anything from FreeBSD to Solaris to (ick) IRIX. There is no head node which dispatches jobs, and each node is independent of the other (no sub-clusters within the cluster). I know I'm currently stepping on a lot of toes and re-hashing a lot of info, so visit the Beowulf FAQ. Kragen's done a great job of gathering info, and it's a good read.
Secondly, Professor Grimshaw discussed the PetaFLOP project the other day, in which the LANL project is a stepping stone in. If you shell out enough money, you can have a GigaFLOP machine on your desk. If you shell out even more money, you can have a TeraFLOP machine in the raised-floor room with tons of A/C at your research center. The challenge now is to bump it up another 3 orders of magnitude. By combining SMP nodes with a message passing interface or some other form of managing distributed memory, LANL hopes to build this 30 TF machine.
However, SGI may not get the bid as C|NET reports. When the gov't spec'ed out the machines they want to have as nodes they requested 16-node processors. So let's look at the Big Boys of horsepower:
Which leaves
Just my 2 drachmas
-OWJones
...that Linux has now successfully captured the "high-end" computing marketplace? ;)
30 Teraflops sounds nice and crunchy.
Linux: It's what's for dinner.
SGI has not announced that they are going to put linux on this beast, only that it was "a possibility". Given that the Irix OS scales well beyond Linux currently I doubt they'd replace it on such short notice. The problems with Irix, namely nonstandard library layout and bad security are non-issues in a supercomputer environment.
In related news, Microsoft has confirmed that it is "possible" that WindowsCE will be used for critical life support systems in upcoming space missions. I mean come on. Neither is strictly speaking impossible, but I'd take it with a large grain of NaCl.
--
"L'IT c'est moi!"
It depends on the dimension you're scaling in. Parallel procesing power is the easiest to scale in.
/netname > /ipnetaddr > /ipnetbits > /ipnetdef > < /netdef >
Mainframes aren't that much more powerful than desktops in processing power, but they are much more powerful in terms of I/O bandwidth and storage capacity.
The scalability issue that most people are talking about is scalability on an enterprise network in number of users and diversity of missions. The scalability challenge in those dimensions is manageability. The MS argument about scalability is basically that an enterprise can manage its IT assets more cheaply on NT (and manifestly looks easier to a PHB because you use the familiar windows GUI).
However, I think most people have figured out by now that the "user friendliness" of Windows is basically a cardboard facade put up on a big honking hunk of complexity. I think Linux (as well as other Unices) has the opposite problem in that a lot of its utilities appear unneccesarily complicated, but the underlying system is much cleaner and more modular. It would be cool if every package adopted the same scheme for its configuration files, perhaps XML based:
< netdef >
< netname > sales-dept-subnet <
< ipnetdef >
< ipnetaddr > 192.168.0.64 <
< ipnetbits > 26 <
<
These could be manipulated with any combination of GUI tools, Web tools, command line tools, or even special YACC grammars purpose built to your enterprise network.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Check out http://www.gapcon.com/listg.html
This list of the top supercomputer sites is as close as you can get to up-to-date and authoritative in that field.
You will notice that there are no Linux installations in that list. Linux on a supercomputer has not been proven to be viable for the highest end systems yet. What happens if SGI fails to deliver? The box may be installed, it may boot, but what happens to Linux's reputation if the system can't fulfill it's mission.
Also, keep in mind that SGI does need some good press. You could say that they are desperate for good press right now.
SGI: We are making a bid for the T30 supercomputer
CNET: What OS will you be using?
SGI: No comment.
CNET: (aha!) So what CPU will you be using?
SGI: No comment.
CNET: (Aha!) Can you confirm that you will be using Linux.
SGI: No comment.
CNET: AHA!!!!! (Whoops did I say that out loud?)
Later...
CNET: (Damn, I forgot to ask about alien technology, I guess I'll just go with the Linux angle. I'll throw some "experts" in there for "balance". Slashdot readers will read it....I'll be rich) Rich I tell you!! haHaHAHAHAHAHAH!!!!!! (cough)
--
"L'IT c'est moi!"