Transmeta Meets Blades
The Griller writes "Gordon Bell, one of the creators of VAX, and Linus Torvalds were at the launch of a new supercomputing platform at the Los Alamos National Laboratory. Based on Crusoe processors from Transmeta and running a version of linux, it is aimed at being cheaper than conventional supercomputers by requiring no cooling and lower maintenance.
" Basically, it's blade clustering, using Beowulf.
It all comes down to "power consumption, size, reliability and ease of administration", apparently.
And the marketing people at RLX Technologies should be shot for not having a press release up for this, as it's all based on their product...
Strongarm has no FPU and is not as fast as a Crusoe.
A Pentium III still needs way more energy than a Crusoe. You have to keep in mind that those energy savings of Intel ships are usually accomplished by lowering the processor clock rate which will not help very much if you need processing power. The Crusoe also changes the clock rate, but does so dynamically, so that you always have the speed you need. Additionally, it has far fewer transistors and therefore needs less energy even at full speed.
***Quis custodiet ipsos custodes***
Obviously you are not familiar with the ARM family of processors - they are very similar to the Crusoes, and in particular they don't require any active cooling either.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
I was always amused at how a unibus PDP-11 with 512K of main memory could beat the snot out of a 386 at real-world tasks. I/O is critical for so many applications...
However, don't write off clusters yet; have you looked at The AGGREGATE? The link points to Klat2 (Gort, klaatu barada nikto! Sorry) which is a very photogenic aggregate-based machine. The techniques these guys are developing may bring high I/O throughput into clustering at mainframe levels eventually.
Ok, finally that's a legitimate response. It's true ARM doesn't include an FPU. However, the last I checked (and I'm not real up to date on it) using libfloat it had emulation good enough to keep up with IA32 fairly well on FP.
I imagine, though, this is probably the reason. It seems reasonable that Supercomputer work would require some FP, although I don't know for sure.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
fast 128-bit vector processing unit (very nice for scientific calculation)
Actually, vector processing is essentially useless for most scientists as long as the compiler doesn't autovectorize the code.
First, most algorithms are NOT trivially vectorizable.
Second, most scientific code is Fortran-77 that has been developed over decades. If there are trivial function calls where you can use an Altivec library it's fine, but there is no way people are going to rewrite all their code in Altivec since it would destroy portability (and Altivec primities only exist for C/C++ anyway).
3. Almost all scientific software users double precision.
There are a handful of cases where vector processing is wonderful, but it's a very limited subset (and although that subset might be important to you, it doesn't suffice for most users). Just look at x86; you can argue that SSE/SSE2 isn't as capable as altivec, but it definitely accelerates performance significantly. Still, very few programs are handcoded with those instructions even though the x86 marked is 20 times larger and SSE2 supports double precision - it simply isn't worth the effort.
The G4 Altivec might be wonderful, but I want my code to run fast on all platforms, and have a lifetime of at least 10-20 years. If we are to invest any time in handcoding vector instructions it will be SSE and not Altivec, since that userbase is 20 times larger...
Actually I found some specs, SA@600MHz uses 450mW, so you could power 4 for the same price as one Crusoe. Perhaps it's a political thing, I'd certainly rather Transmeta get the business than Intel, but I still don't see the technical justification.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
Your math is wrong. It should be:
1,555,200 * 0.00175Kw * 0.10 (dollar per KwH power cost) = $272.16 electricity cost/year (Crusoe)
1,555,200 * 0.00575Kw * 0.10 (dollar per KwH power cost) = $894.24 electricity cost/year (Intel)
This savings is absolute dollars is much less significant when you divide by 10.
Actually, vector processing is essentially useless for most scientists as long as the compiler doesn't autovectorize the code.
// init it somehow // init it somehow
// result is a vector
Thats wrong. The rest of your post also.
double x[veclen];
double y[veclen];
double scalar_product = 0;
for (int i; i less_than veclen; i++) {
scalar_product += x[i] * y[i];
}
This above is scalar code. Any compiler aware of a vector processor compiles that to a singel vector processor instruction. At least that was the case 14 years ago when I worked on vector processors.
I'm not sure if Altivect is a true vector processor, I think it supports like MMX only very limited SIMD processing, but I'm not sure as I say.
Operations on "arrays", hence vector processors, are very easy to map on vector processing units.
Regardless if it is as easy as above or if you have offsets or gaps like i+=3 in the loop above.
Same is true if the result is a vector again of course.
Manual vector processing instructions get interesting if the loop aove would calculate a vector and that vector was nput for a further stage.
Like this:
Vector a, b, c, d, e;
Scalar i, j, k;
a = i*b + j*c;
e = a + k*d;
Ususlay you would have loops calculating that, the second loop would run after a is completely calculated.
If there is a second vector processor (or just a unit on the processor) you can feed a dirctly into it tocalculate e.
AND THIS is hard to figure for a compiler. Probably youment that. As all vector units are different in that respect there exist fortran libraries with standard subroutines for that.
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.