Efficient Supercomputing with Green Destiny
gManZboy writes: "Is it an oxymoron to have an efficient supercomputer? Wu-Chun Feng (Los Alamos National Laboratory) doesn't believe so - Green Destiny and its children are Transmeta-based supercomputers that Wu thinks are fast enough, at a fraction of the heat/energy/cost, according to ACM Queue." 240 processors running under 5.2kW (or less!) is nothing to sneeze at. The article offers up this question: might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
I knew that sword was beefy, but that's insane!
Most definitely nothing to sneeze at. I have to ask, though: how long ago was it insane for a supercomputer to put out as much heat as the average enthusiast PC puts out today?
"Aye, and if my grandmother had wheels, she'd be a wagon!" -- Montgomery Scott, ST:III
How much of a footprint and weight they take up as a metric to consider? ;)
Join the TWIT army now!
The MHZ war has been going on for soooo long that everyone just excepted that faster MHZ related to faster machines. Well, 64Bit computers are placing chip manufactures in a position where they have to market on a platform that declares that MHZ doesn't really matter.
I think the question is a bit naive though as everyone knows a hundred software tools to rate performance of CPUs rather than just relying on MHZ.
Nick Powers
Encryption: I may not agree with what you say, but I will defend your right to encrypt it...
a (kinder, gentler) Beowulf cluster of these!
Speaking of which, why hasn't anyone made an OpenMosix cluster-in-a-box yet?
Imagine a beowulf cluster of these! Might set my house on fire..
Why bother? If you have to sacrifice computational power for energy efficiency, then what is the point of having a supercomputer? Isn't compute power the whole purpose of having a supercomputer?
RE:"Doesn't matter if the inside contains a computer that's half the speed of competing x86 boxes LALL!!!"
or if it is just has a squirril cage and a rat...
in spite of my rage i am still just a rat in a cage
...would be how large a Quake Death Match it could host. Now if they could make a Beowulf cluster with these... /. observations are out of the way. Rational discussion can^H^H^H may begin.
OK, the obligatory
Rick
This was on The Register AGES ago... I remember it well, because that's where I learned of the Gelsinger Coefficient.
For this specific application, scientific supercomputing with a blade architecture, would native VLIW offer any performance benefits?
If it would only be a few percent, it wouldn't be worth it. Transmeta has picked the game they want to play, and it would be a big deal at this point to engineer a special version of their chips that make it possible to run native VLIW code.
I'm guessing that typical scientific processing involves a lot of loops that run many iterations, which is the ideal situation for the code morphing engine; so I'll go out on a limb and predict that it is not worth it to make a special version of the chip.
steveha
lf(1): it's like ls(1) but sorts filenames by extension, tersely
I was talking to a friend the other day about a bunch of lab computers that my school is getting rid of - a bunch of old Pentium MMX's. He suggested turning them into a cluster. But after thinking about it, I realized that the group of about 10 old computers we had would consume more power - and would likely be considerably slower than a single one of the 2.4Ghz Dell's that replaced them. "What's the point?" I said.
Applying that here, the little VIA chips run at roughly the speed of a Celeron 500 or so, I'd say something like an AMD Athlon 3GHz would be just about as fast as about 6 of the VIA chips. So you are still saving some power, but as not as much as it would seem as first, as you need many low power chips to equal the speed of one faster chip. Not to mention power consumed by having more motherboards, network cards, switches, and other associated hardware.
Something to really look at is the cluster of G5's. The G5 chips use a lot less power than their x86 counterparts. I bet that cluster of G5's is probably right up there in terms of processing power per watt as this VIA super computer. And it's way more cool to boot.
While the points that the author makes are true about the "frugal consumer", those aspects are not applicable to supercomputing.
Overall performance is much more important than efficiency. While efficiency is commendable at all computing levels, if efficiency is a very important aspect, then a supercomputer is probably not for you.
Post #7517209 (above) is plagarized from Post #7214390. Don't mod up plagarism!
The other point is: how expensive it is to support a cluster ? Not only the energy consumption, but also the infraestructure. It is pretty darn difficult to keep a thousand processors cold. You may need a special building, special power supply for it, etc.
A final point: as far as I know, the rule of thumb is that the floating point performance with these energy efficient processors is of the same order of magnitude as regular processor, may be a factor 2 difference.
You do the math ... :-)
If your supercomputer cluster nodes are cheaper/more energy efficient, then given a fixed budget, your supercomputer can be bigger!
This makes me wonder if such a configuration might find its way into ordinary extreme performance desktop/laptop computers.
Especially with the new wave of Media Center based PCs...small small machines that are very powerful....is THIS the future of servers? Perhaps in a few years my web pages will all be served up from something like a handheld PC, with several processors and always-on WiFi? The possibilities are endless, but I see this DEFINITELY making it into laptops of some creed...those ultra-high-performance ones that nobody seems to buy.
Vivan los pocos!
Robert Cringely pointed out the benefits of this tradeoff (pure speed vs. low heat/hihg maintainability), pointing to Google's use of Pentium III-s for their server farms.
Mod parent DOWN -- was copied from another post!!
Why are supercomputers primarily benchmarked by their speed? The answer comes when you consider that almost all labour-saving devices are measured in the work they perform in a given period of time.
Time is the only truly finite resource from a human perspective. As technology has progressed, distances have been conquered, vast energies harnessed, but old Father Time is still inescapable.
As a result, we place great value on just how much time is taken to accomplish anything.
with the centaur C5P processor core. Draws about 8W for the chip @ 1Ghz. Lets assume 12W total for network boot.
p g ]
[ see image here: peertech.org/hardware/viarng/image/nano-itx-c5p.j
With 5,200 Watts for Green Destiny, you could use 433 boards these boards for the same power consumption.
The on chip AES is clocked at 12.5Gbps, Entropy at 10Mbps (whitened). Thus you would have
422Ghz of C5 processor power
5.412TB/s of AES (yes, terabytes)
4.22Gbps of true random number generation.
Yeah, these are really rough estimates, but that is a long of bang for your kilowatt buck no matter how you slice it.
With a cutting edge P4 approaching 100W the efficiency of these less powerful but fully capable systems will become increasingly attractive.
I would not be surprised to find bleeding edge processors relegated to gamers and workstations as most computing tasks start migrating towards small, silent, low power systems that simply *work* without eating up desk space, filling a room with fan noise and driving the electricity bill higher with continuous 100's of W draw.
No shit. Long live tinfoil hats.
might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
Yes, people often consider flops/watt to operate, and flops/dollar to buy.
Speed alone means nothing. All these atoms in my apartment can do billions of operations per second, but they can't even play mp3s.
------DO NOT WRITE BELOW THIS LINE------
remember when computers were big enough to be in warehouses? and 10 years ago, people theorized that computers would be small enough to fit in your watch or hand? and that was just theory and considered fiction.
now we have palm pilots and watches that can store data (see the usb wrist watch)
so, really, a supercomputer that doesnt use that much energy isnt impossible.
anything's possible, one just has to break through the set barriers technology has made. if no one did that, we still would be sitting around in caves.
What does he want, affirmative action for slow CPUs?
Power is a consideration in TCO.. At the end of the day it comes down to $$/TFLOP (etc).
Lottsa years ago I used to maintain a CDC 7600, not only did it need full refrigeration, but it's original design spec was for an MTBF of 15 hours! The designers reckoned that it was so fast that the biggest job imaginable could be run in that time. Of course it did better than that in the end, but it was a bugger of a job to fix, and the backplane was 6 inches deep in twisted pair wires. Just imagine making wiring changes.
The article offers up this question: might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
Um, yes?
sounds like some kind of floor cleaner!
yeah but he got modded +1. 0wned
Insightful like a motherfucker! Highfive!
If you do the math with X (10,280 instead of 13,880 performance, 1000sq instead of 21,000sw, and 800kw instead of 3,000kw) you get a 337 fold increase in performance per square foot, rather than 65, and an 832 fold increase in performance per Watt, rather than 300 fold, vs the Cray.
Of course I dunno the numbers for the Transmeta solution yet!
GPL Deconstructed
. . . might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
No. I have a rock that can sit and do nothing, consuming considerably less than even 5.2kW. You can talk efficiency and bang-for-buck all you like, but if you don't benchmark faster than (roughly) 100 common desktop machines, you don't get to call yourself a supercomputer.
The article offers up this question: might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
Yeah. Does the supercomputer do what the customer needed it to do? Nobody in the world lays down money for a "supercomputer" these days so that they can be the fastest kid on the block ... or at least they shouldn't. Ostensibly, there are massive amounts of computing work that they need done, and they need something that can do it in a reasonable amount of time. Beyond that ... was it worth their money? If the answer to the last two questions was yes, then it was a success. Supercomputers are a tool; a very large tool, but a tool nonetheless.
Let me get this straight, you're comparing overall computing performance of a chip based solely on its clock speed? You must be new around here.
Well, there's spam egg sausage and spam, that's not got much spam in it.
Anyone interested in this sort of thing should check out the Beowulf mailing list - go to www.beowulf.org, and read through the (recent) archives. There's been some talk lately on different metrics.
Oh bullshit. This is a total troll, are you kidding me?
5.2kW cannot be sucked out of "a normal building power strip." And you are sure as heck going to notice 5.2kW of heat, and the regular everyday HVAC is most definitely not all it requires. "Uncooled ordinary room" my ass.
But if you do the math (clipped from another post of mine)
If you do the math with X (10,280 instead of 13,880 performance, 1000sq instead of 21,000sw, and 800kw instead of 3,000kw) you get a 337 fold increase in performance per square foot, rather than 65, and an 832 fold increase in performance per Watt, rather than 300 fold, vs the Cray.
And I don't know what the numbers for the Transmeta solution is.
GPL Deconstructed
Its not CPU speed that is important in supercomputer/clusters it is the speed at which you can get data from one node to esp memory access. If you havea 512 node system and node 3 needs a copy of node 40's memory it has to copy it over.
If its even just 512Mb of Gigabit ethernet and assuming 100% performace it would still take 5 seconds which is many orders of magniture. Just look at SGI machines which use NUMA and their Cray-Linux are 3.2 TeraBytes (bytes not bits). Now thats how you want to shift data
Rus
Cheap UK and US VPS
Money is usually finite, too. Especially in research. Power costs money. Cooling also costs money.
Is it an oxymoron to have an efficient supercomputer?
I thought heat was the real poison of "ultimate" computing...
So it seems likely computers will move towards those limits and become 'greener' with respect to how much energy they use...
To do otherwise would be counterproductive in terms of both efficiency and ecology.
That's needed given how much energy the US is using
Not a bad thing - but I wonder when green will move towards a technology that means less polluting in terms of hardware that gets trashed every year.
Subduction leads to orogeny
thanks for noticing and giving me the credit. the original post was indeed mine. this pig copied it.
Some drink at the fountain of knowledge. Others just gargle.
Especially when simulating nuclear weapons.
-Shane
I love teh int4rw3b!!!!!111one1
The only metric that really matters : total cost / insutuction.
Total cost equals the sum of the following:
1) cost for the CPU+memory+I/O,
2) cost of energy to power the CPU over it's lifetime,
3) cost of floorspace to house the CPU including cooling,
4) cost to write/purchase software for that particular CPU.
The system with the lowest total cost/instruction wins.
It's still valuable to have one or a few really friggin' fast processors versus a whole lot of smaller processors if you're running tasks that can't easily be subdivided. This is why people are still buying single processor PCs rather than multiprocessor boxen. If you're buying the setup for a specific purpose and multiple slower CPUs will do the job for you, then that's great; but you'll get more flexibility with speedy processors.
I'm considering building a server off the Nano-ITX, depending on price. Something the size of a small book to sit in a corner or on my desk, serving files over SSH with all that crypto acceleration. It will be a really cool platform... when it comes out.
Nano-ITX was annonced only a month and a half ago, and hasn't been released yet. So at least wait until the end of the year when they get it out before suggesting building beowulf clusters out of it.
I hereby place the above post in the public domain.
let us do some math... .. :-D
5.3 kW
let use a power factor of 0.75...
then the figures for VA is about 7kVA.
now, use a standard 220 V (RMS), so AMPS are about 32 AMPS. I'm sure that a standard #10 AWG power cable can handle that current (60 amps, in fact).
SO, maybe not your normal power strip.. but sure a circuit no bigger that the one you use for your electric clothes dryer can provide that power.
Here is my 2 cents.
Buy In-Win MicroATX BT553 ($38), PC Chips M810CDLU with on board duron +2000 ($69)
and 3NET NIC ($3) from newegg.
Add 1GB Memory from crucial ($140)
You will have a node for around 250$.
Make 20 of this guys and you have a super computer for $5000.
If one can pack the processors more densely, it would cut down on the wiring etc, or allow much shorter paths between nodes (better still, one might be able to stuff many processors on the same board or something), thereby increasing bandwidths (when you try to increase bus speed, path length and related current leakages etc do pose problems). This in turn means computations that require more 'random' communication between nodes can speed up. I suppose that's definitely worth pursuing for the more fine-grain computation where communication bandwitdh is the bottleneck.
Yes, yes, those numbers are impressive but can it be used to destroy other weapons and conquer the Chinese underworld in the hands of a rebellious Manchurian girl? (Reference: Crouching Tiger Hidden Dragon)
EvilCON - Made Famous by
That is the only advantage of using a Transmeta CPU. Wouldn't it be more efficient to just use a regular VLIW CPU without all the x86 code morphing stuff?
I'd like to see an analysis that allows you to cost (i'd say price, but its not just about money) the different components of a supercomputer and account for things like power, cooling, weight, size, infrastructure etc. The factors would have to be weightable so that you can assign varying levels of importance(like if space is more precious than money). It wouldn't need to be indepth or terribly exact, but i think it would help bring out the best possible choices.
[Fuck Beta]
o0t!
Most regular circuits are rated at 20 amps maximum. And that's with 110V that we have here in the US, where this thing is located. That means you'd need double the current, 64 A. Sorry, not from a "Normal building power strip."
Have you ever tried picking one of those things up? I have. I worked on the G5 cluster. Those SOBs were heavy. Nice to look at, but they suck to bring to LAN parties.
Being a smartass is a much better thing than being the alternative.
Maybe it would be possible to convert the heat from the processor into more electricity somewhere else. That would be cool.
READY.
PRINT ""+-0
Now most people who buy a car are more concerned with other features - passenger comfort, style, efficiency.
I don't know anything about you, but I'm now absolutely certain you're not a resident of the United States.
No.
Faster, ever faster!
I browse Slashdot at +3, Funny
might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
A supercomputer needs lots of blinkylights that flash in random patterns that people can try to find a pattern in...
Making a case for Efficient Supercomputing
From Power
Vol. 1, No. 7 - October 2003
by Wu-Chun Feng, Los Alamos National Laboratory It's time for the computing community to use alternative metrics for evaluating performance.Motivation
A supercomputer evokes images of big iron and speed; it is the Formula 1 racecar of computing. As we venture forth into the new millennium, however, I argue that efficiency, reliability, and availability will become the dominant issues by the end of this decade, not only for supercomputing, but also for computing in general.
Over the past few decades, the supercomputing industry has focused on and continues to focus on performance in terms of speed and horsepower, as evidenced by the annual Gordon Bell Awards for performance at Supercomputing (SC). Such a view is akin to deciding to purchase an automobile based primarily on its top speed and horsepower. Although this narrow view is useful in the context of achieving performance at any cost, it is not necessarily the view that one should use to purchase a vehicle. The frugal consumer might consider fuel efficiency, reliability, and acquisition cost. Translation: Buy a Honda Civic, not a Formula 1 racecar. The outdoor adventurer would likely consider off-road prowess (or off-road efficiency). Translation: Buy a Ford Explorer sport-utility vehicle, not a Formula 1 racecar. Correspondingly, I believe that the supercomputing (or more generally, computing) community ought to have alternative metrics to evaluate supercomputersspecifically metrics that relate to efficiency, reliability, and availability, such as the total cost of ownership (TCO), performance/power ratio, performance/space ratio, failure rate, and uptime.
Motivation
In 1991, a Cray C90 vector supercomputer occupied about 600 square feet (sf) and required 500 kilowatts (kW) of power. The ASCI Q supercomputer at Los Alamos National Laboratory will ultimately occupy more than 21,000 sf and require 3,000 kW. Although the performance between these two systems has increased by nearly a factor of 2,000, the performance per watt has increased only 300-fold, and the performance per square foot has increased by a paltry factor of 65. This latter number implies that supercomputers are making less efficient use of the space that they occupy, which often results in the design and construction of new machine rooms, as shown in figure 1, and in some cases, requires the construction of entirely new buildings. The primary reason for this less efficient use of space is the exponentially increasing power requirements of compute nodes, a phenomenon I refer to as Moore's law for power consumption (see figure 2)that is, the power consumption of compute nodes doubles every 18 months. This is a corollary to Moore's law, which states that the number of transistors per square inch on a processor doubles every 18 months [1]. When nodes consume and dissipate more power, they must be spaced out and aggressively cooled.
Figure 1
Without the exotic housing facilities in figure 1, traditional (inefficient) supercomputers would be so unreliable (due to overheating) that they would never be available for use by the application scientist. In fact, unpublished empirical data from two leading vendors corroborates that the failure rate of a compute node doubles with every 10-degree C (18-degree F) increase in temperature, as per Arrenhius' equation when applied to microelectronics; and temperature is proportional to power consumption.
We can then extend this argument to the more general computing community. For example, for e-businesses such as Amazon.com that use multiple compute systems to process online orders, the cost of downtime resulting from the unreliability and unavailability of computer systems can be astronomical, as shown in table 1millions of dollars per hour for brokerages an
[Fuck Beta]
o0t!
Well, it's about time they realize clock speed is the only thing to look at. Most people don't use the cycles they have on a 4 or 8 cpu box! Imagine if they started to clock disk I/O to a shared source! What would happen to the supercomputing industry then?!?
yes, cost...
imagination is more important than knowledge --Albert Einstein-
No. Transmeta chips were chosen because they're low-power, not because they're VLIW. shItaniums are VLIW, but they're certainly not low-power! As it turns out, the code morphing is good, becaue each task can run as efficiently as possible given the data it's being run against.
aQazaQa
Supercomputers should include the cost of the building and cooling system needed to house them. Power, heat and size are big factors in this cost. NERSC, LANL, etc have all built new computer centers just to deal with these new systems. These building are just as expensive as the supercomputers themselves. And then you add the recurring costs of the additional staff and maintenance of these centers.
The school computer store where I go to school is having to find ways to dump their single processor G5's because of the recent price drop in dual processor 1.8's.
:).
I think apple might have to buy back some stock or give discounts to retailers. It's pretty vicious to drop down a system that completely obsolete's the retailers stock so soon. Vicious to the retailers at least
I want a G5, but i dont have a place to put it. My wife has officially limited the number of computers to 5.
The Ro Factor - Jeep/Linux Weblog
Footprint now seems to be measured in "tennis courts" ;-)
Paul B.
I couldn't care less about the prissy Daphne. I want a sex story with Velma in it!
Sure, if one has the budget to build a better supercomputer, like the NCSA, then all one would be interested in is speed. That's great... unless you can't afford it. I am sure that there are many places, like community colleges, small-scale research labs, and other considerably lower-budget places that would greatly appreciate having a supercomputer brought down to a possibly acceptable level. Of course these places would love a top-notch computer, just as you lust at the Ferrari dealership... on your way to the Honda dealership to see about that cute little Civic. Same thing for the smaller institutions. Who cares that your computer is insanely powerful when you can't afford to turn it on? I think that's the aim of the whole efficient supercomputer idea. An OK-speed supercomputer that an institution can afford is better than a powerhouse that it can't.
I wish I could write clever and witty sigs.
There has been a lot of research into getting transistors to switch faster. Silicon on insulator systems are used to reduce junction capacitance (and if there is less electrical charge to strip off every time you want to change state, you can do it faster and with less power). I also think the work done with diamonds (see Wired: http://www.wired.com/wired/archive/11.09/diamond_p r.html) on how private researchers and the U.S. Navy are working to produce better semiconductiong materials which can withstand very high thermal loads (silicon on diamond) and building diamonds using chemical vapor deposition. There is also a lot of research using biology to do computing, and on building system-on-chip computing where instead of having a computer with many printed circuit boards (computers of the 80's and 90's) or computers with everything integrated onto the motherboard (today) where sound, network, i/o and processor are all on one motherboard, instead everything is all on one chip (and chips snap together like lego bricks). We need supercomputers to push this along.
Parent has lots of text, but no numbers.
Page 4 of the article says that Green Destiny does 11.6 Mflops/watt. This looks like the computing performance of a laptop computer. So have they really done something?
It would seem to make more sense to shut down and start up individual nodes for power saving. A supercomputer has relatively little down time and most of the time jobs come in large batches, you can shut down or heat up additional nodes as needed. It should be relatively easy to implement this kind of power saving inside single "computers" with many processors, and of course absolutely trivial to do it with a cluster. Especially if you use WoL NICs.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
"might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?"
Sure. How fast Quake can run on a Beowulf cluster of them.
To a lot of people, nothing else matters. Me, I like processing power for its own sake.
Intolerance for ambiguity is the mark of the authoritarian personality.
No-one seems to have noticed that SGI's R16k CPU uses a grand total of 12W power, and does 1.4 GFLOPs. That's 512 of them for your budget of 5kw. And that's single-system-image, and it'll all fit in 4-5 full racks.
[yes I work for SGI, but I'm not in marketing.. since marketing never DO anything useful I thought I'd better..]
read the freakin article! besides I've been there and I've seen it. it most defintley is drawing commonbuilding power and is not cooled other than the building air. shees RTFA, they say so too
My desktop machine is faster than a Cray 1, and it'll never be labelled "Supercomputer" by any rational being.
Unless their architecture actually hits the Top Ten, I'm not going to be impressed that it's overcoming its handicap. Unless you're running a Special Olympics for computers and "everyone's a winner."
Green Destiny is yesterday's news. Blue Gene, in its baby stage, is slightly smaller and already 25-50 times faster, with only maybe two or three times the power consumption.
Well, the parent may have been wondering what I've wondered about Transmeta chips: If they're so efficient at running "morphed" x86 machine code, wouldn't they be even more efficient at running code compiled to its native ISA that didn't need translation?
Li Mu Bai wants his sword back
OTOH, a power-efficient, low-heat cluster with a cooling system that doesn't cost more than the machines that make the cluster is nothing to laugh at. Buy a transmeta-based cluster and you don't need a special contract with the electricity company! You don't need five-feet-thick walls for sound insulation! You don't need to pipe liquid nitrogen to your supercomputer!
Probably, if you have an optimizing compiler for exactly that chip. On the other hand, the code-morphing engine is already quite efficient, and by keeping the internal ISA "hidden" or "secret", each new model can have a completely new ISA, and they never have to worry about backwards compatibility.
By the way, every x86 CPU since the pentium has had their own "internal" instruction set, where x86 instructions was translated into the internal ISA before being executed. None of these, except an early Cyrix pentium clone gave you access to their internal ISA, for much the same reasons as mentioned above. Transmeta's code-morphing just works on a larger scale than individual instructions.
Yes it's a statement of the obvious but I'm amongst what I suspect is the majority that loves the idea of having a multi 64 bit CPU as soon as I have spare cash.
I don't need that amount of power. The reality is my 700 Duron is still adequate though not totally adequate.
Nice Product
my password really is 'stinkypants'
If MIPS/Watt is the focus, why not use Intel's StrongARM, XScale or other ARM based cores rather than Transmeta's stuff. Afterall, ARM was designed specifically with the MIPS/Watt ratio as objective, starting a whole new architecture from scratch. Whereas Transmeta has focussed on effectient x86 "emulation".
--
Real computer scientists despise the idea of actual hardware.
Hardware has limitations, software doesn't.
It's a real shame that Turing machines are so poor at I/O.
- programmability and debugability (how easy/hard it is to write and/or debug programs on the supercomputer)
- portability (how easy is it to port programs to/from the supercomputer)
- applicability (how wide a range of problems can be solved more easily than with alternative systems)
- adherence to standards (e.g. compatibility with IEEE 754 binary floating point arithmetic standard)
- peak performance (what is the "guaranteed not to exceed" speed of the system)
- easily obtained performance (what do you get without having to try too hard)
- typical performance (what can you realistically expect to achieve)
Of course, we don't live in an ideal world.In the real world, the only metric that matters is how fast the system solves the problem or problems which are important to the person paying the bills. Supercomputers are, by definition, used to solve BIG problems. Nobody really cares how hard it is to implement or debug the program as long as the program runs REALLY fast once it is working (this is a bit of an exaggeration but not as much as you might think). The reason is quite simple - compared to the time that the typical production supercomputing application will consume, the time required to implement/port/debug the program is, for all practical purposes, irrelevant.
That said, writing high performance software is hard enough without having to screw around with message passing, and/or shared memory/semaphores. Folks have proposed (and companies have been born and died trying to implement/sell) tools/technologies which make writing high performance software easier. They, basically without exception, fail because the marketplace values performance so much more than ease-of-use that they are simply not prepared to trade off even a few percentage points of performance for any meaningful improvements in ease-of-use. This was true back in the 1980s and is still true today. A consequence of this is that today's state of the art (e.g. OpenMP and MPI) are little more than "standardized" versions of the tools which were in common use twenty years ago.
P.S. This is the "voice of experience" talking as I've been involved in the supercomputing industry on and off for the past twenty years. The opinions expressed above should probably be nuanced a bit but, hey, this is Slashdot and I'm already way WAY over my word limit!
P.P.S. Peak performance really is the "guaranteed not to exceed" speed of the system. Basically, when the vendor quotes you peak performance numbers, they are telling you that "no matter what you do, the computer will not go any faster than that". They are NOT telling you anything even remotely useful regarding how fast the computer will go when running your application. The gap between peak performance and typical performance can be and often is VERY wide.
Surely it just boils down to cost.
Power costs money and so does cooling. A more efficient supercomputer will be cheaper to run.If you can afford to supply your supercomputer with the cooling requirements of a nuclear reactor, fine.
People tend to focus on the initial outlay for a big system, forgetting that (like a car) it will have ongoing running costs.
"My cat's breath smells like cat food." - The Tao of Ralph Wiggum.
This is a metric that could be used. How much performance can you get per watt?
For supercomputing, I would imagine that something like SPECfp/watt or SPECrate/watt would be a decent metric.
If your limitation is a finite power budget, then you pick the most highest perf/watt CPUs.
P4 3.2 EE = 18.44 SPECfp/Watt (80 watts)
Crusoe = ?? No performance numbers published, but I'll bet you it's lower
Building larger caches (which can be made low power) is a good way to acheive high power/performance efficiency. Crippling your performance with a VLIW seems like a bad choice. You can voltage scale down a Pentium-M to the power levels of a Crusoe and easily get 2-3x the performance.
Using a specialized instruction set would probably be more efficient but, sadly, Transmeta doesn't have enough influence to pull of something like that - they need the compatibility.
The article offers up this question: might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
Not to sound flippant, but...duh. Okay, I know I sound flippant. But seriously, why has it taken so long to realize that processing speed is not of the utmost importance? It's like saying one car is better than another because has a top speed of 180MPH and the other 174MPH, ignore that the "slower" car gets 30% better mileage. There's such a thing as total cost of ownership.
Now if only Slashdotters would realize that this applies to home systems and not just supercomputers.
Yeah, just wait until Li Mu Bai finds out, he is going to be pissed.
My beliefs do not require that you agree with them.
The article offers up this question: might there be other metrics that might be important to supercomputing, rather than relying solely on processing speed?
/.
Embedded systems (and I'm not just talking about microcontrollers in your phones or microwaves - I'm talking about 100s of processors connected together in VME cages and the like - see Mercury, CSPI, Sky for examples) have always had the metrics FLOPS/W (FLOPS per Watt) and FLOPS/m^3 (FLOPS per volume) metrics. These were critical measurements because applications required certain performance and the machines themselves had to meet size/weight requirements depending on where they were being deployed. Jamming many processors in the space of a microwave oven to meet performance requirements (like 64+ processors), being less than a certain weight, having power consumption constraints, and requiring high performance without melting down because of the heat has always been an issue in certain sectors.
In the past, the embedded systems were typically special purpose - ran "special" OSs and were basically big set-top boxes that did only the one thing they were programmed to do. However, a few years back (like 5+), companies like CSPI started doing things like running Linux (or a realtime Linux variant) on their nodes instead of VxWorks and such, turning the box into a general purpose machine. I guess it just takes a while for some things to get enough attention to where someone would post it on
The sole unfortunate drawback of this highly efficient supercomputer is that it can only be properly wielded by Chow Yun Fat.
Anything you might ever need to say about anything has already been said better by Penny Arcade.
The Blue Gene Lite system by IBM is actually running even cooler (article).
The 440PPC processor being used is designed for embedded computing, so each node (2 processors and 4 FPUs) uses only 15 watts per node. That means that the 1024 processor system (512 nodes in normal configuration), now at #73 in top500, only uses about 7.7kW of power. At 240 processors or 120 nodes, power consumption would only be 1.8kW. This is far better than the Transmeta numbers.
Umm..there was *never* a time when the primary consideration the John Q Public had wrt the purchase of a car was cubic inches and horsepower. That was only true for a small subset of hot rodders and street racers, and is still primarily true for that niche.
Style, passenger comfort, features, etc. have always been part of the equation, even going back to the earliest days of the automobile. Evidence: Remember that Ford's Model T was available in "any color you want, so long as it's black"? Well, holding on to that sort of thinking opened the door for other companies to eat into Ford's dominant market share, and eventually let the GM companies supplant Ford as the #1 producer of cars. And Ford has yet to reclaim the top spot back.
What has changed is the consumer added fuel economy and safety to the list of things they consider when purchasing a car. Fuel economy only became a consideration due to outside market forces (the price of petroleum in the '70s), and safety was due to consumer advocacy (Ralph Nader's "Unsafe at Any Speed" testimony), as well as some pressue from the insurance agencies.
Computer manufacturers are operating under similar market forces. Power consumption, die size and noisey fans fall far short on John Q. Public's list of considerations when purchasing a computer. I'd say at the top of the list, in no particular order, is cost of acquisition, cost of transition (compatibility), and perceived speed. The only time power consumption/efficiency becomes a factor is when one is purchasing laptops, and that metric is only reported as "battery life", which doesn't really capture power consumption/efficiency.
Even large companies, purchasing electricity to operate hundred/thousands/tens of thousands of computers 24/7, frequently fail to consider whether their computers are "green". No. It's not until the price of electricity becomes a significant line item on someone's bottom line, that anyone really considers efficiency as an important feature to track on a decision making matrix.
---anactofgod---
---anactofgod---
"Equal opportunity swindling - *that* is the true test of a sustainable democracy."
That's not exactly trivial. Does anyone know wnything more about this? What are the trade-offs?
And, will this code will be made available to the general public? I imagine there are Transmeta owners that would also like to double their floating point performance.
Jon Acheson
All opinions expressed herein are my own, and not those of my employers, who are appalled.
Your basic premise - ``time was the only thing many people would look at is cubic inches or horsepower'' - is flawed. This refers only to a subset of epople who were interested in muscle cars, hot rods, or the term du jour. It's as true today of such people as it was then.
Muscle cars are the better metaphor for supercomputers; the term "car" just compares to "computer", or at best "individual's computer".
A hot rod or muscle car is rated in terms of horsepower and torque, or acceleration and speed. In other words, sheer performance. If it's more of a racing vehicle, throw in handling. Evetrything else falls by the wayside - comfort, fuel efficiency, etc. Well, style matters to lots of folks.
So if your metaphor has much validity, then the only things that matter with supercomputers are performance, and in some cases style. And that about covers where things are today, doesn't it?
I'm not saying that energy efficiency shouldn't matter in supercomputers. Just that you didn't qiote make your case. 8^)
Little did we know that that blade was actually a blade server. Yet another scientific first for China :)
QED