Has Supercomputing Hit a Brick Wall?
anzha writes "Horst Simon, Deputy Director of Lawrence Berkeley National Laboratory, has stood up at conferences of late and said the unthinkable: supercomputing is hitting a wall and will not build an exaFLOPS HPC system by 2020. This is defined as one that passes linpack with a performance of one exaFLOPS sustained or better. He's even placed money on it. You can read the original presentation here."
You can't really make factor 10 improvements indefinitely. Eventually the numbers overwhelm you and you hit roadblocks. The only real solution will ultimately be new computing technology, such as quantum computers.
much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
"Japan to develop new exaflop computer by 2020" ... why not? And if it's even a few microseconds into 2021 I suppose that supercomputing has failed, will pack up, and go home.
Don't think of it as a flame---it's more like an argument that does 3d6 fire damage
Does that mean we get to play super breakout with them?
Just saying sounds like a dare....
He wouldn't be the first to declare the death of Moore's law. And I'd be willing to bet he wont be the first to be wrong.
The first time I've heard that metaphor for competing technologies. Nice.
If it did, it would rapidly calculate a way over, under, around or through it.
Let's make like a bird... and get the flock outta here.
It's confident, dipshit.
Supercomputing has hit a brick wall, and yet for some reason we keep blaming the construction company that built the wall, not the reckless driver who hit it.
...but I can pretty much guess where this is going. If you look at the massive parallelization improvements we've witnessed among supercomputers over the past couple decades, you can predict that at some point, most of the low hanging fruit would eventually be picked at which point the underlying latency between interconnects would start to become a limiting factor. Couple that with the fact that there's been a complete lack of significant performance improvement in desktop/server CPU space in say the past 5 years and you can predict that it wouldn't be long before we'd see a leveling off of the supercomputer performance curve.
He doesn't say it's not possible, rather we can't get there by just extending current technology. So by extension, 2020 is too soon to expect exaflops. He also presents arguments why exaflops is important and work to get there should continue.
I eat only the real part of complex carbohydrates.
... I guess you may be excused to think it hit a brick wall. Alternative technology has fortunately already matured, and is commercially available.
Clarke's Three Laws are three "laws" of prediction formulated by the British writer Arthur C. Clarke. They are:
1. When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2. The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
3. Any sufficiently advanced technology is indistinguishable from magic.
Prove anything by multiplying Huge Number times Tiny Number
A 30 MB google docs document. Oh joy. It even appears to break my ipad. Yes, it's worth reading, but would it kill you to write an interesting summary? Even a pithy one, such as "by 2020, the energy costs associated with moving bits around will exceed the costs of actually processing them.
I wish there was more discussion on the interconnect and routing challenge of these systems. I used to work on an InfiniBand SubnetManager. Exascale will require more complex topologies and more complex routing. Does anyone think today's systems are up to the task?
-- soldack
Hybrid Memory Cube, memristors, and other technologies will make it work. I think Horst will "loose" his bet.
And still a little fuzzy headed, but the first thing I though of was arranging the racks for shortest maximim path, instead of one big football field sized room, stacking the datacenter into a cube shape... Then I thoght, "That's probably why Borg ships are Cubes."
Cronglebaun
I believe in karma, which is why, when I do something bad to people, I assume they deserve it.
Lie you ignored the article? And nothing i that link shows they are using it to build a supercomputer.
The Kruger Dunning explains most post on
confidante
Although latency isn't so much of an issue: the #1 systems of the last ~3 years did all have torus networks (all Blue Genes, all Crays, K computer, too). These networks only perform well for next neighbor communication -- which is fine since most codes running on these machines are simulation codes and they only need this type of communication. If you scale up the system, you'll typically also scale the size of the simulation instance (this is known as "weak scaling").
This means that your program can still spend the same time waiting for the network as it could on a smaller machine. The cables do not need to become shorter.
Computer simulation made easy -- LibGeoDecomp
WHAT are we computing that we need to keep pushing this envelope along?
That's part of the troll, dipshit. And it's confidant anyway.
Have problems parsing your question "Lie you ignored the article?"
Was that supposed to be "Like"?
My point is that conventional super-computing is indeed facing a crisis, but that non CMOS based technologies may save the day.
He sells the exaFLOP dream; but it's x1,000 faster than today. At Moore's faux Law, that takes 15 years - so, due in 2028, not 2020.
Back in the early 80's I got the opportunity to hear Grace Hopper speak. One of the stories she used to like to tell at her talks was about the time that she was having trouble visualizing a nanosecond. Eventually she sent a memo to her engineers which said, "Please send up one nanosecond." She waited, curious as to how they would respond. After a couple of days a response came back in the form of a metal rod 11-3/4 inches in length with the note attached, "One Nanosecond", and no other explanation. After puzzling over the metal rod she called down to the engineering department and asked, "I give up, what is it"? "That's the distance light travels in a nanosecond", was the response. Later, she sent another memo to the engineers with the request, "Please send up one picosecond." The engineers immediately responded with a memo instructing her to, "put the nanosecond in a pepper grinder and you can make picoseconds all over your desk."
Grace Hopper's humorous anecdote underlines the serious problems faced by researchers when they push the boundaries. In her case, it was a real concern over how far a bit can travel at the speed of light. I have no idea if that has any bearing on the exascale problem, but it might illustrate the kinds of problems they might be running into.
Proverbs 21:19
I'm an HPC professional, and do not see much value in these "hero" machines. Yes, you can go on all you want about the march of progress and tier-1 and grand challenges, but you're just reiterating an unquestioned manifest destiny-based view of history. Why do we need an Exaflop machine? is it because some particular set of applications need it? where is the threshold for those applications where the compute facility will be fast enough to achieve some breakthrough?
it's hard to find areas that are primarily limited by compute facilities. for instance, genetics/proteomics/metabilomics/whatever are *not* compute-limited, especially at the high end. they're laboratory-limited, the same way weather simulations are good and getting better, but not past the quality of their input data.
we need more compute in general, but not necessarily in one machine. a single exaflop machine will cost much more than a thousand petaflop machines. letting a thousand flowers bloom is much prettier than one excruciatingly beautiful flower...
and no, hero machines do not provide an efficient way to improve the tech of lesser or later machines. they have to be justified by their own need.
Consumer computing kept changing in interesting ways after clock rates stopped shooting upwards. If this guy is right about not hitting exaFLOPS, there might be an analogous situation for high-end computing.
Admittedly, there may be particular fields -- weather forecasting? fluid dynamics? -- that are going to have to pull a rabbit out of a hat to make progress without scaled-up supercomputers. But, in computing in general, we've found plenty of interesting things to do besides faster conventional supercomputing, like using huge clusters on relatively slow interconnects to do new, cool things with data. There may also still be big improvements to make on dimensions like cost (imagine we don't get exaflops by 2020, but every largish university can have a petaflops deployment) and ease of programming. Point is, interestingness is a much more textured, varied thing than just a FLOPS score.
Even if you ignore all the controversy over D-Wave's system and its nature, and take it all at face value, it is still only applicable to a narrow class of problems. CMOS or not, it amounts to something similar in principle to an ASIC. It is no surprised that a custom built chip can solve a specific class of problems orders of magnitudes faster than a general purpose processor. This used to be slightly more popular for a while in the 80s, where a few custom computers were built that were specifically designed for doing things like orbital calculations. And it pops up every so often, like custom chips for playing chess, and now bit coin mining chips. That is great for a small computer, but when your price gets into the millions or billions of dollars, the people bankrolling it will probably want to build a system that can be used for a wider class of problems even if it means running slower.
I'm pretty sure at one point, someone stood up in a meeting and said "No one will ever make a 1MB memory chip" or "No one will ever achieve a 64 bit processor", so how about sit down and just wait.
Do I?
Yes, in a way. We'll probably never be able to improve the hardware far enough that we can simply rely on it to fail gracefully (i.e. announce it's impending death a few seconds in advance). The reason is that ATM our systems contain approx. 20k nodes. Exascale systems will likely push this to 200k.Even if you assume a node will live 10 years in average, then you can estimate that every ~53 minutes one node of the system will fail.
My money is on the software: we'll need some kind of redundancy (e.g. a simulation code would need to store its mesh so that each part is held by multiple nodes, a bit like the redundancy we see in Bittorrent and other P2P networks). But that will require applications to be reengineered, and that will be really really expensive. Considering how the industry is struggling with the (comparatively easy) adoption of GPUs, I don't see this happening anytime soon. Interesting times ahead!
Computer simulation made easy -- LibGeoDecomp
Computers have been increasing in speed/power/capability for the past 70 years. In the mid 1940's, mechanical devices (gears/leavers/streaming paper tape) gave way to thermionic devices (vacuum tubes: diodes, triodes, pentodes, etc.). With the creation of the transfer-resister (transistor) in 1957, the somewhat reliable thermionic devices that relied on physics, were replaced with solid state devices that relied on chemistry. The Apollo Space program and the US Air Force (and the MinuteMan Missile) pressed the need for complete interchangable circuits on an easily replaceable chip. Integrated circuits were born. In 1968 the Intel Corporation created the 4004 chipset for the The Busicom 141-PF calculator. Based on different inputs, the 4004 chip could produce different outputs, making it 'programmable'. Gordon Moores law about semiconductors doubling in capability every 18 months has remained true since 1970. If Intel can cut the power to its 'big iron' cpu's (the 4/6/8 core chips), then just increasing the number of processors in supercomputers from 10,000 to 100,000 will give you an 10x increase in speed while using the same or less power. Continuing from 4/6/8 core chips up to 64 core chips is the next step, giving a further 8x increase, and that's all immediately available without doing anything revolutionary. An 80x increase at the same size/power as what we have now puts us into exaflops range. In 10 years, more improvements will likely come along, and hey "1 Exaflop ought to be enough for anyone(tm)".
Think you miss the bigger picture here, in that they are pioneering non silicon based LSI circuits that operate adiabatically (no heat production). This technology could very well be extended to include conventional logic in paralel with their quantum circuitry.
Sure, you just won a million dollars! I will just need your full name, SSN, address, date of birth, copy of your birth certificate and driver licence, bank acoount number and your pin for processing before I send your check!
Tomorrow is another day...
Woosh, dipshit. The duoble twist.
Sweet! its 123-45-6....wait a second!! You did not ask for my mother maiden name this has to be a scam.
I don't see anything about this in the PDF, so I'll ask the Hive Mind here:
How does this affect distributed computing efforts such as Folding@Home and the BOINC project?
These have very little node-to-server and zero node-to-node communication. With F@H already on the petaFLOP scale I wouldn't think it all that unlikely that it would reach exaFLOP level in less than a decade if interest keeps up.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
Memory latency. Beowulf clusters are good for things that are highly parallel *and* have a high degree of memory locality, ie. you rarely need to make memory calls between boxes.
True supercomputers use high-speed interconnects between systems for this reason, usually using something like Infiniband or a weird proprietary system, and usually with some network topology with numerous inter-system links. This gives them much lower latency when one system uses data in memory in another system.
True Supercomputers can solve non-highy-parallel problems.
The wall got smacked in the 1980's.
Where does CMOS even enter the question? All modern fast processors use dynamic NMOS devices in the signal path; PMOS is only used to recharge nodes to start a new cycle. For a particular set of process dimensions, CMOS is about 1/4 the speed of dynamic NMOS.
Contribute to civilization: ari.aynrand.org/donate
What you really need is more like error correcting codes. A snapshot is a diminishing returns strategy (see discussion on Beowulf mailing list a few months ago).. as you point out at some point you spend more time snapshotting than computing.
But with error correcting codes, you can tolerate some fraction of the computations failing. ECC memory is at the finest grain, but the concept does scale somewhat. Triple Modular Redundancy (TMR) is a crude code (also known as a rate 1/3 code)that works at higher levels but can have issues with the voter, because there's only one.
And it has some granuarity issues of its own: you can't just buy 3 supercomputers with a failure rate of 1 per day, run your 5 day computation, vote on the outputs and hope for success.
Fair enough, I used CMOS as sloppy shorthand for all current silicon based field effect transistor integrated circuit technology. (See how much longer that is?)
Whenever someone on on /. likens Google's network to a supercomputer God kills a Pokemon. But honestly: the reason why Google can cope with these massive outages is that they're doing totally different computations from supercomputers. Google's compute jobs are losely coupled. They do data mining. That is fundamentally different from supercomputing where all compute jobs are tightly coupled. To give you a car analogy:
Not a good analogy, but I hope to correct the picture of Google being lightyears ahead of the supercomputing industry: they're simply working on very different problems. I wonder what makes you think that Google/Amazon/Facebook were 10 years ahead of Cray and academia? If they were, they'd simply take over Cray's market. And since Cray competes with IBM and Fujitsu, they'd probably try and claim parts of their market shares, too. This is not happening.
Computer simulation made easy -- LibGeoDecomp
...and this is the problem: the time we need to get all the data to disk is closing in on the MTBF. With the current technology an exascale system would suffer node failures even while taking a snapshot.
Computer simulation made easy -- LibGeoDecomp
You don't seem to understand the effect of "FTL". It doesn't matter if you warped, teleported or used some wormhole - for external observer you have exceeded speed of light and it isn't any different from violating causality or time travel.
It wouldn't affect causality if you traveled completely out of your initial light cone. Ditto for time travel - if it is possible to travel into past, then it is impossible to cause change in present, i.e. what happens in "new past" stays there, unable to propagate back to "old present" faster then it itself propagates into future.
I guess your major misunderstanding is that the applications running on supercomputers could somehow be done in the (loosely coupled) way that Google does its data mining. Since you're a professional, too, please refer to this Wiki article on stencil codes, one of the major classes of codes that run on supercomputers. If you find a way (or at least a pseudo-code formulation) to transform these applications into loosely coupled codes, then I would not be the only one to be curious to hear about it. You'd transform the whole industry. In fact this is not possible, though.
But I agree that software will need to help with reliability and will have to actively manage node eviction/addition.
BTW: comparing Google and Cray is really like comparing apples and oranges: they're in different markets. The market for supercomputers is extremely small, the market for (online) advertising is gigantic.
Computer simulation made easy -- LibGeoDecomp
how long would it take the bestest supercomputer to mine all bitcoins by itself ? ... is it possible i am the first person in the world to think this? I hope not since that would not only mean i'm seriously underpaid, but it would mostly mean a lot of overpaid people are seriously overpaid, let's just pretend i didnt post this before someone fed-side takes it seriously
i always say things i shouldn't that's why none of both sides available in any layer of existence ever likes me. So that's how the government will eventually take control of the bitcoin ? I'm raving mad arent i
Free speech was meant to be free for all... how can anyone grow up in a nanny state ?