Warning At SC13 That Supercomputing Will Plateau Without a Disruptive Technology
dcblogs writes "At this year's supercomputing conference, SC13, there is worry that supercomputing faces a performance plateau unless a disruptive processing tech emerges. 'We have reached the end of the technological era' of CMOS, said William Gropp, chairman of the SC13 conference and a computer science professor at the University of Illinois at Urbana-Champaign. Gropp likened the supercomputer development terrain today to the advent of CMOS, the foundation of today's standard semiconductor technology. The arrival of CMOS was disruptive, but it fostered an expansive age of computing. The problem is 'we don't have a technology that is ready to be adopted as a replacement for CMOS,' said Gropp. 'We don't have anything at the level of maturity that allows you to bet your company on.' Peter Beckman, a top computer scientist at the Department of Energy's Argonne National Laboratory, and head of an international exascale software effort, said large supercomputer system prices have topped off at about $100 million 'so performance gains are not going to come from getting more expensive machines, because these are already incredibly expensive and powerful. So unless the technology really has some breakthroughs, we are imagining a slowing down.'"
Although carbon nanotube based processors are showing promise (Stanford project page; the group is at SC13 giving a talk about their MIPS CNT processor).
MIPS CNT... how do you pronounce that?
We've had Silicon Germanium cpus that can scale to 1000+ GHz for years. Graphene is also another interesting possibility.
The question is that "At what price can you make the power affordable?"
For 99% of people, computers are good enough. For the other 1% they never will be.
my intuition tells me that disruptive technologies are precisely that because people don't anticipate them coming along nor do they anticipate the changes that will follow their introduction. not that people can't see disruptive tech ramping up, but often they don't.
My God can beat up your God. Just kidding...don't take offense. I know there's no God.
There are actually a half-decent number of 'supercomputers' -depending on how you define that term- in the private sector. From 'simple' ones that do rendering for animation companies to ones that model airflow for vehicles to ones that crunch financial numbers to.. well, lots of things, really. Are they as large as the biggest National faciltiies? Of course not - that's where the next generation of business-focused systems get designed and tested and models and methods get developed and tested.
It is indeed the case that far simpler systems ran early nuclear weapon design, yes, but that's like saying far simpler desktops had 'car racing games' -- when, in reality, the quality of those applications has changed incredibly. Try playing an old racing game on a C64 vs. a new one now and you'd probably not get that much out of the old one. Try doing useful, region-specific climate models with an old system and you're not going to get much out of it. Put a newer model with much higher resolution, better subgrid models and physics options, and the ability to accurately and quickly do ensemble runs for a sensitivity analysis and, well, you're in much better territory scientifically.
So, in answer to "So what?", I say: "Without improvements in our tools (supercomputers), our progress in multiple scientific -and business- endeavors slows down. That's a pretty big thing."
Of course these people are using talking about supercomputers and the relevance to supercomputers, but you have to be pretty daft to not see the implications for everything else. In the last years almost all the improvement have been in power states and frequency/voltage scaling, if you're doing something at 100% CPU load (and isn't a corner case to benefit from a new instruction) the power efficiency has been almost unchanged. Top of the line graphics cards have gone constantly upwards and are pushing 250-300W, even Intel's got Xeons pushing 150W not to mention AMD's 220W beast, though that's a special oddity. The point is that we need more power to do more and for hardware running 24x7 that's a non-trivial part of the cost that's not going down.
We know CMOS scaling is coming to an end, maybe not at 14nm or 10nm but at the end of this decade we're approaching the size of silicon atoms and lattices. There's no way we can sustain the current rate of scaling in the 2020s. And it wouldn't be the end of the world, computers would go roughly the same speed they did ten or twenty years ago like cars and jet planes do. Your phone would never become as fast as your computer which would never become as fast as a supercomputer again. We could get smarter at using that power of course, but fundamentally hard problems that require a lot of processing power would go nowhere and it won't be terahertz processors, terabytes of RAM and petabytes of storage for the average man. It was a good run while it lasted.
Live today, because you never know what tomorrow brings
The problem is that there are many interesting problems which don't parallelize *well*. I epmhasize *well* because many of these problems do parallelize, it's just that the scaling falls off by an amount that matters the more thousands of processors you add. For these sorts of problems (of which there are many important ones), you can take Latest_Processor_X and use it efficiently in a cluster of, say, 1,000 nodes, but probably not 100,000. At some point the latency and communication and whatnot just takes over the equation. Maybe for a given problem of this sort you can solve it 10 days on 10,000 nodes, but the runtime only drops to 8 days on 100,000 nodes. It just doesn't make fiscal sense to scale beyond a certain limit in these cases. For these sorts of problems, single-processor speed still matters, because they can't be infinitely scaled by throwing more processors at the problem, but they can be infinitely scaled (well, within information-theoretic bounds dealing with entropy and heat-density) by faster single CPUs (which are still clustered to the degree it makes sense).
CMOS basically ran out of real steam on this front several years ago. It's just been taking a while for everyone to soak up the "easy" optimizations that were laying around elsewhere to keep making gains. Now we're really starting to feel the brick wall...
BZZZT, WRONG.
This is where you can stop reading, folks.