Supercomputer On-a-Chip Prototype Unveiled
An anonymous reader writes "Researchers at University of Maryland have developed a prototype of what may be the next generation of personal computers. The new technology is based on parallel processing on a single chip and is 'capable of computing speeds up to 100 times faster than current desktops.' The prototype 'uses rich algorithmic theory to address the practical problem of building an easy-to-program multicore computer.' Readers can win $500 in cash and write their names in the history of computer science by naming the new technology."
What's wrong with Supercomputer On-a-Chip (c) ?
~
I call the "supercomputer on a chip" the "Cell microprocessor". Of course, next year, it won't be so super. But there will be a new one that's really super.
--
make install -not war
"Readers can win $500 in cash and write their names in the history of computer science by naming the new technology."
Is "Clippy" taken?
Vote monkeys into Congress. They are cheaper and more trustworthy.
We have microcomputers and supercomputers and nothing in between? Seems to be a bit of hyperbole involved here.
"National Security is the chief cause of national insecurity." - Celine's First Law
'Space Heater'
I RTFA... It seems to handwave so much about parallel computing, that it seems they haven't discovered anything. All i see is "clock frequency can't increase, so we're going parallel'.... Surely, this can't be the extent of their research. The article claims its 'easy to program', but there are zero specifics about why that would be the case. Can anyone tell me what they've done here (if anything)?
Vaporac. Vaporlon. Vaporium. Whatever...
Strange things are afoot at the Circle-K.
Bob
Oh, for god's sake. I don't understand why this is getting so much press. It was stupid when it went up on Digg, and it's stupid that it's showing up here. This isn't substantially different from any of the other parallel architecture and programming work that's been going on for the last two decades. Their benchmarks are against embarrassingly parallelizable algorithms like matrix multiplies and randomized quicksort, things that any half-intelligent lemur (with a math and cs class or two) could get to run quickly. The hard part is speeding up your average desktop application which, I guarantee you, is not spending the majority of its time doing matrix multiplies.
On top of that, their "parallel extension of von Neumann" amounts to adding primitives to start and stop threads into the language. Again, any half-intelligent lemur (with a slightly different skill set from the first) could have done that. And I think a few actually have (at the risk of comparing language researchers to lemurs). It doesn't solve the underlying problem.
Oh, and did we mention no floating point and the lack of any memory bandwidth to get data into and out of this thing?
This is over-hyped research and shameless self-promotion, and for some weird reason the press seems to be buying it. Stop it.
You know, autovectorization looks good on paper. But for most tasks, it really doesn't net you any benefit unless you can separate all your work into non-overlapping chunks. You can't have any interdependancies on your working set (or risk expensive, non-scalable locking), and if you're all pulling from a single data source to split up the analysis work you'll spend a lot of time in contention for the pipe to that resource.
For example, it wouldn't make searching a database (scratch that, searching any data set) any faster unless the index was already pre-split among the processing units.
In this architecture the processing units have the same bus to RAM and disk on the front and back ends and have to deal with contention.
Your system is only as fast as the slowest serial part. Typically this is storage media, a network connection, or a memory crossbar. Processors really are fast enough for the non-embarrasingly parallel stuff. They are at the right ratio with respect to the other slower busses to do most general purpose work.
If you want to do more than that then its other things; storage media, memory, I/O busses -- that need to be multiplied in density and number. Only then can we see higher throughput.
Autovectorization is only good for things we already have offloading for anyway (TCP encryption, graphics, sound)... and for those general purpose cases like in Game AI where you might want a linear algebra boost NVidia has beaten these guys to the punch with the GP stream processing in the newest chips and the very flexible Cg language/environment.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Second paragraph of the rules:
THE FOLLOWING CONTEST IS INTENDED FOR PLAY IN THE UNITED STATES AND SHALL ONLY BE CONSTRUED AND EVALUATED ACCORDING TO UNITED STATES LAW. DO NOT ENTER THIS CONTEST IF YOU ARE NOT LOCATED IN THE UNITED STATES.
Even though there is a country field in the form. WTF?
They don't mention that on the form page, either. It peeves me just a little bit that they would do that, I mean, how many people actually read these conditions things, anyway? Can't say I'm surprised, though.
While I agree there are certain leaps to be made before this can be a mass market item, I disagree fundamentally with point 1 that you make. You could have made the exact argument about the old DOS Lotus office suite way back, 15 years ago. Those things still word process, and a 386 33MHz is certainly no slouch - I never had to sat around waiting for the software to respond to me or finish some ridiculously long task.
I'm sure you'd agree that these newfangled Pentiums and Core Duos are quite useful, even for the end user.
Think about features like predictive and contextual actions. Desktop search? Search-as-you-type? There are many ways to improve the usability of computers thyat require more and more performance. Honestly, if we can invent faster computers, we will invent ways to put the power to use in a productive, tangible way.
Pretty much the same with any multi-processor technology: shared resources like buses are the major limitation.
Engineering is the art of compromise.
3. As has been mentioned time and again, until developers actually embrace multi-threading this will be relatively useless. Tests from various hardware sites have shown that going from the Core 2 Duo to the Core 2 Quad offers very little benefit except for a very small subset of users... who should probably be running workstations anyway (Video editing, 3D rendering, etc.)
RTFA. The article claims:
"The 'software' challenge is: Can you manage all the different tasks and workers so that the job is completed in 3 minutes instead of 300?" Vishkin continued. "Our algorithms make that feasible for general-purpose computing tasks for the first time."
To show how easy it is to program, Vishkin is also providing access to the prototype to students at Montgomery Blair High School in Montgomery County, Md.
Parallel computing has been around for a while. One of the challenges of parallel computing has always been that it is inherently harder to code. These guys acknowledged this, but they say their prototype is "easy" to program. We'll see if they're right.
I'm going to make an assumption and say that you don't do a lot of system programming. Threaded applications depend... heavily... on synchronizing data access. You simply can't take a single threaded application and break it out across threads without having some context of how it's accessing it's data and why. Imagine landing planes at an airport. It's a serial process... you just can't arbitrarily run it in parallel... "bad things" (tm) happen. The "algorithms" Mr. Vishkin is speaking of have no way of determining the context of code being executed and trying to break it out is a disaster waiting to happen.
There are applications where massive parallelism like this is fantastic... using my initial example... encoding video. Throw each frame off to one of the processors and you're processing 300 at a time (even there there are limitations because each frame requires information from the previous).
But I stand my statement.. anyone who says they can take a serial application and run it in parallel is full of sh*t and they know it. In certain, limited circumstances, yes... but in general. NO.
I will either nominate the name, "Giant Douche," or, "Turd Sandwich," depending on which one slashdotters vote for.
iPerbole©
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
http://en.wikipedia.org/wiki/Transputer
"It doesn't cost enough, and it makes too much sense."
Anyone remember the hype of the i860? Great on paper, but not so great in reality. I really hope this works though, von Neuman architecture was always supposed to be a stop-gap (even vN said so I think).
Bitter and proud of it.
It appears to be a few FPGAs. With FPGAs, you can optimize the logic to represent algorithms for faster execution that on general purpose processors. Simply, you use more of the gates available on the chip. That appears to be what these guys are doing. It also appears that there is a single memory controller (I think that is what the QuickLogic chip is) and there is only one DRAM module installed on the board. It would be interesting if the board had a unified memory architecture. There is a separate Xilinx Spartan FPGA on the board that does who-knows-what, but I wouldn't be surprised if it was involved in communication with the processing chips. Of course, this is speculation, but it would seem logical for a board layout.
Just my thoughts.
If everything in the chip is lining up so nicely, how about calling it
THE SYZYGY
no, I'm not making up the word. If you don't believe me, http://dictionary.reference.com/browse/syzygy
I wouldn't consider the mad hatter mad. Just reality impaired. He sure can make a mean cup of tea.
But I want those $500. Maybe I could use it to buy a board
Don't lie. You'll actually spend it on 2 computer games, lots of mountain dew and some pizzas.
Seven puppies were harmed during the making of this post.
Since of course that breaks down. Actually maybe it isn't so retarded since the same thing is true in many computing problems.
For example if you take the cleaning situation sure, adding a second cleaner will nearly double the speed it gets cleaned at. Adding four will probably close to quadruple it. However, it starts to break down after a while. At first the gains just start slowing down, as there's more people they have to spend more time talking and dividing up who does what than actually working, as well as doing work others have done because of a miscommunication. Eventually you have so many people that you start actually slowing down with each person you add, because they are getting in each other's way and taking up too much time with non-work.
That's fairly similar to what you get with a lot of problems in computation. You split the task in half, you can have 2 processors/cores/whatever execute it and nearly double your speed. However after a point you find that you can't split the task more, or that even if you can, it takes more time getting it all sync'd up than you gain from the multiple execution, or that contention in other parts of the system (like memory) holds things back.
The concept of "two is better than one to 100 must be better than 10" doesn't hold up. There are almost always limits to how much you can divide up a task. Sometimes those limits are extremely high, but they are there. Unfortunately, for many tasks, the limits are pretty low.
Suppose you hire one person to clean your home, and it takes five hours, or 300 minutes, for the person to perform each task, one after the other," Vishkin said. "That's analogous to the current serial processing method. Now imagine that you have 100 cleaning people who can work on your home at the same time! That's the parallel processing method.
100 people trying to clean my house at the same time would be slower than 1, because no one would be able to move or breathe. Which is exactly what makes parallel computing hard.
It's not wasting time, I'm educating myself.
Wow. You got half way with your idea, but didn't make it all the way.
Right now, with most programming languages, we tell the computer how to compute the result. We generally do this with a linear list of steps for the computer to take. But that's not the only way to write a program. Another way is to tell the computer what we want it to compute, and let it figure out the best way how to do that. This sounds pretty crazy at first, but it's actually been done. Take a look at the Prolog and Haskell programming languages. They're much more descriptive than iterative. They can parallelize things a lot better than the languages we're used to using.
Software sucks. Open Source sucks less.