Slashdot Mirror


Supercomputer On-a-Chip Prototype Unveiled

An anonymous reader writes "Researchers at University of Maryland have developed a prototype of what may be the next generation of personal computers. The new technology is based on parallel processing on a single chip and is 'capable of computing speeds up to 100 times faster than current desktops.' The prototype 'uses rich algorithmic theory to address the practical problem of building an easy-to-program multicore computer.' Readers can win $500 in cash and write their names in the history of computer science by naming the new technology."

42 of 214 comments (clear)

  1. Name ? by Hsensei · · Score: 2, Insightful

    What's wrong with Supercomputer On-a-Chip (c) ?

    --
    ~
    1. Re:Name ? by Anonymous Coward · · Score: 2, Funny

      What about people-ready chip?

    2. Re:Name ? by DigiShaman · · Score: 3, Funny

      Supercomputer-On-a-Chip, or SOAC (pronounced soak).

      "Need your data processed in a jiffy? Then SOAC your data on our new chip. All yours for $19.95*!

      *sorry, no CODS accepted

      --
      Life is not for the lazy.
    3. Re:Name ? by OctoberSky · · Score: 4, Funny

      Babywulf Cluster

    4. Re:Name ? by hAckz0r · · Score: 4, Funny

      What's wrong with Supercomputer On-a-Chip (c) ?

      Oh great, I can hear the PR advertisements already; "Put a SOC in it".

  2. "Cell" by Doc+Ruby · · Score: 3, Insightful

    I call the "supercomputer on a chip" the "Cell microprocessor". Of course, next year, it won't be so super. But there will be a new one that's really super.

    --

    --
    make install -not war

    1. Re:"Cell" by Doc+Ruby · · Score: 2, Insightful

      How is that "fair"? By the time this new chip is even properly named, TBM will have Cell chips in 45nm silicon. Partly because their engine is simpler. And the Cell is designed for scalable multicore/chip parallelism. Its main magic is its coherent, superfast "elements" bus, which retains coherency even at 1.6Tbps across multiple cores and chips. IBM has 4-core chips in pairs already deployed in public, and 128-core chips in the lab, where a massive new top-predator supercomputer is being built on the new architecture.

      There are other, more parallel processors. The PS3's Cell at 204GFLOPS is matched to a 128-shader RSX at 1.8TFLOPS. But you can't run Linux, or anything else so general purpose, on an RSX - not without a prohibitively difficult development process, if at all retaining the speed.

      The Cell has builtin allocation facilities, so app code doesn't have to schedule or otherwise closely manage the fast SPEs, just send tasks to a generic pool. Which SPEs just DMA into a unified memory model. That kind of simplicity makes Cell programming harder than, say, PowerPC programming, but much easier than other parallel programming, without losing its speed. Once there are some basic libraries for programming "common" new parallel tasks on the Cell, it won't be considered any harder than it was to program x86 "Protected Mode", Extended vs Expanded Memory, word alignment, etc.

      --

      --
      make install -not war

  3. Taken? by bryan1945 · · Score: 3, Funny

    "Readers can win $500 in cash and write their names in the history of computer science by naming the new technology."

    Is "Clippy" taken?

    --
    Vote monkeys into Congress. They are cheaper and more trustworthy.
    1. Re:Taken? by trolltalk.com · · Score: 3, Funny

      Chipzilla would be good, except that's what everyone calls Intel. I guess we'll have to settle for "CowboyNealOnAChip". Or "theChipThatCanActuallyRunJavaProgramsWithinTheUni versesLifetime"

      What gets me is that that there's a dropdown in the entry form to choose your country, as well as asking you for your state or province, but the rules state:

      WHO MAY ENTER: Open to all legal residents of the 50 United States (including the District of Columbia) who are 18 years or older in their respective US state at time of entry. Individuals employed by the University of Maryland, College Park. ("University") as faculty, exempt or non-exempt employees, and members of their immediate family or persons living in the same household, are not eligible to enter or win.

      I hope their chip design is better thought out than the contest form.

  4. WTF? by msauve · · Score: 4, Insightful

    We have microcomputers and supercomputers and nothing in between? Seems to be a bit of hyperbole involved here.

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
    1. Re:WTF? by gardyloo · · Score: 3, Funny

      We have microcomputers and supercomputers and nothing in between? Seems to be a bit of hyperbole involved here. Most. Insightful. Post. Ever. ;)
    2. Re:WTF? by Kadin2048 · · Score: 5, Insightful

      but what is even the high-end gamer going to need a chip 100 times faster than today's machines for any time in the next decade?

      If you compare megahertz-cores (number of megahertz times number of cores at that speed), I suspect that there's been almost a 100x increase in the past 10 years, at least if you look from the low end a decade ago to the high end of personal computers now.

      I don't see why the next ten years would be any different. Operating systems will continue to get more bloated, software packages will get more feature-stuffed, games will continue to demand just slightly more than whatever's available to most people with expenses and regular lives, and most people will buy a new machine every few years based on whatever's on sale for $500 at Best Buy when their old one gets clogged with spyware.

      Sure, 100x might be a bit of a stretch (I'm not sure whether silicon will go that much further and I'm not totally convinced that parallelism is the solution for general-purpose computing), but if that kind of power was available, it would be put to use.

      Software expands to fill the resources made available to it, and then some. Always has and always will.

      --
      "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
    3. Re:WTF? by jacksonj04 · · Score: 2, Insightful

      Build it, and they will come.

      Remember getting your first 1gb drive and going "Wow, I'm *never* gonna be able to fill this up". A few years later people are throwing around files in excess of 1gb with no worries.

      --
      How many people can read hex if only you and dead people can read hex?
    4. Re:WTF? by eggnoglatte · · Score: 2, Insightful
      Once you have compute power, you'll find a way to use it. If you are a gamer, then this kind of performance gain will be used to roll the GPU back into the CPU, increase your screen resolution to 10-20 MPixels, increase the rendering quality, and improve the game AI.

      Once you are done with all that, you are going to be back asking for more.

  5. My Name by the+eric+conspiracy · · Score: 5, Funny

    'Space Heater'

  6. There's nothing here by IlliniECE · · Score: 2, Insightful

    I RTFA... It seems to handwave so much about parallel computing, that it seems they haven't discovered anything. All i see is "clock frequency can't increase, so we're going parallel'.... Surely, this can't be the extent of their research. The article claims its 'easy to program', but there are zero specifics about why that would be the case. Can anyone tell me what they've done here (if anything)?

    1. Re:There's nothing here by Holi · · Score: 2, Interesting

      Well, you should learn to follow links.
      It was quite easy from the article to find more information about the project.

      --
      Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
    2. Re:There's nothing here by James+McP · · Score: 4, Informative

      Here's the deal.

      Up 'til now, Parallel Random Access Model (PRAM) computing has been a theory of parallel processing that was a thought model. It hadn't been built. Some people had written programs to emulate a PRAM computer but they were not complete versions.

      It could work at a snail's pace and still be a technological accomplishment as it is the very first, complete, working, hardware PRAM computer. It's on par with the Z3, Colossus and Eniac, the first programmable computers (German, English, American, in historical order).

      Fortunately, they made the algorithms work well, or at least, if the press release it to be believed, work so that 64 75Mhz computers could produce 100x the performance of a current desktop on at least one particular function. Which is pretty impressive in first-time hardware even if it turns out to be an obscurely used math function known only to about a dozen coders.

      --
      I've been on slashdot so long I'm starting to get out of touch with the cool stuff if it ain't on slashdot.
  7. Confidence: Low by Lije+Baley · · Score: 5, Funny

    Vaporac. Vaporlon. Vaporium. Whatever...

    --
    Strange things are afoot at the Circle-K.
    1. Re:Confidence: Low by Refenestrator · · Score: 2, Funny

      Or you could add in a temperature joke and call it the Vaporizer.

  8. I name it by Kohath · · Score: 3, Funny

    Bob

  9. Overhyped by rivenmyst137 · · Score: 5, Insightful

    Oh, for god's sake. I don't understand why this is getting so much press. It was stupid when it went up on Digg, and it's stupid that it's showing up here. This isn't substantially different from any of the other parallel architecture and programming work that's been going on for the last two decades. Their benchmarks are against embarrassingly parallelizable algorithms like matrix multiplies and randomized quicksort, things that any half-intelligent lemur (with a math and cs class or two) could get to run quickly. The hard part is speeding up your average desktop application which, I guarantee you, is not spending the majority of its time doing matrix multiplies.

    On top of that, their "parallel extension of von Neumann" amounts to adding primitives to start and stop threads into the language. Again, any half-intelligent lemur (with a slightly different skill set from the first) could have done that. And I think a few actually have (at the risk of comparing language researchers to lemurs). It doesn't solve the underlying problem.

    Oh, and did we mention no floating point and the lack of any memory bandwidth to get data into and out of this thing?

    This is over-hyped research and shameless self-promotion, and for some weird reason the press seems to be buying it. Stop it.

    1. Re:Overhyped by Doppler00 · · Score: 4, Informative

      Yeah this article is pretty week. "Woohoo! Look we took a picture of a last generation FPGA development board and wrote some nifty programs for it that prove our pet project!" I think very little of things like this make it outside of academia. I'm not saying this research is unworthy, just not news worthy.

      And "parallel extension of von Neumann" exists. It's called OpenMP and it still takes a skilled programmer to understand.

      Look at that board... it uses "SmartMedia" yeah... that means that:

      1. This is OLD research
      2. The board developers didn't have a clue
      3. A very old development board is being used.

    2. Re:Overhyped by uarch · · Score: 2, Funny

      Actually, the more I think about it they could have made a better whitepaper using this:

      http://pdos.csail.mit.edu/scigen/

  10. Human-guided autovectorization. by Ayanami+Rei · · Score: 3, Interesting

    You know, autovectorization looks good on paper. But for most tasks, it really doesn't net you any benefit unless you can separate all your work into non-overlapping chunks. You can't have any interdependancies on your working set (or risk expensive, non-scalable locking), and if you're all pulling from a single data source to split up the analysis work you'll spend a lot of time in contention for the pipe to that resource.

    For example, it wouldn't make searching a database (scratch that, searching any data set) any faster unless the index was already pre-split among the processing units.

    In this architecture the processing units have the same bus to RAM and disk on the front and back ends and have to deal with contention.

    Your system is only as fast as the slowest serial part. Typically this is storage media, a network connection, or a memory crossbar. Processors really are fast enough for the non-embarrasingly parallel stuff. They are at the right ratio with respect to the other slower busses to do most general purpose work.

    If you want to do more than that then its other things; storage media, memory, I/O busses -- that need to be multiplied in density and number. Only then can we see higher throughput.

    Autovectorization is only good for things we already have offloading for anyway (TCP encryption, graphics, sound)... and for those general purpose cases like in Game AI where you might want a linear algebra boost NVidia has beaten these guys to the punch with the GP stream processing in the newest chips and the very flexible Cg language/environment.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  11. Non-US residents inelligible to enter by bh_doc · · Score: 2, Informative

    Second paragraph of the rules:

    THE FOLLOWING CONTEST IS INTENDED FOR PLAY IN THE UNITED STATES AND SHALL ONLY BE CONSTRUED AND EVALUATED ACCORDING TO UNITED STATES LAW. DO NOT ENTER THIS CONTEST IF YOU ARE NOT LOCATED IN THE UNITED STATES.

    Even though there is a country field in the form. WTF?

    They don't mention that on the form page, either. It peeves me just a little bit that they would do that, I mean, how many people actually read these conditions things, anyway? Can't say I'm surprised, though.

  12. Re:Limited Practical Applications (for now) by p0tat03 · · Score: 4, Insightful

    While I agree there are certain leaps to be made before this can be a mass market item, I disagree fundamentally with point 1 that you make. You could have made the exact argument about the old DOS Lotus office suite way back, 15 years ago. Those things still word process, and a 386 33MHz is certainly no slouch - I never had to sat around waiting for the software to respond to me or finish some ridiculously long task.



    I'm sure you'd agree that these newfangled Pentiums and Core Duos are quite useful, even for the end user.



    Think about features like predictive and contextual actions. Desktop search? Search-as-you-type? There are many ways to improve the usability of computers thyat require more and more performance. Honestly, if we can invent faster computers, we will invent ways to put the power to use in a productive, tangible way.

  13. Where parallelisms break down by EmbeddedJanitor · · Score: 2
    Suppose you had 100 cleaners in your house. They'd all be tripping over each other and all unplugging eachother's vacuum cleaners to plug in their own. And all their minivans would cause a traffic jam in your driveway.

    Pretty much the same with any multi-processor technology: shared resources like buses are the major limitation.

    --
    Engineering is the art of compromise.
    1. Re:Where parallelisms break down by rbanffy · · Score: 2, Interesting

      Sun had something with tiny radio interconnects between chips. This way, they could have thousands of "pins" on the chip and the only metal pins you would need would be power and ground. If I remember correctly, I had a server whose memory had to be upgrades about 8 (or 9) modules-with-lots-of-pins a time, so, wide buses are nothing new.

      Intel also had something about optical interconnects, which are also nice, since you can place your "connectors" anywhere in the chip and not just around the borders and, if you can aim properly, the receivers can be much smaller than the pads around a current chip (or, by properly spreading the signals, one could synchronize many receivers to a single source very efficiently).

      We may not be constrained by the number of pins a connector has for that much longer.

  14. Re:Limited Practical Applications (for now) by Morty · · Score: 2, Informative


    3. As has been mentioned time and again, until developers actually embrace multi-threading this will be relatively useless. Tests from various hardware sites have shown that going from the Core 2 Duo to the Core 2 Quad offers very little benefit except for a very small subset of users... who should probably be running workstations anyway (Video editing, 3D rendering, etc.)


    RTFA. The article claims:


        "The 'software' challenge is: Can you manage all the different tasks and workers so that the job is completed in 3 minutes instead of 300?" Vishkin continued. "Our algorithms make that feasible for general-purpose computing tasks for the first time." ...
    To show how easy it is to program, Vishkin is also providing access to the prototype to students at Montgomery Blair High School in Montgomery County, Md.


    Parallel computing has been around for a while. One of the challenges of parallel computing has always been that it is inherently harder to code. These guys acknowledged this, but they say their prototype is "easy" to program. We'll see if they're right.

  15. Re:Limited Practical Applications (for now) by thesandbender · · Score: 4, Informative

    I'm going to make an assumption and say that you don't do a lot of system programming. Threaded applications depend... heavily... on synchronizing data access. You simply can't take a single threaded application and break it out across threads without having some context of how it's accessing it's data and why. Imagine landing planes at an airport. It's a serial process... you just can't arbitrarily run it in parallel... "bad things" (tm) happen. The "algorithms" Mr. Vishkin is speaking of have no way of determining the context of code being executed and trying to break it out is a disaster waiting to happen.

    There are applications where massive parallelism like this is fantastic... using my initial example... encoding video. Throw each frame off to one of the processors and you're processing 300 at a time (even there there are limitations because each frame requires information from the previous).

    But I stand my statement.. anyone who says they can take a serial application and run it in parallel is full of sh*t and they know it. In certain, limited circumstances, yes... but in general. NO.

  16. Please vote on the new name by cashman73 · · Score: 3, Funny

    I will either nominate the name, "Giant Douche," or, "Turd Sandwich," depending on which one slashdotters vote for.

  17. Hand over the $500 right now by Enderandrew · · Score: 5, Funny

    iPerbole©

    --
    http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
  18. Transputer? by MadMidnightBomber · · Score: 3, Informative
    --
    "It doesn't cost enough, and it makes too much sense."
  19. i860? by Evil+Pete · · Score: 2, Interesting

    Anyone remember the hype of the i860? Great on paper, but not so great in reality. I really hope this works though, von Neuman architecture was always supposed to be a stop-gap (even vN said so I think).

    --
    Bitter and proud of it.
    1. Re:i860? by julesh · · Score: 2, Interesting

      Anyone remember the hype of the i860? Great on paper, but not so great in reality. I really hope this works though, von Neuman architecture was always supposed to be a stop-gap (even vN said so I think).

      As far as I can tell, there's no really significant departure from von neumann architecture here. They have a processor capable of executing 64 concurrent threads, 'fork' and 'join' instructions, and a version of C that has been extended to be able to use them. I'm not sure I really see what's so revolutionary here -- I've been reading about prototypes of similar ideas to this since the late 90s.

  20. FPGAs by CompMD · · Score: 2, Informative

    It appears to be a few FPGAs. With FPGAs, you can optimize the logic to represent algorithms for faster execution that on general purpose processors. Simply, you use more of the gates available on the chip. That appears to be what these guys are doing. It also appears that there is a single memory controller (I think that is what the QuickLogic chip is) and there is only one DRAM module installed on the board. It would be interesting if the board had a unified memory architecture. There is a separate Xilinx Spartan FPGA on the board that does who-knows-what, but I wouldn't be surprised if it was involved in communication with the processing chips. Of course, this is speculation, but it would seem logical for a board layout.

    Just my thoughts.

  21. Re:Limited Practical Applications (for now) by eonlabs · · Score: 2, Interesting

    If everything in the chip is lining up so nicely, how about calling it

    THE SYZYGY

    no, I'm not making up the word. If you don't believe me, http://dictionary.reference.com/browse/syzygy

    --
    I wouldn't consider the mad hatter mad. Just reality impaired. He sure can make a mean cup of tea.
  22. Re:I don't know about you guys... by Dunbal · · Score: 2, Funny

    But I want those $500. Maybe I could use it to buy a board

    Don't lie. You'll actually spend it on 2 computer games, lots of mountain dew and some pizzas.

    --
    Seven puppies were harmed during the making of this post.
  23. It's also retarded by Sycraft-fu · · Score: 2, Insightful

    Since of course that breaks down. Actually maybe it isn't so retarded since the same thing is true in many computing problems.

    For example if you take the cleaning situation sure, adding a second cleaner will nearly double the speed it gets cleaned at. Adding four will probably close to quadruple it. However, it starts to break down after a while. At first the gains just start slowing down, as there's more people they have to spend more time talking and dividing up who does what than actually working, as well as doing work others have done because of a miscommunication. Eventually you have so many people that you start actually slowing down with each person you add, because they are getting in each other's way and taking up too much time with non-work.

    That's fairly similar to what you get with a lot of problems in computation. You split the task in half, you can have 2 processors/cores/whatever execute it and nearly double your speed. However after a point you find that you can't split the task more, or that even if you can, it takes more time getting it all sync'd up than you gain from the multiple execution, or that contention in other parts of the system (like memory) holds things back.

    The concept of "two is better than one to 100 must be better than 10" doesn't hold up. There are almost always limits to how much you can divide up a task. Sometimes those limits are extremely high, but they are there. Unfortunately, for many tasks, the limits are pretty low.

  24. Worst Analogy Ever by FuzzyDaddy · · Score: 2, Insightful
    From TFA:

    Suppose you hire one person to clean your home, and it takes five hours, or 300 minutes, for the person to perform each task, one after the other," Vishkin said. "That's analogous to the current serial processing method. Now imagine that you have 100 cleaning people who can work on your home at the same time! That's the parallel processing method.

    100 people trying to clean my house at the same time would be slower than 1, because no one would be able to move or breathe. Which is exactly what makes parallel computing hard.

    --
    It's not wasting time, I'm educating myself.
  25. Re:Isn't the solution to reverse the concept? by booch · · Score: 2, Insightful

    Wow. You got half way with your idea, but didn't make it all the way.

    Right now, with most programming languages, we tell the computer how to compute the result. We generally do this with a linear list of steps for the computer to take. But that's not the only way to write a program. Another way is to tell the computer what we want it to compute, and let it figure out the best way how to do that. This sounds pretty crazy at first, but it's actually been done. Take a look at the Prolog and Haskell programming languages. They're much more descriptive than iterative. They can parallelize things a lot better than the languages we're used to using.

    --
    Software sucks. Open Source sucks less.