Slashdot Mirror


Supercomputer On-a-Chip Prototype Unveiled

An anonymous reader writes "Researchers at University of Maryland have developed a prototype of what may be the next generation of personal computers. The new technology is based on parallel processing on a single chip and is 'capable of computing speeds up to 100 times faster than current desktops.' The prototype 'uses rich algorithmic theory to address the practical problem of building an easy-to-program multicore computer.' Readers can win $500 in cash and write their names in the history of computer science by naming the new technology."

214 comments

  1. Name ? by Hsensei · · Score: 2, Insightful

    What's wrong with Supercomputer On-a-Chip (c) ?

    --
    ~
    1. Re:Name ? by Aranykai · · Score: 1, Funny

      I call it the Gargantu-Hertz Processor :P

      --
      If sharing a song makes you a pirate, what do I have to share to be a ninja?
    2. Re:Name ? by Anonymous Coward · · Score: 2, Funny

      What about people-ready chip?

    3. Re:Name ? by DigiShaman · · Score: 3, Funny

      Supercomputer-On-a-Chip, or SOAC (pronounced soak).

      "Need your data processed in a jiffy? Then SOAC your data on our new chip. All yours for $19.95*!

      *sorry, no CODS accepted

      --
      Life is not for the lazy.
    4. Re:Name ? by OctoberSky · · Score: 4, Funny

      Babywulf Cluster

    5. Re:Name ? by ozmanjusri · · Score: 1

      How much did you earn for that?

      --
      "I've got more toys than Teruhisa Kitahara."
    6. Re:Name ? by hAckz0r · · Score: 4, Funny

      What's wrong with Supercomputer On-a-Chip (c) ?

      Oh great, I can hear the PR advertisements already; "Put a SOC in it".

    7. Re:Name ? by Anonymous Coward · · Score: 0

      Super Lucky Besto Computing Chip

    8. Re:Name ? by IdleTime · · Score: 1

      I like, for obvious reasons and it's quite appropriate here, "Deep Thought"

      --
      If you mod me down, I *will* introduce you to my sister!
    9. Re:Name ? by KDR_11k · · Score: 1

      SOC is already taken for System On Chip? Maybe ScOC. No idea what's the difference between an ScOC and an MPSOC.

      --
      Justice is the sheep getting arrested while an impartial judge declares the vote void.
    10. Re:Name ? by blackicye · · Score: 1

      My vote is for CLustered Units of Multiple Processors.

      CLUMP :P

    11. Re:Name ? by Opportunist · · Score: 1

      Awww... that's cute! And the logo and icon animal almost comes along by itself.

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    12. Re:Name ? by moeinvt · · Score: 1

      Huh???

      Looks to me like it's a "supercomputer" on a PCB? They wired a bunch of processors together on a circuit board(the size of a license plate). That isn't a "chip". How about SOB?

    13. Re:Name ? by mikael · · Score: 1

      But be careful not to get confused with:

      Spearmint Oil Administrative Committee
      Sons of Alpha Centauri (band)
      State of the Art Car
      Submarine Officer Advance Course
      System-On-A-Chip

      (From SOAC Acronym

      --
      Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
    14. Re:Name ? by moeinvt · · Score: 1

      I did RTFA, or more like "skimmed through it" before my comment.

      With a more careful read however, I noticed that they explicitly called this a "prototype". Not many universities have their own wafer fabs, so it makes sense. More importantly, they didn't give all of the specs on the processors used. If they're small enough, maybe this could be implemented on a single chip.

    15. Re:Name ? by Capt+James+McCarthy · · Score: 1

      'Oh great, I can hear the PR advertisements already; "Put a SOC in it"'

      Better say that instead of "Computer On-a-Chip"

      ServiceDesk Tech: "Sir, I think your COC is over heating and needs to be replaced."

      --
      There are no loopholes. It's either legal or it's not.
    16. Re:Name ? by clang_jangle · · Score: 1

      No no, you forgot the POW!
      Super Lucky POW! A-Number One ...

      --
      Caveat Utilitor
    17. Re:Name ? by Neo_piper · · Score: 1

      Wasn't that already used as the name of IBM's first chess playing super computer, you know Deep Blues little brother?

  2. "Cell" by Doc+Ruby · · Score: 3, Insightful

    I call the "supercomputer on a chip" the "Cell microprocessor". Of course, next year, it won't be so super. But there will be a new one that's really super.

    --

    --
    make install -not war

    1. Re:"Cell" by Spy+der+Mann · · Score: 1

      I know, I know! Let's call it the Goku(TM) microprocessor! :D

    2. Re:"Cell" by Anonymous Coward · · Score: 0

      So that'd be a Super Cell chip, right?
      If Sony loses funding for it, they can always sell it to the Canadians and tell them it gives them an edge in high altitude weather forecasting!

    3. Re:"Cell" by b0101101001010000 · · Score: 1

      Its very interesting reading the paper linked to the link http://www.umiacs.umd.edu/users/vishkin/XMT/spaa07 paper.pdf. It reminds me of Mercury Computing Programming Toolkit for Cell Processor Programming. They too have a spawn and join method of concurrent programming see: http://www.mc.com/uploadedImages/MCF-FOE-model.jpg at http://www.mc.com/microsites/cell/ProductDetails.a spx?id=2824. Notice the worker/manager similarity to the spawn/join semantic. It would appear that this chip is fundamentally the same, but provides implicit engine allocation. Very interesting....

    4. Re:"Cell" by dwarfsoft · · Score: 1

      "Buy! Buy! BUY! The Cell, Cell, CELL!"

      --
      Cheers, Chris
    5. Re:"Cell" by julesh · · Score: 1

      To be fair, if this crowd had a version of their chip implemented on 65nm silicon, it would probably outperform the Cell in several key areas. For a start, it has a maximum parallelism of 64 simultaneous instructions -- I believe the Cell can only reach 10 (?). Of course, writing a real program that takes advantage of that much parallelism is a little tricky...

    6. Re:"Cell" by Doc+Ruby · · Score: 1

      Before I start to wade through that UMD paper, are you saying that its model might have been an inspiration for the Cell, or that it is just similar to, but designed after, the Cell? Are you saying the UMD design provides implicit engine allocation, or just confirming the fact that the Cell does?

      --

      --
      make install -not war

    7. Re:"Cell" by Doc+Ruby · · Score: 2, Insightful

      How is that "fair"? By the time this new chip is even properly named, TBM will have Cell chips in 45nm silicon. Partly because their engine is simpler. And the Cell is designed for scalable multicore/chip parallelism. Its main magic is its coherent, superfast "elements" bus, which retains coherency even at 1.6Tbps across multiple cores and chips. IBM has 4-core chips in pairs already deployed in public, and 128-core chips in the lab, where a massive new top-predator supercomputer is being built on the new architecture.

      There are other, more parallel processors. The PS3's Cell at 204GFLOPS is matched to a 128-shader RSX at 1.8TFLOPS. But you can't run Linux, or anything else so general purpose, on an RSX - not without a prohibitively difficult development process, if at all retaining the speed.

      The Cell has builtin allocation facilities, so app code doesn't have to schedule or otherwise closely manage the fast SPEs, just send tasks to a generic pool. Which SPEs just DMA into a unified memory model. That kind of simplicity makes Cell programming harder than, say, PowerPC programming, but much easier than other parallel programming, without losing its speed. Once there are some basic libraries for programming "common" new parallel tasks on the Cell, it won't be considered any harder than it was to program x86 "Protected Mode", Extended vs Expanded Memory, word alignment, etc.

      --

      --
      make install -not war

    8. Re:"Cell" by somersault · · Score: 1

      First slashdot comment that's given me a bit of a chuckle for weeks :P Could call it the SSJ2? Though I'm not up to date with my Cell saga (stopped watching DBZ cuz I didn't have satellite at Uni), so maybe it's SSJ3 or whatever..

      --
      which is totally what she said
    9. Re:"Cell" by b0101101001010000 · · Score: 1

      I really can't say whether Cell was created as a result of the UMD work. I do think that the programming paradigm of micro tasks has now been found in 2 separate places that I know of and that each architecture has a main controller: the Cell's Power Processing Element and the XMT's Master Thread Control Unit and Global Register File and discrete offload engines: Synergistic Processor Elements (SPEs) and the XMT's Processing Clusters. What I find very novel here is that the Maryland guys have explicitly shot for ease of programming - leading to implicit offload engine allocation guided by an explicit task primitive. The Cell on the other hand uses a more complicated DMA model in lieu of the higher level "taskSpawn" that is made into a taskSpawn via an API. So here we're just trading bandwidth for ease of use it seems. I'm looking at some of the PMU and Cell programs right now....

    10. Re:"Cell" by Doc+Ruby · · Score: 1

      In 1990, at Array Technologies on the SF Bay, we invented a multi-DSP (scalable chip count) parallel processor for image processing. 3-9+ AT&T DSP32Cs were connected by Xilinx FPGAs, with several buses to host PCs, including EISA, SCSI, GPIB/IEEE-488, and later a RAM bus. One DSP was the master, distributing tasks around the board to resources it monitored for availability. That master DSP also reprogrammed the FPGA in realtime. Tasks were selected and distributed for execution, and data routed, and logic configured, to match parameters in incoming data streams, which came from a custom parallel bus from a 4-color-channel video sensor chip behind a Nikkor mount (for Nikon 35mm lenses). The DSP array also controlled the microposition of a subsampling stepper array on which the videochip was mounted, giving 8Kx8K (at 40b color) real data, interpolated by the DSPs into 16Kx16K.

      We invented everything ourselves, starting with a raw Hitachi videochip (for TV), blank FPGAs, and DSP32Cs. In 1990. We produced all the software, from DSP and FPGA to host apps to drivers to Photoshop plugins etc. The programming model used "spaces" of data with attached operations and dataflow dependencies, implemented in C structures (before C++ was viable), which we preprocessed to parallelize into the spatial processors (including color spaces, edge enhancing convolution spaces, etc). So both the compiler and runtime maximized parallelism while presenting a simple model of "the computer" to coders, who could reuse existing C code (including DSP libraries optimized for the AT&T chips). The hardest part was writing a single-stepping debugger - really hard was its UI. But we did it, even going to the extent of writing a "C interpreter" in it that could simulate C well enough that many programmers used it as a "command line" to solve simple offline programs, like a really fancy scientific calculator.

      Again, that was in 1990. We promoted a lot at parallel computing forums and imaging forums worldwide. And we were in Silicon Valley (up the street, towards Berkeley), so lots of other people in the area got turned on to us. Our model has turned up quite a lot. The Cell task management architecture reads like a direct descendent. And I can tell you that we didn't use any existing architecture, implemented or even just planned, when we made our own. But I'm not sure that the Cell or any other architecture or approach is based on any explicitly transmitted info. I think that architecture is just one of the best ways to do it, and we just thought of it (and pulled it off) very early. But we were ahead of our time: the company died after shipping only a few working systems. Partly because just fixing images in Photoshop for a few hundred bucks was cheap and easy (with lots of staff available), while the much better results we got cost a whole lot more. While people's expectations of image quality in most publishing were lowered by the booming popularity at the low end of digital shooting and retouching.

      But I do like seeing other work that validates how smart we were then. And keeps me current on ways to exploit what we developed as even faster, more complex HW comes across. That Array machine used 12.5MFLOPS DSPs, for up to something like 200MFLOPS total at $100K. Today my PS3 has 200GFLOPS just on its Cell, running Linux and all its apps (though needing porting to SPEs to really fly), for about $500. That's 1000x the power at 1/200th the price, or 200,000 the bang per buck. Finally getting to the point where everyone will have to learn to do what we made up almost 20 years ago.

      If only I could plug a Xilinx Virtex II board running Microblaze/uCLinux directly into the Cell bus, I might have to learn something new again ;).

      --

      --
      make install -not war

    11. Re:"Cell" by Caktus · · Score: 1
      I think that you are mixing chips and cores together.

      And the Cell is designed for scalable multicore/chip parallelism.
      The Cell is an heterogeneous multicore design with very good bandwidth between its cores, but that does not mean that it has been designed for scalable multichip parallelism. In fact there is a paper that shows that the bandwith between chips is not that great.

      Its main magic is its coherent, superfast "elements" bus, which retains coherency even at 1.6Tbps across multiple cores and chips.
      Again, the Element Interconnect Bus has quite a lot of bandwith, but it is only available between the cores of a single chip. Interchip communication must be performed through the IO port, which has much less bandwidth.

      IBM has 4-core chips in pairs already deployed in public, and 128-core chips in the lab, where a massive new top-predator supercomputer is being built on the new architecture.
      That's interesting. Could you provide a link to that information, please?

      The Cell has builtin allocation facilities, so app code doesn't have to schedule or otherwise closely manage the fast SPEs, just send tasks to a generic pool.
      The Cell does not have those facilities in hardware. All that is implemented in software.

      Which SPEs just DMA into a unified memory model.
      That is a bit confusing. The PPE operates on main memory and it is accessible to the SPEs, but only through DMA operations. They operate on their own memory (Local Store in the literature). I consider that a non unified model.
      Nevertheless, this model can be altered by memory mapping the SPE Local Stores onto the memory of the PPE. But that still does not allow the SPEs to operate directly on main memory.

      That kind of simplicity makes Cell programming harder than, say, PowerPC programming, but much easier than other parallel programming, without losing its speed. Once there are some basic libraries for programming "common" new parallel tasks on the Cell, it won't be considered any harder than it was to program x86 "Protected Mode", Extended vs Expanded Memory, word alignment, etc.
      I think that in general, programming for the Cell is much more complicated than programming for an SMP, and even in some cases MPI.
      • There is very few storage on the SPE side, which must be shared by the code, the data and the stack.
      • The SPEs do not have memory protection on their Local Store, which means that smashing your data or code with the stack is not detected and handled automatically.
      • The SPEs have a pure vector ISA, which forces the programmer to vectorize the SPE code in order to obtain good performance. In fact having a pure vector ISA forces the compiler to emmit lots of additional instructions (rotating and masking) for non vectorized code (compared to scalar ISAs), making the LS space limitations toughter.
      • The PPU, although multithreaded, is not as powerful as a traditional PPC (e.i. no OoO execution), which in practice means that you cannot spend too many cycles on scheduling work for the SPEs, otherwise your SPEs will be starved.
      Without the help of tools and libraries that hide those low level details from the programmer, programming the Cell can be quite hard.
      I think that programming any non embedded processor should be simple and for that reason libraries, compilers and other tools are going to be as important for the Cell processor as the compiler is for Itanium.
    12. Re:"Cell" by b0101101001010000 · · Score: 1

      Sounds like a very cool start-up. If nothing else this little article has lead me to some interesting reading. I've done some architecture research myself. I invented a Cog Computer. It was really just a thought experiment that turned into a paper and a simulator but it was novel. In essence, the cog computer contains "cogs" which are abstractly while(1) loops where the programmer is not allowed to modify the PC value (no traditional conditionals, loops, etc...). Each cog communicates through ports to other cogs, has its own memory and program store. The whole architecture aims for complete determinism with synchronous behavior. It accomplishes this by having each cog's PC run in lock step. This way data can be produced for ports before it is needed...completely alleviating the need for any cog (task) to block. The WWII enigma machine was implemented using the cog simulator.

    13. Re:"Cell" by Doc+Ruby · · Score: 1

      Well, the guy I worked with, who has run Cell seminars for NASA and IBM, exploring some Cell programming prototypes this year explained to me that the EIB is indeed coherent across chips. And Mercury makes its modules in pairs of Cells because they're already well integrated by EIB in that config. The 128-core chip is actually something of my conjecture, because the Cell guy I mentioned told me he'd seen IBM prototypes with 1024 SPEs - which, at 8 per core, would be 128 cores, though I suppose a single core could have 1024 SPEs. I don't think 1 PPE or 1 SPE pumping data for all of them (like on PS3's 1:6) would work well.

      The SPEs work in a unified memory model across all local stores and the PPE memory, by DMA across the EIB, even though they're physically segregated. That is really the basic magic of the Cell's speeds.

      Programming the SPEs is a different model than more familiar CPUs. Its architecture of small addressable onchip memory, per independent SPE, and superfast bus (with crossbars for some extra optimizations) make its model stream processing. The network is supposed to pump data really fast across the Cell, which is mainly concerned (like most DSP) with keeping its pipeline full. Its high costs for failed branch prediction (relative to ALU) means that it really is supposed to process multimedia, which is already tagged for processing, with little control logic per bit processed.

      The saving grace of SPE programming is supposed to be the auto scheduling of tasks to available SPEs. You can send a task to the pool with a mask saying any SPE, or multiple SPEs (MISD), etc. Which novelty makes it harder to program, even if just because there's no existing code or code generators that use that facility.

      I think the Cell really is a new architecture. The x86 model had a fair amount of advantages, but many disadvantages. PPC too, though fewer disadvantages (other than less existing SW and smaller dev community), because it was newer, but still limited in many ways, mostly large binaries because of RISC and low clock rate (despite RISC). Having taken a stab at it, it's more like an extremely fast LAN of dedicated engineering workstations.

      I think a dataflow "language", like a flowchart with topological/graph analytics and techniques, will help programmers best describe to the chip what it is to do with data. But maybe I'm just projecting my own feelings about the new SW paradigm that's been overdue since at least 1995 or so. I thought Java would work like that, and even saw some of its first IDEs demonstrating runtime flowcharts with dynamic properties, even inheritance. But for some reason, programmers hate flowcharts. Maybe the benefit of flowcharting will finally be the best way to use these seductively powerful chips. And maybe the PS3 community will find it in their style, without whatever baggage keeps regular programmers away from it.

      IBM is really betting a lot on the Cell. They are responsible for its Linux kernel and gcc. I hope we see IBM finally get in the lead on mainstream products again with an IDE for this chip that makes the "Pentium" era look like the way the PC made the Zilog era look.

      --

      --
      make install -not war

    14. Re:"Cell" by Doc+Ruby · · Score: 1

      The genius behind Array Tech really wanted to make orbital (around the Earth, in microgravity and near vacuum) rod logic nanocomputers. Now he works at Xilinx. A lot of these mechanical paradigms will come back again as nanomachines are interfaced with optical networks fast enough to feed and care for them.

      The time we spent getting away from his wife to brainstorm his "pipe dreams" was by far the most educational time I've spent in my life. I expect it will guide me for decades, though it only lasted across about a year or two, whether it guides my inventions or just understanding the new ones that come across.

      --

      --
      make install -not war

    15. Re:"Cell" by Caktus · · Score: 1

      Well, my point was that you were not clearly separating what happens inside a chip and what happens between chips. For example, the EIB bandwidth figures you were mentioning are for the aggregate for the whole chip. Chip to chip communication has much less bandwidth.

      About unified memory, PPU to PPU is coherent (plus load/store queues), but inside the chip the instructions are not running on the same memory space unless you do some tricks and then it still not a unified memory architecture. This is one of the main benefits of this architecture, SPUs operate on local data, without the effects of false sharing, ping pongs, etc.

  3. Taken? by bryan1945 · · Score: 3, Funny

    "Readers can win $500 in cash and write their names in the history of computer science by naming the new technology."

    Is "Clippy" taken?

    --
    Vote monkeys into Congress. They are cheaper and more trustworthy.
    1. Re:Taken? by trolltalk.com · · Score: 3, Funny

      Chipzilla would be good, except that's what everyone calls Intel. I guess we'll have to settle for "CowboyNealOnAChip". Or "theChipThatCanActuallyRunJavaProgramsWithinTheUni versesLifetime"

      What gets me is that that there's a dropdown in the entry form to choose your country, as well as asking you for your state or province, but the rules state:

      WHO MAY ENTER: Open to all legal residents of the 50 United States (including the District of Columbia) who are 18 years or older in their respective US state at time of entry. Individuals employed by the University of Maryland, College Park. ("University") as faculty, exempt or non-exempt employees, and members of their immediate family or persons living in the same household, are not eligible to enter or win.

      I hope their chip design is better thought out than the contest form.

    2. Re:Taken? by crgrace · · Score: 1

      There actually was an innovative microprocessor called "Clipper". It was a nice architecture...

      http://en.wikipedia.org/wiki/Clipper_architecture

    3. Re:Taken? by Anonymous Coward · · Score: 0

      who are 18 years or older in their respective US state at time of entry. So wait, you can be 18 in one state but not another? Wow, I guess these time zones are more complicated than I thought.
    4. Re:Taken? by trolltalk.com · · Score: 1

      That's nothing - pity the person born on February 29th ... who only gets a birthday once every 4 years.

    5. Re:Taken? by bryan1945 · · Score: 1

      Sir, are you trying to take my humor away? :)

      (yeah, I knew about Clipper, but so much less fun than a dancing paper clip....)

      --
      Vote monkeys into Congress. They are cheaper and more trustworthy.
  4. WTF? by msauve · · Score: 4, Insightful

    We have microcomputers and supercomputers and nothing in between? Seems to be a bit of hyperbole involved here.

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
    1. Re:WTF? by gardyloo · · Score: 3, Funny

      We have microcomputers and supercomputers and nothing in between? Seems to be a bit of hyperbole involved here. Most. Insightful. Post. Ever. ;)
    2. Re:WTF? by Seumas · · Score: 0

      Agreed. They are obviously presenting this as a user/consumer chip for the desktop. Hence the comparison to its speed over a desktop. This might be of great interest to the NSA and other government agencies that do domestic spying and for companies like Google, but what is even the high-end gamer going to need a chip 100 times faster than today's machines for any time in the next decade? And of course, it will be about a decade before this is even affordable for a consumer, anyway.

      Maybe we can call it "blackout", since that's what these will probably do after sucking the power they need.

      And wouldn't it be appropriate to label this story as the press release that it is?

    3. Re:WTF? by EEPROMS · · Score: 1

      Micro->Mini->Supercomputer, Minis used to be small business systems with more than one cpu (not always) that interacts with a group of terminals or PC's.

    4. Re:WTF? by Kadin2048 · · Score: 5, Insightful

      but what is even the high-end gamer going to need a chip 100 times faster than today's machines for any time in the next decade?

      If you compare megahertz-cores (number of megahertz times number of cores at that speed), I suspect that there's been almost a 100x increase in the past 10 years, at least if you look from the low end a decade ago to the high end of personal computers now.

      I don't see why the next ten years would be any different. Operating systems will continue to get more bloated, software packages will get more feature-stuffed, games will continue to demand just slightly more than whatever's available to most people with expenses and regular lives, and most people will buy a new machine every few years based on whatever's on sale for $500 at Best Buy when their old one gets clogged with spyware.

      Sure, 100x might be a bit of a stretch (I'm not sure whether silicon will go that much further and I'm not totally convinced that parallelism is the solution for general-purpose computing), but if that kind of power was available, it would be put to use.

      Software expands to fill the resources made available to it, and then some. Always has and always will.

      --
      "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
    5. Re:WTF? by jacksonj04 · · Score: 2, Insightful

      Build it, and they will come.

      Remember getting your first 1gb drive and going "Wow, I'm *never* gonna be able to fill this up". A few years later people are throwing around files in excess of 1gb with no worries.

      --
      How many people can read hex if only you and dead people can read hex?
    6. Re:WTF? by LarsG · · Score: 1

      [Voice = Old Man]
      Forgotten all about minicomputers and mainframes, have you?

      Get off my LAwN kids!
      [/Voice]

      --
      If J.K.R wrote Windows: Puteulanus fenestra mortalis!
    7. Re:WTF? by eggnoglatte · · Score: 2, Insightful
      Once you have compute power, you'll find a way to use it. If you are a gamer, then this kind of performance gain will be used to roll the GPU back into the CPU, increase your screen resolution to 10-20 MPixels, increase the rendering quality, and improve the game AI.

      Once you are done with all that, you are going to be back asking for more.

    8. Re:WTF? by Jedi+Alec · · Score: 1

      but what is even the high-end gamer going to need a chip 100 times faster than today's machines for any time in the next decade?

      2 letters: AI

      --

      People replying to my sig annoy me. That's why I change it all the time.
    9. Re:WTF? by RingDev · · Score: 1

      Your comparison based on mainstream availability.

      1997 Pentium 1: 1 core 200MHz - 200
      2007 Core 2 Duo: 2 core 2.6GHz - 5200

      A 26x improvement is quite impressive, but it's a good ways short of 100x. Even with the quad code chips, you're still only looking at a 50x improvement.

      Now if you compare cycles per power consumption... Then I be you would pull 100x, but I don't have those kinds of numbers.

      -Rick

      --
      "Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
    10. Re:WTF? by booch · · Score: 1

      Damn. WTF is a much better name than I was going to suggest.

      --
      Software sucks. Open Source sucks less.
    11. Re:WTF? by Kadin2048 · · Score: 1

      True. But my point was more, "if there was some technology that could give us 100x performance in 10 years, it would be put to use."

      Technology has given us (by your numbers, which look fine to me) a 26-50x improvement in the last decade, and modern applications and games are still pushing that right along. If the engineers had managed 100x, then we'd all be using faster machines, and our applications would be that much more resource-intensive.

      So anyway, what I was really disputing was the GGP's claim that, in essence, there's some limit above which most people won't care about having a faster computer. I think that's false, and there are enough broken predictions ("640k should be good enough.." etc.) surrounding it to show that people will buy hardware based on applications and features, and applications will be designed for cutting edge hardware, and thus the upgrade treadmill will continue with no signs of stopping, regardless of how many times more powerful computers get for the same price.

      --
      "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
    12. Re:WTF? by RingDev · · Score: 1

      I agree entirely. Software will always grow to require n+1 resources where n is the resources provided by current hardware.

      -Rick

      --
      "Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
    13. Re:WTF? by Arterion · · Score: 1

      Well, how much work is each clock cycle doing? I'd think a Core 2 duo with one core disabled running at 200mhz would probably blow the Pentium 1 away. That's just speculation, though. But I do know the Core 2 gets more work done per cycle than previous generations. The same is true for AMD chips, too. So we haven't only made them faster in terms of clock, but we've made each clock cycle do more work.

      So maybe they are 100x "faster" in terms of how quickly they can compute x thing.

      --
      "That which does not kill us makes us stranger." -Trevor Goodchild
    14. Re:WTF? by Anonymous Coward · · Score: 0

      We also have apples and oranges and nothing in between.

    15. Re:WTF? by Anonymous Coward · · Score: 0

      Computer.

    16. Re:WTF? by Seumas · · Score: 1

      One hundred times in the last ten years is an enormous stretch. And note that I said "in the next decade". Even if our computing needs for a high end gamer double in the next ten years increases at double or triple the rate, we won't need this kind of speed for more than a decade from now.

      I would suggest that the statement that software expands to fill the resources made available to it is largely backwards. Or at least should be clarified: The demands of new software that consumers want to use drives the demand for production and availability of more powerful hardware.

      I think you will find that it will be far more than ten years before the gaming and mainstream markets need a "500ghz CPU" and that no manufacturer will release anything significantly faster than today's performance benchmarks. Even if they could provide a desktop with a "500ghz CPU" type of experience today, they would not. The money isn't in the end of maximizing the total product potential. It's in exploiting the profits to be made from each annual incremental addition to the product line. Just enough to show up the competition, but not enough to blow them away by a factor of five, ten or a hundred.

    17. Re:WTF? by Seumas · · Score: 1

      That statement would seem to contradict itself. Software drives the demand for more computing power. Not the other way around. There will not be software that demands a 500ghz system for a very long time. I'm not a game developer, but I sincerely doubt you could find any developer who could make use of and peg-out a full 500ghz on the end-user client for a game right now. And of course, that's presuming that everything else (memory, FSB, drives, etc evolved to avoid bottlenecking the massive system).

      Oh, and if you increase the screen resolution that much, you're going to be talking about a $20,000 monitor. Not something anyone but the most insanely rich gamer will buy -- even beyond the hardcore and well-funded gamer. In order for such speed to be utilized and useful, developers have to have a current or near need for it and the rest of the software and hardware has to improve significantly to keep up with it.

      Until this thread here, I have never heard the suggestion that hardware drives the need for bigger software. Ever.

    18. Re:WTF? by eggnoglatte · · Score: 1
      Hardware doesn't "drive" software, but it enables it. Trust me, there are plenty of things people would like to do, except they are to expensive right now. If you take a game engine and reimplement it to use full ray-traced global illumination rather than Direct X, you'll easily absorb a factor 20-100 increase in compute power.

      Oh, and if you increase the screen resolution that much, you're going to be talking about a $20,000 monitor. Actually, I paid around $10k for a 9 MPixel IBM Big Bertha display in 2000. And that cost would come down quite a bit once economies of scale kick in. It is not fundamentally more expensive to manufacture a 24'' diagonal display with 300DPI resolution than it is to manufacture a 24'' 75DPI display, since most of the manufacturing cost depends on the screen area, not the pixel count. The reason why high resolution displays are more expensive right now is that there is no market, which in turn is because you pretty much need multiple graphics cards to serve that display with any kind of decent refresh rate. As a result, the only customers for these displays are research labs and other specialty users. Once you have the compute power to do 3D graphics with a cheapish GPU on such displays, demand will go up drastically, and prices will come down.
    19. Re:WTF? by RingDev · · Score: 1

      Well, the Hz rating is the speed of the clock. So far as I know there are still only 16 interrupts (albeit much less visable to us users). The Core 2 Duo is still only a 32-bit processor, same as the Pentium 1 chip set. So each clock cycle you are still only able to push 32 bits of instructions to the CPU. So if you disabled the second core and dropped the clock to 200mhz, you would still have the same amount of floating point opperations per second, although the memory interface has been greatly improved, so IF the old 200MHz chipset was being bottlenecked by memory, you could see an improvement there. All this being ofcourse completely hypothetical as I can't imagine that dropping a board's clock speed that much would be stable if it is even possible.

      -Rick

      --
      "Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
    20. Re:WTF? by Anonymous Coward · · Score: 0

      Nah. It's just like how we have "pro-life" and "pro-choice" but not "anti-life" or "anti-choice".

      Supercomputers don't want to be called "megacomputers", and microcomputers don't want to be called "lame-o-puters".

      But that's what they are, right?

  5. The Cowardly Lion says.......... by Anonymous Coward · · Score: 0

    Looks like a cluster on a single board. The cleaning analogy is kind of stupid. If I had 100 people cleaning my house at the same time, they wouldn't get shit done. New twist on old technology.

    I'm not sleeping, I passed out from holding my breath.

    Signed,

    The Cowardly Lion

    1. Re:The Cowardly Lion says.......... by Meostro · · Score: 1

      The cleaning analogy is perfectly apt!

      If 100 people cleaned your house, they "wouldn't get shit done".

      If 100 people cleaned Prof. Vishkin's house, they would be finished in about 3 minutes.

      How this is better than Intel's 80-core processor remains to be seen. This "technology" looks like it's an overhyped version of GPGPU or PhysX.

    2. Re:The Cowardly Lion says.......... by Anonymous Coward · · Score: 0

      So how does this new chip rate? Between Snakes on-a-Plane, and Lasers on-a-Shark?

  6. My Name by the+eric+conspiracy · · Score: 5, Funny

    'Space Heater'

    1. Re:My Name by Ice+Wewe · · Score: 1

      Nah, that was the nickname for the Pentium 4 chip. I think we should hail the new, more energy efficient chips, besides, they can't exactly heat that much space anymore. How about a term more fitting to the amount of heat they put out, 'Hobo Heaters'? Then they'll stop begging for money, and start begging for large data files to process.

    2. Re:My Name by the+eric+conspiracy · · Score: 1, Interesting

      Actually power consumption per instruction has remained pretty constant over the years if you exclude the Pentium 4. The Yohah uses about the same amount of power per instruction as the Pentium. So if you are running 100 times more instructions per second, well you will be using 100 times more energy.

    3. Re:My Name by Just+Some+Guy · · Score: 1

      Actually power consumption per instruction has remained pretty constant over the years if you exclude the Pentium 4.

      Um, not even close. A MC68000 from 1979 drew 1.35 watts and yielded about 1 MIPS (.74MIPS/W). An Intel Core 2 Extreme QX6700 (shoot that marketer!) dissipates about 110W at 57063 MIPS (518MIPS/W).

      The Core 2 is about 700 times more efficient than the 68K. You could probably argue some of those numbers a few percent either way, but that's not going to explain away the nearly three orders of magnitude of improvement.

      --
      Dewey, what part of this looks like authorities should be involved?
    4. Re:My Name by the+eric+conspiracy · · Score: 1

      Current implementations (i.e. Freescale Coldfire) of the MC68000 draw more like 1 mW per MIPS.

    5. Re:My Name by Just+Some+Guy · · Score: 1

      Current implementations (i.e. Freescale Coldfire) of the MC68000 draw more like 1 mW per MIPS.

      So, the modern Coldfire is about 1400 times more efficient than its 68K predecessor. That would seem to strengthen my point that power consumption per watt is much lower now than in bygone years (which is the opposite of your earlier post).

      --
      Dewey, what part of this looks like authorities should be involved?
  7. Name by christurkel · · Score: 1

    Future Slashotting in the Waiting (FSW).

    --

    CDE open sourced! https://sourceforge.net/projects/cdesktopenv/
  8. There's nothing here by IlliniECE · · Score: 2, Insightful

    I RTFA... It seems to handwave so much about parallel computing, that it seems they haven't discovered anything. All i see is "clock frequency can't increase, so we're going parallel'.... Surely, this can't be the extent of their research. The article claims its 'easy to program', but there are zero specifics about why that would be the case. Can anyone tell me what they've done here (if anything)?

    1. Re:There's nothing here by Holi · · Score: 2, Interesting

      Well, you should learn to follow links.
      It was quite easy from the article to find more information about the project.

      --
      Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
    2. Re:There's nothing here by Anonymous Coward · · Score: 0

      haha, well I don't know about their hardware but the programming model isn't really anything new, and it certainly isn't parallel programming made easy. there's other parallel C extensions like UPC and Cilk that work similarly, this sort very convenient for data-parallel applications, but an elegant solution to the critical section problem it is not.

    3. Re:There's nothing here by IlliniECE · · Score: 1, Insightful

      And people who write articles should learn to write them more thoroughly. If the article doesn't look promising, i'm not going to spider across the web collecting as much as I can on it.

    4. Re:There's nothing here by James+McP · · Score: 4, Informative

      Here's the deal.

      Up 'til now, Parallel Random Access Model (PRAM) computing has been a theory of parallel processing that was a thought model. It hadn't been built. Some people had written programs to emulate a PRAM computer but they were not complete versions.

      It could work at a snail's pace and still be a technological accomplishment as it is the very first, complete, working, hardware PRAM computer. It's on par with the Z3, Colossus and Eniac, the first programmable computers (German, English, American, in historical order).

      Fortunately, they made the algorithms work well, or at least, if the press release it to be believed, work so that 64 75Mhz computers could produce 100x the performance of a current desktop on at least one particular function. Which is pretty impressive in first-time hardware even if it turns out to be an obscurely used math function known only to about a dozen coders.

      --
      I've been on slashdot so long I'm starting to get out of touch with the cool stuff if it ain't on slashdot.
    5. Re:There's nothing here by raftpeople · · Score: 1

      I read most of the article, until it starting repeating itself and I still hadn't read anything new. I see in some other comments references to PRAM so I'll check that out, but they sure didn't do themselves any favors with that article.

    6. Re:There's nothing here by timalewis · · Score: 1

      But the PRAM model only requires that the parallel memory accesses occur at constant time - that constant could still be huge and it would run PRAM programs.

      If they have got those times down to something useful that is clearly a step forward, but why then are they claiming superlinear speedup (64 CPUs performing as 100) and "desktop applications"?

    7. Re:There's nothing here by James+McP · · Score: 1

      But the PRAM model only requires that the parallel memory accesses occur at constant time - that constant could still be huge and it would run PRAM programs.

      Right, which is why I said that it would be significant even if it wasn't fast. It's a prototype, first of its kind. It could run like a dog and still be "news for geeks." Kind of like the first quantum computer is/will be even if it doesn't really go faster.

      If they have got those times down to something useful that is clearly a step forward, but why then are they claiming superlinear speedup (64 CPUs performing as 100) and "desktop applications"?

      I'm honestly not sure what metric they are using. I'd imagine they could run custom software and I'd be surprised if they didn't have a PRAM algorithm interpreter, so they probably ran a suite of functions that followed some hypothetical "desktop software" equivalence to get some performance indexing.

      I know they had the hardware available for public access (public meaning the attendees at the ACM International Conference on Supercomputing) so it probably isn't a load of hooey, but it could still be market speak for "does some stuff horrifically fast and could be on the market in 5 years." I'd really expect this to show up as a coprocessor or series of subunits on a traditional CPU at first. I figure it would be like the GPU-based protein folding software or the physics processor, taking tasks flagged as "PRAM-friendly" that throws in some overdrive. Ironically, gaming would get a huge boost from A good parallel processing CPU and API. Many aspects of games could be parallelized (each bot/unit gets their own process, for instance) although I really have no idea if a PRAM processor cluster would be better than using a multicore x86 cpu.

      --
      I've been on slashdot so long I'm starting to get out of touch with the cool stuff if it ain't on slashdot.
    8. Re:There's nothing here by Putmank4 · · Score: 1

      If you check out their papers, they really have solved the parallel programming problem. They've shown impressive speedup on such impossible-to-parallelize algorithms as Matrix Multiply and Randomized Quicksort....... oh wait, those are really easy to parallelize. Well, maybe I can still get my desktop supercomputer so long as I want to compute really, really big matrices on my desktop. Now the only trick is finding a way to encode Quake as series of giant matrix multiplies.... http://www.umiacs.umd.edu/users/vishkin/XMT/spaa07 paper.pdf Oh, and feel free to breeze over the part about no floating point support. Everyone knows that the really hard programming problems involve integer-only matrices. :-P And just to throw one more fun fact in there, this thing lives in a PCI slot (not even PCI-x), so getting data in and out of this thing is going to take an obscene amount of time. I don't want to bash it too much, but saying that this thing is a desktop supercomputer capable of a 100x performance boost is pure BS. It's yet another example of a way to decelerate single-threaded applications while leaving the hard part of parallel programming and program partitioning up in the air. PRAM isn't really an answer to that. The only reason this is getting any publicity is the $500 naming game.

  9. Limited Practical Applications (for now) by thesandbender · · Score: 1, Interesting

    Assuming this actually works as detailed and the fine print on the claim isn't too onerous, there's three practical problems:

    1. Many applications are limited by the speed of the user, not the computer. You can only type or click so fast.
    2. Hardware would have to catch up to drive this beast. This would max out all known memory and storage systems. Not to mention your internet connection.
    3. As has been mentioned time and again, until developers actually embrace multi-threading this will be relatively useless. Tests from various hardware sites have shown that going from the Core 2 Duo to the Core 2 Quad offers very little benefit except for a very small subset of users... who should probably be running workstations anyway (Video editing, 3D rendering, etc.)

    However, I have a ton of HD content on my MythTV box that I would like to turn this processor and h264 loose on :) Maybe by the time this is a viable commercial product it will have more practical uses. (Remembering LOGO on my TI-94/A... we've come a long way baby)

    1. Re:Limited Practical Applications (for now) by p0tat03 · · Score: 4, Insightful

      While I agree there are certain leaps to be made before this can be a mass market item, I disagree fundamentally with point 1 that you make. You could have made the exact argument about the old DOS Lotus office suite way back, 15 years ago. Those things still word process, and a 386 33MHz is certainly no slouch - I never had to sat around waiting for the software to respond to me or finish some ridiculously long task.



      I'm sure you'd agree that these newfangled Pentiums and Core Duos are quite useful, even for the end user.



      Think about features like predictive and contextual actions. Desktop search? Search-as-you-type? There are many ways to improve the usability of computers thyat require more and more performance. Honestly, if we can invent faster computers, we will invent ways to put the power to use in a productive, tangible way.

    2. Re:Limited Practical Applications (for now) by Morty · · Score: 2, Informative


      3. As has been mentioned time and again, until developers actually embrace multi-threading this will be relatively useless. Tests from various hardware sites have shown that going from the Core 2 Duo to the Core 2 Quad offers very little benefit except for a very small subset of users... who should probably be running workstations anyway (Video editing, 3D rendering, etc.)


      RTFA. The article claims:


          "The 'software' challenge is: Can you manage all the different tasks and workers so that the job is completed in 3 minutes instead of 300?" Vishkin continued. "Our algorithms make that feasible for general-purpose computing tasks for the first time." ...
      To show how easy it is to program, Vishkin is also providing access to the prototype to students at Montgomery Blair High School in Montgomery County, Md.


      Parallel computing has been around for a while. One of the challenges of parallel computing has always been that it is inherently harder to code. These guys acknowledged this, but they say their prototype is "easy" to program. We'll see if they're right.

    3. Re:Limited Practical Applications (for now) by thesandbender · · Score: 4, Informative

      I'm going to make an assumption and say that you don't do a lot of system programming. Threaded applications depend... heavily... on synchronizing data access. You simply can't take a single threaded application and break it out across threads without having some context of how it's accessing it's data and why. Imagine landing planes at an airport. It's a serial process... you just can't arbitrarily run it in parallel... "bad things" (tm) happen. The "algorithms" Mr. Vishkin is speaking of have no way of determining the context of code being executed and trying to break it out is a disaster waiting to happen.

      There are applications where massive parallelism like this is fantastic... using my initial example... encoding video. Throw each frame off to one of the processors and you're processing 300 at a time (even there there are limitations because each frame requires information from the previous).

      But I stand my statement.. anyone who says they can take a serial application and run it in parallel is full of sh*t and they know it. In certain, limited circumstances, yes... but in general. NO.

    4. Re:Limited Practical Applications (for now) by thesandbender · · Score: 1

      I agree... to a point... but I'm wondering where the limit is. You mentioned four possible applications. Let's be generous and say we broke that off to four threads for each tasks... sixteen threads. Lets be even more generous and say there were four more tasks you didn't consider. All told that's thirty-two threads... a tenth of the power were talking about here. And... I'll go back to my second point. Currently, there are no memory or storage systems that are capable of feeding this. If it really is a 300x increase in processing power then moore's law predicts it will be almost a decade before current approaches can actually support this.

    5. Re:Limited Practical Applications (for now) by Anonymous Coward · · Score: 0

      are you claiming that 64 cores ought to be enough for anyone?

    6. Re:Limited Practical Applications (for now) by eonlabs · · Score: 1

      You clearly overlook all the problems we brute force our way through.
      There are many uses for HUGE COMPUTATIONAL THROUGHPUT in the business world.
      This value extends far beyond grand challenge problems, and touches corporate databases, data analysis, and automation.

      For a fun and amusing example, how many weblogs do you think the RIAA would have to go through to actually be able to prove that one single illegal transaction occurred. Now what is that computer going to do that involves user input? I would imagine the compute time is greater in this case (it doesn't even need to be Network IO bound if they're archived records).

      --
      I wouldn't consider the mad hatter mad. Just reality impaired. He sure can make a mean cup of tea.
    7. Re:Limited Practical Applications (for now) by eonlabs · · Score: 2, Interesting

      If everything in the chip is lining up so nicely, how about calling it

      THE SYZYGY

      no, I'm not making up the word. If you don't believe me, http://dictionary.reference.com/browse/syzygy

      --
      I wouldn't consider the mad hatter mad. Just reality impaired. He sure can make a mean cup of tea.
    8. Re:Limited Practical Applications (for now) by zymano · · Score: 1

      itanium

    9. Re:Limited Practical Applications (for now) by blincoln · · Score: 1

      All told that's thirty-two threads

      There are 41 processes and 512 threads in use on my system right now, and all I have open in terms of interactive applications are two Firefox windows and CDex.

      --
      "...always new atoms but always doing the same dance, remembering what the dance was yesterday." -Richard Feynman
    10. Re:Limited Practical Applications (for now) by Mikkeles · · Score: 1
      fTFA:

      "Suppose you hire one person to clean your home, and it takes five hours, or 300 minutes, for the person to perform each task, one after the other," Vishkin said. "That's analogous to the current serial processing method. Now imagine that you have 100 cleaning people who can work on your home at the same time! That's the parallel processing method.

      As in: the two window cleaners, the guy mopping the floor, and the wall-washer who all need one of the two buckets available and access to the (single) sink which is currently being monopolized by the person washing the dishes?

      OR

      If one woman can have a baby in nine months, why can't nine women have a baby in one month?

      --
      Great minds think alike; fools seldom differ.
    11. Re:Limited Practical Applications (for now) by CarpetShark · · Score: 1

      why can't nine women have a baby in one month?


      Well, you know, if you let them all talk amongst themselves for long enough, they'll soon believe they can ;)
    12. Re:Limited Practical Applications (for now) by arthernan · · Score: 1

      Sure there are many applications that need to wait for user input and as the statment implies there are many that don't need to.

      Max out all known memory and storage systems? Not to mention your internet connection? I think you are getting too creative in your arguments.

      The only good point in your post is the lack of programmer support to today. And hasn't it occured to you that part of this lack of support exists because most programmers don't have access to a supercomputer. I know I don't have. The minute I can have a desktop that can have one superprocessor like that one. I'll try to get one. In my case I think I need it to cost 2000$ or less to justify buying it.

      Nobody can stop progress, and lack of faster clock speeds IS a HUGE deal. So if you are a programmer you better look at what do you need to learn to be proficient on parrallel computer programming.

    13. Re:Limited Practical Applications (for now) by Ephemeriis · · Score: 1

      Many applications are limited by the speed of the user, not the computer. You can only type or click so fast.
      While this is certainly true, think of all the things you can do while you're waiting for the user to click or type. Realtime spellchecking is a fairly useful feature that is only possible because modern computers are fast enough to look up the words as you type them.

      And look at the amount of multi-tasking your average user does these days... Usually an email program open, a web browser, a word processor, maybe an mp3 player... I remember when simply running an mp3 player alone slowed my PC down. The reason we can multi-task today is because the hardware has grown enough to allow it.

      Hardware would have to catch up to drive this beast. This would max out all known memory and storage systems. Not to mention your internet connection.
      True, but hardware is constantly improving anyway. Broadband is, more or less, widely available. Gb networking is almost commonplace. Folks are stringing up fiber left and right. SATA and SAS are replacing their parallel predecessors. We've done DDR, and now DDR2. ISA, to PCI, to PCI Express. Hardware is always changing, improving, growing. Sure, there'll be bottlenecks somewhere...but that's nothing new.

      As has been mentioned time and again, until developers actually embrace multi-threading this will be relatively useless. Tests from various hardware sites have shown that going from the Core 2 Duo to the Core 2 Quad offers very little benefit except for a very small subset of users... who should probably be running workstations anyway (Video editing, 3D rendering, etc.)
      I'd say this really depends on how the cores are implemented and used. Sure, a word processor isn't going to speed up much from 2 cores to 4 or 8 or 16... But what if you're running a word processor, an mp3 player, some antivirus, VOIP softphone, IM client, and email. What if your OS is automatically defragmenting your files in the background, maybe compressing them too, or encrypting them. Single applications and tasks may not benefit much from tons of cores...but what if you break up your workload so that each application or task has, more or less, a core to itself?

      I can't say that having a big pile o' cores is actually going to make anything better. But claiming that lots of cores is useless based on current workloads is just silly...you need to match the workloads to the technology. Look at how different our workloads are today compared to 10 years ago. In another 10 years it may make perfect sense to have 100+ cores in a single PC.
      --
      "Work is the curse of the drinking classes." -Oscar Wilde
    14. Re:Limited Practical Applications (for now) by James+McP · · Score: 1
      But I stand my statement.. anyone who says they can take a serial application and run it in parallel is full of sh*t and they know it. In certain, limited circumstances, yes... but in general. NO.

      Then you miss the point of the PRAM concept. It is designed to evaluate concurrency and to identify parallel-friendly routines. Part of this project is funded by the NSF and includes developing compiler modifications and an API.

      --
      I've been on slashdot so long I'm starting to get out of touch with the cool stuff if it ain't on slashdot.
  10. Confidence: Low by Lije+Baley · · Score: 5, Funny

    Vaporac. Vaporlon. Vaporium. Whatever...

    --
    Strange things are afoot at the Circle-K.
    1. Re:Confidence: Low by edwardpickman · · Score: 1

      This brings up a good point. Will Duke Nuke Em Forever require this chip? It's likely to be on the minimum specs for Windows 2012.

    2. Re:Confidence: Low by ScrewMaster · · Score: 1

      Only if you have the Smokum Mirrorum add-on.

      --
      The higher the technology, the sharper that two-edged sword.
    3. Re:Confidence: Low by Anonymous Coward · · Score: 0

      Good god man, is your sig a subtle reference to Spongebob Squarepants?

      ...If it's not, I am deeply ashamed of myself. Of course, if it is, I may bring shame on my entire family for recognizing it.

      U is for uranium! ...BOMB!

    4. Re:Confidence: Low by Refenestrator · · Score: 2, Funny

      Or you could add in a temperature joke and call it the Vaporizer.

    5. Re:Confidence: Low by Lije+Baley · · Score: 1

      All...Hail...Plankton.

      --
      Strange things are afoot at the Circle-K.
    6. Re:Confidence: Low by cli_rules! · · Score: 1

      Unobtanium!

    7. Re:Confidence: Low by Dunbal · · Score: 1

      It's likely to be on the minimum specs for Windows 2012.

            The header says it's 100 times faster than current desktops, so I doubt this chip will be powerful enough to run Windows 2012 anyway.

      --
      Seven puppies were harmed during the making of this post.
    8. Re:Confidence: Low by inKubus · · Score: 1

      Cirrus, Cumulonimbus, Stratus. Uh, clouds. Made of vapor.

      --
      Cool! Amazing Toys.
  11. Duh. by Anonymous Coward · · Score: 0

    Supercomputing 2.0. Now, I'd like that 500 bucks in twenties, please.

  12. Uhm, whatever it's always been called? by Cafe+Alpha · · Score: 1

    Hard to tell from the some of those "papers" since they seem to be written for kindergarteners - or journalists. But with that much parallelism I'm guessing that these computers basically allow "dataflow" style programming, with a certain amount of automatic decomposition, similar to the way PC chips decompose assembly into a simpler language on-chip.

    1. Re:Uhm, whatever it's always been called? by JimXugle · · Score: 1

      Hard to tell from the some of those "papers" since they seem to be written for kindergarteners - or journalists.


      wait... there's a difference?
      --
      -jX

      Don't you just love politics? It's like a comedy of errors.
  13. I don't know about you guys... by Ub3rT3Rr0R1St · · Score: 1

    But I want those $500. Maybe I could use it to buy a board with a chip that will actually provide some routine functionality on a shorterm scale. Wouldn't that be the ultimate irony?

    1. Re:I don't know about you guys... by Dunbal · · Score: 2, Funny

      But I want those $500. Maybe I could use it to buy a board

      Don't lie. You'll actually spend it on 2 computer games, lots of mountain dew and some pizzas.

      --
      Seven puppies were harmed during the making of this post.
    2. Re:I don't know about you guys... by Alsee · · Score: 1

      Scratch the pizza and just double up on the Mountain Dew.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  14. I name it by Kohath · · Score: 3, Funny

    Bob

    1. Re:I name it by Anonymous Coward · · Score: 0

      Larry

    2. Re:I name it by flakeman2 · · Score: 1

      I name this here chip "Pittsburgh Nellie";

    3. Re:I name it by Anonymous Coward · · Score: 0

      I'd like to thank some guy named.. 'Earl'

  15. Contention Management Issues by MarkPNeyer · · Score: 1

    All the processors in the world won't do you any good if you can't write the software to harness them, and conventional lock-based techniques are really really easy to screw up. I'm really curious to see what those 'rich algorithmic' solutions they've got are.

    --

    My blog
  16. Overhyped by rivenmyst137 · · Score: 5, Insightful

    Oh, for god's sake. I don't understand why this is getting so much press. It was stupid when it went up on Digg, and it's stupid that it's showing up here. This isn't substantially different from any of the other parallel architecture and programming work that's been going on for the last two decades. Their benchmarks are against embarrassingly parallelizable algorithms like matrix multiplies and randomized quicksort, things that any half-intelligent lemur (with a math and cs class or two) could get to run quickly. The hard part is speeding up your average desktop application which, I guarantee you, is not spending the majority of its time doing matrix multiplies.

    On top of that, their "parallel extension of von Neumann" amounts to adding primitives to start and stop threads into the language. Again, any half-intelligent lemur (with a slightly different skill set from the first) could have done that. And I think a few actually have (at the risk of comparing language researchers to lemurs). It doesn't solve the underlying problem.

    Oh, and did we mention no floating point and the lack of any memory bandwidth to get data into and out of this thing?

    This is over-hyped research and shameless self-promotion, and for some weird reason the press seems to be buying it. Stop it.

    1. Re:Overhyped by Anonymous Coward · · Score: 0

      You make matrix multiplies and randomized quicksort sound like trivial implementations on parallel hardware. I promise you, however, it's not as simple as you make it sound.

      I've never, not even once, met a lemur that could do that. Ocelots: Yes, ocelots could do it, but not lemurs. Even ocelots would need some remedial linear algebra and algorithms tutoring.

    2. Re:Overhyped by phantomfive · · Score: 1

      This is over-hyped research and shameless self-promotion, and for some weird reason the press seems to be buying it

      Because it's a contest. Free publicity. Hooray!

      Their benchmarks are against embarrassingly parallelizable algorithms like matrix multiplies and randomized quicksort, things that any half-intelligent lemur (with a math and cs class or two) could get to run quickly

      Dang what kind of lemurs do they have where you're from? We must find them and make them our president! Oh wait, you say we already did?

      OK I admit it, that was low.

      --
      Qxe4
    3. Re:Overhyped by Doppler00 · · Score: 4, Informative

      Yeah this article is pretty week. "Woohoo! Look we took a picture of a last generation FPGA development board and wrote some nifty programs for it that prove our pet project!" I think very little of things like this make it outside of academia. I'm not saying this research is unworthy, just not news worthy.

      And "parallel extension of von Neumann" exists. It's called OpenMP and it still takes a skilled programmer to understand.

      Look at that board... it uses "SmartMedia" yeah... that means that:

      1. This is OLD research
      2. The board developers didn't have a clue
      3. A very old development board is being used.

    4. Re:Overhyped by uarch · · Score: 1

      After skimming through the whitepapers I have to agree with you.

      It reminds me a little of the dataflow architectures of the 70's. A quick google search will probably give you several reasons why it wasn't very effective in the real world. This design will suffer from many of the same problems.

      These are the types of white papers we used to tear apart for fun when I was in grad school. They boast all these breakthroughs that aren't very different from anything else that's done (not uncommon even when great work has been done) and they avoid any mention of (let alone solutions to) all the problems associated with their approach. The benchmarks they're using to gauge performance just make it even funnier.

    5. Re:Overhyped by uarch · · Score: 2, Funny

      Actually, the more I think about it they could have made a better whitepaper using this:

      http://pdos.csail.mit.edu/scigen/

    6. Re:Overhyped by kwikrick · · Score: 1

      The hard part is speeding up your average desktop application which, I guarantee you, is not spending the majority of its time doing matrix multiplies. Your average desktop application doesn't need speeding up. If it runs slow, it's because it's programmed inefficiently, using a ton of memory, swapping data in and out of core. Shoot the developer.

      The only place where speedup is needed is in scientific computing. Of course, scientists already use mass parallel machines (grid computing etc) and they are already designing their algorithms for parallel execution.

      And gamers of course want fast graphics, for which they have GPUs, which are specialized parallel hardware. And for small parallelize-able calculations need in desktop apps, there's MMX and similar extensions.

      So really, I don't see the need for this 'new' technology. Except that I, as a computer scientist and hobbyist would love to have a cheap parallel machine at home to play with.

      What we do need is better programmers.

      But wait, there's one thing: low frequency, high bandwidth processors are much more power-efficient that the current gigahertz toasters. Once we have better programmers, it might be a good idea to switch to mass-parallel designs, to save energy.

      --
      assignment != equality != identity
    7. Re:Overhyped by arthernan · · Score: 1

      Man, I can't belive the urge for bashing!!!

      There are a handful of ways to increase computing power optical computing and parallel computing being the most likely as far as I know. And it may very well be they could be combined to attain even more power.

      One day we might all be using superparallel computers.

      Using an old board might even have some merit. Maybe they realized that for their architecture, that old board is as good as any.

      Plus, and this is a huge plus. Older boards are also cheaper boards. And if you are thinking of mass production you can save some serious money If you are able to make it work. And if they are thinking of making it for a desktop enviroment, they need to make it as cheap as they possibly can.

    8. Re:Overhyped by Anonymous Coward · · Score: 0

      >Look at that board... it uses "SmartMedia" yeah... that means that:
      >
      >1. This is OLD research
      >2. The board developers didn't have a clue
      >3. A very old development board is being used.

      It's probably storing the Xilinx FPGA configuration bitstream on that card. Why is this a bad thing? What would you suggest they use instead?

      http://www.xilinx.com/products/silicon_solutions/p roms/system_ace/

    9. Re:Overhyped by SolexaJoe875 · · Score: 1

      This is a development board made by the DINI Group. It looks like it's the DN8000K10PCI, and it's typically used for logic emulation.
      Here is a web-link to the manufacturer's website:
      http://www.dinigroup.com/index.php?product=DN8000k 10pci

    10. Re:Overhyped by Doppler00 · · Score: 1

      Actually, I later found the website for the development board and found that it was the xilinx 4. I'm just surprised they used SmartMedia on the board instead of CF or SD. Hardly anyone sells SmartMedia anymore and it has limited capacity.

  17. I don't know much about marketing... by NotQuiteReal · · Score: 1

    I think SOC would SUCK as a product name.

    --
    This issue is a bit more complicated than you think.
    1. Re:I don't know much about marketing... by normuser · · Score: 1

      I think SOC would SUCK as a product name.

      I agree. Since words are linked in my head by how they sound, SOC would just make me think of all the nasty clothes I have yet to wash.
      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
      XXX#######
  18. They should call it... by kobatan · · Score: 1

    kobatan.

    I wonder if they can get the domain cheaply?

    --
    "Pulling together is the aim of despotism and tyranny. Free men pull in all kinds of directions." -TP
  19. Analogy at work... by RuBLed · · Score: 1

    "Suppose you hire one person to clean your home, and it takes five hours, or 300 minutes, for the person to perform each task, one after the other," Vishkin said. "That's analogous to the current serial processing method. Now imagine that you have 100 cleaning people who can work on your home at the same time! That's the parallel processing method."


    Brilliant! Even my mother had not thought of such an idea.
    1. Re:Analogy at work... by Repton · · Score: 1

      "Suppose you hire one person to clean your home, and it takes five hours, or 300 minutes, for the person to perform each task, one after the other," Vishkin said. "That's analogous to the current serial processing method. Now imagine that you have 100 cleaning people who can work on your home at the same time! That's the parallel processing method."

      The kitchen cleaner will grab the bucket and the bathroom cleaner will grab the mop, and neither will be able to get any work done. The rest will be tripping over each other in the hallways, and spend half their time queueing for the toilets...

      --
      Repton.
      They say that only an experienced wizard can do the tengu shuffle.
  20. How about by jshriverWVU · · Score: 1

    "OMG I gotta have It (TM)" or Deep Silicon :)

  21. Human-guided autovectorization. by Ayanami+Rei · · Score: 3, Interesting

    You know, autovectorization looks good on paper. But for most tasks, it really doesn't net you any benefit unless you can separate all your work into non-overlapping chunks. You can't have any interdependancies on your working set (or risk expensive, non-scalable locking), and if you're all pulling from a single data source to split up the analysis work you'll spend a lot of time in contention for the pipe to that resource.

    For example, it wouldn't make searching a database (scratch that, searching any data set) any faster unless the index was already pre-split among the processing units.

    In this architecture the processing units have the same bus to RAM and disk on the front and back ends and have to deal with contention.

    Your system is only as fast as the slowest serial part. Typically this is storage media, a network connection, or a memory crossbar. Processors really are fast enough for the non-embarrasingly parallel stuff. They are at the right ratio with respect to the other slower busses to do most general purpose work.

    If you want to do more than that then its other things; storage media, memory, I/O busses -- that need to be multiplied in density and number. Only then can we see higher throughput.

    Autovectorization is only good for things we already have offloading for anyway (TCP encryption, graphics, sound)... and for those general purpose cases like in Game AI where you might want a linear algebra boost NVidia has beaten these guys to the punch with the GP stream processing in the newest chips and the very flexible Cg language/environment.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  22. MISCELLANEOUS CONDITIONS: by DrunkenTerror · · Score: 0, Offtopic

    All entries become the property of the University and will not be returned. By participating, entrants agree to abide by and be bound by these Official Rules and the decisions of the University, which shall be final and binding with respect to all issues relating to this Contest. It is your responsibility to ensure that you have complied with all of the conditions contained in the Official Rules. The University is not responsible for any lost, late, misdirected, stolen, illegible, incomplete entries, or for any computer, online, telephone or technical malfunctions that may occur. The University is not responsible for any incorrect or inaccurate information, whether caused by website users, any of the equipment or programming associated with or utilized in the Contest, or any technical or human error which may occur in the processing of submissions in the Contest. The University assumes no responsibility for any error, omission, interruption, deletion, defect, delay in operation or transmission, communications line failure, theft or destruction or unauthorized access to, or alteration of, entries. The University is not responsible for any problems, failures or technical malfunction of any telephone network or lines, computer online systems, servers, providers, computer equipment, software, email, players or browsers, on account of technical problems or traffic congestion on the Internet, at any website, or on account of any combination of the foregoing. The University is not responsible for any injury or damage to participants or to any computer related to or resulting from participating or downloading materials in this Contest. If, for any reason, the Contest is not capable of running as planned, including infection by computer virus, bugs, tampering, unauthorized intervention, fraud, technical failures, or any other causes beyond the control of Contest which corrupt or affect the administration, security, fairness, integrity or proper conduct of this Contest, the University reserves the right at its sole discretion to cancel, terminate, modify or suspend the Contest and select winners from among all eligible entries received prior to the cancellation. Persons found tampering with or abusing any aspect of this Contest, or whom the University believes to be causing malfunction, error, disruption or damage will be disqualified. CAUTION: ANY ATTEMPT BY AN ENTRANT OR ANY OTHER INDIVIDUAL TO DELIBERATELY DAMAGE ANY WEBSITE OR UNDERMINE THE LEGITIMATE OPERATION OF THE CONTEST MAY BE A VIOLATION OF CRIMINAL AND CIVIL LAWS. SHOULD SUCH AN ATTEMPT BE MADE, SPONSOR RESERVES THE RIGHT TO SEEK DAMAGES FROM ANY SUCH PERSON TO THE FULLEST EXTENT PERMITTED BY LAW. The University reserves the right to correct any typographical, printing, computer programming or operator errors.

  23. Parallel programming made easy ... by Anonymous Coward · · Score: 0

    By redefining it.

    Data parallel programming is a significant subset of parallel programming in general but it is relatively easy to get right to start with, so I don't see how XMT-C is such an advance.

  24. Non-US residents inelligible to enter by bh_doc · · Score: 2, Informative

    Second paragraph of the rules:

    THE FOLLOWING CONTEST IS INTENDED FOR PLAY IN THE UNITED STATES AND SHALL ONLY BE CONSTRUED AND EVALUATED ACCORDING TO UNITED STATES LAW. DO NOT ENTER THIS CONTEST IF YOU ARE NOT LOCATED IN THE UNITED STATES.

    Even though there is a country field in the form. WTF?

    They don't mention that on the form page, either. It peeves me just a little bit that they would do that, I mean, how many people actually read these conditions things, anyway? Can't say I'm surprised, though.

    1. Re:Non-US residents inelligible to enter by Anonymous Coward · · Score: 0

      I'm guessing that they don't want liability for all the stuff they disclaim against in other countries.

  25. A modest proposal by Anonymous Coward · · Score: 1, Funny

    Call it Grendel - it has no ARM

  26. hmm by hansoloaf · · Score: 1

    how about naming it Vizi?

    1. Re:hmm by Ecuador · · Score: 1

      If they run "Vizi" through their Greek department, they will find out it means "boob" in Greek.

      --
      Violence is the last refuge of the incompetent. Polar Scope Align for iOS
    2. Re:hmm by KDR_11k · · Score: 1

      Yeah but in computer science that means an immediate approval.

      --
      Justice is the sheep getting arrested while an impartial judge declares the vote void.
  27. $500??? by Anonymous Coward · · Score: 0

    I could get more than that for naming my neopet.

    How about....Glumphoof

    An algorithm came up with it.

  28. Where parallelisms break down by EmbeddedJanitor · · Score: 2
    Suppose you had 100 cleaners in your house. They'd all be tripping over each other and all unplugging eachother's vacuum cleaners to plug in their own. And all their minivans would cause a traffic jam in your driveway.

    Pretty much the same with any multi-processor technology: shared resources like buses are the major limitation.

    --
    Engineering is the art of compromise.
    1. Re:Where parallelisms break down by rbanffy · · Score: 2, Interesting

      Sun had something with tiny radio interconnects between chips. This way, they could have thousands of "pins" on the chip and the only metal pins you would need would be power and ground. If I remember correctly, I had a server whose memory had to be upgrades about 8 (or 9) modules-with-lots-of-pins a time, so, wide buses are nothing new.

      Intel also had something about optical interconnects, which are also nice, since you can place your "connectors" anywhere in the chip and not just around the borders and, if you can aim properly, the receivers can be much smaller than the pads around a current chip (or, by properly spreading the signals, one could synchronize many receivers to a single source very efficiently).

      We may not be constrained by the number of pins a connector has for that much longer.

    2. Re:Where parallelisms break down by EmbeddedJanitor · · Score: 1
      IIRC, they used capacitive coupling, not radio.

      Still, the coupling mechanism is not the bottleneck. The true bottleneck is being able to access shared resources such as RAM. Withouth being able to do that Amdahl's Law http://en.wikipedia.org/wiki/Amdahl's_law is a killer.

      --
      Engineering is the art of compromise.
  29. How About "Almost Fast Enough For Vista"? by NeverVotedBush · · Score: 1

    But I doubt that's worth $500...

  30. Skynet or Borg by detain · · Score: 1

    Skynet or Borg both great recognizable names refering to a massive supercomputer, or perhaps a massive cluster of nodes, either way , both those names would pwn. resistance is futile

    --
    http://interserver.net/
  31. Here's your name: by Anonymous Coward · · Score: 0

    I dub thee: SKYNET

  32. WTF?-another history lesson? by Anonymous Coward · · Score: 0

    "Most. Insightful. Post. Ever. ;)"

    *smirk*

    For all you youngsters, there is minicomputer.

  33. I call it the iStove by ILuvRamen · · Score: 0
    okay first....

    The prototype developed by Uzi Vishkin and his Clark School colleagues uses a circuit board about the size of a license plate on which they have mounted 64 parallel processors
    Sell the rights to mac, make the processor board a remote device, and cook some eggs on that sucker. Seriously, they say software is the hardest challenge? How about keeping 64 processors on a license plate sized board cool.
    --
    Google's Super Secret Search Algorithm: SELECT @search_results FROM internet WHERE @search_results = 'good'
    1. Re:I call it the iStove by Dunbal · · Score: 1

      How about keeping 64 processors on a license plate sized board cool.

      Ahh but that's the brilliant part, see, if you have the processors on opposite sides of the board, then the heat just cancels itself out ... oh wait, nevermind!

      --
      Seven puppies were harmed during the making of this post.
  34. Transparent Parallelism? by gandracu · · Score: 1

    Transparel.

    1. Re:Transparent Parallelism? by zrq · · Score: 1
  35. Please vote on the new name by cashman73 · · Score: 3, Funny

    I will either nominate the name, "Giant Douche," or, "Turd Sandwich," depending on which one slashdotters vote for.

    1. Re:Please vote on the new name by LS · · Score: 1

      Definitely "Giant Douche"

      --
      There is a fine line between being a cultivated citizen and being someone else's crop. - A. J. Patrick Liszkie
  36. This is just an old FPGA development board by Doppler00 · · Score: 1

    http://www.dinigroup.com/index.php?product=DN8000k 10pci
    There you go! It's just a vertex 4 development board. Nothing special. I mean, if they would have used this graphic http://www.dinigroup.com/DN9000k10PCI.php it would have been a little more impressive.

  37. Some possible names by YU+Nicks+NE+Way · · Score: 1

    "VaporWire"

    "Parallel Lies Processor"

    "iProcessor"

  38. Hand over the $500 right now by Enderandrew · · Score: 5, Funny

    iPerbole©

    --
    http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
    1. Re:Hand over the $500 right now by Anonymous Coward · · Score: 0

      iPerbole©

      F^cking Genius!!

      I wonder if you made that up yourself or stole it from somewhere else.

      Either way, I'm going to steal it and claim that I made it up.

    2. Re:Hand over the $500 right now by Anonymous Coward · · Score: 0

      The iphone seems to be ifizzling I want my non existent 50,000 in stock gains back :( I know, I know off topic why do you think I am a "coward?"

  39. I think I've got a name for it... by Whuffo · · Score: 1
    How about "Wishful Thinking"?

    They describe the same old massively parallel computing idea but gloss over the problems involved. This old chestnut keeps coming to the surface every few years but nobody ever seems to show any working hardware...

  40. What about iCPU? by trunks14 · · Score: 1

    What about iCPU? has some other company already done the 'i' prefix thing? i mean like iPod or something like that?

    Deep CPU
    MP (Moon MacroProccesor)
    MPU (Macrohard Proccesing Unit)

    1. Re:What about iCPU? by maxwell+demon · · Score: 1

      has some other company already done the 'i' prefix thing?

      Yes.
      --
      The Tao of math: The numbers you can count are not the real numbers.
  41. Transputer? by MadMidnightBomber · · Score: 3, Informative
    --
    "It doesn't cost enough, and it makes too much sense."
  42. Power by fr4nk · · Score: 1

    'capable of computing speeds up to 100 times faster than current desktops.'

    So, how many laptop miles are this? If it has more power than one laptop mile, they could name it 'Milestone Computer'!

  43. name? by Anonymous Coward · · Score: 0

    xyzzy
    that word noone can pronounce.
    from advent :)

  44. Oblig by Anonymous Coward · · Score: 0

    How about "Puter," as in, "What? Did your mother purchase for you a PUTER for Christmas?"

  45. Hmmm.... by bjackson1 · · Score: 1

    A supercomputer on a chip....so it should be named Altivec?

  46. i860? by Evil+Pete · · Score: 2, Interesting

    Anyone remember the hype of the i860? Great on paper, but not so great in reality. I really hope this works though, von Neuman architecture was always supposed to be a stop-gap (even vN said so I think).

    --
    Bitter and proud of it.
    1. Re:i860? by julesh · · Score: 2, Interesting

      Anyone remember the hype of the i860? Great on paper, but not so great in reality. I really hope this works though, von Neuman architecture was always supposed to be a stop-gap (even vN said so I think).

      As far as I can tell, there's no really significant departure from von neumann architecture here. They have a processor capable of executing 64 concurrent threads, 'fork' and 'join' instructions, and a version of C that has been extended to be able to use them. I'm not sure I really see what's so revolutionary here -- I've been reading about prototypes of similar ideas to this since the late 90s.

    2. Re:i860? by abertoll · · Score: 1

      I tend to agree. I don't think there's anything conceptually or architecturally new here. If there is anything to this, it's an improved algorithm for parallel processing.

      --
      "he drew his sword Ringil that glittered like ice... and he wounded Morgoth with seven wounds..."
  47. FPGAs by CompMD · · Score: 2, Informative

    It appears to be a few FPGAs. With FPGAs, you can optimize the logic to represent algorithms for faster execution that on general purpose processors. Simply, you use more of the gates available on the chip. That appears to be what these guys are doing. It also appears that there is a single memory controller (I think that is what the QuickLogic chip is) and there is only one DRAM module installed on the board. It would be interesting if the board had a unified memory architecture. There is a separate Xilinx Spartan FPGA on the board that does who-knows-what, but I wouldn't be surprised if it was involved in communication with the processing chips. Of course, this is speculation, but it would seem logical for a board layout.

    Just my thoughts.

  48. Calculon! by Werrismys · · Score: 1

    The stupid web form always complained about illegal characters in a field without specifying which one.

    --
    'Once scientists, even the dim-witted social scientists, get muzzled, the Western Civilization is finished.' - oldhack
  49. Why can you patent NSF funded research??? by Anonymous Coward · · Score: 0

    Why is this guy able get patents on research financed by NSF and DOD? These should be assigned to the US Government or simply become public domain.

  50. Forgot Transputers by CarpetShark · · Score: 1

    I think it goes Micro->Transputer->Mini->Supercomputer. Could be wrong.

    On the other hand, most of these technologies seem kind of obsolete now, as distinctions are falling away.

  51. And what I'd love to see by Sycraft-fu · · Score: 1

    Is how it benchmarks against, say, an nVidia Tesla (a GeForce 8800, with more, faster memory and no DVI connectors). I mean ok, you want to limit to just parallel kinds of benchmarks I can live with that, after all it is ok to design more specialized chips. However then let's see it go against a chip designed for that. Ya, an 8800 will eat shit on a calculation that's a single thread with a lot of branching. However you give it a task that can be highly parallelized and is straight through computation (like, say, 3D graphics hence the reason it is designed this way) and it flies like you can't believe. We are talking in the realm of 400-500 gigaflops (single precision) when it is crunching an ideal problem. That's your competition if you want to make a specialized parallel processor. As you noted, a desktop processor is a different animal, hence why a system that has an 8800 would still want a Core 2. What the Core 2 is good at, the 8800 is not.

    1. Re:And what I'd love to see by UtilityFog · · Score: 1

      Well, I don't have a Tesla, but I have a GeForce 8800 running CUDA, NVIDIA's general programming interface for it. Their timings from the slides are XMT 63.7 sec, Opteron 113.83 for a 2kx2k matrix mult. The 8800 does a 2kx2k mat mult in 0.511 sec.

      On the other hand, the 8800 is immensely painful to program because there isn't enough communication between processors; there's a 500-cycle latency to go to the on-board memory that is common to all processors. So you have to get really ingenious with your data formats and dependencies.

      The best way to look at the XMT research is to say, Hey, these guys basically equalled the performance of an opteron with three fpgas at 75mhz. if you can't do better, don't carp at them.

  52. It's also retarded by Sycraft-fu · · Score: 2, Insightful

    Since of course that breaks down. Actually maybe it isn't so retarded since the same thing is true in many computing problems.

    For example if you take the cleaning situation sure, adding a second cleaner will nearly double the speed it gets cleaned at. Adding four will probably close to quadruple it. However, it starts to break down after a while. At first the gains just start slowing down, as there's more people they have to spend more time talking and dividing up who does what than actually working, as well as doing work others have done because of a miscommunication. Eventually you have so many people that you start actually slowing down with each person you add, because they are getting in each other's way and taking up too much time with non-work.

    That's fairly similar to what you get with a lot of problems in computation. You split the task in half, you can have 2 processors/cores/whatever execute it and nearly double your speed. However after a point you find that you can't split the task more, or that even if you can, it takes more time getting it all sync'd up than you gain from the multiple execution, or that contention in other parts of the system (like memory) holds things back.

    The concept of "two is better than one to 100 must be better than 10" doesn't hold up. There are almost always limits to how much you can divide up a task. Sometimes those limits are extremely high, but they are there. Unfortunately, for many tasks, the limits are pretty low.

  53. No market for that box by dmlpat · · Score: 1

    There is nothing fundamentally new and effective First no indication of performance, but probably as many others similar solutions can solve effectivelly only DGEMM,FFT,and crypto, sorry for all others thousands algos no future for you. The number one problem you need to solve to pretend to be 100 times more rapid is to increase bandwidths by 100 and Amdhall Law works against you. My bottleck is the network or the disk and if I could look at the memory bus, the memory Then do you deliver 100Gbytes of network? 5GB/s disks? 600GB/s memory dimms? In fact you need all that since I stress all that on my home PC. My 2 CPUs are iddle 99% of time, and when not idle they consume only 10% of the cycles in average. Now, maybe you want me to buy this machine, and give the cycles to SETI@home. A real alternative is that you made a major breakthrough in compiler technology. But without massive data flow, I dont see what new problem you can solve assuming you made a breakthrought at least on the 3 algos I cited above. Then next times in place of photos of protos that prove nothing, please offer diagrams of the architecture and details on bandwidth paths, latency, ... and explain where is the innovation For people interested on massive multicores look at what INTEL is doing this is more serious and they try somewhat to solve the memory bandwidth issue. This is horribly hard problem and this can ONLY be solved with massive investments and research that only very few large companies can do it, all others will disappear because there is no market for a processor that cost more than 10$ since we plan to have 32 on a motherboard. Who can develop that in 32nm? Again this is NOT just hardware this is even more a software problem. How many people know to program effectivelly 2 Woodcrest? How many people will be capable to extract more than 1% of that future massive multicores? Does Windows effectivelly use multicores? and why not? Then what is the market for massive multicores? There is a chance that Moore Law become ineffective not because it can not deliver the promise of doubling the number of transistors every 2 years but because it is useless to do it.

  54. My Name by Anonymous Coward · · Score: 0

    I Can't Believe It's Not A Beowulf Cluster

  55. name by hernyo · · Score: 1

    Of course, next year, it won't be so super. But there will be a new one that's really super. Then name it simply Computer-On-a-Chip: COC. For easy pronounciation, make it COCK.

    ...two geeks are chatting in year 2020. "How big is your cock?" "Quarter inch."
  56. Flew right over your head... by msauve · · Score: 1

    DEC and DG are no longer. Today's marketeers allow no middleground - it's either a microcomputer (implying small), or a supercomputer (implying powerful). The terms used to have some meaning - now it's just marketing fluff.

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
  57. suggestion by tabrown · · Score: 1

    How about Sybil, based on the chick with 16 personalities. personality one: I am the worlds fastest parrallel processing super computer personality two: I am a toaster personality three: I am a yo yo personality four: syntax error personality five: yes but do I run Linux? etc.......

  58. Easy. by McGurk · · Score: 0

    The Tits. "You get that new supercomputer on a chip?" "Yeah, its The Tits." "Awesome."

    --
    You're doing it wrong--http://youredoingitwrong.mee.nu
  59. But... by Tolkien · · Score: 1

    Oh fer...

    An article about a supercomputer on a chip and nobody memes the hell out of this one? The article doesn't even answer what I thought must be the most OBVIOUS question!

    "But does it run linux?"

    Yeesh!

  60. Sick of "breakthroughs" by Anonymous Coward · · Score: 0

    Lets program an FPGA and write a cheezy 'spawn' scatter/join function to allow "desktop applications" to benefit from parallel processing.

    And then lets tell the world we actually accomplished something thats in any way useful or new by doing this...

    Someones ego is in serious need of a parallel deflating algorithm.

  61. Terpsichore by dschuetz · · Score: 1
    I've just submitted "Terpsichore" as a name (so nobody else do it, dammit!).
    • Terpsichore is a Muse -- helping inspire the creative process of mere mortals.
    • Muse of dancing, delight, and chorus [chorus might imply parallel voices, though really in this context it's a dramatic chorus, but I'm hoping they overlook that :) ]
    • The last syllable is pronounced "core"
    • It starts with "Terp" (the short name of the Maryland mascot -- Terrapin)


    Plus, as a bonus, it connects to Monty Python via the Cheese Sketch.

    Wish me luck. :)
    1. Re:Terpsichore by BJZQ8 · · Score: 1

      Terpsichore...that's the chick from Xanadu Aight,I put on my roller skates and wizard hat...

  62. not single chip by Anonymous Coward · · Score: 0

    How the hell is this single chip? TFA says the prototype has 64 processors. Speculating that the prototype may eventually be produced on a single piece of silicon is commonplace. People have speculated that just about everything would end up on a single chip by now.

  63. The second reason "the singularity" won't happen by Colin+Smith · · Score: 1

    Software expands to fill the resources made available to it, and then some. Always has and always will. People will always be able to make the text editor that little bit bigger... Ask the Emacs guys how.

    --
    Deleted
  64. Worst Analogy Ever by FuzzyDaddy · · Score: 2, Insightful
    From TFA:

    Suppose you hire one person to clean your home, and it takes five hours, or 300 minutes, for the person to perform each task, one after the other," Vishkin said. "That's analogous to the current serial processing method. Now imagine that you have 100 cleaning people who can work on your home at the same time! That's the parallel processing method.

    100 people trying to clean my house at the same time would be slower than 1, because no one would be able to move or breathe. Which is exactly what makes parallel computing hard.

    --
    It's not wasting time, I'm educating myself.
    1. Re:Worst Analogy Ever by Anonymous Coward · · Score: 0

      Depends on how big the house (or problem size) is. 100 people wouldn't be a problem at my house but it would surely take more than 300mins for 1 person to do it all.

  65. Isn't the solution to reverse the concept? by Colin+Smith · · Score: 1

    At the moment, our software is mostly designed as a script. 1, 2, 3 we push the instructions onto the CPU. As you say, sequential.

    But we already have a different way of thinking about getting information, client/server. With the Internet, millions of people get the information they need by asking a server somewhere. Instead of applications running sequentially on a cpu, shouldn't they be parallel by default, little bits of client code querying and updating little bits of server code.

    --
    Deleted
    1. Re:Isn't the solution to reverse the concept? by booch · · Score: 2, Insightful

      Wow. You got half way with your idea, but didn't make it all the way.

      Right now, with most programming languages, we tell the computer how to compute the result. We generally do this with a linear list of steps for the computer to take. But that's not the only way to write a program. Another way is to tell the computer what we want it to compute, and let it figure out the best way how to do that. This sounds pretty crazy at first, but it's actually been done. Take a look at the Prolog and Haskell programming languages. They're much more descriptive than iterative. They can parallelize things a lot better than the languages we're used to using.

      --
      Software sucks. Open Source sucks less.
  66. Uhuh. by julesh · · Score: 1
    From their PDF introduction:

    The number of cores is expected to double every 18 months for the next decade and
    reach 256 in a decade.

    Right. Not sure I'm with you there. 256 cores is a lot, and I doubt that the infrastructure of (e.g.) memory bandwidth and power supply would be able to keep up with such demands.

    Clock rates of commodity processors have stopped improving since mid-2003. This
    followed several decades in which clock rates have doubled every 18 months.

    Right. You know, I'm sure the fastest desktop processor you could buy in June 2003 had a clock speed of about 3GHz. Clearly I'm imagining the availability of 4GHz chips on the market today. Yeah, sure, it's slowed down. It hasn't stopped, though. I'm also clearly imagining that Core2 chips achieve more calculations per core-second than Pentium 4 chips running at significantly faster clock speeds.

    Basically, the entire thesis here is that improvement in individual processor core performance has been halted for the last 4 years. This blantantly is not the case.

    [Vishkin] opined that a mathematical
    model, called PRAM for Parallel Random-Access Model (or Machine), would be a
    proper framework. The PRAM is a simple extension to the standard RAM (for random access machine)
    model used to teach serial algorithms in every standard Computer Science curriculum.

    Can somebody help me out here. I've never heard of this "random access machine" model. Are we talking about a von Neumann machine, or something else?

    In his PhD thesis, Vishkin proposed a simple "work-depth"
    methodology for designing parallel algorithms: Formulate your algorithm in the form of
    rounds, where each round can include any number of operations that could all be
    performed concurrently had there been enough hardware to execute them. For
    performance, the design objective should be to minimize two parameters: (i) work - the
    total number of operations over all rounds, and (ii) depth - the number of rounds. A
    simple example for such a parallel algorithm follows.

    Well, duh. Thanks for enlightening us. To improve performance, minimize the number of steps that must be taken sequentially, and perform as many as possible in parallel, but don't make too many parallel ones either. Clearly revolutionary thinking, there.

    PRAM algorithms often allow an "arbitrary concurrent write" resolution, where several
    concurrent attempts to write into the same memory location result in one of these writes,
    but we don't know in advance which one. Note also that such "semantics" cannot even be
    expressed in any of the common serial programming languages.

    ...

    Err, OK. Me, I thought starting several threads and making them all write to the same location would result in an unpredictable choice of the values written being stored in that location in standard languages. But then I don't have a PhD in parallel programming techniques, just ten years of industry experience writing multithreaded software, so what do I know?

    All successful general-purpose computers since the 1940s rely on the so-called Von-
    Neumann apparatus. Is there a way to upgrade, rather than completely replace, this
    successful apparatus to handle parallelism?


    Yes. You place multiple von neumann machines with access to the same memory, and provide wtructures for sending control signals between them so that a thread on one processor can start a new thread on another one. This is generally called SMP, and has been used extensively since a long time before this paper was written, so why are you even asking the question? There are of course alternative approaches (e.g. NUMA) that can provide better efficiency in some cases, but the basic question is answered already.

  67. Awesome! I submitted "Steve" by UID30 · · Score: 1

    I've always wanted a computer named Steve...

    --
    "Glory is fleeting, but obscurity is forever." - Napoleon Bonaparte
  68. Obvious name : Chuck Norris by UID30 · · Score: 1

    Chuck Norris does not sleep. He waits.

    This could be the bestest thing in supercomputing EVAR!!1!one!1

    --
    "Glory is fleeting, but obscurity is forever." - Napoleon Bonaparte
  69. Really innovative work at Berkeley by arrrrg · · Score: 1

    Here's the webiste of a class at Berkeley that is designing totally new chip architecture, something actually innovative and quite interesting in my opinion. http://research.cs.berkeley.edu/class/fleet/ It's still a few years away from being practical, but they are hoping to have in-silicon test chips very soon now.

  70. valueless by rodentia · · Score: 1


    Your claims are valueless because 'yer anonymous, coward. I made it up!

    --
    illegitimii non ingravare
  71. They call me... by WheresMyDingo · · Score: 1

    Tim?

  72. All That Reall Matters is... by Nom+du+Keyboard · · Score: 1

    All that really matters here is how fast it runs Microsoft Word and Excel. You may not like it. You may want to mod me Troll or Flamebait, but to 80%+ of the population, as long as their PC brings up e-mail faster than they can type, shows movies without dropped frames, and quickly runs Word and Excel, that's all they care about. Blazing Folding@Home scores simply don't translate to a computing experience improvement. It's either faster enough in MSOffice, or it isn't. Sad, but very true.

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
  73. dog by jhutchens · · Score: 0

    what do you call a dog with no ears..... "it dont matter cuz he aint hear you no how"

    --as told by my co-worker

  74. I can't wait! by Impy+the+Impiuos+Imp · · Score: 1

    > parallel processing on a single chip and is 'capable of computing
    > speeds up to 100 times faster than current desktops.'

    Toshiba plans on releasing a laptop in six months, complete with 448MB of RAM (512 - 64MB for shared video RAM).

    --
    (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
  75. new killer apps by David+Gould · · Score: 1

    I don't see why the next ten years would be any different. Operating systems will continue to get more bloated, software packages will get more feature-stuffed, games will continue to demand just slightly more than whatever's available [...] That's not all there is to it. The thing is, remember, there hasn't been a really new "killer app" in closer to fifteen years (i.e., since email and the WWW went mainstream). Games have been pushing constantly, but incrementally, better graphics (plus improving physics and AI). Web browsers have grown more feature-rich and CPU/memory-hungry. Desktops have transparent menus. What else has changed?

    AJAX / Web 2.0 stuff is mostly about re-creating the exact same apps that used to be built natively on the desktop, i.e., a new, more CPU/memory-hungry way of doing the same old stuff, but nothing fundamentally new. A blog is a web page that someone else hosts for you and that you can create without needing to know HTML. Social networking actually is looking to be pretty revolutionary, sociologically speaking, but technologically, it's just email and web pages.

    It's (probably) true that there's a limit to how many GHz and GBs we can soak up by finding new, less-efficient ways to do the same things over and over again. So yes, a 100x speedup might seem unneeded. But that's failing to take into account the really new technologies that [could|will] come along -- stuff that would be as revolutionary as the CLUI->GUI change.

    It's not even that hard to guess what sort of app this might be: offhand, I can suggest fully-immersive virtual reality and speech-controlled UIs, which have been predicted in sci-fi for decades. The only reason they haven't arrived yet is that our computers are still too damn slow to do it properly. (Voice recognition is just about now crossing the threshold of being "accurate enough", but that's just for recognizing the words you're speaking -- for a UI revolution to happen, we also need some major natural-language processing and AI advances.)

    In short: Yes, there's plenty that we could do with a 100x faster CPU.

    --
    David Gould
    main(i){putchar(340056100>>(i-1)*5&31|!!(i<6)<< 6)&&main(++i);}
  76. How about "Niagara" or "ClearSpeed" by Anonymous Coward · · Score: 0

    ... oops, sorry; those are already taken...

  77. Rich algorithmic theory? by HTH+NE1 · · Score: 1

    "Rich algorithmic theory" probably means "you cannae afford it".

    --
    Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
  78. name - die on fire by Tynin · · Score: 1

    I'm thinking Die on Fire would work nicely.