Slashdot Mirror


Intel Squeezes 1.8 TFlops Out of One Processor

Jagdeep Poonian writes "It appears as though Intel has been able to squeeze 1.8 TFlops out of one processor and with a power consumption of 62 watts." The AP version of the story is mostly the same; a more technical examination of TeraScale is also available.

168 comments

  1. Oblig. by Anonymous Coward · · Score: 2, Funny

    Imagine a Beowolf cluster of those!!

    1. Re:Oblig. by niconorsk · · Score: 5, Interesting

      It's quite fun to consider that when the original joke was made, the processing power of that Beowulf cluster would probably been quite close to the processing power of the processor discussed in the article.

      --
      Nothing is impossible. We just haven't quite worked out how to do it yet.
    2. Re:Oblig. by Vandilzer · · Score: 1

      They did RTFA:

      "However, considering the fact that just 202 of these 80-core processors could replicate the floating point performance of today's highest performing supercomputer, those power consumption numbers appear even more convincing: The Department of Energy's BlueGene/L system, rated at a peak performance of 367 TFlops, houses 65,536 dual core processors."

    3. Re:Oblig. by Anonymous Coward · · Score: 2, Interesting

      It is entirely not true that you could replace today's fastest computer with this kind of technology and get the same performance. These new Intel CPU's are really difficult to program efficiently. You would only get good performance on certain problems sets.

    4. Re:Oblig. by utopianfiat · · Score: 1

      Christopher Lloyd was really freaking out about the fact that it required 1.21 Jiggawatts of power.

      --
      +5, Truth
    5. Re:Oblig. by PitaBred · · Score: 2, Interesting

      Because it doesn't take special problem sets and programming on the current supercomputers?

    6. Re:Oblig. by adam31 · · Score: 1
      Certain problems like transforming vertices and shading pixels, where programming efficiently is easily achieved with HLSL.


      These aren't meant for supercomputers. Those aren't DP flops they are talking about, and it doesn't seem like it is Intel's intent to change that. Of course, there are already GPUs doing SP teraflop. Sony bragged that RSX in PS3 was 1.8 TFlop, and newer GPUs are even faster. But hotter, probably.

  2. Both cool and useless for 99% of computing by tomstdenis · · Score: 4, Insightful

    The trick like SPEs is finding way to efficiently use them in as many tasks as they can.

    I'm glad to see Intel is using their size for more than x86 core production though.

    Tom

    --
    Someday, I'll have a real sig.
    1. Re:Both cool and useless for 99% of computing by billsoxs · · Score: 1

      I read an article in the morning paper (probably AP) where they said it might not make it out of the development stage. As I understand what they have done is add high-k to the gate stack - greatly reducing power consumption. So - it might still be x86 architechure but it will run a lot for very little power - at standard (3 GHz) frequencies.

      --
      This message was brought to you by "Lack of Sleep."
    2. Re:Both cool and useless for 99% of computing by Anonymous Coward · · Score: 0

      I will take a dual-core version of it, with 1.6W consumption.

    3. Re:Both cool and useless for 99% of computing by distributed · · Score: 1

      So finally the tile processor architecture makes it to the industry. People in the comp arch group at MIT envisioned and prototyped something pretty similar to this years ago as the RAW processor.
      http://www.cag.lcs.mit.edu/raw/
      http://portal.acm.org/citation.cfm?id=624515
      http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumb er=13382&arnumber=612254

      --
      [all generalizations are untrue except this one]
  3. The title is misleading by xoyoboxoyobo · · Score: 5, Informative

    That's not 62 watts at 1.8 teraflops. That's 62 watts at 3.16 GHz FTFA: "Intel claims that it can scale the voltage and clock speed of the processor to gain even more floating point performance. For example, at 5.1 GHz, the chip reaches 1.63 TFlops (2.61 Tb/s) and at 5.7 GHz the processor hits 1.81 TFlops (2.91 Tb/s). However, power consumption rises quickly as well: Intel measured 175 watts at 5.1 GHz and 265 watts at 5.7 GHz. However, considering the fact that just 202 of these 80-core processors could replicate the floating point performance of today's highest performing supercomputer, those power consumption numbers appear even more convincing: The Department of Energy's BlueGene/L system, rated at a peak performance of 367 TFlops, houses 65,536 dual core processors."

    1. Re:The title is misleading by Anonymous Coward · · Score: 1, Insightful

      Good for your for understanding that. Now if only you would make an effort to try to understand what xoyoboxoyobo wrote. (Hint: Nowhere in his comment does he equate flops with hertz.)

    2. Re:The title is misleading by Anonymous Coward · · Score: 0

      Sure the title is misleading!

      My calculations are:

      The chip appears to be 2.5" side.. that's about 6.25sqin. area... which means that a flop is about...
      3.47 pico-square-inch! which nowhere in the article!

      Boy those flops are small these days!

    3. Re: The title is misleading by Dolda2000 · · Score: 1

      Furthermore, I think it's kind of weird to say that it's "one processor". It may be one chip, but is a processor defined by its die? Since it's an 80-core chip, isn't it more accurate to say that it's 80 CPUs on one die, just as a dual-core chip is rather two CPUs on one die? It's not as if it isn't impressive, but I think it's kind of misleading to say that it's just one processor.

    4. Re:The title is misleading by Anonymous Coward · · Score: 0

      no flops obviously dont equal hertz. what the warlock meant by what he said and was hopefully what you infered, is that there is not some static linear relationship (like a coefficient ie gallons to liters)... however for a given software procedure, with implementation to a given ABI, with given hardware implementing said ABI where the software procedure completely operates within circuitry controlled by the clock in question (ie: Intel's magestically scheduled benchmark on a radically different abi deep within their secret testing facilities, and the proc thats the subj of TFA.) then there is mighty-darn close to a straight line. AND

      if you had read xoyoboxoyobo's excerpt of TFA "For example, at 5.1 GHz, the chip reaches 1.63 TFlops and at 5.7 GHz the processor hits 1.81 TFlops" you would have where he equates flops with hertz.

      all xoyoboxoyobo was trying to do was point out that the article description as posted was wrong/misleading because it quoted the power consumption at 3.16 Ghz and the work output of 5.7 Ghz, which takes more than 4 times the power. But on /. everything is fish tales anyways, 4x is nothing, we're used to seeing articles of what vista was supposed to do for the last 4+ years oooooh BURN!

    5. Re:The title is misleading by Anonymous Coward · · Score: 0

      aww, did he hurt your feelings there, The Warlock? :(

    6. Re: The title is misleading by TeknoHog · · Score: 1

      I wonder the same whenever some marketing genius mentions a dual-core processor. Of course, processors didn't have cores until Intel innovated the Core architecture ;)

      --
      Escher was the first MC and Giger invented the HR department.
    7. Re:The title is misleading by Anonymous Coward · · Score: 0

      Actually, for a given processor, flops should scale fairly linearly with hertz. If you increase the processor speed by a specific fraction, it does everything that it does faster by that fraction, so the amount of calculations it completes in a given time will scale by roughly that fraction. (Note that this does not account for speed changes external to the processor, so that is why it is only approximately linear and only within a certain range.)

    8. Re: The title is misleading by juergen · · Score: 1


      And if you read the articles, you can see those 80 cores are not much more than 80 dual floating point units with a bit of dedicated RAM to each, to supply them with a little bit of data for benchmarking. There is no way yet to provide real data to these 80 cores. And the cores can't do much else but single precision multiply-add instructions, so calling them proper cores (instead of FP/vector units) is exageratad.

      Even for pure floating point applications, without a data pipe to main memory which can keep up, aggregate TFLOPS are nothing more but snake oil in practice.

    9. Re:The title is misleading by ZonkerWilliam · · Score: 1

      Some what correct, it wasn't 62 watts for 1.8 Teraflops. It was 62 watts for 1 Teraflop. taken from eetimes.com; "The 80-core chip crunches 1 trillion floating-point operations/second when running at a 3.2-GHz clock speed and consumes 62 watts, to yield a record 16 Gflops/watt. And by cranking the clock up to 5.6 GHz, the chip bested 1.8 teraflops--that's 80 percent faster--albeit by increasing power consumption fourfold to 265 W, or 3.7 Gflops/W."

    10. Re:The title is misleading by Anonymous Coward · · Score: 0
      202 of these 1.81 TFlops single precision http://www.pcper.com/article.php?aid=363

      for BlueGene 367 TFlops double precision http://www.netlib.org/utk/people/JackDongarra/faq- linpack.html#_Toc27885741

      no thanks

    11. Re:The title is misleading by xoyoboxoyobo · · Score: 1

      Right - thanks for clarifying for me. That's what I get for posting in a hurry. :) If I'd taken the time to copy/paste one more paragraph it would've been more clear... (as I bang out a post on my way out the door)

    12. Re: The title is misleading by afidel · · Score: 1

      Uh, processor's had core's WAY before Intel came out with the Core architecture. In fact you could buy a MIPS, ARM, etc core for your system-on-chip design as far back as the early 90's that I'm familiar with, and probably further back than that. Just because something has been recycled by marketing doesn't mean it didn't start out in the technical realm =)

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  4. Take it a step further..... by bostons1337 · · Score: 0

    Now if they can only find a way to lessen its thirst for volts they could make it useful for the masses.

  5. Just imagine by andyck · · Score: 2, Insightful

    "Intel" "Introducing the NEW CORE 80, personal laptop supercomputer running Windows waste my ram and cpu cycles SP2 edition" But seriously this looks interesting for the future. Now we just need software to fully utilize multicore processors.

    1. Re:Just imagine by TheUni · · Score: 1

      Core 80? Psh. I'm waiting for Core2 80...

      Though i'm tempted to wait for Core-Quad 80 extreme.... 320 cores!

    2. Re:Just imagine by Anonymous Coward · · Score: 0

      Yeah, we need to make software that can utilize multiple processors/cores. We can call it ... "Same time Many Processors" ... "SMP" for short...

    3. Re:Just imagine by Anpheus · · Score: 1

      I thought SMP was Symmetric Multi-Processing? Is there a history behind that acronym, and was it changed from its more humble roots? - Just curious.

  6. bus speed by Anonymous Coward · · Score: 0

    bus speed *cough* bus speed *cough* bus speed

    1. Re:bus speed by Anonymous Coward · · Score: 0

      That was informative. Now would you care to elaborate on which bus you are referring to? Are you saying "wow, it has a 256GB/s internal bus". Or are you saying "well regardless of how many Tflops the thing can do, it's still faced with dealing with a outside world saddled by slow buses"? If it's the latter, fine, this is just one of their research projects, one can probably safely assume that the creators of PCI are busily working on more advanced buses on one front, and optimizing compilers to minimize bus latency effects on the software front.

  7. What kinds of apps does this make reasonable? by DoofusOfDeath · · Score: 4, Interesting

    Does this permit the practical use of any truly breakthrough apps?

    Does it suddenly make previously crappy technologies worthwhile? I.e., does image recognition or untrained speech recognition become a mainstream technology with this new processing power?

    1. Re:What kinds of apps does this make reasonable? by truthsearch · · Score: 5, Funny

      Does it suddenly make previously crappy technologies worthwhile?

      Vista?

      (Sorry, couldn't resist.)

    2. Re:What kinds of apps does this make reasonable? by DoofusOfDeath · · Score: 5, Funny

      Clippy?

      "It looks like you're writing a five-page essay on the role of the Judicial branch during periods of famine in the late 1850's."

    3. Re:What kinds of apps does this make reasonable? by Frumious+Wombat · · Score: 4, Interesting

      Atomistic simulations of biomolecules. Chain a bunch of those together, and you begin to simulate systems on realistic time scales. Higher-resolution weather models, or faster and better processing of seismic data for exploration. Same reason that we perked up when the R8000 came out with its (for the time) aggressive FPU. 125 MFlops/proc@75MHz was nothing to sneeze at 15 years ago. If they can get this chip into production in usable quantities, and if it has the throughput, then they're on to something this time.

      Of course, this could just be a single-chip CM2; blazingly fast but almost impossible to program.

      --
      the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
    4. Re:What kinds of apps does this make reasonable? by Intron · · Score: 2, Interesting

      Realtime, photorealistic animation and speech processing? Too bad AI software still sucks or this could pass a Turing test where you look at someone on a screen and don't know whether they are real or not.

      --
      Intron: the portion of DNA which expresses nothing useful.
    5. Re:What kinds of apps does this make reasonable? by CubicleView · · Score: 1

      I "seriously" doubt that this could be used to pass a turning test. The noise and heat from the fan sink keeping a 250+ watt processor cool would be a dead giveaway. If I recall correctly though I don't think you need a fancy avatar for the robot/computer/whatever to pass the turning test. It's more of a black box approach where all that matters is what the box says not how it says it.

    6. Re:What kinds of apps does this make reasonable? by Intron · · Score: 5, Funny

      Sorry, your post made me realize that a sophisticated processor is unnecessary. It's already difficult to tell whether a message is from a human or just a randomly generated string of nonsense.

      --
      Intron: the portion of DNA which expresses nothing useful.
    7. Re:What kinds of apps does this make reasonable? by vertinox · · Score: 2, Insightful

      Does this permit the practical use of any truly breakthrough apps?

      From my understanding perhaps with that many cores, the OS could simply allocate one application per core.

      But the OS has to support that feature or have applications that know how to call unused cores.

      From my understanding Parallels for OS X only uses one core and picks the second core to run on for the best performance.

      Of course then there are applications that could be programmed to use all the cores at once if they needed to do scientific calculations or something like Ray Tracing.

      --
      "I am the king of the Romans, and am superior to rules of grammar!"
      -Sigismund, Holy Roman Emperor (1368-1437)
    8. Re:What kinds of apps does this make reasonable? by Heembo · · Score: 2, Informative

      I used to teach 5th grade computer class, and please do not underestimate the power of Clippy(tm). I would instruct my students to remove Clippy, as I have done per habit for so long, but they would rebel. I recall at least several classes where Clippy hypnotized my class (and kept them preoccupied and easy to deal with.)

      --
      Horns are really just a broken halo.
    9. Re:What kinds of apps does this make reasonable? by riskeetee · · Score: 1

      Duke Nukem Forever!

    10. Re:What kinds of apps does this make reasonable? by hackstraw · · Score: 1

      I recall at least several classes where Clippy hypnotized my class (and kept them preoccupied and easy to deal with.)

      Leave that to the experts.

      That is what drugs and TV are for.

    11. Re:What kinds of apps does this make reasonable? by danlock4 · · Score: 0, Offtopic

      I used to teach 5th grade computer class, and please do not underestimate the power of Clippy(tm). I would instruct my students to remove Clippy, as I have done per habit for so long, but they would rebel. I recall at least several classes where Clippy hypnotized my class (and kept them preoccupied and easy to deal with.)
      *sniffle* The things 5th graders get to use these days! Why, when I was a lad, we didn't get access to computers until 6th grade, but we learned BASIC programming, darn it! (The school had about four CBM Pet machines with built-in green monochrome CRTs.) There was no Clippy to waste our time!
      --
      To .sig or not to .sig, that is the question.
    12. Re:What kinds of apps does this make reasonable? by Heembo · · Score: 1

      Sorry man, I was teaching them VBA programming, but they wanted clippy. Parents rule in private schools.

      --
      Horns are really just a broken halo.
    13. Re:What kinds of apps does this make reasonable? by bill_kress · · Score: 1

      It will be interesting when the ability to merge and analyze multiple images becomes possible, even better if it can be done in real-time.

      "Vision" can give computers the ability to correct themselves. With visual feedback, suddenly robotic arms don't have to be told what to do via a long stream of coordinates, you could pretty much point.

      It could also enable a new form of GUI control where the camera just watches your hand--eliminating the need for a mouse.

      Pointing a single camera out the side window of a moving vehicle could give you an accurate, very detailed 3-d representation of the landscape since a video from a car contains many images of the same objects from different locations (required for 3-d image creation).

      Comparing such shots to each other over a period of days would allow you to locate changes in huge areas of environment. Chicago could have identified their Bomb Scare two weeks earlier and not endangered their entire population by their inability to notice what they are calling "bomb-like devices".

      They could also identify growth of vegetation, new construction, problems in roads and buildings and any other changes that we tend not to notice because they happen over long periods of time.

      Because of the ability to only recognize differences, the storage space should be MUCH better than any existing compression mechanisms currently used.

      Although there are an awful lot of bad things this kind of technology will enable as well--it's going to happen, and the changes to our world should be as massive as the changes brought about by the internet itself, I'm sure I barely scratched the surface.

    14. Re:What kinds of apps does this make reasonable? by Frank+T.+Lofaro+Jr. · · Score: 1

      The kids rule in public schools - they have most of the firepower.

      --
      Just because it CAN be done, doesn't mean it should!
    15. Re:What kinds of apps does this make reasonable? by Anonymous Coward · · Score: 1, Funny

      Is it because of your mother that you say it's already difficult to tell whether a message is from a human or just a randomly generated string of nonsense?

    16. Re:What kinds of apps does this make reasonable? by Marsell · · Score: 1

      Atomistic simulations of biomolecules. Chain a bunch of those together, and you begin to simulate systems on realistic time scales.

      Considering the algorithmic complexity of these types of problems, I'm curious what you mean by this? Throwing more processors at protein-folding simulations, for example, is pointless (factorial time). That's why they use heuristics.

    17. Re:What kinds of apps does this make reasonable? by Aussie+Osbourne · · Score: 1

      A trillion flops per second!!!!! wow, M$ really needs to get up to speed, they've only managed one complete flop in the last five freakin' years.

      (Sorry, couldn't resist either.)

    18. Re:What kinds of apps does this make reasonable? by Nefarious+Wheel · · Score: 1
      Is it because of your mother that you say it's already difficult to tell whether a message is from a human or just a randomly generated string of nonsense?

      *snort* dang, another coffee in the keyboard.

      Smile when you say that, Eliza.

      I've known people, however, who spoke in a close approximation of Racter http://en.wikipedia.org/wiki/Racter/. Didn't need teraflops for that.

      --
      Do not mock my vision of impractical footwear
    19. Re:What kinds of apps does this make reasonable? by CubicleView · · Score: 1

      Did you actually read up on what a turing test is, or did you just learn about it from the experts here at Slashdot? http://en.wikipedia.org/wiki/Turing_test

    20. Re:What kinds of apps does this make reasonable? by Intron · · Score: 1

      How do you know that I'm a real person? If you look at my posting history you will see that I don't make any original posts, I just reply to other people. My response could be generated by a program which assigns weights to words in your post and selects the reply which most closely matches from an enormous but finite list of stored statements.

      --
      Intron: the portion of DNA which expresses nothing useful.
    21. Re:What kinds of apps does this make reasonable? by CubicleView · · Score: 1

      I never said you were(never proved I wasn't one either I guess), but from the limited amount of info I have, no, I can't tell if you're a slashbot or not. I'm not impressed if you're just running from a finite list though. Augmenting your list with posts from other users might be a start, Google and Wiki's seem to be a common source for that kind of info these days as well.

    22. Re:What kinds of apps does this make reasonable? by Frumious+Wombat · · Score: 1

      I'm thinking in terms of MD simulations, and time-steps/hr. (or Nanoseconds/day). Even more so, I'm really thinking of QM/MM simulations with better than semi-empirical for the QM part becoming more routinely available.

      --
      the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
  8. wahoo by DaMattster · · Score: 1

    I gotta get me one of these. This lends new creedence the Staples Red Button of major scientific and engineering problems. "That was easy!"

  9. 99% is exagerated by Anonymous Coward · · Score: 4, Interesting

    The first thing that jumped out at me was the presence of MACs. They are the heart of any DSP. So, this chip is good for computation although not necessarily processing. As other posters have pointed out, this chip could become a very cool GPU. It should also be awesome for encryption and compression. Given that the processor is already an array, it should be a natural for spreadsheets and math programs such as Matlab and Scilab. Having a chip like this in my computer just might obviate the need for a Beowolf cluster. :-)

    1. Re:99% is exagerated by trigeek · · Score: 1

      The new IEEE Floating Point standard as proposed (last time I looked at it) is going to require a MAC mode of operation.

      --
      Sometimes I doubt your committment to SparkleMotion!
    2. Re:99% is exagerated by Dzonatas · · Score: 1

      So, this chip is good for computation although not necessarily processing.

      Notice how intel has stated it is setup for general purpose. Also, it could be said a computation is a process that happens over time. Here, with this cast, it obviously parallelizes the overall computation within one CPU.

      We already know where intel plans to take this. They have released early statements:

      An earlier article compared extra cores to rayseg computation:

      Add Another Core for Faster Graphics

      From that previous article, it stated a 3.2GhZ P4 could give 100 raysegs. At that point, it predicted 450 raysegs potential for what is now the current quad-core technology. The 450 raysegs are enough to real-time raytracing on a common screen size (assume 1024px768). A little math, and you could see 800 raysegs performance would make a smooth real-time raytrace at a 1024px768 screen size.

      Let's speculate: The cores inside the 80 core cast don't seem to have HT or dual SSE units. We can presume each core is able to achieve 50 raysegs. Times that by 80 and you have 4000 rayseg potential. That is at least over a 800% performance increase for rayseg computation.

      Another articles that states a 5 year plan seen for this technology to market:

      Intel demonstrates 80-core processor

      That articles states: "The long time frame is required because current operating systems and software don't take full advantage of the benefits of multi-core processors. In order for Intel to successfully market processors with CPUs that have more than say, 4 cores, there needs to be an equal effort from software programmers, which is why producing an 80-core processor is only half the battle. On paper, 80-cores sounds impressive, but when the software isn't doing anything imaginative with them it's actually rather disappointing: during a demonstration, Intel could only manage to get 1 Teraflop out of the chip, a figure which many medium- to high-end graphics cards are easily capable of."

      Another words, we'll see a new implementation of SMP in the OS level before most applications will be able to take advantage of the 80 core cast. Most SMP programs were written with full access and bandwidth to all memory. Now, they will have to modified to handle distributed memory. The past typical one kernel per core design may not be the most efficient anymore.

  10. EIGHTY Cores??? by rwyoder · · Score: 4, Funny

    64 cores should be enough for anybody.

    1. Re:EIGHTY Cores??? by Anonymous Coward · · Score: 0

      How many times has this dead horse been beaten? 64 times should be enough for anybody.

    2. Re:EIGHTY Cores??? by mr_mischief · · Score: 1

      Build me a home computer that supports five to forty times the memory of all its competitors and then make fun of the PC.

      Seriously, when IBM and Microsoft released the IBM 5100 PC and MS-DOS/PC-DOS, the Apple II+ had 48k expandable to 64k, the Atari 600XL had 16k and the 800XL had 64k, Commodore hadn't yet released the Commodore 64 leaving them with the 5k VIC-20, and the Tandy Color Computer 1 had 32k. Most of these systems have 6800-series processors in the one megahertz range. The IBM had a processor which did similar work per clock as the Motorola chip, and was 4.77 Mhz.

      Sure, IBM and Microsoft could have had the foresight to support 16 megabytes or even 64 gigabytes of memory using the same software as their 256k and 512k offerings that were upgradeable to 640k. I doubt they knew how successful the platform would be. At the time, computers leapfrogged each other and people bought whole new platforms from completely different companies. Data was moved from one system to another by floppy or cassette tape if you were lucky, or by hand if you weren't. IBM probably had no idea the same platform would be overhauled so many times back in 1981.

      IBM was probably mostly interested in getting companies using their mainframes and midrange systems to stick with the brand anyway. It's like Harley Davidson and John Deere selling golf carts or like Ralph Lauren making bedsheets. Those companies want you to think of them whenever you think of anything related to their core business. I'm betting Harley Davidson doesn't make a lot of money on golf carts, but even at break-even it's better than seeing a Kawasaki golf cart owner go and buy a Kawasaki motorcycle.

      IBM had the right product at the right time in the 5100, and it took off. The 640k limit seemed silly when people knew what EMS and XMS were. Since OS2, Windows, Linux, and most other operating systems put the processor into protected mode and don't use the BIOS after they finish loading, it's no longer an issue. Now it's two or four cores, 32 or 64 bits, 100Mb or 1000Mb networking, and hundreds of gigabytes on your hard drive. It's Windows XP vs. Vista vs. Linux vs. OS X vs. BSD vs. whatever. It's no longer 64k vs. 640k, twin 360k floppies vs. single 160k floppies, 40 columns vs. 80 columns, RS-232 vs. RS-485 serial ports (with many manufacturers not doing either), 4 colors vs. 16, and a 5-meg hard drive being an expensive option.

      So please, can we give some credit where it's due, and get over a bit of shortsightedness on a product that's five years older than Blake Ross?

    3. Re:EIGHTY Cores??? by 644bd346996 · · Score: 1

      Once your software can take advantage of about 8 cores, it is probably scalable enough to take advantage of core increases almost as well as clock speed increases.

    4. Re:EIGHTY Cores??? by julesh · · Score: 1

      Seriously, though, I have been wondering about this. With a design where each core connects only to its neighbours, surely a square array (i.e. either 64 or 128 cores) makes much more sense than the rectangular 8x10 array that this chip appears to be based on. Anyone?

  11. I see that as a feature... by StressGuy · · Score: 3, Funny

    Get the bugs worked out be Xmas and you could sell at 1.81 Tflop easy-bake oven

    {...I need more sleep...}

    --
    A goal is a dream with a deadline
  12. Wow, I can't wait! by cciRRus · · Score: 3, Funny

    Gonna get one of these. That should bump up my Vista Experience score.

    --
    w00t
    1. Re:Wow, I can't wait! by daeg · · Score: 2, Funny

      Except there won't be any Vista drivers. Damn!

  13. Real-time Ray Tracing? by Dr.+Spork · · Score: 5, Interesting
    When I read about this I didn't get all worked up, since I imagine that it will be almost impossible for realistic applications to keep all 80 cores busy and get the teraflop benefits. But then I read about the possibility of using this for real-time ray tracing, and got very intrigued!

    Ray tracing is embarassingly parallelizable, and while I'm no expert, two terraflops might just be enough calculating power to do a pretty good job at scene rendering, maybe even in real time. To think this performance would be available from a standard 65nm die that uses 65 watts... that really could make a difference to gamers!

    1. Re:Real-time Ray Tracing? by Vigile · · Score: 2, Interesting

      Yep, that's one of the things that got me excited about it as well. Did you also read this article on ray tracing on the same pcper.com site by a German guy that made a Quake 4 ray tracing engine?

      http://www.pcper.com/article.php?aid=334

    2. Re:Real-time Ray Tracing? by Anonymous Coward · · Score: 0

      Firstly, as addressed elsewhere, this chip can do 1.8 TFlop, and can run at 65W, but not both at the same time. At full speed it uses a lot more power.

      Secondly, "a big difference"? I play a lot of games and I always complain that CGI lighting (in movies too) looks pretty crap, but I can't really say that the lighting in games truly affects the enjoyment. It also seems like raytracing just isn't worth it in a value per FLOPs sense - even if these chips were available they'd probably be better used in the current graphics way, rendering pixel shader effects and the like.

    3. Re:Real-time Ray Tracing? by tcas · · Score: 1

      I'm sorry, but this comment is really crazy:

      Firstly, there are hundreds of computation-intensive applications that can keep 80 cores busy: environmental modeling, protein folding... anything that currently uses a supercomputer.

      Secondly, why is the parallelizable nature of ray tracing embarrassing?! It's parallelizable exactly because each ray is computed independently of other rays - I don't see what is embarrassing or surprising about that.

      Finally, talking about the application to consumer gaming shows that you completely missed the point of this story. Way down the line... another whole generation of hardware later, maybe you'll have 80 cores on your home computer. But it's a little early to be thinking about that right now.

    4. Re:Real-time Ray Tracing? by ispeters · · Score: 5, Informative

      Secondly, why is the parallelizable nature of ray tracing embarrassing?! It's parallelizable exactly because each ray is computed independently of other rays - I don't see what is embarrassing or surprising about that.

      It's embarrassing because "Embarrassingly parallel" is the technical term for problems like ray tracing. It's a parallelizable problem wherein the concurrently-executing threads don't need to communicate with each other in order to complete their tasks so the performance of a parallel solution scales almost perfectly linearly with the number of processors that you throw at the problem.

      Ian

    5. Re:Real-time Ray Tracing? by VE3MTM · · Score: 1

      "Embarrassingly parallel" is a term for such problems, where each step or component is independent and requires no communication.

      http://en.wikipedia.org/wiki/Embarrassingly_parall el

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 Whoops, silly middle mouse button...
    6. Re:Real-time Ray Tracing? by fitten · · Score: 1

      It's parallelizable exactly because each ray is computed independently of other rays - I don't see what is embarrassing or surprising about that.


      As others have said, "embarassingly parallel" isn't a derogatory term any more than "greedy algorithm" is.
    7. Re:Real-time Ray Tracing? by Anonymous Coward · · Score: 0

      As others have said, "embarassingly parallel" isn't a derogatory term any more than "greedy algorithm" is.

      So, if others have said it, why did you feel the need to reply and add nothing to the conversation?

    8. Re:Real-time Ray Tracing? by Anonymous Coward · · Score: 1, Insightful

      While the ray tracing algorithm is embarrassingly parallel, I would imagine memory access is not. Having 80 cores accessing pretty much the same data (mainly textures) could be a problem. Perhaps procedurally generating textures would solve this. Perhaps caching is enough. I'm no ray tracing expert so please correct me if I'm wrong here.

    9. Re:Real-time Ray Tracing? by Linker3000 · · Score: 1

      Ooh look - three replies in a row - parallel!! - explaining the definition of a term related to parallel processing.

      Is something going to explode now?

      --
      AT&ROFLMAO
    10. Re:Real-time Ray Tracing? by schmiddy · · Score: 1

      I'd just like to point out, that yes, it would be great to do real-time raytracing with such powerful processors. Last week I was up until 6 in the morning waiting for a 2+ hour render of a reasonably simple scene to finish. Yeah, these procs would be great... if someone could just write a parallelizable version of POV-ray for Linux. Before someone jumps in to point to the few ports out there, let me head you off:

      A distributed version of POV-ray exists using the MPI library, but it's based on the pretty old 3.1 branch (POV-ray is on 3.6beta right now). This is important because even the newest POV-ray betas have pretty vanilla features compared to some of the other experimental branches (like Mega POV) that include things like motion blur to simulate moving objects, etc. I haven't even tried MPI Pov because I like playing around with the must-have toys like radiosity.

      A version that looks really good for Windows (bleh..) and is based off the 3.6 branch is SMPov. I really, really, really wish someone would port this to Linux so that I could have a chance to play..

      And, finally, there is a patch to POV-ray that will work on Linux using the PVM library -- and it will work with the 3.5 branch. Sounds good, until you read the Howto. Quoting directly: Radiosity is not working. The resulting image looks like a mosaic. The energy bias for each block is different because the radiosity equation is not globally resolved correctly.

      I suppose someone's going to tell me I should just do it myself. *Sigh*. I'm actually learning Erlang right now to learn more about distributed processing. Maybe, someday..

      --
      http://cltracker.net -- powerful craigslist multi-city search
    11. Re:Real-time Ray Tracing? by fitten · · Score: 1

      Dunno... perhaps you could answer that question yourself, now ;)

    12. Re:Real-time Ray Tracing? by aicrules · · Score: 1

      You totally wtfpwned him with facts...

    13. Re:Real-time Ray Tracing? by NiceRoundNumber · · Score: 1

      "Embarrassingly parallel" is the technical term for problems like ray tracing.

      Otherwise known as "Embarrallel."

      --
      Diplomacy is the art of letting other people have your way.
    14. Re:Real-time Ray Tracing? by Anonymous Coward · · Score: 0

      No communications? I highly doubt that. More like no communication between each 'unit of work'. However even these types of problems suffer from communication overhead. As you still need something to look over the results or stitch them back together. Also if you look at they used a crossbar switch. Not the best way to have processors talk to each other, or move data out to the processor. But it is 'decent' enough because of space issues with other switching network designs out there. Esp in a 2d type layout problem they have.

      Would be interesting to see how they keep processors busy with work though. What sort of scheduling do they use? How do they prevent work stall/bubbles? What sort of process do they use to communicate between processors (it is needed)?

      Sure some problem chop up very nicely however there is a bit more to it than that. That is if you want to use the hardware to its fullest. If you just want to throw more hardware at it later, then dont worry too much about it. But do not be too surprised when it does not scale.

      Also why 80? Why not 64 or 32 or 128? Why 80? It seems a strange number.

    15. Re:Real-time Ray Tracing? by mindriot · · Score: 1

      Well, there's nothing stopping you from making redundant copies of the needed data. It's read-only.

    16. Re:Real-time Ray Tracing? by Anonymous Coward · · Score: 0

      Repeat after me: 3d graphics is all about "bandwidth, bandwidth, bandwidth"... compute has pretty much always been the cheapest part of any 3d hardware. Unless the scene you're trying to raytrace (including all the textures, by the way) fits into the 2kb of memory local to each pair of MACs, this design isn't going to do anything revolutionary for raytracing. It might make a reasonable conventional rasterizer with the addition of a lot of specialized hardware and a ton of extra local memory, but you can already buy one (it's called an 8800).

    17. Re:Real-time Ray Tracing? by brsmith4 · · Score: 1

      Tachyon [http://jedi.ks.uiuc.edu/~johns/raytracer/]

      Site may be down so look for mirrors.

    18. Re:Real-time Ray Tracing? by goldenpanda · · Score: 0

      A lot of replies but nobody has explained the "embarassing" to parent. It's so *easy* to parallelize correctly, it would be *embarassing* to miss it and not take advantage.

    19. Re:Real-time Ray Tracing? by julesh · · Score: 1

      Firstly, as addressed elsewhere, this chip can do 1.8 TFlop, and can run at 65W, but not both at the same time. At full speed it uses a lot more power.

      Yes. At 65W it "only" manages 1TFlop.

    20. Re:Real-time Ray Tracing? by VE3MTM · · Score: 1

      That's exactly what I meant by communication: "Embarrassingly parallel" algorithms do not require communication between work units to complete. Of course you still need to stitch the units together into a final solution.

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 Whoops, silly middle mouse button...
    21. Re:Real-time Ray Tracing? by Dr.+Spork · · Score: 2, Interesting
      I'd heard about the Quake3 thing somewhere else. It's pretty cool with Quake4. What really impressed me, though, is that when they multiplied the number of polygons in the scene by several orders of magnitude, rendering performance fell only 60% or so. This makes it seem like an increase in processing power will accomodate an expoential improvement in scene detail. This confirms my suspicion that real-time ray tracing is the future of game graphics.

      The fact that ray-traced Quake3 works OK in real time on present (though big - but not specialized) hardware makes me think that Intel's chip might be able to do some impressive real-time ray-tracing already, and a 2012 version of the chip would render nicer scenes through ray-tracing than would conventional GPUs made with 2012 technology.

  14. Sounds like a cellular automata machine by Anonymous Coward · · Score: 0

    The architecture is very much like how one might build a cellular automata machine, albeit with FPUs instead of lookup tables.

    As an example, check out CAM-8: http://www.ai.mit.edu/projects/im/cam8/
    This dated from 1993 or so, it took at least a 1 GHz Pentium III to match its cellular automata performance, if I recall correcly.

  15. Tflops all over the place. by tocs · · Score: 2, Funny

    I hope they can get them back in.

  16. I for one welcome our new Android overlords... by doomy · · Score: 5, Informative

    33 of these CPU's should be more than enough to construct Lt. Cmdr Data.

    --
    ...free your source and the rest would follow...
    1. Re:I for one welcome our new Android overlords... by pimpimpim · · Score: 1

      If I follow the wikilink, most of the "information" there seems to come from an epside from 1989. Apart from the sad thing that some people actually treat these data as real, the fun thing is that apparently the scriptwriters who made up these fictional data did a pretty good job to make up computer specifications that would still be out of reach for normal PCs 20 years later.

      --
      molmod.com - computing tips from a molecular modeling
    2. Re:I for one welcome our new Android overlords... by Kjella · · Score: 0

      Apart from the sad thing that some people actually treat these data as real, the fun thing is that apparently the scriptwriters who made up these fictional data did a pretty good job to make up computer specifications that would still be out of reach for normal PCs 20 years later.

      Well for one it's 28 by now then, secondly they're supposed to be 300 years in the future. How's being off by a factor of 10 particularly impressive? In another 20 years, you'll see old sci-fi nerds making gags about having more processor power than Data.

      --
      Live today, because you never know what tomorrow brings
    3. Re:I for one welcome our new Android overlords... by RealGrouchy · · Score: 1

      33 of these CPU's should be more than enough to construct Lt. Cmdr Data [wikipedia.org].

      Yes, but the portable power supply would make him look more like Jabba the Hutt.

      - RG>
      --
      Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
  17. exaflop computers? by peter303 · · Score: 2, Insightful

    Since petaflops are likely by the end of the decade its time to imagine exaflops in 2020.

  18. Pinch one loose by jrmiller84 · · Score: 0, Offtopic

    "Intel Squeezes 1.8 TFlops Out of One Processor"

    In other news, AMD pinches a 1.9 TFlops loaf out of one processor

    --
    I will forever be a student.
  19. What is the point for 80 cores on the FSB by Joe+The+Dragon · · Score: 2, Insightful

    The FSB will be a big bottleneck even more so with the cpu needing to use to get to ram. You would need about 3-4 FSBs with 1-2 mb per core of L2 to make it fast.

    1. Re:What is the point for 80 cores on the FSB by Anonymous Coward · · Score: 0

      I don't see any mention of what type of memory or bus architecture they are using with this, but I think it's fair to assume that their chip architects arn't complete asshats, and that they've given it sufficient memory bandwidth to keep the cores all fed! ;-)

    2. Re:What is the point for 80 cores on the FSB by Anonymous Coward · · Score: 0

      Hello fool? Did you read any of the articles?

    3. Re:What is the point for 80 cores on the FSB by daniel_gustafsson · · Score: 1

      What is the point of commenting on an article you haven't read?

    4. Re:What is the point for 80 cores on the FSB by RightSaidFred99 · · Score: 1

      Umm, guh? This chip is an experimental chip and won't see the light of day for years. The FSB doesn't have years left. Ergo, this is a non sequitur - FSB has nothing to do with this chip.

    5. Re:What is the point for 80 cores on the FSB by julesh · · Score: 1

      RTFA. The cores have onboard RAM. There isn't an FSB, only a network between cores. They're working on a 3D interconnect to stack memory on top of the cores (i.e., 80 FSBs -- or perhaps TSBs).

  20. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  21. More room for bloatware..... by madhatter256 · · Score: 1, Insightful

    Yep. The only way to really use this effectively is to load it up with lots of bloatware. Imagine the tons of ads one can finally get with this type of CPU! doubleclick.net would seriously love this.

    People still effectively use processing power equivelant to that of an 800mhz Pentium 3 for basic stuff (and I'm just talking about Word processing, email, internet, no gaming) on average. Why would someone need a quad core CPU, and a crappy videocard just for surfing the net, typing, etc?

    In reality, that is what will ultimately happen. Just lots of stuff running in the background without us really noticing it. The speed and cores can make it easier to hide spyware in the background because you won't notice any slowdown in your system when the spyware loads, whereas if you have an older PC you will notice when something is running in the background as it will slow it down considerably. Bloatware will end up becoming tolerable when these types of CPUS start being put in desktop PCs. People will get used to it as much as most people tolerate spam in their email.

    --
    Previewing comments are for sissies!
    1. Re:More room for bloatware..... by Udderdude · · Score: 1

      Yes, people will really tolerate random popups and keyloggers that steal passwords/credit card information. What?

    2. Re:More room for bloatware..... by iampiti · · Score: 1

      That's funny since I use a PIII 800Mhz as my main computer.
      I know it's time to buy a new one and I intend to do that soon, but for "office" tasks (word processing, music playing, web surfing ) works just fine.
      My only problem are H264 videos, although videolan allows me to watch some 448x336 ones without problems.

  22. Narrow Minded by Deltronica · · Score: 4, Insightful

    Many comments on this post are centered around the processor's use as a personal computing solution. There is much more to computing than PCs! When viewed alongside specialized programming technology, bioinformatics, neurology, and psychology, this (rather large) leap in processing power brings AI to yet another level, and continues the law of accelerated returns. I'm not saying "oh wow now we can have human-like AI", I'm just saying that the ability to process 1.8 Tflops is nothing to scoff. Personal computing is inane and almost moot when compared to the other applications that new processors may pave the way for. Know your facts, but use your imagination.

    1. Re:Narrow Minded by Anonymous Coward · · Score: 0

      Spot on. As well as climate simulations, think brain simulations. (Sorry no mod points.)

    2. Re:Narrow Minded by j2xs · · Score: 1

      Yes! Thank you! And here's some help with those apps on 80-core chips...

      http://www.pervasivedatarush.com/

      --
      Java To Excess
  23. You won't notice a performance difference... by Dekortage · · Score: 4, Funny

    They've already allocated 40 cores to the RIAA and MPAA for DRM processing, 30 cores to NSA/Homeland Security surveillance of all your computing activities, and 6 cores to combat spam and phishing. In the end, there is no net gain in performance over today's processors. Sorry.

    (tongue firmly planted in cheek)

    --
    $nice = $webHosting + $domainNames + $sslCerts
    1. Re:You won't notice a performance difference... by Sri+Ramkrishna · · Score: 1


      Sounds more like a Lord of the Rings parallel

      We need the One Core! Kudos for someone to write the poem from LoTR. :-)

      sri

  24. About time... by nadamucho · · Score: 5, Funny

    Looks like Intel finally put the "80" in 80x86.

    1. Re:About time... by grimJester · · Score: 1

      Meh. Sounds like an 8x10 to me. Lame.

  25. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  26. Nope, think about what the other IHV's are doing. by Anonymous Coward · · Score: 1, Interesting

    This clearly isn't for CPU's. It's for building GPU's and more importantly for intel get a part of the huge growing market demand for general purpose programming on GPU's. We'll have to call them something other than GPU's in 5-10 years as they'll do all sorts of other jobs too.

    IBM saw this coming and went with the Cell, AMD saw this coming and bought ATi, NVidia already has a card that has all these shader units. Intel would be stupid not to respond. They've already admitted a discrete GPU part is on the way (http://www.reghardware.co.uk/2007/01/23/intel_dis crete_gpu_return).

    Only the other day there was a story (either the register or inquirer that's AFAIK has been now deleted...) about their GPU part being a whole chunk of in order x86 parts on a chip. Pieces of the jigsaw are slotting togheter. Makes programming GPGPU stuff easy for many. Intel want to move x86 architecture onto GPU's.

    Ah well, I wonder when we'll get that story confirmed. Intel are clearly up to something... I think we'll know what shortly. All in all it spells trouble for NVidia as being left out of the CPU part of the equation with Intel, AMD and in some respects IBM all with combo's.

    Anon because I've signed way too many NDA's...

  27. to stuff with FLOPs! by Anonymous+Cowpat · · Score: 1

    I want something that will do 1.8 trillion integer operations per second (single threaded). This simulation is taking 5 hours per run with this A64 3200+. Gimme give me 1.8TIOPs and I'll be listening.

    --
    FGD 135
  28. The /. conundrum by Sebastopol · · Score: 1

    You can't say this is useless, and support nVidia or ATI's stream computing, they are the same thing.

    This is the future of CPUs: everyone is doing it, and with GFX manufacturers heading down this path, it proves to be a very interesting future.

    --
    https://www.accountkiller.com/removal-requested
    1. Re:The /. conundrum by default+luser · · Score: 1

      Sadly, this is the only possible future for CPUs. Massively parallel single cores with support for symmetric multi-threading will replace complex cores with out-of-order execution, it's just a matter of time.

      Three resons why we're reversing a 15-year trend toward more complex CPUs:

      1. Single-thread performance using current processes and clock speeds is "good enough" for most desktop applications, even when you take away all the out-of-order execution goodies.

      2. Programmers are beginning to understand SMT, and this is essential because it basically replaces out-of-order execution: if a thread blocks on I/O or dependencies, the processor just switches to another thread.

      3. Out-of-order processor design is an area of decreasing returns - it took the major CPU manufacturers TEN YEARS to move from a 3-issue to 4-issue design, just because of the complexity. As you add more issue pipes, the complexity gets unmanageable, and the chips take too long to develop and verify.

      Unfortunately, the move to simplistic cores with SMT puts the burden on the compiler and the programmer, but that's the cheapest solution for continued performance increases. If you can cut-and-paste a design a few dozen times, you save time verifying it and shipping it.

      The good thing is, as I mentioned before, single-threaded performance (of simple cores) is "good enough" for most desktop applications, so the OS can handle multi-tasking for the user. Really, only performance-hungry apps (like compilers, renderers, AI/scientific and codecs) need that extra power. Most apps can remain single-threaded and simple, and still perform quite well.

      --

      Man is the animal that laughs.
      And occasionally whores for Karma.

  29. not first; Connection Machine, Masspar by peter303 · · Score: 1

    Others have built large scale parallelism in the past such as Thinking Machines and Masspar. They were not fully general CPUs, i.e. floating point. Plus the companies could only develop new generations on a 3-5 year time scale, so the general purpose workstations and clusters almost caught up by then. Having a "major" back large scale parallelism may finally lift the curse.

    1. Re:not first; Connection Machine, Masspar by convolvatron · · Score: 1

      or maybe its actually really hard, or arguably impossible, to use data-parallel machines for general purpose computing?

  30. Anyone notice? by treeves · · Score: 1

    One year later, and /. has updated their Intel logo to the new one?

    --
    ...the future crusty old bastards are already drinking the Kool-Aid.
  31. Comment removed by account_deleted · · Score: 1, Funny

    Comment removed based on user account deletion

  32. Re:WOW. How do you program it? by Splab · · Score: 1

    Easy, you use something like CSP where just about everything is a thread.

  33. Only on slashdot... by icydog · · Score: 1

    Only on slashdot will you find a post complaining about how bad of an idea an 80-core processor is. (On a side note, I'll finally be able to open PDFs in less time than it takes to go to the bathroom and back.)

  34. Re:WOW. How do you program it? by caffeinemessiah · · Score: 1
    There exists a moderately sized computing world outside of games. 80 cores, as you have pointed out, are clearly not directed towards gamers, or even personal computing at the moment. I would personally love one of these for my simulations, but I can use up absolutely any number of cores without too much trouble. If you want to extend it to games, it isn't very hard to imagine. As someone else mentioned, with a handful of cores you could probably do real-time ray tracing, which is naively parallel and can eat up any number of cores. You'd get photorealistic graphics, probably indistinguishable from real life. Throw in 40 cores there to run that big-screen plasma TV. Throw in a couple of cores for AI, which is much more than path-finding. You can have any number of learning algorithms that monitor every aspect of your gameplay and build better agents. In fact, throw in ..say..a dozen cores and you'd have enemies that might actually seem intelligent. A few more cores to track millions of particles and objects in the game.

    Remember that a lot of algorithms can be parallelized to use any number of cores (it gets inefficient after a point, but there definitely is an initial speedup).

    --
    An old-timer with old-timey ideas.
  35. Dear Intel: by rbarreira · · Score: 1

    please ue a power of two for the number of cores. Base 10 sucks.

    Sincerely, /. nerds

    --

    The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
  36. Re:WOW. How do you program it? by pizza_milkshake · · Score: 1

    with compilers/tools meant for programming it. before virtual memory programmers had to program for their machine's RAM size and manually manage their memory using "overlays" (or so i've read), but now this concept seems horrid to younger programmers. a generation from now, programmers will read about how computers used to only have one logical core and think it ludicrous.

    my uninformed, amateur guess is that functional languages will become more popular for programming massively multi-core machines (this coming from a C programmer). they will start to become faster than imperative languages because their workloads can be more easily recognized and farmed out to multiple cores.

  37. OpenMP by S3D · · Score: 1

    I'm not the best programmer in the world, but how the heck would you utilize 80 cores?

    OpenMP hide multithreading from developer and make parallelization completly transparent. Couple of OpenMP instructions can parallelize complex loop, witn no effort form developer at all. That is especially easy in physical simulation and AI. http://en.wikipedia.org/wiki/OpenMP
  38. Sorry by rbarreira · · Score: 2, Funny

    Sorry, I obviously meant "Base 1010 sucks"...

    --

    The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
  39. Re:WOW. How do you program it? by MajinBlayze · · Score: 1

    Take a game like starcraft:
    As mentioned earlier, ray tracing is embarrasingly parallel; each core can render a few hundred pixels, making real-time ray tracing possible at 30fps.

    AI: some strategy applications of AI are parallel: i.e. figuring out several possible paths at once; as the path branches, more cores can be used to determine the best possible approach.

    each unit can have an AI (probably more usefull in FPS games)

    and finally: there is more to computing than starcraft. (sorry, Korea)

    --
    "Hate is baggage. Life's too short to be pissed off all the time." Danny Vinyard -American History X
  40. Wonder why this hasn't been mentioned yet ?? by ashren · · Score: 1

    anyone for Duke Nukem forever??

  41. Re:Nope, think about what the other IHV's are doin by ThosLives · · Score: 1

    The interesting question is, if you take a special-purpose processor (GPU) and turn it into a general-purpose processor, which was the wrong classification initially?

    --
    "There are a dozen opinions on a matter until you know the truth. Then there is only one." - CS Lewis (paraprhase)
  42. Re:WOW. How do you program it? by deviousalex · · Score: 1

    Who says each game would have to utilize 80 cores? These multi core processors make other things possible such as encoding videos and such in the background while playing a video game. Imagine with things like this everyone could easily run a game server while playing the game on the computer and having no slowdown. You could run a CS, UT, and a couple other game servers for friends all while playing one of these games!

  43. Is 1.8TFLOPS really that much though? by OfNoAccount · · Score: 0, Troll

    Xbox360 = ~1TFLOPS
    PS3 = ~2.18TFLOPS
    According to Wikipedia

    Also, why does the article compare to a BlueGene variant, when in supercomputer terms it's really competing against things like MDGRAPE-3 which are already in the PFLOP range?

    1. Re:Is 1.8TFLOPS really that much though? by RightSaidFred99 · · Score: 1

      1.8 _Real_ TFLOPS is a lot. The 360 and PS3 have 1/2.18 "fake" TFLOPS.

  44. 110C??! by Some_Llama · · Score: 1

    Did anyone else see that?

    "Even more impressive, this chip is able to achieve incredibly high clock speeds on modest power usage. Running on a 1.0v current at 110 degrees C the tile maximum frequency is 3.13 GHz while at 1.2v the tiles can run at 4.0 GHz."

    That would be about 250f, would peltier coolers be mandatory?

    1. Re:110C??! by D3m0n0fTh3Fall · · Score: 1

      You're thinking of the temperature they're running at, not the amount of heat they're putting out. The maximum clockspeed the processor can manage with some stability is dictated by it's temperature. Intel are saying that if you keep the chip at that temperature it will run happily at that frequency. I think the 5.something GHz power consumption was something like 250 Watts, now for that you better have either some very very serious air cooling or else water/peltier etc.

    2. Re:110C??! by jonjay · · Score: 1

      At 0.95 volts and 3.16 GHz - the clock speed that was indicated at the fall developer forum - the processor provides a data bandwidth of 1.62 Tb/s and a floating point performance of 1.01 TFlops, according to Intel. This will fit nicely into the Green 500, http://www.green500.org/Home.html, lineup, eventually.

    3. Re:110C??! by Some_Llama · · Score: 1

      "The maximum clockspeed the processor can manage with some stability is dictated by it's temperature."

      Yes but they are saying at 110C (note celcius) it runs at 3.13 GHZ, so in order to run at that speed you would need liquid cooling or such, same for 5GHZ. I mean it's great the power consumption is relatively low, but that is still a scorcher.

      I mean I can run my opteron 1.8 GHZ at 3 GHZ too if my thermal limit is 110c, but would I want to or is it going to be stable without a shit load of colling? probably not.

    4. Re:110C??! by D3m0n0fTh3Fall · · Score: 1

      Cooling is not related to temperature, it is related to the amount of heat being created by the chip which is roughly equal to the amount of power being drawn by the chip.
      Heat != Temperature. Schools really need to start teaching people something.

  45. Re:WOW. How do you program it? by Anonymous Coward · · Score: 0

    OS X automatically uses as many cores as it can find, splitting the job between them. They swapped out the dual Core2Duos on a Mac Pro for two four-core processors and it was able to see and use every core it had.

  46. Good luck on the compiler by cfulmer · · Score: 1

    As the article points out, this is a VLIW (Very Long Instruction Word) design -- in effect, each instruction word will be broken up into chunks, with a chunk going to each processor. This means that you can end up with some bizarre situations -- what happens, for example, if one processor needs to jump to one location in memory and the other 79 don't? Effectively, your compiler would need to be able to realize this, and have the instructions at that memory location for the 79 processors be the same. (In reality, I don't think you'd do this -- that processor would probably just sit and wait for the others.) This is not the equivalent of having two cores, with each able to run independently.

    The real bottleneck here is the compiler, not the processor, because the compiler has to be able to pick up on implicit parallelism in the code and dole it out among the available cores. While it's possible to the compiler by using a language where the programmer specifies the parallelism, if you think about it, that's the opposite direction from the progress of computer languages in the last 20 years.

    The biggest problem that this technology has is that it is expensive when compared with a compute cluster, which can scale easily and can be more easily programmed. The main time the cluster won't do better are the instances where each core needs results from other cores so frequently that the overhead in message passing is too high.

    1. Re:Good luck on the compiler by Glasswire · · Score: 1

      The biggest problem that this technology has is that it is expensive when compared with a compute cluster, which can scale easily and can be more easily programmed. The main time the cluster won't do better are the instances where each core needs results from other cores so frequently that the overhead in message passing is too high.
      Surely you're joking? A single box constructed with this processor will be vastly less expensive than a compute cluster. Even modern quad core DPs would still require 10 nodes to equal this. The cluster would be more veristile, but it certainly would not be cheaper than this processor.

    2. Re:Good luck on the compiler by smallfries · · Score: 1

      Tell me, what does 2+2 add up to on your world? VLIW is not usually used across cores, it tends to be used to exploit parallelism within a core. Assuming a 3Ghz clock, and 1 Tflop throughput - we are averaging 333 operations per cycle. That's a little over 4 operations per core, per clock. Guess where the VLIW is going to be?

      Given your other vague ballsup in understanding where the tradeoff between a tightly couple array like this, and a loosely coupled cluster - how is the second year of your degree?

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    3. Re:Good luck on the compiler by cfulmer · · Score: 1

      Yeah, so I screwed up a bit: Intel's new chip, which has a somewhat similar architecture (bunch of less capable units working in parallel), does not actually run off a VLIW. However, my core points still hold: (1) there is a class of applications which will run well on this sort of processor, but the majority won't, (2) effective use of the processor will either need re-coding of applications in a parallel-conscious language or a very smart compiler, and (3) for many applications, a cluster of general-purpose machines will be more cost-efficient.

      Incidently, the sarcasm is not appreciated. I have a full degree followed by 13 years designing parallel and distributed systems. I admit that I confused Intel's research with stuff going on at IBM.

      In response to the sibling comment, the cost is in the design and manufacture of a new chip. The fixed cost to create a new processor is enormous, which means that have to produce millions in order to drive the per-unit price down. If I have an application that runs about as well on an 80-core processor at $500,000 as it does on 80 $1000 machines, guess what I'm going to choose? Heck, even if it takes 200 such machines, I'm still better off. (Excluding power considerations.) People have been building massively parallel machines for years, but they have never achieved broad commercial success.

    4. Re:Good luck on the compiler by smallfries · · Score: 1

      Your points are generally accepted wisdom in the parallel community. Each generation of hardware gives us a chance to argue over them again. There must be a parallel-processing equivalent to the graphic "wheel of invention" that describes this phenomena.

      The dig about being a 2nd-year student was just intended to get a rise, I guess that it worked. ;^)

      A lot of people on this discussion are seeing this as a branch away from multi-core x86. I don't think that would be Intel or AMD's strategy. When the fab technology to put 80 cores on a single chip goes mainstream I think that we'll see a lot of mixed architectures. If I get a chip with 4 full x86 cores and 20-30 smaller vector cores then I don't need a graphics accelerator anymore. That will be the mass-market driver, and then the comparison is not some exotic $500,000 chip. It's actually a cluster of commodity vector arrays, with a tightly bound multi-core architecture for I/O and control logic. That is where it gets really interesting....

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
  47. Re:WOW. How do you program it? by radish · · Score: 1

    Thread based programming really isn't that hard, particularly where you have a problem space which can be split up into discreet chunks of work. Example - a photoshop blur filter. Just divide the image up into (overlapping) chunks and blur each piece on a different thread. Another example - digital audio. Put each VST instrument on it's own thread. Once your apps are well threaded (and in many cases they already are) you can simply rely on the OS to schedule them over how ever many cores are available. For example, I write server code on my desktop box (single core w/hyperthreading) and it runs perfectly happily on the 64 core production servers, just faster.

    Of course this is simplifying things a bit, and it is hard to get the very best performance from any given environment, but you can make a big difference quite easily.

    --

    ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

  48. So much detail by ProfessionalCookie · · Score: 1

    Now that's a summary I'm willing to read- Bravo editors!!!111

  49. Java/.NET by Anonymous Coward · · Score: 1, Funny

    Ya know, at those clock speeds/flops Java and .NET start to look attractive! ;-)

  50. Haha! by alfredw · · Score: 1

    I have one too... It does 2 Tflops on the same amount of power. As long as all of the opcodes are "NOOP".

    --
    In Soviet Russia, sig types you!
  51. Processor Scaling by GRyanNZ · · Score: 1

    I'm sure this has already been pointed out, but unless you're working with a well documented highly scalable and very specific problem like say, Navier-Stokes simulations you're not going to utilise many of those 80 cores. Running a single threaded program will still only use one core in all likelihood. (But hey, Vista needs more cores to run all those DRM checks 30 times a second!) And most parallel programs don't scale well past 8 processors anyway, so until some new programming paradigm and better compilers are available, this is just another "Everest" a.k.a "Because it was there and we could".

  52. Uh oh. by chihowa · · Score: 3, Funny

    Are we allowed to imagine a Beowulf cluster of chips that obviate the need for a Beowulf cluster?

    --
    If you want a vision of the future, imagine a youtube comments section scrolling - forever.
  53. Re:WOW. How do you program it? by Anonymous Coward · · Score: 0

    I'm not the best programmer in the world, but how the heck would you utilize 80 cores?


    With 80 threads.

    Would there be one core assigned to AI path-finding...


    Stop thinking sequentially. How about a thread FOR EACH AI that needs to find a path. What's the branch factor in that tree? How about a thread FOR EACH top-level branch FOR EACH thing that needs to find a path.

    Got 200 units in the game? How about a thread FOR EACH one's AI, a thread to update screen data FOR EACH one, a thread FOR EACH one to accept new commands.

    Gaming with autonomous AIs is embarrassingly parallel.
  54. Real size is a 3D thing ! by da5idnetlimit.com · · Score: 1

    I didn't (yet) read the Fine Article, but I did envision an 80 cores "Intel Hasty Style" :

    (Warning, this vision's description could definetly damage a Hardware Geek's brain, or cause him to get an erection, depending)

    80 centimeters of stacked Very Expensives Interconnected Xeons (tm) with a 2 cubic meters deep freezed radiator enclosing them.

    --
    It takes 40+ muscles to frown, but only four to extend your arm and bitchslap the motherfucker
  55. TFlops by Anonymous Coward · · Score: 0

    Isn't this the same number of flops it takes before Microsoft gets a product right?

  56. Imagine .... by snoggeramus · · Score: 0

    Imagine a Beowolf cluster on a single chip! Oh, never mind ...

  57. looks like the old inmos transputer T800 ... by ooo00ooo · · Score: 1

    But alot more up to date :-) I can wait to build an hypercube with those ...

    1. Re:looks like the old inmos transputer T800 ... by ooo00ooo · · Score: 1

      oooppppss I can'T wait .... :-P

    2. Re:looks like the old inmos transputer T800 ... by SemanticPhilosopher · · Score: 2, Interesting

      Or more like the T9s... So the 32way crossbar switch, with 32 processors that I have working in the garage is coming back into fashion... Now if all the work that we did on interconnect topologies and their performance in networks up to size 1024 nodes might be useful. Hey we might even make something from the book!.... Welcome back to the late '80s Intel - do yourselves a favour - read the literatature - we've done the painful stuff already - you don't need to waste money on the fundemental research - its been done!

    3. Re:looks like the old inmos transputer T800 ... by ooo00ooo · · Score: 1

      Yeah, since yesterday, I found my old Occam book :o) and by book on CSP (concurrent sequential processing) on top of occam. That was good stuff ;-) Hey if you ever want to get ride of those T9 I will be please to help you :-)

  58. What I'd like to see... by petrus4 · · Score: 2, Interesting

    ...is a version of the Sims 2 rewritten so that the Sims have a much greater degree of genuine autonomy, and for said version to be run without human intervention (and recorded) for a period of months or years on a multiple TFlop system. If the environment was made a lot more detailed than it is in the retail version of the game, and if the Sims were given somewhat more capacity for learning than what they've currently got, something tells me the results of such an experiment might be extremely interesting, given enough time.

  59. Great info, thank you! by Dr.+Spork · · Score: 1
    Well, if I hadn't posted in this thread I would have modded you informative.

    So did I understand correctly that POV-ray at this point doesn't support parallel processing? If that's so, it would be a shame and it must really limit its usefulness in big projects.

    It would be cool if, just as the routines got more sophisticated, they'd get a consumer-grade processor that could run them in real-time.

  60. Software for 80-core chips by j2xs · · Score: 1

    Well, it doesn't solve world hunger, but parallel processing on an 80-core chip?
    Yup...can eat this sucker for breakfast... well, when it can support a JVM that is :-)

    http://www.pervasivedatarush.com/

    --
    Java To Excess
  61. DataRush Framework by j2xs · · Score: 1

    Well, if you are building data-intensive apps (gigs and terabytes of data in a batch processes), then you use http://www.pervasivedatarush.com/ and not OpenMP....or COBOL !!

    --
    Java To Excess
  62. Re:Good luck on the compiler HERE'S ONE by j2xs · · Score: 1

    Totally disagree with the notion that compilers need to somehow gleen implied parallelism from the source code. If anything .NET and J2EE taught us is that developers can indeed consume relatively complex frameworks to achieve threaded/concurrent application design. In this case, .NET and J2EE take care of OLTP computing architectures.

    For data-intensive batch processing, we have http://www.pervasivedatarush.com/ and other frameworks. Yes, you have to be able to build self-contained components that can be assembled in a dataflow graph of sorts... but developers had to learn the same craft when EJB's were invented.

    Bring on the cores, Intel. Do you feel lucky? Do ya?

    --
    Java To Excess