Slashdot Mirror


Intel - Market Doesn't Need Eight Cores

PeterK writes "TG Daily has posted an interesting interview with Intel's top mobility executive David Perlmutter. While he sideswipes AMD very carefully ('I am not underestimating the competition, but..'), he shares some details about the successor of Core, which goes by the name 'Nehalem.' Especially interesting are his remarks about power consumption, which he believes will 'dramatically' decrease in the next years as well as the number of cores in processors: Two are enough for now, four will be mainstream in three years and eight is something the desktop market does not need." From the article: "Core scales and it will be scaling to the level we expect it to. That also applies to the upcoming generations - they all will come with the right scaling factors. But, of course, I would be lying if I said that it scales from here to eternity. In general, I believe that we will be able to do very well against what AMD will be able to do. I want everybody to go from a frequency world to a number-of-cores-world. But especially in the client space, we have to be very careful with overloading the market with a number of cores and see what is useful."

82 of 548 comments (clear)

  1. We've heard that before. by GundamFan · · Score: 5, Insightful

    I don't doubt an "8 core" desktop will exist in the near future. Then again he has a point... we won't likely need it.

    --
    I don't give a damn for a man that can only spell a word one way.
    Mark Twain
    1. Re:We've heard that before. by seanmb15 · · Score: 2, Insightful

      Isn't this the same thing they said about 64bit chips?

    2. Re:We've heard that before. by kfg · · Score: 5, Funny

      I don't "need" an R sticker and turbo sound synthesizer either, but they sure make my FIAT 500 go faster.

      The Little Mouse that Roars!

      KFG

    3. Re:We've heard that before. by tomstdenis · · Score: 4, Insightful

      If you're basing that on some logical sense of "need" I may remind you the average consumer doesn't need a quarter the computer they already have.

      Tom

      --
      Someday, I'll have a real sig.
    4. Re:We've heard that before. by mrxak · · Score: 5, Interesting

      I frequently run as many as 8 programs at a time, sometimes more, but I seriously doubt each program would know what to do with its own core. With my two-CPU set-up, I find RAM to be almost the biggest limiting factor (although with 2GB, I've never actually run out). There's really no need for 8 cores until my brain is able to take multitasking to the next level, doing many many complex tasks that would gain benefit from (essentially) unlimited CPU power for each program.

      They say the biggest bottleneck of any modern computer is its user...

    5. Re:We've heard that before. by rbgaynor · · Score: 5, Funny

      eight is something the desktop market does not need

      So is he the only person on the planet who has not tried the Vista beta?

      --
      "Good things don't end with eum, they end with mania or teria." - H. Simpson
    6. Re:We've heard that before. by mrxak · · Score: 2, Insightful

      At least 64-bit chips let us address a lot more RAM, and everybody knows that programs are gobbling up more and more RAM these days. Millions of cores aren't quite as useful, at least for the time being, for your typical home PC.

    7. Re:We've heard that before. by Neon+Spiral+Injector · · Score: 2, Interesting

      That board has the HyperTranport slots that allow for a daughter board to be connected. It can form the root of a 16 core system, with 128 GB of RAM.

    8. Re:We've heard that before. by mrchaotica · · Score: 2, Informative

      No, because 64-bit doesn't have the same kind of diminishing returns increasing the number of cores does. We don't need eight cores, at least in the short-to-medium term, because it would require fundamentally rewriting all our software to be more parallel (unlike 64-bit support, which only requires fixing code that assumes 4-byte pointers).

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    9. Re:We've heard that before. by Mayhem178 · · Score: 5, Funny

      And that will almost let you run Oblivion at max settings.

      --

      "You will pay for your lack of vision..." - Emperor Palpatine to Ray Charles

    10. Re:We've heard that before. by hackstraw · · Score: 4, Insightful

      I don't doubt an "8 core" desktop will exist in the near future. Then again he has a point... we won't likely need it.

      My crystal ball is not always crystal clear, but I believe that 8+ cores will exist and are needed in the near future, at least for desktop systems.

      History here. I'm an HPC admin which translates into I run beowulf stuff where pretty much OTS computers are connected together to work as one big computer. I'm also a desktop computer user who is anal retentive about having realtime info regarding the status of my computer with respect to CPU utilization and whatnot.

      Now, in the many years of running desktop systems and being anal retentively monitoring them, I've noticed that CPU utilization is very often bursty. Meaning that its common for the CPU to hover around zero, and spike up with doing something like rendering a webpage, printing, compiling code, etc, etc. But most of the time (> 90% or well more if including when I sleep and stuff), the CPU is doing nothing.

      So, what is my point? Give me cores out the wazoo, and let them completely power down when not needed and crank up to all 8 or more when needed. This will greatly improve power requirements and improve performance at the same time. Evidence of similar stuff in either nature or in other technologies are plentiful. 1) Hybrid gas/electric cars. They use both for higher performance when needed, and then back off and oscillate between the two when its optimal for efficiency. 2) Animal tissue like muscles and nerves. Muscles are pretty much idle most of the time, and only use a few fibers when doing a light contraction, but all of the available fibers become active when exerting maximum effort. Similar, but different with nervous systems. 3) Human workloads. There are certain industries that are not really a constant, and even the seemingly constant ones also have bursts as well, but lets think of things like seasonal things like retail, taxes, or things like seasonal vacation spots. These kinds of jobs bring in more human bodies to handle the peak loads, and let them go when the peaks are over. Its nuts that in many places in the US, seasonal vacation spots are frequently employed by people from half way across the world!

      Now, is my 8+ core pipe dream going to happen tomorrow? No. But I believe this is where computing is going. Another thing that will have to change is that RAM should not be as random. In other words, memory, like CPU cores, should go dormant when not needed in order to conserve power as well, and of course there is the memory bandwidth issue as well.

    11. Re:We've heard that before. by AuMatar · · Score: 2, Interesting

      The cores will only help if the action being performed is parallel in nature. Rendering a webpage is not parallel, you have to parse the file serially. Printing is not parallel, the instructions need to arrive in order. Compiling is the only example of a parallelizable action (and there's serial bottlenecks there as well).

      --
      I still have more fans than freaks. WTF is wrong with you people?
    12. Re:We've heard that before. by indy_Muad'Dib · · Score: 2, Insightful

      they also dont need their $200 shoes, or their multi million dollar homes, or their $60k cars.

      their $500 clothes sets, their personal shoppers, their $100 hair cuts.

      but hey this is america, the land of capitalism.

      overindulgence is expected here.

      shame we are now the fatest, laziest, most uneducated country in the world but atleast you have that 50" plasma TV right?

      thats all that matters, you have more "stuff" than somebody else.

    13. Re:We've heard that before. by 'nother+poster · · Score: 2, Insightful

      Well, as we all know, "He who dies with the most toys... Is still dead."

    14. Re:We've heard that before. by avronius · · Score: 4, Interesting

      See, here's where I have to disagree.

      Imagine an RPG that has multiples (100's) of 'computer' competitors that are "developing" along the same lines as you and your character(s). Or perhaps an MMORPG with thousands of players, competing against 100's of thousands of virtual characters that are developing along the same lines as your and the mmorpg's characters. Say goodbye to random encounters with stale NPC's - and hello to enemies with unique names and playing styles - all due to the computer's ability to handle such incredible virtualization.

      Adding more RAM and a minor increase in speed wouldn't help in either of these scenarios. Bring on the cores, man, and don't stop at 8...

    15. Re:We've heard that before. by fmoliveira · · Score: 3, Insightful

      You can distribute different pages for printing, different frames pro html rendering, or divs or something. Your browser could be decompressing pngs and jpegs in other cpus while one parse the html too.

      Web browsing is still limited by the network anyway, increase cpu to browse the web doesn't make any sense for me. At least with my 400kbps DSL

      But I at least would not want to increase cpu power for these trivial tasks. I would prefer that it happens when I do something heavier, like a game, or at least something that takes more than 1sec.

    16. Re:We've heard that before. by guaigean · · Score: 5, Insightful

      The home consumer market isn't exactly the goal for technology like this intiially, and the price won't be inline with home consumers anyhow. This is the kind of stuff used in High Performance Computing, as a single computing node can maintain large amount of CPU performance with no transfer between nodes. 2GB is nothing in the HPC world, and 8 cores get filled up fast. While it may be easy to assume "I can't fill 1 CPU, what would I do with 8"? you have to remember that there are people out there running huge simulations, which could very easily use up many thousands of CPU's.

      Utility is in the eye of the user.

      --
      Microsoft Sucks, F/OSS Rocks. I get mod points now right?
    17. Re:We've heard that before. by mrchaotica · · Score: 2, Insightful
      x86-64 in particular

      Well, AMD did come up with a way to make their 64-bit CPUs immediately useful: they increased the number of registers at the same time (but could only make them available in 64-bit mode, to avoid breaking stuff when running in 32-bit mode). Aside from that, 64-bit isn't intrinsically useful unless you want a virtual memory address space bigger than 4 gigabytes (which, at the moment, tends not to be true for casually-used PCs).

      In any case, i'm really hoping that these multi core consoles translate to more experience in multithreading programming moving to the PC side of things, whether it's games or something else.

      I'm betting on just that -- I'm a CS undergrad, and I took a parallel programming specifically for that reason

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    18. Re:We've heard that before. by soft_guy · · Score: 2, Funny

      I've been doing my part to help increase memory usage with the following handy function:

      void * allocateMemory(size_t bytesNeeded)
      {
      time_t myTime;
      time(&myTime);
      struct tm * myTm = localtime(&myTime);
      unsigned int ramWastingFactor = myTm->tm_year > 100 ? (myTim->tm_year - 100) : 1;

      return malloc(bytesNeeded * ramWastingFactor);
      }

      --
      Avoid Missing Ball for High Score
    19. Re:We've heard that before. by 'nother+poster · · Score: 2, Insightful

      Actually I could benefit from 4 or 8 cores right now. On my desktop at work I have currently have 4 browsers, 5 Xwindows, A mail client, and PCanywhere. Large portions of the work day I am 100% CPU util.I could use those cores.

      In the future if they have some decent logic to handle context switching and thread migration then shifting to lots of small parallel operations could make computing even better as far as I'm concerned. Parallel operations on multiple cores could really benefit some types of desktop apps. Others simply wouldn't benefit because of the simple logical linear progression required by their nature.

    20. Re:We've heard that before. by tashanna · · Score: 2, Funny

      I'm not so sure. I think there's more demand than you suspect
      CPU 1: User
      CPU 2: Windows Vista (Swap baby swap)
      CPU 3: Outlook Anti-spam filter
      CPU 4: Norton Anti-virus scanner
      CPU 5: Web-security system
      CPU 6: Sony "DRM Enabling" root-kit

      Now, if you had said that average Linux user...
      ***Ducking And Covering***

      - Tash
      Yippie... Hybrids!

    21. Re:We've heard that before. by samkass · · Score: 5, Insightful

      Isn't this the same thing they said about 64bit chips?

      Good point... yes, Intel said this about 64bit chips, and they were right. Almost nobody needs 64bit chips. But now virtually all chips are 64bits, wasting a lot of die real estate and engineering effort because of the perceived benefits driven more by AMD's marketing than reality. It's quite possible 8 cores could end up in the same boat-- AMD pushing it for no valid technological reason and Intel being forced to follow suit.

      --
      E pluribus unum
    22. Re:We've heard that before. by synx · · Score: 5, Interesting

      Interestingly enough as you recompile for 64 bits, you also need more memory as you get more memory. Now your memory alignment is now 8 bytes not 4 bytes, and your pointers are much later.

      I'd like to take a moment to rail against most commonly accepted forms of parallel education. I'm sure you were taught about threads, critical sections, semaphores, shared memory, etc.

      These are all inherently dangerous and difficult to program concepts. Write some application that is flexible and can run with N threads - usually this is hard, the best solution from Java-land is the concurrency toolkit which defines units of work which can be parallelized by a thread pool.

      However, there _is_ another way. "CSP" - communicating sequential programs. This is a method of writing naturally parallel systems that do not have the disadvantages of all of the above. (Standard concurrency debugging suggestion in java: "make the method synchronized") A practical example of this is the programming language Erlang. Ericsson invented this language to write high performance telco gear. Their ATM switch line is written in it. In Erlang you have many many 'processes' (not traditional OS processes, but defined in the VM) which cannot share memory - the only way they can communicate is via async messages. You can build a synchronous call on top of async messages pretty trivially (after all, all syncronous network protocols are based on IP which is asynchronous). You never have to worry about memory stomps, or critical sections. You _do_ have to design your applications differently, but it is most definitely worth it.

      Another interesting thing about this is your applications naturally parallelize. The "R11" release was just put out, which included SMP support. The previous versions would only use 1 CPU, but this version will use all your CPUs, which means if you have multiple processes ready to run, they'll run on as many CPUs you have! Instant SMP support, no redesign, no RECOMPILE necessary.

      This kind of language technology is what is necessary to get us to the next level. A similar thing is possible with Functional languages such as OCaml, Haskell, etc.

      I've been working in the industry for 5 years and I'm currently working on a Erlang project. My company was fairly conservative in terms of languages, there was a standing order (until about 2000) "no C++".

    23. Re:We've heard that before. by pseudorand · · Score: 2, Funny

      > Another thing that will have to change is that RAM should not be as random.

      I agree. All these years we've been suffering with RANDOM access memory, an crutch of an antiquated technology that's time is over. Considering that computers do a whole bunch of searching, and a binary search is so much faster than a sequential search, and that you can only do a binary search on sorted data, if we could just get SORTED access memory instead of the end-of-its-usefull-life RANDOM crap, computers would be much faster.

    24. Re:We've heard that before. by 2nd+Post! · · Score: 3, Interesting

      Uh, all your examples are only serial in implementation, not serial in nature.

      A webpage, for example, need not be parsed serial, though the performance of current systems is high enough that you get nothing in attempting to parallelize the renderer. A printer, however, can trivially be designed to be parallel, especially if you have unusually high DPI. Think of a printer rendering to a paper in the same way that a graphics card renders to a framebuffer. If you can use multiple pipelines, GPUs, and cards to accelerate video display, why wouldn't the same be possible for printing? The neat thing about printers and printed data is that there is no dependence, the image in the upper right exists independent of the image on the lower right, and etc etc. In theory you could have a core assigned to every PIXEL printed on a page, and a corresponding printhead with a printhead for each core, and you would be able to print an entire page in a cingle CPU cycle. Technically.

      So there are plenty of other things that could be executed on multiple cores:

      Decoding video (playback)
      Encoding video (storage, rendering, chat)
      AI for games (imagine simulating a multitasking AI on multiple cores)
      Physics for games (uncoupled events can be processed independently and coupled events require access to the same data)

      Yes, everything has a serial bottleneck, such as data access, but once properly set up most things can also be set up to be multicore as well. Saving a file, for example, can be multicore if you imagine the write as happening all at once, rather than serially, with each core assigned to a write head, each write head then operating independently... Etc.

    25. Re:We've heard that before. by steveo777 · · Score: 3, Funny

      I, for one, plan on living forever... so far, so good.

      --
      This sig isn't original enough, it's time to come up with something witty...
    26. Re:We've heard that before. by LordKronos · · Score: 2, Insightful

      There's really no need for 8 cores until my brain is able to take multitasking to the next level

      Video processing
      photo processing
      Multitrack digital audio recording with multiple real time DSP effects

      And that's just what I thought about in 10 seconds. Not to mention what video games could do with all that processing power.

    27. Re:We've heard that before. by timeOday · · Score: 2, Interesting
      We don't need eight cores, at least in the short-to-medium term, because it would require fundamentally rewriting all our software to be more parallel
      I think that's somewhat of an exageration. Not all software has to be rewritten, just software where 1) speed is a driving concern and 2) isn't already multithreaded. In other words, normal office and web software doesn't need rewriting because it runs fine on 1 core. Raytracers and video effects software doesn't need rewriting because it's already multithreaded. That leaves things like games, where we would want multithreaded collision detection, physics engines, etc.
    28. Re:We've heard that before. by synx · · Score: 2, Informative

      Interesting class. Looks pretty standard for a US/Canada education.

      No, in fact I mean CSP - see http://www.usingcsp.com/

      Have a look at yaws: http://yaws.hyber.org/ a high performance webserver written in Erlang.

    29. Re:We've heard that before. by bomanbot · · Score: 2, Insightful

      Well, I RTFA and for me it looks like he is taking about the near future (emphasis mine):

      But especially in the client space, we have to be very careful with overloading the market with a number of cores and see what is useful. I believe '2' is a good number. '4' will be an interesting number for the high-end. Will we see eight cores in the client in the next two years? If someone chooses to do that, engineering-wise that is possible. But I doubt this is something the market needs.

      and

      I think that it will be two or three years until you are going to see four cores entering the mainstream.

      So according to him, for the next few years anything more than four cores will not be mainstream. Sounds pretty reasonable to me.

    30. Re:We've heard that before. by Paradox · · Score: 2
      they also dont need their $200 shoes, or their multi million dollar homes, or their $60k cars. their $500 clothes sets, their personal shoppers, their $100 hair cuts. but hey this is america, the land of capitalism.
      If economies were only based on need, then they wouldn't be economies. They'd be the mechanism of a welfare state. America rewards those who have money, and in general you get money by doing something successfully, or being the relative of a successful person. This is captialism. Got a problem with it? Check out how the alternatives fare.
      overindulgence is expected here.

      They should be wearing sackcloth! They are guilty of being successful! Damn those Hollywood actors for receiving a small portion on the proceeds from movies that most of america goes to see, then using them to hire someone to do their errands so that they don't get molested by creepy fans. Damn them to hell!

      shame we are now the fatest, laziest, most uneducated country in the world but at least you have that 50" plasma TV right?
      "Fattest"? Maybe. "Laziest?" Definitely not. Hang out in Europe for awhile. I am positive you can find a lazier country.


      But when you say, "most uneducated" I must take issue. Do you live in some sort of fantasy world? Do you realize that the US literacy rate is one of the top 10 highest in the world? That even people without highschool educations know how to do simple math? America has education problems, but they are on a completley different scale from other parts of the world. It's grossly insulting to countries that are struggling with genuine basic education problems when you say crap like this.


      If America is such an uneducated country, why do people come from around the world to study technical disciplines here? Most masters programs are full of foreigners on a student visa. They aren't doing it for the life-experience, they're doing it because America has some of the best higher education the world has to offer.

      thats all that matters, you have more "stuff" than somebody else.
      Maybe for you. I dunno about you, but I look at where American culture has gone as a result of the proliferation of computing power and I see good things. Even poor school districts have computers that let children browse wikipedia (if they ever wanted to). We have realtime news and entertainment. Relatives can communicate with their families on a much more personal level even if they are spread across the country.

      Americans now communicate on a broader scale, and are more aware of world events in a much more education fashion than even 25 years ago. Hell, my cousin knows every capital in Europe, because he talks to people online from other countries and considers it important. When was the last time you met an American who cared about Geography?

      --
      Slashdot. It's Not For Common Sense
    31. Re:We've heard that before. by dgatwood · · Score: 2, Interesting

      Audio processing is the classic example of trivially massively parallel processing on the desktop. Let's say you have 40 tracks with four or more effects slots, plus a few effects on a bus, plus a few effects on the master output. Each one of those is a separate computation unit.

      That's not to say that there aren't dependencies, of course. Within each channel chain, each plug-in has a dependency on data from the previous plug-in. Each of the bus mixers (including the master) has a dependency on having received output from all of its inputs. The effects that follow have a dependency on the bus mixer.

      The point is that with a small amount of communication, it is possible to run each of these plug-ins on a separate thread and process it as a separate execution unit, and start processing the next batch far enough ahead to generally keep the pipeline completely full on a very large number of CPU cores. The I/O is still relatively high performance compared to the amount of processing being done (particularly if you include things like convolution plug-ins). In a well-written audio app, you are limited primarily by CPU speed. Of course, some folks offload this to specialized DSP hardware, but for those of us who aren't made of money, it's nice to be able to get a really beefy stock desktop computer that can handle it.

      I do my audio work on a quad G5. It will do for now, but I'm guessing I'll want more than four cores within a low single digit number of years....

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    32. Re:We've heard that before. by IamTheRealMike · · Score: 3, Insightful

      Almost no desktop programs actually use 4 gigabytes of RAM. Not even allowing for rapid expansion will we reach that bottleneck anytime soon.

      The Intel guys were right. What are the uses of 64 bit systems? They are removing a bottleneck that very few were hitting. The AMD64 instruction set fixes (more registers etc) are nice but not worth the hassle of losing binary compatibility. Result? Hardly anybody uses a pure64 system. Only enthusiasts.

    33. Re:We've heard that before. by vadim_t · · Score: 4, Interesting

      Actually, I've been discussing this with a friend recently.

      Take NWN for instance. How about making a game where things are REALLY happening? So far most worlds are extremely static. MMORPGs are static in that nothing ever changes, you kill the Lord of Evil and he's back on his dark throne 5 minutes later. And in most RPGs things just stay there and wait for you to appear (say, you never miss a battle in progress, as they just stay there until you appear nearby so that you can conveniently join the battle).

      For example, in NWN it's very clear that there are multiple factions living in the area. How about having kobolds, knolls, wolves, etc move around on their own, gather food, kill each other, reproduce, try to invade, etc? Wouldn't it be neat if you could defeat the gnolls, then wander off for whatever reason, and when you return find the kobolds now took over the gnoll cave, increased their population, and Tymofarrar got out of the cave and set fire to the town?

      Of course, make it too realistic and it gets a bit weird... imagine having to kill kobold children and walking on gnolls having sex.

    34. Re:We've heard that before. by drinkypoo · · Score: 2, Insightful
      do you think 8 cores sharing a memory pool are really going to help you out

      If those 8 cores are from AMD, then they'll be utilizing a NUMA architecture, and provided the OS "does the right thing" then no, you won't be waiting for memory, at least not any more than you are now, and probably less so.

      If those 8 cores are from intel, they'd better have improved their bullshit bus, or no, it won't help.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    35. Re:We've heard that before. by metaforth · · Score: 2, Funny

      Nice. Except Moore's law will kill you unless you grow the factor exponentially. Here's what I recommend: unsigned int ramWastingFactor = myTm->tm_year > 100 ? (int)pow(2,(myTim->tm_year - 100.0)) : 1; --- Clown Car Discounts

    36. Re:We've heard that before. by JamesTRexx · · Score: 3, Insightful

      Although I also have plenty of programs running, I could really use them for all the virtual machines I have running. Having each one run on its own cpu would speed things up considerably.

      --
      home
    37. Re:We've heard that before. by ArbitraryConstant · · Score: 2, Interesting
      I frequently run as many as 8 programs at a time, sometimes more, but I seriously doubt each program would know what to do with its own core.

      Indeed. Even with multiple applications, there is rarely more than 1 CPU-bound thread on a desktop system.

      Not that I doubt the ability of the industry to produce applications that warrant increased numbers of cores, but this will take time and in the current landscape anything more than 2 goes to waste for most people. There are some applications that need it already, and those people will no doubt be happy about increasing the core count, but I think this is going to be one of those "build it and they will come" things for the most part.
      --
      I rarely criticize things I don't care about.
    38. Re:We've heard that before. by 2nd+Post! · · Score: 3, Interesting

      The problem we are running into is that webpages were designed to be parsed serially; if they were designed to be parsed in parallel, then they would be woefully inefficient being parsed serially, which until now has been the norm. The same with your printer example.

      So imagine a situation where a webpage was DESIGNED to be parsed in parallel. The page hierarchy would be formatted into independent chunks that could be assigned to different threads and cores without first preparsing it. It would be like having an index built into the webpage such that different elements on the page could automatically, without additional effort, be spun off to different cores. A navigation bar, a banner, the main content, a link-box, and a footer, for example, could all be defined in a webpage such that as soon as the render saw that there exists five elements on the page each element is spun off to a different core to be handled.

      The same with a printer; if the printing language were designed up front to be parallel, rather than serial, you could see speedups in rendering, though such gains is probably negligible. An image, such as an embedded jpeg in a document, would be split into four, for four cores, and then rendered into the appropriate printing language, which might come in handy when 10 megapixel pictures become common. Imagine a printer with four print heads, now. You could conceivably send four streams of data at once, which again could be fed by four cores (or a single core of course, if it pre-computed the data needed to be sent to the printer).

      Decoding video: Uh, take a look at HD... that's pretty hardcore :) No imagine decoding on the fly HD video chat; unlike HD DVD, on the fly encoded video will not have the best encoding/compression/compute values, but rather an average one. No imagine multi-chat, in which four people are talking, and four HD video streams are being decoded at once.

      Encoding video: Imagine now encoding an HD video chat on the fly :) My parents, for example, talking to their grand-daughter in HD, is clearly superior to seeing a 640x480 image, which is again clearly superior to 320x240. Multiple cores would allow for a much nicer, cleaner, 30fps 1024x768 video stream, especially if background tasks are occurring.

      AI: I think you misunderstand. One AI enemy which formulates, simultaneously, five DIFFERENT responses from the same data structures... in other words, an AI of split mind. In the same way I can imagine writing four different responses to you, but only acting on one of them, an AI with multiple responses, but only a single action, becomes much richer, more unpredictable, and unbelievably more complex.

      Physics: Physics is really a generic superset of graphics. Graphics is merely how light interacts with the data structures. Throw in gravity, sound, friction, and mass, and you have physics. The same reason why graphics can use multiple cores, then physics can too. Imagine if the 3d sound effects were split among two CPUs, just like frames are? Sound can be trivially represented as frames, much like graphics. Imagine the same with gravity being calculated by two CPUs every other frame, or friction, etc. You can calculate, for example, the spray pattern of a shotgun in time; the trajectory is known, the number of pellets are known, and the environment is known. Right now we approximate the intersection of a shotgun blast with the intersection of a player or a structure, but with additional compute resources you can actually trace each pellet individually!

      The same with falling rocks, a flooding room, etc.

      Most problems ARE parallelizable, I think, the only real question is approaching the problem from the onset with multiple cores in mind.

    39. Re:We've heard that before. by Squalish · · Score: 2, Insightful

      Highend game engines are already at a point where 3-4gb can be a major improvement on 2gb. Battlefield 2 was one of the first that showed a difference, and Oblivion's outdoor scenes are really begging for at least 2gb.

      As with every other technological yardstick of computers, entertainment is driving the platform, not "desktop programs," for which the technology of a decade ago was adequate.

      --
      People in Soviet Russia, however, appear to be afflicted with amusing juxtapositions of the aforementioned situation
    40. Re:We've heard that before. by GaryOlson · · Score: 3, Insightful
      ...only enthusiasts ?!

      Obviously, you have never tried to simualate or graph propagation of an organic virus with a 4 million node set using Matlab x64 on a desktop system.

      We would be pleased to take your enthusiast money and 128 of your gaming buddies' money and build a Linux computational cluster to solve a problem that will likely save your life or the life of someone you know.

      --
      Every mans' island needs an ocean; choose your ocean carefully.
    41. Re:We've heard that before. by master_p · · Score: 2, Interesting

      What happened and your company allowed C++ after 2000 anyway?

  2. The desktop market is the largest market. by khasim · · Score: 4, Insightful

    If you put 8 core procs in desktop machines, software will be written that will take advantage of them. Which means you'll sell more 8 core procs.

    Are you going to lead or follow?

    1. Re:The desktop market is the largest market. by Henry+V+.009 · · Score: 2, Informative

      If they double the speed of my CPU, I can take advantage of that just by not trying as hard and letting my code bloat.

      If they double the number of cores, I can only take advantage of that if I have a problem that can be parallelized and then if I work very very hard to multi-thread my project.

    2. Re:The desktop market is the largest market. by Lonewolf666 · · Score: 2, Insightful

      First, the hardest part is going from 1 to 2 cores. For that, you have to figure out the principle of how to split the workload. Going from 2 cores to n cores will usually be easier. And since dual cores are already becoming mainstream, professional programmers will be forced to take the step from 1 to 2 cores anyway.

      Second, the makers of multimedia applications already go ahead with multithreading, because it really works for that type of application. This will drive the market for more cores. In the long run, I expect the mainstream market to settle at the number of cores that works best for multimedia applications and games.
      I think this will be at least four cores (in that I agree with David Perlmutter) but it may be more, depending on the progress of computer science in parallelization of the above applications. Personally, I would not be surprised to see 16 core CPUs in mainstream computers someday.

      --
      C - the footgun of programming languages
  3. Translation by lisaparratt · · Score: 4, Insightful

    "Our multiprocessor technology doesn't scale, but we don't want to scare investors away, so we'll pretend it doesn't matter."

  4. Question. by Max_Abernethy · · Score: 3, Insightful

    Does having multiple cores do anything about the memory bottleneck? Does this make the machine balance better or worse?

    1. Re:Question. by Aadain2001 · · Score: 5, Informative
      It depends on which memory bottleneck you are talking about. There is a memory hierarchy in computers, with the fastest also being the closest to the processor, the level 1 or L1 cache (usually split into separate data and instruction caches). This is then tied into a much larger, but slower, L2 cache (combined instruction and data lines). Some processors use an L3 cache, but not many these days. Current processors have L1 and L2 directly on the chip. If you see those die pictures they show off to the press, the largest areas of the chip are the caches. Finally, the chip can go across the front side bus and access the main system memory, which is very large compared to the L2 and L1 caches, but much slower in terms of number of cycles to access.

      So which bottleneck are you refering to? The new Core 2 Duo chips of Intel's share the L2 cache and, as far as I can tell from the reviews I have read, this setup works very well. Both chips can share data very quickly or when executing a single sequential program one of the cores can use all of the L2 cache (which in the Extreme Edition verion is up to 4MB!). Or are you refering to the main memory? It is possible for both cores to need to access the main memory at the same time, but modern pre-fetching and aggress speculation techniques reduce how often that occurs and the timing penalties when they do occur. And of course, the larger the L2 cache the more memory can be stored on the chip at once, reducing the need to access the main memory very often. According to Intel's own internal testing, they had a very hard time using all of the bandwidth the current front side bus and memory offers, which means the main memory shouldn't be a bottleneck.

      So what is the bottleneck you are refering to?

      --
      Space for rent, inquire within
    2. Re:Question. by NovaX · · Score: 3, Insightful

      Worse.

      For Intel, they are currently using a shared bus approach. It makes sense for a lot of reasons (mainly by being very cost effective), and they are developing a point-to-point bus for the near future. In such a system, each CPU is using the bus for retrieve data. This means that they lock the bus, make their calls, finish, and unlock. The total bandwidth available is split between all parties, so if there are multiple active members (e.g. CPUs) then their effective bandwidth is split N ways. The only solution to this is to have multiple shared busses, which is expensive.

      A point-to-point bus gives each member their own bus to memory. Thus, there is NxBW effective bandwidth available. As memory cells are independant, the memory system can feed multiple calls. You'll only run into issues if multiple CPUs are accessing the same memory, but models have been around for a long time. There might be a slightly higher latency, but not by much.

      With multiple cores, you may get the benefit of shared caches which could remove a memory hit.

      Overall, I would assume a multi-core system would scale fairly similarly to a multi-processor system.

      --

      "Open Source?" - Press any key to continue
    3. Re:Question. by powerlord · · Score: 2

      I'm curious about this (in part at least), because of the expectations on the new Apple Desktop. Currently the 'high end' desktop is the PowerMac G5 Quad (Two dual-core 2.5GHz PowerPC G5 processors).

      The speculation is that Apple will want to keep that Quad line by putting in two Core 2 Duos. Therefore 4 cores for the desktop may be anounced as early as two weeks from now (and one would assume that once its released for Apple, it will filter into other Intel shops like Dell). There was also the speculation about AMDs new 4x4 systems. With 4 Chips, each Dual Core, that would put 8 chips on the desktop (hence probably Intel's wish to minimize the fact that they aren't about to anounce that).

      I am just wondering at what point each system will start to hit the limitations of their CPU/Main Memory bandwidth. :) (and of course if AMD can keep up the preasure on Intel ... and vice-versa)

      --
      This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
    4. Re:Question. by larien · · Score: 2, Interesting
      Well, if you believe Sun's marketing, it's great for throughput. The new Niagara chips (in the T1000/2000 servers), each core has 4 compute threads. As thread 1 waits for RAM, thread 2 kicks in, repeat until we get back to thread 1, which now has its data from memory and gets a chance to do some work, before passing onto thread 2 etc, etc.

      However, these chips are designed for throughput of multiple threads; for a desktop, single threaded app, you will still have the same memory bottlenecks we have now.

    5. Re:Question. by NovaX · · Score: 3, Insightful

      One thing to remember, Sun has a lot more expertise on memory busses than Intel does. The UltraSparc chips have never been great performers, but are wonderful at scaling in multiprocessor systems. Intel has never put too much effort in their bus system, because the economics favor cheaper solutions. Their shared bus approach reduces costs for a mass market, but they even use it for ultra high-end systems like Itanium. Those systems really need a better system bus, but simply used a tweaked version of the standard Xeon one. I believe Intel is targetting 2007 for the release of their new bus architecture.

      --

      "Open Source?" - Press any key to continue
  5. well, by joe+155 · · Score: 3, Insightful

    I don't want to insult the person but saying that 8 is something that will not be needed seems very short sighted. People were saying only a few years ago "1GB is too big for a hard-drive"... Never under estimate the increasing need for power in computers, even for home users

    --
    *''I can't believe it's not a hyperlink.''
    1. Re:well, by man_of_mr_e · · Score: 5, Insightful

      I think he was talking about the foreseeable future.

      1 core is really enough for most users. 2 cores is enough for most power users. 4 cores will be enough for all but the most demanding jobs. Workstations are different, however and are not usually considered part of the "desktop". For example, I could see 3D artists using 4 or 8 cores easily. In fact, there's simply no such thing as a computer that's "too fast" for certain purposes.

      The issue, though, is one of moderation. Why would a desktop user want 8 cores, which are drawing insane amounts of power, when they're not even utilizing 4 to full advantage? Word processing, accounting, and surfing the web don't need any of this. Games? I can imagine in 10+ years we'll have some photo-realistic 3D games that run in real-time, but the vast majority of the work will likely be handled by GPU's and won't need 8 cores to deal with it.

      I simply cannot fathom a purpose for 8 cores for any "desktop" application that isn't in the "workstation" class.

    2. Re:well, by Waffle+Iron · · Score: 2, Insightful
      People were saying only a few years ago "1GB is too big for a hard-drive"

      But this time the new hardware would be dependent on a major overhaul of the software industry. Any programmer can write code to fill up a 1GB hard drive, but effectively using 8 cores usually requires talented programmers who have mastered multithreaded programming. This is a small fraction of the software developer population, so apps that can take advantage of an 8-core CPU will probably be few and far between for a good long while. (Not to mention, not every computing task can even be parallelized in the first place.)

      Changes in software architecture seem to have a huge amount of inertia. It took almost a decade to transition from 16-bit to 32-bit desktops even though it was *easier* to program a 32-bit app than a 16-bit one. Who knows how long it would take to get most apps taking advantage of large numbers of cores when the coding will be much harder than most developers are used to?

    3. Re:well, by koreth · · Score: 4, Insightful
      effectively using 8 cores usually requires talented programmers who have mastered multithreaded programming.

      But ineffectively using 8 cores can be done by any dumbass with a C# compiler or a book on the pthreads library. Which is why we actually will need 8 cores.

  6. 640K cores ought to be enough for anybody... by Jerf · · Score: 2, Informative

    Of course it's a bit of a chicken and egg problem right now, isn't it? If more software used multiple cores, then we'd have a greater need for more cores. Or you could start programming in Erlang and sort of automatically use those cores.

    On the other hand, to be fair, the scaling issues start getting odd. I'd expect that we're going to have to move from a multi-core to a multi-"computer" model, where each set of, say, 4 cores works the way it does now, but each set of 4 gets its own memory and any other relevant pieces. (You can still share the video and audio, though at least initially there will presumably be a priviledged core set that gets set as the owner.)

    Still, as my post title says, this does strike me as rather a 640KB-style pronouncement. (The original quote may be apocraphal, but the sentiment it describes has always been with us.)

    1. Re:640K cores ought to be enough for anybody... by mrchaotica · · Score: 3, Informative
      a multi-"computer" model, where each set of, say, 4 cores works the way it does now, but each set of 4 gets its own memory and any other relevant pieces.

      That's called NUMA.

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

  7. Silly Perlmutter by ackthpt · · Score: 5, Funny

    If the home user can justify (even indirectly due to demands of the operating system or changes in software architecture) 4 cores then 8 is immenently logical. Seems some minds at Intel are falling back to the dubious position they held regarding home users never needing 64 bit CPUs. Then again, maybe they're just playing dumb and are slaving away, burning midnight oil by the drum, to make 8 and 16 core processors.

    Three Cores for the Clippy, but I don't know why,
    Seven for the Vista kernel which is defect prone,
    Nine for for Bloat which will make the cooling fry,
    One for the Screensaver to toil alone,
    In the Land of Redmond where Marketing lies.
    One Core to rule them all, One Core to find them,
    One Core to bring them all and in the darkness bind them
    In the Land of Redmond where Marketing lies.

    --

    A feeling of having made the same mistake before: Deja Foobar
    1. Re:Silly Perlmutter by WuphonsReach · · Score: 2, Insightful

      Not necessarily. A dual-core system is more expensive, per-core-GHz, than a single-core system. That is, $300 might buy you a 2.0GHz dual-core CPU or a 3.0GHz single-core CPU (apples-and-apples GHz here, so AMD and not Intel).

      $154 - AMD Athlon 64 X2 3800+ Dual Core 2GHz
      $86 - AMD Athlon 64 3200+ 2GHz

      Looks pretty close to a wash in my book.

      $327 AMD Opteron 165 Dual Core 1.8GHz
      $170 AMD Opteron 144 - Box 1.8GHz

      Not much difference here on $ per-core-GHz either.

      Your statement might have been true last week, prior to the AMD price cuts. But things are a lot nicer now (and the low-end dual-cores are almost an automatic choice). $68 for the 2nd core makes a lot of sense, even for a low-end CPU because it will add a few years of usability onto the lifespan of the machine. Or at least the machine will feel snappier for a few years longer then the single-core.

      And the primary reason that AMD 64bit CPUs get so much goodwill? Unlike the Itanic, AMD came up with a 64bit design that provides for the future while still providing excellent performance for 32bit applications. So why not buy a 64bit chip even if you're still running 32bit? There's no performance hit and if the landscape changes and we all need to move to 64bit, you're already there.

      Pretty much a no-risk decision as a result. You're not betting on 32bit or 64bit, you're simply prepared for either.

      --
      Wolde you bothe eate your cake, and have your cake?
  8. Translation by growse · · Score: 2, Insightful

    Intel saying "The market doesn't need 8 cores" = Intel saying "We can't really engineer 8 cores right now, we've hit some trouble". Of course the market would like 8 cores. Markets are greedy for new stuff, that's how you keep on making money. Intel's covering their ass for putting 8 cores on their roadmap for anytime soon.

    --
    There is nothing interesting going on at my blog
  9. Neither four nor eight. by MarkByers · · Score: 5, Funny

    I think there is a world market for maybe five cores.

    --
    I'll probably be modded down for this...
  10. Classic mistake by ajs · · Score: 4, Insightful

    He's right. Current desktops don't need 8 cores. However, as four cores become widely available, desktops will begin to change. They will become more threaded, and more processing that would have been avoided previously will begin to happen passively. Constantly streaming video in multiple thumbnail size icons on taskbars, stronger and more pervasive encryption on everything that enters or leaves the machine, smarter background filtering on multiple RSS sources, MUCH beefier JIT on virtual machines, on-the-fly JIT for dynamic languages, more complex client-side rendering of Web content (SVG, etc), these will all start to become more practical for constant use. Other things that we haven't even thought of because they're impactical now will also spring up. By the time 8-core systems are available, the market will already be over-taxing 4-core systems.

  11. Do all cores have to be smart? by spyrochaete · · Score: 5, Interesting

    I recently read about a 1024-core chip for small devices like cell phones Each core ran on a simplified instruction set and specialized in a certain task like muting the microphone when incoming sounds are too quiet, smoothing text on the low resolution screen, and other minute tasks. Individual cores could be placed in low power sleep mode until the software dictated a need for that instruction set.

    Is it possible to couple CISC and RISC cores on one die? Is this how the math coprocessors of the 386 era worked? This sounds like an ideal solution to me since nobody needs 4 or 8 cores to be fully powered and ready to pounce at all times.

    1. Re:Do all cores have to be smart? by Kjella · · Score: 3, Informative

      Is it possible to couple CISC and RISC cores on one die? Is this how the math coprocessors of the 386 era worked?

      It's essentially how all modern processors are. I think the old coprocessors were the last that weren't on the same die (except the fake "coprocessors" that actually took over and completely ignored the old CPU, was more like a CPU upgrade in drag). Modern processors have a CISC instruction set which gets translated to a ton of mircoops (RISC) internally, and with parallel execution you in essence have multiple cores on one die - they're just not exposed to the user.

      The limitation compared to a cell phone, which has an extremely fixed feature set is trying to find workable dedicated circuits for that are meaningful for a general purpose computer. That's essentially what the SSE[1-4] instruction sets are, dedicated encryption chips (on a few VIA boards, plus the new TCPA chips), dedicated video decoding circuitry (mostly found on GPUs) and maybe a few more. But on the whole, we've not found very many tasks that are of that nature.

      In addition, there are many drawbacks. New formats keep popping up, and your old circuitry becomes meaningless or CPU technology speeds on and makes it redundant. The newest CPUs can so barely decode 1080p H.264/VC-1 content, but I expect that to be the hardest task any average desktop computer will face. What more is there a market for? I don't think too much.

      --
      Live today, because you never know what tomorrow brings
  12. People Will Always "Need" More by ausoleil · · Score: 4, Insightful

    "Need" is subjective.

    Once upon a time, Bill Gates said we would never "need" more than 640K.

    Once upon a time, mainframes only had 32K of RAM -- and that was a vast amount more than their predecessors.

    The '286 came out and was primarily aimed at the server and workstation market. "No one will ever need all of that power."

    Thing is, people always "need" more speed, more RAM and more storage. And they'll pay for it too, so Intel may "need" to sell 8X cores.

    1. Re:People Will Always "Need" More by not5150 · · Score: 2, Informative

      "Once upon a time, Bill Gates said we would never "need" more than 640K."

      Why the hell do people still bring this up? Gates never said this.

      Do a Google search for Gates and 640K and be enlightened. Wired did an article about this bogus attribution and Wikipedia has an entry about it under Bill Gates.

  13. 6 Coors enough by CrazyJim1 · · Score: 5, Funny

    The 6 pack has been tried and true, why try and stuff an additional 2 Coors into it.

  14. Re:Translation by ssista537 · · Score: 4, Insightful

    Seems like people dont RTFA. Let me Quote "Will we see eight cores in the client in the next two years? If someone chooses to do that, engineering-wise that is possible. But I doubt this is something the market needs." He is talking about next two years not ever. We just have an abundance of dual core machines in the market now and the apps to take advantage of it. Tell me how much different software we had two years ago than today. If so there is no way a desktop market needs 8 cores two years from now. Geez we have so many fanbois and script kiddies here with absolutely no knowledge of the industry, it is sickening.

  15. Comparisons to 640K misguided ... by AHumbleOpinion · · Score: 3, Insightful

    What quite a few other posters are failing to understand is that he is referring to diminishing returns. 1 to 2 give you some fractional improvement, 2 to 4 gives you a smaller fractional improvement, 4 to 8 gives you an even smaller fractional improvement, etc. At some point the cost, size, heat, noise (for the cooling), etc is not worth the fractional improvement. For most users that will probably be dual or quad.

    For those extremely rare apps and jobs that are highly parallelable 8 and above will be useful. However this will be very rare and this is why the comparisons to the infamous 640K quote are misguided. Increasing RAM is easy, software naturally consumes RAM with no additional work necessary, just do more of what you are alraedy doing. Multiprocessing is something completely different, the code must be designed and written quite differently, and it is often very difficult to retrofit existing code for multiprocessing. Now you have the practical problem that not all problems are parallelable.

    Strangely enough, I think one case where 8 cores could be useful in a home environment would be a bit retro. A multiuser/centralized system. One PC with the computational power for the entire family, dumb terminals for individual users, connections to appliances for movies, music, etc. Such a machine might go into the basement, garage, closet, or other location where noise is not an issue. Of course, I'm not sure such a centralized machine would be cost effective.

  16. Do they not need it? Really? by Aeomer · · Score: 2, Informative

    Transputers used up to sixteen cores twenty years ago and had plenty of scientific applications. The physical arrangement of the inter-core buses was important depending on the application. Scientists of all types (except maybe Christian Scientist) use desktop machines for experimental work. Expect to see 'dynamic bus reconfiguration' in the next gen of multi-core processors and motherboards. One config for your word-pro, another for your server app, another for your First Person Shooter. Can I patent that idea??? Oh well too late. But it is interesting to see Intel and AMD reinventing the wheel. ;-)

  17. Same mistake by LParks · · Score: 2

    "I want everybody to go from a frequency world to a number-of-cores-world."

    That's the same thing they tried against AMD before. Higher frequency is a bigger number so it will sell better, right? Now, more cores is a bigger number so it will sell better, right?

    If Intel wants to not repeat the same mistake and let AMD gain ground on them, they should go to a "performance world." Actual perfomance, in many different environments.

  18. I smell a fundemental software change coming... by PotatoHead · · Score: 2, Interesting

    While an 8 core desktop is gonna be overkill for a lot of people, it still leaves us with a nasty problem.

    Peak CPU speed.

    For now we have topped out on this, meaning our existing software is either gonna have to get more efficient, or it's going to have to change, unless we want to just deal with the level of performance and features we currently have.

    (like that's gonna ever happen --how else would the closed corps sell upgrades then?)

    Additionally, some application areas do not have enough CPU power to fully realize their potential. MCAD is one of these, by way of example. Take the fastest CPU's we have today and they are still not fast enough to fully render a solid model without wasting the operators time. Current software offerings are all working toward smarter data, creative ways to manage the number of in-memory and in-computation models, better kernel solves, etc...

    But it's just not enough for the larger projects to work in the way they could be working.

    Most of the MCAD stuff currently is built in a linear way. That's largely because of the parametric system used by almost all software producers today. With a few changes to how we do MCAD, I could see many cores becoming very important for larger datasets.

    Peak CPU and RAM are the two primary bottlenecks that constrain how engineering CAD software develops and what features it can evolve for it's users. It's not the only example either.

    The bitch is that most of the software we have is more than adequate for most of the people. For those that lie outside the norm, dependance on this software (both development and just use value need), constrains their ability to make use of multi-core CPU capabilities...

    Messy.

    Will be interesting to see how this all goes. Will the established players evolve multi-core transitional software that can bridge the gap, or will new players arise, doing things differently to take advantage of the next tech wave?

    IMHO, there is a strong case for Intel doing the, "If we build it, they will come thing." For the higher demand computing needs, there really isn't any other way to improve, but through very aggressive code optimization.

  19. I'll be the first to say it... by Junior+J.+Junior+III · · Score: 5, Funny

    640 cores ought to be enough for anybody.

    --
    You see? You see? Your stupid minds! Stupid! Stupid!
    1. Re:I'll be the first to say it... by ManuelKelly · · Score: 2, Funny

      I guess nobody with mod points caught that one.

      Here is another.

      I estimate the global market for cores at about 4.

  20. Parallelization and cache coherency by QuantumFTL · · Score: 2, Interesting

    The custom rendering software I work on at Maas Digital (used for things like the IMAX Mars Rover film) is very cache sensitive. I've been mulling this over recently, because in computer graphics, memory is almost always the bottleneck, and it's lead me to conclude we really need some different languages, or at least language constructs.

    Pixar's Photorealistic Renderman (perhaps one of the greatest pieces of software ever written, from an engineering point of view) is very odd in that its shading language, while interpreted, is actually much faster at accomplishing its goals than other compliant renderers which compile down to the machine level. I believe this is because of memory bottlenecks, and despite the fact that computer graphics is an "embarassingly parallel" problem, eight cores is likely to aggrevate this much more than it is to help.

    What I think is needed is a more functional-programming approach to a lot of these problems, where the mathematics of an operation is more purely expressed, leaving things like ordering/scheduling up to the compiler/runtime environment. Runtime compiled languages, like Java, can sometimes outperform even the best hand-optimized C due to the fact that the runtime compiler can optimize to the cache size and specific chihp family.

    Also, this type of language would benefit multi-core processing because it would help expose the most possible parallelization opportunities, and let the compiler (perhaps even through trial and error) determine exactly when and how much parallel code to create.

    Currently all of my parallel supercomputing code uses Fortran and the Message Passing Interface, but it's clear that this approach leads to code that is often very hard to debug and is very programmer-intensive. Hopefully the future of programming languages will help ease us into general purpose computing on highly parallel architectures like Cell.

  21. Yes and no : depends on the brand by DrYak · · Score: 5, Informative

    Not quite exactly. Things depends on the brand.

    For Intel that's exactly the case :
    With current intel architecture, memory is interfaced with the NorthBridge.
    With multicore and multiproc systems, all chips communicate to the NorthBridge and get their memory access from there.
    So more cores and processors means same pipe must be shared by more, and there for memory bandwith per core is lower.
    Intel must modify their motherboard design. They must invent QUAD-channel memory bus, they must push newer and faster memory types (that's what hapenned with DDR-II ! They needed the faster datarates, even if those come at cost of latency), etc...

    But the more their pursue in this direction, the more latency they add to the system. Which in the end will put them in a dead end. (Somewhat like the deeper pipe of their quest for Gigahertz put them in dead-end of burning-hot and power-hungry P4).

    For AMD that's not quite the same :
    With the architecture that AMD started with the AMD64 series, memory is directly interfaced with a memory controller that is on-die with the Chip.
    The multiple procs and the rest of the mother board communicate using a standarized HyperTransport.
    The rest of the mother board doesn't even know what's hapenning up there with the memory.
    And with the advent of HyperTransport-plugs (HTX) the mother board doesn't even realy need to know it.
    Riser cards with Memory-And-CPU-Both-of-Them (à la Slot 1) is possible (and highly anticipated, because it'll make possible a much wider possibility of specialized accelerators to be plugged than currently with AM2 socket)

    The most widely publicised advantages of this structure are the lower latency.
    But this also makes it easier to scale up memory bandwith : Just add another on-board memory controller and voilà you have dual-channel. That was the differences between first generations of entry-level AMD64 (Athlon 64 for 7## socket : one controller - single channel, Athlon FX for 9## socket : 2 controllers, dual channel).
    by the time 8 cores processors come out and if CPU riser-board with standart HTX connector appears, nothing will prevent AMD to just build riser board designed for 8 cores chips with 4 memory controllers (and Quad-channel speed). Just change the riser board, memory speed will scale. Mother board doesn't need to be re-designed. In fact, same mother board could be kept.
    And this won't come at the price of latency or whatever : the memory controller is ON the cpu die, and must not be shared with anything.

    In fact, that's partially already happening :
    In the case of multi procsystems, instead of all procs sharing the same pipe thru the NorthBridge, each chips has it's own controller going at full speed.
    And this memory can be shared over the HT bus (albeit with some latency).
    It's basically 4 memory controllers (2 per proc) working together. Acheiving quad-channel alike shouldn't be that difficult.
    Specially when Intel is pushing the memory standart to chips with higher latency : asking for more bandwith in parallel over the HT-bus won't be that much penalizing.

    So I think AMD will be faster at developping solutions to scale against higher number of cores than Intel, due to better architecture.

    Maybe, it's not a coincidence that AMD is working on technology to "bind together" cores and present them as single proc to not-enough SMP-optimized software, and that at the same time Intel is telling who ever wants to listen to them that 4 cores is enough, 8 is too much. (Yeah, sure, just tell it to the database- and Sun Niagara people. Or even to older BeOS users. This just sounds like "640k is enough for everyone")

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
  22. Past 4 cores, you want NUMA anyway by iabervon · · Score: 2, Insightful

    Once you have 8 cores, it becomes advantageous to have memory which is faster for each group of 4. At 8, you're on the edge where the advantage exists, but isn't sufficient to justify the additional architectural complexity. For 16 and up, it's much better to have 4-processor nodes each with its own memory (and slower access to memory on other nodes). It's unlikely that improvements in chip technology will change this. It's also not something about desktop computers; existing large machines use 4-processor nodes.

    So he's right; before it makes sense to have more than 4 cores on a chip, you'll want multiple chips of 4 cores each with separate memory busses, and then system RAM on the processor chip (at which point the architecture is significantly different, because the system is asking the processor for memory values, rather than the opposite), and only then does it become efficient again to put more cores on the chip, as you can have a multiple-node chip.

  23. Re:The point is... by timeOday · · Score: 3, Insightful
    Those running huge simulations and using far more than 2GB of RAM are not doing so on a desktop.
    That's obviously because a desktop can't do the job. I run cluster jobs, and I assure you I'd prefer to run them on my laptop, if only I could put 100 cores in there.
  24. People It's JUST Marketing*Speak by Jherek+Carnelian · · Score: 2, Interesting

    All this discussion over some BS from a marketingdroid? Are you really all such suckers?

    Let me translate from marketing-speak to plain English for y'all:

    droid: two are enough for now, four will be mainstream in three years and eight is something the desktop market does not need.
    translation: We have a two core product available now, we will have a 4 core product available in three years but we don't yet have a plan for an eight core product.

  25. Re:Main memory of course. by Aadain2001 · · Score: 3, Insightful
    I'll give you that the data sets programs are using today are getting gigantic, which can easily lead to constant memory block swaping between the main memory and the caches. But when it comes to instruction caches, you obviously haven't heard of the 90/10 Locality rule of thumb: a program executes about 90% of its instructions in 10% of its code. That's because of branches, loops, the fact that there are large sections of code that are run only once, during initialization, and never run again, etc. So while the Java run time engine is larger than the L2 cache in all but the most expensive workstation processors, the majority of the instruction that are executed are only a small subset of the actual code, which can fit easily in typical L2 caches.

    If you look at Intel's Core 2 Duo, the cache space is not "divided" as the number of cores increase. Each core, if running at full load, will have 2MB of cache (extreme edition anyway). That is a very respectivable cache size and would be a respectable single core processor. When one core is not running (like when running only Word), one core sleeps while the other core is given all of the cache.

    Past marking ploys (GHz) were definately wrong, and trying to directly replace those metrics with the number of cores is also a bad choice. But don't you see that that is exactly what Intel is trying to prevent? The interviewee in the article is saying that more cores != more performance. Hence why desktop users will have no need for 8 cores or more. Most of the posts on this topic are along the lines of "ya right, more cores FTW!", which is a very uninformed mentality.

    --
    Space for rent, inquire within
  26. It's called dataflow by Mateorabi · · Score: 2, Insightful
    The problem with the asynch message passing method is that you have to explicitly send/copy your data to the next process, which causes the bandwidth to skyrocket. (Better to pass by reference, with some mechanism to ensure the producer can't make more modifications after the consumer process gets the pointer.)

    Actualy what you described is a very specific instance of dataflow programs, where the flow can best be described by a directed "dataflow" graph. Technicaly macrodataflow since you pass data between processes; true dataflow reduces the granularity all the way down to individual instructions passing each other operands.

    The reason "applications naturally parallelize" is because the language is forcing the programer to be explicit about the parallelism, something that doesn't come naturaly to your Freshman CS101 coder. Imperative languages like C, Fortran, Java, etc. that students are taught first are geared towards von Neumann machines and are incredebly hard for the compiler to parallelize.

    Interestingly, functional languages like you mentioned (also try 'Id') map quite well to dataflow. This is directly due to their lack of side effects (i.e. manipulating structures in memory, which must be inherently sequential in order for the programer to reason well about program correctness.)

    Dataflow had a lot more following in the 80s and early 90s. One problem was actualy an explosion of too much parallelism exposed in the application, more than the functional units could handle. The overflow then had to be shuttled back and forth to memory, making the aps bandwidth limited. Look at the MIT TTDA, Monsoon, *T, TERA, TAM, WaveScalar, and other projects. The ability to put many functional units (cores) and sufficient memory to keep them fed on a single die recently (last 5 years) reduces this limit and may allow the field to have a bit of resurgance.

    --
    "You saved 1968." - Ms. Valerie Pringle to the crew of Apollo 8