Intel - Market Doesn't Need Eight Cores
PeterK writes "TG Daily has posted an interesting interview with Intel's top mobility executive David Perlmutter. While he sideswipes AMD very carefully ('I am not underestimating the competition, but..'), he shares some details about the successor of Core, which goes by the name 'Nehalem.' Especially interesting are his remarks about power consumption, which he believes will 'dramatically' decrease in the next years as well as the number of cores in processors: Two are enough for now, four will be mainstream in three years and eight is something the desktop market does not need." From the article: "Core scales and it will be scaling to the level we expect it to. That also applies to the upcoming generations - they all will come with the right scaling factors. But, of course, I would be lying if I said that it scales from here to eternity. In general, I believe that we will be able to do very well against what AMD will be able to do. I want everybody to go from a frequency world to a number-of-cores-world. But especially in the client space, we have to be very careful with overloading the market with a number of cores and see what is useful."
I don't doubt an "8 core" desktop will exist in the near future. Then again he has a point... we won't likely need it.
I don't give a damn for a man that can only spell a word one way.
Mark Twain
If you put 8 core procs in desktop machines, software will be written that will take advantage of them. Which means you'll sell more 8 core procs.
Are you going to lead or follow?
"Our multiprocessor technology doesn't scale, but we don't want to scare investors away, so we'll pretend it doesn't matter."
Does having multiple cores do anything about the memory bottleneck? Does this make the machine balance better or worse?
I don't want to insult the person but saying that 8 is something that will not be needed seems very short sighted. People were saying only a few years ago "1GB is too big for a hard-drive"... Never under estimate the increasing need for power in computers, even for home users
*''I can't believe it's not a hyperlink.''
Of course it's a bit of a chicken and egg problem right now, isn't it? If more software used multiple cores, then we'd have a greater need for more cores. Or you could start programming in Erlang and sort of automatically use those cores.
On the other hand, to be fair, the scaling issues start getting odd. I'd expect that we're going to have to move from a multi-core to a multi-"computer" model, where each set of, say, 4 cores works the way it does now, but each set of 4 gets its own memory and any other relevant pieces. (You can still share the video and audio, though at least initially there will presumably be a priviledged core set that gets set as the owner.)
Still, as my post title says, this does strike me as rather a 640KB-style pronouncement. (The original quote may be apocraphal, but the sentiment it describes has always been with us.)
If the home user can justify (even indirectly due to demands of the operating system or changes in software architecture) 4 cores then 8 is immenently logical. Seems some minds at Intel are falling back to the dubious position they held regarding home users never needing 64 bit CPUs. Then again, maybe they're just playing dumb and are slaving away, burning midnight oil by the drum, to make 8 and 16 core processors.
Three Cores for the Clippy, but I don't know why,
Seven for the Vista kernel which is defect prone,
Nine for for Bloat which will make the cooling fry,
One for the Screensaver to toil alone,
In the Land of Redmond where Marketing lies.
One Core to rule them all, One Core to find them,
One Core to bring them all and in the darkness bind them
In the Land of Redmond where Marketing lies.
A feeling of having made the same mistake before: Deja Foobar
Intel saying "The market doesn't need 8 cores" = Intel saying "We can't really engineer 8 cores right now, we've hit some trouble". Of course the market would like 8 cores. Markets are greedy for new stuff, that's how you keep on making money. Intel's covering their ass for putting 8 cores on their roadmap for anytime soon.
There is nothing interesting going on at my blog
I think there is a world market for maybe five cores.
I'll probably be modded down for this...
He's right. Current desktops don't need 8 cores. However, as four cores become widely available, desktops will begin to change. They will become more threaded, and more processing that would have been avoided previously will begin to happen passively. Constantly streaming video in multiple thumbnail size icons on taskbars, stronger and more pervasive encryption on everything that enters or leaves the machine, smarter background filtering on multiple RSS sources, MUCH beefier JIT on virtual machines, on-the-fly JIT for dynamic languages, more complex client-side rendering of Web content (SVG, etc), these will all start to become more practical for constant use. Other things that we haven't even thought of because they're impactical now will also spring up. By the time 8-core systems are available, the market will already be over-taxing 4-core systems.
I recently read about a 1024-core chip for small devices like cell phones Each core ran on a simplified instruction set and specialized in a certain task like muting the microphone when incoming sounds are too quiet, smoothing text on the low resolution screen, and other minute tasks. Individual cores could be placed in low power sleep mode until the software dictated a need for that instruction set.
Is it possible to couple CISC and RISC cores on one die? Is this how the math coprocessors of the 386 era worked? This sounds like an ideal solution to me since nobody needs 4 or 8 cores to be fully powered and ready to pounce at all times.
"Need" is subjective.
Once upon a time, Bill Gates said we would never "need" more than 640K.
Once upon a time, mainframes only had 32K of RAM -- and that was a vast amount more than their predecessors.
The '286 came out and was primarily aimed at the server and workstation market. "No one will ever need all of that power."
Thing is, people always "need" more speed, more RAM and more storage. And they'll pay for it too, so Intel may "need" to sell 8X cores.
The 6 pack has been tried and true, why try and stuff an additional 2 Coors into it.
God spoke to me.
Seems like people dont RTFA. Let me Quote "Will we see eight cores in the client in the next two years? If someone chooses to do that, engineering-wise that is possible. But I doubt this is something the market needs." He is talking about next two years not ever. We just have an abundance of dual core machines in the market now and the apps to take advantage of it. Tell me how much different software we had two years ago than today. If so there is no way a desktop market needs 8 cores two years from now. Geez we have so many fanbois and script kiddies here with absolutely no knowledge of the industry, it is sickening.
What quite a few other posters are failing to understand is that he is referring to diminishing returns. 1 to 2 give you some fractional improvement, 2 to 4 gives you a smaller fractional improvement, 4 to 8 gives you an even smaller fractional improvement, etc. At some point the cost, size, heat, noise (for the cooling), etc is not worth the fractional improvement. For most users that will probably be dual or quad.
For those extremely rare apps and jobs that are highly parallelable 8 and above will be useful. However this will be very rare and this is why the comparisons to the infamous 640K quote are misguided. Increasing RAM is easy, software naturally consumes RAM with no additional work necessary, just do more of what you are alraedy doing. Multiprocessing is something completely different, the code must be designed and written quite differently, and it is often very difficult to retrofit existing code for multiprocessing. Now you have the practical problem that not all problems are parallelable.
Strangely enough, I think one case where 8 cores could be useful in a home environment would be a bit retro. A multiuser/centralized system. One PC with the computational power for the entire family, dumb terminals for individual users, connections to appliances for movies, music, etc. Such a machine might go into the basement, garage, closet, or other location where noise is not an issue. Of course, I'm not sure such a centralized machine would be cost effective.
Transputers used up to sixteen cores twenty years ago and had plenty of scientific applications. The physical arrangement of the inter-core buses was important depending on the application. Scientists of all types (except maybe Christian Scientist) use desktop machines for experimental work. Expect to see 'dynamic bus reconfiguration' in the next gen of multi-core processors and motherboards. One config for your word-pro, another for your server app, another for your First Person Shooter. Can I patent that idea??? Oh well too late. But it is interesting to see Intel and AMD reinventing the wheel. ;-)
"I want everybody to go from a frequency world to a number-of-cores-world."
That's the same thing they tried against AMD before. Higher frequency is a bigger number so it will sell better, right? Now, more cores is a bigger number so it will sell better, right?
If Intel wants to not repeat the same mistake and let AMD gain ground on them, they should go to a "performance world." Actual perfomance, in many different environments.
While an 8 core desktop is gonna be overkill for a lot of people, it still leaves us with a nasty problem.
Peak CPU speed.
For now we have topped out on this, meaning our existing software is either gonna have to get more efficient, or it's going to have to change, unless we want to just deal with the level of performance and features we currently have.
(like that's gonna ever happen --how else would the closed corps sell upgrades then?)
Additionally, some application areas do not have enough CPU power to fully realize their potential. MCAD is one of these, by way of example. Take the fastest CPU's we have today and they are still not fast enough to fully render a solid model without wasting the operators time. Current software offerings are all working toward smarter data, creative ways to manage the number of in-memory and in-computation models, better kernel solves, etc...
But it's just not enough for the larger projects to work in the way they could be working.
Most of the MCAD stuff currently is built in a linear way. That's largely because of the parametric system used by almost all software producers today. With a few changes to how we do MCAD, I could see many cores becoming very important for larger datasets.
Peak CPU and RAM are the two primary bottlenecks that constrain how engineering CAD software develops and what features it can evolve for it's users. It's not the only example either.
The bitch is that most of the software we have is more than adequate for most of the people. For those that lie outside the norm, dependance on this software (both development and just use value need), constrains their ability to make use of multi-core CPU capabilities...
Messy.
Will be interesting to see how this all goes. Will the established players evolve multi-core transitional software that can bridge the gap, or will new players arise, doing things differently to take advantage of the next tech wave?
IMHO, there is a strong case for Intel doing the, "If we build it, they will come thing." For the higher demand computing needs, there really isn't any other way to improve, but through very aggressive code optimization.
Blogging because I can...
640 cores ought to be enough for anybody.
You see? You see? Your stupid minds! Stupid! Stupid!
The custom rendering software I work on at Maas Digital (used for things like the IMAX Mars Rover film) is very cache sensitive. I've been mulling this over recently, because in computer graphics, memory is almost always the bottleneck, and it's lead me to conclude we really need some different languages, or at least language constructs.
Pixar's Photorealistic Renderman (perhaps one of the greatest pieces of software ever written, from an engineering point of view) is very odd in that its shading language, while interpreted, is actually much faster at accomplishing its goals than other compliant renderers which compile down to the machine level. I believe this is because of memory bottlenecks, and despite the fact that computer graphics is an "embarassingly parallel" problem, eight cores is likely to aggrevate this much more than it is to help.
What I think is needed is a more functional-programming approach to a lot of these problems, where the mathematics of an operation is more purely expressed, leaving things like ordering/scheduling up to the compiler/runtime environment. Runtime compiled languages, like Java, can sometimes outperform even the best hand-optimized C due to the fact that the runtime compiler can optimize to the cache size and specific chihp family.
Also, this type of language would benefit multi-core processing because it would help expose the most possible parallelization opportunities, and let the compiler (perhaps even through trial and error) determine exactly when and how much parallel code to create.
Currently all of my parallel supercomputing code uses Fortran and the Message Passing Interface, but it's clear that this approach leads to code that is often very hard to debug and is very programmer-intensive. Hopefully the future of programming languages will help ease us into general purpose computing on highly parallel architectures like Cell.
Not quite exactly. Things depends on the brand.
For Intel that's exactly the case :
With current intel architecture, memory is interfaced with the NorthBridge.
With multicore and multiproc systems, all chips communicate to the NorthBridge and get their memory access from there.
So more cores and processors means same pipe must be shared by more, and there for memory bandwith per core is lower.
Intel must modify their motherboard design. They must invent QUAD-channel memory bus, they must push newer and faster memory types (that's what hapenned with DDR-II ! They needed the faster datarates, even if those come at cost of latency), etc...
But the more their pursue in this direction, the more latency they add to the system. Which in the end will put them in a dead end. (Somewhat like the deeper pipe of their quest for Gigahertz put them in dead-end of burning-hot and power-hungry P4).
For AMD that's not quite the same :
With the architecture that AMD started with the AMD64 series, memory is directly interfaced with a memory controller that is on-die with the Chip.
The multiple procs and the rest of the mother board communicate using a standarized HyperTransport.
The rest of the mother board doesn't even know what's hapenning up there with the memory.
And with the advent of HyperTransport-plugs (HTX) the mother board doesn't even realy need to know it.
Riser cards with Memory-And-CPU-Both-of-Them (à la Slot 1) is possible (and highly anticipated, because it'll make possible a much wider possibility of specialized accelerators to be plugged than currently with AM2 socket)
The most widely publicised advantages of this structure are the lower latency.
But this also makes it easier to scale up memory bandwith : Just add another on-board memory controller and voilà you have dual-channel. That was the differences between first generations of entry-level AMD64 (Athlon 64 for 7## socket : one controller - single channel, Athlon FX for 9## socket : 2 controllers, dual channel).
by the time 8 cores processors come out and if CPU riser-board with standart HTX connector appears, nothing will prevent AMD to just build riser board designed for 8 cores chips with 4 memory controllers (and Quad-channel speed). Just change the riser board, memory speed will scale. Mother board doesn't need to be re-designed. In fact, same mother board could be kept.
And this won't come at the price of latency or whatever : the memory controller is ON the cpu die, and must not be shared with anything.
In fact, that's partially already happening :
In the case of multi procsystems, instead of all procs sharing the same pipe thru the NorthBridge, each chips has it's own controller going at full speed.
And this memory can be shared over the HT bus (albeit with some latency).
It's basically 4 memory controllers (2 per proc) working together. Acheiving quad-channel alike shouldn't be that difficult.
Specially when Intel is pushing the memory standart to chips with higher latency : asking for more bandwith in parallel over the HT-bus won't be that much penalizing.
So I think AMD will be faster at developping solutions to scale against higher number of cores than Intel, due to better architecture.
Maybe, it's not a coincidence that AMD is working on technology to "bind together" cores and present them as single proc to not-enough SMP-optimized software, and that at the same time Intel is telling who ever wants to listen to them that 4 cores is enough, 8 is too much. (Yeah, sure, just tell it to the database- and Sun Niagara people. Or even to older BeOS users. This just sounds like "640k is enough for everyone")
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Once you have 8 cores, it becomes advantageous to have memory which is faster for each group of 4. At 8, you're on the edge where the advantage exists, but isn't sufficient to justify the additional architectural complexity. For 16 and up, it's much better to have 4-processor nodes each with its own memory (and slower access to memory on other nodes). It's unlikely that improvements in chip technology will change this. It's also not something about desktop computers; existing large machines use 4-processor nodes.
So he's right; before it makes sense to have more than 4 cores on a chip, you'll want multiple chips of 4 cores each with separate memory busses, and then system RAM on the processor chip (at which point the architecture is significantly different, because the system is asking the processor for memory values, rather than the opposite), and only then does it become efficient again to put more cores on the chip, as you can have a multiple-node chip.
All this discussion over some BS from a marketingdroid? Are you really all such suckers?
Let me translate from marketing-speak to plain English for y'all:
droid: two are enough for now, four will be mainstream in three years and eight is something the desktop market does not need.
translation: We have a two core product available now, we will have a 4 core product available in three years but we don't yet have a plan for an eight core product.
If you look at Intel's Core 2 Duo, the cache space is not "divided" as the number of cores increase. Each core, if running at full load, will have 2MB of cache (extreme edition anyway). That is a very respectivable cache size and would be a respectable single core processor. When one core is not running (like when running only Word), one core sleeps while the other core is given all of the cache.
Past marking ploys (GHz) were definately wrong, and trying to directly replace those metrics with the number of cores is also a bad choice. But don't you see that that is exactly what Intel is trying to prevent? The interviewee in the article is saying that more cores != more performance. Hence why desktop users will have no need for 8 cores or more. Most of the posts on this topic are along the lines of "ya right, more cores FTW!", which is a very uninformed mentality.
Space for rent, inquire within
Actualy what you described is a very specific instance of dataflow programs, where the flow can best be described by a directed "dataflow" graph. Technicaly macrodataflow since you pass data between processes; true dataflow reduces the granularity all the way down to individual instructions passing each other operands.
The reason "applications naturally parallelize" is because the language is forcing the programer to be explicit about the parallelism, something that doesn't come naturaly to your Freshman CS101 coder. Imperative languages like C, Fortran, Java, etc. that students are taught first are geared towards von Neumann machines and are incredebly hard for the compiler to parallelize.
Interestingly, functional languages like you mentioned (also try 'Id') map quite well to dataflow. This is directly due to their lack of side effects (i.e. manipulating structures in memory, which must be inherently sequential in order for the programer to reason well about program correctness.)
Dataflow had a lot more following in the 80s and early 90s. One problem was actualy an explosion of too much parallelism exposed in the application, more than the functional units could handle. The overflow then had to be shuttled back and forth to memory, making the aps bandwidth limited. Look at the MIT TTDA, Monsoon, *T, TERA, TAM, WaveScalar, and other projects. The ability to put many functional units (cores) and sufficient memory to keep them fed on a single die recently (last 5 years) reduces this limit and may allow the field to have a bit of resurgance.
"You saved 1968." - Ms. Valerie Pringle to the crew of Apollo 8