Dual Caches for Dual-core Chips
DominoTree writes "The dual-core chips that AMD and Intel plan to bring to market next year won't be sharing their memories. A version of Opteron coming in 2005 and Montecito, a future member of Intel's Itanium family also slated for next year, will both have two processor cores, the actual unit inside a processor that performs the calculations, and each core will have separate caches."
Can I have a 64bit OS too please? (no not linux)
In case it's not obvious to those who didn't read the article all the way through, it's a better thing when the memory is shared (single cache) rather than separate (dual cache). But that is harder to design, so for these first-generation dual-core chips from Intel and AMD, they are using separate caches for each core. (IBM's dual core Power4 processor has a unified cache.) At some point down the road, they will likely unify them to increase performance.
I'm not a hardware pro, but is this basically the same as having two seperate chips, or am I missing the point here?
What will happen to those who must pay a royalty fee per CPU? Will companies that charge for each CPU begin to charge for two, or will it still be viewed as one...?
Real programmers can write assembly code in any language. -- Larry Wall
Sigs cause cancer.
"Montecito", a spanish word, literally translates as "a small monte".
Thus I predict that this will be followed by a quad-core chip called the "monte", an 8-core chip called the "montote" (the big monte), and finally a 16-core chip known as "The Full Monte".
Norman Cook's Ode to Sl
With that logic, you'll always be holding off for some new development.
You probably don't want to have both chips fighting over the cache, and slowing things down; I'm sure doing The Right Thing[tm] will take a while for them to work out. Until then, just pretend that they're mostly separate chips on the same silicon.
Maybe in the future they'll come up with some more advanced cache designs that can share some cache and improve performance. But until then, expect to see it in the next generation of value chips. (Overclocked dual-core Celerons? Nifty!)
pb Reply or e-mail; don't vaguely moderate.
I've saw this article at another website earlier today, and I though this wasnt really important. Each core should have its own cache, thats exactly what a dual core chip is. Not twice as many execution units crammed into the same space, or some other funny configuration, its two seperate chips on the same die, perhaps some modifications for inter-processor communication, but thats about it. With AMD's core design, you have the physical layer only of the hypertransport bus to connect the chips, and the integrated memory controller has one or two ports to talk to memory (single/dual channel) and two ports to talk to two seperate chips. It will be interesting to see if AMD couples dual-core chips with DDR2-667 or DDR2-800, that would make the most sense, as to keep the memory controller from being the bottleneck, as opposed to the system bus on the intel side.
The Doormat
If you're not outraged, then you're not paying attention.
It's not much different - that's the point. 2 processors in a single socket, saves a lot of money production wise, and that should pass onto the consumer. AMD has said their's is backward comaptible, and that's huge. You already got a single cpu opteron workstation? Well now you can have a dual cpu one for the price of a single cpu upgrade. That kicks ass.
Actually, the left core will be verbal, creative and be really good at procesing visual information, while the right core will be logical, good at number crunching and have no style sense whatsoever.
Despite what Sun has to say on the matter, Itanium system and processor sales have been increasing steadily since 2H,2000prior to that, there was a big lull in demand because few wanted to buy underperforming Itanium 1 machines when the Itanium 2 was expected rather soon (and announced relatively early).
Today, in contrast, there _doesn't_ appear to a lull in demand for Itanium 2 machines, even though Montecito (Itanium 3) has been announced in a fair bit of detail. That's because for some applications (in HPC, high-end database work, certain EDA/CAD/CAE work, and ultra-high-reliability computing) Itanium 2 systems are basically unbeatable. They also run some OSes which are very important to some organizations, such as HP-UX and OpenVMS.
Long story short, the Itanium 1 was something of a flop, the Itanium 2 is really pretty decent, and everyone is expecting the Itanium 3 to offer pretty decent _price/performance_, in addition to best-bar-none performance when it is released next year.
Luckily for AMD, the Opteron/A64 was designed with dual-core in mind. As I understand it both cores will talk to each other via an internal Hypertransport link and (as with current Opertons) together with the internal memory controller will eliminate the need for an external northbridge. It is also expected that upon release they will drop directly into existing motherboards with nothing more than a BIOS upgrade.
Intel will find things more challenging. Both cores will have to contend the GTL bus, currently the Achilles heel of their MP solutions, by communicating via an external northbridge.
For all intensive porpoises your a bunch of rediculous loosers
Well I would buy a computer now but I have no cash
Is that a pun?
Javascript + Nintendo DSi = DSiCade
Can someone who speaks Idiot please translate for me?
Translated: Please mod me +5 funny.
-- n
Daul core microprocessors are not a new development. IBM with their POWER4 and POWER5, HP and the PA-RISC 8800, and TI with their OMAP processors are definitive proof that multi-core solutions are not just a stop gap in increasing the performance delta of modern silicon.
Daul core processors are a natural evolution in the development of general purpose and even specialized computing devices. SMT was to be a boon for the EV8, but later found its way into the Pentium4. Multiple logical processors were just a first step.
It should be interesting to see just what AMD can do with both SMT and a daul core design.
It just had better run BSD. = )
The benefits of HT, as currently implemented, are pretty insignificant compared to the benefits of multiprocessing, as the possible performance boost is very small, it certainly doesn't give you the ability to handle more interrupts, and it doesn't let you decrease the number of context-switches.
As for building a more intelligent core to take advantage of the extra transistors, that just might make sense - but it would also take hundreds of millions (or billions) of dollars in development, and the chip wouldn't appear for a good number of years (look at the Itanium). It's a lot easier and cheaper to slap two cores on the same die and call it done. Because Intel is scurrying to try and play catch-up to AMD in the high-end market, time-to-market is critical for them.
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
It's "RISC CPI for the CISC guy"
I can't wait to see what they do to his nonorthogonal register file.
--Rob
The downside is that as the AMD chips are going to be backward-compatible with older boards, I imagine that the dual-core chip will still only have the single 128-bit memory controller.
While that will still give you twice as many available CPU iterations, that means that the two cores will be fighting for memory bandwidth. In the case of Intel's chips, that's business-as-usual: But for the Opterons, where each processor brings its own memory controller, it just doesn't feel right. : (
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
Been that way for many years. Is rock stable and secure.
Granted it is on a mini, but we have enjoyed 64bit computing for nearly nearly 10 years. Even have some power5s in production.
There are great OSes other than the ones used on PC hardware... too many "geeks" forget that.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
The problem with the Itanium was that Intel didn't release an optimizing compiler with or before the Itanium. I believe (corrections welcome) that instructions are grouped in 'packets' (I forget the term used) that the Itanium can run in parallel. The problem is that only certain instructions can be bundled together. When older compilers are used the instructions are generated in a way that only a few or even just one instruction is in a 'packet'. So, the problem was that the processor wasn't being used to its fullest potential. I have never compaired the Itanium 1 and 2. But, I would guess that the Itanium 2 was primarily released to give the Itanium line a fresh start with an optimizing compiler.
Kernel Panic Core Dumped... Still Panicking Dumping Second Core...
but will it make coffee? I didn't think so.
Given that the power output of a single-core Prescott is 100 watts or more, a dual-core with separate caches will put out 200+ watts. Clock up the speed a bit more, and you'll be at about 300 watts.
I figure that's probably enough to boil a cup of coffee.
"They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
I like Itanium. It's a pretty neat architecture which crushes most before it in FP intensive tasks. It is clear why it has done well in HPC. But HPC is nothing more than a niche.
Now here are the problems:
32 bit (x86) perfomance sucks. All those apps you've spent years developing will need re-writing (A simple recompile is often out of the question).
HP (in collusion with Intel) killed perfectly good archs. in Alpha and PA-RISC in an effort to get people to migrate to IA-64. A few may have made the move but this has mostly served to push people towards the vasty cheaper x86. HP, and to a lesser extent Intel, should provide what their customers want, not what they think is best for them.
It still uses a shared bus architecture. There are diminishing returns as you add more processors.
Itanium requires massive caches to get the best from it. Cache = Silicon = Cost. It is clear that a large scale seeding exercise is still underway with Itanium systems being provided at or below cost. Looks like it will be a long time before there will be any return on the billions invested in Itanium.
For all intensive porpoises your a bunch of rediculous loosers
A friend purchased a 3GHz( yes 3 ) Intel Pentium 4 with HyperThreading a few months back. I asked why he didn't purchase an AMD CPU and he said he needed x86 compatibility... So much for informed hardware engeers. Anyway, I recently asked him about the system since I just built an AMD 2600+ based system and wanted to know if he had some code he wanted to compare/test. Well, he told me that his 3GHz CPU really only runs most applications at 1.5GHz except if they are multi-threaded or hyperthread aware.
Is this true? Does Intel put a 3GHz label on 1.5GHz dual/core CPU's or whatever this hyperthreading is? Sounds dual/core-ish to me...
It's funny how that 1.5GHz number shows up again in Intel product. I remember when they could not build anything faster than 7xxMHz and then all of a sudden, they had a "new technology" that got them 1.5GHz( 2x 750MHz ) and it was found out later that only PART of the CPU was running at 2x. This all happened when AMD beat Intel passed the 1GHz barrier. Are they again playing "tricks" to get a big GHz label on their parts?
So any of you people up on this dual-core and hyperthreading thing and feel like explaining to the rest of us what's going on? TIA.
LoB
"Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
VMS went 64-bit at least a decade ago.
Great OS for English-speaking folk, despite Linus's hatred for it.
Hyperthreading is not a better solution, particularly when dealing with the Intel implementation. Unless it's very carefully done, all it does is keep the cache from working effectively. Linux and FreeBSD actually got performance improvements from leaving one of the virtual processors idle when there were more processes scheduled to run. When there's two threads of the same process, they let them both run because those tend to have better locality of reference and therefore don't thrash the cache so much.
Processor designers are in a different situation now than 10 years ago. They've got more transistors than they know what to do with, so adding cache and adding another core are cheap. Streamlining one core to run faster is much harder, as evidenced by Intel's unending troubles with anything faster than 3.2 ghz.
I rarely criticize things I don't care about.
Hyperthreading is simply a second context. It lets you run a second thread at the same time by using the unutilized capacity of existing functional units and is largely useful only when intel's branch prediction fails and the chip would otherwise be paying the ultimate penalty for its long, long, LONG pipeline.
In other words, HT is an ingenious method for making up for the fact that the pentium 4 is horribly inefficient.
It would be better to stick a whole bunch of simple cores on a single chip at a lower clock rate and have them work cooperatively, if only we used more multithreading. This is pretty much where intel is planning to go, with their multiple-core chips based on the Pentium-M. Or, so the rumors say.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
It's not entirely true that single is better. It depends on what the system is used for. If both cores are accessing the same memory (likely the case in a multi-threaded webserver for instance), then they can benefit from sharing a cache and effectively doubling the cache size. However, if both cores are accessing different memory (almost any situation where different applications are running on the different cores), then sharing a cache could have devastating effects on performance. As each process running on each of the cores would be likely to be evicting the other processes cached memory, there would be a plethora of cache misses. In the worst case, this could effective make the system as slow as if there were no cache at all. In the average case there would likely be a significant performance hit. A better strategy than unified or seperate caches would be to have a read/write cache for each core and allow each core to read the other core's cache. This would allow the benefits of the shared cache in the case where both cores were accessing the same memory without having the major performance hit when each process is accessing different memory. Unfortunately the hardware for this would be even more complicated than for the unified or seperate cache techniques.
hence the block of RAM per CPU.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
While dual cores on a chip might be nice, it won't produce any serious performance increases.
The underlying problem with Intel and AMD's processors is that they are at the mercy of the architecture:
The ironic thing is that even though AMD and Intel are out-clocking mainframe processors by factors of 2 and 3, mainframes still get more work done simply because they aren't choked by a slow and overcrowded system bus .
The society for a thought-free internet welcomes you.
Are the dual cores on the same piece of silicon? This would require both cores to be defect free. If only one core is defect free, is it possible to disable the dud and sell it as a single core CPU? This would make it a much more attractive proposition for the manufacturers.
E.g. if a single core has a yeild (probability of being defect free) of 80%, then the dual core chips will have a yeild of 0.8^2 = 64%. (Actually slightly lower, because whatever interconnect they have also has to be free of defects.) 64% will have two good cores, 4% will have two bad cores, the remaining 32% will have one good core. The manufacturer would obviously like to make use of that 32% if they can.
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
From what I know of the current architectures, AMD's solution to main memory access woes (point-to-point bus) seems more sane as soon as more than a couple of processors are installed in the system. Shared bus (as in Intel's solution) seems to require huge caches to operate efficiently, and as we all know, Pentium 4 really does not like pipeline stalls or branch mispredictions.
Let's take a hypothetical example: quad processor systems utilising dual core processors from Intel and AMD.
AMD: each processor (core) talks directly to its local memory block, and via HT links to adjacent processors' memories. Processors do not have to contest for access to the bus and thus memory access is always low-latency, even when accessing remote memory. If built today, HT links would operate at 1 GHz.
Intel: processors share the same bus with each other and memory controller. Any time a processor needs to access memory, it has to wait until the bus frees to ask the memory controller access to main memory. Pipeline stalls happen here if bus is not free when needed. This is compensated with huge L3 caches. As far as I know, current quad processor systems from Intel have bus speeds of around 533 MHz.
So in a nutshell, Intel competes with AMD on a quite level field when the system has 1-2 processors, but as soon as processor count goes up, bus bandwidth becomes an issue with Intel. It shall be interesting to see how Intel attempts to counter that.
What I am getting at with this? Well, those huge 12 MB L3 caches in Intel's future processors sure aren't cheap. They take up lots of silicon and WILL decrease core yields since they've got lots and lots of points of failure. So manufacturing processes really have to be ramped up to allow that at reasonable cost.
"Intellectual Property" should be an affront to anyone capable of independent thought.
It's not a question of if there will be 64-bit OS's to go with these things. Eventually, it's sure to happen in multiple flavors.
The real question is what ELSE will be on the motherboards and in the chip by the time these things hit the market? Specifically, what DRM hardware will come with these things? What will the BIOS look like?
That's why I think that the current generation of 64-bit desktops are probably one of the best values for a machine you might be using 4 years from now. It's risky to wait 6 months or a year with the current views of the US Congress and FCC. This generation of 64-bit machines might be one of the last to be multi-purpose Turing/Von Neumann devices.
Don't wait for dual-cores if you have the cash and want to be the one in control of your 64-bit machine. Eventually the OS's will catch up.
"Let him go, Ralph. He knows what he's doing." --Otto Mann (simpsons)
...will both have two processor cores, the actual unit inside a processor that performs the calculations...
Oh, so that's what a processor does! Can you remind me again what "RAM" is?