Dual Caches for Dual-core Chips
DominoTree writes "The dual-core chips that AMD and Intel plan to bring to market next year won't be sharing their memories. A version of Opteron coming in 2005 and Montecito, a future member of Intel's Itanium family also slated for next year, will both have two processor cores, the actual unit inside a processor that performs the calculations, and each core will have separate caches."
In case it's not obvious to those who didn't read the article all the way through, it's a better thing when the memory is shared (single cache) rather than separate (dual cache). But that is harder to design, so for these first-generation dual-core chips from Intel and AMD, they are using separate caches for each core. (IBM's dual core Power4 processor has a unified cache.) At some point down the road, they will likely unify them to increase performance.
What will happen to those who must pay a royalty fee per CPU? Will companies that charge for each CPU begin to charge for two, or will it still be viewed as one...?
Real programmers can write assembly code in any language. -- Larry Wall
Sigs cause cancer.
"Montecito", a spanish word, literally translates as "a small monte".
Thus I predict that this will be followed by a quad-core chip called the "monte", an 8-core chip called the "montote" (the big monte), and finally a 16-core chip known as "The Full Monte".
Norman Cook's Ode to Sl
You probably don't want to have both chips fighting over the cache, and slowing things down; I'm sure doing The Right Thing[tm] will take a while for them to work out. Until then, just pretend that they're mostly separate chips on the same silicon.
Maybe in the future they'll come up with some more advanced cache designs that can share some cache and improve performance. But until then, expect to see it in the next generation of value chips. (Overclocked dual-core Celerons? Nifty!)
pb Reply or e-mail; don't vaguely moderate.
Can I have a 64bit OS too please? (no not linux)
Didn't you hear? According to SCO, Linux doesn't even exist!
Slashdot = ((Technology + Politics) / Trolls) % Grammar Nazis
The benefit is that you get two CPUs in less space. You might even be able to get two CPUs in a system designed to support only one (because it has only one slot.) And if your system already has two CPU slots, this might give you four CPUs.
It might also use less power than two CPUs, but I wouldn't hold my breath on that one.
Yes. Actually, I would have thought that the reverse (shared cache) would have been news instead.
The point is that you can have very fast inter-CPU communication, the moderboard gets cheaper to produce, you don't have to double the cooling machinery... and they're probably cheaper to produce also (one package instead of two).
I assume the cores are actually produced one-by-one or it'd get big and very expensive.
Belief is the currency of delusion.
Kinda. I could see a couple advantages though:
1) Fast interconnect between chips. Instead of having to transfer data over the bus, if the CPU needed info from the other CPU it could transfer over a high speed connection without having to involve other parts of the machine (bus). AMD already has a sort of high speed interconnect to their multi-cpu motherboards instead of splitting like intel does but I would imagine that this would still be faster.
2) Less motherboard room needed. You don't need dual cooling fans, dual power / interface lines and have more room overall on the motherboard.
It's not much different - that's the point. 2 processors in a single socket, saves a lot of money production wise, and that should pass onto the consumer. AMD has said their's is backward comaptible, and that's huge. You already got a single cpu opteron workstation? Well now you can have a dual cpu one for the price of a single cpu upgrade. That kicks ass.
Actually, the left core will be verbal, creative and be really good at procesing visual information, while the right core will be logical, good at number crunching and have no style sense whatsoever.
Despite what Sun has to say on the matter, Itanium system and processor sales have been increasing steadily since 2H,2000prior to that, there was a big lull in demand because few wanted to buy underperforming Itanium 1 machines when the Itanium 2 was expected rather soon (and announced relatively early).
Today, in contrast, there _doesn't_ appear to a lull in demand for Itanium 2 machines, even though Montecito (Itanium 3) has been announced in a fair bit of detail. That's because for some applications (in HPC, high-end database work, certain EDA/CAD/CAE work, and ultra-high-reliability computing) Itanium 2 systems are basically unbeatable. They also run some OSes which are very important to some organizations, such as HP-UX and OpenVMS.
Long story short, the Itanium 1 was something of a flop, the Itanium 2 is really pretty decent, and everyone is expecting the Itanium 3 to offer pretty decent _price/performance_, in addition to best-bar-none performance when it is released next year.
Well I would buy a computer now but I have no cash
Is that a pun?
Javascript + Nintendo DSi = DSiCade
Sure you can
Oh you want one for the AMD64?
How about these?
When encryption is outlawed, ou++1!@(93j++js-d9298yIUH(*Y24JKB!~
Daul core microprocessors are not a new development. IBM with their POWER4 and POWER5, HP and the PA-RISC 8800, and TI with their OMAP processors are definitive proof that multi-core solutions are not just a stop gap in increasing the performance delta of modern silicon.
Daul core processors are a natural evolution in the development of general purpose and even specialized computing devices. SMT was to be a boon for the EV8, but later found its way into the Pentium4. Multiple logical processors were just a first step.
It should be interesting to see just what AMD can do with both SMT and a daul core design.
It just had better run BSD. = )
wrong. the ps2 has a 64bit MIPS cpu with *128bit extentions*. Think MMX or SSE.
It's "RISC CPI for the CISC guy"
I can't wait to see what they do to his nonorthogonal register file.
--Rob
The downside is that as the AMD chips are going to be backward-compatible with older boards, I imagine that the dual-core chip will still only have the single 128-bit memory controller.
While that will still give you twice as many available CPU iterations, that means that the two cores will be fighting for memory bandwidth. In the case of Intel's chips, that's business-as-usual: But for the Opterons, where each processor brings its own memory controller, it just doesn't feel right. : (
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
Been that way for many years. Is rock stable and secure.
Granted it is on a mini, but we have enjoyed 64bit computing for nearly nearly 10 years. Even have some power5s in production.
There are great OSes other than the ones used on PC hardware... too many "geeks" forget that.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
Kernel Panic Core Dumped... Still Panicking Dumping Second Core...
I like Itanium. It's a pretty neat architecture which crushes most before it in FP intensive tasks. It is clear why it has done well in HPC. But HPC is nothing more than a niche.
Now here are the problems:
32 bit (x86) perfomance sucks. All those apps you've spent years developing will need re-writing (A simple recompile is often out of the question).
HP (in collusion with Intel) killed perfectly good archs. in Alpha and PA-RISC in an effort to get people to migrate to IA-64. A few may have made the move but this has mostly served to push people towards the vasty cheaper x86. HP, and to a lesser extent Intel, should provide what their customers want, not what they think is best for them.
It still uses a shared bus architecture. There are diminishing returns as you add more processors.
Itanium requires massive caches to get the best from it. Cache = Silicon = Cost. It is clear that a large scale seeding exercise is still underway with Itanium systems being provided at or below cost. Looks like it will be a long time before there will be any return on the billions invested in Itanium.
For all intensive porpoises your a bunch of rediculous loosers
Hyperthreading is simply a second context. It lets you run a second thread at the same time by using the unutilized capacity of existing functional units and is largely useful only when intel's branch prediction fails and the chip would otherwise be paying the ultimate penalty for its long, long, LONG pipeline.
In other words, HT is an ingenious method for making up for the fact that the pentium 4 is horribly inefficient.
It would be better to stick a whole bunch of simple cores on a single chip at a lower clock rate and have them work cooperatively, if only we used more multithreading. This is pretty much where intel is planning to go, with their multiple-core chips based on the Pentium-M. Or, so the rumors say.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
hence the block of RAM per CPU.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Pulling in a post of mine from a completely different forum...
:)
The G5 is a 64 bit processor and OSX Panther is a 64 bit OS.
Panther is not a true 64 bit OS in the traditional sense of the word. It does not support 64 bit addressing[1]. It does however support the use of 64 bit math operations and the saving of related registers on the CPU.
Tiger (Mac OS 10.4) will have the first steps towards a true 64 bit OS by allowing 64 bit addressing (virtual addressing) to be used for libSystem only based tools (command line applications, no GUIs, etc.). At least that is all that Apple has so far committed to doing in Tiger at this time (cannot say more because of NDA).
[1] Note the Panther kernel has support for 64 bit physical addressing so the system can utilize greater then 4 GBs of RAM (hardware wise supporting up to 16 GB of RAM) but it does not support 64 bit virtual addressing (what applications use) at this time.
With a 32-bit OS and 32-bit applications you can only access a maximum of 2 or 3GB of data at a time (possibly even less due to memory fragmentation). This may or may not affect what you do.
If you do indeed have files as big as DVDs, it would certainly help with editing those files. You CAN break those up into chunks, only having 2GB or less in memory at any given time, and for the most part this works ok, however it does tend to be a bit of a kludge at the best of times, and sometimes it just flat out doesn't work.
As you correctly guess, servers are the first situation where this really makes sense. If you've got a database that is more than 2GB in size, you REALLY want a 64-bit system, otherwise you'll tend to take a big performance hit. Many high-end workstations require 64-bit systems as well to process all the data.
So, where is the benefit for the end-user? Well that depends on the user. First off, having more than 2GB of physcial memory on a 32-bit processor requires some really ugly hacks to make things work. They do work, but it is a really dumb idea. It was a annoying and crappy when we were forced to do it back in the 16-bit days, and it hasn't gotten any better. Secondly people are using bigger and bigger data files on their home PC, editing larger pictures and videos, playing games with more graphics and sound, some even run into issues with types of databases (I know my Usenet newsreader sometimes craps out when I'm downloading too much pr0n because of database limits). Basically you might not need it, but someone else might. The best part about it though is that 64-bits is "free".
Basically you've got a 64-bit CPU that is no more expensive than competiting 32-bit chips and Microsoft has said that 64-bit WinXP Pro will sell for the same price as 32-bit WinXP Pro, so really the question is not so much "Why" do we need 64-bit, but "why not?"
No... on AMD chips the memory bus is dedicated. Intel chips have a very different system architecture (which does saturate at ~2 CPUs), but AMD gives each chip its own memory controller and memory - scales perfectly. (By the way, this isn't new ... big iron (e.g. Sparc) has been doing this for years).
Currently, the fastest FSB to date is 1033MHz - almost 1/3 of the max clock speed of the processor. Given that Intel's integer units operate at twice the clock speed, the fastest parts of the chip operate at 6 times faster than memory.
That's why modern processors use pipelining (in x86, since 486's) and caches (since, uh, 8086s ?). FSB only comes into play in 1-2% of the memory accesses. But those memory accesses are pipelined, interleaved, with multiple outstanding requests issued by the out-of-order pipeline ... processor designers have been working around a slow bus for years, and the FSB is only the bottleneck in extreme, pathological cases.
The monolithic, synchrous, central-processing-unit design of the architecture prohibits optimizations such as using memory controllers for block moves and having dedicated IO processors
Ever heard of DMA? A DMA controller does that memory transfer ... there are 2 DMA controllers with 8 channels on your current x86 PC. Heck, high-end PCI cards even have their own onboard DMA engines (it's called bus-mastering). I/O offload? You've obviously never written a device driver... modern drivers issue a few "start" instructions, then sleep; eventually the device completes the I/O and issues an interrupt to inform the CPU it's done. The last computer I had that stalled on disk I/O was running MS-DOS - nine years ago.
In all fairness, I thought exactly the same things four years ago. Then I learned about modern computer architecture. And in today's world (and, in fact, all PCs for the past ten years), your points are completely - and utterly - irrelevant.
A witty [sig] proves nothing. --Voltaire