Quad Core Battle, Intel Yorkfield vs AMD Altair
Joe writes "Yorkfield Extreme Edition based on the 45nm Penry core architecture will meet
heads-on with AMD Altair based on the 65nm K8L core in Q3 2007 as
reported by VR-Zone. Due to its
advanced 45nm process technology, Yorkfield XE is able to pack a total of 12MB
L2 cache (2 x 6MB L2) and still achieving a much smaller die size and higher
clock speed of 3.43-3.73Ghz. Yorkfield will feature Penryn
New Instructions (PNI) or more officially known as SSE4 with 50 more new
instructions. Yorkfield XE will pair up nicely with the
Bearlake-X chipset supporting DDR3
1333, PCI Express 2.0 and ICH9x coming in the Q3 '07 timeframe as well."
I for one... Will... wait for those 80 core CPU's intel said they will have in a 'few' years. I'll refuse to upgrade till I get one! :D
Mod me down im a newf (wiki)
Ooooh. Blinkenlights on a processor!
This guy's the limit!
I mean, frankly... isn't 12MB L2 overkill? We're barely putting today's 2-4MB to good use.
I've said it before, I'll say it again: This is exactly why competition rocks. Soon, we'll say Moore was no prophet, he was a pessimist!
Ok, so we have all this neat info about the Intel chip; what about the AMD processor (it gets a whole sentence and a half)? If this is supposed to be a "battle", it seems that most of the comparison has already been done in favor of Intel before the event even takes place, if this article is any reference. :P
I don't reply to Anonymous posts; if you have something to say to me, identify yourself or I won't reply.
Processer speed as well as cores are just numbers to me. The only thing high processer speed means to me is that I am able to write unefficient code and get away with it. For(int i = 0; i9000;i++){For(int j = 0; j9000;j++){For(int l = 0; l9000;l++){System.out.println("More Cores")}}}
" I think that freedom is Americas biggest export. Atleast untill China can stamp it out for 20 cents a unit."
I've mocked intel in previous threads for not beating AMD by a much larger margin with core2 than they actually did. This stuff (mentioned in post) is the kind of performance jump I was expecting to see. Bravo! If they get this stuff out the door ontime, Intel just might make it back onto my vendor list.
Ground invasion is where it's at. Space battles can be equalized with sufficient technology taken from captured planets.
Intel is going to need that HUGE cache because of it's limited FSB. It will be interesting to see how they do side by side.
The AMD with it's Hyper-transport could have an advantage over the Intel chip but right now it is all pie in the sky.
I wish that AMD had access to the Intel Fab tech. Just how fast and low power would their chips be if they where 65nm right now like Intel's?
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
So I strongly doubt that it will come out at the same time as K8L unless AMD mess up and their schedule slips. Sadly that is a likely possibility for AMD at the moment, their fault for sitting back on their asses (well, they didn't but Intel have performed extremely well recently after a few years adrift and caught AMD by surprise).
K8L's core is meant to be quite a bit faster at some tasks than the K8 core, over 40% apparently, although I'm sure that is not in general.
Please, no more!
Intel has MORE than 10 times the market capital as AMD, and they're still going "head on"? You gotta love the cache size and clock speed on Intel's desktop though, just so... big. I mean, that's better right?
I've often wondered, what are these new instruction Intel keep thinking up? Are they some sort of fancy array processing, new addressing modes? I'm curious. Whatever happened to RISC?
Penryn new instructions = PNI = SSE4
Prescott new instructions = PNI = SSE3
Therefore SSE3 = SSE4...?
Strikes me that Intel is running out of buzzwords! Was the marketing dept. severely depleted in the last round of purges?
THe next 12 months or so are going to be a very interesting time for the CPU world. All Intel needs to do it get their chips' idling power down into the same ballpark as AMD, and AMD need that 65nm process in volume *now*! I've actually been finding myself forcing myself not to look at computer stores and upgrade my workstation because I know that six months down the line there'll be something orders of magnitudes better on the scene...
Moderation Total: -1 Troll, +3 Goat
Like many laws, people mention Moore's without actually knowing what it says.
Pining for the fjords
Just like the Duos are "faster" on the benchmarks that have been ran- they've got 2-4 times the L2
and as long as you sit in the L2 for the large part, you're going to be "faster". Take the L2
advantage away from the Duo and it's a slower system overall to the Athlons, still.
I'll consider an Intel right now, mostly because the Jury's still out on the GMA X3000 display
GPU (All the reviews are using the old drivers which run it in the older mode of operation with
no T&L, etc.- I've not seen ANY benchmarks using the enhanced drivers or the Tungsten Graphics
developed Linux drivers for the GPU yet...); but otherwise, it's not really as compelling as
the review sites make them out to be.
I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
From the quote of the article "and higher clock speed of 3.43-3.73Ghz"
/me shakes head disapprovingly
Let the Ghz versus instructions per cycle war begin, yet again...
You've got to be kidding - a secure OS?
The only way to have a secure computer is to have it separately firewalled from the net for worms, and to run with a lowest user priviledged account, using non-MS software.
Modern games are another ball of wax, and I've actually gotten to the point of creating a separate OS installed partition for any new games.
The cesspool just got a check and balance.
No more AMD for me !! AMD is going back to the "follow the leader" mold. The leader is INTeL !! and I only follow the leader !!
I used to upgrade systems about every 3 years when CPU speed typically tripled or more.
So my first system was a 486-25.
Second system was a P-90.
Third was a 300MHz AMD.
Fourth was 1.2 GHz AMD.
Current system is a P4 2.7 GHz and it's at least 3 years old. And I don't feel any urgency to upgrade my basic system, perhaps a video card and some more RAM instead.
I simply don't see that CPU horsepower increasing in the steps like it used to. Yes, I understand multicore, more-cache, hyperthreaded CPUs are going to offer performance not indicated by something as simple as CPU speed, but is it THAT much?
-Styopa
I hope they come up with a new acronym. PNI is already used for Prescott New Instructions.
Support SETI@home
The Altair, AMD's Quad-Core CPU, being named for the first widely available home computer, the Altair 8800, is just too fun.
Let's hope AMD's altair is more useful.
You like your new Mac more than you like me, don't you, Dave? Dave? I asked...She said Yes.
I built myself a new system last year to replace my ancient Micron 486. It was so old that the CPU didn't even use a fan (had plenty of dust in the heatsink though), the VLB graphics board had 2 MB vram, 48 MB RAM, the monitor was a 14" CRT that had ghosting probs, and the hard drive space was less than most high-end MP3 players. Even the mouse and keyboard barely worked anymore. I pretty much milked it for every penny I paid.
I did a 100% rebuild. Now I've got a AMD 64 X2 3800+, Lian-Li case, UPS, 19" LCD, 2 GB RAM, 500 GB total SATA HD space, NVidia 7800GT, etc. I have no idea how many generations I must've jumped. I felt like a hermit walking out of his cave and blinking at the sun. I've even gone from dial-in to high-end DSL. I'll replace the CPU/MB when apps start grinding on it. 5 years prob?
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
Hey, did you notice? You were drooling. Stop.
True, though they're to be lauded for creating an architecture that can take advantage of a ginormous cache as well as it does. They really did learn to "work smarter, not harder" in their current chip. And of course, in the end, the consumer doesn't care how performance is achieved. :)
In the case of Intel's quad core solution, they seem to be achieving higher overall performance, as expected, but at the expense of pushing their thermal envelope back up to 130W even after the die shrink. AMD, on the other hand, has commited to shipping a 68W quad part and that's including the integrated memory controller.
Still, it's not only a few months of bragging rights that Intel is buying - Woodcrest only supports dual socket configs which means four cores at most unless you fall back to Dempsey, which simply isn't price or performance competitive with the Opteron. This at least scales them to eight core configurations. Of course on the AMD side of the house you can pick up an eight-way board, which gets you to sixteen cores today, and will scale up to a whopping thirty-two cores next year. Between that and Intel's lackluster bus technology, they've got serious problems at the high end of the server market.
I know it's not considered sexy, but I find it hard to get excited about this when I can't even justify upgrading to dual-core.
It's not that I couldn't use the speed -- I could, even on multiple cores. It's that I've outgrown wanting to spend a lot of time and space on a computer. Any computer is fast enough to work on, these days. I want something that takes little power, and little space.
For Linux, the Intel 950 graphics seems to be the way to go, which means Core 2 Duo. But if I want a flatscreen, that means DVI, and there aren't even *any* Core 2 Duo motherboards with Intel 950 and DVI.
Thanks to Core 2 Duo and AMD64, I can get a new CPU that uses as little power as my old (sub-1GHz) one. Unfortunately, as long as it needs a full-height graphics card, it'll take up just as much space.
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
That's just a stupid statement. Local cache is an integral part of modern processors and increasing cache size has been one of the more common strategies for increasing processor performance. Intel has designed this particular architecture to rely on a large L2 cache and to suggest crippling it in order to put it on a more "equal" basis with AMD is just stupid. You might as well complain about AMD's pipeline advantage or onboard memory controller advantage.
I don't think losing taht segment of the server market is that big a deal. There are very few case where you can't get a better price/performance ratio by scaling out instead of up. The only one I can think of off the top of my head is an OLTP database server. Even with that, some recent HP studies indicate that scaling out may be viable in many cases. They pitted a 32-way cluster of blades vs a 32-way superdome server. For some operations the blades got trounced, but in others they defated the superdome handily.
2 socket servers start around $1k. 4 socket servers start around $7k. 8 socket servers start around $20k.
The biggest diff between the two, in my opinion, is the drastically different methodolgy they took to acheive 4-core status.
:) I'm looking forward the 80 core CPU another slashdotter mentioned.
Intel took two dual cores and packaged them in one unit (but inside that unit they are actually just two separate dual core CPUs) whereas AMD has made an actual quad core single die CPU.
I'm not saying Intel's method is wrong or even disadvantaged, just that it's quite different. Intel will therefore get to market much quicker than AMD, I beleive, but once bother are on the shelves (sans benchmarks, which we don't have yet) my money is on AMD's solution being the performance winner. Still, getting the market first is a huge bonus and will give Intel the breating room to go back and make a true quad core single die CPU. Who knows how this will end? All I know is we win!
Tom Caudron
http://tom.digitalelite.com/
-Tom
AMD Opteron chip sets also have more pci-e lanes then Woodcrest systems have.
aka in most systems each cpu links to the 2 chip halfs.
amd intel
r |aaa| |aaa| r |aaa| |aaa|
a-|cpu|--|cpu|-a |cpu| |cpu|
m |aaa| |aaa| m |aaa| |aaa|
| | | |
|chip| |chip| |chip Set|-ram
|
|chip|
why can't intel give it's on board video some of it's own ram like ATI hypermemory and NVIDIA TurboCache?
It's not so stupid a remark.
L2 is what helps a CPU compensate the disparity of the FSB to the main system memory's speed.
The larger it is, the faster the CPU will run- so long as the data and executables remain there.
If you halve the L2 on the Core CPUs (Matching the AMD's Cache size...), you will see a 20% or
more drop in overall performance.
If you drop about 15-20% of the performance, you see that the Core Duo is actually SLOWER than
the comparable AMD and that the only real edge is the overal TDP which goes to Intel on this
round. The architechture itself isn't as good as AMD's and "wins" speed-wise only because
Intel can jam double or triple the L2 on die because of process shrink. If you run an app that
forces L2 thrash (which is a hell of a lot of them, actually...) you'll see the Cores running
roughly neck and neck with the AMD chips in the same class- and only there because of the larger
L2.
You pop off terms, but do you HONESTLY know what they all mean? From your comment, I'd say not-
I could be wrong, but it strongly looks like you don't get it. I do- it's sort of what I studied
in my Master's studies when I was working on my MSCS years ago.
I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
Ultrasparc T1 (Niagara) may be the last chance for server RISC CPUs.
http://revj.sourceforge.net
It is called a design decision... Intel's Core 2 Duo family is designed with a large L2 cache coupled with a higher latency memory access, just as AMD is design with smaller L2 cache a lower latency memory access.
Anyway making statement about well if you cut this, change or run this specific task that then you see this makes little sense when trying to truthfully compare real devices. Compare the devices as they are running the work load YOU need to run and see which is best for what YOU want to do with them.
Not always the faster one, or better one, but the one the average customer decides to buy.
"To be is to do." --Socrates
"To do is to be." -- Aristotle
"Do-Be-Do-Be-Do..." --Sinatra
There is a ton of money to be made in the high end of the market. Fewer boxes, but much higher revenue (and profit margin) per box.
:)
Whether it makes sense... depends. Some applications really do benefit from big boxes - OLTP, ERP, datamining, application servers, directory servers. Commodity virtualization has made consolidation a buzzword again.
The other thing to keep in mind is that when you move into a large environment the initial purchase price is not necessarily the primary cost - rack space costs money, as do power, switch ports, battery backed power, power distribution, cooling.
Some of the pitfalls of horizontal scaling are addressed by blade centers, but then you're looking at a cost of more like $2-3k for a dual processor system, depending on the vendor and architecture. You're also going to pay more for storage (2.5" disks), and be limited in capacity (typically no more than 2 x 74GB). There's a fair number of blades out that that will only take one disk, which precludes mirrored storage without external storage of some sort. The cost of the chassis also needs to be taken into account. Usually only a few grand, but then the switch module is a few grand more, redundant power supplies another grand or two, storage module if you need it another grand or two. If you're not already running high power equipment, getting new power circuits and equipment will cost thousands more.
Then of course there's the soft cost - aggregate support costs are likely lower with multiple boxes, operating system licensing, software licensing (expecially when we're talking about stuff that runs into six figures per machine), system administrators, etc.
Not knocking horizontal scaling, mind you. We should be picking up at least seventy more blades before the end of the year. The demand for big boxes isn't going to disappear anytime soon, though.
(Not to mention that $20k is still firmly in the "volume server" category. They may sound expensive next to a $1-2k x86 machine, but compare it to $200k for an eight-way Sparc or PA-RISC system and it looks like a mighty fine bargain.
They made a lot of smart decisions in their design. Svartalf is likely right in one sense though - the actual execution pipelines of the Opteron are theoretically more capable than those of the Core architecture. Of course that doesn't mean anything if the pipeline isn't kept fed.
Something that I might be concerned about at Intel is that some of their optimizations aren't necessarily coupled to their design decision, such as the intelligent pre-fetch in their memory controller. It was more necessary in their design, since they have inherently higher latency to overcome. There's no reason the idea couldn't be adapted to do pre-fetch into L3 cache on the new AMD design, though, to further cut down on latency (especially when they add support for FB-DIMMs).
Nope. Not dead. SPARC is up to 8 cores, and unlike above-mentioned quads, it is actually a real processor shipping and running today for reasonable $. I was configuring one a couple weeks ago. Sweeeeeeet box ....
-- Windows is not simply installed on a computer; it is inflicted.
Ok. Maybe my response was overly harsh, but I don't see anything in your analysis that contradicts what I wrote. You are making the argument that cutting L2 cache on the Core Duo erases its advantage over an equivalent AMD chip, something I don't necessarily disagree with. My argument is simply that increasing L2 cache is a valid design decision that is just as important to overall processor performance as pipeline length, branch prediction techniques, or any other component of modern processor design.
Arguing for cutting L2 in order to make the two processors more comparable is as stupid (sorry) as arguing to cut any of these other components since all components are designed to work together efficiently. You could just as well argue for increasing the pipeline length of the Athlon64 or removing the memory controller and using an Intel FSB instead of HT to make them more comparable to the Core Duo. I'm sure that making these changes would negatively impact the Athlon's performance relative to the Core Duo because AMD designed the chip around those components.
The point I'm trying to make in my long-winded way is that the complex designs of modern processors mean that changing any of them will impact performance. Intel didn't just cram more L2 cache on their new chips. They crammed more L2 cache on and then optimized the rest of the chip to use it as efficiently as possible. Of course it's going to hurt if you cut it in half.
If you run an app that forces L2 thrash (which is a hell of a lot of them, actually...)
I find this hard to believe. Not the L2 thrashing part, but the assertion that there are a hell of a lot of programs that would cause it. I've seen quite a number of performance tests covering more than just the standard benchmark scenarios and the new Intel chips still win. Not necessarily by the same margins, but they still win.
You pop off terms, but do you HONESTLY know what they all mean? From your comment, I'd say not
Why not? Because I didn't go into an in-depth analysis of processor performance scenarios? I assure you that I know what the terms mean, although only BSCS on my part, so you have me beat there.
"some recent totally unbiased HP studies indicate"
You can't be serious...
Show me any possible way to design a microprocessor that can do number crunching faster if the memory subsystem is hampered. Do you realize what would happen to that Athlon if we disabled the on-board memory controller and forced it to communicate with memory through an intermediate chipset?
All, I repeat *all* processors need fast memory accesses. It's simply how they chose to make memory accesses faster that differs. The Athlon went with an on-board memory controller and the Core uses a load of cache and more intelligent pre-fetch algorithms. They *all* need some form of "cheating" to overcome the fact that they have to go to memory.
(C) You know who.
- Penryn is a quad-core notebook processor with 6MB of L2 cache. Unlike today's quad-core Kentsfield, Penryn is a "true/real" quad-core CPU with all four cores on one die. The 6MB of L2 cache is shared among the four cores.
- Yorkfield is an 8-core desktop processor with 2x6MB of L2 cache. Like Kentsfield, Yorkfield is a "cheat" and is really just two quad-core dies (two Penryns) on a single package. However, Kentsfield's early benchmarks have looked pretty good for a "cheat," so I'm reserving judgement on this design decision. I've also read that, similar to how Kentsfield trailed Conroe by a few months, Yorkfield will be released several months after Penryn.
Intel may be "cheating" with Yorkfield, but it looks like Intel ship their "8 cores in one socket" CPU long before AMD can release their "real" 8-core CPU.Some links:
TO START
PRESS ANY KEY
Where's the 'ANY' key? I see Esk, Kitarl, and Pig-Up...
Grammer burn...
" I think that freedom is Americas biggest export. Atleast untill China can stamp it out for 20 cents a unit."
Are you running a distributed microkernel OS on it?
The rumours of yorkfield being eight core have largely fallen by the wayside. See here for more recent speculation. (The link you provided points to rumours from 2005-12, mine are from 2006-09.)
One thing you guys need to understand is that memory latency is a secondary issue in PC architecture (much as AMD would love you to believe otherwise). Fact 1: while CPUs have gone from 2GHz, commodity DRAM latency has gone from ~100ns to ~50ns. Clearly density, cost, and power are the priorities. Fact 2: memory vendors could reduce latency drastically if there was a desire to. Take RLDRAM-2 from Micron... 20ns latency, slightly larger die. Current cost: 10X the price of DDR2 because the market is so much smaller. I have actually asked DDR2 program managers whether they would make the part 20% lower latency at 5% more cost and they replied "absolutely not". Fact 3: Intel may have an off-chip memory controller, but in a four-socket AMD board, 75% of RAM is attached to an off-chip controller also. 8-way is worse. Conclusion: Winning CPU architectures have been dealing with memory latency for ages. Architectures that work around memory latency, versus needing less of it, will scale better given facts #1 and #2. AMD and Intel have just chosen different bus/cache solutions. They both deal with horrendous memory latencies. AMD added a bunch of pins to their CPU, Intel added a bunch of cache; both add cost. IMHO, AMD's N-way socket story (good due to HT) is hurt by the fact that most memory is now off-chip, while they may not have enough cache to deal with that.