Intel Harpertown (Penryn) Quad CPUs Benchmarked
unts writes "The Intel Developer Forum is currently running in San Francisco, and Intel is showing off the up-coming Harpertown processors based on the Penryn core. HEXUS got hands on with a test system and ran some performance tests: 'Harpertown is a better quad-core processor than Clovertown: it's as simple as that. More L2 cache will gobble-up larger application data-sets and a higher FSB, on select models, will ensure that per-CPU bandwidth is less of a concern.'"
While invariably the comparisons will bemade between this and AMD, let us not forget that Intel is getting stiff competition from left field as well. The arrival of the SPARC Niagra II processor is about to make the realm of high-end computing a lot more competitive than it has been in years. I, for one, can't wait to see a real head-to-head-to-head, AMD and Intel quads vs the 8-core monstrosty that is SPARC.
Karma Whoring for Fun and Profit.
This article goes into some of the juicy technical details about Penryn/Nehalem and covers a lot of ground about what Intel had to show at the IDF.
The article is also relevant to this discussion, "End of Moore's Law in 10-15 years?". FTA:
A much more in-depth review is available at The Tech Report: http://techreport.com/articles.x/13224
Bottomline: The Niagra microarchitecture is meant for a particular niche.
The Raven
Visit some of the standard sites (AnandTech, Hardware info, TechReport, etc.) for various reviews. Here's some to get started on:
link
link
link
link
link
link
Quote from a poster at another site that I found interesting: What's really sad is that more people have benchmarked harpertown than barcelona, and yet one of these chips has "launched", and the other is ~2 months away.
Another intersting quip:
from: link
Processors have had multiple layers of interconnect for decades.
Transistors, however, have generally been on one layer since the avent of the planar integrated circuit. Although there have been some advances in putting passive components capacitors and floating gates (for dram and flash, respectively), on top of active transistors, or orienting transitors themselves vertically instead of planar, a general 3d circuit is very much a future technology that's only presently being researched.
As a hack, people have tried "stacking" layers of pre-fabricated planar chips (usually drams or flash memory chips), but there have generally been problems with evacuating the heat from the inner layers from these types of devices which why to date they have been restricted to low-cycle-time devices. Although all parts of a processor are generally doing something all the time, only a small part of a memory devices is active. This allows memory to have few heat issues than a processing type devices and why they are really only working on them first.
Soon people will get 3d circuits going, but they certainly haven't been doing 3d circuits for decades...
Here's an old image which shows Intel's current roadmap: http://img366.imageshack.us/img366/5313/1775largelongtermroadmap7fs.png
Basically, intel releases a new architecture every 2 years and in between that they release a die shrink/derivative.
Penryn is mainly just a die shrink of Merom (codename for the laptop version of the Core 2). Merom was a 65nm chip and Penrym is a 45nm chip using the same architecture. Next they will release a new architecture using 45nm (codename Nehalem), then they will release a die shrink of Nehalem using 32nm, and so on and so forth...
Here's a quick rundown:
2006: Core 2 architecture released at 65nm
2007: Die shrink of the Core 2 architecture from 65nm to 45nm
2008: New architecture (code name Nehalem) released at 45nm
2009: Die shrink of the Nehalem architecture from 45nm to 32nm
2010: New architecture (code name Sandy Bridge, formerly known as Gesher) released at 32nm
2011: Die shrink of the Sandy Bridge architecture from 32nm to 22nm
Just look at thre STREAM benchmark numbers and you'll see clearly that AMD has been way ahead of Intel when it comes to RAM bandwidth. I just benchmarked a dual-Quad-Xeon myself (Dell 2900) and I could not believe the poor results I got. One app running in the system can get up to around 3,500 MB/s. Put just two tasks running together (taskset'ed to different chips), and they will each get around 2,600 MB/s. From there on, total aggregate bandwidth tops at 5,200 MB/s and stays there, no matter how many simultaneous tasks you run (it will of course degrade if you run more than eight tasks, you get the point).
Dual-socket Opteron machines from two years ago can get to 15,000 MB/s aggregate, easily.
So, I'd really like to know if Intel is planning to improve things in this department.
Nehalem, the next CPU, uses DDR3 RDIMMS with ECC.