AMD's Showcases Quad-Core Barcelona CPU
Gr8Apes writes "AMD has showcased their new 65nm Barcelona quad-core CPU. It is labeled a quad-core Opteron, but according to Infoworld's Tom Yeager, is really a redefinition of x86. Each core has a new vector math processing unit (SSE128), separate integer and floating point schedulers, and new nested paging tables (to vastly improve hardware virtualization). According to AMD, the new vector math units alone should improve floating point operation by 80%. Some analysts are skeptical, waiting for benchmarks. Will AMD dethrone Intel again? Only time will tell."
SSE+ operations up until now were operated on 64 bit at a time within the processor
Hmm...do you mean specifically on AMD's hardware? That stopped being true for Intel starting with the Core, which has 1-cycle latency on SSE instructions.
Keeping in scientific fact, how much heat has to be generated for 1 MIPS?
The fact is, absolutely none. It has been shown that only the destruction of information via AND and like instructions create entropy (heat). As long as you use only 3 types of gates (pass through, not, xor), you can create a heat-free CPU. Provided we do want to check for bit errors, we could maintain a very low heat via ECC like checking. Estimates on that are 10^8 lower than present.
We could keep 98% of our efficiency of current day chips if we switched to this method.
In my own benchmarks (generic C integer and floating point scientific code) I have found that the Core Duo and Core 2 Duo aren't all that quick compared with an AMD64. Clock for clock the AMD64 Opterons we have are about 50% quicker than an equivalent Core 2 Duo for integer work. I know this doesn't agree with all the usual magazine benchmarks but they are heavily biased towards using SSE instructions where possible and it is SSE where the Core 2 Duo has been a real improvement over previous Intel designs and also bests the AMD chips. Hopefully, AMD has recognised this and the new SSE implementation will bring them back on par with Intel for these benchmarks but even today an AMD64 processor is a beast and more than a match for anything Intel produces.
"I have the attention span of a strobe lit goldfish, please get to the point quickly!"
"Lets make a Octa-core processor!"
Oh, here's one. Though it's been out since before Intel had quad-core chips.
No but a good hard, well aimed, holding nothing back kick in the nuts can leave them impotent,
so they'll have to do some ugly procedures to survive it in the long run. A couple of identical
blows in the meantime could leave them sterile, so if the current setups begin to die out.
And Intel had no more babies waiting anymore, they will not be dethrowned, but will be getting
an hounerable mention in the history books.
If you don't like my sig then don't read it.
Clock speed doesn't mean crap anyways. It's all in the code. I see guitar tuning programs for the computer... TEN megs in size, slow as hell, and inaccurate! I believe APTuner is FAR smaller than most, faster and far more accurate. People just don't know how to code, plus the fastest ways to code are copyrighted, which they shouldn't be since they'd be utterly obvious to any programer with that standard "ordinary" knowledge in that language, so one has to make workarounds that inevitably end up being slower. No more oldskool hacker ethic, now it's greed.
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
Please explain this. Do I understand correctly that you think some SSE instuctions are 16 bytes? Issuing is one thing, and latency another. In most cases I've found AMD/Intel can issue 1 mulps/shufps/adds per cycle, the *ss instructions at 2 per (AMD sometimes 3 per cycle). If you mean that only the first 64-bits, 2 components, are computed in a cycle and the next 2 components in the next cycle, okay. Except that vmx does 4 component multiply-add in a single cycle, which would mean that SSE sucks at its GHz.
What really interest me is how does it compare with single and double precision calculations. If AMD gets in the range of Itanium performaces will Intel follow and kill their own Itanium by boosting core 2 FP ?
If AMD can produce a better performing chip at 65nm, then who the hell cares if Intel - or anyone else - move to a 45nm process?
They care. Just moving the chip from 65 nm to 45 nm means you can produce twice as much on the same silicon wafer. Also, if a 65 nm chip performs well, then a 45 nm version of it (with slight modifications of course) will work even better.
Opus: the Swiss army knife of audio codec
Likely Intel has an edge because they are [almost] ready for 45nm process, while AMD is just getting started on 65nm.
But it is interesting to see the two companies approach the problem from different ends. Do you improve the silicon process or do you alter the architecture and instruction set? I bet you the best answer will be to do both.
quad cores that actually share cache would be nice. these double duals kind of suck because architecturally they can never share cache. although AMD and Intel don't have very dual cores that can even share cache with-in themselves. (although I think Intel is releasing one soon?)
“Common sense is not so common.” — Voltaire
The problem I have with performance/watt is that it distorts the true "value" to the system owner. You NEED to break it down, because while power usage is important, the real issue comes down to "is the higher performance WORTH the extra power the chip draws". I personally don't CARE about performance/watt, except when the power draw is excessive, and I believe that is how MOST people will look at it.
Most laptop processors have a higher performance/watt than desktop processors because they are designed with battery life in mind. What people want is a processor that goes faster, but doesn't suck a huge amount of power to get that performance increase. The Pentium 4 got a LOT of flack toward the end with Prescott because the power demand was so far above the benefits that extra power provided. If it were only ten percent more than an Athlon 64 at the time, then no one would have been bothered by it, unless you are talking about a data center where the price for electric power is a very important consideration.
The only reason the whole fab process improvements is even brought up is because Intel is afraid of AMD. Intel has amazing resources when it comes to money and the ability to pay a lot more into their R&D, but in spite of this, AMD was seen as the performance leader before the Core 2 Duo came out, and AMD has the potential to come back and beat Intel again once K8L is released. It goes to show that if you spend some time looking at how to improve the overall system design and how things fit together, performance will go up a LOT.
Actually, from the important perspective of the difficulty of building a new machine around it, the Intel "dual-dual" core chips really are quad core -- they drop into the same socket as the previous dual core chip, placing four cores into the socket. That certainly helped speed the time to market for the chip.
If you mod me down, I shall become more powerful than you could possibly imagine.
I will not surprised if AMD dethrones Intel again. It is a classical Intel vs. AMD battle...
I am not sure Intel ever did beat out AMD.
I went down to Best Buy where the Intel rep was hard peddling a Code Duo 2 machine and compared his $1500 machine to a AMD X2 clearance one for $600. I had nothing to do that day but be a clown, so I went and got a DVD with software on it, and said these are both XP right? Copy the contents to the hard drive and compress it. I am going to measure it. Core Duo 2 results were almost the same at more than twice the prices were less than 1% different. Not only that, I put my hand over the back of the machine to see how warm the exhaust was. AMD was noticeably cooler. So I walked out with an AMD X2.
So while in my less than perfect benchmark and testing, it is an end to end test factoring in everything. I still bought AMD. Nice machine too, runs much cooler than my P4 2.8HT. Certainly a lot faster.
The proper fix is to run multiple copies of the benchmark.
I'm using Linux, with single-threaded apps, but so what? I run lots of things at once:
X, window manager, xterm, editor -- that is 4, plus the kernel
X, xterm, tar, gzip -- that is 4, plus the kernel
X, xterm, make, bash, cc1, cc1, cc1, gas, gas, ld... -- that's a lot of things!