Domain: spec.org
Stories and comments across the archive that link to spec.org.
Comments · 448
-
Re:SPECviewperf numbers?nVidia have a popular mid-range line of "professional" 3D chips, the Quadro series (sold by Elsa in its Gloria series). These are basically GeForce chips with a couple of extra features enabled, like hardware anti-aliased lines & two-sided lighting. They're quite a bit more expensive than a consumer GeForce, but a LOT cheaper than most workstation cards.
There's a few places you can look for benchmarks on GeForce, Quadro and mid- to high-end workstation gfx cards. Currently the Wildcat 5110 pretty much rules the roost (at around $3k), with the Quadro2 Pro (under $1k) & FireGL4 (over $1k) competing hotly below that. Lesser cards (FireGL 1 & 2, Quadro, Quadro2 MXR & EX, and the older Oxygen models) can be had for well under $1k. Prices are only from memory, and are probably wildly inaccurate.
Even a standard GF2/GF3 or Radeon does pretty well, impressively so for the price. Rendering quality has been compared (for the GeForce at least), and is roughly equivalent - no major texture or polygon errors, all cards generating the occasional off pixel.
Bottom line: The majority of my customers (2D/3D FX) are switching to GeForce or sometimes Quadro cards - sought-after features include decent (not necessarily superlative) 3D app performance, dual monitor support (WITH hardware accel on both monitors!), and bang for the buck. Good DirectX support doesn't hurt either (very few cards from 3Dlabs support DirectDraw well, and some serious apps do need this).
-
SPEC2000
Yes. There is a comparison of IA64 -vs- The Rest.
You can check out the details at spec.org,
but I've found Ace's Hardware Top 20 Review to be a more concise and readable version.
You can see that IBM's POWER4 has a fp peak of 1169,
while Intel's Itaniam has a fp peak of 701. In int performance, the Itanium is pretty dismal. -
The PowerPC "G4"
At the heart of the current high-end Macs, routers, and switches is the PowerPC G4, which is what Apple and Motorola claim to be their "fourth generation" CPU that is the result of the three-way AIM alliance, which has been designing and fabbing chips in various PowerPC families since 1991.
I contend that the "G4" is a blatant misnomer by Apple and Motorola to spur sales and compete with Intel's Pentium 4 product and nomenclature. Below I'll give some historical background, technical information, and plain facts that support my claim that the PowerPC G4 is really a second-generation processor, and the broader notion that the PowerPC family has not evolved signifigantly since 1995-- something Apple and Motorola propoganda has repeatedly accused the competition of in recent years. But first, the background...
By 1991, the AIM alliance (Apple, IBM, and Motorola) had begun working on a single-chip implementation of IBM's RSA chipset. This was IBM R&D's attempt to hack the POWER architecture into one chip instead of several . Imagine, instead of having a 64mm PowerPC chip having to use a 64cm PowerPC *board*. That's unacceptable to the desktop market.
Motorola brought bussing technology to the table, which had previously been intended for the "Ripfire" 88k RISC series (displaced by the PowerPC) and Apple brought years of motherboard knowledge and operating systems (A/UX, Mac OS 7, and the new, mysterious Copland project). Between these three giants, the PowerPC 601 was realized. It ranged from 50-125MHz but was soon replaced by a quartet of newer, second-generation (G2) parts-- the 602, 603, 604, and 620.
The 602 was an embedded chip, being used for satellite descramblers, stadium scoreboards, and the Nintendo64. It lacked an FPU. The 604 was a workstation-class chip that was an absolute monster. Performance was above the Pentium Pro's. The 620 was a 64-bit godhead beast that trounced all known microprocessers of the day-- but was mysteriously canned after it had been included in only a handful of beta motherboards by the Bull Group. The 603 was designed to draw little power and be cheap to manufacture, but AIM had hobbled it a bit too much-- beta testing sent it back to the lab to add L1 caches and the ability to access L2 cache. Performance afterwards was dismal, but acceptable for cheap consumer devices for the time being.
It was this enhanced PowerPC 603 that would be the basis of its own savior. Apple and Mot only admitted that the 603 was subar along its whole production run when they had a replacement ready. By taking the L2 caching of the 620 and adding it to the 603, they had created the PowerPC 750L. And to Apple and Mot, this small change justified dubbing it a whole new generation of processor. Say Hello to the G3.
Fast forward a few years. By 2001, Motorola was shipping 800MHz PowerPC 7450s, a "G4" series part. The "G4" stands for "Generation 4," which is totally misleading. Look at it this way: the entire 74xx / G4 family is based on the "G3" family, its prime "advances" over the G3 being an FPU ripped from a PowerPC 604, and AltiVec, a questionable technology meant to operate on mulitple pieces of data at once (MMX, anyone?). To get a better look at the crawl from 603 to 7450, let's look at a chart.
[censored by SLashdot Lameness Filter]
As you can see the "G4" is really just an evolution of the 603. The more "features" Mot addes to the creaky, second-generation 603 core, the slower the chip goes. Don't believe me? Visit SPEC's site and read the numbers. A 500MHz PowerPC 7400 is just as fast as an 800MHz PowerPC 7450 at the same clockspeed. And why is IBM *and* Mot still continuing PowerPC 750 development!? Mot can no longer expect to push this aging family on to 1GHz. It's clear that for PowerPC to survive, something drastic must be done. To this end I suggest two possible courses of actions.
First, since its initial run with the PowerPC 604, Motorola has introduced 3 new fabrication processes. I suggest applying these latest fabrication processes, as well as Silicon-on-Insulator and Copper wiring, to the 604e. It's highly probable that such a part could reach GHz speed. Seeing that the "G3" began at 200MHz and will top 1GHz soon, the 604e could do much better-- it started at 100MHz and made it all the way to 400MHz (not in any Mac, but in an MCG motherboard).
The other, more expensive option is to resurrect the PowerPC 620 and include all of today's latest enhancements. Give it AltiVec, a copper process, Silicon-on-Insulator, on-chip L2 cache up to 4 megs in size, the ability to address up to 8 megs of L3 cache, SpeedStep technology, etc. etc. and you'd have a chip that nothing from Intel or AMD could touch. The MHz myth would be null and void, the MHz war would be over-- and a solution to using dodgy G2 technology to drive Macs and networks the world over would be achieved.
-
Re:Already being sold...Sorry, but a 1.0 GHz P3 does not beat a 1.4 GHz P4 on most benchmarks. You may want to check your favorite hardware news site again. Also, a 500 MHz G3 is not nearly as fast as a 1.0 GHz P3. It's not even close.
I found these specs on IBM's PowerPC site. A powerPC 750(G3), 500 MHz, 100 MHz bus, 1MB external L2 cache:
SPECint95: 23.8
SPECfp95: 14.5Keep in mind that the iMac you quoted has a 256k L2 cache and a 66 MHz bus speed, so the SPEC results above are actually higher than you'll get with the iMac. Then I went over to www.spec.org and found some SPEC CPU95 results for Dell Precision Workstation 410, 500 MHz P3, 100 MHz, 512k L2 cache:
SPECint95: 20.5
SPECfp95: 14.2Unfortunately, since most of the world has moved on to the SPEC CPU2000 benchmark, I didn't find anything on the 1 GHz P3 there. So, in order to establish some sort of conversion factor between SPEC CPU95 and SPEC CPU2000 within the same processor line, I looked at the same Dell workstation (410) with a 700 MHz P3, for which both results are available:
SPECint95: 33.8
SPEC CINT2000: 307 (base)
SPECfp95: 24.3
SPEC CFP2000: 205 (base)Now, I found two P3 1 GHz results for the Dell 420 (the difference in the two is the result of using a different compiler version):
SPEC CINT2000: 418 and 454 (both base)
SPEC CFP2000: 292 and 329 (both base)Using the conversion factor established from the Dell 700 MHz P3 result and the results from above, I estimate the P3 1 GHz SPEC CPU95 results to be:
SPECint95: 46.0 and 50.0 (estimate)
SPECfp95: 34.6 and 39.0 (estimate)The reason why these results are more than double the results of the 500 MHz P3 is because the P3 1 GHz has an on-die L2 cache and, more importantly, the P3 tested above came with RDRAM instead of SDRAM. Since I think most P3 1 GHz systems shipped with PC133 SDRAM, the results are a bit optimistic. However, they're useful for comparing to the P4, which is almost always equipped with RDRAM. Finally, I had a look at the results for a P4 1.4 GHz Dell 330 (again, two results from two compiler versions):
SPEC CINT2000: 493 and 501 (both base)
SPEC CFP2000: 524 and 527 (both base)That's about 9% faster than the 1 GHz P3 result in CINT2000 and 79% faster in CFP2000, which in turn, are already about double the G3/500 performance results found above. The conclusion I draw is that a Dell 1.4 GHz P4 system with RDRAM is at least twice as fast as a 500 MHz iMac in integer performance and more like three times as fast in floating point. And since this is the G3 we're talking about, which doesn't have AltiVec, you can't point to Photoshop and signal processing benchmarks to save your argument.
-
Re:Here's another news flash
Yep, even AMD uses the intel compiler v5.0.1 for itsSPEC 2000 Athlon MP 1800+ benchmarks.
-
Re:Model Numbers
"SPEC benchmarks are all legitimate, real world applications (or do you consider gcc to be a toy benchmark?). Where are you getting your information from?"
Check out the Anandtech review of the new Athlons referenced in the Slashdot article, or the HardOCP review, or the Tom's Hardware review.
There you'll see the AthlonXP1800 beating or matching the P4 2GHZ in the majority of real-world benchmarks. When I say "real-world" benchmarks, I mean games, office applications, and graphics apps that are what the majority of people use 99%. As seen in SPEC's own FAQ, "Typically, the best measure of a system is your own application with your own workload".
SPEC benchmarks are designed to be purely CPU intensive, although I'm sure they stress the memory subsystem somewhat as well. Unlike "real-world benchmarks", they're designed specifically to stress the rest of the system as little as possible. This makes SPEC benchmarks valuable in the sense that you can compare one CPU to another more-or-less directly, but this has the downside of not making SPEC results directly relevant to day-to-day computing tasks.
Talking about SPEC benchmarks is sort of like talking about the "potential" that athletes had before they entered the big leagues. It's interesting, but doesn't really matter. What matters is how they actually perform in real situations. -
Spec scores - more detail is available!
It's nice to have a SPEC2001 test to look at and compare, but the SPECINT and SPECFP aren't the only results of this test. If you look at the whole reports for the P4 2GHz and the Athlon 1.4GHz, you'll see that the score is based on the 12 programs. If you're looking for performance in only one type of application (as I am), you can see how the two processors compare:
Timberwolf (300.twolf) is closest to what I do:
Athlon=703, Intel=683 --> a 3% difference - it's fairly even.
GCC (164.gcc) is something else I use a lot:
Athlon=254, Intel=197 --> a 29% difference - bigger difference
To select what test matches what you do best, you can get more info on the individual integer tests here, and the floating point tests here
Still, these two applications show that the variantions from the composite 18% SPECINT and 56% SPECFP advantage the P4 has can be great.
Also, these pages detail the hardware setup used to reproduce these tests. We can see the Athlon was tested with 256MB and an ATA66/7200 rpm drive. The Intel was tested with the same amount of RAM and the faster ATA100/7200 rpm infamous 75GXP drive. That may explain some of the gcc differences. Also included are the compilers to build these test programs - If you're not (or your software vendor isn't) using the Intel 5.0 compiler, then these results probably aren't as applicable to you. Still, you've got to wonder why AMD is using the intel compiler... (it has K7 optimizations, but how much work is intel going to put into then?)
Lots more info on SPEC2001 here.
FYI - the difference between peak and base - from the the spec run rules:
"Peak" metrics are produced by building each benchmark in the suite with a set of optimizations individually tailored for that benchmark. The optimizations selected must adhere to the set of general benchmark optimization rules described in section 2.1 below. This may also be referred to as "aggressive compilation".
"Base" metrics are produced by building all the benchmarks in the suite with a common set of optimizations. In addition to the general benchmark optimization rules (section 2.1), base optimizations must adhere to a stricter set of rules described in section 2.2. These additional rules serve to form a "baseline" of recommended performance optimizations for a given system. -
Spec scores - more detail is available!
It's nice to have a SPEC2001 test to look at and compare, but the SPECINT and SPECFP aren't the only results of this test. If you look at the whole reports for the P4 2GHz and the Athlon 1.4GHz, you'll see that the score is based on the 12 programs. If you're looking for performance in only one type of application (as I am), you can see how the two processors compare:
Timberwolf (300.twolf) is closest to what I do:
Athlon=703, Intel=683 --> a 3% difference - it's fairly even.
GCC (164.gcc) is something else I use a lot:
Athlon=254, Intel=197 --> a 29% difference - bigger difference
To select what test matches what you do best, you can get more info on the individual integer tests here, and the floating point tests here
Still, these two applications show that the variantions from the composite 18% SPECINT and 56% SPECFP advantage the P4 has can be great.
Also, these pages detail the hardware setup used to reproduce these tests. We can see the Athlon was tested with 256MB and an ATA66/7200 rpm drive. The Intel was tested with the same amount of RAM and the faster ATA100/7200 rpm infamous 75GXP drive. That may explain some of the gcc differences. Also included are the compilers to build these test programs - If you're not (or your software vendor isn't) using the Intel 5.0 compiler, then these results probably aren't as applicable to you. Still, you've got to wonder why AMD is using the intel compiler... (it has K7 optimizations, but how much work is intel going to put into then?)
Lots more info on SPEC2001 here.
FYI - the difference between peak and base - from the the spec run rules:
"Peak" metrics are produced by building each benchmark in the suite with a set of optimizations individually tailored for that benchmark. The optimizations selected must adhere to the set of general benchmark optimization rules described in section 2.1 below. This may also be referred to as "aggressive compilation".
"Base" metrics are produced by building all the benchmarks in the suite with a common set of optimizations. In addition to the general benchmark optimization rules (section 2.1), base optimizations must adhere to a stricter set of rules described in section 2.2. These additional rules serve to form a "baseline" of recommended performance optimizations for a given system. -
Spec scores - more detail is available!
It's nice to have a SPEC2001 test to look at and compare, but the SPECINT and SPECFP aren't the only results of this test. If you look at the whole reports for the P4 2GHz and the Athlon 1.4GHz, you'll see that the score is based on the 12 programs. If you're looking for performance in only one type of application (as I am), you can see how the two processors compare:
Timberwolf (300.twolf) is closest to what I do:
Athlon=703, Intel=683 --> a 3% difference - it's fairly even.
GCC (164.gcc) is something else I use a lot:
Athlon=254, Intel=197 --> a 29% difference - bigger difference
To select what test matches what you do best, you can get more info on the individual integer tests here, and the floating point tests here
Still, these two applications show that the variantions from the composite 18% SPECINT and 56% SPECFP advantage the P4 has can be great.
Also, these pages detail the hardware setup used to reproduce these tests. We can see the Athlon was tested with 256MB and an ATA66/7200 rpm drive. The Intel was tested with the same amount of RAM and the faster ATA100/7200 rpm infamous 75GXP drive. That may explain some of the gcc differences. Also included are the compilers to build these test programs - If you're not (or your software vendor isn't) using the Intel 5.0 compiler, then these results probably aren't as applicable to you. Still, you've got to wonder why AMD is using the intel compiler... (it has K7 optimizations, but how much work is intel going to put into then?)
Lots more info on SPEC2001 here.
FYI - the difference between peak and base - from the the spec run rules:
"Peak" metrics are produced by building each benchmark in the suite with a set of optimizations individually tailored for that benchmark. The optimizations selected must adhere to the set of general benchmark optimization rules described in section 2.1 below. This may also be referred to as "aggressive compilation".
"Base" metrics are produced by building all the benchmarks in the suite with a common set of optimizations. In addition to the general benchmark optimization rules (section 2.1), base optimizations must adhere to a stricter set of rules described in section 2.2. These additional rules serve to form a "baseline" of recommended performance optimizations for a given system. -
Spec scores - more detail is available!
It's nice to have a SPEC2001 test to look at and compare, but the SPECINT and SPECFP aren't the only results of this test. If you look at the whole reports for the P4 2GHz and the Athlon 1.4GHz, you'll see that the score is based on the 12 programs. If you're looking for performance in only one type of application (as I am), you can see how the two processors compare:
Timberwolf (300.twolf) is closest to what I do:
Athlon=703, Intel=683 --> a 3% difference - it's fairly even.
GCC (164.gcc) is something else I use a lot:
Athlon=254, Intel=197 --> a 29% difference - bigger difference
To select what test matches what you do best, you can get more info on the individual integer tests here, and the floating point tests here
Still, these two applications show that the variantions from the composite 18% SPECINT and 56% SPECFP advantage the P4 has can be great.
Also, these pages detail the hardware setup used to reproduce these tests. We can see the Athlon was tested with 256MB and an ATA66/7200 rpm drive. The Intel was tested with the same amount of RAM and the faster ATA100/7200 rpm infamous 75GXP drive. That may explain some of the gcc differences. Also included are the compilers to build these test programs - If you're not (or your software vendor isn't) using the Intel 5.0 compiler, then these results probably aren't as applicable to you. Still, you've got to wonder why AMD is using the intel compiler... (it has K7 optimizations, but how much work is intel going to put into then?)
Lots more info on SPEC2001 here.
FYI - the difference between peak and base - from the the spec run rules:
"Peak" metrics are produced by building each benchmark in the suite with a set of optimizations individually tailored for that benchmark. The optimizations selected must adhere to the set of general benchmark optimization rules described in section 2.1 below. This may also be referred to as "aggressive compilation".
"Base" metrics are produced by building all the benchmarks in the suite with a common set of optimizations. In addition to the general benchmark optimization rules (section 2.1), base optimizations must adhere to a stricter set of rules described in section 2.2. These additional rules serve to form a "baseline" of recommended performance optimizations for a given system. -
Spec scores - more detail is available!
It's nice to have a SPEC2001 test to look at and compare, but the SPECINT and SPECFP aren't the only results of this test. If you look at the whole reports for the P4 2GHz and the Athlon 1.4GHz, you'll see that the score is based on the 12 programs. If you're looking for performance in only one type of application (as I am), you can see how the two processors compare:
Timberwolf (300.twolf) is closest to what I do:
Athlon=703, Intel=683 --> a 3% difference - it's fairly even.
GCC (164.gcc) is something else I use a lot:
Athlon=254, Intel=197 --> a 29% difference - bigger difference
To select what test matches what you do best, you can get more info on the individual integer tests here, and the floating point tests here
Still, these two applications show that the variantions from the composite 18% SPECINT and 56% SPECFP advantage the P4 has can be great.
Also, these pages detail the hardware setup used to reproduce these tests. We can see the Athlon was tested with 256MB and an ATA66/7200 rpm drive. The Intel was tested with the same amount of RAM and the faster ATA100/7200 rpm infamous 75GXP drive. That may explain some of the gcc differences. Also included are the compilers to build these test programs - If you're not (or your software vendor isn't) using the Intel 5.0 compiler, then these results probably aren't as applicable to you. Still, you've got to wonder why AMD is using the intel compiler... (it has K7 optimizations, but how much work is intel going to put into then?)
Lots more info on SPEC2001 here.
FYI - the difference between peak and base - from the the spec run rules:
"Peak" metrics are produced by building each benchmark in the suite with a set of optimizations individually tailored for that benchmark. The optimizations selected must adhere to the set of general benchmark optimization rules described in section 2.1 below. This may also be referred to as "aggressive compilation".
"Base" metrics are produced by building all the benchmarks in the suite with a common set of optimizations. In addition to the general benchmark optimization rules (section 2.1), base optimizations must adhere to a stricter set of rules described in section 2.2. These additional rules serve to form a "baseline" of recommended performance optimizations for a given system. -
Spec scores - more detail is available!
It's nice to have a SPEC2001 test to look at and compare, but the SPECINT and SPECFP aren't the only results of this test. If you look at the whole reports for the P4 2GHz and the Athlon 1.4GHz, you'll see that the score is based on the 12 programs. If you're looking for performance in only one type of application (as I am), you can see how the two processors compare:
Timberwolf (300.twolf) is closest to what I do:
Athlon=703, Intel=683 --> a 3% difference - it's fairly even.
GCC (164.gcc) is something else I use a lot:
Athlon=254, Intel=197 --> a 29% difference - bigger difference
To select what test matches what you do best, you can get more info on the individual integer tests here, and the floating point tests here
Still, these two applications show that the variantions from the composite 18% SPECINT and 56% SPECFP advantage the P4 has can be great.
Also, these pages detail the hardware setup used to reproduce these tests. We can see the Athlon was tested with 256MB and an ATA66/7200 rpm drive. The Intel was tested with the same amount of RAM and the faster ATA100/7200 rpm infamous 75GXP drive. That may explain some of the gcc differences. Also included are the compilers to build these test programs - If you're not (or your software vendor isn't) using the Intel 5.0 compiler, then these results probably aren't as applicable to you. Still, you've got to wonder why AMD is using the intel compiler... (it has K7 optimizations, but how much work is intel going to put into then?)
Lots more info on SPEC2001 here.
FYI - the difference between peak and base - from the the spec run rules:
"Peak" metrics are produced by building each benchmark in the suite with a set of optimizations individually tailored for that benchmark. The optimizations selected must adhere to the set of general benchmark optimization rules described in section 2.1 below. This may also be referred to as "aggressive compilation".
"Base" metrics are produced by building all the benchmarks in the suite with a common set of optimizations. In addition to the general benchmark optimization rules (section 2.1), base optimizations must adhere to a stricter set of rules described in section 2.2. These additional rules serve to form a "baseline" of recommended performance optimizations for a given system. -
Re:What will they advertise now?
What a novel idea!
Sarcasm aside -- the SPEC benchmarks have been around for a long time and are well respected. You can see some SPEC CPU 2000 results here. -
SPECIn the end, virtually ALL the units used for measuring processor performance have died ugly, brutal deaths.
Um, no. And I'm not even talking about Q3 640x480.
;-)Try the SPEC website, and look at CPU2000 results.
And, you know what? Within a week, we all sigh with relief, because the old units never worked anyway!
Disagree again. What do you find wrong with SPEC? Its a very useful tool for measuring CPU and memory subsystem performance (bearing in mind that for many applications other factors are important as well - which is why application benchmarks are necessary).
When was the last time you heard the MIPS or FLOPS rating for a processor? When the RISC processors came out, and scored 100 x the nearest CISC chip, we suddenly started hearing how worthless those ratings really were. (Which was true, only the people saying it had been using them to crush the competition under their feet, the previous week.)
Coincidentally, I looked at Dell's 800 Mhz. Itanimum SPECint and SPECfp numbers this morning, along with Athlon and P4. Quite interesting. The integer performance of Itanium has been pretty abysmal so far (Athlon and P4 are both about twice as fast on average)- I can't wait for the Hammer processors myself...
What's the FLOPS rating for a Pentium IV? Anyone seen it listed on any of Intel's adverts? Curious, that.
Athlon 1.4, DDR: CINT2000 495, CFP2000 426
Pentium 4 1.8, RDRAM: CINT2000 596, CFP2000 618
Before screaming about how non-representative these scores are, you should read about the SPEC2000 methodology. It's fairly rigorous, and the benchmarks are actual real-world programs.
The benchmark scores reflect the ratio of the tested system to a Sun Ultra5_10 with a 300MHz processor. This means the P4 is (on average) 6.18 times faster at running the SPECfp benchmark suite.
Nothing is perfect, but SPEC is a useful CPU/memory benchmark.
186,282 mi/s...not just a good idea, its the law!
-
Re:Get a "REAL" Computer
BY placing these right in the fimware Sun is able to smoke x86 performancewise ALWAYS.
While I agree that Sun's computer architecture is more clean than that of the PC, SPARC CPU performance really isn't that good. Using the best benchmark (SPEC CINT 2000) with which I am familiar, an Athlon 1.4GHz can get up to 495, while the fastest Sun computer (a Sun Blade 1000 model 1900) gets a respectable 439. A Pentium 4 1.8GHz can score as high as 596.
While CINT2000 is a great measure of typical application performance, it used to be true that Intel PCs had terrible floating point performance compared to RISC architectures such as Sun, SGI, and Alpha. However, according to SPEC CFP2000, the Sun Blade scores 369, the Athlon scores 433, and the P4 goes up to a shocking 618.
Sadly, most of today's software isn't CPU limited, so your best choice is to get a balance of performance between your disk, memory, operating system, graphics card, and CPU.
-
Re:Get a "REAL" Computer
BY placing these right in the fimware Sun is able to smoke x86 performancewise ALWAYS.
While I agree that Sun's computer architecture is more clean than that of the PC, SPARC CPU performance really isn't that good. Using the best benchmark (SPEC CINT 2000) with which I am familiar, an Athlon 1.4GHz can get up to 495, while the fastest Sun computer (a Sun Blade 1000 model 1900) gets a respectable 439. A Pentium 4 1.8GHz can score as high as 596.
While CINT2000 is a great measure of typical application performance, it used to be true that Intel PCs had terrible floating point performance compared to RISC architectures such as Sun, SGI, and Alpha. However, according to SPEC CFP2000, the Sun Blade scores 369, the Athlon scores 433, and the P4 goes up to a shocking 618.
Sadly, most of today's software isn't CPU limited, so your best choice is to get a balance of performance between your disk, memory, operating system, graphics card, and CPU.
-
Re:What did they use to generate 400k users?You did that with ONE system? Interesting - the Mirapoint results for 400K users needed:
- 2 POP server machines
- 5 SMTP router machines
- 10 message store machines
- 1 benchmark manager machine
- 1 mail sink machine
- 5 load generator machines
Dealing with 300K outbound postings is no biggie - I've been able to deal with that level on an old IBM RS6000-F30 (166mz 604). You don't need really big iron for outbound mail until you have more than 500K or so RCPT TO's on one piece of mail. It's mostly a matter of good queue management, and Sendmail 8.12 has new queue management code that makes it even easier (I should know, I tested it). The only real magic is not getting logjammed due to DNS waits and unreachable destinations.
On the other hand, having 40K people doing POP accesses while you're dumping mail into their mailboxes is trickier. Some of the more obvious issues:
- The obvious popd solution leaves you 40K processes running at once. This could be bad.
- Locking issues get interesting.
- Even with a journaling filesystem, you can get killed on the I/O. Remember that writing to a file means you also need to do stuff with the inode....
-
Re:PPC != POWERIBM uses the PPC architecture in their RS/6000 and AS/400 boxen.
This is not entirely correct. IBM for the most part uses POWER processors in the pSeries and iSeries machines. The POWER line is a direct descendant from the arch that spawned the PPC. The processors used in these boxes are 64-bit implementations of the ISA and for the most part are a LOT faster than the PPC's that Apple sells. These machines have numbers listed on the spec.org (the only benchmark organization who's sole goal is to provide a cross platform level playing field) page. You would do well to
look at SpecINT/SpecFP if your interested in processor bound workstation type system loads. -
MHz to MHZ
Just because the MHz on the Sun equipment (900MHz) is lower than the current Pentium (1.5MHz), don't be fooled into thinking the Intel hardware is better. What matters after all, is throughput and pumping that data. Check your specs!
Check this 4 CPU Intel vs the 1 CPU Sun considering plain speed...
CINT2000: Intel Corporation Intel D850GB motherboard(1.5 GHz, Pentium 4 processor) - 536 524
CFP2000: Intel Corporation Intel D850GB motherboard(1.5 GHz, Pentium 4 processor) - 558 549
CINT2000: Sun Microsystems Sun Blade 1000 Model 1900 - 467 438
CFP2000: Sun Microsystems Sun Blade 1000 Model 1900 - 482 427
CINT2000: Advanced Micro Devices Tyan Thunder K7 Motherboard, 1.2GHz Athlon MP Processor - 522 495
CFP2000: Advanced Micro Devices Tyan Thunder K7 Motherboard, 1.2GHz Athlon MP Processor - 481 433
Throughput on the Sun with 2 CPU, but strangely enough, none for any Intel hardware. Throw a 2 CPU AMD in there, though...
CINT2000 rate: Sun Microsystems Sun Blade 1000 Model 2900 - 10.7 9.97
CFP2000 rate: Sun Microsystems Sun Blade 1000 Model 2900 - 10.2 9.09
CINT2000 rate: Advanced Micro Devic Tyan Thunder K7 Motherboard, 1.2GHz 2CPU - 10.8 11.1
CFP2000 rate: Advanced Micro Devic Tyan Thunder K7 Motherboard, 1.2GHz 2CPU - 8.30 9.14
-
Re:Funny how /. editors miss thingsIIS (which pulled off 5,137 transactions per second) beat Apache (4,602 tps)
Not only that, but IIS (with SWC) also beat Tux 2.0 on nearly identical hardware. The question is whether or not the difference in hard drives was enough to account for the difference in performance. My hunch is that it is, but that's just a hunch. Either way, it's obvious that MS have been shocked into action by the performance of Tux, and have come up with something comparable. I'd like to see a real comparisson of Tux, IIS/SWC and X15 on identical hardware.
-
Re:Funny how /. editors miss thingsIIS (which pulled off 5,137 transactions per second) beat Apache (4,602 tps)
Not only that, but IIS (with SWC) also beat Tux 2.0 on nearly identical hardware. The question is whether or not the difference in hard drives was enough to account for the difference in performance. My hunch is that it is, but that's just a hunch. Either way, it's obvious that MS have been shocked into action by the performance of Tux, and have come up with something comparable. I'd like to see a real comparisson of Tux, IIS/SWC and X15 on identical hardware.
-
Re:However...
X15 is still 2-3 times faster than Tux 2.0, and Cheetha (from MIT) is 2-3 ORDERS OF MAGNITUDE faster than either.
Er, let me get this straight: TUX can saturate multiple GigE cards per CPU, so Cheetah can saturate 200-2000 GigE cards per CPU? Today's systems don't even have that much memory bandwidth. -
Check out SPECweb
The SPECweb99 benchmark uses both static and dynamic content, and there are results from a variety of servers.
-
My quiet case project : it's an answer ... sort of
Well, it seem these days, most of the power user just care to get something like 200fps in Quake III. Why ? Beat's me ! I'm not on a quest to get the ultimate frame rate, I just want my box to be quiet as possibly can be.
To help you understand my take on the subject, here is the background
:
My PC has the following components :- A OEM case
- A 235W OEM power supply
- ASUS P3B-F
- Intel Pentium II rated 400Mhz @ 400Mhz
- A cheap OEM SECC2 Heat-Sink made of aluminum
- A 128MB CAS2 no-name DIMM
- Two 32MB CAS3 Samsung DIMM slowing down my memory timing, but preventing the appearance of the all mighty evil SwaP
- A ATI All-In-Wonder Rage128 16MB
- A Creative SoundBlaster Live! Value
- A Realtek 8139 Ethernet NIC
- My beloved USR 56Kbps ISA Real Modem. Sorry but to me a component that uses CPU power to do it's processing instead of taking the load off is not worthy of being in my computer. Not to mention the M$ Win part...
- A Creative 48x CD-ROM drive. It's the loudest damned thing in my computer when it's spinning
- A Quantum Fireball AS PLUS 40GB (7200RPM) in a removable tray
- A Quantum Fireball CX1 10GB (5400RPM) mounted inside the case
- Of course the stupid old 1.44 MB floppy drive only used for booting Tomsbrt in case of emergency
Soon to be
:
- A Adaptec 2940UW
- A Diamond Monster 3D II for Glide games
It turn out that the Quantum Fireball AS makes less noise than the Quantum Fireball CX1. I still have to figure it out
...I use my PC for
:
- Running Linux and learning as much as time allows me (Jez I had so much time when I was a student... Think of all the time I wasted in High-School running the evil W monster)
- Doing some gaming i.e. : Diablo II, Unreal, UT, Undying (Although that thing is going to cost me a new box)
- Spending numerous nights filling my brain @ Slashdot, Tomshardware, Anandtech, Arstechnica, StorageReview, Developper.Intel.com, and most importantly, hounding the web for all the case manufacturers and their take at a quiet box.
As I'm writing this post, that is probably going to be the base documentation for my Silent Case Project, you're guessing that my sleepless night of browsing have not yielded the desired result.
I've check out many options such as water cooling, moving the PC to the closet, returning to the forest where a PC is pretty far from your everyday quest for survival. None of them suits me.
The objective of my project is to build a case that meets the following criteria
:
- A silent as possible
- Accessible
- Provides sufficient ventilation to maintain all the components running within thermal specs
- Be light enough to be easily transportable (Let's not forget the Lan parties
;-)
To attain those goals I have to
:- Read all I can about noise, sound, aerodynamics, PC specs
- Find suitable materials : A case is not just a protection against unwanted fingers and dust ; it must provide EMI shielding, proper grounding, resist to impacts, and fit into my conception of the king of object you want in your bedroom (If you were thinking about plywood and a box of rusted leftover nails, forget it)
- Find the tools or the companies or individuals with the means to work the materials I choose to build the casing
For the sound isolation I was thinking about some kind of foam. Mineral lint would be affective but that takes too much space and it's not the kind of thing I want beside my bed. Form the casing itself, metal is almost inevitable if you want EMI shielding and grounding. And as for you who wonder why I have not mentioned water cooling yet, the greatest source of noise is not my CPU cooler and your just moving the problem out of the case (Nice ; you have water heating up but unless your reservoir is like a bathtub or something you will have to transfer the heat for the water to the air).
That about as far as I am. If you have any idea that might help me, please fell free to send me some bits forming ASCII characters at Prozzaks@operamail.com
To finish up, here is a list of thing that might help people wanting to achieve similar goals
:
- http://www.formfactors.org/ You should be able to find all the documents regarding the ATX form factor and thermal design guides. A must if you want to build a quiet PC.
- http://developer.intel.com/ Intel has contributed a great deal to the ATX definition ; here you will find many relevant documents including thermal design guides for all Intel processors.
- Etract from my favorite's :
Hardware\cases PC CASE
Fong Kai
PowerOn
Enlight Corporation
dir.yahoo Enclosures Manufacturers
procase
YY Computer
Psi
IN WIN
Amtrade
American Suntek
Addtronics
A-Top Technology, Inc
Nikao
Palo Alto Products
Antec
Lian-Li
amaquest
Koolance
Quietpc
PC Power & Cooling
Hardware\Heat Sinks ALPHA
Cooler Master
AVC
ekl
GlobalWIN
globefan
RDJD
Foxconn
Spring Spread
Sanyo Denki
TITAN
TaiSol
ChipCoolers
Orb a
ElanVital
Hardware\Info\Form Factor Platform Development Support
SSI
WTX
Hardware\Info\Standards Fibre Channel Industry Association
PCI SIG
RAB
serialata
SPEC
Hardware\Info\Storage RAID.edu
Hardware\Info\Cours CS 252 - Graduate Computer Architecture
Hardware\Info The PC Guide!
Hardware Bible
FullOn3D
developer.intel.com
HwB The Hardware Book
United Overclockers
Ars Technica
Tech-Junkie
HardwarePub
Webopedia
Illustrated Guide to the PC Hardware
SysOpt
2CPU
Ace's Hardware
Technical Support - RaidHelp v1.0 - Free RAID Technology Guide
Computer Architecture
OPENCORES.ORG
TechFest
MidWest Micro Support
Hardware\Resalers GeekTek!
Micro-Bytes
ALCO
ABC Micro
2CoolTek
Plycon Computers
TCWO
ABC Micro - Lprix
Case Outlet
The Chip Merchant, Inc
Cimsys
OrdiGros
ALIENWARE
SHENTECH
FireStorm
Hyper Microsystems
TWEAKBOX
Hardware\Reviews Tom's Hardware Guide
Sharky Extreme
StorageReview
HardOCP
AnandTech
SystemLogic
x-bit labs
Active-Hardware
FiringSquad
SocketA
Overclockers Australia
HEXUS
dansdata
SysReview
Hardware\Manufacturers AMD
ASUS
Belkin
MassMultiples
Promise
StarTech
VIA Technologies, Inc
ABIT Computer Corp
Comcase
Micron Semiconductor
ECS
Hardware Freeboxen
-
Re:specCPU?Hmmm.. Thats interresting (CINT=404, CFP=711). What I want to know is why they are not on the official SPEC CPU page. Comparing them with the PIII, P4 gives
CINT Dell PIII 1.0ghz-418base (not bad)
CINT Dell P4 1.7ghz-575base (oh, not looking so good)
CFP Dell PIII 1.0ghz-292base (really nice)
CFP Dell p4 1.7ghz-593base (nice)
Which is pretty decent. The integer performace is about what a decent RISC processor (not Sparc crap, the bottom of the performace heap) should get at the same clock rate, while the FP performance is quite stellar! Not bad, the stream looks pretty good too, I would expect more from a brand spanking new arch, but its not bad... -
I would NOT take the SunYes, but the servers under consideration don't have the US-III. The entry-level Netra isn't even really a serious US-II, but rather a IIe, which was originally designed for embedded applications. In other words, it's the Celeron of the Sparc world, and it doesn't have anywhere near the performance of its big, 8MB-cache bretheren.
You can see SPEC CPU results for a 500 mHz UltraSparc IIe on this page. Yep, that's a base CINT of 165, which is a helluva lot lower than the 307 result of a mere Pentium-III at 700 mHz (results can be seen here). In fact, no Intel chip has ever turned in a performance nearly that bad since the Spec CPU2000 benchmark was created in late 1999. Ouch.
Since these Netras also have IDE drives, you won't be improving performance along that axis either. I'd definitely go with the Intel options as far better bang-per-buck.
--JRZ -
I would NOT take the SunYes, but the servers under consideration don't have the US-III. The entry-level Netra isn't even really a serious US-II, but rather a IIe, which was originally designed for embedded applications. In other words, it's the Celeron of the Sparc world, and it doesn't have anywhere near the performance of its big, 8MB-cache bretheren.
You can see SPEC CPU results for a 500 mHz UltraSparc IIe on this page. Yep, that's a base CINT of 165, which is a helluva lot lower than the 307 result of a mere Pentium-III at 700 mHz (results can be seen here). In fact, no Intel chip has ever turned in a performance nearly that bad since the Spec CPU2000 benchmark was created in late 1999. Ouch.
Since these Netras also have IDE drives, you won't be improving performance along that axis either. I'd definitely go with the Intel options as far better bang-per-buck.
--JRZ -
Re:What's the deal with Intel?Intel is on the ropes and only by tuning small parts of the core OS to run at the "marketed speed" can they keep fooling the public into thinking it's a faster chip.
The P4 probably has many flaws, but for compute-intensive applications, the fastest P4 is significantly faster than the fastest Athlon. If you haven't benchmarked it yourself, check out the benchmarks at spec.org.
-
Re:What's this "Tux"?This page on the spec.org site shows a comparison of the three servers:
- TUX 1.0 wins with scores 1270, 2200, and 4200 (1, 2, and 4 CPU) on Dell x86 servers
- Zeus 3.3.5 in second place with 1050, 2200, and 3216 on Alpha and RS6000 (2, 6, and 8 CPU)
- poor Microsoft IIS 5.0 in last place, pulling scores of about from 700, 1180, and 1600 on x86 (1, 2, and 4 CPU).
Except TUX 1.0 comes along with the 4 CPU Dell PC outperforming an 8 CPU RS6000 box, the 2 CPU PC equalling the 6 CPU RS6000, and TUX 1.0 on just two x86 CPU greatly outperforming Microsoft IIS 5.0 running on four x86 CPUs !!
Talk about an upset!
Of course, this is all just benchmarking taken to the extreem... do it really matter if your web server can fill a gigabit ethernet pipe?
-
Re:What's this "Tux"?
Well TUX is indeed one of the fastest webserver, it's written by Ingo Molnar
what makes it special? Well, It runs in kernel space, that's why it's so fast. It's also not meant to completely replace a full fletched web server like apache.
check out this older slashdot article -
Re:But we already have sub $1k for AMD 1.333From the current SPEC CPU2000 listing. The numbers for each proc are integer base, int peak, fp base, and fp peak.
P4 1.5GHz
524 536 549 558Athlon 1.33GHz
482 539 414 445Alpha 833MHz (Compaq AlphaServer ES40)
518 544 590 658As you can see, the Alpha has a slight lead in peak int, a very close base int, and, of course, is far ahead in fp. The Athlon is all over the place, managing to beat the P4 in peak integer speed. The really important lesson, though, is that benchmarks are crap. The P4 is no where near 20% faster (or whatever, math is the first thing to go when my brain shuts off for the night) than an Athlon on any other benchmarks I've seen, nor in the experience of anyone I've heard from. In the "real world" benchmarks -- that is, games for the most part -- the P4 has been closer to 20% slower (except in Quake III, for whatever reason). Those results are probably crap too.
My own uninformed belief is that the relationship is more like P4 > Athlon > Alpha in integer, and Alpha > P4 == Athlon in fp, and that somewhere out there are chips that beat all of them, from manufacturers that have better things to do than make up numbers about their products.
-
Re:But we already have sub $1k for AMD 1.333From the current SPEC CPU2000 listing. The numbers for each proc are integer base, int peak, fp base, and fp peak.
P4 1.5GHz
524 536 549 558Athlon 1.33GHz
482 539 414 445Alpha 833MHz (Compaq AlphaServer ES40)
518 544 590 658As you can see, the Alpha has a slight lead in peak int, a very close base int, and, of course, is far ahead in fp. The Athlon is all over the place, managing to beat the P4 in peak integer speed. The really important lesson, though, is that benchmarks are crap. The P4 is no where near 20% faster (or whatever, math is the first thing to go when my brain shuts off for the night) than an Athlon on any other benchmarks I've seen, nor in the experience of anyone I've heard from. In the "real world" benchmarks -- that is, games for the most part -- the P4 has been closer to 20% slower (except in Quake III, for whatever reason). Those results are probably crap too.
My own uninformed belief is that the relationship is more like P4 > Athlon > Alpha in integer, and Alpha > P4 == Athlon in fp, and that somewhere out there are chips that beat all of them, from manufacturers that have better things to do than make up numbers about their products.
-
The P4 is no slouchI know benchmarks don't tell the whole story, but the SPEC benchmarks would likely show any really serious performance problems. In fact, the 1.5GHz P4 seems to be a little faster than the 1GHz Athlon on integer and significantly faster on floating point.
I don't like the P4 design. It's complex, has a messy instruction set, and consumes too much power for the performance they deliver. But the same is true of the whole Pentium series, and we have learned to live with it.
Overall, I think the Athlon may be a somewhat better deal than the P4, but the P4 isn't a slouch either. Now, I am looking forward to Sledgehammer: the 64bit AMD chip might end up being a much better compromise than Intel's 64bit offerings.
-
Re:Time for a rematch? :)Have a look at the fastest SPECWeb'99 results yet submitted for Intel hardware.
On an 8-way Dell PowerEdge 8450/700,
- TUX 2.0 scores 7500 simultaneous connections.
- IIS 5.0 and SWC 3.0 (a cache front-end) scores 7300.
The rematch has already been quietly won, by Linux.
-
how fast??? can't find any benchmark resultsWhile benchmarks aren't perfect, the SPEC results at least give a rough indication of how fast these things might be. Unfortunately, I can't find any information on the Sun Blade 100. Has anybody seen SPECmarks on these things?
The SunBlade 1000 (an UltraSPARC III running at 500MHz) seems in the same ballpark as a high-end Intel or AMD processor, so I wouldn't get my hopes up too high for the SunBlade 100 (an UltraSPARC IIe running at 500MHz); it's probably a reasonable deal compared to PCs, but not great.
The really interesting thing is that that the SunBlade 100 is a 64bit machine for less than $1000.
-
The Myths......
Hmmm, so the magical debate over CISC vs. RISC rolls on. As I much agree that this machine would be a nice machine to have, I don't think that the $1k price tag is really worth it. Reason being that high end Athalon and Pentium systems can be obtained at a much more attractive price/performance ratio.
Hmmm, now how about raw performance? Granted the Sparc II is a nice CPU, but can it actually compete with an Athalon at 2x the MHz. If so, can it compete in such a wide margin that would justify the price. Check the Spec bench's...
2000 Integer Results
2000 Floating Point Results
Granted, these shouldn't be taken as the ultimate in performance, but I don't see a staggering lead.
As for those in the RISC vs. CISC camps. I hate to inform you that the RISC, CISC is all but dead. Current RISC designs now longer embody the RISC philosophies of days past. CISC cores blend in to the point that one couldn't distinguish it from its called RISC counterparts. Modern CPU's are cutting the edge of new design concepts. If you feel the need to follow up on this. I suggest reading the following...
RISC vs. CISC: the Post-RISC Era
and to track the history of your favorite CPU...
Here is a good place to start..
Hmmm, if you have a few bucks to spare, pick one up. But I don't see it as a vastly superior platform. -
The Myths......
Hmmm, so the magical debate over CISC vs. RISC rolls on. As I much agree that this machine would be a nice machine to have, I don't think that the $1k price tag is really worth it. Reason being that high end Athalon and Pentium systems can be obtained at a much more attractive price/performance ratio.
Hmmm, now how about raw performance? Granted the Sparc II is a nice CPU, but can it actually compete with an Athalon at 2x the MHz. If so, can it compete in such a wide margin that would justify the price. Check the Spec bench's...
2000 Integer Results
2000 Floating Point Results
Granted, these shouldn't be taken as the ultimate in performance, but I don't see a staggering lead.
As for those in the RISC vs. CISC camps. I hate to inform you that the RISC, CISC is all but dead. Current RISC designs now longer embody the RISC philosophies of days past. CISC cores blend in to the point that one couldn't distinguish it from its called RISC counterparts. Modern CPU's are cutting the edge of new design concepts. If you feel the need to follow up on this. I suggest reading the following...
RISC vs. CISC: the Post-RISC Era
and to track the history of your favorite CPU...
Here is a good place to start..
Hmmm, if you have a few bucks to spare, pick one up. But I don't see it as a vastly superior platform. -
SpecWeb results?About 80% of all new kernel features in the 2.4 kernel are directed towards the "Enterprise Market (tm)". 64 GB RAM, 2000 GB files, tens of thousands of concurrent processes, highly scalable networking, IO, cache subsystems, Reiser-FS, RAID, LVM, to name just a few. If you do not believe the featurelist then high end scalability can be witnessed in last year's SpecWeb99 results as well. (search for "TUX".)
Linus does not accept technically weak kernel code. This might not sit well with some of those would-be "technology leaders" who now claim that Linus is an "obstacle". If they know it better, they should start their own kernel project, nobody prevents them from doing so.
-
Then where are the SPECweb benchmarks for BSD?
If BSD/OS is sooo much faster than Linux 2.4, then why does NO ONE use any BSD for SPECweb benchmarks ? I see Linux 2.4, Windows 2000, AIX, Tru64, and HPUX.
-
You already have a good benchmark: SPEC
Of course, it is JUST A BENCHMARK, so it is not perfect, but I think that it is quite good:
- vendor independant
- separate integers/floating-point performances (which are quite different beasts really)
There are many available results, look here.
Anyway, I think that it is much better than your "power rating" number.. -
Re:I hate Sun computers.
That 400 megahertz processor operates on about 4 times more CPU instructions per clock cycle than your X86 chip. You're comparing apples and oranges. And I have bad software support problems on my IBM Aptiva running Windows that crashes every 5-7 days. What problems do SPARC chips have that x86 chips don't?
You are wrong. An Intel P3 or an AMD Athlon absolutely smokes the USII in terms of instructions per clock cycle. Fortunately most people don't use these machines for their raw CPU power, they use 'em for the I/O throughput. Current Sun architecture is quite a bit faster than current Intel architecture in addition to the fact that Sun uses huge amounts of L2 cache. The USIII and the Intel P4 will be on par with each other in terms of I/O throughput (they'll both be at 3.2GBps).
What exactly is standard about needing a massive image editing package with your server? Dumb statement.
How will you ever get a job in the real world when you equate Microsoft Paint with Oracle in the same sentence. I'm a sys admin and haven't touched a graphics program for work in over 5 years.
You seem to imply here that graphics programs are not valid applications for a server. A lot of very strenuous supercomputing that is done is directly related to graphics. I know quite a few people that are using SGIs on the same scale as the E10k to do rendering. Incidentally, SGIs work quite well for this as they have slightly better fp performance as well as a more scalable MP architecture requiring simpler programming (multi-threaded vs. clustered). The fact remains though that neither MIPS nor SPARC have the FP performance of an Alpha, Intel, or AMD which is what some people use to build render clusters (think Titanic).
Calling the original poster dumb was an argument ad hominem. It also makes you an asshole (another argument ad hominem).
Most Sun hardware is pretty reliable though it is overpriced in comparison to Intel hardware of equivalent quality. I can buy a dual-processor E220 for ~$20,000. I could buy an equivalent Compaq, HP, or Dell rack-mounted server for half that. It would have slightly better better CPU performance and slightly less I/O throughput.
Sun's specifically aren't good at hosting dynamic web pages because they can require quite a bit of CPU and relatively small amounts of data. Sun machines do better with huge amounts of data and relatively smaller CPU loads. The make great Oracle servers. Which is what I gather you are using them for.
Additionally, MS SQL Server 2000 on Win2k on a quad-processor Compaq (Xeon 700Mhz) can be faster than a quad-processor E450.
Yes, you are a Sun Bigot. You are also an asshole (once again, there goes that ad hominem thing again). You're almost as bad as a mainframe guy. People that think that they have a concept of system architecture but get their judgement clouded by their own zealous behaviour annoy the shit out of me.
Additionally, I'll provide something that you didn't provide while debunking the earlier poster's comments.
Check out specint and specfp marks on Intel P3s versus ultrasparc IIs. go to http://www.spec.org for the info. For database results that prove my point check out tpc-c benchmarks at http://www.tpc.org. Granted, they don't have results for Oracle 8i on a quad-processor E450, but they do have results from other rdbms vendors. The E450 scores about a third slower than the quad-processor compaq.
Have a nice day.
-
The P4 is the world's fastest microprocessor.
It merely needs recompiled code to perform well.
On what am I basing this apparently heretical statement? On SPEC_CPU2000, the most demanding, well balanced, most respected cross-platform CPU benchmark in the world. As you can see if you peruse these lists, the P4/1500 has the highest scores of any shipped CPU in the world, both in SPECint (base and peak) and in SPECfp (base only).
Before any of you reply and think you've caught a mistake, the Alpha EV67/833 is *not* publicly available, and won't be until January, at which point it will take back leadership in SPECfp_base and SPECint_peak. Of course, the P4/1700 will probably take back the lead when it's released in March or so. Indeed, the P4 and Alpha will likely trade the top SPEC spot back and forth at least until the EV68 (EV67 moved to .18 um process and with on-die L2 cache) makes an appearance (Q2?), if not all the way until the EV7 (EV68 with integrated on-chip *8-channel* RDRAM controller) is released (Q4?).
This is why all this banal talk about the P4 being a crappy chip or (in the wake of this article) a "crippled" chip is ignorant drivel. SPEC_CPU is an exceptionally well designed, balanced, and comprehensive benchmark stressing a CPU to its limits in all sorts of ways. Why then the P4's disappointing performance on all those other benchmarks? They are all on "legacy" code--code compiled with the P6 core in mind. Because the P4 represents the first chip with a new core architecture (the horribly misnamed "NetBurst" core) from Intel in 5 years, it has a lot of pretty radical design features which don't take well to code compiled for the P6 core. While this means the P4 is pretty a useless (or at least very overpriced) solution to running today's code--and indeed, most code released for at least the next year or so--it has nothing to do with how good a *design* it has, which is ostensibly the point of this discussion. Indeed, the PPro--the first P6 core chip--posted very "disappointing" benchmarks on legacy code when *it* was released 5 years ago; many observers wrote it and the P6 core off as underperforming overdesigned wackiness from Intel. It was arguably the most successful and innovative CPU core ever. Not so incidentally, this was strongly forshadowed by its brief theft of the SPECint95 performance crown from the top Alpha of the time...
Now to dispense with the most repeated "points" we've seen thus far.
1) "This just goes to show that x86 is a dead ISA with no headroom to grow." Not the most unexpected statement to be found on /., but let's just say that the other 99.99% of the world that enjoys backwards compatability will make sure x86 stays alive for quite a long time to come thank you. On a technical (rather than marketing level), though, this is ridiculous bunk as well, as the fact that the P4 beats every released 64-bit 10-times-as-expensive RISC chip with 30-times-as-expensive platforms, on SPEC_CPU--a benchmark specifically designed to stress exactly those high-performance situations demanded of professional level workstation and server machines--demonstrates quite nicely.
Yes, x86 is a bad ISA, and yes it presents a problem to be overcome by chip engineers. But it has been overcome and will continue to be overcome--today by taking on a decoding stage to x86 processors that turns x86 instructions to RISC-like instructions for internal operations (taken out of the critical path by the P4's trace cache), and tomorrow perhaps by dynamic recompilation software ala Transmeta, IBM's DAISY, and HP's Dynamo, techniques which are still in their infancy and *may* end up providing better-than-compiled performance even without the benefit of converting to a more optimal ISA. The other negative of the x86 ISA, namely the paucity of compiler-visable registers, is indeed a problem, although one partially aleviated by rename registers and partially by evolutionary extensions to the x86 ISA, such as SSE2, which will eventually replace much of the god-awful stack-based x87 FPU ISA.
The real question is, does the performance hit generated by sticking with x86 exceed the performance gain generated by having a much larger target market, and thus more money to spend keeping up with the latest process technology and thus getting faster clocked CPUs? The answer thus far has been a rather resounding "no"--that is, the economies of scale granted by staying x86 have meant processors which are outright faster and cost much much less.
After all, there is no doubt that were the Alpha not around 18 months behind Intel in terms of process technology, the EV67 would be much faster than the P4. On the other hand, the EV67 gets to take advantage of resources that Intel could never dream of in a mainstream chip--like a 300+mm^2 die size, extra wide memory buses, and 4-8MB L2 caches--because of the tremendous added cost. And even with all that plus what is widely acknowledged as the best CPU design team on the planet, the Alpha only manages to keep up with the P4.
Moreover, the rest of the 64-bit world--despite the same advantages as the Alpha (well, except their design team)--can barely keep up with the P3, and that's a 5 year old design. They may be available in multi-chip boxes scaling to kingdom come, but on the level of individual chips, the best that Sun, IBM, HP or MIPS has to offer is pretty lame, despite all the advantages of a RISC ISA. Of course, the same old folks will be claiming that x86 is an inherent dead end when the P4 (or whatever Intel is calling its current NetBurst core by then) scales past 4 GHz two years from now, well ahead of anyone in the RISC world. And we'll hear it again in 4 or 5 years, when Intel releases another all-new x86 core.
2. "The P4 should have left in all those features this article talks about." Uhhuh. Sure. Um...now, who would know more about this? Would that be you, having read some article on the Internet? Or would that be Intel's engineers who maybe understand the P4 core and the issues involed with these features a bit better than you, and who had the benefit of cycle-perfect simulations on dozens if not hundreds of possible P4 variants running every concievable type of code??
If there's a feature which doesn't make it into a finished CPU, it's because of one of two reasons:
1) The designers didn't think of it;
2) The designers couldn't figure a way to implement it and make it work with the rest of their design in such a way that it raised performance/cost.
Needless to say, "The designers thought of it, implemented it (which they did in this case), and it was a good feature (i.e. improved performance/cost on a majority of code), but then made a boneheaded decision not to use it," is *not* on the list.
IMO, the features listed here are all better off gone from the current P4. The only really intriguing one--another FPU--was *not* left off for die size considerations (i.e. cost): FPU's are not very big. It was left off for performance issues. You see, while "more is better" sounds like a nice philosophy, adding an extra FPU would have meant extra decoding and routing logic in the FP section of the chip. Considering Intel actually went to the considerable trouble of implementing this feature and then decided against it, it is very likely that this extra logic was in the P4's critical path. Thus while including the extra FPU would have meant extra performance/clock, it would have meant lower overall clock speeds. Obviously Intel felt the tradeoff worked better without the extra FPU than with it.
If you "disagree" with their decision, please refer to the cycle-perfect simulators which Intel has and you don't, and the P4/1500's SPECfp2000 score which is a mere, oh, 68% better than the fastest P3. Also you might note that the P4 is scaling quite well with clock speed on SPECfp, that it will spend most of its life at speeds well above 2 GHz, and that it will likely sell most (at least for the next 2 years) in combination with a memory subsystem providing *less* bandwidth than the current dual-RDRAM i850 chipset--all of which point to this being a very smart decision on Intel's part. (The reasoning is this: if the P4's FPU can already keep up quite nicely with a larger memory bandwidth, then why increase FPU power/clock when most P4's will have higher clocks and lower bandwidth to keep them fed?)
As for the features I'd like to see added to the P4 when it moves to its .13 um Northwood variant next summer: one of them was on the list, i.e. a 16kb L1 data cache. The reason it was left off was clearly not die size but clock scalability--Intel decided having a 2-cycle latency L1 was more important than having a bigger one, and I totally agree. After the move to .13, though, perhaps a 16kb 2-cycle L1 will no longer limit clock scalability, just as the PPro's 8kb L1's were expanded to 16kb each with the PII. The other, a 512kb L2, would take up much too much die space at .18um to be feasible; it too, may make it to Northwood, depending on Intel's target die size. Needless to say, whatever they decide, it will be a much better informed decision than I or anyone here could presume to make. -
The P4 is the world's fastest microprocessor.
It merely needs recompiled code to perform well.
On what am I basing this apparently heretical statement? On SPEC_CPU2000, the most demanding, well balanced, most respected cross-platform CPU benchmark in the world. As you can see if you peruse these lists, the P4/1500 has the highest scores of any shipped CPU in the world, both in SPECint (base and peak) and in SPECfp (base only).
Before any of you reply and think you've caught a mistake, the Alpha EV67/833 is *not* publicly available, and won't be until January, at which point it will take back leadership in SPECfp_base and SPECint_peak. Of course, the P4/1700 will probably take back the lead when it's released in March or so. Indeed, the P4 and Alpha will likely trade the top SPEC spot back and forth at least until the EV68 (EV67 moved to .18 um process and with on-die L2 cache) makes an appearance (Q2?), if not all the way until the EV7 (EV68 with integrated on-chip *8-channel* RDRAM controller) is released (Q4?).
This is why all this banal talk about the P4 being a crappy chip or (in the wake of this article) a "crippled" chip is ignorant drivel. SPEC_CPU is an exceptionally well designed, balanced, and comprehensive benchmark stressing a CPU to its limits in all sorts of ways. Why then the P4's disappointing performance on all those other benchmarks? They are all on "legacy" code--code compiled with the P6 core in mind. Because the P4 represents the first chip with a new core architecture (the horribly misnamed "NetBurst" core) from Intel in 5 years, it has a lot of pretty radical design features which don't take well to code compiled for the P6 core. While this means the P4 is pretty a useless (or at least very overpriced) solution to running today's code--and indeed, most code released for at least the next year or so--it has nothing to do with how good a *design* it has, which is ostensibly the point of this discussion. Indeed, the PPro--the first P6 core chip--posted very "disappointing" benchmarks on legacy code when *it* was released 5 years ago; many observers wrote it and the P6 core off as underperforming overdesigned wackiness from Intel. It was arguably the most successful and innovative CPU core ever. Not so incidentally, this was strongly forshadowed by its brief theft of the SPECint95 performance crown from the top Alpha of the time...
Now to dispense with the most repeated "points" we've seen thus far.
1) "This just goes to show that x86 is a dead ISA with no headroom to grow." Not the most unexpected statement to be found on /., but let's just say that the other 99.99% of the world that enjoys backwards compatability will make sure x86 stays alive for quite a long time to come thank you. On a technical (rather than marketing level), though, this is ridiculous bunk as well, as the fact that the P4 beats every released 64-bit 10-times-as-expensive RISC chip with 30-times-as-expensive platforms, on SPEC_CPU--a benchmark specifically designed to stress exactly those high-performance situations demanded of professional level workstation and server machines--demonstrates quite nicely.
Yes, x86 is a bad ISA, and yes it presents a problem to be overcome by chip engineers. But it has been overcome and will continue to be overcome--today by taking on a decoding stage to x86 processors that turns x86 instructions to RISC-like instructions for internal operations (taken out of the critical path by the P4's trace cache), and tomorrow perhaps by dynamic recompilation software ala Transmeta, IBM's DAISY, and HP's Dynamo, techniques which are still in their infancy and *may* end up providing better-than-compiled performance even without the benefit of converting to a more optimal ISA. The other negative of the x86 ISA, namely the paucity of compiler-visable registers, is indeed a problem, although one partially aleviated by rename registers and partially by evolutionary extensions to the x86 ISA, such as SSE2, which will eventually replace much of the god-awful stack-based x87 FPU ISA.
The real question is, does the performance hit generated by sticking with x86 exceed the performance gain generated by having a much larger target market, and thus more money to spend keeping up with the latest process technology and thus getting faster clocked CPUs? The answer thus far has been a rather resounding "no"--that is, the economies of scale granted by staying x86 have meant processors which are outright faster and cost much much less.
After all, there is no doubt that were the Alpha not around 18 months behind Intel in terms of process technology, the EV67 would be much faster than the P4. On the other hand, the EV67 gets to take advantage of resources that Intel could never dream of in a mainstream chip--like a 300+mm^2 die size, extra wide memory buses, and 4-8MB L2 caches--because of the tremendous added cost. And even with all that plus what is widely acknowledged as the best CPU design team on the planet, the Alpha only manages to keep up with the P4.
Moreover, the rest of the 64-bit world--despite the same advantages as the Alpha (well, except their design team)--can barely keep up with the P3, and that's a 5 year old design. They may be available in multi-chip boxes scaling to kingdom come, but on the level of individual chips, the best that Sun, IBM, HP or MIPS has to offer is pretty lame, despite all the advantages of a RISC ISA. Of course, the same old folks will be claiming that x86 is an inherent dead end when the P4 (or whatever Intel is calling its current NetBurst core by then) scales past 4 GHz two years from now, well ahead of anyone in the RISC world. And we'll hear it again in 4 or 5 years, when Intel releases another all-new x86 core.
2. "The P4 should have left in all those features this article talks about." Uhhuh. Sure. Um...now, who would know more about this? Would that be you, having read some article on the Internet? Or would that be Intel's engineers who maybe understand the P4 core and the issues involed with these features a bit better than you, and who had the benefit of cycle-perfect simulations on dozens if not hundreds of possible P4 variants running every concievable type of code??
If there's a feature which doesn't make it into a finished CPU, it's because of one of two reasons:
1) The designers didn't think of it;
2) The designers couldn't figure a way to implement it and make it work with the rest of their design in such a way that it raised performance/cost.
Needless to say, "The designers thought of it, implemented it (which they did in this case), and it was a good feature (i.e. improved performance/cost on a majority of code), but then made a boneheaded decision not to use it," is *not* on the list.
IMO, the features listed here are all better off gone from the current P4. The only really intriguing one--another FPU--was *not* left off for die size considerations (i.e. cost): FPU's are not very big. It was left off for performance issues. You see, while "more is better" sounds like a nice philosophy, adding an extra FPU would have meant extra decoding and routing logic in the FP section of the chip. Considering Intel actually went to the considerable trouble of implementing this feature and then decided against it, it is very likely that this extra logic was in the P4's critical path. Thus while including the extra FPU would have meant extra performance/clock, it would have meant lower overall clock speeds. Obviously Intel felt the tradeoff worked better without the extra FPU than with it.
If you "disagree" with their decision, please refer to the cycle-perfect simulators which Intel has and you don't, and the P4/1500's SPECfp2000 score which is a mere, oh, 68% better than the fastest P3. Also you might note that the P4 is scaling quite well with clock speed on SPECfp, that it will spend most of its life at speeds well above 2 GHz, and that it will likely sell most (at least for the next 2 years) in combination with a memory subsystem providing *less* bandwidth than the current dual-RDRAM i850 chipset--all of which point to this being a very smart decision on Intel's part. (The reasoning is this: if the P4's FPU can already keep up quite nicely with a larger memory bandwidth, then why increase FPU power/clock when most P4's will have higher clocks and lower bandwidth to keep them fed?)
As for the features I'd like to see added to the P4 when it moves to its .13 um Northwood variant next summer: one of them was on the list, i.e. a 16kb L1 data cache. The reason it was left off was clearly not die size but clock scalability--Intel decided having a 2-cycle latency L1 was more important than having a bigger one, and I totally agree. After the move to .13, though, perhaps a 16kb 2-cycle L1 will no longer limit clock scalability, just as the PPro's 8kb L1's were expanded to 16kb each with the PII. The other, a 512kb L2, would take up much too much die space at .18um to be feasible; it too, may make it to Northwood, depending on Intel's target die size. Needless to say, whatever they decide, it will be a much better informed decision than I or anyone here could presume to make. -
The P4 is the world's fastest microprocessor.
It merely needs recompiled code to perform well.
On what am I basing this apparently heretical statement? On SPEC_CPU2000, the most demanding, well balanced, most respected cross-platform CPU benchmark in the world. As you can see if you peruse these lists, the P4/1500 has the highest scores of any shipped CPU in the world, both in SPECint (base and peak) and in SPECfp (base only).
Before any of you reply and think you've caught a mistake, the Alpha EV67/833 is *not* publicly available, and won't be until January, at which point it will take back leadership in SPECfp_base and SPECint_peak. Of course, the P4/1700 will probably take back the lead when it's released in March or so. Indeed, the P4 and Alpha will likely trade the top SPEC spot back and forth at least until the EV68 (EV67 moved to .18 um process and with on-die L2 cache) makes an appearance (Q2?), if not all the way until the EV7 (EV68 with integrated on-chip *8-channel* RDRAM controller) is released (Q4?).
This is why all this banal talk about the P4 being a crappy chip or (in the wake of this article) a "crippled" chip is ignorant drivel. SPEC_CPU is an exceptionally well designed, balanced, and comprehensive benchmark stressing a CPU to its limits in all sorts of ways. Why then the P4's disappointing performance on all those other benchmarks? They are all on "legacy" code--code compiled with the P6 core in mind. Because the P4 represents the first chip with a new core architecture (the horribly misnamed "NetBurst" core) from Intel in 5 years, it has a lot of pretty radical design features which don't take well to code compiled for the P6 core. While this means the P4 is pretty a useless (or at least very overpriced) solution to running today's code--and indeed, most code released for at least the next year or so--it has nothing to do with how good a *design* it has, which is ostensibly the point of this discussion. Indeed, the PPro--the first P6 core chip--posted very "disappointing" benchmarks on legacy code when *it* was released 5 years ago; many observers wrote it and the P6 core off as underperforming overdesigned wackiness from Intel. It was arguably the most successful and innovative CPU core ever. Not so incidentally, this was strongly forshadowed by its brief theft of the SPECint95 performance crown from the top Alpha of the time...
Now to dispense with the most repeated "points" we've seen thus far.
1) "This just goes to show that x86 is a dead ISA with no headroom to grow." Not the most unexpected statement to be found on /., but let's just say that the other 99.99% of the world that enjoys backwards compatability will make sure x86 stays alive for quite a long time to come thank you. On a technical (rather than marketing level), though, this is ridiculous bunk as well, as the fact that the P4 beats every released 64-bit 10-times-as-expensive RISC chip with 30-times-as-expensive platforms, on SPEC_CPU--a benchmark specifically designed to stress exactly those high-performance situations demanded of professional level workstation and server machines--demonstrates quite nicely.
Yes, x86 is a bad ISA, and yes it presents a problem to be overcome by chip engineers. But it has been overcome and will continue to be overcome--today by taking on a decoding stage to x86 processors that turns x86 instructions to RISC-like instructions for internal operations (taken out of the critical path by the P4's trace cache), and tomorrow perhaps by dynamic recompilation software ala Transmeta, IBM's DAISY, and HP's Dynamo, techniques which are still in their infancy and *may* end up providing better-than-compiled performance even without the benefit of converting to a more optimal ISA. The other negative of the x86 ISA, namely the paucity of compiler-visable registers, is indeed a problem, although one partially aleviated by rename registers and partially by evolutionary extensions to the x86 ISA, such as SSE2, which will eventually replace much of the god-awful stack-based x87 FPU ISA.
The real question is, does the performance hit generated by sticking with x86 exceed the performance gain generated by having a much larger target market, and thus more money to spend keeping up with the latest process technology and thus getting faster clocked CPUs? The answer thus far has been a rather resounding "no"--that is, the economies of scale granted by staying x86 have meant processors which are outright faster and cost much much less.
After all, there is no doubt that were the Alpha not around 18 months behind Intel in terms of process technology, the EV67 would be much faster than the P4. On the other hand, the EV67 gets to take advantage of resources that Intel could never dream of in a mainstream chip--like a 300+mm^2 die size, extra wide memory buses, and 4-8MB L2 caches--because of the tremendous added cost. And even with all that plus what is widely acknowledged as the best CPU design team on the planet, the Alpha only manages to keep up with the P4.
Moreover, the rest of the 64-bit world--despite the same advantages as the Alpha (well, except their design team)--can barely keep up with the P3, and that's a 5 year old design. They may be available in multi-chip boxes scaling to kingdom come, but on the level of individual chips, the best that Sun, IBM, HP or MIPS has to offer is pretty lame, despite all the advantages of a RISC ISA. Of course, the same old folks will be claiming that x86 is an inherent dead end when the P4 (or whatever Intel is calling its current NetBurst core by then) scales past 4 GHz two years from now, well ahead of anyone in the RISC world. And we'll hear it again in 4 or 5 years, when Intel releases another all-new x86 core.
2. "The P4 should have left in all those features this article talks about." Uhhuh. Sure. Um...now, who would know more about this? Would that be you, having read some article on the Internet? Or would that be Intel's engineers who maybe understand the P4 core and the issues involed with these features a bit better than you, and who had the benefit of cycle-perfect simulations on dozens if not hundreds of possible P4 variants running every concievable type of code??
If there's a feature which doesn't make it into a finished CPU, it's because of one of two reasons:
1) The designers didn't think of it;
2) The designers couldn't figure a way to implement it and make it work with the rest of their design in such a way that it raised performance/cost.
Needless to say, "The designers thought of it, implemented it (which they did in this case), and it was a good feature (i.e. improved performance/cost on a majority of code), but then made a boneheaded decision not to use it," is *not* on the list.
IMO, the features listed here are all better off gone from the current P4. The only really intriguing one--another FPU--was *not* left off for die size considerations (i.e. cost): FPU's are not very big. It was left off for performance issues. You see, while "more is better" sounds like a nice philosophy, adding an extra FPU would have meant extra decoding and routing logic in the FP section of the chip. Considering Intel actually went to the considerable trouble of implementing this feature and then decided against it, it is very likely that this extra logic was in the P4's critical path. Thus while including the extra FPU would have meant extra performance/clock, it would have meant lower overall clock speeds. Obviously Intel felt the tradeoff worked better without the extra FPU than with it.
If you "disagree" with their decision, please refer to the cycle-perfect simulators which Intel has and you don't, and the P4/1500's SPECfp2000 score which is a mere, oh, 68% better than the fastest P3. Also you might note that the P4 is scaling quite well with clock speed on SPECfp, that it will spend most of its life at speeds well above 2 GHz, and that it will likely sell most (at least for the next 2 years) in combination with a memory subsystem providing *less* bandwidth than the current dual-RDRAM i850 chipset--all of which point to this being a very smart decision on Intel's part. (The reasoning is this: if the P4's FPU can already keep up quite nicely with a larger memory bandwidth, then why increase FPU power/clock when most P4's will have higher clocks and lower bandwidth to keep them fed?)
As for the features I'd like to see added to the P4 when it moves to its .13 um Northwood variant next summer: one of them was on the list, i.e. a 16kb L1 data cache. The reason it was left off was clearly not die size but clock scalability--Intel decided having a 2-cycle latency L1 was more important than having a bigger one, and I totally agree. After the move to .13, though, perhaps a 16kb 2-cycle L1 will no longer limit clock scalability, just as the PPro's 8kb L1's were expanded to 16kb each with the PII. The other, a 512kb L2, would take up much too much die space at .18um to be feasible; it too, may make it to Northwood, depending on Intel's target die size. Needless to say, whatever they decide, it will be a much better informed decision than I or anyone here could presume to make. -
SPEC?
Measuring CPU performance with just one MPEG program is not a good measure of performance, unless you plan to dedicate most of the CPU clock cycles to this program for the rest of the CPUs life!
I would be more interested to see how well the new intel CPU fairs in modern-day benchmarks, such as SPEC. SPEC benchmarks attempt to test CPU performance using a mix of many different *real-world* programs, and also take compiler optimisations into account..
There is a SPEC benchmark for almost anything these days, and I think this one will do Tom nicely:
http://www.spec.org/osg/cpu2000/ -
Re:..about time too!
-
Re:..about time too!
-
Re:Price-Performance of "iCubes" and other Macseven though PPC clock speeds are slower, programs run faster because the processor can do more per clock cycle.
That is true. (It's also true of sparc, alpha, MIPS and pretty much any non-x86 architecture.)
I've been told to expect twice the performance from a G3 than a similarly clocked PIII.
That is unadorned horseshit. But don't take my word for it: go to www.spec.org and check out the numbers yourself. 20-30% is more the average gain, and that's cold comfort when you can buy 1.2GHz Athlon chips for less than $500 a pop.
The "twice as fast as Wintel" claim is based on a small number of Adobe Photoshop operation benchmarks; usually filters that have been painstakingly optimized for the G4's "Altivec" vector processing unit. This isn't necessarily "cheating", since Photoshop is still one of the primary reasons to buy a mac, but if you are not a graphics professional, you are simply never ever going to see that kind of speed benefit using a Mac.
In "regular use" applications, the scenario at the moment is even worse than you might guess based on the SPEC numbers: MacOS 9 is such a turgid, inefficient piece of crap, and the device drivers for 3rd-party Mac hardware so shoddily implemented, that MacOS applications will often run significantly slower than their Windows counterparts on similar hardware: just ask anybody if they're getting the same kind of Quake III framerates out of a G4/500 with a Radeon card as they would from a PIII/800 with the same graphics card.
You just don't buy Macs for world-beating performance (Photoshop being the exception). You buy them for nice industrial design, an OS that for all of its architectural ugliness still offers a more compelling user experience than Windows, and more often than not just to maintain an existing investment in MacOS software.
-
Re:You are the idiot
When the P6 was released, it was the fastest processor available in industry standard benchmarks (SPEC, including Alpha)
Woo boy... hold on there. First of, the "P6" is not a processor, but a processor core. It was also the code name for the CPU later to be marketed as the "Pentium Pro". The Pentium Pro was an impressive chip, and for some types of operations, it was increadibly fast. However, it did not beat everything else out there with SPEC benchmarks.
For starters, SPEC is not a single benchmark, rather a consortium that comes up with benchmarks, the most well recognized being their CPU benchmarks (colloquially refered to as SPEC benchmarks). These benchmarks however, do not exclusively test a CPU, but rather a system as a whole, although they are designed to make the CPU the limiting factor (nonetheless using bucket loads of RAM, fast disk controllers, and a huge external memory cache can have wonderful impacts on SPEC benchmarks). Typically these benchmarks have been divided into those that stress the integer unit (SpecInt) and those that stress the floating point unit (SpecFP). The Pentium Pro was the first x86 CPU to post respectable SpecFP benchmarks, but it still got it's butt kicked all over the place compared to it's RISC competition.
Even on the SpecInt benchmark, the earliest PPro benchmarks I could find on Spec's website show that while the PPro put in some respectible numbers, it was far from being the king of the SpecInt benchmark.
The PPro was a breakthrough in terms of it's price/performance. This was largely due to economies of scale rather than design genius.
Despite all that, I think the PPro design was very impressive. It was probably the strongest evidence at the time that CISC could hold it's own against RISC competition, something the pundits had been suggesting wasn't going to happen.
-
A simple answer...
Remember that "better" is a subjective term.
It's easy to read a few benchmarks, and fall under the impression that one thing is superior to another.
Remember that people will generally believe something because they want to, without questioning their own biases - this is true for everyone, PPC and x86 users alike.
Now, to answer your question about x86 processors: exactly why should we move away from x86?
Performance? Check this out: The 1.0GHz Thunderbirds are faster than Sun's Ultra 10 Sparc IIi 440 based systems in SPEC FP95.
Certainly the Athlon isn't as reliable, and doesn't have such an elegant memory subsystem, but for home and work?
Here's the deal: the x86 architecture does what we need it to do - that's what counts. We can play games, run office, and surf the net. Furthermore we already have (literally) millions of working applications that would need to be ported... All for what? A more elegant architecture?
Take a look at the Itanium - VLIW, and a gigantic waste of time and money. How many people here will be running one when it is finally released? It'll be 5 years before the thing makes it down to the workstation market...
When it does will Intel be asking themselves if is was worth it? Will Slashdot be asking if it's a good idea to move to Intel's monopolized platform?
Even worse is the arguments being used here... Have you ever heard people tell you why RISC is better? Usually the argument is a higher IPC (Instructions Per Clock IIRC.)
RISC, or Reduced Instruction Set Computing - the ideal being to simplify the processor to improve clock speeds, and reduce power consumption... Transmeta or ARM anyone?
The PowerPC is very un-RISC like in that regard. More instructions, more power. Throw more stuff on the core to improve the IPC, at the cost of reduced clock speeds.
Regardless, ask yourself weather it is really necessary to move away from x86, or weather someone is simply biased against your architecture for some un-vocalized reason.
If you really want to find out, talk to the person who codes in assembler, or programs emulators rather than taking the word of a guy who purchased a fruit.
http://www.mackido.com is a good start. The information comes from a somewhat biased source IMO, however, he also does some serious development.