Intel Reacts to AMD
NoWhere Man writes "Raging Bull has an article which states that Intel is having to shrink its die size earlier than expected to keep up with AMD's Athlon. "Intel couldn't afford to wait on developing a mainstream desktop Willamette chip," McComas said. "They've returned to the old tried-and-true Pentium III core as a quick fix." The new Pentium III speed grade will be among the first to use Intel's new 0.13-micron wafer processing with copper interconnect. At the same time, Intel is said to be readying a 200-MHz frontside bus to support the faster Pentium III."
The only benchmark where P3 outperform athlon is in software (mostly games) that heavilly use SSE.
(Q3 for exemple).
But the difference is not that big.
"...point out the hidden costs of running the chips as hot as AMD runs them,..."
.10*58.4=$5.84 per year.
.15*58.4=$8.76 per year.
I thought your idea was interesting, so I decided to compute the cost of the extra power the Athlon chip dissapates as heat when compared to a PII system of the same clock. I made calculations based on power comsumption data from
http://users.erols.com/chare/elec.htm.
A K75 core Athlon at 750mhz consumes 36.3W typically, 40.4W max.
An FC-PGA PIII at 750mhz consumes 19.5W max.
Since no typical dissipation is given for the PII, I will use maximum dissipation rates for both processors.
This Athlon consumes about 20W more than this comparable PIII. If I leave my computer on for 8 hours a day 365 days a year I will use it for
8*365=2920 hours.
Therefore the additional power needed for this Athlon system is 20watts*2920hours=58.4 kilowatt-hours. (This is, of course, neglecting the different power consumption of each motherboard.)
Assuming a cost of $.10/kilowatt hour, the Athlon 750 would cost you an additional
Assuming a cost of $.15/kilowatt hour, the Athlon 750 would cost you an additional
Linux is currently estimated at 4% of the desktop.
That is substantially better than the 1% figure using open source that you pulled out of your rear end. By the end of the year it could be better still.
At current exponential growth rates Linux will be an economically significant niche for chip makers very soon. (It is already significant in their planning.)
Regards,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
I have seen two sets of leaked P4 benchmarks, ;) that's what I've inferred by look at the situation.)
>>>>>
I have a feeling the benchmarks aren't valid. They are probably testing mainstream software, which P4 at its current 800MHz clock won't do to well. If they tested 3D applications, I suspect they would have gotten very different results. P4 is meant to blow away Athlon in matrix crunching (read on, more below about that), and beat it in clockspeed (so that its performance doesn't suck in regular apps). (No, I'm not an Intel-insider (ohh, I HAD to do that
and both were dismal, other than predictably
impressive memory bandwidth (with dual channel Rambus memory). Never mind the benchmarks,
there is at least one MAJOR reason to question P4's performance - the pipeline length. P4 has a
pipeline around 20 steps to allow it to reach very high clock speeds, but this brings with it two
major penalties:
- branch mispredict pipe flush (lose 20 instructions!!!)
- instruction latency; waiting for instruction arguments takes a lot of clock cycles
>>>>>>>>
Ah but look at the upside. When doing matrix crunching (which is what I suspect P4 is built to do) the pipelines can easily be kept full. Matrix multiplies are extremely regular pieces of code, and thus
A) Doesn't have branches,
B) Is easy to parallize,
C) Doesn't introduce bubbles into the pipeline.
If the hyper pipelined design allows P4 to run at much higher clocks than Athlon, then the performance on normal apps would be about the same, while performance on 3D would kill. (Especially since SSE2 has twice the throughput of 3DNow!)
Comparing P4's double pumped ALU to Athlon, you should remember that Athlon has 3 Integer
units, and 3 Floating point units (PIII has 2 of each). On certain benchmarks, Athlon just beats PIII
silly because of this:
>>>>>>>
Athlon has 3 of each, PPro and up has 3 integer, 2 FPU. However, the Athlon's 3 FPU design is a little crippled, since every clock you can only do 2 FPU operations while the 3'rd one can only do a store instruction.
When Athlon is paired with DDR memory and is able to keep it's multiple instruction decoders (it
can issue 3 instructions per clock cycle) fed, then it is going to scream, never mind the advantages
that Mustang will bring such as it's 400MHz FSB.
A final point for you to ponder: if P4 is clock-for-clock faster than PIII (I don't believe it), then why
doesn't Intel launch it at 1.2 or 1.3GHz (vs fastest PIII 1.1GHz). A process doesn't yield such a
narrow (1.4-5 or 1.3-5) speed band, so Intel must be throwing away the lower binsplits! Why?.....
>>>>>>>>
Read above comment about how the PIIIs FPU is lower power in non-regular code.
Every indication that I can see points to P4 having been designed for very high clock speeds at the
expense of instructions-per-clock performance. AMD Mustang will launch at 1.3GHz and will be in
volume before P4. Mustang achieves it's speed more through better process technology (AMD
licenced Motorola's HiP6L copper interconnect process, and is well on their way to dual damascene
and 0.13 micron), than through Intel pipeline-achieved GHZ marketing stunts.
>>>>
Maybe, or maybe Intel sees that the future is 3D, and has designed a processor to accomodate it.
A deep unwavering belief is a sure sign you're missing something...
Yes, on the other hand, the pipeline is even deeper, so bad branch predictions are going to hurt you even worse than before. Is the Level 1 cache also running at 3GHz? If I remember my comp.arch (it's been years since I read it) by breaking the pipeline into more/smaller stages, it becomes easier to ramp up the clock rate, but you get hit harder by pipeline stalls and bad branch prediction. Compilers are getting better (as is silicon) at re-arranging operations to decrease the occurrance of those pitfalls (stalls and incorrect branch prediction) but there's a finite limit to how much the linear algorithms used in most software are parallelizable. That ALU needs to run at twice the clock rate because its pipeline is twice as deep. It probably takes about the same amount of time for an instruction to make its way through the pipeline as it does on an Athlon with the same base clock rate. You get to issue instructions twice as often, but that won't buy you much if your pipeline is empty half the time.
Again, you start off assuming that doubling the clock rate of the ALU will translate into a doubling of the operations issued per second. That's like saying that the doubling of the processor clock will double a processor's spec or Quake scores. As we can see by comparing a 500MHz Athlon and a 1GHz Athlon, that is not the case, expecially if your Level 1 cache doesn't keep pace with the core.
In the end, I think it will depend on what you are doing. If you are running an application with a small tight loop that fits in Willamette's Level 1 cache, you will see a big performance boost. If you try to simultaneously run multiple applications like Microsoft Office, you will quickly blow the Level 1 and 2 caches and the increase in performance will not be much higher than what can be expected due to the base CPU clock rate increase (1GHz->1.5GHz). Can AMD ramp Athlon's clock as fast Intel can ramp Willamette's? Maybe not. It will be an interesting race.
Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
A) Intel never has problems with yield.
B) Essentially (given the fact that they have 14 fabs or so) they ARE omnipotant.
C) RDRAM could be a problem, but Intel seems to have gotten that figured out, and besides, they could always use SDRAM.
Intel HAS introduced products at all levels before. The Pentium MMX was an example, and Pentium II was another example. (Well, high end to lower midrange anyway.) Like the PPro and its cache problems, the RDRAM thing was simply a snafau. I mean AMD almost did with the K6. If the performance of the design had been higher, (and AMD had Intel's manufacturing capacity) then AMD would have made the introduction at all levels.
A deep unwavering belief is a sure sign you're missing something...
You're kidding right? Actually, I've never actually though of it that way, I always thought that it was a kindergarten type word along the lines of "insey weinsee." Either way, I don't care. Respect for other races is something you do, not what words you use. Regardless of its associations with the deragation of blacks, "niggling" (as well as it's cousin "niggardly") are full-fledged english words, and it is silly to discourage its usage because it sounds a certian way or is related to a certian slur. I am probably one of the most obsessive equal-rights persons you'll meet, but to me, niggling over word usage only exacerbates the problem; it doesn't make it better.
A deep unwavering belief is a sure sign you're missing something...
Jeez, I think i was smoking crack when i posted the parent. A watt is a watt for sure, but the computer power supply does not neccesarily convert 120AC power to 5 and 12 volts DC 100% efficiently! If, for example, it is only doing this job at 50% efficiency, the $ per year estimates in my original post could double. Does anyone know what a typical power supply efficiency is? (Also, if someone sees some crazy errors in the calculations, let me know.)
You're right about power, but there is a bug with AMD751 northbridge that will cause the GeForce to lockup the system at AGP2x. This has been fixed on the recent motherboards that use the Via chipset.
Could it be that Intel is finally starting to realise how much impact AMD has had on its once monopolistic market? (not to mention their continuing monopolistic practices)
I've been a huge fan of AMD ever since their 386 DX/40's came out. AMD has done quite an impressive job trying to keep up with Intel, even though they're always doomed to be one step behind while Intel creates and pushes new standards all over the place just to bully their competitors.
I haven't bought a single Intel chip since '94, and I can't really say I've noticed any 'issues' at all with my AMD's.
Now, my question is this... why does Intel think it can continue to get away with such devious practices (remember the FP bugs, the P3 serials?) and extrememly high markups on their chips...when you might as well be paying for a chip that may just be a step below (remember how they marked pentiums based on highest stable clock speeds rather than manufacturing them all independently for a set speed?)... Quite frankly, I don't like what Intel does. At this point, I don't think Intel really has had much going for it other than name recognition.
Click here to accomplish absolutely nothing...
Intel has hardly every had problems with bad microprocessor cores
Then who's fault was the FDIV bug in the early original Pentium?
<O
( \
X Adopt a bird today!
Will I retire or break 10K?
A) Hahahaha yeah thats why those 1+ GHz P3's are just flooding the market. They are moving to .13u for a reason they are not yielding enough product in the high frequency bins.
B) No they are not, they cannot arbitrarily throw capacity at badly yielding parts if the income doesnt justify it... for one their shareholders dont like it, and its probably a for of dumping
And how the hell is all that capacity supposed to magically make stabile products appear? Processors and chipsets (just cause they had something stable enough to run a demo on dont mean diddly squat, on top of that they need different chipsets for different market segments) dont magically appear out of those fabs, and just throwing more engineers at development to get them done sooner has diminishing returns too.
C) Intel has limited room to manouvre due to their commitments to Rambus...
You are mixing up your processors. Itanium is the chip with the VLIW (Intel EPIC) instruction set architecture. Willamette just has a deeper pipeline. Otherwise, I agree with you.
Research by Hennessy, Patterson, and others in the early 90s indicated that procedural and naive O-O software design produces finite levels of parallelism that can be exploited by VLIW compilers. Parallelism needs to be more explicitly expressed in the software design (with threading and OO message passing) to split the task over multiple processors. This is why 1990s research into programming languages developed languages that facilitated this approach (i.e. Self, Java). The problem is that the average developer commonly found doing applications programming in many corporations is a self-taught VB programmer who has a bit of a problem switching over to effectively using those paradigms. So if you think IT staff shortage is bad now, think how bad it will be when over half of the development staff can adjust to produce the kind of software that is needed. With Itanium, I think Intel is postponing when we have to bite that bullet, but I doubt it can be postponed forever.
Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
Jesus.
- A.P.
--
"One World, one Web, one Program" - Microsoft promotional ad
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
-legolas
i've looked at love from both sides now. from win and lose, and still somehow...
Something fishy going on here :)
The x87 FPU in the P4 has a single instruction per clock throughput (not counting load/stores which like the K7 are handled seperately) for MUL/ADD's not unlike PPro/P2/P3 but WITHOUT free FXCH. Thats going to hurt a lot of legacy code... It certainly is going to make it lag compared to K7 for anything which doesnt use SSE/2 for its floating point calculations. It probably is even going to make it lag compared to the P3 at the same clock for code which use's the x87 FPU a lot.
SSE/2 doesnt exactly give it an advantage over K7 either, for single precision its still got the same throughput on operations as 3DNow! only with better load/store bandwith compared to the P3. And for double precision it potentially has the same number of operations per clock as K7, but its advantage in its larger number of registers is IMO negated by the limitation of needing to do everything with SIMD. Which for double precision code might not always be as easy as for 3D.
With the waiting that has been done for the Williamette I think that this quick fix will not be the answer. If AMD continues to release faster processors ahead of Intel the true winner in the chip race will become apparent.
Kate
_________________________ Visit me at http://pornforcomputers.com
You're confusing the PIII and P4 shrinks.
PIII is first being shrunk 5% while still at 0.18 micron. This will be the 1.15GHz part, and is slated for low-volume "release" on July 31st.
PIII will also be shrunk by a move to 0.13 micron. This is the "Tualatin" PIII core that is slated for introduction mid next year, after P4.
P4 is being introduced at 0.18 micron in a 432 pin package, will then switch to a mysterious 0.18 micron 479 pin package early next year, and is then slated for a shrink to 0.13 micron also around the middle of next year.
Given that Intel are only just starting on th move to 0.13 micron (and the fact that historically they've always been well behind state of the art inn process technology), I'd take those mid-2001 0.13 micron shrinks with a grain of salt! Mid 2001 is about the timeframe for process verification/samples; I'd be VERY surprised if Intel are ready for 0.13 PRODUCTION before the end of next year.
I'm not sure where you get that Intel say P4 will be a high end part. They intend P4 to replace PIII, and PIII to replace Celeron. It's not apparent that P4 adoption will be fast though because:
1) It will intially be Rambus only until Via produce a DDR chipset and fight Intel in court for the right to market it (thay have stated this is what they intend to do. Intel themselves have a legal agreement with Rambus that prevents them from making DDR chipsets for P4 .
2) None of the Taiwanese motherbaord makers have expressed interest in making P4 motherboards, because of the short life of the initial 432 pin package. They will at least wait until the 479 pin package, and maybe until Via's DDR chipset is available.
The Williamette is a LOT faster than an Athlon.
Can we agree on the fact that the Wiliamette doesn't exist yet?
First, it take full advantage of SSE (which is 128bit) to be able to multiply 4 floats in one operation.
Unfortunately, the software it's running doesn't. Look how much ISSE/3D-NOW! software is out there right now. And as you point out later...
Willamette is faster per clock for SIMD, slightly slower for regular FP, but the clock ticks twice as often.
The "clock ticks twice as often" part refers to the core "integer unit" only, not the fp unit. The "ALU" literally is the integer processing unit. The fp unit is what will get the most press, though, and may well be slower than in the Athlon. The "integer" and SIMD performance will both be excellent, I agree. But the fp performance will sell any new chip, since we already have more than enough integer performance and SIMD only has limited support. Unfortunately, Williamette seems to fall down in that particular area. But who knows for sure, right? What I really want to know is how the Williamette deals with its 20-stage pipeline. Williamette seems to be designed for high clock speeds, which is the #1 selling feature for CPUs. But there's got to be a major penalty for mispredicted branches with a pipeline that long!
The benchmark results were AWFUL (although interstingly both agreed - one set run on an Intel demo at WinHEC 2000, and one set on am enginerring sample that got into the wrong hands). 800MHz P4 performed like a 600MHz PIII!!!
I don't doubt that P4 will excell at some particular tasks. If used with dual-channel PC800 RDRAM and an SSE2 optimized application, I believe it will be able to turn in some impressive FP numbers. However that doesn't mean it's going to offer good general purpose performance, or perform well on legacy code.
P4 won't beat AMD on clock speed. 1.3/4GHz P4 will be up against 1.3/4GHz Mustang this year, and both will be at 1.5GHz in Q1 of next year. Maybe technically Intel will launch the 1.5GHz P4 this year (just like, technically, they launched the 1GHz PIII back in march), and Dell will sell 3 of them.
The P3 is already faster than the Athlon at the same clockspeed in many areas. Try checking out a review at Anandtech, or Toms Hardware. The athlon does take advantage of some technologies that the P3 has yet to use, but once it does, I predict it will be faster than the Athlon in *most* areas of performance.
The biggest problem will be the price. That is why I think the Athlon has done so well all this time. For 50% less money, you get a chip that performs only 5% worse than the intel equivelant. I think that is why most people, myself included, like Athlons.
Actually, Java's approach to parallelism is very hard to map to a VLIW architecture. Generally speaking you use VLIW to execute a ton of instructions from one thread at once. Threads are too high level and typically execute completely asynchronously from each other (as opposed to VLIW where instructions typically execute lock-step together).
What you really want are languages like High Performance Fortran that have the programmer declare segments of code that can be executed in parallel (not quite the same as threads).
sigs are a waste of space
AMD's doing a lot right, but there are a couple of additional things they need to do. One is to get the damn SMP chipsets out. They're not going to break into the lucrative server markets without a strong SMP offering.
Another thing they really should do is devote an engineer or two to hacking AMD specific optimizations into GCC. Intel realizes the value of this and is quietly working toward getting heavily optimized compilers in place for Itanium. AMD is sadly lacking here.
AMD is going to have to scramble if they're going to stay ahead of Intel in this race.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Like the space race, but at least these will be processors that will fail not people that will die.
An Education is the Font of All Liberty
"SSE2 can crunch 4 32 bit numbers in each of 2 FPU pipes"
e /manuals/245470.htm and http://developer.intel.com/software/idap/media/ppt /program/Optimizing_for_WMT_E31.ppt
This is an old rumour prooved false for quite a while now, it has a seperate 128 bit LOAD/STORE unit which can kinda count as a seperate pipeline. But its max FLOP's per cycle is still 4 just like P3/K7.
See http://developer.intel.com/design/processor/futur
Hmmm... from /proc/cpuinfo:
fdiv_bug
hlt_bug
sep_bug
f00f_bug
coma_bug
"long, strong track record", huh?
I think the Athlon will still out sell the Intel due to the fact that they can be bought. I can goto www.pricewatch.com and look up the price of a 1G Athlon and even systems with the chip in use. The nearest Intel I can find there is a P3-850. I am sure that there are 1G Intels out there, but they can not be bought by the common user.
No replies made to AC posts. Please log in.
The cost of the old ones will plummet. :-)
Seriously, if you want a computer that ages gracefully it is far more important to buy extra RAM and hard drive than the fastest chip. A 30% difference in chip speed begins to feel very unimportant when you hit swap and you cannot install anything more on your computer...
Cheers,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
. Anyone care to speculate on what kind of processor AMD will develop by the time Intel actually releases Willamette?
. :-))
In the next few days (from now)ALI supposed to release his DDR chipset for K7. Later AMD will release his AMD760 chipset (with DDR support) and then AMD760M aka SMP.
Sometimes in Q4 AMD supposed to present Mustang that will have up to 2Mb of L2 cache (while teoreticlly I belive he can have up to 8Mb) and will start ramping from 1.5GHz
The n there is also have a bunch of mobile CPUs (Corvete aka mobile Tbird and Camaro aka mobile Duron) with powersaving technology.
And then sometimes in 2001H1 AMD should present Sledgehammer, thier dual core, 32 and 64 bit capeble CPU with LDT controller ondie (that possible will allow creation of N-way SMP systems(where N > 1
I said 4% of the desktop market. Not server market. Desktops. Up from an estimated 0.6% the year before IIRC.
It has a nice big chunk of the server market. That is not counted in this figure.
Given as little as 50% growth/year (observed growth has been exceeding 100%), it will be hitting 10% in a couple of years from now.
Given current trends in 2 years we could see an absolute majority of CPUs that are being sold running Linux in the embedded space. Linux looks to again be a significant portion of the small server space. Clustering technology has already made it viable for a lot of high performance computing tasks, and even in its weakest area - the desktop - it looks like it will have noticable penetration.
Of all of these initiatives the weakest is the desktop. OTOH the gaming market is quite aware that Microsoft's long-term game plan is to kill the Win9x line whenever it can. Given the choice between Linux and NT for a gaming platform, there is a real chance that Linux will win.
And yes, the chip companies are quite aware of these trends and make plans accordingly. Why do you think that Intel has been so supportive of Linux on the Itanium?
Cheers,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
DDR doubles potential memory bandwidth.
AMD's Athlon has a 200/266MHz FSB and can use DDR bandwidth. PIII has a 100/133MHz FSB and will get very little (maybe zero) gain from DDR.
We'll see. Typically, 80% or more of memory accesses are in cache. That's maybe a low number, but we'll work with it. Let's assume the DDR really doubles the bandwith. So now 20% of your application is going to be twice as fast. That's a whopping 10% speedup, which is hardly the same as doubling.
...honestly, I could give a damn about Intel and AMD ramping their core speeds up (which is really what is driving the move to a smaller die size).
In a modern PC, the processor isn't the bottleneck for 99% of the work you're doing. Unless what you're doing can fit in the L2 cache, the processor is going to sit and wait for it. Now, modern L2 cache architectures are getting pretty good, and the hit rate is now approaching 90-95% for most stuff, but think of it this way: a L1 cache miss (which is quite common) incurs a 4-6 cycle wait, while a L2 cache miss can incur a 50-70 (to as much as 100-200) cycle wait. Ouch!
Realistically, virtually all "CPU-bound" applications these days are also memory-bound, too. Quake, Photoshop, file-compression, and others typically thought of as relying on the CPU exclusively actually suffer greatly from the constricted pipe into the CPU core. Only stuff like crypto-cracking and maybe batch rendering are truly CPU-only-bound.
Also, especially with graphics (which is one of the most CPU-intensive operations left, and for which higher-speed CPUs are generally targeted first), the trend is to specialized co-processors (the GPU), since the I/O and memory limitations of the current PC architecture make improvements in the CPU of limited help to the graphics subsystem. Take a good look today: it's a far better thing for your Quake performance to replace that 2-year-old video card than it is to replace the 2-year-old CPU. And, given which is cheaper, which do you think people will do?
I'd love to see Intel and AMD quit spending all their money (or the vast majority of it) on speed improvements, and dump a huge amount of it into redesigning the PC architecture. Honesly, I think there is a huge amount of $$ to be made in the chipset integration market. Intel has done this to a certain extent, and now owns probably 60% of the primary MB chipset market. However, nobody is doing anything interesting.
AMD has made a step in the right direction with the adoption of the EV6 bus, but Intel is still running the GTL+ bus, which sucks. I'm talking about as a bus architecture, not comparing MHZ of the bus.
I'd love it if I could buy a 600Mhz PC with an advanced memory and I/O subsystem (maybe like that on the SGI O2 or VisualWorkstion series). The key here is that the push has to come from Intel or AMD. They're the only ones who can design the CPUs to take advantage of such a design. SGI's problems (besides internal ones to the company (like no clue about how to sell into the NT market)) were that the CPU was still a "typical" CPU, and the custom chipset had to work around its limitations.
If AMD or Intel really wants a leg up on the other (and I'm not talking about "My latest CPU is 3% faster than yours, for the 1 month until you introduces your next one" crap), they should:
If AMD and Intel ever want to do something besides make desktops and small servers, they have to change the PC architecture. Maybe Merced and Sledgehammer will do that, but I'm not optimistic.
Hell, even the high-end UNIX vendors could take a page from the Mainframes. Ever wonder why the 'frames haven't died? Batch-processing my good chap - I can get a 'frame with equivalent CPU power to a vanilla Pentium that will go through 100,000 jobs/hour, where a 8-way Sun E3000 might do 10,000/hour, and a quad Xeon maybe 4,000/hour. It's all about the I/O throughput.
I'm looking forward to the day when my PC isn't of an identical design to the one I bought in 1990.
-Erik
There are always four sides to every story: your side, their side, the truth, and what really happened.
We'll see. Typically, 80% or more of memory accesses are in cache. That's maybe a low number, but we'll work with it. Let's assume the DDR really doubles the bandwith. So now 20% of your application is going to be twice as fast. That's a whopping 10% speedup, which is hardly the same as doubling.
Yeah, but the 80% of your memory accesses from cache don't take anywhere near 80% of your processing time. Cache is so much faster that your 20% or whatever from main memory is probably taking more time than the 80% from cache. So now most of your application is going to be twice as fast. That is an impressive speedup.
Of course, like you, I'll wait and see. Theory and practice don't often agree. :-)
I completely agree with you; I guess I wasn't really clear on that in my post. Java is meant to support a hardware approach different from VLIW - many processors on a chip. This is the approach that Sun is taking with MAGIC and IBM with the upcoming generations of Power Architecture. Hmm, and those two companies are major proponents of Java. What a coincidence!
:-) Fortran code is often poorly structured and difficult to maintain. This is part of the driving force behind IBM's efforts to support scientific computing with Java. Unfortunately, from what I last read, they still have a ways to go.
High Performance Fortran has VERY limited use outside of the scientific community (and probably within it these days as well). Microsoft Word is unlikely to run faster if you rewrite it in High Performance Fortran
Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
The Williamette is a LOT faster than an Athlon
.13 micron first to get out any manufacturing process problems before moving to the new core). But AMD may very well jump ahead again with the Mustang, which is their next 32-bit CPU.
:-)
:-)
It's *way* too early to say this for sure. Willamette isn't even in production yet, and won't be out for a few months. This means it's not terribly meaningful to compare it to the Athlon, since the Athlon is out *now* and will be out in even faster versions by time Willamette ships. Now, I happen to think that Willamette will *probably* be a decent amount faster than the Athlons once the part is actually shipping. But it's a little early to say whether it will be a "lot", or just a bit.
Also consider that AMD and Intel's product schedules are offset by several months, such that they are playing leapfrog. AMD's jumped ahead with the Athlon, and Intel is likely to hop ahead with Willamette (once it actually comes out, which isn't for a while since they are moving the P3 down to
Sledgehammer IS AMD's next chip. It will HAVE to save it in the 32 bit market. It is simply a 64 bit x86 chip, not a new architecture like Merced. As such, it has a similar place to the Athlon as the Athlon did to the K6
AMD has a couple of new chips in the works. One is aimed at the desktop, and the other at high-end workstation/server environments. The desktop chip is code-named Mustang, and is a 32-bit x86 chip. The server chip is Sledgehammer, which is a 64-bit extension of the x86 instruction set architecture. Sledgehammer is NOT their next desktop CPU! Sledgehammer is designed to compete against Intel's Itanium (formerly Merced), a 64-bit VLIW-like CPU. The Mustang will be AMD's competition to the Pentium4 (aka Willamette).
First, it take full advantage of SSE
Well, OK, but this requires special processor-specific coding in the software to reap any benefits, and so you only get SSE in certain apps that choose to support it (like Photoshop). Whereas the Athlon has a significantly faster FPU without need to recompile the software. Plus AMD has 3DNow, which offers many of the benefits of SSE. Granted, 3DNow isn't as powerful as SSE, but since the regular FPU on the Athlon is faster the advantage may swing in either side's favor.
Additionally, it runs the ALUs at 3 GHz
The double-speed ALUs may not provide a really large benefit, since that's only one pipeline stage of about 20. If the other stages can't keep up, you may not see the benefit at all, or maybe only partially. You certainly won't be getting the equivalent of a 3GHz CPU. Besides, the ALU often isn't the bottleneck, memory accesses and waiting for results of previous instructions often are. I suggest we all wait and see for benchmarks once Willamette/P4 is actually on the market before we speculate much on this one
And keep in mind the leapfrog effect I mentioned above. AMD is by no means out of the race once Willamette comes out, even if Willamette does do the 30% better than the P3 like Intel claims it will (which it may, it's got some really new and cool features, such as the "trace cache"). Should be a fun battle to watch (almost as much fun as watching those prices drop
Just this week, I bought an ASUS K7V with the VIA chipset supporting 200 MHz Front Side Bus and an Athlon 800 (Best price to performance for the nonce). No Intel at all (Maybe some of the smaller chips, but none of the main features.)
If Intel wants to go to sleep, so be it. It's nice that there is competition in both the hardward and OS arena (aka, this box is running Linux, not any sucky overpriced Win2K stuff...)
The road map for AMD will be two things: The "Mustang" processor will be released before the end of the year. This processor should do pretty good against Willamette. I have a feeling that Intel delayed Willamette because they found that AMD was a bit behind with the Mustang and could afford to go to 0.13 micron and still not be behind. If they can actually make 'em =)
What else? Obviously SMP support will be nice... Anand (not to mention myself) is expecting them to be out before the end of the year.
Hmmm... one last thing of course (save the best to last). DDR support. If DDR can manage to be less pricy than RDRAM (which it should) then Athlons will blow the pants off Intel. According to this conference call summary AMD's "next gen" supports DDR... whether that means Mustang or the current crop I really don't know. The Anandtech article suggests that it will be for Mustang.
So what's next for AMD? They are about to show Intel just how wrong they have been. We will see a DDR SMP Mustang compete with Intel. Let's see how they like that???
-rt-
-rt-
** Evil Canadians are taking over the world. Learn about the conspiracy
I think that Intel's biggest problem right now is that they took their arbitrary "Moore's Law" as gospel and have organized the company around it.
According to this article, Moore's law suggests that processors should be shipping at 800 MHz now... the fastest speed that Intel is currently shipping in volume. Intel seems to be stuck in their monopolistic mode of thought, which dictated that they could determine that clockspeeds would be set by Moore's Law.
AMD has no such hallucinations... and benefit from a new point of view.
-rt-
-rt-
** Evil Canadians are taking over the world. Learn about the conspiracy
Congrats AMD, you guys are kicking some Intel ass!
Depends on your definition of kicking ass, I suppose. Athlons are cheaper, yes, and just a hair faster, but we're not seeing any big breakthroughs here, like x86 compatibles running in the power range of the PowerPC chips, or a major, major price drop.
In some ways, we're just seeing a pissing contest at the high end and not much to get excited about. (People who buy new video cards every three months and overclock CPUs like crazy will of course be offended.)
A) AMD is in serious trouble. It still has the sheer clock speed advantage, but doesn't have a next-gen architecture to compete with Willamete.
Arguably, they do, and it's called the Athlon. The Williamette isn't out yet, and its specs are a not necessarily impressive (e.g. only a dual-pipelined FPU, compared to the three FPU pipelines of the Athlon). Of course, it could turn out to be an excellent preformer, but unless it's substantially better than the Athlon, AMD need not worry so much.
If Williamette is slower or equal than the Athlon clock for clock, then AMD has no problem (obviously). If it's a bit faster, there's still no problem, since AMD is used to selling its chips a lower price and still making a profit. That's what it's doing right now. If Williamette is far better than the Athlon, then AMD has a problem.
Even then, AMD's Sledgehammer is a wildcard. Although it's a 64-bit chip, it is supposed to have good 32-bit compatibility (unlike the Itanium, for example). I wouldn't count on it to save AMD in the 32-bit market, but even if Williamette kicks ass, AMD may have an ace up its sleeve.
Again, though, AMD's problems start when Williamette is released, and only if it's a good architencture.
AMD has in the last year forced Intel's hand in both price/performance, FAR sooner than they'd planned. This is obvious because of the i820 chipset fiasco, etc that Intel was forced into releasing the Coppermine and the 133 MHz FSB FAR sooner than planned. I don't think that a yet another Pentium Pro (686) core facelift is going to be able to keep up with the newer Athlons, especially with the 64-bit "Sledgehammer", which won't suffer from Itanium's speed and 32-bit app problems. Intel is desperate, they lost the "bleeding edge" market lead in performance a year ago and they still don't seem to be able to recover. Why did this happen? Complacency. After they went to the Slot-1 (more to keep AMD/Cyrix from making clone chips than any technical advantage, hence their latest move BACK to the socket) Intel killed their competition off. In 1998-9 Intel basically became a monopoly. But thankfully, due to superior engineering, AMD has been able to thru innovation, to level the playing field against a vastly larger compeditor. The fact that HARDWARE standards are mostly (at least to the extent that is important), OPEN, is why hardware continues to outpace the Operating Systems in development. It took Microsoft 5 years from the development of the first mainstream 32-bit CPU to produce a (sorta) 32-bit mainstream OS. Now that we are on the cusp of 64-bit CPU's, anyone care to guess how long it will take MS to produce a 64-bit `Doze? Which is another reason why the AMD Sledgehammer will likely beat the current IA-64 "Itanium", as it will likely be years before there are enough 64-bit apps and OS's (except Linux which is already available) to make it's 32-bit performance handicap a non-issue. in other words: CPU/Hardware market: NON monopoly, developemt speed VERY fast OS market: virtual monopoly, development speed FAR slower than hardware. Which is harder, do design hardware or software? Clearly, the lack of the ability of software to keep up with the hardware is the fault of the complacent, M$ monopoly.
In 2000 America, is a non-lawyer truly free?
A good compiler developer would run you about $125,000/yr + the usual overhead (say another 75K). X 2 would be $400,000/year. There is a reason AMD does not maintain an in-house compiler team.
A) Intel never has problems with yields
.18 micron process. Now they want to go to .13 ahead of schedule with copper. They should be taking smaller steps.
That's why AMD yields 12x as many GHz Athlons as Intel does for its GHz P3s.
B) Essentially (given the fact that they have 14 fabs or so) they ARE omnipotant.
Only 5 of which have been upgraded to
I would also like to point out thay AMD has MEGAfabs.
C) RDRAM could be a problem, but Intel seems to have gotten that figured out, and besides, they could always use SDRAM.
They cant. They have a contract with RAMBUS saying they cant switch back. This is why VIA is more important to Intel then they like to admit.
I for one personally feel that if it took Intel this long and caused so much trouble going to .18 micron with the pentium III, how can they drop to .13 so quick to fight AMD?
It also seems to be that if Intel did go to a 200mhz fsb, then AMD will be able to go to something higher very easily. Such as the end of this year when they go to a 266mhz fsb.
That made absolutely no since, but thanks for the effort AC.
Sense that is. LOL!
- 128-bit memory bus @ 640Mhz SDR or 320Mhz DDR. As far as I know, nobody is getting good yields on 200Mhz DDR parts at this point.
- 256-bit memory bus @ 320Mhz SDR or 160Mhz DDR. IIRC, most designers aren't using 256-bit data paths because of signal noise.
- 2-channels of RAMBUS would have to run at 1250Mhz -- that's PC2500 RDRAM.
So, I'm afraid that 10GB/s *anywhere* is a bit too optimistic. And one of the problems with updating the PC design is cost: while the chip manufacturers might be willing to fork over the dough for a higher-speed chipset, only people who are building new systems with no legacy peripherals will buy an entirely new chipset. And even small increases in cost seem to cause a lot of technologies to languish - how many 66Mhz or 64-bit PCI cards or slots do you own?A)
> Most importantly, the ALU's are
> clocked at twice the core speed, so >you have your integer and FPU units running
> at 3+ GHz.
1) ALU == Arithmetic and Logical Unit
It does Integer math, so no the FPU isn't clock doubled.
2) Yes, the ALU are clock-doubled, but they are also "pipeline-doubled" in effect they use also twice as many pipeline-stage so they run in about the same time.
It's great for some particular kind of code, but it's not the killer thing you made it.
The trace cache is insteresting though.
B) Do not forget that Intel is pushing for the expensive Rambus whereas AMD will surely use DDR SDRAM. Will Intel allow VIA to make DDR chipset for their P4?
It's not sure, they would risk being sued by RAMBUS shareholders, I think..
yeah, well, the consumer dollar is nothing compared to the IT dollar. your friend might buy a single machine. IT guys will buy 20 at a pop in most places I've worked. Since IT guys are generally more informed than your friend, I will guess they will stick with the Intels. besides, I've never heard of a tech company going with pure AMD or an insurance company buying Athlons. PIII's and Celerons are going to dominate for some time yet in terms of over all sales becuase of all this.
Remember, Intel did this with Pentium as well. They slashed prices and kept Pentium alive for a long time to allow the PII to trickle down. In fact, this move seems to suggest that Intel is QUITE satisfied with P4's performance. If they were worried about competing with Athlon, they would have launched P4 at all levels immediatly. However, they're comfortable in the idea that P4 is really next-gen (kind of like PII or the original Pentium was) and that even with AMDs competition, they can afford to let it trickle down.
A deep unwavering belief is a sure sign you're missing something...
Intel's having enough problems fabbing at .18 micron. Notice we don't see most of their highest end chips out in quantity.
Take this rumor with a grain of salt.
Chas - The one, the only.
THANK GOD!!!
Chas - The one, the only.
THANK GOD!!!
And the exact pairing restrictions and latencies arent known yet are they?
:)
As for games and 3D they rely a lot on floating point, and in Williamette's case they rely on SSE because its standard FPU is rather lame AFAICS. How well it will do in gaming integer code, dunno. The extremely fast latency (0.5 clock) on some integer instruction could help on mainly serial (pointer chasing?) code, would that for instance help much on BSP code? But you are not going to get 4 ALU instructions per clock throughput anyway, the trace cache can only supply 3...
Why Intel would give away its bus design, maybe because they are locked into a very restrictive deal with Rambus?
B) Preemptive strike against those saying that new CPUs are useless. Go run 3D Studio on you Pentium-60 and then come back begging for forgiveness
That argument still seems a bit week.
Lets see.. I run Photoshop on the lower power 1Gig Athlon. Then I go run it on a 1.3Gig willamette... Am I going to SEE that much of a difference? I mean im all for more power and more bandwidth, But I mean were not even coming CLOSE to doing 10times the speeds now or nothing. A move from a p60 to a k7-1000 thats an event. A move from a k7-1000 to a willamette Is that such a big deal? Really? Gee quake runs at 150fps not 130fps now.....? Point? I paid for the willamette with my firstborn and now my photo shop filters run 10% faster! Well pat yourself on the back there.
Hmmn.. I just dont see how that argument is relevant. Granted I STILL want speeds to increase I just dont think that its going to be such a ground shaking change... The numbers the big ones are no longer so exciting. Oh yay I got a 2gighz chip.. Its just lost some of its excitement and or use becuase now you run into people with systems that do everything they want and have enough computing power to do so in the near future (Given Software doesnt get to much slower.....)
Jeremy
If you think education is expensive, try ignornace
Actually LDT is universal bus and not a bus like EV6 or GTL+ is. Think of it as an addition to current bus technologies that works with everything from ISA & USB to the actual system bus.
Oh I should mention their have been some early showings of the engineering P4 with benchmarks that have appeared onlne (though I'd have to go search for links). They show how badly the current situation for P4 really is (even considering it's engineering sample nature). P4 was benching as if it was a 900 Mhz chip when it was clocked at 800. I think their fall backs to older simpler technologies that allow them to clock so high has lead to substandard performance clock for clock by a large margin & Intel had to double pump the ALU's to simply keep almost even when they relaized the penalty for gettign the chip to clock higher.
Now that's an ssumption, but so is the fact that a 1.5 Ghz (3 Ghz internal) P4 is going to be such a great chip and will have just so much muscle behind it as you are claiming. Now not to be rude, but my assumption is based on an engineering sample I've seen benches of. What is yours based on other than Pro-Intel Bias?
we are all invisible unless we choose otherwise
One niggly little complaint. PPro wasn't expensive so much beacuse of bad yields, but two different reasons.
1) It illustrated Intel's trickle down business model, where the absolute high end is absurdly priced.
2) They had to produce their own SRAM, and had bad yields on the SRAM. However because of the design, bad SRAM meant that the whole CPU had to be discarded. (They weren't tested until they were fully assembled.)
A deep unwavering belief is a sure sign you're missing something...
- Celerons run @ 66MHz FSB today
- P3's run @ 133MHz FSB, soon-to-be 200 MHz
- Duron runs @ 200MHz FSB today
- Thunderbird runs @ 200MHz FSB today, soon-to-be 266MHz
Given these numbers, it looks as though Intel is placing the P3 to compete with the DuronSoon to be a duron-owner,
DragonWyatt
Don't sweat the petty things. But do pet the sweaty things.
They are up for hire for exactly these kind of jobs are they not? Where there is a will there is a way... but apparently there's no will.
I believe for the next couple of years Intel is going to take a beating with the Williamette. They took too large of a jump. LVIW processors have been tried in the past and failed. Mostly because of the complexity of writing good compiliers for it. Its a trade-off between simple hardware with complex software Vs. complex hardware with simple software. Unforunately for Intel they are in the hardware business and not the software business.
I can see how this strategy can help them handle their chronic supply problems.
"Where's my P4?" "It's right there." "Where? I don't see anything." "Of course not, we've shrunk the die to microscopic size." "So how come my computer doesn't work?" "Uhh.... that's a known bug. We'll have a patch out in a few weeks."
Any sufficiently advanced civilization is indistinguishable from Gods.
_/_
/ v \
(IIGS( Scott Alfter (remove Voyager's hull # to send mail)
\_^_/
20 January 2017: the End of an Error.
Following the shock of logging onto HardOCP this morning in search of more details of microns new dual-thunderbird board I was most distressed to find out they had been wrong and it was actually a new socket 370 board. :(
:)
However I did notice a little sidenote on ZDNet today which reads
Right now, "the priority is for a two-way chip set solution," an AMD spokesman said.
Meanwhile, Alpha Processor Inc. may step up to the plate. API is developing 4-way and greater chip sets for Athlon.
I think i'll hold off replacing my Celery 300 (@464 of course) for a little while longer for a dual Tbird. Certainly I've read that the athalon architecture is closer to the Alpha architecture than that of intel and should lend itself to far superior smp.
I suppose this is just bait for AMD to release some new processor to beat this Intel realease. Maybe Intel is hoping that sooner or later AMD is going to screw up. I wont be changing from AMD any time soon.
In fact they did test games on this engineering sample, not just 'industry standard benchmarks'. I saw marks for everything from Sisandra to 3dmark and while their was some divergence it basically seemed to perform about 100 Mhz higher than it really was. The only things that were far enough off to question were the memory benchmarks which were in the 3200 Mb/s range (which is btw not possible to have for real numbers) and a couple other benchmarks that should strange super low results. Their was one otehr oddity: FPU intensive tasks didn't get the extra kick you seem to think they would.
My best way to explain why is to retake a look at the actually Integer & FPU units again to see why it's not doing so well. First thing you coem across is that the Integer pipelines are doubled but the fpu isn't. This has been reported a few times if you visit such places as JC's PC News or Ace's Hardware. IF that's changed the numbers didn't refelct it in the engineering sample as the number were lower or equal to a P3 in all fpu tests. The one up for the P4/Willy is that most apps don't use SSE2 yet, so it may see more improvement later when more things start using SSE2 instead of plain SSE.
Just to make sure I wasn't full of it I went and relooked up a few things that suggested exactly what I just said. The P4 flew in Integer apps (though only a bit higher than rated compared to a P3), but bombed or stayed equal to a P3 in FPU intensive tasks. Now ok this is based on a engineering sample and thigns can allways improve and or change, but right now it's showing a high lack of that 3D performance you want, well except for showing that AGP makes more sense on it than any previous platform (he tested using a Geforce 256 for the most part and got numbers higher than normal that didn't seem to have any correlation to the FPU performance, were as he didn't see the same gains if he tossed in a PCI graphics card). I'll wait and see, but I don't think Intel is answering your prayers this time.
we are all invisible unless we choose otherwise
Oh well... Digital equals Compaq nowadays.
And furthermore, there is the serendipity of Dynamo to conscider. Dynamo would pay of _very_ handsomely on regular peices of code, as it could use execution traces to make accurate branch predictions.
You could imagine Intel licencing it from HP to be used in their compilers. An advanced compiler could just enable dynamo for inner loops. Thus, the irregular code wouldn't get the speed penalty implied by the interpretation, and the compiler could provide valuable hints to the Dynamo system.
Compiler writers, what info would you like to see given to dynamo? Hrm. I've half a mind to ask comp.compilers this.
Actually the Willy uses some older technologies more than you think. While it does have innovations like trace cache, Double pumped ALU's, & some more minor ones. It also falls back in other areas such as having less complex Integer & FPU pipelines (the fpu is almost all SIMD instructions form what I've seen).
On the otehr hand AMD ahs some nice features coming out as well in future chips that should be released at about the same time as Willy is, such as a new FPU design, The LDT bus design (which btw they are giving away, something Intel never learned how to do).
By the way we aren't seeing the best the current athlon cores can do. It's been shown that when you code in Athlon optimizations (both 3dnow! & such things as prefetch) you can gain as much as 33% performance.
we are all invisible unless we choose otherwise
There have been plenty of problems with Intel CPU's and chipsets over the last 1-2 years. Less with AMD. But nonetheless, having that "good ole' Intel(TM) feeling" many don't believe in this. People "remembering like elephants" in an industry so rapidly changing that only immidiate evaluation is valid. One can be good one generation and poor with the next.
Like I said, AMD hasn't had problems since the K6 days. (At least I THINK I said that.) Also, the K6 problems are mainly due to the trouble AMD had switching from .35 micron to .25 micron. However, in the current situation, (AMD pushing up to 1.5 GHz) they aren't changing the manufacturing process, so there's no risk of reduced quality. Also, AMD did a great job switching over (the very complex I might add) Athlon to the .18 micron process, and I think that they'll have a fairly easy time with .13 micron. In fact, they'll have an easier time than Intel, because Intel not only has to change the gate size, but has to introduce the copper technology. Since AMD is already cranking out copper chips from its Dresden fab, they won't have that problem. And I suggest you don't hold AMDs problems with the K6 against them. Even many of those problems were not with the K6, but the crappy supporting chipsets. Right now, VIA and AMD seem to have gotten a handle on these things, and for the last year or so, the Athlons have been 100% stable. (Aside from some power issues with the GeForce.)
A deep unwavering belief is a sure sign you're missing something...
And how exactly do you know they have the yields stability motherboard chipsets etc needed to introduce the P4 at all levels immediately? Did I miss some press release announcing Intel's omnipotence? (must have come recently, after all the RDRAM chipset fiasco's)
Development of an optimized GCC backend should not be a huge task... nothing compared to the scale of undertaking which the development of a processor is.
Even still, the P4 is going to kick ass where it counts. What need 3 GHz of processor? Games, and 3D. Games and 3D are what is driving the performance market, and since 3D ops are so easy to SIMD (I know that's not an appropriate usage of the word) the double pumped ALU's will probably result in 90% more performance. As for Athlon-optimized code, why bother? Are companies actually going to release Athlon optimized binaries? It hasn't happened yet, and probably won't happen, especiaily with such a small performance increase. Additionally, why should Intel give away their bus design? They produce all their own chipsets, and thus have no desire to help out the competition. AMD doesn't really want to produce their own chipsets, so they have to release the design to allow others to make support chips.
A deep unwavering belief is a sure sign you're missing something...
I think VIA bought Cyrix. Who knows what they are doing now, other than manufacturing chipsets with the shittiest IDE I have ever seen.
Lars -
This might be a little off-topic, but it seems to me CPU speeds have gone up much faster, and prices gone down, ever since AMD started going after Intel. Which means that before AMD turned up, Intel was really behaving like some kind of monopolist, hurting cosumers and all. So why didn't the DoJ go after them? bhaloo
I want to die like my grandfather. Peacefully, in my sleep. Not screaming and terrified, like his passengers.
Looking back at Intel's history, their new product families tend to be introduced around the same speed or slower than older chips, but then they ramp the new chip's speed way up.
Consider the Pentium 60, vs 486DX4/100. The P60 wasn't all that much faster. The (non-MMX) Pentium core made it to 200Mhz, which smoked all 486s. P2 was introduced at 233 Mhz, similar to PMMX, but they ramped it to 400-500Mhz. The P3 was introduced at 450Mhz, and they're now at 1Ghz.
Why is this done? I believe Intel knows they can get people to buy things at the speeds and prices they set. [Last I checked, they're still making a profit on CPUs, so this strategy is still successful.] Once they get the cash of the early
adoptors, they ramp speeds, and get more customers once things have a real benefit.
Nathan Mates
CodePlay offers VECTORC which they claim does a good job of optimizing for Athlons and K6 cpus. But it's windows only, no C++ support yet and it is quite expensive.
The fortran compiler you mentioned is from the folks at Digital AFAIR.
Correct me if I'm wrong, but won't using .13 micron process and copper technology make it easier to overclock this new P3? .13 is sure to run cooler too, right?
Thanks I was trying to remember who it was... As soon as I saw your post I knew that Digitial was indeed who it was.
Though as for CodePlay's VectorC (& later VectorC++) they seem to suggest it's all 3dnow! optimization. 3dnow! isn't the only thing that can be optimized in an athlon though. Prefetching being a prime exmaple of what else needs to be added. It's already supported by Intel made compilers for Intel cpu's.
Eventually we can hope to see a compielr for Athlon that is just as good as for Intel cpu's. I just hope it doesn't take until teh K8 Sledgehammer chips come out.
we are all invisible unless we choose otherwise
It doesn't seem like Intel is having a good time right now, or will be any time in the future. Anyone care to speculate on what kind of processor AMD will develop by the time Intel actually releases Willamette? I was thinking that Intel would break away with the new Willamette chips because they'd be undeniably faster, but now that they're delaying them again... AMD must be seeing Sugar Plum Faeries.
-----------
"He's more machine now than man, twisted and evil."
This could be a great example of how competition actually gets things improved. Of course it could turn ugly if either one of them starts pushing products to market before they are ready, still if one does that we'll always have the other, right? ;-)
Even if you're an Intel fan (which I am not), you have to praise AMD for creating the much needed competition in the x86 CPU area.
One of the pitfalls, though, is that although AMD is releasing excellent products, Intel seems to be rushing theirs to the market. Rushed products == unstable products == not good for the consumer.
Congrats AMD, you guys are kicking some Intel ass!
Quality doesn't seem to be suffering here. Both the Athlons and PIIIs have been 100% stable, especially since most manufacturers cool their high power systems very well. Of course, the Athlon has been plauged by power constraints, but in general, those are the fault of cheap components. For example, the problems with the GeForce had nothing to do with the Athlon or its infrastructure, but crappy motherboards that didn't meet power specifications. You spend the extra ducats and buy a good motherboard, or suffer with poor quality just like with every other cheap product in the world. As for speeding up the manufacturing process, Intel has hardly every had problems with bad microprocessor cores, and its recent problems have everything to do with RDRAM rather than anything else. As for AMD, it to hasn't had manufacturing troubles since the K6 days, and its new 1.5 GHz chips will be based on the same .18micron copper process that has been flowing smoothly for some months now.
A deep unwavering belief is a sure sign you're missing something...
They cannot just shoehorn in too much new arhictectural features into a core they are taking out of commission soon. You will not see a second floating point unit for instance IMO.
The big point about why P3 is playing catchup (in vain I might add) is the price difference you mention... and thats not going to change, the Athlon is better optimized for speed and nothing short of a complete redesign (which is the Williamette) can change that.
Per clock performance doesnt meand dick... for one consumers have been conned into merely look at the MHz, and secondly even if you profess to be above that and look at the performance purely objectively... even then in performance/$ Intel is still hugely lagging and thats ultimately the most important ratio IMO. If you just care about top performance go get an Alpha.
So yeah, they are playing catch up.
IF the reliability issue is only based on FUD, (and that's my perception), then Intel will basically be doing what it's doing now - and still lose. The point is, AMD is cutting low enough on price that the reliability FUD that's coming out of Intel is being perceived as worth the price differential.
Perhaps Intel could lower their prices significantly? But that undoes this whole "market segmentation" strategy that they've built since the Pentium first came out.
Incidentally, "market segmentation" is the opposite philosophy from what got them where they are in the first place. The original philosophy was, produce a decent chip at low enough prices, so people can afford a computer - period, as opposed to the Sun philosophy which was, charge a buttload, only the rich can afford computers, so we might as well get rich off of them. So of course Intel sold a butttload of chips, and gained market dominance, which really didn't bother Sun much because INtel wasn't after their market. Once Intel decided to go after SUn's market, they decided not to sacrifice the low end, and segment the market, "scamming" the high-end market into paying premium prices for essentially the same chip (hence, the Celeron overclocking craze).
I believe that if Intel had fully locked-down overclockers, they would probably be significantly lower in the marketplace than they are today.
if it ain't broke, then fix it 'till it is!
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
the only company sacraficing quality in production is Intel. They have released speed bins faster than their chips should be able to handle (their first 600mhz chips). However, I have not heard about such problems since.
I don't know about you, but I sure as hell don't see quality going down the tubes. Just prices.
I dunno, I had been using a P233 for most of my apps for the last 3 years, and suddenly, I just got a 600mhz PIII, and there's a 900-something MHz PIII that showed up in the lab, and there's really no difference in the two. The 600 is a HUGE difference from the 233 (I can launch netscape, and by the time it's up and running, I can remember why I launched it - that wasn't always the case with the 233).
if it ain't broke, then fix it 'till it is!
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
Intel's number of errata have been accelerating recently, but then again AMD's number of errata was quite low the last time I looked at Athlon "Classic" - and if that still holds we are not suffering lower quality (at least from AMD), but only lower prices. As for those who say we don't need faster processors, I can only say: *Speak for yourself*. I sure can and I am not even a gamer. Faster RAM (etc) is also good, ofcourse.
Well, Intel can take a stale dog turd, slap a 20-pound heatsink and fan on it, and people will still beg for the chance to pay $2000 for it. because "intel inside" people are stupid.
if it ain't broke, then fix it 'till it is!
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
Lets see.. I run Photoshop on the lower power 1Gig Athlon. Then I go run it on a 1.3Gig
willamette... Am I going to SEE that much of a difference?
>>>>>>>
Depends on what you do. Certain filters nearly double in speed with SSE. (Photoshop doesn't support 3DNow!) Additonally, the ALUs in Willamete run at 3GHz. Yea, its safe to say you'll see a pretty big increase in speed.
I mean im all for more power and more
bandwidth, But I mean were not even coming CLOSE to doing 10times the speeds now or nothing.
A move from a p60 to a k7-1000 thats an event. A move from a k7-1000 to a willamette Is that such a
big deal? Really? Gee quake runs at 150fps not 130fps now.....? Point? I paid for the willamette with
my firstborn and now my photo shop filters run 10% faster! Well pat yourself on the back there.
>>>>>>>>
How about this. Moving from a 1GHz Athlon to a 1.5 GHz Willamette will grant you 300% more performance (theoretically) Even if you get a third of that, you've still got double the speed. Instead of saying Quake will run faster, you can say that upcoming, graphically intensive games (Black and White, Halo) will go from 20-30fps (barely playable) to 40-50fps (comfortably playable.) Or instead of taking 13 hours for your 3D animation to render, it'll take 7 hours.
PS> Have you SEEN Halo! Its un-friggin-believable.
Hmmn.. I just dont see how that argument is relevant. Granted I STILL want speeds to increase I
just dont think that its going to be such a ground shaking change... The numbers the big ones are no
longer so exciting. Oh yay I got a 2gighz chip.. Its just lost some of its excitement and or use
becuase now you run into people with systems that do everything they want and have enough
computing power to do so in the near future (Given Software doesnt get to much slower.....)
>>>>>>
I don't have enough computing power! Where's the 1 million polygons per frame games. Why do I have to wait hours to render an animation rather than a few minutes. Why does it take several minutes to preview a Premiere movie? Unless of course you're running Word, in which case it doesn't matter. Additionally, people have been using your arguements for ever. The Pentium was only 60% or so faster than a 486, but was that worth it? Most people say yes.
A deep unwavering belief is a sure sign you're missing something...
I would just like to point out a few things myself...
Point A) A senior VP at Intel Corporation, Albert Yu, indicated that Willamette would only be 30% faster than the Coppermine. The Willamete will not be that much faster than the Pentium III at the same clock speed. The most important thing to remember about the Willamette is Intel has decided the way to increase performance is to raise the clock speed and sacrifice IPC ( instructions per clock ) efficiency. So you can't compare this transient from the P III to the Willamette to the transitions of the 486 to Pentium. The Pentium had a much better IPC than the 486, the Willamette will most likely have a lower IPC than the P III. But the Willamette is designed to scale to very high clock speeds. You only have to look at the Willamette's pipeline to see this design philosophy. The pipeline has 20 stages; this will make it easy for Intel to raise the clock speed. For more info on how a pipeline length affects CPU performance check out this web page. http://www.aceshardware.c om/Spades/read.php?article_id=50. I would also like to point out that the double clocked ALU is only one stage of the 20 stage pipeline. You do have a good point about the Sse2 instructions, if Intel can get a lot of developers to use them it will make it much harder for AMD to claim they have the fastest FPU. But AMD shouldn't be afraid, they have just rolled out a new architecture that has a lot of room for growth. And I can't see Intel getting that far a head of AMD. I would expect that title of the fastest x86 CPU will change hands many times in the next year, but Intel is not about the totally destroy AMD.
Point B) I am not going to say I disagree with you it's just that I think 3d rendering is not a good example, the average computer user will never do 3d rendering outside of a game. I think that speech recognition is a much better example... something that many people would use if it worked better...
Point C) The Ppro was expensive because they had bad yields. This is the same reason that Rdram is much more expensive today when compared to Sdram.
True it should, but I don't think AMD has the talent in house for it. You may notice the only time they have ever dealt with coding themselves was when they did the reworking of the Quake 2 opengl engine for 3dnow. They have never had a 'compiler team' as Intel does (these guys have been getting overworked of late proving they can make up for the lack of a 7th gen core).
The good news is I heard someone (can't remember who right now) was going to make a fortran compiler with AMD optimizations. Hopefully this is the start of better work into this area. But I still don't think AMD plans on having their own coding teams... Though they will really need to have some coders in house when K8 Sledgehamemr roles out.
we are all invisible unless we choose otherwise
AMD's Mustang processor will be out in a few months and will match P4's 400MHz FSB (except it will use affordable DDR memory). It is rumoured to possibly also have Sledgehammer's "Technical floating point" unit which will blow P4 out of the water.
FWIW Sledgehammer will sample in Q1 2001, to ship late in the year, and therefore will not be far behind Mustang.
From the P4 benchmarks that have leaked, it appears that Intel may well be the ones in serious trouble. Certainly Intel's plans for PIII don't exactly jive with it being obsoleted by P4...
Arguably, they do, and it's called the Athlon. The Williamette isn't out yet, and its specs are a not necessarily impressive (e.g. only a dual-pipelined FPU,
compared to the three FPU pipelines of the Athlon). Of course, it could turn out to be an excellent preformer, but unless it's substantially better than the
Athlon, AMD need not worry so much.
>>>>>>>>>
The Williamette is a LOT faster than an Athlon. First, it take full advantage of SSE (which is 128bit) to be able to multiply 4 floats in one operation. That twice what Athlon can do. Second, the Athlon's 3 pipelines aren't what they seem. In pracice, one of the pipelines is always limited to simply doing a store operation every cycle. (ie. you can do two floating point multiplys and 1 store every cycle compared to two floating multiplies and no store on the PIII.) Additionally, it runs the ALUs at 3 GHz. It will definately be an excellent performer, because even if it meets half the projected performance, it will rock.
If Williamette is slower or equal than the Athlon clock for clock, then AMD has no problem (obviously). If it's a bit faster, there's still no problem, since
AMD is used to selling its chips a lower price and still making a profit. That's what it's doing right now. If Williamette is far better than the Athlon, then
AMD has a problem.
>>>>>>
Willamette is faster per clock for SIMD, slightly slower for regular FP, but the clock ticks twice as often.
Even then, AMD's Sledgehammer is a wildcard. Although it's a 64-bit chip, it is supposed to have good 32-bit compatibility (unlike the Itanium, for
example). I wouldn't count on it to save AMD in the 32-bit market, but even if Williamette kicks ass, AMD may have an ace up its sleeve.
>>>>>>>>>
Sledgehammer IS AMD's next chip. It will HAVE to save it in the 32 bit market. It is simply a 64 bit x86 chip, not a new architecture like Merced. As such, it has a similar place to the Athlon as the Athlon did to the K6. Also, it will have a MUCH faster FPU because it will go RISC style and ditch the stack-based FPU present in x86 chips.
A deep unwavering belief is a sure sign you're missing something...
Hmm, been running a webserver on a Linux box since 1996 on a K6/233. Been on pretty much 24/7 since then. Have had a drive fail, but the MB and CPU are still alive and doing their job today.
Today the secret to good performance on games and 3d is to move the calculations to video cards which are specially designed for it. What is the reason not to do this?
As for AMD optimized code, well the Windows people can take care of themselves. Alan Cox has been given a couple of nice Athalons to try to optimize Linux on. They would be idiots not to try to do similar things with gcc. Which means that with open source you will be able to take advantage of it without having to depend on someone else for binaries.
:-)
Cheers,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
Intel shot itself in the foot with this rambus deal.
When someone yells "Stop" or goes limp, or taps out, the fight is over.
DDR doubles potential memory bandwidth.
AMD's Athlon has a 200/266MHz FSB and can use DDR bandwidth. PIII has a 100/133MHz FSB and will get very little (maybe zero) gain from DDR.
With CPU speeds around 1GHz and memory around 133MHz, memory is THE bottleneck for many applications. Increasing processor speed from 900MHz to 1GHz may be nothing to be excited about, particularly since the performance increase is much less than the clock speed increase. DOUBLING memory bandwidth IS something to be excited about, since it should have a HUGE impact on performance!
Yes that Cyrix 686L PR200+ was a top performer, especial as it was really running at only 150mhz, & as far as its integer unit was concerned it was up there with the P233mmx, even though its FPU was about not much better than about P133 standard. It was those bloody PR ratings that killed it, just so they could make a little more money on each speed grade, even though it made it look bad performance wise. When they tested the 'Joshua' cored CyrixIII at Tom hardware, the PR533 was competetive with the Celeron 533 as far as integer was concerned, but its FPU again blew it. Howevera compared with the Celeron 433 (the CyrixIII 'Joshua' PR533 really ran at 433mhz) it creamed it as far as integer scores & its FPU was competitive too. Plus if they had used optimised benchmarks it would 'ave done even better as it had twin MMX units & twin 3D-Now units ontop of its twin FPU, & its 686 integer unit (the undisputed X86 integer king), & it had on-chip 256k L2 cache. However the Joshua core just wasnt ramping MHZ wise, so VIA again 'released' the Cyrix III with the IDT Winchip 'Samual' core. which doesnt even perform as well as the 'Joshua' at equal speed grades (it even has no on-chip L2 cache) but its ramping marvously at speed up to to the GHZ leval & even better in the lab.
By all accounts P4's x87 performance may be BELOW PIII's, so when you're talking about good P4 FP performance you're presumably talking about SSE2, not x87...
Note that P4 has SSE2, not SSE. Photoshop is only going to faster if/when it gets P4-specific support, which isn't likely to occur before it becomes a volume part. Intel themselves have stated that P4 will take ("3-8 quarters") to replace PIII (up to 2 years!!!), and certainly it's hard to see the initial 432 pin P4 being volume since Intel are planning to replace it with tht 479 pin part in about 6 months (gee, wonder why..). I expect P4 with it's huge 170 sq. mm (vs 120 for Athlon) die, huge power consumption and heat output, will not be a viable mass volume part until a 0.13 die shrink sometime late next year.
Unless you think I'm unduly pessimistic about P4, look at Itanium. Now many YEARS late, and currently scheduled to launch at 733MHz (struggling in the lab at 600MHz), vs an original 1GHz+ goal... Intel arn't exactly the CPU geniuses many seem to think they are! With Intel's huge R&D budget, they still produce fewer CPU patents per year than tiny AMD!!!!
I recently heard the official figure for the FDIV return rate. It was VERY low, much lower than I expected, around 20% (I forget the exact figure). However, that bug was only present in the first stepping of the Pentium, which themselves are rare (since they were very high-end at the time!)
It is clear that right now AMD could bump performance significantly. As long as that remains true, every time Intel pushes the same old a little farther, AMD just stays ahead. The result? Intel is working well beyond what it can reliably produce, they have low chip yields, and AMD eats up more of their market.
Oh right, since AMD remains ahead on what is widely available, they keep on growing market share.
Intel needs to stop playing this losing game. Take the body blow, throttle back on the speed, play up reliability, point out the hidden costs of running the chips as hot as AMD runs them, then put the energy into a better architecture...
Cheers,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
Actually, competition inherently produces better quality products at a lower price. We, as the consumers are reaping the benefits of this rivalry. These comanies have huge incentives to produce good quality products, and they pour huge amounts of money into researd & devleopment to ensure it can be done at such a rapid pace. I guarantee you that Intel and AMD understand that people are going to buy the rival's product if they are not up to par in quality. I mean, can you name ways that quality from both of these companies has suffered because they are, "making new products too fast"? Chips have to be of high quality for the technology to progress as far as it has. This is also why the government quickly dropped their anti-trust actions against Intel - they realized that there is healthy competition and the consumer is indeed benefiting by getting better products at lower prices.
I have no bias either way, I go where the performance is. Yea, I know, a disloyal little bastard ain't I? Secondly, I don't really think those benchmarks are valid in this case. These industry standard benchmarks test the CPU in a wide range of settings. In these cases, the 20-stage pipeline hurts performance because it is much harder to keep a pipeline like that full. Thus, a 1.5GHz P4 probably won't benchmark that much faster on diverse stuff like SPEC or Winstone. However, I'm not running Winstone, am I? I run games and 3D renderers. The math for that kind of thing is
A) Very easy to parallelize. Thus, the P4's 128 bit dual FPU design will be kept quite busy, and
B) Has little data dependency. When crunching a stream of a few million matracies, there are no branches in sight. You can multiply out the whole matrix before you actually get a few dependencies during the adding stage. However, even those dependencies are very regular and easy to optimize. 70+% of processing power used in a games is used for graphics, the bulk of that for geometry processing. Just the kind of thing P4 will excel at. When doing 3D rendering (especially preview) it is ALL graphics. Again, very good for 3D rendering. So what happens is, that in the end, the P4 is about equal to Athlon in regular apps, and extremely fast in 3D and other types of regular data. To me, that's a BIG advantage. To sum it up, these are the advantages disatvantages of the P4.
Advantages:
-It's ALUs run at 3GHz
-SSE2 can crunch 4 32 bit numbers in each of 2 FPU pipes
Disadvantages:
-20 stage pipelines impact performance on many apps.
Net Result: A proc built for 3D. High core speed keeps it competitive with Athlon, and special architecture makes it a 3D screamer. What's not to love?
Now if those benchmarks were a test of Quake 3 framerate, then, I'm wrong, you're right. However I suspect it was a general benchmark like SPEC or Winstone, in which case those results are irrelevant.
A deep unwavering belief is a sure sign you're missing something...
Just some points that come to mind due to this story.
A) AMD is in serious trouble. It still has the sheer clock speed advantage, but doesn't have a next-gen architecture to compete with Willamete. Now if Willamete is delayed long enough, maybe they will be able to get their 64 bit Sledgehammer chip out in time, but that seems doubtfull. The reason they should be so scared of Willamete is that it is truely a new architecture. Remember the transition from 486 to Pentium, and how the 60MHz Pentiums beat even 100 MHz 486s? Well, this promises to be just as big. Not only does this signal the arrival of ultra-high bandwidth memory to the mainstream (RDRAM on high-end, DDR-SDRAM on lower end) but the Willamete architecture boasts a number of improvemnts. Most importantly, the ALU's are clocked at twice the core speed, so you have your integer and FPU units running at 3+ GHz. This promises to be an even bigger jump than the switch from 1 integer unit in the 486 to 2 in the Pentium. Additionally it introduces new instructions, and it seems that the new instruction set idea is genuinely working, since SSE actually DOES help out a lot in apps like Photoshop and 3D renderers. AMD should be afraid, very afraid. (Buyer Tip: Don't buy a new computer before the Willamete comes out. I have a friend who purchased a Pentium 233 just before the PII came out, and I remember laughing at him for quite awhile.)
B) Preemptive strike against those saying that new CPUs are useless. Go run 3D Studio on you Pentium-60 and then come back begging for forgiveness. The truth is that the bandwidth problem is just not that important for many applications. Additionally, bandwidth is getting a major shot in the arm with the coming of dual-channel RDRAM, and DDR-SDRAM. (Buyer Tip: Get DDR-SDRAM if you're going anywhere near 3D. Latency is KING!)
C) Buyers beware. If AMD can't match the Willamette in sheer performace, we may again revert to a situation like the PPro era when Intel's power proc was really expensive, and with no competition on the high end, they had no incentive to lower prices.
A deep unwavering belief is a sure sign you're missing something...
To be honest, in my own personal experience, I have had problems with AMD K6 processors. Granted I realize that these are older and the K7 is a whole different story, but I cannot afford to take the risk of something not working right. Strictly speaking out of my work and home experience, I have never had any problem with any Intel Pentium chips starting at P133 going to PIII 667 and SMP PIII 550 setups. However, I accept that this is my experience and is not based in science at all
AMD should get a big clap on the back for at least challenging Intel and doing a damn good job at it, but I will not become an AMD believer until I see how they do over the long run. Intel may not be pretty, but it has a long and strong track record. I've got nothing against AMD, but I'd like to see them mature a bit before I start putting my servers on the line in the name of AMD.
Cheers,
Matt
Don't take life so seriously; it isn't permanent.