Bug in Pentium III Xeon Processors
Doug Muth writes "There is an
article in Wired that talks about a bug in their Pentium III Xeon chips that causes crashes when "when a system is pushed to its highest performance limit", whatever that is supposed to mean. Fortunately, the bug is only present in two specific variations of the chip, the 550 Mhz versions that have either 512 K or 1 Meg of secondary cache. Intel is also working on a bugfix for the problem. " Furthermore, the bug seems to be only present in Intel-brand motherboards, (Sabre). Intel has stopped shipping the board, but not the chip.
I hear it's an SMP bug and only shows up in 8-way SMP configurations. Something to do with the bus voltages exceeding usual limits, which suggests to me a termination problem, perhaps ringing on the bus lines? Of course, the general ineptitude of computer journalism means that decent technical info is hard to come by... Interesting that this was the same problem Cyrix had years ago with the 686.
I have a server system on a K6-2 400 (w/ Asus P5A) that currently has an uptime in excess of 130 days. And, my workstation has had similar uptimes with its k6/233(w/ FIC PA-2011) Even my AMD 486/100 could do that before the power supply in its case died. ;) Granted, the Pentium Pro and Alpha have similar histories, but I haven't been able to witness any of these problems you speak of with my AMD-based systems.
;)
Btw, I was going over the AMD K7 system building guide (the pdf) the other day, and noticed they had 2 things your friend may be interested in - a recommendation of going with no less than a 300 watt power supply and a video card compatibility list. Since all the reviews I've seen have remarked how stable all of the boards / chips are, I have a feeling it could be one of those causing the problem. If not, it's time to take advantage of a warranty.
Since it only shows up under high load in an 8-way system there is a large chance that there are almost no non-NT systems configured that way.
It may end up cauing a BSOD on NT, and a panic on Unixish systems. It may cause just a plain lockup and the reporter assumed anything that crashes is a BSOD. It is easy to imagine the "bug" ends up loading the wrong thing into a cahe line which would upset any OS, or maybe it signals a non-correctable ECC failure which a good OS will panic on, a bad one will ignore (a great one will log the error, and if the page it is on is clean page it in from the backing store again, if dirty kill that pricess, or restart from the last checkpoint...)
> The microcode is stored (in encrypted form) in the BIOS flash ROM
And won't life be exciting when someone cracks the code and turns things over to the kipt scriddies.
Sheesh, evil *and* a jerk. -- Jade
I *think* this is related: According to C't in Germany, VIA are also having problems with their newest chipset. This is a case for the Babelfish - the article is in German. http://www.heise.de/newsticker/data/ciw-29.09.99-0 01/
Mielipiteet omiani - Opinions personal, facts suspect.
"There is no surer way to ruin a good discussion than to contaminate it with the facts."
1. It's faster.
/. readers know enough to make that informed choice properly. For a lot of us the choice is possible, because we don't go buying pre-assembled systems from the big names. When will Dell/Compaq/HP begin offering Athlon to the masses? And will Intel FUD triumph, or is this really a turning point for AMD? Unless something new hits the market very soon I see my Celeron 300A being replaced by an Athlon system very soon.
Actually lots of things are faster than a PIII... from the humble overclocked Celeron to the screaming AMD Athlon, the PIII isn't even 2nd best any more.
2. It's bang-for-buck.
Athlon again! PIII must be the most money you can spend on an x86-compatible CPU right now.
3. You get to upgrade your motherboard if you buy one.
Same "advantage" if you go Athlon.
4. Nobody ever got fired for buying Intel.
Sadly, this is one of the big reasons this also-ran might turn into a leader.
and the most compelling reason to buy a PIII is...
5. It has a bigger number on it than the PII.
I would guess that most
Actually, it is not an instruction, but an MSR write.
I have a stepping 2 Pentium Pro. I think I could software upgrade it to rev 3 or if it exists 4
Actually, the steppings represent actual hardware steppings and not microcode versions AFAIK.
The microcode is stored (in encrypted form) in the BIOS flash ROM which is one of the reasons regular BIOS upgrades are a good idea.
Intel has egg on its face yet again because one of its products has a bug in it. This is the best indication that AMD is doing well. Intel will not lose market share so they put out parts as quickly as they can make them and don't test properly. This reminds me of a certain company in Redmond. Let's hope that processor's microcode does not become field upgradable or Intel will start releasing processors that have bugs (not show-stoppers, but minor flaws) are released to the public and we have to wait for the first processor service pack to play the newest version of Quake.
Or Intel could just start competing with AMD honestly and consumers could benifit greatly. Of course, that doesn't help Intel's stockholders, does it?
From www.news.com:
The flaw crops up when 550-MHz Xeons, with either 512KB or 1MB of secondary cache memory, are used in an eight-processor server with a Saber motherboard, which was designed by Intel. The voltage from the processors in this scenario can exceed the recommended voltage limits and cause a server go to "blue screen," or crash, according to Pijkper.
I note that the article says that a complete system crash is also called a "blue screen of death".
Does this bug appear under any OS other than NT? Does anyone else thing this sounds more like a bug in NT than in the chip?
Geezus. I must have been up WAY too late and drank way too much last night.. My head is foggy.. I sat staring at the 'Bug in Pentium' headline and froze.. For a few minutes, I thought y'all had done a flashback to '94.
Think there's a correlation between the MS release schedule and Intel's bug schedule? There was the buggy 386-40 back in 88-89 when 3.1 came out, there was the P54D divide bug about when Win 95 was due to be released, and now the Xeon has gone screwy just in time for Win2K. There wasn't a chip failure for Windows 98 because it was nothing more than a relabelled copy of Win95.
Wintel conspiracy? Or is Intel atempting to undermine MS?
.sig: Now legally binding!