Slashdot Mirror


Major Linux/Athlon CPU bug discovered

GeorgeFrancisco writes "I recently installed the nVidia drivers so I could play TuxRacer on my Athlon. Problem is it kept inexplicably hanging Linux. Now I know why. The CPU bug affects Athlon/Duron/Athlon MP AGP users. Fortunately there's a way around it, and: "Alan [Cox] is going to try to add some kind of Athlon/AGP CPU bug detection code to the kernel so that it will be able to auto-downgrade to 4K pages when necessary." Read more on the Gentoo Linux site."

402 comments

  1. Could this be.... by dadragon · · Score: 1

    Could this be AMD's version of the bug in the original Pentium?

    It was bound to happen, everybody makes mistakes.

    --
    God save our Queen, and Heaven bless The Maple Leaf Forever!
    1. Re:Could this be.... by Anonymous Coward · · Score: 1, Interesting

      Bullshit. The same *precise* bug hit me running 2.4.* on a PIII 450.

      I reported it half a dozen times.

      Someone, somewhere doesnt give a shit.

    2. Re:Could this be.... by zeno_2 · · Score: 2, Interesting

      From just the story, it looks like to me that the linux kernel could send the cpu pages that were bigger/smaller then 4k, and it would have a problem with that. His fix would automatically detect the bug and resize the info that is sent to the cpu to 4k.

      The original pentium bug had to do with the floating point processor on the chip, not with the size of page that was sent to the chip..

      Of course I could be wrong about all this =)

    3. Re:Could this be.... by Score+Whore · · Score: 1

      Bullshit yourself. Give us an in depth explanation of your experiences and the tools, methods and procedures you followed to determine that this precise bug bit you running 2.4.* on a PIII 450.

      The fact that you ran into problems while you think your system is using AGP transfers, doesn't mean it's the same problem. The fact is that a lot of OS software is substandard and built with only a partial understanding of the hardware that is being targetted. It's orders of magnitude more likely that you ran into a completely different problem in what you perceived to be a similar situation.

  2. how lame by Jeff+Probst · · Score: 0, Insightful

    just so he could play tux racer. why not just play the windows version?

  3. I noticed too by Fembot · · Score: 3, Interesting

    I noticed this too, it seems to only affect 3D games, mainly SDL based ones such as armagedtron, but strangly it hasent affected quake 3 at all. Unreal tournamet was affected, but i SWEAR it didnt use to do that.

    1. Re:I noticed too by iriki · · Score: 1

      used to do what? playing Unreal Tournament? U f**ing quaker lamer =)

    2. Re:I noticed too by pangur · · Score: 1

      Maybe if you renamed all your game executables to 'quake.exe', that would fix it.

      Oh wait, that's the Pentium chip...

    3. Re:I noticed too by Einsdot · · Score: 1

      Exactly. I thought i was really dumb because linux doesnt usually "crash" though every time i tried to run tuxracer it would hang on me.

    4. Re:I noticed too by Anonymous Coward · · Score: 0

      No, that's ATI's drivers.

  4. I am pleased by bonzoesc · · Score: 1
    With this, UT in Linux will finally be a viable option for me! Three cheers for the kernel hackers!

    Aww, now I have to figure out how to install UMODs in Linux.

    1. Re:I am pleased by Anonymous Coward · · Score: 0

      http://umodpack.sourceforge.net

    2. Re:I am pleased by bonzoesc · · Score: 1

      Thank you both, linzeal and AC. I will now participate in Strangelove v2.

  5. Is this the same as the Win2k bug? by sprayNwipe · · Score: 4, Interesting

    There was a Win2k bug a while back that did the exact same thing, and you had to install a "LargePageMinimum" patch for it to not crash. Is this the Linux equivilant of that? And if so, how come it has taken so long to surface and fix?

    1. Re:Is this the same as the Win2k bug? by npietraniec · · Score: 1

      This isn't a linux equiv, it's a hardware problem. It affects Microslut too. Someone didn't read the story... But it's slashdotted now anyway, so you're excused ;)

    2. Re:Is this the same as the Win2k bug? by kilrogg · · Score: 5, Funny
      RTFA, AMD released a patch for w2k but never mentioned anything to the kernel developers.

      Instead of saying "oops, there a hardware bug", they said, "oops, here' a patch for w2k". Looks like none of the kernel developers knew they had to look a w2k bug fixes to find out about hardware bugs.

    3. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 4, Redundant
      It's slashdotted. Here's the article:

      The bad news is that a major Athlon CPU bug has been discovered, and it affects Linux 2.4. Note that this is a bug in the actual CPU itself, and is not a Linux bug. However, it becomes our problem because there are very many semi-broken Athlon/Duron/Athlon MP CPUs out there.

      Here are the details. As you may know, x86 systems have traditionally managed memory using 4K pages. However, with the introduction of the Pentium processor, Intel added a new feature called extended paging, which allows 4Mb pages to be used instead. Here's the problem -- many Athlon and Duron CPUs experience memory corruption when extended paging is used in conjunction with AGP. And, this problem hits us because Linux 2.4 kernels compiled with a Pentium-Classic or higher Processor family kernel configuration setting will automatically take advantage of extended paging (for kernel hackers out there, this is the X86_FEATURE_PSE constant defined in include/asm-i386/cpufeature.h.) Fortunately, there is a quick and easy fix for this problem. If you have been experiencing lockups on your Athlon, Duron or Athlon MP system when using AGP video, try passing the mem=nopentium option to your kernel (using GRUB or LILO) at boot-time. This tells Linux to go back to using 4K pages, avoiding this CPU bug. In addition, it should also be possible to avoid this problem by not using AGP on affected systems. As soon as I discovered that this CPU bug existed (which happened, unfortunately, because my CPU has the bug), I informed kernel hacker Andrew Morton of the issue; he put me in touch with Alan Cox. Alan is going to try to add some kind of Athlon/AGP CPU bug detection code to the kernel so that it will be able to auto-downgrade to 4K pages when necessary.

      The unfortunate thing about this situation is that AMD and others have known of this bug since September 2000. In fact, AMD's CPG technical marketing division announced this bug on September 21, 2000 in a technical note entitled Microsoft Windows 2000 Patch for AGP Applications on AMD Athlon and AMD Duron Processors (Technical Note TN17 revision 1). And, the kind folks at AMD even created a simple patch for Windows 2000 that disables extended paging by tweaking the registry. However, apparently AMD didn't realize that Linux 2.4 also uses extended paging when the kernel is compiled with a Pentium-Classic or higher Processor family kernel configuration setting. And, it looks like no one in the Linux community noticed that this "Microsoft Windows 2000/AGP Athlon/Duron bug" also applied to Linux 2.4 systems, probably because it was presented by AMD technical marketing as just that -- a Windows 2000-related AGP bug. An unfortunate miscommunication, which has resulted in lots of problems for Athlon, Duron and Athlon MP users. Here's something that's even more unsettling -- consider what kind of Linux users actually use AGP. That's right -- desktop users. And in what area has Linux been struggling? Yes, the desktop. One wonders how many negative desktop Linux experiences have resulted from this unfortunate problem. I don't know if any particular party is to blame for this issue. After all, AMD did prominently announce this bug when it was discovered. But due to an apparently unfortunate series of events, us Linux people never benefitted from this knowledge. But Microsoft Windows 2000 and XP users did. Let's hope that all parties involved can keep things like this from happening in the future.

    4. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 0

      Uhhh, how the FUCK is the parent post flamebait? It's a god damned valid point. Of course, Slashdot dweebs are pro-AMD so OBVIOUSLY they are going to mod down opinions going counter to their own.

      I'll see you cocksmoking fucking whores in meta-moderation!

    5. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 0
      It affects Microslut too. Someone didn't read the story...

      The parent is talking about Microsoft's own solution to this Athlon/AGP bug. Someone didn't read the comment...

    6. Re:Is this the same as the Win2k bug? by Afrosheen · · Score: 0, Offtopic

      Hey moderators, don't waste your points modding this AC up to +5, that's a total of 6 points!! AC's don't get karma remember? I think a +1 or +2 would've sufficed...

    7. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 0

      Hey moderators, don't waste your points modding this AC up to +5, that's a total of 6 points!! AC's don't get karma remember? I think a +1 or +2 would've sufficed...

      Way to count, dumbass.

    8. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 0

      Sigh. Are you a troll or don't you understand it? Good postings deserve a high score, no matter who they're from. FYI, I only view the Score:5 posts ATM - I haven't got as much time as you do - why am I replying anyway?

    9. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 0

      These problems have kept me from being able to upgrade my kernel for almost a year now. Hallelujah!

    10. Re:Is this the same as the Win2k bug? by GigsVT · · Score: 2, Flamebait

      If this was discovered almost 2 years ago, then aren't and chips bought in the last couple years bug free?

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    11. Re:Is this the same as the Win2k bug? by slycer · · Score: 1

      Yes,
      Or, they appear to be.

      I had this problem consistently when I was running an original Athlon 700.

      It has been rock solid since I upgraded to my 1ghz TBird

    12. Re:Is this the same as the Win2k bug? by fader · · Score: 1

      If this was discovered almost 2 years ago, then aren't and chips bought in the last couple years bug free?

      Nope. Unfortunately, I bought my machine about 6 months ago and I still got hit with this. I didn't have any problems at all until I upgraded to RH7.2, when my machine started locking up hard daily. I guess I owe the guys that built the RH kernel RPMs an apology :)

      --
      - fader
    13. Re:Is this the same as the Win2k bug? by jelle · · Score: 1

      I've had 1G TB systems sometimes lockup/reboot XFree86 overnight, while it ran the screensaver. You think it may be related?

      --
      --- Hindsight is 20/20, but walking backwards is not the answer.
    14. Re:Is this the same as the Win2k bug? by DeeKayWon · · Score: 5, Informative
      The only revision without the bug is the A5 stepping (CPUID 662) Athlon XP/MP/Mobile Athlon 4. See the Athlon model 4 revision guide and the Athlon model 6 revision guide, erratum 16.

      Basically, if you run "cat /proc/cpuinfo" and see these:

      cpu family: 6
      model : 6
      stepping : 2

      Then you should be safe.

    15. Re:Is this the same as the Win2k bug? by MrResistor · · Score: 3, Interesting
      So, it's just the ones with the morgan/palomino core that are safe? Or am I reading this wrong.

      I have to say that this news is somewhat of a relief to me. My Athlon 700 has the bug and I've been going nuts recompiling kernels and nvidia drivers since I first tried to play tuxracer with my little brother christmas eve.

      On the upside, it finally motivated me to explore the guts of Linux a little more... :)

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    16. Re:Is this the same as the Win2k bug? by evilpaul13 · · Score: 2, Informative

      Cool, that includes my Athlon XP which I picked up this week!

    17. Re:Is this the same as the Win2k bug? by Freija+Crescent · · Score: 1

      Aye... I noticed that my random lockups in Quake3 were cut to an absolute minimum after ditching my Athlon 700 for an AthlonXP 1900. (well, i've had one lockup in the last 2 months, and that was most likely caused by faulty power as my roommate dropped a box of books on the floor the instant the lockup occured)

      I was just about to comment an argument to this whole post as I don't have these lockups anymore.

      My reasoning is that something else is bring the kernel down while AGP use it high. (at least on nvidia systems, which is all i have). I think I am going to continue with my plans to set up an online database on my webserver to try to sort this all out, should people STILL have lockup issues after using the workaround.

      I do know that using SBA (side band addressing) will re-introduce these problems. Also i switched Processors around the time i upgraded to kernel 2.4.16 from 2.4.12. Maybe something changed in the kernel, I'm currently testing 2.4.12 again to see if the lockups come back.

      Anyhoo.. here is my output of cat /proc/nv/card0

      velarious:/usr/work # cat /proc/nv/card0
      ----- Driver Info -----
      NVRM Version: NVIDIA NVdriver Kernel Module 1.0.2313 Tue Nov 27 12:01:24 PST 2001
      Compiled with: gcc version 2.95.3 20010315 (SuSE)
      ------ Card Info ------
      Model: GeForce2 MX/MX 400
      IRQ: 5
      Video BIOS: 03.11.01.26
      ------ AGP Info -------
      AGP status: Enabled
      AGP Driver: AGPGART
      Bridge: Generic Via
      SBA: Supported [disabled]
      FW: Unsupported [disabled]
      Rates: 4x 2x 1x [4x]
      Registers: 0x1f000207:0x00000104

      -fc
      .

      --
      . echo -e \\04 > /dev/hand1
    18. Re:Is this the same as the Win2k bug? by Skuld-Chan · · Score: 1

      Thats funny - I have an older athlon (socket A, ceramic package) that says that.

    19. Re:Is this the same as the Win2k bug? by DeeKayWon · · Score: 2

      Are you sure? The list of AMD identifications is in the AMD Processor recognition application note. See pages 20-21 of the PDF.

    20. Re:Is this the same as the Win2k bug? by Skuld-Chan · · Score: 1

      Yeah - it really says family 6, model 4 stepping 2 - best of all I must have bought this like 6~7 months ago - I even verifyed it in windows using wcpuid. Its a 1200 mhz athlon running in a A7V266-E. (used to be in an older rev A7V)

    21. Re:Is this the same as the Win2k bug? by swillden · · Score: 2

      Yeah - it really says family 6, model 4 stepping 2

      Then you have the problem, as do I. The first poster said that model *6* doesn't have the problem. You have model 4.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    22. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 0

      The primary purpose of moderation is not to increase someone's karma but to make good comments more visible and bad comments less visible. The karma is just a side effect to encourage people to write good comments. Many excellent comments are written by ACs and they stay at 0 score too often.

    23. Re:Is this the same as the Win2k bug? by Anonymous Coward · · Score: 0

      The paren't post didn't read the article. Please moderate accordingly.

    24. Re:Is this the same as the Win2k bug? by Afrosheen · · Score: 1, Offtopic

      The primary purpose of moderation is not to increase someone's karma but to make good comments more visible and bad comments less visible. The karma is just a side effect to encourage people to write good comments. Many excellent comments are written by ACs and they stay at 0 score too often. I agree, and lately AC's have had alot more to contribute than just links to goatse.cx, but modding higher than 2 or 3 is completely pointless.

    25. Re:Is this the same as the Win2k bug? by dtjohnson · · Score: 0

      I have an Athlon XP 1700+ purchased in November, 2001 that shows the following with the AMD Dos CPUINFO utility:

      AMD Athlon(TM) XP 1700+
      AuthenticAMD
      Model: 6
      Step: 2
      Family: 6
      L1 Data Cache: 64 KB
      L1 Inst Cache: 64 KB
      L2 Cache: 256 KB
      MMX Yes
      AMD Extended MMX Yes
      3DNow!(tm) Yes
      Extended 3DNow!(tm) Yes

      Looks like the Athlon XP series of processors basically do not have the bug.

    26. Re:Is this the same as the Win2k bug? by RiffRafff · · Score: 1

      Mine says:

      ----- Driver Info -----
      NVRM Version: 1.0-1512
      ------ Card Info ------
      Model: GeForce2 MX
      IRQ: 10
      Video BIOS: 03.11.00.18
      ------ AGP Info -------
      AGP status: Enabled
      AGP Driver: NVIDIA
      Bridge: Via Apollo Pro KT133
      SBA: Supported [disabled]
      FW: Unsupported [disabled]
      Rates: 2x 1x [2x]
      Registers: 0x1f000203:0x00000102

      Does this mean, since my AGP Driver is NVIDIA, not AGPGART, that I shouldn't have the problem?

      --
      "I might have made a tactical error in not going to a physician for 20 years." -- Warren Zevon
  6. For once Microsoft manged to fix it first by bob1000 · · Score: 2, Informative
    1. Re:For once Microsoft manged to fix it first by kilrogg · · Score: 4, Redundant

      Rather, AMD fixed it for microsoft, they made the w2k patch but didn't release a linux patch.

    2. Re:For once Microsoft manged to fix it first by Anonymous Coward · · Score: 0
      For once Microsoft manged to fix it first

      Read the article. AMD provided the fix, not Microsoft.

    3. Re:For once Microsoft manged to fix it first by bob1000 · · Score: 0, Troll

      The patch is a one line registry change to tell win2k not to use PSE. Not only was this problem made public over a year ago but the patch is open source. The linux developers could have easily incorporated it into the kernel but I think they just don't take linux on the desktop too seriously.

    4. Re:For once Microsoft manged to fix it first by Anonymous Coward · · Score: 2, Funny
      The patch is a one line registry change ... open source. The linux developers could have easily incorporated it into the kernel

      I don't know why those damn Linux developers just didn't fire up good ol' /sbin/regedit and fix the Linux registry.

    5. Re:For once Microsoft manged to fix it first by kilrogg · · Score: 2, Troll
      You expect the kernel developers to follows every windows bug and try to figure out if its infact a software or hardware bug? Fact is, AMD made this look like a windows bug, read it for yourself(its over to the top right).

      To me, this looks like AMD doesn't give a rats ass about its customers customers who use linux.

    6. Re:For once Microsoft manged to fix it first by nomadic · · Score: 1

      Why place all the blame on AMD? If you write pentium-optimized code, what's so surprising if it won't work exactly right on an AMD? Maybe the kernel coders should have caught this?

      OH NO QUICK MOD HIM DOWN HE CAN'T CRITICIZE LINUX HE JUST CAN'T

    7. Re:For once Microsoft manged to fix it first by bob1000 · · Score: 1
      To me, this looks like AMD doesn't give a rats ass about its customers customers who use linux.

      There are several workarounds for buggy cpus and chipsets in the kernel now which could only have come from extensive debugging. The same holds true for identifying the bug under win2k: AMD and/or Microsoft noticed a problem and went about trying to identify and fix it. The fact that no effort (not even to look at the amd errata sheets) was spent on the Linux team even trying to identify the crashes makes them out to be the ones who don't give a rats ass.

    8. Re:For once Microsoft manged to fix it first by Anonymous Coward · · Score: 0

      Intel at least produces errata which list all of the minor ways that they know their processors don't work properly and which bugs are fixed at which microcode release.

      Alan Cox and other kernel hackers do read these documents. The question is if AMD documented this bug in their errata, or just fixed for Windows 2000 and figured that was good enough. If so, I have to commend them on their Microsoftish solution. Maybe the "XP" thing isn't just skin deep.

      As for AMD being pentium-compatible, they sure claimed 100% compatiblity.

    9. Re:For once Microsoft manged to fix it first by Skuto · · Score: 2, Informative

      >Why place all the blame on AMD? If you write
      >pentium-optimized code, what's so surprising if it
      >won't work exactly right on an AMD?

      It's not _nothing_ _whatsoever_ to do with Pentium optimized code. It's a new feature that both Intel and AMD cpu's support. Or in AMD's case, are supposed to support.

      --
      GCP

    10. Re:For once Microsoft manged to fix it first by Anonymous Coward · · Score: 1, Informative
      The question is if AMD documented this bug in their errata, or just fixed for Windows 2000 and figured that was good enough.

      AFAICT from AMDs Technical Resources, the patch is all there is. So AMD is infact concealing the bug, trying to make it look like a tiny "registry problem".

      Sorry, but that was a bad move, guys. Much worse than the Pentium bug thingy, which was rather theoretical, anyway.

    11. Re:For once Microsoft manged to fix it first by soulsteal · · Score: 1
      So what you're saying is....


      They moved the headstones but left the bodies, didn't ya??

      You bastards moved the headstones but left the bodies, didn't ya? DIDN'T YA?


      ahem *cough*

    12. Re:For once Microsoft manged to fix it first by Reziac · · Score: 2
      Or for that matter, a rat's ass about ANY of their customers (this is copied from a rant I made elsewhere):

      This sort of bullshit is exactly why I've become adamantly a pure Intel user. I went round and round with AMD about a buggy CPU 2 years ago (Win95 setup would not run on that particular CPU), and they not only wouldn't admit to the problem (I had inside info, so I already KNEW, *positively*, about the bug in that production batch) but also refused to warranty the affected CPU.

      Conversely about that same time, Intel told me they'd be happy to replace that old P24T (83MHz 486-overdrive CPU) I've got with the nasty floating point bugs, if only they had anything in its class available to replace it with. (Who keeps 6 year old CPUs in stock??)

      Intel may not have owned up to their bugs right up front (who does??) but at least they didn't try to pass 'em off as an OS bug. And they eventually make good to affected customers, even those whose CPU is out of warranty. AMD, in my experience, couldn't care less. And that's why I don't buy AMD anymore.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
    13. Re:For once Microsoft manged to fix it first by IronChef · · Score: 2


      But, in the end, you didn't get anything concrete from Intel either. It sounds like they blew sunshine up your skirt instead.

  7. Nice write-up. by Wakko+Warner · · Score: 1

    Now, since gentoo's well and truly dead (thanks to slashdot), can someone explain the bug and the workaround for us Athlon users?

    - A.P

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
    1. Re:Nice write-up. by bob1000 · · Score: 2, Informative

      Add 'mem=nopentium' to your lilo/grub/whatever bootup or compile the kernel for i386 to avoid extended cpu operations. The fault is something in the page size extension and agp.. which is strange because I though agp would be more of a chipset issue than processor.

    2. Re:Nice write-up. by Anonymous Coward · · Score: 0

      Turns out I wont even need to reboot:

      "I was able to work around this problem by typing 'export __GL_SINGLE_THREADED="yes"' before starting the program from the shell."

  8. And we were blaming the NVIDIA drivers... by npietraniec · · Score: 2, Informative

    It really shows up if you use the pre-empt kernel patch. Ever since I added the workaround, things have been pretty solid. (not that it's been that long)

    1. Re:And we were blaming the NVIDIA drivers... by Anonymous Coward · · Score: 0

      Moderate this up, it's a valid complaint, guys...

    2. Re:And we were blaming the NVIDIA drivers... by Dimensio · · Score: 2

      Er, added what workaround? I've been curious about a few locks I've had in Tribes 2 -- though I don't know for certain if this bug was the culprit.

  9. Re:fp by Anonymous Coward · · Score: 0, Funny

    The repetition of the elegant, almost minimalistic mantra, "fp," brings a warm, subtle shade of meaning to this first post. Truly a masterful work by a brilliant though little known artist. It seems a hot new contender for best original Slashdot content, a category previously dominated by the broad, mellow flavor of the goatse.cx link.

  10. old news again by Afrosheen · · Score: 2, Interesting

    I guess it takes awhile to pile through the submissions. This was posted on pclinuxonline.com recently.

    1. Re:old news again by cymen · · Score: 2, Offtopic

      I guess it takes awhile to pile through the submissions. This was posted on pclinuxonline.com recently.

      Wow... That takes the cake. It's bad enough to bitch about deja vu reposts from /. itself, no need to bitch about reposts of stories at other sites. If you can't see the reasons why please bite yourself. I can't wait until it is unhip to be aloof.

    2. Re:old news again by Anonymous Coward · · Score: 0
      I guess it takes awhile to pile through the submissions. This was posted on pclinuxonline.com recently.
      I'm sorry, is pclinuxonline.com the same site as slashdot.org? Do you think maybe it's possible that some Slashdot readers don't read pclinuxonline regularly? It probably is possible, since I've never even heard of the site. Considering only 78 people have read the article on pclinuxonline at this point, I would conjecture that that's quite possibly the case for an overwhelming majority of the Slashdot readership.

      But maybe your complaint wasn't that it was posted on another site somewhere on the Internet. Maybe your complaint was, as it states in your subject line, that it was "old news." Well, the article on pclinuxonline was posted Sunday afternoon, and the Slashdot article was posted Sunday night. Doesn't really sound like "old news" to me.

      So, pardon my rudeness, but shut the fuck up!

    3. Re:old news again by Anonymous Coward · · Score: 0

      The thing that got me was that I submitted it and it got rejected, then it turns up again hours later. I can't wait until it is commonplace for slashdot to accept submissions.

  11. Hmmm... by TheQuantumShift · · Score: 0, Troll

    Windows=Security Bugs, huge pricetag.

    Linux=Performance Bugs, no pricetag.

    Gee, wonder which I'll choose...

    --

    Shift happens. Fire it up.
    1. Re:Hmmm... by grytpype · · Score: 1, Flamebait

      It's not a Linux bug, troll, it's a bug in the Athlon chip itself. Read the damned article.

      --

      - Have a picture

    2. Re:Hmmm... by Anonymous Coward · · Score: 0

      Geeeeeee, I wonder if you can read?

      Windows=Security Bugs, huge pricetags
      AMD=Performance Bugs, not so small pricetag
      Linux=Solution and patch, no pricetag

      Geeeeeee, I wonder....

    3. Re:Hmmm... by TheQuantumShift · · Score: 1

      yeah, that was a direct attack on linux. what the hell are you idiots smoking....

      --

      Shift happens. Fire it up.
  12. NO AMD BASHING by Perdo · · Score: 0, Flamebait

    "AMD did prominently announce this bug when it was discovered."

    Keep your tuxedos on. Not like they are recalling processors like Intel had to do to the 1.13 because it would not compile the linux kernel. Or recalling a million motherboeards for memory translation hub issues. Or forcing us to live with a floating point math error then doing their best to cover it up or Run around as the market leader yelling the sky is falling causing a slunp in the tech sector because they were faced with a little competition for the first time. And their 64 bit processor should run 32 bit apps faster than a 200mhx pentium pro.

    --

    If voting were effective, it would be illegal by now.

    1. Re:NO AMD BASHING by Afrosheen · · Score: 2

      From what I've seen with amd motherboards (granted, this isn't amd's fault), half of those damn via chipset discount boards should be bonfired. The worst agp implementation ever seems to rear it's ugly head only in linux.

    2. Re:NO AMD BASHING by Ryan+Amos · · Score: 3, Interesting

      VIA does make some complete crap, but they also make some nice chipsets. The KT266A is very nice, it's the fastest DDR implementation out there by far. But still, VIA chipsets are a good bit cheaper than the Intel equivalent, and while the Intel chipset may be more stable, the VIA one is almost always faster. And even Intel has issues with chipset stability, it's just that they ignore them and only quietly replace the faulty boards when they're returned under warranty. You know how it goes in the computer industry... Faster, cheaper, or more stable- pick any two.

    3. Re:NO AMD BASHING by NanoGator · · Score: 4, Insightful

      AMD didn't turn interesting until the Athlon came out. The previous versions of its processors were decidedly inferior. This is *worse* than recalling for a bad, rarely used function call. I can't take a processor back 6 months after I bought it because it sucks, but I can get it replaced if it has a bona-fide bug.

      If this is a bug in the processor, AMD really should fix it and offer replacement processors to those who need it. If they don't, and they expect you to patch your OS instead, then that definitely shakes my faith in that company. When you're an artist dependent on OpenGL, you can't have problems like this.

      And finally...

      Why are you worried about running 32-bit code on a 64-bit processor? 64-bit processors are supposed to run 64-bit code. Intel's not marketing 64-bit processors to replace desktop computers (today), they're for servers and high-end graphics with custom code. They don't NEED to run 32-bit code. I hardly think that's a point against Intel, especially considering they don't make it a big secret that 32-bit code runs slower on it.

      --
      "Derp de derp."
    4. Re:NO AMD BASHING by Perdo · · Score: 3, Interesting

      AMD doesn't keep tabs on VIA and VIA doesn't keep tabs on motherboard manufacturers.. The only decent AMD motherboards are the from manufacturers trying to compete in the enthusiast market where crap boards just don't sell. Combined with VIA actually being in competition with AMD in the budget processor market (The Cyrix) delaying a decent integrated chipset for the duron and VIA bullying motherboard manufacturers into not producing The SIS 735 chipset, VIA is not AMD's best friend.

      AMD chipsets:

      Nforce 220,420
      AMD-760MPX,760MP,760
      ALi MAGiK 1,MAGiK 2
      SIS 735,745,746,755
      VIA KT266A,KT133A,KM133,KLE133,KT333,K8HTB

      STABLE (100+Days,Linux) Chipsets:

      760,KT133A,735,760mp

      Good Motherboard Manufacturers:

      Asus,Abit,Iwill,ECS,Epox,Soyo

      Personal Best Uptime 135 days, Iwill KK266 (KT133A), Power supply failure

      --

      If voting were effective, it would be illegal by now.

    5. Re:NO AMD BASHING by spauldo · · Score: 5, Informative
      Why are you worried about running 32-bit code on a 64-bit processor?

      Just as an aside, if you ever deal with ultrasparcs, you'll quickly find that the majority of the code used is 32 bit.

      The reason for it is simple; most applications will run slower at 64 bit than at 32 bit. The ultrasparc chips were designed to take this into account. Hell, due to a firmware bug, solaris on my ultra 1 installs as a 32 bit kernel by defualt - and runs no slower because of it (although it can't run 64 bit apps that way). After a firmware patch, it is easy to change to running the 64 bit kernel though.

      In all reality, why would most apps need 64 bit integers and whatnot? Most don't, and doing so is a waste of memory. If the processor is designed right, it can handle 32 bit code with no problems whatsoever.

      --
      Those who can't do, teach. Those who can't teach either, do tech support.
    6. Re:NO AMD BASHING by Anonymous Coward · · Score: 3, Interesting

      Not like they are recalling processors like Intel
      -----

      Oh great, so they make defective processors, but don't worry because they won't recall them! How in the hell does that make them better than Intel?

      Think about it -- If you own an affected part a recall is GOOD!

    7. Re:NO AMD BASHING by Anonymous Coward · · Score: 0

      If this is a bug in the processor, AMD really should fix it and offer replacement processors to those who need it.

      This would be a possibility if

      a) it were really a major problem - as others have mentioned it is easily turned off in software

      b) it affected enough people to matter. Sorry, but "Linux users" do not, and never will, number enough to matter.

      Please remember that for the most part companies exist to make money, not to help you live in a dreamworld where software costs nothing, Russians are allowed to hack anything they want, and Kevin roams free.

    8. Re:NO AMD BASHING by BJH · · Score: 1

      Hmm... your definition of "good motherboard manufacturers" seems to be quite different from mine. There's only two manufacturers I would recommend every time, and that's Supermicro and Tyan.

    9. Re:NO AMD BASHING by Anonymous Coward · · Score: 0

      *Now* it's not a major problem, but it sure as hell *was* a major problem if you encountered this
      (like I did) and couldn't figure out what was wrong. Everyone I went to for help told me it was most likely a bug in NVdriver. Ha!

    10. Re:NO AMD BASHING by Carrot007 · · Score: 1

      If that is the case, and I see no rason why is is not then surely it only applies to backward compatable 32-bit code and not the 64-bit chip specific 32-bit code.

      am i the only person here who sees x86 compatability in 64-bit chips a MAJOR BAD POINT. hell the chip should never have gone 32-bit, it's a complete pile of poo.

      Carrot007.

      --
      +----------------- | What is the question!
    11. Re:NO AMD BASHING by mikera · · Score: 4, Interesting

      I've lost count of the number of times I wanted 64-bit integers, in pretty general purpose apps.

      Not because I do big databases or suchlike, but they let you do loads of optimisations that wouldn't otherwise be possible. For example, you can pass around 8-byte structures in a single register, which is damn useful given the lack of available registers in the x86 architecture.

      Example: I've recently been coding a large hexagonal grid component. Each point in the grid is indexed by 2 32-bit (x,y) integers. With a 64-bit register, you could put a full co-ordinate into a single register.

      Why is this useful? Well, one of my requirements was to be able to manage large sets of co-ordinates (think reachable spaces for an AI). You want to be able to combine sets of co-ordinates, which basically requires merging two lists. In order to merge lists efficiently, you need to sort them. And with the 64-bit representation, you can do this with just one subtraction and one branch rather than a combination of two subtracts
      and two branches. This is a definite speedup if you are hand-coding, and possibly an even bigger one if your compiler doesn't inline all the 32-bit code properly.

      Other example: 32-bits are large enough for most integer applications (you couldn't enumerate all the people on the plant though....) but they tend to fall down when you multiply, e.g. 100,000 * 100,000 has already blown the 32-bit limit, and neither of those are particularly big numbers. Whenever you start doing a reasonable amount of multiplication, 64-bit becomes useful.

      Also, 64-bits is big enough to encode the positions of pieces on a chess board. You can use bitwise logic to analyse and store positions. GNU chess certainly does it this way. I expect a *cosiderable* speedup in the top chess-playing algorithms when 64-bit becomes widespread.

      I'm really keen to se 256-bit arrive to be honest, 2^(2^3) has more elegance than 2^(2*3) and it would allow you to store a set of bytes in one register. Would allow some very cool text-processing tricks.

      Course, it might never happen - I predict a move towards massively parallel 64-bit computers rather than stonking 256-bit ones as the next major evolution in processor power.

    12. Re:NO AMD BASHING by billcopc · · Score: 2, Insightful

      I'll second that : Iwill boards are consistently better than average in terms of both performance and stability. Abit sucks ass though, they try to push things too far and forget that a super-overclocked machine that hangs every hour isn't worth shit.

      ECS are off to a very impressive start with the K7S5A board. Using the SIS 735 chipset, it is unsurpassed in reliability and offers very decent performance as well. Overclocking isn't its strong point, but at a mere 65$ price tag you can invest the money saved on a faster CPU.

      (no I'm not sponsored by ECS, I just hate my Abit KT7-Raid and am jealous of all my friends who have the ECS board)

      --
      -Billco, Fnarg.com
    13. Re:NO AMD BASHING by Anonymous Coward · · Score: 0

      The AMD 386-40Mhz was quite interesting, and I would certainly not call that one inferior compared to Intels own 386 (which was max. 33Mhz at that time). Same goes voor 486DX2-80.

    14. Re:NO AMD BASHING by Tower · · Score: 1

      Well, after a bunch of problems with Supermicro P5Es, my list ended up more like: ASUS, Abit, Tyan.

      I personally haven't tried Epox, avoid IWill and ECS, and will stay away from Soyo boards until the day I leave this Earth... not that I'm bitter about any experience with Soyo, but one can only take so much jerking around...

      --
      "It's tough to be bilingual when you get hit in the head."
    15. Re:NO AMD BASHING by UnknownSoldier · · Score: 1

      > In all reality, why would most apps need 64 bit integers and whatnot?

      If you're arguing "apps don't need more then 64 bit registers", then I guess you're not a 3D (graphics or geometry) programmer to see the uses ;-(

      3D apps can make use of 128-bit (16 bytes) integer registers. You can pack 4 floating point numbers (4 bytes) into one register (one of the things, the PS2 does right.)
      e.g. Reading (or storing) quad words in one shot, doing a dot product, or cross product, parallel add, etc, are nice and neat with large 128 bit integers / registers.

      Cheers

    16. Re:NO AMD BASHING by jaavaaguru · · Score: 1

      but I can get it replaced if it has a bona-fide bug

      ...but you can't take you'r motherboard back to shop x and say "Um... my processor didn't fit" when you bought it specially for the certain processor that had to be returned. Having just bought two Athlon MP chips and an Athlon MP specific motherboard, I would hate to have to return Any of it.

      I think it is more important that we help with the fix rather than spend time arguing about who's fault it is and why nothing was done sooner.

      Would we be complaining if it caused problems with QNX? (hehe well some people might). It's not the hardware manufacturer's place to test their product with all software that's on the market. It is their place to release accurate specs for their product so that software producers can work from these.

      This particular problem just goes to demonstrate the problems caused by trying to make everything backwards compatible. It's my guess that if AMD had made a new 64-bit chip from scratch instead of making a faster 32-bit one, then this wouldn't have happened (but sure enough some other bugs would creep up).

    17. Re:NO AMD BASHING by spauldo · · Score: 1
      Ah, no, I didn't say that no apps need 64 bit integers, just that most don't. Certainly there are cases where 64 bit or higher processing would be much better, but for most tasks they're not necessary.

      All I was saying is that some systems, solaris being my example, only use 64 bit applications when it's best to - otherwise the majority of the system is 32 bit. I can understand pushing 64 bit performance, but 32 bit performance is important too. Then again, I don't design chips :)

      --
      Those who can't do, teach. Those who can't teach either, do tech support.
    18. Re:NO AMD BASHING by spitzcor · · Score: 1

      You wrote: "AMD really should fix it and offer replacement processors to those who need it"

      Surely, you must be joking. If a simple software workaround is all you need to get going, then there is absolutely no reason to spend millions of dollars replacing chips they have already shipped.

      I'm sure that AMD will fix this bug in their next Rev (if they haven't already).

      There are tons of bugs in everything. No one is perfect. But think about it this way. This is a MINOR bug. You will never get the "wrong answer" when a simple workaround is in place. You will never have to make a critical decision based upon bad data. You just need more TLB misses to get there. Minor, minor, minor.

      Maybe AMD should have communicated this problem a little better. I'm sure they have tried to learn from the huge Intel debacle from the first Pentium. But there, you could get the "wrong answer".

      -spitzcor

    19. Re:NO AMD BASHING by jelle · · Score: 2, Insightful

      "when you multiply, e.g. 100,000 * 100,000"

      When you multiply 2 32-bit numbers and really need the full precision of the 64-bit result, yes, then you need some 64-bit registers. However, that does not mean you need to have a multiply instruction that accepts 64-bit inputs. Also, often you don't need more than 32 bits of the result. In that case a barrel shifter in the chip right after the multiplier would already give you what you want without needing the large and slow 64x64 multiplier in the chip.

      On DSPs, you can often choose between 'integer mode' and 'fixed point mode'. In the former case they mean integer input values just like the CPU has, and in the latter case they mean values in the range [-1,1>, which places the decimal point 31 bits more towards the LSB. In 'fixed point mode', it's intuitively easier to stick with 32 bit precision if more precision is not needed.

      Additionally, DSPs have 'MAC' instructions: "accum out = accumin + (in1*in2)". Often, the number of bits in the 'accum' registers is larger than the number of bits in the 'in1' and 'in2' multiply inputs. A 16-bit DSP often (always?) has at least 32 bit wide 'accum' registers, often more than that, with up to 4 or 8 overflow bits in some cases. You need the overflow bits when you use the MAC instruction repeatedly (which is done often in typical DSP algorithms). With 4 overflow bits, you can use the MAC instruction 14=16 times and be guaranteed you'll never overflow 'accum'.

      Personally, I'd more prefer the CPUs to get more DSP features than a simple increase of 'bits'.

      --
      --- Hindsight is 20/20, but walking backwards is not the answer.
    20. Re:NO AMD BASHING by mz001b · · Score: 3, Insightful

      As someone pointed out in elsewhere, this would make the processors too expensive, if the vendor had to ship replacement processors each time a bug was found. Lots of bugs exist in processors, and typically they are fixed with each new stepping. Look at /proc/cpuinfo and see how many bugs it checks for (fdiv_bug, hlt_bug, f00f_bug, coma_bug on my system). This bug will probably be just another line. There is a simple workaround for it too, so it is not that bad. The real problem (as may people state) is that AMD did not inform the kernel developers about this problem long ago, so a fix could already be implemented.

    21. Re:NO AMD BASHING by Ryan+Amos · · Score: 2

      I agree with most of this. A lot of having a stable system comes from paying $30 more for a decent motherboard. Also, the AMD market tends to be oversaturated with commodity memory. While the Intel side of things tends to use rambus, which is all pretty decent quality, most non-DDR RAM people buy for AMD machines is just crap. The thing memory affects the most is-- you guessed it-- system stability.

    22. Re:NO AMD BASHING by Perdo · · Score: 2

      It would be so nice if ServerWorks made an AMD chipset. Imagine what they could do with hypertransport bus and if they implemented their quad channel SDRAM in a DDR solution. Finnaly there would be a truely stable enterprise class chipset available for AMD. They could probably even properly implement USB (MPX satire).

      --

      If voting were effective, it would be illegal by now.

  13. Don't think so by Metrollica · · Score: 2, Informative

    I don't think so. AMD reverse engineered the x86 and made their own implementation without Intel's crap in it.

    AMD's version of the x86 that is in the Athlon and the Duron runs faster than Intel's chips because of this reverse engineering.

    This bug could be a problem of reverse engineering the x86. It doesn't say Intel's chips have the problem.

    --



    --Metrollica
    1. Re:Don't think so by Anonymous Coward · · Score: 0

      Sorry Mr. mis-informative. AMD has pays for loads of licences from Intel, and AMD CPUs are by no means "reverse-engineered". Of course they add their own tech too, which is why they get different results.

    2. Re:Don't think so by athlon02 · · Score: 1

      Without replying at the bottom of the thread... Do you really need to see links to believe AMD licenses x86? Intel could get AMD legally for reverse engineering x86 and selling their own. AMD has licensed x86 for a while and quite obviously except for new things like SSE and all, AMD can go on previous Intel documents and their own previous processors to build new ones without having to "reverse engineer" it.

  14. More info? by ChrisJones · · Score: 1

    Is there more information about this bug anywhere? I'd like to know if it affects the Athlon XP I upgraded to in the last couple of months.
    I had a Duron before which was very unstable with the nVidia driver AGP support enabled, but I've had a couple of crashes with the AGP support enabled on the Athlon XP - if I disable the AGP support it runs rock solid (current uptime is 8 days with GL screensavers and various GL apps having run for hours and hours of that 8 days).
    I hope it's bugs in the drivers/agpgart and not the CPU - if AMD knew about this in 2000 the Duron I bought shouldn't even have had it, let alone me new Athlon XP.
    More info (specifically, which CPUs it affects) would be really good. Any takers?

    --
    Chris "Ng" Jones
    cmsj@tenshu.net
    www.tenshu.net
    1. Re:More info? by larien · · Score: 2

      I'd also like to know if I'm affected or not; I've been getting some hangs on starting X (the system locks up with the NVidia logo on screen) and I'd like to know if this is related...

    2. Re:More info? by Sadfsdaf · · Score: 2, Informative

      Disable Fast AGP write (AGP Turbo?) in your BIOS.

      Read the manual. http://205.158.109.140/XFree86_40/1.0-2313/README. txt

    3. Re:More info? by larien · · Score: 2

      That seems to be Ali specific; my mboard is an Asus (and so it the GF3, FWIW). However, thanks for the pointer, I'll give it a try.

    4. Re:More info? by zudo · · Score: 1

      My old athlon slotA 500 in an asus K7M motherboard with a geforce256 has the same problem. Random hangs on or just after the nvidia flash screen. I also get weird green artefacts at the top of the screen for a few seconds when starting x, even when it doesn't hang.

      My first thought on seeing this article was that this must be related, but there shouldn't be any agp going on (afaik) at startup, so maybe this is a different problem. Only happens with the nvidia drivers though.

    5. Re:More info? by Anonymous Coward · · Score: 0

      Nvidia's drivers have just been utter garbage for the past few months.
      In order to squeeze out more performance, they've been removing sanity checking that they believed wasn't needed. They were wrong.

    6. Re:More info? by Anonymous Coward · · Score: 0

      I get weird green artifacts too, but only for a fraction of a second. Everything but OpenGL works fine for me.

    7. Re:More info? by zudo · · Score: 1

      Ermmm, yeah, mine is probably more like a fraction of a second than a few seconds actually. OpenGL works on my box though.

    8. Re:More info? by Jucius+Maximus · · Score: 1
      They are saying that it affects the Duron, Athlon and Athlon MP ... no word yet on whether or not the Athlon XP is affected as far as I can see.

      But let me tell you this ... I have win2k with a Geforce2 and occasionally have crashes in unreal tournament but based on the error messages, they are attributed to directx (which may in turn be having memory page problems.) Perhaps I will apply this patch and see if anything works better. I could never get tuxracer to run in the first place though since my linux skills are still relatively lame.

    9. Re:More info? by ChrisJones · · Score: 1

      I went to AMD's site and into the tech section for the Athlon XP and it contained a link to the Win2k patch, so I guess the bug is still there :(

      What I also want to know is if adding the "nopentium" kernel option will disable anything other than 4mb pages - I'd rather not lose a whole gamut of optimisations just because of one bug.

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    10. Re:More info? by Enigma2175 · · Score: 2

      Dammit, close your AOL tag, you just AOLed the rest of the page for me! I will close it now for future reading.

      --

      Enigma

    11. Re:More info? by Anonymous Coward · · Score: 0

      I too had problems with lockups in 3d openGL applications. After much web searching (and I'm sorry, I don't have the original URL) I found a fix some user posted on a obscure gaming site.
      Basically it instructs Xfree to use the nvidia gart rather than the XFree gart, from what was said on the site. I applied the "patch" (see below) and haven't had a lockup since.
      I don't know how this may be related to this problem, neither do I know whether my Duron 800 has the bug spoken about here.

      I'm using Mandrake 8.1. I opened the XFree86Config-4 file in /etc/X11 and added the following line at the end of the "Device" section for the Nvidia Geforce I"m using:

      Option "NvAgp" "2"

      then saved and started X. Since then I have experienced not one lockup; beforehand it was at least once per session and forced me to hit the reset button.

      I hope this helps. It may not be a catch-all solution for everyone but sure made things work on my system. I'm willing to discuss it and have included my email.

      Andy G

      agroz@2zNOSPAM.net

    12. Re:More info? by MrResistor · · Score: 2
      I went to AMD's site and into the tech section for the Athlon XP and it contained a link to the Win2k patch, so I guess the bug is still there :(

      It doesn't mean anything of the sort. All that means is that AMD recognizes that there are people still using Athlons that have the bug. Every hardware company I've ever had experience with supports everything they put out for a number of years. For example, the last company I worked for supported their stuff for 7 years, then sold all the remaining parts etc. to an interested "independent contractor" (usually a company tech who was "retiring") and refered all further support requests to them. That included all drivers and patches specific to every hardware revision the product had undergone during it's lifecycle.

      Anyway, the fact that they still make the patch available for those cores that do have the bug doesn't mean the current core has the bug, but I'd still add mem=nopentium to my lilo boot option string at the first sign of trouble. Always try the cheap/easy fix first, and this one is definately easy.

      That said, I'll be extremely disappointed if they didn't fix the bug in the recent core revision.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    13. Re:More info? by ChrisJones · · Score: 1

      Read this: http://www.amd.com/us-en/Processors/TechnicalResou rces/0,,30_182_739_3748,00.html

      My point is, that page is specifically about the AthlonXP and it prominently links to the patch. Of course they will still be providing support for the older Athlons, but patches relating to them wouldn't be on the AthlonXP page, they'd be on the Athlon page, which is here: http://www.amd.com/us-en/Processors/TechnicalResou rces/0,,30_182_739_2983,00.html

      Of course it could just be AMD being overly cautious or someone not telling the content people that the patch isn't relevant, but the fact that the patch is linked to from the AthlonXP page suggests to me that it is still required, suggesting that the bug isn't fixed.

      It would be a bit silly to provide a patch that downgrades performance when it's not needed for the product in question, hence my original assumption that the AthlonXP suffers from the bug too. Understand now? ;)

      --
      Chris "Ng" Jones
      cmsj@tenshu.net
      www.tenshu.net
    14. Re:More info? by cl0secall · · Score: 1

      http://www.geforcefaq.com
      http://go.to/geforcefaq

      (both are the same site)

      this site has alot of information on GeForce troubleshooting. Much of it is not platform specific. I think they even mention this bug in there somewhere.

      But AMD and it's people should have been better about getting the word out. It took me awhile to find out about the CPU bug. (Though I later found out that my mobo was bad. Still haven't replaced it yet.)

      --
      Model 551, Chambered in 6mm
  15. Mirror/cache from Google! by Metrollica · · Score: 3, Redundant

    Here is the cached article.

    Thank Google again for this one!

    --



    --Metrollica
  16. Re:haha by ChrisJones · · Score: 1, Flamebait

    well, except that they released a patch back in 2000.
    It would help if you actually read the story before posting uninformed opinions. Dumbass.

    --
    Chris "Ng" Jones
    cmsj@tenshu.net
    www.tenshu.net
  17. Slashdotted already? by lcorc79 · · Score: 1
    Damn, there were less than 10 replies (two on topic) .. and it was slashdotted already. Ah well.

    Soooo... is Linux going to now start having kernel patches to detect people who overclock to 3+ gigahertz? lol -- I've heard that has some 'stability issues' as well *grin*

    --
    Groove Salad -- a nicely chilled plate of ambient grooves and beats.
  18. Another mirror/summary here by Afrosheen · · Score: 3, Informative

    Karma whoring, here I come. Hopefully this server can withstand a mild slashdotting. Link

  19. The quick answer: by Doctor+K · · Score: 5, Informative

    The site seems to be down. However, last week, I contacted nVidia about this problem on my two dual Ahtlon MP workstations (random hangs when OpenGL is invoked). So the quick answer is you can

    Boot your system with following option on your kernel command line: "mem=nopentium"

    or

    Disable AGP in XFree86 config (i.e. Option "NvAGP" "0" in the "Devices" section).

    nVidia clued me into the first approach about a week and a half ago. It made my system completely stable. However, there was still some texture flakiness in some OpenGL applications. Since my workstations are number crunchers (and thus Quake FPS don't matter to me), the latter option eliminated both the stability problems and the texture flakiness (at the expense of some graphics speed).

    By the way, nVidia mentioned the same issue exists on Win2K / Athlon boxes.

    Enjoy,
    Kevin

    1. Re:The quick answer: by carpe_noctem · · Score: 1

      I had all sorts of weird problems when using the AGP driver in the linux kernel (like the big, green rectangle in the corner of my monitor. Cute!). I recompiled my kernel, and told X to use NVidia's bulitin AGP driver, and since then, I've had virtually no problems. Q3A has only frozen on me only *once*, which is nothing compared to the hundreds of hours that I have addictively played it. =)

      --
      "Quoting famous computer scientists out of context is the root of all evil (or at least most of it) in programming." - K
    2. Re:The quick answer: by carpe_noctem · · Score: 0, Flamebait

      By the way, nVidia mentioned the same issue exists on Win2K / Athlon boxes.

      ....but nobody really noticed, because those boxes crashed all the time anyways.

      --
      "Quoting famous computer scientists out of context is the root of all evil (or at least most of it) in programming." - K
    3. Re:The quick answer: by Damned · · Score: 1

      I've noticed hang activity on both linux and win2k with my geforce2 while playing multiple games (max payne..there are no words), but have only had it completely hang the system once or twice on the linux side. the win2k side hangs more often for durations of ~10-15 seconds and has only completely hung the system once.

      this is annoying and i hope they have a fix in soon, but at least it doesn't happen all that often.

      --
      "I swear I won't break you if you let me take you where the willows never weep" -- Switchblade Symphony
    4. Re:The quick answer: by Anonymous Coward · · Score: 0

      Wow, I am impressed at nVidia's level of service. Thanks for passing this along.

    5. Re:The quick answer: by constantnormal · · Score: 1

      For Redhat 7.2 on Athlon cpus, in some systems similar symptoms can result from different causes (borderline hardware compatibility). The RH 7.2 release notes say to use a boot parm of "noathlon" to disable Athlon-specific optimizations. If "mem=nopentium" does not do the trick, try this one.

  20. Simple Workaround by Laven · · Score: 3, Redundant

    The Gentoo site says a simple workaround where you add "nopentium" to your kernel options at bootup and it will avoid the bug condition. Alan Cox is currently working on adding auto-detection of this bug in the kernel, so we wont have to worry about it soon.

    And yes, this is the same Athlon Windows 2000 AGP bug that was discovered and patched last year with that registry key. They just didn't realize that it also effected Linux until now. I now realize that was the cause of my TuxRacer crashes with my nVidia card on my Athlon computer.

    1. Re:Simple Workaround by Anonymous Coward · · Score: 0

      No, blame NVIDIA. hahaha.

  21. Well that is it! by Metrollica · · Score: 2, Funny
    --



    --Metrollica
    1. Re:Well that is it! by Anonymous Coward · · Score: 0

      It was a joke moderators. Don't make me refer to this

      -Metrollica

  22. Big surprise. by Anonymous Coward · · Score: 0

    Wow, video problems with Athlons? That's unheard of! I've always had rock-solid stability with Athlon and all the video adapters I've owned - except the exact opposite.

    You get what you pay for, baby.

  23. Performance hit? by mojo-raisin · · Score: 4, Interesting

    So does anyone know how performance is affected from this 4MB->4KB page thing?

    1. Re:Performance hit? by Taco+Cowboy · · Score: 1



      Simple arithmetic -

      4MB = 1024 X 4KB

      Therefore, the worst case scenario is a 1024 times slower performance hit.

      But in reality, unless there is a lot of paging activity, 4KB is not too bad.

      Of course, 4MB is much nicer ;p

      --
      Muchas Gracias, Señor Edward Snowden !
    2. Re:Performance hit? by larien · · Score: 4, Interesting
      That's a rather naive assumption; it assumes that a 4KB page takes the same amount of time to move as a 4MB page. Admittedly, there will be 1024 times as much loop activity in order to move 4MB, but that probably isn't the real bottleneck, which would be memory/disk bandwidth. Also, you may gain some efficiency if you only want to move say 512KB.

      In short, you're better off with 4MB pages if it's stable, but I don't know by how much. I guess some benchmarks would be easy enough to do; e.g. run Q3A with and without the mem= options.

    3. Re:Performance hit? by andrewgaul · · Score: 5, Interesting

      The performance hit for using the smaller pages is mostly unrelated to paging. When a CPU loads an virtual address (all addressing in "protected mode" is virtual), there is a translation to a physical address before data can be accessed. This table is stored in memory and the CPU breaks into kernel mode to do the translation. To avoid this cost, there is a cache of translations (managed by the kernel) in the Translation Look-aside Buffer (TLB). Most of the entries in this cache are for 4kb pages, but there are a few 4mb pages which are generally used for kernel memory (I am unsure if any OSes use the big pages for user programs).

      That said, there should be a modest performance hit. Bigger pages can store more data, which results in fewer TLB misses. Hopefully someone will post benchmarks.

    4. Re:Performance hit? by themassiah · · Score: 2, Troll

      I, personally, think it's sad when a video game's measure of frames per second becomes a benchmark. At least re-index a database or something ;)

      --
      - Sometimes you're the pidgeon, sometimes you're the statue.
    5. Re:Performance hit? by Sits · · Score: 3, Informative

      You may want to take a look at the benchmarks posted later.

    6. Re:Performance hit? by larien · · Score: 2
      It's fairly standard, like it or not! It also throws a fair bit of data around, which should give an indication of performance. In any case, it's probably what most desktop users are concerned with!

      The DB reindex is a good test of paging as well, however.

    7. Re:Performance hit? by Anonymous Coward · · Score: 0

      And I! tend to! overuse exclamation marks too! much!!!

    8. Re:Performance hit? by frleong · · Score: 2
      See this link to read how folks at MSDN describes LargePageMinimum, the fix to the Athlon/AGP bug:

      Kernel improvements of Windows XP

      --
      ¦ ©® ±
    9. Re:Performance hit? by hearingaid · · Score: 2

      Video games are really processor-dependent. Most other applications are hard drive-dependent to some extent or another. Indexing a database is really a way to test the speed of your hard drive for any DB of significant size (nobody keeps a 500GB DB in RAM :)

      The only other application I can think of that's comparatively CPU-dependent is raytracers and the like, and the problem with using them as benchmarks is that the length of time they take to produce a picture will obviously depend on the complexity of the picture. Q3/UT/etc. generate pictures of roughly fixed complexity, saving you the trouble, and also do so in a time-optimized kind of way (while raytracers tend to be more optimized towards producing beautiful results).

      --

      my old sig used to be funny, but then slashcode ate it and now it's not funny anymore

    10. Re:Performance hit? by Taco+Cowboy · · Score: 1



      You said :

      "That's a rather naive assumption; it
      assumes that a 4KB page takes the same
      amount of time to move as a 4MB page"

      But I _did_ say that it's a WORST CASE SCENARIO, didn't I?

      Of course, unless someone posted an actual benchmark to show how much time it takes for a 4MB page to load, versus that of a 4KB, et cetera, et cetera, simple arithmetic WILL DO.

      Remember K.I.S.S. ??

      --
      Muchas Gracias, Señor Edward Snowden !
    11. Re:Performance hit? by Tassach · · Score: 2

      I, personally, think it's sad when a video game's measure of frames per second becomes a benchmark.


      What's so sad about using actual application performance as a benchmark? It gives a much better indication of real-world performance than theoretical benchmarks.


      Benchmarks are useful to compare how different systems will perform while running a particular application. Re-indexing a database is an appropriate benchmark to look at if you are evaluating a database server, it's not so useful if you are evaluating a graphics workstation. Conversely, if you want to get the best gaming machine for your money, Frames / Second gives you a much more accurate picture of the system's performance than the time it takes to re-index a million-row table.

      --
      Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
    12. Re:Performance hit? by Anonymous Coward · · Score: 0

      Actually the TLBs are implemented via hardware, not software. The important thing is that the CPUs have separate TLB caches for 4kb and 4mb paging and less memory is needed to do paging via 4mb instead of 4kb (and hence, less memory needs to be cached; so the frequency of cache hits is greater for 4mb paging than 4kb paging, when it is used).

      In particular, the CPU has a register called "CR3" (also known as the Page Directory Base Register, or PDBR) which contains the address of a 4kb page called the page directory. That directory has 1024 32-bit entries that map between physical and linear memory, where "linear" is more or less "virtual".

      Also note that paging hardware is used only when the PG bit of the CR0 register is enabled, so 'virtual' (the proper word is linear) addresses aren't always used in protected mode: I switch to protected mode all the time with paging disabled.

      When physical addresses match linear addresses, and paging is enabled, it is said that the system is identity-mapped.

      Now a page directory contains 1024 32-bit entries, each of which is called a PDE (Page Directory Entry), e.g.

      unsigned PDE[1024];

      Now if I look at linear address 0 to 1 less than 4mb, the CPU accesses a block of 4mb data in physical ram located at the offset in the 4mb block I used, but at 4mb offset selected by PDE[0] bits 12 to 31. Thus if bits 12..31 of PDE[0] contains the number 5, then linear offset 1234 corresponds with physical offset 1234 + 5 * 4mb (where mb = 1024*1024).

      That's how 4mb paging is done.

      Note that for 4mb paging, the CPU needs only to fetch a 4kb page once when I access data anywhere in linear memory. When I do a task switch, the CPU will forget the old page directory that it cached, due to the CR3 register being reloaded, with one exception: certain pages can be marked "global".

      Note that the biggest performance hit for nopentium being used on Linux is going to be the lack of global pages being used, if indeed that is the case (global pages were introduced with the pentium pro architecture, if memory serves).

      That's because nonglobal pages are wiped from the cache at each task hit, causing the CPU to forget everything it knows about the TLBs at each task switch. that's what the original pentium did, and so did earlier processors.

      (It is possible to mix 4mb paging with 4kb paging, and in practice, that's quite necessary if one wishes to share memory between two processes without being totally wasteful in most cases).

      Now for 4KB paging, here's how it works:

      suppose I access offset 5 in linear memory. The CPU does this:

      it looks at PDE[0], and uses that do find a 4kb page called the page table. Note that there can be up to 1024 page tables; each page table corresponds with a 4mb region of memory, and maps 4kb pages in that region to physical memory, sort of like this:

      typedef unsigned PTE_t[1024];
      PTE_t *PDE[1024];

      Now to access offset 5 we have to do this:

      unsigned linear = 5;
      unsigned physical =
      PDE[linear / (4*1024*1024)]
      [linear / (4*1024)] & (4*1024-1);

      Kind of like hashing, where 4MB paging uses 1 hasher thingy and 4KB paging uses 2 levels of hashing thingies.

      So the biggest performace hit is really this: the lack of global pages being used, because "nopentium" disables global pages AND 4mb paging. To avoid that performace hit, you'll need to see if there's a way to avoid 4mb paging besides "nopentium"; or else to enable the global bit being used.

      Now the second biggest hit, is due to 4mb paging itself not being used. That means that more pages will be fetched from memory, due to more cache misses; which in turn is ultimately due to Intel not making the 4MB TLB caches 1/1024 the size of the 4KB TLB cache (which would be a totally useless idea, since that'd be a 32-bit cache, which is stupid.)

      It's the difference between doing this:
      unsigned physical = PDE[linear / (4 * 1024 * 1024)] & ~(4 * 1024 * 1024 - 1);
      and this:
      unsigned physical =
      PDE[linear / (4*1024*1024)]
      [linear / (4*1024)] & (4*1024-1);

      Note that the amount of data that you need to cache is 4KB for 4MB paging and 4MB for 4KB paging.

      The former discussion, except for the global bit, does not take into account the fact that even when 4MB paging is enabled, 4KB pages are still used; 4MB pages are just used more.

      Worst case scenario due to no 4mb paging is that every time you do a task switch the CPU needs to fetch 4MB from your RAM, which brings your system to a halt until that memory is fetched; so how many GBs can you fetch in a second? But in practice, that won't happen. Ultimately, for that to happen, you'd need to have 4GB of memory and EVERY process in your system would need to access all 4GB of memory immediately after a task switch was made to that process... so don't worry about that worst-case.

      Wost-case due to the lack of global pages is hard to determine; basically, if you had global pages, you wouldn't even notice nopentium at all.

  24. Is this present in Athlon optimized kernels? by victwenty · · Score: 2, Interesting
    from the article: And, this problem hits us because Linux 2.4 kernels compiled with a Pentium-Classic or higher Processor family kernel configuration setting will automatically take advantage of extended paging

    so the question is, if I configure my kernel for the K7 family, do I need to pass the kernel "mem=nopentium" or is this the default?

    1. Re:Is this present in Athlon optimized kernels? by Sits · · Score: 2, Informative

      Almost definitely not. It sounds like the existence of this bug was not known until recently and K7 options almost definitely enable all memory enhancements.

  25. Incredible as it may... by Taco+Cowboy · · Score: 1, Troll



    But the bug was there since Sep. 2000 !

    You think someone in AMD may have correct the bug, but nooooooo.

    How many version of Athlon / Thunderbird / XP / MP have there since Sep. 2000 ?

    I thought in all new iteration of chip, they have "de-bugging sessions" - just like softwares - before the "tape-out" stage.

    Have to wonder why AMD don't do debugging before tape-out ?

    Is it money?

    Or is the bug a "feature" instead ?

    --
    Muchas Gracias, Señor Edward Snowden !
    1. Re:Incredible as it may... by Anonymous Coward · · Score: 1, Informative

      If their flow is the same as most other semis, they do functional verification both before (in simulation) and after tapeout. A chip is almost always rev'ed a few times before it get 'prodution' status and ships to customers in large quantities. Looks like they had a hole in their functional test plan and missed this one.

    2. Re:Incredible as it may... by Hoser+McMoose · · Score: 2, Insightful

      Of course it's due to money! Coming up with a fix to a bug like this doesn't just happen overnight, and since the errata in processors barely ever effect anyone and can usually be easily worked around in software (and the software fix for this bug is trivial), most companies have better things to do with their time. As it turns out, AMD did eventually get around to fixing this issue with Stepping A5 of the AthlonXP/MP core.

      In the same vein, out of the 83 bugs that Intel currently has listed for their Pentium III processor, quite a bit more then 50% of them are listed as "NoFix", ie Intel has no plans on ever fixing these bugs.

      The real question I have to ask is why no one caught this earlier? This bug is well documented in AMD's errata list, complete with a workaround. AMD's Athlon chips only have something like 10-15 known bugs listed, which is quite a few less then the 59 known bugs for Intel's P4 or the 83 known bugs for Intel's PIII processors, so going through the list of AMD bugs should be a fairly easy thing to do (aside: one could argue either that AMD chips have fewer bugs then Intel or simply that Intel documents their chips better.. I don't want to take either side on that flame war though).

      If anyone is really interested in this sort of thing, both AMD and Intel have their list of known bugs up on their website under "specification updates" for each of their processors.

    3. Re:Incredible as it may... by Anonymous Coward · · Score: 0

      it also depends on the magnitude of the bug. you cannot just compare bug numbers without taking into account the different impact the the bugs may have.

  26. Re:Now aren't you glad you use Free/Net/Open BSD(n by Yarn · · Score: 2

    Does anyone actually *know* if this is worked around in the *bsds?

    Or do they use the 4k method by default anyway?

    --
    -Yarn - Rio Karma: Excellent
  27. Nvidia + AGP + Irongate + Athlon by hack0rama · · Score: 4, Interesting

    Nvdia drivers forces AGP to 1x due to corruptions caused by AMD Irongate chipset signal integrity [ Mentioned at the README for Nvidia 1.0-2313 Drivers ]

    This newly discovered memory corruption with Athlon + AGP, is it contributing to the signal integrity of the Irongate ? Or is it a separate bug ?

    Anyway this makes AMD look very bad in my view. There is a bug in the CPU and their chipset screws up my AGP to 1x. Sigh.

    1. Re:Nvidia + AGP + Irongate + Athlon by Grandpa+Jive · · Score: 1

      There are tweaks out there that let you crank it to 2x. I believe there are registry patches that let you do it, and I believe you can do something in linux to up it to 2x. I suffer from having an irongate also; I'm hoping that this will finally stop the random lockups.

    2. Re:Nvidia + AGP + Irongate + Athlon by Arimus · · Score: 1

      Why does this make AMD look any worse than, oh let me see... Intel?

      Remmember the bug with their Pentiums that caused them to give the wrong FP results if you used certain values (due the small(?) mistake of omitting a column of lookup values)... we seem to be forgetting that CPU's are getting more and more complex but the people designing them and checking them etc still have to use the old Mk1.brain.

      AMD still make better sense than the Intel cpu's on a $ per performance basis.

      --
      --- Users are like bacteria -> Each one causing a thousand tiny crises until the host finally gives up and dies.
    3. Re:Nvidia + AGP + Irongate + Athlon by dunkelfalke · · Score: 1

      well, the irongate problems are there only with nvidia cards.

      never had the same problem with a radeon or a kyro 2.
      so, who is to blame now?

      --
      "It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
    4. Re:Nvidia + AGP + Irongate + Athlon by chips · · Score: 1

      Have you ever used accelated X (DRI) with the radeon? There used to be a huge crashing bug with radeon + irongate. Basically, whenever you tried to start the X server with DRI enabled, it would hang. And beleive me it wasn't fun, I had to live with unaccelerated X for the six months that it took to fix it (even then I had to compile X myself until it got into debian). It had something to do with the HDP_SOFT_RESET register, IIRC. Thankfully its fixed now and I'm experiencing no more problems.

      --
      -- Guns don't kill people, bullets kill people. Guns just make bullets go really, really fast.
    5. Re:Nvidia + AGP + Irongate + Athlon by Anonymous Coward · · Score: 1, Interesting

      What was the mean-time-to-patch for all major operating systems for the P5 FP bug? A month or two?

      The issue isn't the bug, bugs happen and it doesn't sound serious. It's the fact that it was unknown to Linux users for more than a year. And performance only matters if your system is up.

    6. Re:Nvidia + AGP + Irongate + Athlon by psamuels · · Score: 2
      What was the mean-time-to-patch for all major operating systems for the P5 FP bug? A month or two?

      Bad example - you can't work around that class of bug in the OS. Best thing you can do is get Intel to ship you a replacement - which I guess they did, on request.

      You should have picked the Pentium F00F bug from a few years back. That's one you can work around in software. Ingo Molnar produced a working Linux patch over the span of a weekend, and it was shipped with a new kernel rev (2.0.35 or so) a couple days later. Microsoft didn't fix it at all in NT4 - presumably they did in Win2k, if they felt the Pentium-MMX 233 and below was still worth supporting by then.

      None of which had much to do with Intel, except that they were very cooperative distributing intel about the bug and suggested workaround to the Linux people.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  28. Should AMD do the right thing? by NanoGator · · Score: 2, Insightful

    I should start by saying I haven't read the article yet, can't get to it. *hopes the /. traffic dies down soon...*

    If it is a defect in the processor, I wonder if AMD will replace my existing processor. It may not seem like all that big of deal to most people here at Slashdot, but as a 3D artist I am *dependent* on OpenGL.

    Don't get me wrong, I'm not having this problem now. (I'm not a Linux user.) But when I built my Athlon I had to install a patch for a similar type of problem in order to get the machine to work. At what point do we say "it's no longer ok to work around a CPU bug"?

    If Intel has one set of bugs in their processors, and AMD has another, that divides the market. Software companies shouldn't have to put the effort into scrutinizing their code based on which CPU they are on, it's bad enough they are trying to optimize for one or the other. What happens when they get used to the workaround, but then it gets fixed? Worse yet, what happens when a company says "I'm sick of this, I'm only supporting one processor."

    So it's not so much that I think AMD should replace the processors with this specific bug, but I think we should be vigilent in not allowing them to let errors like that run rampant.

    --
    "Derp de derp."
    1. Re:Should AMD do the right thing? by Linux+Freak · · Score: 3, Informative
      Heh, microcode bugs go back, WAYYYY back as far as microprocessors do themselves.



      Shit happens. Work around it. ;-)
    2. Re:Should AMD do the right thing? by Anonymous Coward · · Score: 0

      Even if you're not a linux user, the bug affects win2k/XP - AMD released a patch for Win2k that disables large pages, but presented it as a fix for a bug in Win2k, not as a workaraound for their shoddy processor - thus, the Linux folk never realised the bug was in the processor, not Win2k, so never (until now) incorporated the workaround into linux. There's probably lots of Win2k/xp workstations without the patch, too.

    3. Re:Should AMD do the right thing? by Anonymous Coward · · Score: 1, Interesting

      Yep, in addition to the 4004 bug mentioned in one of your links above, I seem to recall a bug in the Z80 (in fact, a Russian company developed some kind of device (calculator??) which used a custom CPU compatible with the Z80 chip -- only, it also had the microcode bug so it's likely the reverse engineering was not as "clean room" as it probably should have been). There were bugs in the Amiga's SCSI chipset, and even a few microcode bugs in the old VAXen.

    4. Re:Should AMD do the right thing? by Una · · Score: 1

      Honestly, I dont believe AMD will be issuing replacement CPUs for such a small (and easily software patchable) bug.

      Think about it this way:
      Currently, a large portion of AMDs income is from microprocessors.
      Now, if they were to issue a recall on the Athlon, they would have to supply enough fixed CPUs to replace every Athlon they have sold in the last few years.

      If you think about that for a second, that would mean AMD would take a huge financial loss, potentially leading to a complete bankrupcy.

      Not to be a troll or anything, but do you think its really in AMDs best intrest to issue a recall on their flagship product, because of an easily software correctable bug?
      No. I think not.

      --Una

    5. Re:Should AMD do the right thing? by Eric+Smith · · Score: 4, Informative
      That third article about the supposed "HCF" instruction on the 4004 is completely and utter BS. None of the instructions on the 4004 will cause it to burn up, even on the earliest production parts.

      Several processors had self-test instructions known as "HCF". The 6800 family and the 6502 had such instructions. They caused the processor to start fetching consecutive locations, thus continuously incrementing the address bus. Didn't damage the processor, even if you left it running that way. The "Catch Fire" was a figurative description of what was happening on the address bus, nothing more.

      On the original NMOS 6502, about 13 of the undefined opcodes had this effect. This was the most common cause of computer lockups if the code went into the weeds.

      On some of the later 6800 family members, the test instructions were actually documented, but Motorola's published description did not include any mnemonmic for them.

    6. Re:Should AMD do the right thing? by Anonymous Coward · · Score: 0

      Mod this thread up!
      (Any slashdot-like sites for microprocessor geeks? :^) )

    7. Re:Should AMD do the right thing? by Anonymous Coward · · Score: 0

      Microsoft shipped their own compatibility patch for Win2k.

      XP does not require any patch.

    8. Re:Should AMD do the right thing? by red_dragon · · Score: 1

      So it could never really catch fire? Guess ESR needs to update this entry.

      The MC6800 microprocessor was the first for which an HCF opcode became widely known. This instruction caused the processor to toggle a subset of the bus lines as rapidly as it could; in some configurations this could actually cause lines to burn up.
      --
      In Soviet Russia, Jesus asks: "What Would You Do?"
    9. Re:Should AMD do the right thing? by flatrock · · Score: 5, Interesting

      First of all, this bug is not that significant performance wise. Very little software is going to use 4 MB pages. I don't think you even have an option of allocating memory with 4 MB pages in user space. This appears to be an issue with being able to optimise drivers, however, if AMD's processors can't do this, and Intel's can, why don't we see Intel's processors greatly outperforming AMD's in Win2k? This is a minor bug, and it's easily worked around without patching the kernel in both Win2k and Linux.

      The processors are basicly all their Athlon and Duron processors. For AMD or any chip maker to replace chips with bugs in them is VERY expensive. They already have a low profit margin. Replacing all "defective" Athlon and Duron processors would simply bankrupt AMD. Realisticly, all complex software or hardware has bugs. Bugs in hardware are much more difficult and expensive to fix. The truely significant hardware bugs are usually found early in testing. Other bugs are fixed in software, usually in the system BIOS, but sometimes in the OS code. This isn't something new. It's pretty much always been this way. Why has it been this way? Because no one wants to pay the outlandish prices that would result from trying to make hardware perfect. It costs a tremendous amount of money to reroll a processor. It's not as simple as making a quick code change and recompiling software. THERE WILL ALWAYS BE BUGS IN PROCESSORS! A truely significant bug like the Pentium floating point bug needs to be fixed in the hardware, and that one was even significant enough to deserve a recall of the processor. This bug is simple to work around, and isn't truely a significant problem.

      The question you asked in the subject is "Should AMD do the right thing?" The answer is yes, they should correct their Technology Bulletin to actually say what the processor bug is, rather than just say here's a workaround to a bug that effects Win2k.

      I'm really surprsed that someone at NVidia didn't pass this on to Linux kernel developers much sooner, since people at that company seem to have been aware of this for some time.

    10. Re:Should AMD do the right thing? by Batou · · Score: 1

      Does anyone here read the article before posting? This is NOT A NEW BUG. If a patch for Win2K was released in Sept 2000, then this is pretty old news. As for a recall: If they were planning on recalling any processors for this, I somehow doubt they would decide to wait nearly a year and a half to do it. Especially when it is EASILY patched. (Hell, even my bios gives me the option to restrict the processor to using 4KB paging, but that's far from typical.) What surprises me is it took this long for the kernel folks to catch wind of this. From what I remember, the Win2K patch for this was pretty widely publicized, in so far as ntbugtraq and M$'s notification services were involved. I guess no one even remotely involved with the kernel has to use Windows from time to time? Must be nice. What I think the real question we should be asking is this: If this was known about and addressed (for Windows, at least) in Sept of 2000, HOW ON EARTH WAS THIS BUG NOT ADDRESSED IN THE ATHLON XP/MP CORES? Anyone?

      --
      "Oh my God! The dead have risen! And they're voting Republican!" - Bart Simpson
    11. Re:Should AMD do the right thing? by rabidcow · · Score: 2

      Except you can't have microcode bugs without microcode, now can you?

      I seriously doubt the 4004 had microcode. I know for a fact that the z80 didn't (originally anyway, it was the most complicated cpu to be hardwired), that's why it had so many undocumented instructions.

      (Also, the middle link knows nothing about the z80, for example, it also "could join pairs of 8-bit registers to use for 16-bit operations". Dunno if it inherited that from the 8080 or what.)

    12. Re:Should AMD do the right thing? by Anne+Thwacks · · Score: 1
      The SAA4000 (or was it SAA6000) actually DID catch fire:

      It had separate control over pull-up and pull-down transistors on one of the IO ports, so it was possible to enable pull-ups and pull-downs together. This would allow the entire output of the power supply to be dissipated in the output port.

      This was a processor targeted at portable equipment (we used it in pagers). The CPU could easily set fire to the PWA and plastic case.

      I forget who made the chip, but it was appalling rubbish in many other ways too: The same opcode did different things according to which memory page it was loaded in (so forget relocating linkers). I heard rumours that programmers' brains would catch fire trying to debug the thing, as the program counter did not increment, but used a grey code sequence to avoid having to have a carry look-ahead.

      HCF was certainly not new technology when the 6800 came out. Early LSI logic "Naked Minis", and some models of DG Nova could burn out core memory by looping on a "JMP $" instruction, as sustaining the drive current through a single location would, over a period of seconds, burn out the drivers, wires, or both.

      After working on sh*te like this, the 6800 was a programmers dream come true (until the 6501 came out).

      Incidentally, Motorola was noe for a sense of humour: in the early days of the 6800, probably 1975, they issued a data sheet for a compatible 68xx Write Only Memory "Ideal for implementing /dev/null". It featured a "Not Chip Destruct" pin, which, if not grounded, would cause the chip to self-destruct. I guess it found volume sales with Mission Impossible. They also used to sell a 1-bit CPU. Does anyone know whether this was a serious product, or another 1 April release?

      --
      Sent from my ASR33 using ASCII
  29. why we blame NVIDIA by Anonymous Coward · · Score: 0, Informative

    It is impossible to debug closed-source
    drivers like the NVIDIA one. So any NVIDIA
    bugs can't be found.

    But you say "this is an AMD bug"...

    How could we know that? The presence of
    closed-source drivers in the kernel made
    us unable to determine what was at fault.
    Video drivers can cause non-video problems,
    so in all cases only NVIDIA can help you.

    1. Re:why we blame NVIDIA by bzzzt · · Score: 1

      Maybe because there are other video cards than nvidia using AGP in Linux which have the same problems?

    2. Re:why we blame NVIDIA by innocent_white_lamb · · Score: 1

      But you say "this is an AMD bug"...

      AMD released a patch for this bug for Windows 2000 way-back-when, but the way that they went about publicizing it made it appear that the bug affected only Windows 2000 and nobody on Linux kernel development realized that it also affected Linux.

      However, AMD themselves say that it's a bug in their CPU, so I think it's fairly safe to say that it's a bug in their CPU.

      --
      If you're a zombie and you know it, bite your friend!
  30. Same as Intel's F00F problem by bob1000 · · Score: 1

    It is easily fixed through software and there doesn't seem to be any noticeable performance hit when turning off PSE.

    1. Re:Same as Intel's F00F problem by bzzzt · · Score: 1

      Then why did the implement this feature in the first place?

    2. Re:Same as Intel's F00F problem by RKloti · · Score: 1, Insightful

      Probably for marketing reasons.
      (another buzzword)

  31. Re:First Menstrual Cycle post by Anonymous Coward · · Score: 0

    9 years don't menstruate. But I'd suck her pussy.

  32. Not really. by Ch_Omega · · Score: 1

    The site with the article is slashdotted to oblivion, so I'm not sure what the problem really consist of, but according to what I have read in comments below, it's not really comparable.

    As far as I remember, the the bug in the original pentium was a floating point flaw that led to wrong calclulations under certain circumstances.

  33. Re:Actor Clint Howard, Dead at age 42 by Anonymous Coward · · Score: 0

    OK, I'll kill Clint Hollywood and you kill yourself.

  34. Ahh now I know what it is.... by gergnz · · Score: 1

    I've had this problem before, so intermittent I didn't think it was worth worrying about.
    When I noticed it the most (figured this out the other day) was when I had several compiles going at the same time, ie kde3 libs and say a new patched kernel.

    I could never get it to reliably repeat, so never looked into it.

    Great to see someone actually had the patience to figure it out :-) !!!!

    --
    404 Not Found The requested signature was not found on this server.
  35. Why don't I see this? by Sits · · Score: 1

    I have an Athlon 850 with a Geforce 1. I thought I had finally gotten rid of the last of the system workarounds when I upgraded my BIOS and I stopped seeing "Stomping on Athlon bug" (the classic VIA chipset problem). Looks like this isn't going to be the case after all.

    I have always compiled my kernel for Athlon optimisations and I use the NVidia linux drivers with agpgart. How come I haven't hit this bug before?

    How much performance is knocked off the system as whole because of this? Is it a few percent? I presume this will hit all applications not just AGP ones...

    1. Re:Why don't I see this? by Skuto · · Score: 1

      >I stopped seeing "Stomping on Athlon bug" (the
      >classic VIA chipset problem)

      IIRC the 'Athlon bug stomper' is not the classic VIA chipset problem (which are the KT133A/686B's from hell)

      It's related to a BIOS setting some MB manufacturers use to increase performance on the Athlon. AMD had specifically indicated the setting as 'reserved - do not change' but some did anyway, causing Athlon optimized Linux kernels to crash because they are overfloring an internal buffer (memory write queue). It can happen in Windows too with optimized video drivers. Newer Linux kernels detect and reset the setting to a stable value.

      --
      GCP

    2. Re:Why don't I see this? by hearingaid · · Score: 2

      The announcement did indicate most Athlons were affected by the bug. Perhaps you're one of the lucky few who isn't. If you have no bug, do not worry: do not attempt to fix your computer as it is not broken. :)

      --

      my old sig used to be funny, but then slashcode ate it and now it's not funny anymore

    3. Re:Why don't I see this? by Anonymous Coward · · Score: 0

      Are you running FreeBSD? FreeBSD users are not affected.

  36. Bug Problem with SETI by polyp2000 · · Score: 1, Interesting

    can anyone tell me if this problem may occur when running SETI? only I used to run it on my dual MP Athlon under MDK 8.1 , but it would invariably kill my machine. so I stopped running it.. ideas anyone ?

    nick(nospam)@(nospam)polyprecords.com

    --
    Electronic Music Made Using Linux http://soundcloud.com/polyp
    1. Re:Bug Problem with SETI by larien · · Score: 2

      Could well be possible; AFAIK, SETI throws around a fair bit of data, so it might do some paging. If it 'invariably' killed your machine, it should be easy to test using the boot options.

    2. Re:Bug Problem with SETI by Anonymous Coward · · Score: 0

      I have exactly this problem too. If I run seti the chances of a cold-freese increase dramatically.

      It's almost certain that the computer will completely hang (under Linux) when I run seti. It'll definatly hang if I have music/opengl stuff running.

      Hmmm...

  37. Athlon bug, and NVIDIA drivers by Rohan427 · · Score: 3, Interesting

    I have 2 Athlon systems, a dual Thundirbird 1.4GHz (Tyan Thunder K7) and a single Thunderbird 1.4GHz (Asus A7V133). The former runs a GeForce 3 and kernel 2.4.17, the later TNT2 and RH 7.2 (kernel 2.4.9 I believe). Both systems run semi-custom NVidia drivers (release 2313). By semi-custom, I mean I tweaked them to use SBA, the NVIDIA AGP driver (NOT agpgart) and to run in 4x mode. The later has never had a problem, the former (the dual) had some problems until kernel 2.4.14.

    The problems I had were frequent lockups with everything X, especially Q3A and Tribes 2. Some experimenting proved what worked and what didn't, and here's what I found:

    agpgart never worked worth a damn even with kernel 2.4.17, despite several attempts by me to make it work (I don't maintain it, so I gave up on messing with it). Earlier NVIDIA drivers were less stable, but the latest is great (although it does not support FW, which blows). Tweaking the NVIDIA driver to use SBA and it's own AGP driver instead of agpgart, along with kernel 2.4.14 - 2.4.17 makes for a very stable and fast system. Older kernels just did not work worth a damn whenever I enabled DMA on my IDE drive - they locked every time. These newer kernels don't exhibit this problem, and the NVIDIA driver works nicely with all 3D games as well as 3D development tools like Blender.

    My kernels have always been compiled as Athlon kernels as well. The bottom line is: don't blame this bug and/or the NVIDIA driver if your system is unstable and/or slow. There are other things at work, and in my case I seem to have found them all.

    - Rohan

    1. Re:Athlon bug, and NVIDIA drivers by ZaMoose · · Score: 2

      Have you made this tweak available? How difficult is it to perform?

      I've got a dual Athlon MP 1900+ machine from Alienware coming in for work and I'd like to get it running like a dream, if at all possible.

      --
      I wish I had a kryptonite cross, because then you could keep Dracula and Superman away.
    2. Re:Athlon bug, and NVIDIA drivers by JBv · · Score: 1

      My setup is a duron 900 with a TNT2U and an asus A7V133. I have an almost simmetrical experience from yours.

      With the default kernel in mandrake (2.4.8, i belive) i always had freezes with nvidia agp drivers. With agpgart, it only froze on ocasion, but frequently enough to keep me away from quake3. All this with the latest nvidia drivers.

      I have been using 2.4.17 with agpgart for a while now and it is a definite improvement in terms of stability. In all the (too many) hours of playing quake3 (sometimes over the whole weekend) it only froze once. I compiled 2.4.17 myself with all the IDE performance goodies on.

      I don't know if this 'stability' i get is due to the criptic message "Stomping Athlon Bug" i get during boot of 2.4.17... I haven't bothered with it until now.

  38. Re:Now aren't you glad you use Free/Net/Open BSD(n by Anonymous Coward · · Score: 0

    If one were using *BSD, AGP wouldn't be that well supported in the first place. :-)

  39. How-To: lilo workaround by Anonymous Coward · · Score: 4, Redundant

    If you're using lilo, and just want to apply the workaround quickly, edit /etc/lilo.conf.

    Before the first image= line, insert the line:

    append="mem=nopentium"

    1. Re:How-To: lilo workaround by Anonymous Coward · · Score: 0

      and this gets to +4? I thought people here know what they are dealing with, someone talks about how to modify lilo.conf and it gets +4? Sheesh!

  40. Does this happen if kernel compiled for K7? by Nicolas+MONNET · · Score: 4, Interesting

    The article says it happens when the kernel is compiled for Pentium processors; but does this happen if the kernel is compiled for a K7?

    By the way, I had to shelve my nVidia card a couple months ago because of this ... I have an Athlon and it kept hard freezing. The bug doesn't happen with a Voodoo card.

    1. Re:Does this happen if kernel compiled for K7? by Anonymous Coward · · Score: 0

      Nvidia's drivers have been utter garbage lately.
      In an effort to increase performance, they removed sanity checking that they believed was not neccessary. They were wrong.

    2. Re:Does this happen if kernel compiled for K7? by Anonymous Coward · · Score: 0


      Funny, after telling XFree to use the nvidia gart instead of the XFree gart, my system works fine and runs UT and QIII about 5-10% faster (and more stably) than Windows 98 or XP.

      AG

    3. Re:Does this happen if kernel compiled for K7? by DeeKayWon · · Score: 2

      I assume so. Since PSE is supported in Athlons I would think the kernel people would enable it for a K7 compile.

      I would think that only people who compile their own kernels and those who use Mandrake would be affected by this since pretty much everyone else compiles for 386, which would turn off the use of the PSE capability.

    4. Re:Does this happen if kernel compiled for K7? by zsazsa · · Score: 2

      The bug doesn't happen with a Voodoo card.

      That's because Voodoo cards never used AGP. Sure, they may have fit into an AGP slot but they functioned more or less like PCI cards. (I believe this is true even for the last Voodoo 4/5 generation.)

      Ian

  41. F00F by srichman · · Score: 2
    As far as I remember, the the bug in the original pentium was a floating point flaw that led to wrong calclulations under certain circumstances.
    No, I think the analogous bug the parent was referring to was the F00F bug, which would hang Pentiums, regardless of OS, even for unprivileged users.
  42. Lawyers by Anonymous Coward · · Score: 1, Funny
    (its over to the top right)

    Oh my God, AMD makes you read a 7000-character licensing agreement in order to download a 334 byte patch. And people think the GPL is bad ...

    1. Re:Lawyers by Anonymous Coward · · Score: 1, Interesting
      Oh my God, AMD makes you read a 7000-character licensing agreement

      Infact, this legally prevents Linux developers from using this patch, because it explicitely state that you may not "reverse engineer" it.

      Imho AMD have a serious problem in the relationship towards linux. Intel has helped gcc developement, they provide their own optimised compiler free of charge and they work together with kernel developers. Seems like AMD on the other hand couldn't care less. Don't get me wrong, I like AMD CPU. But I what use is a good CPU if it is not supported properly?

    2. Re:Lawyers by Anonymous Coward · · Score: 0

      It is supported properly... on all the platforms that matter.

  43. The equivalent Win2k bug fix by LadyLucky · · Score: 3, Informative
    can be found here

    Funny, I knew something was wrong...

    --
    dominionrd.blogspot.com - Restaurants on
  44. Buggy Features by Perdo · · Score: 5, Funny

    MShaft: "Not-a-bug-it's-a-feature"

    Intel: "Not a bug it's erratum."

    VIA: "We slowed it down to keep it cool."

    Nvidia: "That was a leak! We are not doing public driver beta testing!"

    ATI "Who the hell plays Quack3?"

    AMD "the patch is here"

    --

    If voting were effective, it would be illegal by now.

    1. Re:Buggy Features by Anonymous Coward · · Score: 0, Insightful

      Linux Nut: "It's not a bug.. you're just too stupid... god you disgust me... fix it yourself moron.. wait, you're too stupid."

  45. The guys who found the bug... by GdoL · · Score: 2, Funny

    ...seems they work for Intel. Their description was:
    "It's a major bug. We don't know how it happend. We will ask marketing. We don't remember ever sell that chip.".

    :-))

    --

    ------I can please only one person per day. Today is not your day. Tomorrow isn't looking good either.------
    1. Re:The guys who found the bug... by Anonymous Coward · · Score: 0

      Wel, I gues it's not a Linux nor a AMD bug.

      I tested it on a intel with Nividia geforce, it had the same lockup. I think that the geforce drivers suck, and they do, they Using a matrox card, works for me. It never locked up.

      Grtz and good luck.

  46. Ah, my mistake. :) by Ch_Omega · · Score: 1

    I thought he meant the fdiv bug in the first generation Pentiums.. :)

    My mistake, I guess. It's still a little early in the morning here. ;)

  47. Using Test Suites to Validate the Linux Kernel by goingware · · Score: 5, Informative
    Let me take this opportunity to plug Using Test Suites to Validate the Linux Kernel.

    Thank you for your attention.

    --
    -- Could you use my software consulting serv
    1. Re:Using Test Suites to Validate the Linux Kernel by goingware · · Score: 2
      I realized I hadn't checked the links in the article for nearly a year, and knew that some of them were out of date. All the links are now fixed, and I added memtester, a user-mode RAM test for Unix-like systems.

      Again, the test suite article is here.

      --
      -- Could you use my software consulting serv
    2. Re:Using Test Suites to Validate the Linux Kernel by Anonymous Coward · · Score: 0

      And let me thank you for your plugging yourself for the 1000th time!

  48. Quake 3 benchmarks by Sits · · Score: 5, Informative

    Quake 3 demo was run with \timedemo 1 and \demo DEMO001 . Each test was run three times. The system load average was < 0.5 before Quake 3 was run.

    Without mem=nopentium
    FPS = 79.4 (79.4, 79.4, 79.4)

    With mem=nopentium
    FPS = 79.2 (79.1, 79.3, 79.2)

    System tested:
    Athlon 850, 384MB RAM, Geforce 1 DDR, VIA KT133 Chipset
    Athlon/Duron/K7 optimised 2.4.17 kernel (optimising the kernel above pentium makes very little difference though)
    NVidia 1.0-2313 video drivers using agpgart
    Mandrake 8.0

    Quake 3 settings
    Texture depth = 16 bits
    Colour depth = 16 bits
    Geometric detail = High
    Texture detail = High
    Dynamic lights = On
    Video mode = 1024x768

    Looks like there is a difference but it's very slight (0.003%) but my benchmarks aren't very scientific. Either way, if there is an improvement in stability this tradeoff is easily worth it. Here's hoping that you don't run linux just for it's Quake 3 scores though...

    1. Re:Quake 3 benchmarks by andrewgaul · · Score: 1

      According to your numbers, it's a ~0.3% performance difference, which is still insignificant.

    2. Re:Quake 3 benchmarks by Anonymous Coward · · Score: 0

      Thanks for posting these.


      The difference you get is probably lost in the statistical noise, but anyway it seems the performance hit can't be that big, and after Mr. Cox gets a real fix out it'll be even smaller since it's supposed to turn the 4MB paging off only when it's necessary (mem=nopentium turns it always off)

  49. Is it really a bug? by nusuth · · Score: 1

    Well I had the same problem with the same setup, and after days of frustration it turned out to be overheating due to bad thermal contact and air flow. My problem never occured if I ran only one instance of seti, because cpu affinity of linux kernel sucks, and as a good sideeffect it can't overheat a single cpu. If single seti does not kill the machine, you should consider thermal problems.

    --

    Gentlemen, you can't fight in here, this is the War Room!

  50. in response to mr troll by Metrollica · · Score: 2, Flamebait

    you are only right on this:

    they add their own tech too, which is why they get different results.

    quote

    Now, the Athlon processor is made by a rival company, AMD. They have
    basically reverse engineered the Intel processors and tried to make a
    processor that operates just like Intel's processors, and then sell them
    cheaper than Intel does.

    This makes it a little more difficult to compare them to the Pentium
    processors. Some things the AMD Athlon actually does faster than a Pentium
    III, some things it does a little slower, and some things it can't do at
    all, while other things the Intel can't do, the Athlon does do.

    quote

    Had AMD had a design ready when Intel released their Pentium, their market share
    wouldn't have dropped to 10%. In the days of the 286, 386, and 486, AMD, Cyrix, and other "clones"
    reverse-engineered the Intel chips. In a sense, it was Intel's design (with maybe a few improvments),
    but it was reverse-engineered so it did not violate patents.

    quote

    But nothing lasts forever. The companies that had built Intel chips under license eventually reverse-engineered the chips and built them license-free. Intel copycats including Advanced Micro Devices (AMD) and Cyrix (a division of National Semiconductor) used the courts to validate their right to copy Intel's chip architectures. And PC manufacturers like Compaq and IBM used these clone chips as a weapon to force Intel prices down. Now the best way for Intel to stay ahead is to simply run faster. Running faster means shrinking product cycles from three years to 18 months by running parallel product development teams and spending more money faster than the other guys. Since Intel has more money to spend, this keeps them in command, but shorter product cycles mean less time to recoup R&D expenses. Hence, those lower margins.

    someone better mod me up for all my work

    --



    --Metrollica
    1. Re:in response to mr troll by Anonymous Coward · · Score: 0

      someone better mod me up for all my work

      Why Becuase you googled a bunch of dubious sources out of your expansive asshole? J Random Geocities, a "Scientist" (in IP law?), and CRINGLEY of all people. Ack.

      I, on the other hand, have it from the horse's mouth -- AMD Annual Report:

      In January 1995, we reached an agreement with Intel to settle all previously outstanding legal disputes between the two companies. As part of the settlement, in December 1995, we signed a five-year, comprehensive cross-license agreement with Intel which expired on December 31, 2000. We are currently negotiating a new agreement with Intel but there can be no assurance that a new agreement will be successfully negotiated. The lack of a patent cross-license with Intel could lead to expensive and time-consuming litigation, the outcomes of which could have a material adverse effect on our business. .

      Now give me the karma.

    2. Re:in response to mr troll by Anonymous Coward · · Score: 0

      Link does not work

      -Metrollica

    3. Re:in response to mr troll by Anonymous Coward · · Score: 0

      Ok, go here: http://www.edgar-online.com/brand/amd/search/ and find it yourself.

    4. Re:in response to mr troll by Verteiron · · Score: 1

      Intel gets a cut of every AMD cpu that's sold, because AMD is licensing the i386 instruction set. They are not "license-free".

      Why do you think Intel hasn't made more of an effort to really squash AMD? AMD is a revenue source for Intel, plus it keeps Intel from having to worry about Microsoft-ish anti-trust issues.

      --
      End of lesson. You may press the button.
    5. Re:in response to mr troll by Anonymous Coward · · Score: 1, Interesting

      Why do you think Intel hasn't made more of an effort to really squash AMD?

      Plus, some big customers have second-source agreements from Intel.

      Also, until recently, both Intel and AMD have been running at full fab capacity, and Intel hasn't had an incentive to move into the lower profit markets that AMD inhabits. Now that they are going to .13m (more capacity in theory), and the market has constricted, they may change course and start competing on price with AMD.

  51. Simliar K6 bugs too? by Anonymous Coward · · Score: 0, Interesting

    I had XMMS crashing and completely locking the box serveral times. I tried accessing my computer remotely too, but it didn't even echo on pings! This may ofcourse be XMMS' fault, but even though it's pretty darn dangerous.

  52. "It does not affect FreeBSD" by Anonymous Coward · · Score: 1, Informative
  53. You're absolutely right by Sits · · Score: 1

    Shessh - that was quite a mistake (didn't multiply by 100). Man I hope nobody hires me for my mathematical ability...

  54. I'm not sure what to think by hyehye · · Score: 2, Interesting

    I just got a new box, Athlon 1.2GHz... Asus a7a266 mainboard... nice little box for general usage. Soon as I finish moving, I'll get cable modem back and stop using mom's AOL, and I'll go back to Linux. But now I see this, and I'm eyeing my AGP card, and wondering. AMD has earned a lot of respect from me in the last couple years, as I've found the Athlons to be simply the finest x86 CPU's I've ever got my hands on, at great prices with very reasonable motherboards/chipsets as well. Now this. I'm not sure. Yeah, it's an engineering mistake, but I'm not clear on how AMD is handling it, and I hope they don't disappoint me. Sure, you can do a workaround - but as others have asked, what's the story on the performance hit? What about AMD working with the kernel folks to find another, better solution? Or maybe AMD could consider offering serious discounts on new, un-flawed CPU's, for those who are already eyeing upgrades?

    --
    think for yourself, you won't like the results if others do it for you.
    1. Re:I'm not sure what to think by hyehye · · Score: 2

      Also...

      AMD should seriously consider its response to this. The Linux community is well-informed, in general, and has been much quicker in moving to AMD than Windows users (mostly because Windows users are mainly Dell/Gateway/Compaq/Etc customers..), and AMD would do well to make attempts to avoid disappointing us.

      --
      think for yourself, you won't like the results if others do it for you.
    2. Re:I'm not sure what to think by KevCo · · Score: 1

      What better solution? Win2K/XP systems have been running with this workaround in place since September. This means that any benchmarks you've seen done recently have likely been done with 4K paging. Guess what? Athlons still outperform similarly clocked P3/4's. As far as expecting discounts... they are already significantly cheaper than Pentiums, what kinda discount do you expect?

    3. Re:I'm not sure what to think by hyehye · · Score: 2

      Significantly cheaper, they are. Good point. I would assume AMD has razor-thin profit margins from their chips, as well, so discounts would be impossible. *shrug*

      --
      think for yourself, you won't like the results if others do it for you.
  55. Benchmarks by Sits · · Score: 1

    I've posted some quake 3 benchmarks and it looks like the difference may not be significant (less than a half percent). However this is by no means a good test of a heavily loaded system (i.e. a high load average), nor does it test the effect when memory is tight (which is when I guess more paging would take place and the change would be more noticible).

  56. I had a stroll through AMD erratas by Anonymous Coward · · Score: 3, Interesting
    If I read the various PDFs correctly on AMDs site, all Athlons
    except model 1 (the very first K7 since it didn't have PSE) are affected,
    except the latest revision A5 (cpuid 662) of the Athlon XP, i.e. A0/660 and
    A2/661 are affected as well (similarly all 64x Thunderbirds etc.).
    (there was a model 1, 2, 4 and 6 Athlon, with 6 being XP)

    Some or all Durons might be affected too, but I didn't look at that closely.

    The above hinges on whether this is the correct bug description, feel free
    to flame the anonymous coward if this has got nothing to do with it :)


    "16 INVLPG Instruction Does Not Flush Entire Four-Megabyte Page Properly with Certain Linear Addresses

    Normal Specified Operation. After executing an INVLPG instruction the TLB should not contain any
    translations for any part of the page frame associated with the designated logical address.

    Non-conformance. When the logical address designated by the INVLPG instruction is mapped by a 4-MB
    page mapping and LA[21] is equal to one it is possible that the TLB will still retain translations after
    the instruction has finished executing.

    Potential Effect on System. The residual data in the TLB can result in unexpected data access to stale or
    invalid pages of memory.

    Suggested Workaround. When using the INVLPG instruction in association with a page that is mapped via
    a 4-MB page translation, always clear bit 21."

    (page 7 from Athlon Model 6 revision sheet)

  57. Alternate, faster? workaround by jquirke · · Score: 5, Interesting

    The current workaround gets around this problem by disabling 4M (2M?) pages (PSE). Hence we go back to 4K pages, and mapping large slabs of VM is a little slower and wastes memory (we need another Page table for each slab of 4M) and obviously more TLB misses/space wasted, because to touch the whole 4M region, the CPU needs to do up to 1024 page table lookups instead of 1.

    As discussed this may have performance implications.

    According to the AMD docs, the problem is only when flushing TLB entries with INVLPG and the page is a 4M page, _and_ the virtual address's bit 21 is set (which does not affect the 4M block of memory the address is in - eg: 0x400000 (2^22) vs 0x600000 (2^22|2^21) are both in the second 4M block).

    Hence, when invlpg'ing a VA we just need to INVLPG(address&~(1 (leftshift) 21)). This only requires a single ANDL instruction. But we need to distinguish a 4M page first though, so I don't know?

    Heck maybe we should just do it the FreeBSD way and recursively map the Pagedir :-)

    Any ideas? Will this work?

    --JQuirke

  58. would ECC ram correct the error? by Anonymous Coward · · Score: 0

    Don't know how ECC works exactly but would it possibly correct such an error on the fly?

    Anyone out there know?

  59. What bloody bug? by DABANSHEE · · Score: 5, Funny

    None of the Athlons or Durons I've built have had any problems with Tux Racer (Mostly on Man8.1 default install).

    My nephew spends hours Sliding that little penguin arround with that bloody elevator music going, & not once has there been a freeze or lockup, much to my dissapointment.

    1. Re:What bloody bug? by El+Prebso · · Score: 1

      Works great for me to. TuxRaver and Chromium runs nicely. However Quake3 sometimes frezes, but I think that's a bug in Quake or my Matrox drivers.

      --
      I didn't say it was your fault. I said I was going to blame it on you.
  60. Other Hackers did it better . . . by Jeff+Kelly · · Score: 5, Informative
    Here is a Posting from Terry Lambert on the FreeBSD -stable Mailing List regarding this "Bug".
    Maybe it sheds some light on this issue.


    > Recently I found Linux 2.4 kernel is affected by the
    > bug of extended paging in AMD Athlon through the
    > following link. I don't know if FreeBSD is also
    > affected.
    >
    > http://linuxtoday.com/news_story.php3?ltsn=2002-01 -21-001-20-NW-KN

    I am well aware of this bug.

    It does not affect FreeBSD, which only uses 4M pages for
    the first 4M of the kernel itself.

    I've worked on code that enables 4M pages on other memory
    used in FreeBSD, that had this problem, but only if you
    were really stupid in your allocation mechanism.

    There's a workaround for this problem which is fairly
    trivial to implement in software, and should probably be
    done when 4M pages are enabled, if you are using an Athlon,
    and are adding 4M pages.
    [...]
    In any case, this will not be a problem for FreeBSD, and is
    only a problem for Linux because of the strange way they
    initialize things.
    1. Re:Other Hackers did it better . . . by jelle · · Score: 2, Informative

      When an OS doesn't use a CPU feature (4M pages, using it just for the kernel doesn't count), that doesn't make the hacker better, it makes the OS not taking advantage of all CPU features (and therefore not running into the related CPU bugs...).

      So this guy tried to do 4M pages, it didn't work well (he encountered the bug), and decided not to implement 4M pages at all. And for Linux, the guys just happened to implement 4M pages long before AMD created the processors with the bug.

      Different history, all good hackers.

      --
      --- Hindsight is 20/20, but walking backwards is not the answer.
    2. Re:Other Hackers did it better . . . by Jeff+Kelly · · Score: 2, Informative

      When an OS doesn't use a CPU feature (4M pages, using it just for the kernel doesn't count), that doesn't make the hacker better, it makes the OS not taking advantage of all CPU features (and therefore not running into the related CPU bugs...).


      Read again. The Posting states that "I've worked on code that enables 4M pages on other memory
      used in FreeBSD, that had this problem, but only if you
      were really stupid in your allocation mechanism."

      He encountered the Problem in his _own_ code and fixed it there. He also states: "There's a workaround for this problem which is fairly
      trivial to implement in software, and should probably be
      done when 4M pages are enabled, if you are using an Athlon,
      and are adding 4M pages." He very clearly states that 4M pages are not currently supported in FreeBSD (should be in 4.5) but that a workaround exists. (And it is _not_ deactivating the 4M paging as in linux).

      So although they are not affected by the Bug because they do not use that particular feature at least they know that it exists and they do have a workaround ready _now_ so that by the time this feature is implemented this bug will not cause any troubles. Which is more than I can say about the Linux hackers, which don't even bother to read the docs provided by AMD.

    3. Re:Other Hackers did it better . . . by rew · · Score: 2

      Better? Better???

      They decided not to use the (albeit small) performance benefit that the processor offers, and then claim to be better.

      Yeah. Right.

      Roger.

    4. Re:Other Hackers did it better . . . by jelle · · Score: 1

      i read it and "I've worked on' doesnt" translate into "It's in there, we're using it", it translates into "somebody took a look at it".

      --
      --- Hindsight is 20/20, but walking backwards is not the answer.
    5. Re:Other Hackers did it better . . . by Anonymous Coward · · Score: 0

      Because everyone is an idiot.

    6. Re:Other Hackers did it better . . . by Anonymous Coward · · Score: 0

      Actually, it has nothing to do with BSD snobbishness. I know Terry Lambert from school and whatever he happened to be doing was always the best way and the only one that made any sense -- in his mind, at least.

    7. Re:Other Hackers did it better . . . by Anonymous Coward · · Score: 0

      >I know Terry Lambert from school and whatever he
      >happened to be doing was always the best way and
      >the only one that made any sense -- in his mind,
      >at least.

      That's exactly what's meant by BSD snobbishness.

      The type of person who becomes a BSD snob is usually a fairly smart but egocentric person.

    8. Re:Other Hackers did it better . . . by Anonymous Coward · · Score: 0

      Like RMS? Or Linus? Or AC? Riiiight...chode.

  61. could you be anymore vague by Metrollica · · Score: 0, Flamebait

    this discussion is over. you lose. that quote says nothing about cpu's. quit giving fake links. stop trolling

    --



    --Metrollica
    1. Re:could you be anymore vague by Anonymous Coward · · Score: 0

      You're right, it was actually patents relating secret Intel Nazi gas chamber tech. Thanks the advanced USian intellectual property system, our world leading genocidal technology was kept out of the hands of a third world company such as AMD. Sorry for trolling.

  62. grub workaround by chongo · · Score: 3, Redundant
    If you're using grub and want a quick but effective workaround, then edit your grub.conf file, which is usually under /boot/grub.conf or /etc. On the end of any line that begins with the word kernel add:

    mem=nopentium

    For good measure, re-install your grub config by running:

    /sbin/grub-install /dev/hda

    Where /dev/hda is your boot disk. For most PC users with IDE drives, it will be /dev/hda .

    Last, just reboot.

    --
    chongo (was here) /\oo/\
    1. Re:grub workaround by xZAQx · · Score: 1

      Thank you for posting this; it's hard to find Grub support/users.

      Silly redhat default.

      --

      We dance to all the wrong songs.
      --Refused.
  63. AthlonXP not affected by toofast · · Score: 2, Funny

    From AMD's website:

    Note: This patch is not needed for Windows XP

    1. Re:AthlonXP not affected by Anonymous Coward · · Score: 0

      No, that just means Windows XP already has a Microsoft-implemented compatibility mode that was introduced in a Microsoft-provided patch for Win2K (the registry entry is not needed if you have already upgraded Win2K to SP2)

    2. Re:AthlonXP not affected by dougmc · · Score: 2
      Note: This patch is not needed for Windows XP
      As much as AMD would like you to think otherwise, Athlon XP != Windows XP.
    3. Re:AthlonXP not affected by Anonymous Coward · · Score: 0

      You dumbass! Athlon XP is not the same as Windows XP, only dumbasses like you could confuse the two. Windows XP is not affected because it has the patch built in.

  64. Re:Awwwww by Anonymous Coward · · Score: 0

    1. Who said Linux *never* crashes? It crashes, but not nearly as often as the OS you're obviously running..

    2. Which OS can withstand hardware failiures?

    3. Go LEARN SOMETHING, you nut.

  65. Not only Athlons by Morgahastu · · Score: 1

    I had the same problem but with my Pentium 4. When I installed the nvidia drivers the system became very unstabled and crashed often. But hey, I could start tux racer and thats the reason I installed it in the first place. But that problem was only in the latest mandrake release (8.1), it worked fine with RedHat (7.2).

  66. ha ha, where's the problem? by Erris · · Score: 2
    When you're an artist dependent on OpenGL, you can't have problems like this.

    Of course if you were in that situation, you must not have noticed.

    Bug or no bug, my machines have been running just fine. I bought them based on reviews that showed them running circles aroung Intel and they did. At the speeds the newer machines run, I'd hardly notice if they were hanging.

    --
    DMCA, Hollings, Palladium. What might have sounded like paranoia is now common sense.
  67. bad form AMD, realy bad form by budgenator · · Score: 3, Troll

    A lot of your market share is there only because we who use Linux® have stuck by you. We have been ridiculed because we are using an "off-brand" processor, we've rationalized a way thermal problem's and fragile cores to get the benefit of more bang for the buck. We have suffered through inadequate compiler support, until your market share has grown to the point where an honest push onto the main-stream desktop is possible.

    And what do we get for it, no real support, write your own fix, no; that we can, and often do. What we got was forgotten, you didn't even tell us. We are used to and demand full disclosure, and in real time. Linix people hang their dirty laundry out in public to give everyone a fair and equal chance at a fix.

    We're often treated as a minority because we are, but treat us as a second class minority at our own peril. In short don't ever let the marketing weenies convince you to hide something from us; if we wanted to be treated that way we would use Win/Intel products

    --
    Apocalypse Cancelled, Sorry, No Ticket Refunds
    1. Re:bad form AMD, realy bad form by Anonymous Coward · · Score: 0

      At first, I snickered at your comment that we're used to full disclosure.

      Maybe I'm too used to the old days, when I'd have a video card sitting in a box and I'd be hitting the XF86 website each day, trying to see if they had drivers yet (Yeah, yeah, write your own drivers. You should see some of my code from when I was just learning C. Hideous. ;))...

      But you know what?

      IBM supports Linux. Freakin' *IBM*. I don't know of a single person I know who hasn't at least *heard* of Linux, even if they don't use it.

      There's really no excuse for AMD not talking about this bug. They can't even use the excuse that they didn't know who to talk to - they could've always mailed that crazy guy from Finland.

      Well, my past three processors were from AMD. I had intended for the next to be AMD as well, but I may have to reconsider that. Then again, Intel has probably been less than forthcoming in regards to Linux developers as well..

      Still, this sort of thing isn't good(tm). I still know people who won't use Intel processors because "4/2=1.9999999999999999999999". People see hardware bugs, and they stick in the mind, even if they are fixed.

    2. Re:bad form AMD, realy bad form by Anonymous Coward · · Score: 0

      IBM supports Linux

      And how many CPUs does AMD sell to IBM every year? None that I know of.

      I think the root of the problem is that companies get into Big Customer thinking. None of AMDs major OEMs (primarily in home/consumer space) list "Linux" as requirement (or "NetWare" or "SCO UNIX", etc), so AMD doesn't take notice. However, from the noise on the Internet, they seem to be selling tons of CPUs to the hobbiest/dabbler and whitebox markets, and you'll find a lot of Linux users there.

    3. Re:bad form AMD, realy bad form by Anonymous Coward · · Score: 0

      might I remind you of the pentium3 1.13ghz processors that were unstable? they could not compile a linux kernel.

  68. Are all of them affected? by theEdgeSMAK · · Score: 1

    I ran a T-bird 900 with a GeForce3 for awhile on 2.4.17 using NVagp and never had one bit of a problem. Now I have a XP 1700+ and still have no problem, though i've switched my board from a kt133a to a kt266a and have to use agpgart because the NVidia driver dosn't support my chipset. Still perfectly stable (i've noticed one or two graphical glitches in glx apps lately but never a lockup or a crash)

  69. (Meta)"Wasting" points on AC comments by yerricde · · Score: 1

    Hey moderators, don't waste your points modding this AC up to +5,

    It's not a waste if the content of the comment deserves it.

    that's a total of 6 points

    No, ACs start at 0. It takes 5 points to get them up to Score:5.

    AC's don't get karma remember?

    And neither do experienced users thanks to the karma kap. Karma determines only who gets to moderate and who gets to post at 2. It does not equal penis size. In fact, since Slashdot upgraded to Slashcode 2.2 with a messaging system to tell me when I've been modded or replied to, I haven't even looked at my karma.

    --
    Will I retire or break 10K?
  70. been fixed? by csbruce · · Score: 2

    So, if it was discovered over a year ago, was this hardware bug ever fixed? We bought a dual-athlon 1.53-GHz (1900+?) machine recently; do these processors still have the bug?

  71. Re:YES AMD BASHING by Anonymous Coward · · Score: 1, Interesting

    I certainly wouldn't mind bashing AMD, and not just for broken processors.. Their whole socket A platform is just plain unreliable.

    They just don't have critical people taking a stance against them, like Intel, so AMD never has to recall anything.

  72. Wow... by Greyfox · · Score: 2
    I'd been noticing this for ages -- just about anything that does GL will almost always hang my system (Oddly enough, I have never noticed this with Tuxracer.)

    I'd always assumed that it was just a crappy AGP implementation on my no-name motherboard, as I'd been following the Mesa/GLX groups for a while and hadn't seen the problem mentioned all that much. It's nice that there's a relatively easy fix for the problem. Maybe now I can get back into Tribes2 again :-)

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    1. Re:Wow... by Peyna · · Score: 1
      I had the described problem with Tux Racer using FreeBSD 4.4 and Redhat 7.2 (I have no clue what version of what else was installed, but I would guess a very recent kernel.)

      My hardware setup is as follows:

      ASUS A7A266, Athlon 1.3 GHz, Geforce 2 GTS Pro 64.

      --
      What?
  73. You want vectors not huge integers by yerricde · · Score: 2, Insightful

    For example, you can pass around 8-byte structures in a single register, which is damn useful given the lack of available registers in the x86 architecture.

    And when you want to use or change one byte in the structure, what do you do? Shift it out and put it in another register. You can beat the "lack of registers" argument by switching to any current architecture but x86; you'll get at least 16, most likely 32, or even 64 registers.

    And with the 64-bit representation, you can do this with just one subtraction and one branch rather than a combination of two subtracts and two branches.

    One problem with your algorithm: one subtraction will "carry" over into the next because the processor assumes you're subtracting whole integers. What you want isn't really 64-bit integers but rather vector SIMD as found in MMX, SSE, and 3DNow!. In fact, AltiVec on the G4 processor is 128-bit.

    --
    Will I retire or break 10K?
    1. Re:You want vectors not huge integers by mikera · · Score: 1

      There's no problem with the algorithm. The carrying doesn't make a difference - you don't care about the result because you're using it to sort and merge (i.e. you only care whether the result is less than, equal to or greater than zero).

      Don't remember all the SIMD instructions, but I think that would actually be slower as you would need to check the subtraction results for each components. Correct me if I'm wrong....

  74. Re:Who will be president in 2004? by Anonymous Coward · · Score: 0

    I bet it will be Tom Daschle.

    You lose, fuckwit. Even if Bush fails to get reelected, the new President would take over Jan 2005.

    You eurotrash are so stupid.

  75. Even simpler workaround by Anonymous Coward · · Score: 0

    Just wipe Linux & run a *BSD.

  76. Please explain by RelliK · · Score: 2

    Could somebody with more knowledge explain why you need 4MB pages in the first place? Pages are supposed to be small for a reason. With 4MB pages, internal fragmentation would go through the roof. It's almost like not having paging at all. I don't understand why this option is even available and used.

    --
    ___
    If you think big enough, you'll never have to do it.
    1. Re:Please explain by rew · · Score: 2

      ... why you need 4MB pages in the first place?

      ... Internal fragmentation would go through the roof.


      Think about it. What ISN"T fragmented?

      Well two things: The linearly mapped image of main memory for direct access by the kernel and the video memory.

      That's exactly where it's used for. For every 4Mb of kernel-memory that you touch, the CPU would have to load 1024 page table entries using 4k pages, and just one for 4M pages. Each of them goes into the TLB and would put pressure on the TLB to evict other entries.

      Still, you'd have a hard time measuring a difference.

      {
      int tot[1024], arr[1024][1024], x, y;

      for (x=0;x1024;x++) {
      tot[x] = 0;
      for (y=0;y1024;y++)
      tot[x] += arr[y][x];
      }
      }

      will show something close to the maximum possible difference: You're accessing 1024 different pages, requiring a page table lookup for every single access... Swap the y and the x in the arr[y][x] and the difference has shrunk by a factor of 1000.

      Roger.

    2. Re:Please explain by RelliK · · Score: 2

      I suppose you mean that 4KB and 4MB pages can be used interchangebly and the OS can specify if it wants to allocate a 4KB or a 4MB page, correct? It is inconceivable that 4MB pages would be used exclusively.

      --
      ___
      If you think big enough, you'll never have to do it.
    3. Re:Please explain by rew · · Score: 2

      4M pages is not a "mode" that you would put the CPU in. It is a per-table-entry thingy.

      For specialized applications (say a machine with 64G of RAM, running HUMOUNGOUS applications) you might chose to run with 4M pages exclusively. Linux doesn't.

      In practise you would still use 4k pages for userspace, and 4M pages where it is immediatly obvious that it doesn't make sense to use 4k pages.

      Roger.

  77. Just my luck.. by attackiko · · Score: 1

    I finally throw away my old P-100 with the division bug and buy an Athlon, this happens.

  78. Hmm, Win2k needs patched, Linux needs boot option by gosand · · Score: 2

    I find it rather interesting that for Win2k, you needed to install a patch. For Linux, you can just edit your bootloader with an option, and it does the same thing. Which seems more robust?
    Granted, the Win2k patch was probably just a registry tweak, but which could the average user do more easily? Which operating system gives more information to it's users?

    --

    My beliefs do not require that you agree with them.

  79. That presumes that many happen to HAVE other cards by Svartalf · · Score: 2

    I'm going to have a G400, but that's because I'm moving a card over from my main machine that's a P3-600 until I can afford another card. Most people getting an Athlon are looking for maximal speed (Who isn't?) so they're going with the NVidia cards because they're "fully" supported with all functionalities including T&L supported (The Radeon doesn't have T&L right at the moment and the top of the line one is a different card w/no support right at the moment...). Most of the Athlon crowd is going to have NVidia cards unless they're insistent about having everything Open Sourced. There's nothing wrong with that position, but since the profile indicates that there's not going to be as many people with other cards, how would they see other AGP cards having this problem?

    --
    I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
  80. AMD compatibility by mrm677 · · Score: 1

    I've always been an Intel-supporter. At least their chips don't burn up if your heat sink falls off. And their power consumption has generally been half of AMD's chips.

    I avoided the early AMD chips because of compatibility issues. However I was planning on trying an AMD Duron chip because I thought that AMD had a decent chip out now. Guess I'm wrong...

    Buying AMD will eventually bite you in the ass somehow someway.

    Go ahead...flame me.

    1. Re:AMD compatibility by Anonymous Coward · · Score: 0

      go back to your cave, you ignorant troll. at least AMD doesn't release processors at a faster speed than they can stably run at (might I point out the p3 1.13 ghz recall?)

    2. Re:AMD compatibility by Anonymous Coward · · Score: 0

      I'd like to see a heat sink that just "falls off." Either you bought a cheap heat sink/fan or are too dumb to install it. I'd also like to see you remove a heat sink from a running p3. I'm 99% sure that it will crash at some point during the day.

      Intel = bites you in the ass everyday. Stupid floating point bugs, various incompatible RAM architectures. Not to mention the biggest flaw of all: that GODDAMN STUPID x86 ARCHITECTURE! Horror of horrors. Finally someone like AMD comes around to make sense of it all. I'd say they are doing a great job of polishing Intel's hack jobs.

    3. Re:AMD compatibility by kweerboi · · Score: 1

      That's a troll post if I've ever seen one.

      Let me ask you how often your heatsink has just fallen off? Must be quite often for you to even consider that a serious risk and go Intel instead. I feel sorry for your video card, must suck to be a landing pad for your HSF. =P

      This is a minor bug, and if you want to talk about getting bitten in the ass, lets remember Intel's nice little i820 MTH "erratum", not to mention the 1.13GHz Pentium !!! "erratum".

      But hey, it's your cash. =) Feel free to spend on overpriced components from a company that sold it's soul to Rambus.

      --
      Most people would rather die than think; in fact, they do so. - Bertrand Russell
    4. Re:AMD compatibility by Ziviyr · · Score: 1

      In Monkey Island I had to feed a troll a Red Herring.

      I guess it doesn't work well in reality, the opposite is happening.

      --

      Someone set us up the bomb, so shine we are!
    5. Re:AMD compatibility by mrm677 · · Score: 1

      I just purchased a 900 MHz T-Bird Athlon for another Linux box today. Hopefully it can turn my opinion of AMD...

  81. Not just the Athlon? by Anonymous Coward · · Score: 0

    I also noticed this problem just this weekend. My fiance started to enjoy playing the game (tuxracer), but it would hang the entire system once in a while. However, I've got a 933MHz pentium III, with a geforce2 mx (and using the nvidia accelerated GL drivers for that board, not the ones that come with XFree86)

    Obviously, this can't be the same problem, but perhaps related in some way?? I've never had my system just completely hang (can't change virtual terminals, or ctrl-alt-del) before.

  82. To avoid further problems of this sort, in future by jd · · Score: 1
    AMD must start handing out large quantities of Athlon-XP 2 GHz chips to all pro-Linux geeks on Slashdot, for a thorough screening of any other bugs that might be in the chip.


    All in favour, say "Aye!"

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  83. AMD Rev A5/CPUID 662 by lanalyst · · Score: 2, Informative

    Recently purchased 2 XP 1600+s (1 in Dec and 1 in Jan) - both indicate they are Rev A5 (CPUID 662) and do not have the INVLPG bug according to AMD's errata sheet.

    1. Re:AMD Rev A5/CPUID 662 by Anonymous Coward · · Score: 0

      Except you can load longer load microcode patch reliably[1]. :(

      [1] http://www.amd.com/us-en/assets/content_type/white _papers_and_tech_docs/24332.pdf

  84. 64 bit Performance by digitalEric · · Score: 2, Informative

    Yes, UltraSPARC's run significantly slower in 64 bit mode. IIRC, this is because it takes more instructions to load 64 bit constants and access 64 bit pointers. This is not true of all 64 bit processors -- and it is not true of x86-64.

    The x86-64 architecture allows 64 bit programs to take advantage of the extra precision (and doubles the number of general-purpose registers, which x86 desperately needs), without forcing them to take the performance hit of using the full 64 bit addressing. It also adds a new, IP-relative addressing, which makes position-independant code (ie, shared libraries) much more efficient. There will be an increase in code size (and possibly a performance drop, but this depends on how AMD implements the 'movabs' instruction) when you start using more than 4GB of data. And, when you start using >4GB of code, things get yucky (requiring indirect jumps).

    But, the point is, x86-64 will run all your 32 bit x86 code at full speed, and if you're able to re-compile your programs for 64 bit mode, you should get a performance boost, if only from getting 9 more registers (8 + no longer need to keep a pointer to the GOT).

    1. Re:64 bit Performance by spauldo · · Score: 1
      Ah, sounds good then. I'm afraid I don't know anything about AMD's 64 bit plans, but if it's like you're describing than I'll definately look into it more.

      --
      Those who can't do, teach. Those who can't teach either, do tech support.
  85. Wow! by Anonymous Coward · · Score: 0

    RedHat 7.2 is a flaming piece of shit! If it's not busy corrupting the ext2 filesystem, it's locking up on __pollwait calls! The entire machine locks up! What a flaming piece of dogshit! Absolutely perfect for AOL to buy! What a stinking, flaming piece of rotting dogshit!

  86. FreeBSD NVIDIA driver project (a bit off-topic) by Anonymous Coward · · Score: 0

    Ok, I just found out about this today, this is probably old news, but I'll post it anyway in case someone missed it.
    http://nvidia.netexplorer.org

  87. Re:Hmm, Win2k needs patched, Linux needs boot opti by Anonymous Coward · · Score: 0

    The average user could most easily install a patch... Duh.

  88. Re:First Menstrual Cycle post by Anonymous Coward · · Score: 0
    goodmorning, spacefem.

    --sdem

  89. SO that's why! by Chanc_Gorkon · · Score: 2

    The other day I left my Dual Boot system (with a Nvidia GeForce 2 MX 400 and NVIDIA drivers) booted into Mandrake Linux for most of the day and it was fine. Of course I was at the system for most of the time. I decided to go to the store and when I came back the system was locked tighter then a drum. No big deal since I run ext3 for the file system. Rebooted and it was fine. How would one add this option to a GRUB bootloader?? I bet if I add it, the screensaver won't lock (Open GL screensaver.....). I don't play a whole lot of games so the texture flakiness would not bother me.

    --

    Gorkman

  90. This is flaw in how Linux is (not) managed by Anonymous Coward · · Score: 2, Insightful

    First off, yes, this is a rather major bug.

    But is it enough to warrant not buying the processor or flaming AMD???? Hardly!

    EVERY piece of hardware out there has some bug in it! Have any of you ever sat down and read the list of errata on Intel parts and the list of how many flaws are fixed in each stepping? The list of bugs fixed over the life of the P3/Celeron core is a rather lengthy document to say the least.

    And I can't really fault AMD at all on this one other than that they HAD a bug...for Win2K etc, they released a fix/patch in very short order and notified everyone rather quickly.

    And don't forget this was back in 2000! What version was the norm for deployed kernels back then (over a year ago!)???

    From what I gather, the 4Mb AGP paging didn't show up until kernel 2.4 builds -- which I do not think were final at that time. Regardless, I feel the Linux kernel community should have been a bit more proactive in noting a DOCUMENTED bug and correcting for it.

    Regardless, this bug in no way affects whether or not I would buy an Athlon/Duron. It is basically trivial to workaround and results in almost no performance loss. In essence, my Athlon XP 1900+ with the fix will still beat the crap of most P4 2Ghz machines in 90% of all applications (for half the price).

    This is basically a failing of the entire Linux concept more than a failing of AMD.
    Is there any central authority who regularly checks AMD, Intel, Via, Transmeta, etc. erratum sheets for bugs that might potentially affect the kernel? Based on this, I strongly doubt it.

    Don't get me wrong, Linux is a great OS, but the lack of centralized control and build management is starting to cause problems. There are so many changes to different modules that version dependencies crop up all the time and no one is managing them.

    I am not a big fan of Microshaft myself, but I would put money that they have at least one or two people whose job is to do nothing but monitor the processor manufacturers erratums to make sure no major problems submarine the sales of Windows XP! Bill Gates may be many things, but stupid is not one of them. H*ll, if I was Microshaft, I could have a marketing field day on this one -- it could be a very persuasive argument to lots of upper management types as to why Windows is better than Linux.

    Is this bug a problem? Yes.

    Was the original problem AMD's? Yes.

    Did they address it and notify people? Yes.

    Did anyone in the Linux community actually notice? NO

    Regardless, any bug that can be worked around this easily is not THAT big a deal people...but it does point out some serious flaws in how Linux kernel development is managed. If Linux is to survive, some order had better start arising out of the spiraling chaos!

    So to sum up the appropriate response to this bug: LEARN FROM YOUR MISTAKES AND GET OVER IT!!!!!!!!

    "The sky is falling! The sky is falling!"

    sheesh...

    1. Re:This is flaw in how Linux is (not) managed by ASM · · Score: 1

      um... Actually no. You see the fault lies with AMD soley(sp). I'm not saying that because they had a bug, that happens. I'm saying that because they never announced it.

      Yeah yeah, I know they put it in the errata, and even released a WIN2k patch, but that's not the same as announcing that there's a problem. You see they merely announced that there was a problem with Win2k on athlons (and here's the fix! gee aren't we great?).

      They never said "There's a bug in the athlon". Had they said that, Linux developers would've known there was a problem. As is, all they knew was that there was a problem with win2k, which doesn't mean squat where linux or any other OS is concerned; the problem could've been anything.

      No, it's entirely AMD's fault for not admiting to the whole truth, and saying they made a mistake. (Honestly AMD, no one would've gotten mad, it happens; we understand that. But next time, tell the truth.).

      --
      Fish
  91. Windows was the problem for me by Anonymous Coward · · Score: 0

    Odd, but Windows 2000 was so unstable on an Athlon/Geforce2 system I built last year, that I could only use Linux on the box. Linux has never crashed, but Windows 2000 was blue screening constantly. Applying the patch some time ago, made Windows stable on it, so I could dual boot.

    Odd that RedHat 7.2 didn't seem to stumble over this bug.

    1. Re:Windows was the problem for me by VB · · Score: 1


      Same problems here on an AMDK6-3 on ActiveServer. PIII fixed that. My Athlon has never crashed without my help:
      wtmp begins Mon Feb 14 17:48:19 2000

      01:05.0 VGA compatible controller: 3Dfx Interactive, Inc. Voodoo 3 (rev 01) (prog-if 00 [VGA])
      Subsystem: 3Dfx Interactive, Inc. Voodoo3 AGP

      --
      www.dedserius.com
      VB != VisualBasic
  92. Re:First Menstrual Cycle post by Anonymous Coward · · Score: 0

    Yes, everyone knows CmdrTaco drove away all the women long ago. Slashdot would hardly be a good place for him to find new sex partners if there were women here.

  93. Amazing, all the effory/money to SuSE... by Anonymous Coward · · Score: 0

    As I remember, SuSE was the people being paid to make the 64 bit AMD patch effort.

    They have the 'cozy' relationship with AMD.

    And yet, little gentoo found the 'feature'.

  94. Can registered and ECC RAM help? by starnerd · · Score: 1

    Does anyone know if registered and/or ECC ram help? It seems that if this is a problem of memory corruption, something like ECC ram could at least reduce the chances of corruption.

    1. Re:Can registered and ECC RAM help? by Tazzy531 · · Score: 2, Informative

      It's not a matter of the type or quality of the memory but how the chip address the memory. There is a flaw in the chip itself. A layman's analogy might be: if a telephone book only list the first 5 numbers of a phone number. What you are suggesting is to replace all the telephones in the world. Even if you do, the phone book still won't work because the phone numbers are incorrect. What has to be fixed is the phone book [or the way of finding phone numbers]. Go here for more technical information.

      --


      _______________________________
      "I'm not Conceited...I'm just a realist..."
    2. Re:Can registered and ECC RAM help? by dakoda · · Score: 1

      it doesn't deal with corrupt ram in the sense that the data the cpu gets is wrong from whats actualyl in memory. it has to do with how the cpu manages data. with 4k pages, it properly clears out its table when told to. with 4m pages, it doesn't always clear it (depending on some address bit 21). no kind of ram could fix this (except for maybe using 4M of ram, but i imagine that would be quite useless =)

    3. Re:Can registered and ECC RAM help? by Anonymous Coward · · Score: 0

      Nope. ECC in RAM can only correct memory corrupt that happens to the memory array itself.

      CPU forget to flush in this case not a whole lot the 10,000+ flushes toilet bowl cleaner can do.

  95. Re:YES AMD BASHING by Anonymous Coward · · Score: 0

    you're obviously an ignorant moron who has never used an AMD system that was set up on a decent chipset

  96. I just want to know by Vicegrip · · Score: 2

    - what proccessor rev its fixed in. I'm wanting to buy a new machine, it's still gonna be AMD, but I don't want a processor with that bug, as I am a big gamer.
    - how to tell if my processor is affected. (I'd rather not have to wait for my system to crash to find out)

    --
    Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
    1. Re:I just want to know by (H)elix1 · · Score: 2

      (fireproofing=true)

      I'll guess if you are a big gamer, you are probably running windows, in which case you should be current on your patches - security or otherwise...

      As a development workstation in Linux - no issues - I've used everthing from Slot A, duron, athlon, XP, and as of this week MP (still have that silly grin). I doubt you would notice... I missed it (shrug) and the only down time I've had was for security updates and kernel upgrades. Never done any gaming on the box other than Q3, and that never seemed to be an issue. UT never did work for me, AMD or Intel...

      I expect this will be patched very shortly on the Linux front. BSD and Windows are already fixed.

    2. Re:I just want to know by rew · · Score: 2, Informative

      ...and that never seemed to be an issue.

      The AMD erratum says that it is an issue if bit 21 of an address is actually 1. Thus you may have been lucky in where your video card got mapped.

      Roger.

  97. sorry, gotta be a grammar nazi by Anonymous Coward · · Score: 0

    Errata is already plural ;-)
    Erratum is singular

  98. Must be a nvidia driver bug... by Anonymous Coward · · Score: 0

    After hearing Alan harp on nvidia driver bugs and them actually being the catalist for his "poluted kernel" flag, I'm sure it took quite a bit of work to get AC to not dismiss this as yet another nvidia problem...

  99. Problems with AMD's technical docs! by Anonymous Coward · · Score: 0

    Below is a letter I sent to AMD (and post it on various AMD/hardware sites) regarding the differences and similarities between MP and XP.
    The problem is propagated by AMD's conflicting ignorance, and they are doing nothing about it.

    AMD better learn that if they want to play in the big leauges, technical documents should be treated like the bible, because some of the consumer's are also developers.

    "
    The current, accepted belief by some of the more influential hardware internet sites, and magazines (pcmaximmum) is best represented by a quote on anandtech.com

    "The current Athlon MPs that are officially validated for dual processor operation are actually no different than their Athlon XP desktop counterparts.

    There is one small exception to that statement and that is in regards to the L1 bridges on the most recent CPUs that implement the new organic packaging.

    As we pointed out in our Athlon XP 1900+ article, the L1 bridges on desktop CPUs are actually cut from the factory while MP CPUs aren't cut, the bridges simply aren't connected."

    Most AMD users are both performance and money conscious (you guys have a great product), and as a direct result, there is a growing number of consumers influenced by the internet media and also their wallets into using XP in SMP configuration. But some of the consumers seduced into using XP in SMP might start questioning (unfairly) AMD's CPUs' integrity/stability when they start experiencing problems with XP in SMP. Who is to blame? I don't know!

    I believe that this is contrary to AMD's intentions, and for the last few weeks I've attempting in different forums to show that MP and XP are technically different, therefore XP should not be used as a replacemen for MP in an SMP configuration.

    Besides the L1 locked/unlocked difference, I documented 3 other differences between MPceramic, MPorganic, and XP.

    1. MPceramic has an FID_Change state, MPorganic does not have an FID_Change state.

    2. MPceramic has an FID_Change state, XP does not have an FID_Change state.

    3. MPceramic/MPorganic has 8 SCHECK pins, XP has this corresponding pins as "Not Connected".

    Questions/Observations:

    1. Is there any documentation on the FID_Change?

    2. Two different sources (first source a "second-hand" source inside AMD, second source using a multimeter) confirm that the 8 SCHECK pins on the XP are indeed connected to something, therefore they are not "Not Connected". Are the 8 SCHECK corresponding pins on the XP in fact functional, therefore connected?

    In a further attempt to show a difference between MP and XP, I compared the registers on the "PCI Bus: 0 Device : 0" between a system with 2x1600XP and a different system with 2x1600MP. There were 16 registers with different values, 13 due to different hardware configuration, but 3 registers might be due to different processors:

    offset 0x04 bit 8:
    System Error Enable

    offset 0x18 bit 7:
    Reserved bits, not documented - different in XP and MP

    offset 0x50 bit 20:
    Fixed bit (Read Only), defined in the docs as "P transitor strength Value - ... The P value are active low"

    Questions:

    1. Could NMI errors be the result of XP's disabled SERR bit?

    2. What sets and/or determines the the value in the 0x50 bit 20 (Read-Only)? (cpu, chipset, bios, or something else)

    3. How can the chipset differentiate between MP and XP, if AMD's own cpuid routine which I assumed is based AMD Processor Recognition Application Note, cannot differentiate between MP and XP?

    In light of the previously mentioned reasons, an almost unanimous internet voice advocating (contrary to AMD) that MP and XP are identical, an ever growing group of consumer using XP in SMP, and an ambiguous AMD statement found in http://athlonxp.amd.com/overview/keyFeatures.jsp

    "266MHz AMD Athlon(TM) XP processor system bus enables excellent system bandwidth for data movement-intensive applications

    Source synchronous clocking (clock forwarding) technology
    Support for 8-bit ECC for data bus integrity
    Peak data rate of 2.1GB/s
    Multiprocessing support: point-to-point topology, with number of processors in SMP systems determined by chipset implementation
    Support for 24 outstanding transactions per processor"

    I think it would be in the interest of all parties, both current and potential SMP/AMD customers, and AMD's own reputation, for AMD to clarify once and for all what are the technical differences between MP and XP.
    "

  100. do not forget to run lilo after that. by ShortSpecialBus · · Score: 1

    Of course, you need to run lilo before you reboot.

    --
    //FIXME: Bad .sig
  101. Annoyed at something else. by Lemmy+Caution · · Score: 4, Insightful
    The article notes that AMD has been proclaiming the bug in public for a while.

    What irks me is this: I got hit with this bug. I posted bug reports to Debian, with NVidia, on different forum, report lock-ups in certain open-GL situations. I got generally hand-waving "read the fucking manual" responses.

    As the article notes, this isn't just a problem with AMD. It suggests that there's an ongoing problem with troubleshooting and resolving the sorts of issues that desktop users are going to have in Linux. (And "paying for support" would not have resolved much, would it have? The problem is the lack of coordination, not the lack of money.)

  102. Re:Hmm, Win2k needs patched, Linux needs boot opti by Anonymous Coward · · Score: 0

    I'm sure the average user is more likely to type in a boot loader option than double click a patch. Yes, you're absolutely right.

  103. Optimised kernels still buggy by Sits · · Score: 2, Informative

    I've posted this elsewhere but to clarify - it looks like this will still happen regardless of which processor you have selected (even i386!). This is because the test for whether your processor does pse seems to be run on startup (I think it's done by arch/i386/mm/init.c __init pagetable_init).

    As an aside, as far as I can tell the only (extra) things that optimising a kernel for a K7 seems to set are gcc options (someone please correct me if I'm wrong).

  104. curious by chompz · · Score: 2

    I think the k62 had this problem as well. Anyone know about that?

    --
    Spring is here. Don't believe me, look outside!
    1. Re:curious by Reziac · · Score: 2
      I don't know if it's the =same= bug, but the little rant I posted to an earlier thread (about how AMD blew it off and wouldn't warranty the affected CPU) was indeed about the K6-2 300MHz, in particular a batch mfg'd approx. Sept.1998. I am not the only person I know who had problems with it -- friend couldn't get one of the same batch to run linux (Slackware), either. He whined at AMD tech support until someone there told him under the table that this was indeed a bad batch with a KNOWN bug, even tho AMD to this day has not admitted to it. (Sorry, I don't remember exactly what the bug was.)

      This is NOT the "fast processor" bug that affected some software. We had the same CPU but from a Dec.1998 production batch in an otherwise-identical system, and it had no problems with the same software and setup.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
  105. Not a documented errata by himi · · Score: 2

    It's rather hard to read non-existent documentation. This bug isn't listed in the AMD K7 errata, which is why it wasn't found - the only 'documentation' for this is the Win2k patch that AMD provided.

    Linux and *BSD just do things differently: it's not a matter of one set of hackers being better than the other.

    himi

    --

    My very own DeCSS mirror.
  106. Does this problem occur in the 2.2 kernel series? by jonabbey · · Score: 2

    I've seen a number of mysterious X freezes in XFree86 4.1.0 and earlier on my Athlon/GeForce2MX system with NVidia kernel/X drivers. Most often the X server just seems to lock up when I'm doing nothing in particular. Occasionally I've had the whole system freeze during 3d gaming.

    This is all with Linux 2.2.18. Has anyone commented for sure on this bug in the 2.2 series?

  107. This would explain it... by -ryan · · Score: 1

    Maybe this would explain the extreme flakiness of my box. I've searched and troubleshooted high and low for a solution to the constant crashing and hanging of my system:

    ASUS A7A266
    512MB Crucial DDR
    Athalon 1.4GHz (266FSB)
    Nvidia GeForce2 GTS 32MB
    Red Hat 7.2

    1. Re:This would explain it... by Brackney · · Score: 1

      I'm in exactly the same boat as you. I've really been irritated by near-daily crashes of my (nearly identical) system. I've been frustrated by my inability to diagnose the problem in order to create some sort of useful bug report, and was about ready to start picking up replacement hardware.

      Anxious for the patch and stability once more. :)

    2. Re:This would explain it... by Anonymous Coward · · Score: 0

      The extreme "flakiness" on your box would stem from your use of Redhat.

      Perhaps if you Redhat lamers would compile your own kernel to support the AMD instruction set instead of leaving it as the default Pentium instruction set you wouldn't be having any problems.

      AMD processors run faster, therefore they run hotter. That is basic logic. If you have the "AMD approved" heatsink and fan then you will have no problem with overheating.

    3. Re:This would explain it... by -ryan · · Score: 1

      "Perhaps if you Redhat lamers would compile your own kernel to support the AMD"

      sorry bub, your little generalization there doesn't stick. I configure and compile my own kernel, as well as the majority of my software.

    4. Re:This would explain it... by slardy · · Score: 1

      I was troubled by the same problem. Nice to finally find a solution! Altough my system usually only hung when the system was doing certian things, playing games, watching movies, somtimes even playing mp3s.

      --
      http://www.nu-vision.org
  108. I think people noticed it all right... by Sits · · Score: 1

    However, it was difficult to pin down the culprit and now the problem *has* been fixed proving that the the decentralisation of Linux does work, albeit slowly when there isn't outside help. I suppose you can point to FreeBSD as an example that management does help but even over there not everyone knows about it and mistakes are made.

    The problem was nobody knew exactly where this was happening. Things like the properitry NVidia drivers are frowned upon for not being open (fair enough - it makes it more difficult to tell where bugs are) because it's difficult to prove they are not buggy. Ironically, it was these drivers that were helping to show up this bug. In the end it took people from within NVidia who could check their own code is correct to file this bug simply because they were in the best position to verify it.

  109. Waht about XP? by Anonymous Coward · · Score: 0

    Not to sound stupid but does it affect XP?

    The list off affected is: Athlon/Duron/Athlon MP

    No XP in the list so am I to assume I am safe?

    1. Re:Waht about XP? by lanalyst · · Score: 1

      cat /proc/cpuinfo

      stepping 2 is rev A5 which according to the AMD Athlon Processor Model 6 Revision Guide does not have the bug.

  110. Near enough by Anonymous Coward · · Score: 0

    Looking at this post it seems like all but the very first and the very latest K7s are affected.

    I dunno if I've had crashes because of this bug but I have definitely seen some weird polygon corruption (in Mesa gears sometimes a poly would come loose). After implementing the workarounds I haven't seen again...

  111. Yes it is documented by Anonymous Coward · · Score: 0

    Take a look at page 7 of the AMD AthlonTM Processor Model 6 Revision Guide.

    Guess the hackers missed this one...

  112. Re:Hmm, Win2k needs patched, Linux needs boot opti by gosand · · Score: 2

    I think you are missing the point here. Does a Win2k user have to be connected to the internet in order to fix their system? Yes. Does someone on the Linux system? No. Imagine you manage 100 machines. Which would be easier to fix? Push out lilo.conf files to the Linux machines, or install 100 patches?

    --

    My beliefs do not require that you agree with them.

  113. what's the point of 4MB pages? by RelliK · · Score: 3, Insightful

    First, as has already been pointed out, there is no performance hit.

    But I still did not get the answer to my question. What is the purpose of having 4MB pages in the first place? It is inconceivable that an OS would use 4MB pages exclusively. The internal fragmentation would be enourmous.

    To give you an analogy, think of what would happen if your file system used 4MB blocks. When you create a file, space would be allocated 4MB at a time so a 1 byte file would waste (4MB - 1byte) of disk space; (4MB + 1byte) file would take up two blocks, also wasting (4MB - 1byte) of disk space. On average, each file wastes 1/2 of the last block. Similarly, each process wastes on average 1/2 of the last page. That's not a problem if the pages are 4KB in size, but with 4MB pages there's lots of space wasted. That's like throwing away paging altogether.

    So, I ask again, what is the point of having 4MB pages?

    --
    ___
    If you think big enough, you'll never have to do it.
    1. Re:what's the point of 4MB pages? by Anonymous Coward · · Score: 2, Informative

      It saves page table entries, which saves an irrelevantly small amount of memory.

      Much more importantly, it saves TLB entries, which makes more room for user memory, speeding up virtual->physical translations.

    2. Re:what's the point of 4MB pages? by PurpleBob · · Score: 2

      Please enlighten me - what does a processor feature have to do with disk space?

      --
      Win dain a lotica, en vai tu ri silota
    3. Re:what's the point of 4MB pages? by screwtheNSA · · Score: 0

      What does a Sherman tank have to do with CPU'S?
      As far as a Sherman tank goes, that was a JUNKPILE that should have been scrapped as it was being built! It was a poor "armored" vehicle, no real firepower and very little USEFUL armor to protect the passengers inside. I wonder how far Hitler would have gotten in a tank battle against the M1 Abrams we have today?
      <div>
      That "tank" was just as much of a deathtrap as the Bradley "fighting vehicle" is/was...aluminum for armor?? WHAT were the designers thinking?
      <div>
      Okay, flame this for offtopic, I can care less about points anyhow, so mod away!</DIV></DIV>

      --
      206.39.38.2, DDN-BLK-36, DOD NET INFO CENTER. 800.365.3642 206.36.0.0-206.39.255.255 NET RANGE.
    4. Re:what's the point of 4MB pages? by psamuels · · Score: 2
      So, I ask again, what is the point of having 4MB pages?

      They are only used for kernel memory, which is never paged out. Memory for user processes and kernel modules always uses 4k pages. (I say always, but of course I mean always on the i386 architecture. The Alpha for example uses 8k pages natively.)

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  114. Re:Hmm, Win2k needs patched, Linux needs boot opti by Score+Whore · · Score: 1

    It's a registry patch. That's all. If you have multiple machines on a network and the users log into a domain, you can trivially write a batch file that will apply the patch. The difficulty of placing this patch is going to be entirely dependant upon each specific situation.

    The only way linux/freebsd/etc. will see this as being an easier situation is if in the future they make detection and application of the problem an automatic feature of the OS. MS could do the same, but the question is will they?

  115. Who's Responsibility? by jxqvg · · Score: 3, Funny
    Where is AMD in any way obligated to call the Kernel Developer Gods whenever they make a mistake? "Oh, Mr. Torvalds, I'm so sorry we made a mistake with our processor. Oh, Mr. Cox, please forgive us. Please don't tell RMS or ESR; we'll fix it, honest!"

    Here's the stark truth for you: 1)Money, 2)Userbase.

    1. Re:Who's Responsibility? by Anonymous Coward · · Score: 0

      How bout they just post a notice about the hardware bug? Is that too much to ask?

  116. Oops by jxqvg · · Score: 2, Funny

    Remember when Corporate Enemy #1 was singing something to the tune of, "Whenever there's a problem, you don't know who to blame?", and how The Community laughed it all off as FUD? Now you can see that whenever there's a problem, you don't necessarily know who to notify, either. Don't call it a feature one day and then curse it the next. That sounds all too much like somebody else...

    1. Re:Oops by Anonymous Coward · · Score: 0

      What the hell are you talking about?

      linux-kernel@vger.kernel.org

  117. don't hurt them so bad by twitter · · Score: 2
    We have been ridiculed because we are using an "off-brand" processor, we've rationalized a way thermal problem's and fragile cores to get the benefit of more bang for the buck. We have suffered through inadequate compiler support...

    Wow, what a bunch of FUD. I've never had a thermal problem, never had a "fragile core" and never suffered from "inadquate compiler support", whatever that is. My new XP system seems to get stuck every now and then, but it's a new system and I've probably done something stupid like make my swap file much too large. My k6/2s and my Athlon perfom better than their more expensive Intel counterparts. According to reviews the XP system sould work just fine when I finish ironing it out. Other people have made it work and so it will work for me too. AMD does math cheap and better than Intel. Gaming? That's not my bag, but others are reporting good results.

    When you look at the support AMD gives the Widoze world, you have to remember that those who suffer under M$ OS NEED the help. Just check out the utilities offered at their site. AMD CPUID? cpuinfo gets that for me. They rushed out with the goofey win2k utility because Windoze users can't pass information like, "no_pentium" to their kernel or recompile it. I'm no kernel hacker, but the AMD documentation site looks informative.

    Yeah, they could have been nicer about it, but I'm not about to give up on AMD over a video bug. The short answer is that this looks like an error of ommision that could occur in any large organization. The folks in the Windoze software branch, asside from having goog feelings toward M$ by the nature of their jobs, might not know who to contact to get the word out to the Linux world. Does AMD even have a Linux division?

    We're often treated as a minority because we are.

    Another beautiful flame disguised as advocacy. It's neither true nor right. There are now more Linux developers than there are M$. I refuse to accept immoral or offensive behavior for myself, a minority of one. Fortunatly, there seems to be none of that here, at least not intentionally.

    --

    Friends don't help friends install M$ junk.

  118. the solution to the problem by grahagre · · Score: 0

    1. delete your partiton table
    2. install netbsd 1.5.2
    3. install more things
    4. repeat

  119. Major Linux Bug Discovered... 16 Months Later by snellac · · Score: 1, Funny
    Yes, that's right, yet another Linux bug was discovered the other day. So, right about now, if you're a clear headed Capitalist, you're probably thinking "Who cares? They find a new bug in Linux daily." Well, you're right. But there's more to the story. Apparently Alan Cocks (a Red Menace Commie who censors documents under the cloak of the DMCA) is trying to pass the blame on another co-conspirator of Communism.

    Apparently, if you'd believe the Linux community, you'd be hard-pressed upon where to place the blame. You see, the Linuxist Manifesto's number one rule is to lie to protect the best interests of Linux. No self-respectable Linux zealot would insult or place blame upon AMD, because AMD's philosophy centers around tackling American Corporations with their Asian sweatshops, selling their chips at bargain-basement prices like the Red Menace Commies do with their Wal-Mart shit.

    So, right about now, you're probably thinking that the zealots are clearly in a dilemma. Who are they going to blame? If you have a prediction before I tell you, the poll is on the right. Or maybe the left. Either way, take your pick.

    You'd think that the parasitic community would place blame upon Microsoft, right? Alas, Microsoft has had the bug patched since September 2000. Not only that, Windows XP , the latest in the suite of high-powered, stable operating systems from Microsoft Corp., has this patch built in. That's right, built in. Keep in mind that Windows XP was released in October 2001, over three months ago. Meanwhile, no one knows what the hell Alan Cocks has been doing since then, since he hides under the cloak of secrecy. nVidia has been informing users via tech support, even to the Linux community, how to fix the problem for months now. Clearly the blame is upon Alan Cocks's shoulder, but to place the blame where it is rightfully justified is inexcusable in the Linux community. The drones are in disarray.

    The actual bug occurs when Linux users contract the Tux Racer virus via KEmail. When first run, Tux Racer enables a feature in your third-world sweatshop AMD processor called "extended paging." Now, I know you're probably thinking that this sounds like some sort of Nokia feature. Well, you're wrong. It's yet another feature that AMD illegally hacked from Intel. It allows your browser to seamlessly view pages up to 4Mb in size. Before its introduction in the early days of the Intel Pentium processor, web pages were broken up into 4K segments, because any pages larger would freeze the computer. That's why Microsoft didn't invent Javascript until after the Pentium, every time they went to use it, their pages exceeded 4K, and henceforth froze the computer. Intel came to the rescue with the Pentium line of chips, and, as usual, AMD got out their super high tech Asian hacking tools and "reverse-engineered" (code-name for 'illegally hacked') Intel's technology. Thus, users of the inferior AMD Cyrix Kx86-2 Now! processor could also view large web pages without crashing. So why did no one notice that pages larger than 4K would crash AMD processors? Well, Microsoft has had a fix for 16 months, like we mentioned earlier. But why did no one from the Linux community notice? Well, apparently, there does not exist a page devoted to Linux that is more than 4K in size. Since most of the Linux installations out there denounce color as 'feature bloat,' all Linux pages follow an unwritten oath to suck. Believe me, they all do.

    So, for the good of Linux, you may now disperse. Head off to various tech sites and continue blaming Microsoft for not telling you sooner. Your community will thank you.

  120. typo s/LSB/MSB/ by jelle · · Score: 1

    s/LSB/MSB/ obviously

    --
    --- Hindsight is 20/20, but walking backwards is not the answer.
  121. Re:To avoid further problems of this sort, in futu by Anonymous Coward · · Score: 0

    NO!!

    ...and in the chair's opinion the nos have it.

  122. This AMD bug exists on the AMD K6-3 by narfbot · · Score: 2, Informative

    I have an AGP Nividia Geforce 2 MX, and an AMD K6-3 333 MHz. I have experienced these memory corruption, graphical anomolies, and lockups in linux and windows 95.

    I noticed that AMD K6-3 was not mentioned, but it has to exist on it. The K6-3 was made with the same instruction set as a pre-Athlon. Thus the bug definately exists.

    Not sure about K6/K6-2, but it is possible.

    1. Re:This AMD bug exists on the AMD K6-3 by Anonymous Coward · · Score: 0

      The K6-3 is a "Pre-Athlon" processor but has a totally different architecture.

      If the bug was shared between the two designs I would be GREATLY surprised. Did the K6-3 have other bugs, probably. But this one? I doubt it strongly.

      It is NOT an instruction set bug per se... it looks to be a matter of how cache is handled and invalidated on AGP transfers with large page sizes.

      Quit spreading the FUD....
      Furthermore, as Windows 95 has very little clue as to what AGP is in the first place (other than another device behind what appears to be a 66Mhz PCI bridge), it could not be susceptible to this bug anyway!

  123. Don't be by Anonymous Coward · · Score: 0

    NVidia has made it clear, on numerous occasions, that they don't care about Linux. The only reason they have any sort of Linux drivers is because a couple of internal techies do care. You'll get a couple in every company. However, NVidia as a whole has repeatidly state that Linux users should go f- themselves.

    1. Re:Don't be by fferreres · · Score: 1

      NVidia is partly and secretly funded by Microsoft. I have the information because I used to talk with people working in Nvidia and 3Dfx after they released the NV1. They sold their souls after the NV1 fiasco. By that time 3Dfx won a contract to power Sega next console.

      After some misterious "...something", everything changed:

      1) Sega dropped the 3Dfx contract for NO reason.
      2) Nvidia got lots of funds to automagically recover from disaster. What did they do? Target they new chip (RIVA) for Direct3D (before direct3D was really usefull).
      3) 3Dfx lost a lot of the initial momentum and the masterminds quit the company (when they still where #1!)

      And now we have the X-Box which houses a Nvidia chipset and 3Dfx is owned by NVidia.

      Do you expect to get Linux support from NVidia?

      --
      unfinished: (adj.)
  124. yes i have this on my xp and my dual mp... by Anonymous Coward · · Score: 0

    this is so much a problem on my AMD XP that i can not install the nvidia drivers on it at all without the system crashing every 10 minutes at the drop of a hat. i have this problem on my dual AMD 1800MP but it is far less an issue, as it only seems to crash occassionally during screen savers and very rarely during a game of wolfenstien. on both ssytems i was using the 2.4.9_13 kernel and smp kernel respectively with nvidia drivers 2314 built from src.

  125. Re:Hmm, Win2k needs patched, Linux needs boot opti by Anonymous Coward · · Score: 0

    You seemed to have missed that it's a registry setting -- aka a configuration file entry. Fundamentally the same.

  126. Re:Hmm, Win2k needs patched, Linux needs boot opti by Hoser+McMoose · · Score: 1

    The Windows "patch" just changes a registry key, so the correct answer to your initial question is No, a Win2K user does NOT need to be connected to the internet (other then to figure out just which key to change in the first place, in which case a Linux user would also need an internet connection to figure out which line to add to the bootloader).



    The only really valid point I see against Windows in this situation is that the registry has essentially no documentation! There are TONS of settings and customizations available within the registry for Windows, but essentially nobody knows even 1/10th of them because of lack of documentation!

  127. Quote from AMD's Errata sheets by Hoser+McMoose · · Score: 1

    Alan Cox and other kernel hackers do read these documents. The question is if AMD documented this bug in their errata, or just fixed for Windows 2000 and figured that was good enough.

    Bug #16 in AMD's Errata list for the AMD Athlon Model 6 processor (ie the AthlonXP/MP) lists the following:

    (Begin quoting)

    16 INVLPG Instruction Does Not Flush Entire Four-Megabyte Page Properly with Certain Linear Addresses

    Products Affected. A0, A2

    Normal Specified Operation. After executing an INVLPG instruction the TLB should not contain any translations for any part of the page frame associated with the designated logical address.

    Non-conformance. When the logical address designated by the INVLPG instruction is mapped by a 4-MB page mapping and LA[21] is equal to one it is possible that the TLB will still retain translations after the instruction has finished executing.

    Potential Effect on System. The residual data in the TLB can result in unexpected data access to stale or invalid pages of memory.

    Suggested Workaround. When using the INVLPG instruction in association with a page that is mapped via a 4-MB page translation, always clear bit 21.

    (end quoting)

    It's there. It's been listed there for quite some time. No one read the errata and/or no one bothered to check to see if this one affected Linux systems. I hate to break it to the Linux boys, but they kinda missed the boat on this one. Normally Linux kernal hackers seem pretty good at staying on top of processor bugs, but it looks like this one slipped through the cracks.

    FWIW, anyone looking for this errata list can find it here. As a bit of an aside, none of AMD's PDF documents will load for me in Mozilla/Netscape 6 with Acrobat 5, but they all work fine with IE/Acrobat 5. Wierd.

  128. why don't i have this problem? by cockroach2 · · Score: 0

    i've been using an athlon (slot a) cpu for more than a year now, with all 2.4 kernels (except for the REALLY broken ones) and i haven't had any such problem at all. i used matrox agp cards (g400 and g550), so shouldn't it crash here too? quake3, castle wolfenstein and a lot of other opengl apps work perfectly well.

    which part am i missing? is this problem somehow nvidia-related after all?

  129. Re:Does this problem occur in the 2.2 kernel serie by Anonymous Coward · · Score: 0

    I would say yes, since the 2.2 kernel also uses these Pentium big pages:
    http://lxr.linux.no/source/arch/i386/mm/init.c?v=2 .2.19#L302

    We also meet systematic freezing with XFree86 on our Athlon+Ati/AGP.

  130. Mod me down Blind: -1 by Dimensio · · Score: 2

    Nevermind, just saw the "workaround" listed in the article.

  131. What bug? Could Pentium IV have a bug too? by Tangofran · · Score: 1

    I can't understand what "CPU" bug you are talking about. Could somebody tell me where's that bug? As far as I'm concerned, the trouble is our dear kernel trying to get those 4Mb pages found in Pentium and not in Athlons. Now, why everybody call this a BUG????? I think this is just a "NOT supported feature". Maybe should a say I've found a bug on the new Pentium IV. They can't work with "3D NOW!" optimizations!

    1. Re:What bug? Could Pentium IV have a bug too? by psamuels · · Score: 1
      As far as I'm concerned, the trouble is our dear kernel trying to get those 4Mb pages found in Pentium and not in Athlons. Now, why everybody call this a BUG????? I think this is just a "NOT supported feature".

      No, it's a bug. 4Mb pages are a supported feature on Athlons - but we now know that the CPU has a small problem invalidating such pages when AGP activity is going on, resulting in memory corruption, which in turn can cause crashes or lockups.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  132. Haven't noticed it... by Anonymous Coward · · Score: 0

    But seems I should have. I have played Tuxracer many times, and while I can't say I'm good at it, I haven't had my system lock up even once. (It did this on a nearly daily basis with several different clean installs of MicroYouKnowWhat Windows 98! Hence the reason I switched.)

    I have a custom built (made it myself) with the following:

    Red Hat 7.2 (kernel 2.4.7) running a nearly standard install on an AMD Athlon 900MHz single CPU in a Biostar MB with VIA chipset.

    I use an ATI All In Wonder (Rage 128 based) AGP card with 32 MB of memory, and 4X AGP enabled in the BIOS. I have 768MB of PC133 memory installed.

    The only thing I've noticed so far is my fresh install of Netscape (anyone notice how NS resembles M$? Maybe with AOL considering a penguin dinner we should just call them N$... when I read that story I thought "I JUST BOUGHT THE THING!") anyway, N$ ver 6.2 likes to crash, which is too bad. Mozilla does not, that I've noticed, and I like that Mozilla allows me to right-click images from advertisers and block their transmissions, which as far as I know NS does not offer... I'm uninstalling all N$ on my box momentarilly...)

    But to get back to the subject at hand, I did "cat /proc/cpuinfo" and I have 6/4/2, model 4 of the CPU so I should be having the same problem and haven't noticed it. Hmmm...

    ~ a newborn Penguinista Mozillite StarOfficer.

  133. Re:Hmm, Win2k needs patched, Linux needs boot opti by gosand · · Score: 2
    That is kind of what I was getting at. Windows more-or-less requires you to rely on someone else to "patch" your system. Yeah, the patch for this is probably just changing a registry setting. But it is automatic. "Here, just run this program, it will take care of everything. You are too stupid to know what you are doing, let us do it for you." Saying "here, patch your system" is not the same as saying "here is HOW you patch your system". Downloading and double-clicking on an executable is not a sufficient explaination for me personally.

    And I understand that most computer users WANT this. If there is some problem with their magical computer thingy, they want something to just fix it. That is part of the real problem here, which goes way beyond this particular issue, that people are patching their systems with blind faith that the patch will "fix" whatever is wrong. Was the name of the registry setting provided? Can you go in and change it manually if you wanted to? I am guessing that it wasn't.

    I am no zealot. I understand that things need to work a certain way in the computer world, just because not everyone is comfortable with computers. But it is my machine, I want to know what is going on with it.

    --

    My beliefs do not require that you agree with them.

  134. Then the REAL problem begins .... by Taco+Cowboy · · Score: 1



    If what you are saying is true ...

    "As it turns out, AMD did eventually get
    around to fixing this issue with Stepping
    A5 of the AthlonXP/MP core."

    then we have a real problem here.

    You see, Alan Cox and friends are planning to make Linux "recognizes" Athlons, and do a special case - the "nomem" thingy - on it.

    But if the Stepping A5 of AthlonXP/MP core has the bug licked, then, the "nomem" thingy by Alan Cox et al may be a step backward.

    Unless of course, the kernel hackers want to have a double-checked thingy on their "Atholon finder". Such as ...

    If Athlon=step5, then do nothing.

    --
    Muchas Gracias, Señor Edward Snowden !
    1. Re:Then the REAL problem begins .... by psamuels · · Score: 1

      You see, Alan Cox and friends are planning to make Linux "recognizes" Athlons, and do a special case - the "nomem" thingy - on it.

      But if the Stepping A5 of AthlonXP/MP core has the bug licked, then, the "nomem" thingy by Alan Cox et al may be a step backward.

      You underestimate kernel hackers. They'll definitely check steppings. If Linux ends up downgrading all Athlons to 4k pages, just email the kernel list and demonstrate / assert / prove that certain Athlons are OK. Alan or Dave Jones will patch the kernel correctly in no time, I'd put money on it.

      --
      "How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
  135. Excuse my ignorance... by p3d0 · · Score: 1

    Slightly off-topic, but...

    How often are 4M pages useful? I guess the whole kernel could be in one page, but where else is this useful? I bet most applications would never see any benefit from 4M pages anyway.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  136. Re:Does this problem occur in the 2.2 kernel serie by Anonymous Coward · · Score: 0

    Update to the previous message:

    I can confirm that the option mem=nopentium
    given to the 2.2 kernel
    allows us to use Xfree on our Athlon without
    freezing.

    In fact, I do not understand why everyone is talking about a *2.4* bug.
    This is a Athlon/linux bug more generally.

    Since Alan Cox is working on automatic workaround,
    and since he is the 2.2 maintainer, I think we 2.2 users should not worry too much.

  137. Re:Hmm, Win2k needs patched, Linux needs boot opti by Thatman311 · · Score: 1

    Then read the KB article that is associated with the patch. There is your documentation as what the patch is doing. Here is the link for you....

    http://support.microsoft.com/default.aspx?scid=%2f search%2fviewDoc.aspx%3fdocID%3dKC.Q270715%26dialo gID%3d1928924%26iterationID%3d1%26sessionID%3danon ymous%7c1700360

    Microsoft makes it a policy to write one of these for every patch it makes. So RTFM ok? (sorry couldn't help myself..at least I told you where in the manual to read)

    --
    Silly Rabbit...Sig's are for kids.
  138. Re:Hmm, Win2k needs patched, Linux needs boot opti by Lemmy+Caution · · Score: 2
    You don't need to download anything to install a registry patch. Most registry patches are two or three lines of text, saved with a ".reg" extension, that you import into your registry. The fact that you can download that textfile and then double click on it is, generally, an increased convenience. The registry is essentially a single file that contains /etc and everything underneath it.

    And using regedit, you can change it manually, too. You can add keys and values and screw things up like a 5 year-old with root, if you like.

    The criticism of the registry model that is valid is two-fold: 1, it can be corrupted like any file, but since it is one file and not a directory like /etc, that can muck up your whole system (the registry can still be backed up and reinstalled) and 2, it is somewhat easier for malicious code to muck with the registry, since most Windows users work in some privileged mode.

  139. K7S5A not so nice either. by netsharc · · Score: 1

    I built a K7S5A system for a friend, at the first go it couldn't install Windows, it turned out we had faulty RAM, so we replaced it, but problems persisted, went to the shop to had the thing looked; the motherboard was faulty. Got a new motherboard but even that didn't go smoothly. After spending 3 hours getting Windows installed, the system rebooted by itself and couldn't get back into Windows because of a corrupted registry file. I looked on the internet, and found this forum and it turns out the problem is widespread. Read especially this FAQ. K7S5A, blah.

    --
    What time is it/will be over there? Check with my iPhone app!
  140. What a moron.... by Anonymous Coward · · Score: 0

    Not surprised they are junk? They are most definitely NOT junk.

    You, however, are an idiot who knows not of what he speaks

    Care to read the list of errata on the Pentium 3 or 4? Its quite a volume and most bugs are of equal severity to this or worse.

    This bug is only a problem in certain rarely seen circumstances that happen with AGP transfers using large page sizes -- which apparently only Nvidia seems to do in their drivers.

    And disabling the feature in the kernel is trivial and affects performance almost unmeasurably. And the bug was known and documented and fixed under Win2K over a year ago. But, as the large paging feature wasn't used under 2.2 Linux kernels, it didn't apply then and slipped through the cracks because the Linux kernel teams didn't apparently read AMD's errata sheets well enough. This is a failing on the part of LINUX not AMD.

    Of course if you look at the old Intel P3 1.1Ghz processor, it wouldn't even COMPILE the Linux kernel without crashing. They had to actually recall the part....hows that for stability???

  141. Re:Hmm, Win2k needs patched, Linux needs boot opti by rtaylor · · Score: 2

    The average user is probably more capable of going to the Windows Updates website, clicking on the tick box and hitting 'Download' which then runs the install.

    The typical computer geek is probably equally capable of editing the bootloader or a registry, but prefers the first.

    Your question was kinda like asking "Which could the typical person do easier. Build a rocket to goto the moon, or build one to goto Pluto?"

    For the typical person, neither is possible.

    --
    Rod Taylor
  142. OK...now what? by xPhoenix · · Score: 1

    ALright, but the mem=nopentium still doesnt work, at least in my case. Not only do I have mem=nopentium, but I also have BIOS AGP set to 2x, Option "NoRenderAccel" "true" in XF86Config, and Option "NvAGP" "0" in XF86Config (basically totally disabling AGP totally anyway). Despite all these safeguards, sure enough, graphically intensive programs (including an idle X desktop) hard-crash the kernel. Anyone else still experiencing this? I didnt have to wait long either, the system crashed within seconds of opening Quake 3 while I was still navigating the game menus.

  143. locking up by jasonbrown · · Score: 1

    I have expereienced that myself. I put Mandrake 8.1 on Duron 900 with DDR ram in have experienced inexplicable lockups. just my 2 cents

    --

    "Congress shall make no law... abridging the freedom of speech, or of the press"
  144. moderation [OT] by kilrogg · · Score: 1
    I've never had so much moderation done to one of my post:
    Moderation Totals: Flamebait=1, Insightful=2, Interesting=1, Informative=1, Funny=3, Overrated=2, Underrated=1, Total=11.

    I think I'm only missing 'troll' and 'offtopic' and It would have been a full house.

  145. 64bit bug is on UltraSPARC 1 cpus less than 200mhz by Animixer · · Score: 1

    If I remember correctly, to enable 64 bit operation on an UltraSPARC 1 cpu (such as that in an Ultra 1 workstation), you need to upgrade to OpenBoot 3.11.1 at a minimum, and then uncomment the line in /platform/sun4u/boot.conf that says this: ALLOW_64BIT_KERNEL_ON_UltraSPARC_1_CPU=true.

    I don't know exactly what the bug is with the old UltraSPARC 1's, except that given specific hand-written assembly code, it is possible to lock up the machine.

    I have been running mine on the 64 bit kernel for some time now, and haven't noticed any problems, so it's probably safe in most circumstances.

    mpb

    --
    man tunefs | grep fish
  146. Abit doesn't suck by CrabCakeJimmy2k · · Score: 0
    Abit sucks ass though, they try to push things too far and forget that a super-overclocked machine that hangs every hour isn't worth shit.

    How does Abit try to overclock? How does Abit push too for? Just because they provide you with the means to push your CPU past its ability, doesn't mean you have to do it. It's your fault if your CPU is overclocked to a point that it becomes unstable.

    Abit has never let me down. 4 out of 5 machines I am currently running at home use Abit Mobos. None of them have given me a problem that couldn't be solved with minimal effort, and a majority of those problems were my fault for trying to push too far, and even then it wasn't Abit that failed me, it was the hardware I had attached to it that couldn't handle what I was trying to force it to do. I like Abit because they give me the ability to push my systems to the edge of the envelope and beyond, but it is stupid to say that they suck if I push past the edge. Besides that, Abit provides better documentation on their products than any other manufacturer I have ever seen.

    1. Re:Abit doesn't suck by billcopc · · Score: 1

      Abit tunes their boards in order to allow such overclocks. They make many design decisions that sacrifice a bit of system stability in exchange for 2% more memory bandwidth, or to allow a wider range of bus speeds just so you can brag that your Athlon 1333 at 147 x9 can saturate the GeForce2's bus a few hairs quicker than mine at 133 x10. For tweak-freaks, that's fine, but for everyone else we just want something that can run Win98/2k for a couple days without a reboot, just like my old P2 system used to.

      --
      -Billco, Fnarg.com
  147. HCF is a reference to an old IBM joke. by Ungrounded+Lightning · · Score: 3, Interesting

    That third article about the supposed "HCF" instruction on the 4004 is completely and utter BS. None of the instructions on the 4004 will cause it to burn up, even on the earliest production parts.

    When the IBM System 360 series came out it had a large number of new opcodes (as compared with the 70x/70xx series). These were the days of CISC (Complex Instruction Set Computers), and the 360 really lived up to the name. It gave over a large amount of its word space to opcodes and opcode extensions, so it had a VERY large potential opcode space. Much of it was unpopulated, but some was populated with undocumented instructions. Further, the machine was microcoded, and the microcode was loaded when the machine powered up. (That's what floppy disks were invented for.) So the company could write new opcodes and add them later.

    Of course the new machine with the ENORMOUS list of opcodes and (true) rumors of hidden undocumented opcodes quickly lead to the circulation of a humorous list of perhaps 20ish additional "new undocumented opcodes". Things like XOE (Execute Operator Immediately), EK (Electrify Keyboard), SSJ (Select Stacker and Jam), BLNK (Blink Lights), WHR (Whirr), etc. The crown jewel of this list was HCF (Halt and Catch Fire).

    While this list was still funny Motorola released the 6800 single-chip microprocessor, predecessor to 650x knockoff that formed the core of the first Apple computers. To ease chip testing, the all-ones opcode threw the chip into a test mode, where it continuously incremented the program counter and performed memory reads. This wiggled all the address lines and most of the control lines, letting you know if the chip was alive and bonded.

    Of course they didn't tell you about it. And of course the only way out was hard reset. And of course a jump to an unpopulated region of the address space (i.e. most of it) would leave the bus floating and generate 0xFF. And of course jumping into random data or uninitialized memory would also quickly get you an 0xFF or jump you off into unpopulated address space. So the typical behavior for a program bug was to lock up the processor beyond the ability of a debugger to function.

    (Hell: I had one of the first round of solder-it-yourself evaluation kits, bent a pin on the debugger ROM putting it into the socket, and ended up with a board that booted into the test state. Was starving student and it took a couple days to get access to test equipment to find out what was wrong.)

    So of course programmers, once they found out about the instruction that hung the chip in a mode where it "twiddled its thumbs at maximum speed" and got a bit warmer than usual, and couldn't get out of the mode except by hard reset, quickly christened the opcode "Halt and Catch Fire". And this became the generic term for get-stuck-in-a-test-mode instructions on microprocessors, until the chip manufacturers finally came to their senses and stopped putting such instructions into instruction sets.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  148. Re:64bit bug is on UltraSPARC 1 cpus less than 200 by spauldo · · Score: 1

    Yeah, the firmware patch came with the solaris CD set. I patched it during my first install after I got the machine.

    Kernel's runnin' in 64 bit mode. Unfortunately, I don't know enough C to really take advantage of it like I want to, but I'm learnin'...

    --
    Those who can't do, teach. Those who can't teach either, do tech support.
  149. Money + Userbase by Anonymous Coward · · Score: 0

    Linux increasingly has both. IBM. Google. etc.

  150. you don't get it by Anonymous Coward · · Score: 0

    We NOW know that this is an AMD bug.
    It was proper to assume the NVIDIA
    driver was at fault though, because
    this one closed-source driver made
    debugging near-impossible.
    Closed-source drivers are guilty
    until proven innocent.

  151. With GRUB... by cduffy · · Score: 1

    ...just append the option to the end of your 'kernel' line. For me, it looks like this:

    kernel /vmlinuz root=/dev/hdb1 vga=5 mem=nopentium

    Hope this helps!

  152. Updated Info about the supposed bug! by ttfkam · · Score: 2
    The guy who originally broke the "AMD bug" story in Linux has since updated his site with new and more accurate information.

    And, for convenience, a rundown by the players involved (both for the Linux kernel and AMD) is here.

    In short, for the reading-impaired, it's not an Athlon bug.

    --

    - I don't need to go outside, my CRT tan'll do me just fine.
  153. Duh! Been around for ages! by FatBoy+Titties · · Score: 1

    It was obviously well known in late 2000 when I bought my system because there were registry patch fixes for windoze avaliable from AMD's site and now are avaliable on mobo manufacturer's sites too.

    --
    F4+80y +1++135
    FatBoy Titties - (aren't I l33+ ;-) )