Slashdot Mirror


Major Linux/Athlon CPU bug discovered

GeorgeFrancisco writes "I recently installed the nVidia drivers so I could play TuxRacer on my Athlon. Problem is it kept inexplicably hanging Linux. Now I know why. The CPU bug affects Athlon/Duron/Athlon MP AGP users. Fortunately there's a way around it, and: "Alan [Cox] is going to try to add some kind of Athlon/AGP CPU bug detection code to the kernel so that it will be able to auto-downgrade to 4K pages when necessary." Read more on the Gentoo Linux site."

7 of 402 comments (clear)

  1. The quick answer: by Doctor+K · · Score: 5, Informative

    The site seems to be down. However, last week, I contacted nVidia about this problem on my two dual Ahtlon MP workstations (random hangs when OpenGL is invoked). So the quick answer is you can

    Boot your system with following option on your kernel command line: "mem=nopentium"

    or

    Disable AGP in XFree86 config (i.e. Option "NvAGP" "0" in the "Devices" section).

    nVidia clued me into the first approach about a week and a half ago. It made my system completely stable. However, there was still some texture flakiness in some OpenGL applications. Since my workstations are number crunchers (and thus Quake FPS don't matter to me), the latter option eliminated both the stability problems and the texture flakiness (at the expense of some graphics speed).

    By the way, nVidia mentioned the same issue exists on Win2K / Athlon boxes.

    Enjoy,
    Kevin

  2. Re:NO AMD BASHING by spauldo · · Score: 5, Informative
    Why are you worried about running 32-bit code on a 64-bit processor?

    Just as an aside, if you ever deal with ultrasparcs, you'll quickly find that the majority of the code used is 32 bit.

    The reason for it is simple; most applications will run slower at 64 bit than at 32 bit. The ultrasparc chips were designed to take this into account. Hell, due to a firmware bug, solaris on my ultra 1 installs as a 32 bit kernel by defualt - and runs no slower because of it (although it can't run 64 bit apps that way). After a firmware patch, it is easy to change to running the 64 bit kernel though.

    In all reality, why would most apps need 64 bit integers and whatnot? Most don't, and doing so is a waste of memory. If the processor is designed right, it can handle 32 bit code with no problems whatsoever.

    --
    Those who can't do, teach. Those who can't teach either, do tech support.
  3. Using Test Suites to Validate the Linux Kernel by goingware · · Score: 5, Informative
    Let me take this opportunity to plug Using Test Suites to Validate the Linux Kernel.

    Thank you for your attention.

    --
    -- Could you use my software consulting serv
  4. Quake 3 benchmarks by Sits · · Score: 5, Informative

    Quake 3 demo was run with \timedemo 1 and \demo DEMO001 . Each test was run three times. The system load average was < 0.5 before Quake 3 was run.

    Without mem=nopentium
    FPS = 79.4 (79.4, 79.4, 79.4)

    With mem=nopentium
    FPS = 79.2 (79.1, 79.3, 79.2)

    System tested:
    Athlon 850, 384MB RAM, Geforce 1 DDR, VIA KT133 Chipset
    Athlon/Duron/K7 optimised 2.4.17 kernel (optimising the kernel above pentium makes very little difference though)
    NVidia 1.0-2313 video drivers using agpgart
    Mandrake 8.0

    Quake 3 settings
    Texture depth = 16 bits
    Colour depth = 16 bits
    Geometric detail = High
    Texture detail = High
    Dynamic lights = On
    Video mode = 1024x768

    Looks like there is a difference but it's very slight (0.003%) but my benchmarks aren't very scientific. Either way, if there is an improvement in stability this tradeoff is easily worth it. Here's hoping that you don't run linux just for it's Quake 3 scores though...

  5. Re:Should AMD do the right thing? by Eric+Smith · · Score: 4, Informative
    That third article about the supposed "HCF" instruction on the 4004 is completely and utter BS. None of the instructions on the 4004 will cause it to burn up, even on the earliest production parts.

    Several processors had self-test instructions known as "HCF". The 6800 family and the 6502 had such instructions. They caused the processor to start fetching consecutive locations, thus continuously incrementing the address bus. Didn't damage the processor, even if you left it running that way. The "Catch Fire" was a figurative description of what was happening on the address bus, nothing more.

    On the original NMOS 6502, about 13 of the undefined opcodes had this effect. This was the most common cause of computer lockups if the code went into the weeds.

    On some of the later 6800 family members, the test instructions were actually documented, but Motorola's published description did not include any mnemonmic for them.

  6. Other Hackers did it better . . . by Jeff+Kelly · · Score: 5, Informative
    Here is a Posting from Terry Lambert on the FreeBSD -stable Mailing List regarding this "Bug".
    Maybe it sheds some light on this issue.


    > Recently I found Linux 2.4 kernel is affected by the
    > bug of extended paging in AMD Athlon through the
    > following link. I don't know if FreeBSD is also
    > affected.
    >
    > http://linuxtoday.com/news_story.php3?ltsn=2002-01 -21-001-20-NW-KN

    I am well aware of this bug.

    It does not affect FreeBSD, which only uses 4M pages for
    the first 4M of the kernel itself.

    I've worked on code that enables 4M pages on other memory
    used in FreeBSD, that had this problem, but only if you
    were really stupid in your allocation mechanism.

    There's a workaround for this problem which is fairly
    trivial to implement in software, and should probably be
    done when 4M pages are enabled, if you are using an Athlon,
    and are adding 4M pages.
    [...]
    In any case, this will not be a problem for FreeBSD, and is
    only a problem for Linux because of the strange way they
    initialize things.
  7. Re:Is this the same as the Win2k bug? by DeeKayWon · · Score: 5, Informative
    The only revision without the bug is the A5 stepping (CPUID 662) Athlon XP/MP/Mobile Athlon 4. See the Athlon model 4 revision guide and the Athlon model 6 revision guide, erratum 16.

    Basically, if you run "cat /proc/cpuinfo" and see these:

    cpu family: 6
    model : 6
    stepping : 2

    Then you should be safe.