Slashdot Mirror


Tracking Down The AMD "Processor Bug"

tercero writes: "over at the Gentoo Linux website there is an update on the AMD processor bug mentioned here. The sum up is that AMD claims it's not a bug with the Athlon processor, but with the motherboard. More detailed information can be found on this LKML post." An Anonymous Coward points to a similar explanation at Linux Weekly News. Update: 01/25 01:25 GMT by T : Daniel Robbins from Gentoo clarifies: "AMD is not calling this a 'motherboard' issue, it is an interaction between a feature of the Athlon called 'speculative writes' and the design of the GART, which is not cache-coherent. It's a 'Athlon/cache coherency/GART' problem, not a 'motherboard' problem."

237 comments

  1. What is the exact parameter to pass to LILO? by Greg151 · · Score: 0, Offtopic

    I have seen both mem=nopentium and
    append mem=nopentium.

    Do I need the append or not?

    Thanks,

    Greg

    1. Re:What is the exact parameter to pass to LILO? by AntiNorm · · Score: 2, Redundant

      LILO passes kernel parameters via an 'append' line, so the syntax would be

      append=" mem=nopentium"

      Make sure you aren't appending anything else. If you are, just add the mem=nopentium at the end of your existing append line.

      --

      I pledge allegiance to the flag...
      of the Corporate States of America...
    2. Re:What is the exact parameter to pass to LILO? by leviramsey · · Score: 1
      Do I need the append or not?

      Reason #20996 to use GRUB: No need to use append=

    3. Re:What is the exact parameter to pass to LILO? by me0 · · Score: 0

      The "append" is specific to lilo.conf.
      on the commandline you'd just be putting "mem=nopentium" and that goes for grubs menu.lst as well.

    4. Re:What is the exact parameter to pass to LILO? by nixadmin · · Score: 0

      Plus don't forget to set devfs=nomount to free up any resources needed by the processor's L2 cache. This can cause timing hiccups which are made worse by the AGP caching bug.

    5. Re:What is the exact parameter to pass to LILO? by Anonymous Coward · · Score: 0

      Reason #233902 not to use GRUB. LILO works fine. Yet another example (a la xinetd) of Red Hat taking something that works fine, has a proven track record, and a large installed base (read: lots of people who understand it), and for apparently no good reason, changing it. The MS of the unix world.

  2. Bug? by smack_attack · · Score: 4, Funny

    2+2=3.9999999999999999999999999999983774

    Oh wait that ws Intel.

    1. Re:Bug? by Anonymous Coward · · Score: 0
  3. Think "Matrix" by SpookComix · · Score: 5, Funny
    AMD claims it's not a bug with the Athlon processor, but with the motherboard

    According to young bald children everywhere, "There is no bug".

    In related news, the motherboard manufacturers are quoted as saying, "It's not a bug with the motherboard, but with the Athlon processor."

    --SC

    --
    You read fiction? I write it! Lemme know what you th
    1. Re:Think "Matrix" by nixadmin · · Score: 0

      This is true; I'm running FreeBSD on an AMD processor and haven't noticed any issues. But chipsets are flaky things.

    2. Re:Think "Matrix" by Anonymous Coward · · Score: 0

      You're also almost certainly treating your AGP video card as a simple PCI card. Or does FreeBSD have a driver for the GART nowadays?

    3. Re:Think "Matrix" by warpeightbot · · Score: 2
      AMD claims it's not a bug with the Athlon processor, but with the motherboard
      I got a question.

      WHICH MOTHERBOARD?? I've got a bunch of customers with AMD's and NVIDIA-custom-driver setups out there in the field, KT-133A and KT266 and AMD-760 chipsets, that are seeing zero problems and wondering WTF is going on.

      So am I.

      Is this specific to the NVIDIA chipset or something? I've never seen this thing manifest.... what does it look like? I saw the guy say Windows went blooey... what would it do in Windows? Signal 11 the X server? Hang the box? Oops?

      The world wonders here, and I'm not getting very many details.

    4. Re:Think "Matrix" by nixadmin · · Score: 0

      LOL No,I'm running it headless!

    5. Re:Think "Matrix" by Mario+B · · Score: 1

      It's not a bug, it's a "feature"! :)

  4. I work in software by Anonymous Coward · · Score: 4, Funny

    And it's never our program that has your bug.

    Meanwhile, we're feverishly fixing your bug in our software.

    "Yes sir, we've patched around the OS problem and this should get rid of that nasty bug you were seeing."

    1. Re:I work in software by Anonymous Coward · · Score: 0

      If the bug is with AGP, why the hell would it be the CPU that's the problem? The motherboard's northbridge is the one that handles AGP.

    2. Re:I work in software by _typo · · Score: 3, Funny
      "Yes sir, we've patched around the OS problem and this should get rid of that nasty bug you were seeing."

      So you code Windows apps then...

      --

      Pedro Côrte-Real.

    3. Re:I work in software by Anonymous Coward · · Score: 0

      Yes, thanks for noticing... You wouldn't believe how often we can get away with the "it's the OS" excuse.

  5. Don't blame AMD entirely by ekrout · · Score: 5, Insightful

    Don't blame AMD entirely. They acknowledged the bug back in September of 2000 and immediately released patches for Windows 2000. Consequently, it doesn't affect users of Windows XP either. It's been around for over a year and now it's "news"? This should've been fixed in the Linux kernel months ago. Sorry for sounding so harsh.

    --

    If you celebrate Xmas, befriend me (538
    1. Re:Don't blame AMD entirely by Arker · · Score: 2

      It would have been fixed months ago if AMD had labeled it a hardware bug. It was billed as a "Win2k Bug" and quite naturally Linux hackers don't tend to pay much attention to that class of problems.

      --
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Friends don't let friends enable ecmascript.
    2. Re:Don't blame AMD entirely by Anonymous Coward · · Score: 0

      No, you're just spouting ignorant bullshit.

    3. Re:Don't blame AMD entirely by Anonymous Coward · · Score: 0

      Actually, if you had read the Gentoo link you would have found that this motherboard bug has nothing to do with that old W2K bug which, apparently, WAS an Athlon bug.

    4. Re:Don't blame AMD entirely by Hoser+McMoose · · Score: 1

      It WAS, however, documented as a hardware bug in AMD's April 2001 update of their Errata sheets that are available on AMD's website for any and all to download. It's bug #16 in AMD's Athlon Revision Guide.

      As an aside, I don't quite get the "It's a motherboard bug" thing, assuming that they're talking about the same bug as the Win2K thing. The documentation for the Win2K bug is clearly the same as the hardware bug mentioned in the above errata sheet. If you ask me, it's a rather well documented hardware bug that really should have been noticed and worked around in software long ago.

    5. Re:Don't blame AMD entirely by Hermanetta · · Score: 0

      Now this is funny! ITS FUNNY DAMNIT! He is showing the humor in his own wise choice of putting "sorry for so sounding so harsh" at the end of a good but not party line comment, as all good posters should.

      I was about to write something similar but then noticed the comment, and that it was by the same guy. That is very funny, and I wonder if the moderators realized it was the same guy they gave the 5. And I'm sorry too for sounding so harsh.

    6. Re:Don't blame AMD entirely by ekrout · · Score: 1

      At least someone appreciates me around here...

      ;-)

      --

      If you celebrate Xmas, befriend me (538
  6. AMD & Bugs by techangel · · Score: 0

    Bugs in chips now? Not just software? Hmm...

    1. Re:AMD & Bugs by SpaceLifeForm · · Score: 1

      It's really nothing new. Generally, the 'harder' the logic medium, the fewer bugs.
      Bugs in firmware are more plentiful than in hardware, but less so than pure software.
      As Intel and AMD race to produce faster processors, expect more hardware bugs.

      --
      You are being MICROattacked, from various angles, in a SOFT manner.
  7. It's not a bug!! by ender-iii · · Score: 4, Funny

    It's an optimization for Windows XP!!

    --
    ender-iii
    1. Re:It's not a bug!! by Anonymous Coward · · Score: 0

      That's right! Athlon's new features speed up reboots. When you run your Windows XP system with an Athlon processor, it will crash faster than with an equivalent Pentium processor. Independent tests show that with the new Athlon technology, Windows XP can crash faster than any prior Windows release.

    2. Re:It's not a bug!! by Anonymous Coward · · Score: 0

      1996 called. He wants his joke back.

    3. Re:It's not a bug!! by Anonymous Coward · · Score: 0

      "It's an optimization for Windows XP!!"

      No but after about a year of it being discoved and fixed via a trivial registry change under Windows 2000 it was fixed in the default install under Windows XP. So in effect the many eyes, bodies and sizeable dollars involved with Windows produced a solution before the problems under Linux was even discovered. Of course not relying on a monolithic kernel and extensive testing might have helped too.

      I like Linux but stupid XP jokes when your own platform is deficent are just that. Lets hope this workaround under Linux works well and that people stop blaiming NVIDIA for their own hardware and software problems are what is to blame.

    4. Re:It's not a bug!! by smack_attack · · Score: 1

      bahahaha

      ha

      *cough*

      ha ha

      *cough*

    5. Re:It's not a bug!! by Anonymous Coward · · Score: 0

      "That's right! Athlon's new features speed up reboots. When you run your Windows XP system with an Athlon processor, it will crash faster than with an equivalent Pentium processor. Independent tests show that with the new Athlon technology, Windows XP can crash faster than any prior Windows release."

      Actually XP has this fixed by default. Windows 2K had it fixed by a minor registry addition. So you are wrong and as others have cited you are using a tired joke.

      You know with all the problems with the developement of the Linux kernel as of late AND now "new bugs" that most would consider old news coming out and addressing the greater emphasises that needs to be placed on memory management in the kernel Linux zealots have just a tad less credibility. Lets face facts this parameter fix is at best a cruddy temporary kludge that will impede preformance in an effort to get some stability trade offs. It's similar to an MS hotfix where they just want to get the users off their back while they figure out how to fix the problem properly.

      Windows just keeps getting more stable (despite what you imply in your jokes). Linux seems to be moving (at least temporarily) in the opposite direction. Maybe a rethink is in order so that linux users have a system like the BSD's use. So users know that STABLE is stable and DEVELOPEMENT is developement.

  8. A kernel bug -- not a motherboard bug by heretic · · Score: 0, Redundant

    "The GART and the CPU see two different views of memory, and it's the kernel's responsibility to map memory in such a way as to prevent bad interactions. Currently, that isn't happening."

    1. Re:A kernel bug -- not a motherboard bug by heretic · · Score: 2, Insightful

      Sheesh! Read the above article where it states "...AMD claims it's not a bug with the Athlon processor, but with the motherboard". AMD is claiming no such thing! They are claiming it's a Linux kernel bug.

    2. Re:A kernel bug -- not a motherboard bug by larien · · Score: 2
      You could argue that since it doesn't happen on intel systems it is an Athlon bug...

      Basically, it seems that someone figured that the GART shouldn't worry about the CPU potentially caching 4MB pages and simplified their circuits accordingly. Unfortunately, they forgot to tell OS developers (NB: I wonder if this affects other OSs like the now doomed Solaris/x86 or *BSD?) causing these problems.

    3. Re:A kernel bug -- not a motherboard bug by Anonymous Coward · · Score: 0

      No, you couldn't argue that. Seeming as how it is merely the way the kernel handles the processor and not how the processor handles; I must deem you incorrect and inpolite for posting this.

    4. Re:A kernel bug -- not a motherboard bug by heretic · · Score: 1

      Well, my point, before being hit with some braindead moderation, was about AMD's claims. However, I'm not sure that there's confirmation that this bug doesn't affect Intel systems as well.

    5. Re:A kernel bug -- not a motherboard bug by crawling_chaos · · Score: 2

      You could also argue that the published spec for the GART states that it shouldn't worry and the OS developers didn't read the spec and assumed that everything worked just like a Pentium *.

      Thus by conforming to a specific implementation, rather than the published spec, it is an OS bug. My architecture knowledge is rusty enough to be unsure which answer is correct.

      --
      You can only drink 30 or 40 glasses of beer a day, no matter how rich you are.
      -- Colonel Adolphus Busch
  9. percentage of affected chips? by Narcocide · · Score: 1

    what exactly is the percentage of chips affected? i've been personally in the presence of 3 different athalon-based systems for extended periods of time that did not seem to have this problem.

    1. Re:percentage of affected chips? by eris_crow · · Score: 1

      Good question. The day this news was posted to Slashdot was the very day I went home from work and installed a new video card to create the nightmare scenario:

      Red Hat 7.1
      PNY Verto AGP vidcard (nVidia GeForce 2 MX400)
      AMD Athlon 1200
      Asus A7A-266 motherboard with ALi 1647 chipset (bad AGP problems)

      The result? It's working like a dream so far.

      To be fair I should point out that I've not tested the video card with anything more stressful than playing a few DVDs, and the longest I've had the computer turned on was about 4 hours.

      One thing that confused me was that the documentation for the nVidia driver said it would automatically disable AGP if it detected the chipset I have on my motherboard, but the drivers output from X startup says it's running in 4x AGP mode. Curious, but not unpleasing as long as things keep running well.

    2. Re:percentage of affected chips? by Emil+Brink · · Score: 1

      The next time you're in the presence of one, take the time to read the label. It's Athlon. You might want to count the 'a's in there. Mumble.

      --
      main(O){10<putchar(4^--O?77-(15&5128 >>4*O):10)&&main(2+O);}
  10. Kernel parameter vs LILO config file by DragonHawk · · Score: 5, Informative

    The kernel will look for the parameter

    mem=nopentium

    and turn off 4MB pages (which may or may not prevent the problem from manifesting -- the situation is unclear at this time). You can do this at the boot prompt like this

    LILO boot: linux mem=nopentium

    or by placing the configuration directive

    append="mem=nopentium"

    in your /etc/lilo.conf configuration file.

    See the manual page for lilo.conf for the details.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
    1. Re:Kernel parameter vs LILO config file by Anonymous Coward · · Score: 1, Interesting


      > mem=nopentium

      This does NOT help.

      Only Option "NvAGP" "0" solves
      TuxRacer problems. Go read the
      linked docs and you will understand
      why.

  11. Re:this is something.. by SpookComix · · Score: 1, Offtopic
    That's because, unfortunately, the Mac is dying. Don't get me wrong, Steve Jobs has done an outstanding job in the last 12 years pulling Apple out of a dark crevase, but it's too little, too late.

    Mac OSX isn't making the kind of surge they had hoped for. Pre-sales of the new iMac are low (although the machines are really cool!), and with so little market share to work with, Apple's fate is sealed.

    Even Time's review was less than glorious, and had a very ominous feel to it.

    Too bad, too. I kind of like the fruity little buggers.

    --SC

    --
    You read fiction? I write it! Lemme know what you th
  12. the day after : by raindog151 · · Score: 1

    Today, software and hardware vendors across the world have finally released full specs on all their previously 'closed-source' products, taking blame for inherant 'bugs' or flaws.

    In other news, Nintendo has signed a deal with Microsoft and Sony to port their world known games to the competitors console systems, and slashdot users deem Windows XP 'pretty okay'.

    Film at 11.

    --
    your jesus is another mans xebu. chew on that hypocrites.
  13. Same damn thing by multiOSfreak · · Score: 0

    Windows *is* a bug...with huge security holes.

  14. Re:this is something.. by NanoGator · · Score: 5, Funny

    Mac users don't have to worry about using the term 'Gigahertz' either.

    --
    "Derp de derp."
  15. although i SHOULD mention... by Narcocide · · Score: 0, Troll

    although i SHOULD mention that until VERY recently one of those boxes that happened also to be using an Aureal Vortex 2 soundcard (good soundcard... damn shame about the company) was experiencing lockups on sound play until someone discovered that you have to run this line:

    /sbin/setpci -d '12eb:*' 40.B=ff

    before using the soundcard in order to keep athalon systems with aureal vortex-based soundcards from locking up.

    frankly, i'm not even sure what the hell that line
    does, it's magic. i don't even know how you'd go
    about figuring something like that out, but at least now i can use my soundcard in linux... damn shame about windows. heh.

    1. Re:although i SHOULD mention... by Anonymous Coward · · Score: 0

      [posting anonymously to sidestep the fact that interpreting 'man setpci' for this dude and karma-whoring are an awful lot alike]

      You use 'setpci' to configure PCI devices (like your soundcard). '-d [vendorID]:[deviceID]' specifies which device to talk to; '12eb' is Aureal, and you're using '*', which basically means you're saying 'talk to all Aureal PCI devices'. '40.B=ff' means you're pushing a 'B'yte with value 'ff' (that's 255 decimal, the biggest available value) into register 40. What register 40 is in charge of on that card, of course, is anybody's guess. (Any Aureal engineers out there?)

      Hope that helps. Have a nice day.

  16. This is embarassing by jidar · · Score: 3, Insightful

    This is embarassing to the Linux community as a whole, and It also explains why I've had problems with crashes on two different systems running Linux and Athlons.

    What I don't understand is how this could have made it so far? This is exactly the sort of problem I have been telling people we don't have in the Linux world, and now it looks like I was wrong. Is this pointing out an underlying problem we have with QA in the Linux kernel? With Open Source in general? What can we do to make sure that a bugs of this magnitude are detected more quickly?

    --
    Sigs are awesome huh?
    1. Re:This is embarassing by Grelli · · Score: 2, Insightful
      This is embarassing to the Linux community as a whole

      Actually, it isn't embarassing at all. It wasn't the "Linux Community"'s fault. This is the fault of AMD who anounced/classified the bug as a Windows 2000 issue instead of a hardware issue. Many posters have pointed out that kernel hackers probably don't follow hardware bug reports for OTHER operating systems.

      The failing was on AMD's part, and nobody else. But don't get me wrong, I love AMD, and this won't change my overall opinion of them. If things like this continually happen, then I may have to reconsider. But if this is a one time thing, I'm not going to get overly mad, and I hope no-one else does either.

    2. Re:This is embarassing by Anonymous Coward · · Score: 0

      Well, if the Athlon and accompanying chipsets were indeed x86 compatible, as they claim, then this problem wouldn't exist.

      The fact is that the Windows community is being bitten by a bug that is the fault of Microsoft, Nvidia, VIA, or AMD. Could this be the same bug Linux is seeing? Nobody knows because everybody is pointing their fingers at everyone else. I have no links but if you've been following The Inquirer for the past couple of months, you've seen it discussed.

    3. Re:This is embarassing by Dahan · · Score: 3, Informative
      Actually, it isn't embarassing at all. It wasn't the "Linux Community"'s fault. This is the fault of AMD who anounced/classified the bug as a Windows 2000 issue instead of a hardware issue.

      If you read the technical writeup on LKML, you'll see that it's not a hardware issue, but a software bug. Which is why AMD announced the bug as a Windows 2000 issue--it is one. Linux also happens to have the same bug (it's a subtle issue and an easy mistake to make, IMO), but how was AMD supposed to know that Linux was doing the same bad thing--mapping the AGP GART area cacheable, when the GART is non-cacheable?

    4. Re:This is embarassing by LordNimon · · Score: 2, Informative
      but how was AMD supposed to know that Linux was doing the same bad thing

      Oh, that's easy. The engineer who discovered the problem should have realized that it's not necessarily a Windows-specific issue, but a problem that any OS could have. He should have then tried to contact all the OS vendors, not just Microsoft.

      Considering how Linux is used by a higher percentage of AMD customers than Intel customers, AMD should have paid more attention to an important segment of its customer base.

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    5. Re:This is embarassing by Anonymous Coward · · Score: 0

      Erm because the source is there for them to look at?

    6. Re:This is embarassing by whovian · · Score: 2, Interesting

      how was AMD supposed to know that Linux was doing the same bad thing-

      How did AMD know that Windows-* was doing the bad things? I guess it didn't occur to AMD to download and inspect the kernel source code or talk to the linux kernel mailing list(s) and developers? It seems to me that that is effectively what they would have had to do with Microsoft.

      OFF-TOPIC: This sort of touches on the point another poster made 1-2 weeks ago or so. I probably will recall the specific accusations incorrectly (and hence flamed), but the gist of that post was the hypothesis that AMD has a loyal following of users, in particular linux users, and it would be nice if AMD reciprocated a little in recognition of that. I am largely ignorant of AMD's contributions to the community per se, so put the flame on a low setting, ok?, as I am an AMD newlywed myself :)

      --
      To-do List: Receive telemarketing call during a tornado warning. Check.
    7. Re:This is embarassing by leviramsey · · Score: 1
      Considering how Linux is used by a higher percentage of AMD customers than Intel customers, AMD should have paid more attention to an important segment of its customer base.

      On the desktop, that's probably true. But the vast majority of Linux installs are on servers. What's Intel's server market share advantage over AMD? It's highly unlikely that any Linux user advantage that AMD has over Intel in desktops/workstations obviates Intel's huge lead in the server domain.

    8. Re:This is embarassing by Dahan · · Score: 3, Insightful
      How did AMD know that Windows-* was doing the bad things?

      Maybe because Microsoft reported the problem to them and asked for help?

      Perhaps I worded my question poorly--why would AMD even think that Linux had the same bug as Windows 2000? Whenever you see a Windows bug, do you usually wonder if Linux has the same bug? They're completely different codebases, and there's no reason to think that a bug in one OS would be present in the other.

    9. Re:This is embarassing by mrm677 · · Score: 1

      The Linux people should have realized that if AMD truly is "Pentium"-compatible, which is what AMD advertises, then there is no such thing as a Windows 2000 bug. Think about it.

      To be classified as Pentium-compatible, the only difference between the behavior of an Athlon and a PIII should be the return value of the instruction that gives you the CPU ID.

    10. Re:This is embarassing by sasami · · Score: 1
      Oh, that's easy. The engineer who discovered the problem should have realized that it's not necessarily a Windows-specific issue, but a problem that any OS could have. He should have then tried to contact all the OS vendors, not just Microsoft.
      Really? Are you volunteering? When I see software doing something stupid like mapping an uncacheable device as cacheable, I don't stop and say, "Gee, I wonder how many other people are also being stupid!" and then spend my time examining everyone else's code looking for the same goof.

      And yes, I do this kind of stuff for a living.

      --
      I like canned peaches.
      --
      Freedom is not the license to do what we like, it is the power to do what we ought.
    11. Re:This is embarassing by whovian · · Score: 1

      I am pretty sure you meant that rhetorically, but, no, I would generally expect Windows bugs and linux bugs to be different in nature. I think the examples speak for themselves. The OSes are of different code base and were written with different goals in mind.

      IMO, AMD deserves some credit for discussing the problem and not turning away now that it has come up.

      --
      To-do List: Receive telemarketing call during a tornado warning. Check.
    12. Re:This is embarassing by LordNimon · · Score: 1

      An email to the linux kernel mailing list would have taken 2 minutes. Hardly a big investment in time.

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    13. Re:This is embarassing by Hoser+McMoose · · Score: 1

      In that case the PentiumPro, Pentium II, Pentium III, Celeron and Pentium 4 are also not "Pentium-compatible" by your definition, because they have at least as many differences from the original Pentium chips as AMD does. What's more, these chips are not 100% compatible with the 486 or 386 either. And no, I'm not talking about new features being added.

      Have a look at Intel's "Specification Updates" some time. Intel has over 80 errata listed for their Pentium III processor in ways that it differs from the "expected behavior".

      Processor bugs happen.

  17. More information by DragonHawk · · Score: 5, Informative

    Yesterday, information became widely available that described possible stability issues (system crashes, hangs, etc.) when using an AGP video card under Linux in conjunction with an AMD Athlon processor. It was generally called a "bug" in the Athlon CPU.

    More information is now available at http://www.gentoo.org, including an analysis of AMD's response. AMD's official response was posted to LKML, and is available at http://www.geocrawler.com/lists/3/Linux/35/175/762 6960/.

    There is apparently some kind of bad interaction between the AGP GART ("Graphics Address Remapping Table", I think?), speculative memory operations performed by the Athlon processor, the memory mappings used by the kernel, and cache coherency. The details are beyond me, but the practical upshot appears to be that the wrong data ends up being written back to main memory at some point.

    I recommend reading the above LKML thread if you suspect you are affected by this issue. Information is still being uncovered, and it is not immediately clear how this occurs, what causes it, who is affected by it, and how to work around it.

    In particular, there is some uncertainty as to whether the "mem=nopentium" option actually prevents the problem, or merely makes it less likely to occur.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
    1. Re:More information by austad · · Score: 2

      I have a Soyo Dragon Plus with an XP 1700+. When I don't run X, my system works fine. However, under X, I get random lockups of the entire system. I've tried the mem=nopentium, and while it doesn't crash as often, it still crashes. I've also tried the kernel AGPGART code, and tried using on the AGP stuff included with Nvidia's kernel driver, but I still get the same crashes.

      This is actually getting quite annoying. When I was running kernel 2.4.8, it only locked up once every couple of days, but using 2.4.13 or 2.4.17, I get lockups almost every time I'm using X. My system is basically unusable right now, which really sucks. I hope they find a solution soon because I don't really like having to drag a windows laptop home with me every night.

      --
      Need Free Juniper/NetScreen Support? JuniperForum
    2. Re:More information by damiam · · Score: 1

      What's the point of a sig about your email address when you don't even display one?

      --
      It's hard to be religious when certain people are never incinerated by bolts of lightning.
    3. Re:More information by iamabot · · Score: 1

      same issues under winXP, although it only occurs currently while playing quake 3.

      i've been told my specific problem is with the via chipset and it's interaction with the geforce, but i'm starting to wonder if this issue isn't in some manner tied to the problems i'm experiencing.

    4. Re:More information by Anonymous Coward · · Score: 0

      That's impossible! Linux is 100% rock solid stable and never has problems.

      You are clearly doing something wrong, you idiot. Go back to Windows you freak!

    5. Re:More information by xwred1 · · Score: 1

      I've got a dual AthlonMP system running on a Tyan Thunder K7, which uses the AMD 760MP chipset. I had been having what seemed to be video related issues with it, and was told by an Nvidia employee that my particular board seems to have stability issues when the AGP bus is under a heavy load.

      Since then, I have disabled AGP altogether with the Nvidia drivers (`Option NvAgp "0"` in my XF86Config) and I am stable as a rock, although I have lost a bit of 3D performance.

      Then this bug surfaces. I hope they resolve it, so I can get my AGP performance back. Until then, I suggest you disable AGP like I did to maintain your stability.

  18. Re:this is something.. by Anonymous Coward · · Score: 0

    LOL! Amen to that, brother.

  19. athlon xp dissection by Anonymous Coward · · Score: 3, Funny

    Recently one of my friends, a computer wizard, paid me a visit. As we were talking I mentioned that I had recently installed Windows XP on my PC. I told him how happy I was with this operating system and showed him the Windows XP CD. To my surprise he threw it into my microwave oven and turned it on. Instantly I got very upset, because the CD had become precious to me, but he said, "Do not worry, it is unharmed."

    After a few minutes he took the CD out, gave it to me and said, "Take a close look at it."

    To my surprise the CD was quite cold to hold and it seemed to be heavier than before. At first I could not see anything, but on the inner edge of the central hole I saw an inscription, an inscription finer than anything I had ever seen before. The inscription shone piercingly bright, and yet remote, as if out of a great depth:

    12413AEB2ED4FA5E6F7D78E78BEDE820945092OF923A40EE lO E5IOCC98D444AA08EI324

    'I cannot understand the fiery letters,' I said in a timid voice.

    "No but I can," he said. '"The letters are Hex, of an ancient mode, but the language is that of Microsoft, which I shall not utter here. But in common English this is what it says:

    'One OS to rule them all, One OS to find them,
    One OS to bring them all and in the darkness bind them.'

    It is only two lines from a verse long known in System-lore:

    'Three OS's from corporate-kings in their towers of glass,
    Seven from valley-lords where orchards used to grow,
    Nine from dotcoms doomed to die,
    One from the Dark Lord Gates on his dark throne
    In the Land of Redmond where the Shadows lie.
    One OS to rule them all, One OS to find them,
    One OS to bring them all and in the darkness bind them,
    In the Land of Redmond where the Shadows lie.'"

    1. Re:athlon xp dissection by Anonymous Coward · · Score: 0

      JOKE:

      I once left two WinXP CDs on my dashboard. When I finished work I found out that someone broke into my car... and left two more CDs.

    2. Re:athlon xp dissection by Anonymous Coward · · Score: 0

      Damn , almost wet myself laughing

  20. All of the above. by Christopher+Thomas · · Score: 5, Informative

    AMD claims it's not a bug with the Athlon processor, but with the motherboard

    According to young bald children everywhere, "There is no bug".

    In related news, the motherboard manufacturers are quoted as saying, "It's not a bug with the motherboard, but with the Athlon processor."


    Funny, I didn't think I was bald...

    It's an Athlon bug if you think doing speculative writes is a bug.

    It's a motherboard chipset bug if you think that the AGP controller should play nicely with cache-coherence protocols (right now it doesn't, presumably to gain a speed boost).

    It's an OS bug if you think that the OS should be bright enough not to make AGP-touched memory cacheable (it wasn't intended to be).

    I'm voting for option 3), myself.

    1. Re:All of the above. by denzo · · Score: 1, Offtopic
      It's an OS bug if you think that the OS should be bright enough not to make AGP-touched memory cacheable (it wasn't intended to be).

      I'm voting for option 3), myself.

      I think this revelation makes this parent worthy of being modded up.
    2. Re:All of the above. by doorbot.com · · Score: 2

      It's an OS bug if you think that the OS should be bright enough not to make AGP-touched memory cacheable (it wasn't intended to be).

      I'm voting for option 3), myself.


      I don't know about that... if you'll recall there have already been pre-emptive blamings from the Linux gurus, so since they're always right, well, it must be something else.

      So I quote from Gentoo.org:
      The bad news is that a major Athlon CPU bug has been discovered, and it affects Linux 2.4. Note that this is a bug in the actual CPU itself, and is not a Linux bug.

      And:
      And, the kind folks at AMD even created a simple patch for Windows 2000 that disables extended paging by tweaking the registry.

      Cache coherency aside, what about logic coherency? These were posts in the same article. To me, it seems as if the Linux guys and gals are ready to blame everyone else but themselves (sounds like the phone company!).

      So quick to blame others... but why not? I mean, if everyone believes you, then you're good to go!

    3. Re:All of the above. by Anonymous Coward · · Score: 0

      I think this revelation was spelled out in great detail in the article which was linked to 3 different times in the summary.

    4. Re:All of the above. by Stiletto · · Score: 3, Insightful

      It's an OS bug if you think that the OS should be bright enough not to make AGP-touched memory cacheable (it wasn't intended to be).

      I'm voting for option 3), myself.


      I thought one of the main benefits of AGP was the ability to remap a bunch of non-contiguous physical blocks into one address space, so the entire bunch could be marked as cachable (for instance when DMA'ing a bunch of vertices across the bus).

    5. Re:All of the above. by Anonymous Coward · · Score: 0

      I'm suprised you're not posting at -1 with that sig. Yes I'm posting anonymously to protect my precious Karma... mmmmmmmm Kaaaarma.....

    6. Re:All of the above. by binner1 · · Score: 1

      sounds like the phone company!

      Do you live in Thunder Bay too??? <grin>

      -Ben

    7. Re:All of the above. by BlueUnderwear · · Score: 2
      And, the kind folks at AMD even created a simple patch for Windows 2000 that disables extended paging by tweaking the registry.
      Cache coherency aside, what about logic coherency? These were posts in the same article. To me, it seems as if the Linux guys and gals are ready to blame everyone else but themselves (sounds like the phone company!).
      Ermh, the bit about that windows registry tweak actually speaks in favor of Linux (i.e. t'is indeed a hardware bug). Indeed, the following questions come to mind:
      • Wow, Windows and Linux stricken by the same bug. What's the probability of that?
      • Why didn't they simply tweak the registry to make that 4MB AGP section non-cacheable, rather than switching off 4MB paging altogether?
      --
      Say no to software patents.
    8. Re:All of the above. by WNight · · Score: 4, Funny

      > Wow, Windows and Linux stricken by the same bug. What's the probability of that?

      Probably quite good. I imagine if you examine both systems carefully you'll see a BSD license agreement in the system binaries that deal with AGP. :)

    9. Re:All of the above. by Anonymous Coward · · Score: 0

      If so, why doesn't any *BSD support AGP GART?

    10. Re:All of the above. by fredrik70 · · Score: 1

      I doubt there's anywhere in the BSD licence that states that the code in question has to run on *BSD system as well.... ;-)

      --
      if (!signature) { throw std::runtime_error("No sig!"); }
    11. Re:All of the above. by rabidcow · · Score: 2

      Why would you cache memory for DMA? It doesn't go through the processor (that's the Direct part) so you certainly wouldn't store it in the processor's cache. In fact, you'll only be accessing each location once, to copy it across the bus. Unless the memory can be read from ram faster than it can cross the bus, but I don't think that's the case...

    12. Re:All of the above. by Anonymous Coward · · Score: 0

      Not code from *BSD, but BSD-licensed reference code from the inventors of AGP.

    13. Re:All of the above. by Hermanetta · · Score: 0

      Its not really that you are out to cache memory for DMA. But if the processor does anything with data in that same memory, and caching is not explicitly turned off for that range, then data in those mem locations will float up and down the processor cache hierachy(sp).

      For example, if you create a bunch of triangles or textures in memory, you could then could pass them to the card, but the memory used has been cached by virtue of being used by the processor. In the other direction, you read some block from a HD, which uses DMA, and then a program processes it, like play an mp3 or whatever. In this case YOU WANT the mp3 data to be cacheable for sure or else the processing might take forever. A full round trip might be that you read from HD though DMA, then process, then send sound to sound card through DMA and send other stuff to 3d card though DMA too. If you are a real bastard you might be able to read form HD and send to sound card without processing, using DMA both ways.

      As for AGP being the only thing that takes disparate physical memory chunks and makes them whole for hardware, I dont think this is true. For starters this was a classic problem for any DMA type controller for virtual memory systems.

      The problem is that a program would allocate and/or read/write to/from what looked like a contiguous virtual memory block for it. But real physical memory which backed the virtual memory would more than likely be very fragmented. So how do you do DMA at all if you have a 2/4K page size and you want a 64K bloack read or written?

    14. Re:All of the above. by Ben+Hutchings · · Score: 2

      The main motivation for the development of AGP was to allow the graphics contoller to use main memory rather than its own separate memory, at a time when memory was getting more expensive.

  21. Don't be so sure by DragonHawk · · Score: 1

    "this is something....that mac users dont have to worry about. :)"

    While I suspect you are correct, there is/was discussion on LKML about whether or not other architectures would be vulnerable to this issue. The PPC was specifically mentioned.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  22. Looks like a "simple" problem by Anonymous Coward · · Score: 0

    From what I've read (you did read the links provided) it looks like the problem occurs when the operating system tells the processor that memory mapped onto the AGP bus is cachable.

    Does n't look like a big show stopper to me.

  23. Similar problems on an intel P4! by DavidJA · · Score: 2

    it's not a bug with the Athlon processor, but with the motherboard

    I somehow wonder if this is related! I had a P3 system, with Gforce 2head card everything was working fine, I replaced the motherboard for an ASUS P4B, and a intel P4 chip. Ever since I intermitently get a BSOD, (bad pool caller).

    Point is, isn't this very similar to the problems that AMD were reported on Win2k system without the patch?

    1. Re:Similar problems on an intel P4! by Hallow · · Score: 1

      You will not get a BSOD with this problem. The machine will just hang (be it win2k or linux). I've been having this problem for quite awhile. I thought it was the video card over heating.

  24. Re:this is something.. by drobbins · · Score: 1

    Actually, Linux PPC kernel developers are taking this issue very seriously since it *could* potentially affect PPC systems too. No confirmation, and probably *not*, but a long conversation ensued over the PPC's "BAT" on LKML....

    Best Regards,

    Daniel Robbins

    --
    Daniel Robbins
  25. Well, I'll be by Anonymous Coward · · Score: 1, Informative

    Well, I'll be darned. Vendors pointing the finger at each other. Who'd have thought?

  26. Not a bug, a design issue by DragonHawk · · Score: 1

    Information is still sketchy, but this appears to be a design issue (i.e., different components interacting badly due to assumptions) rather than a bug. It could even happen on non-Athlon systems, if the processor in use happens to tickle the same conditions.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  27. Re:this is something.. by dhamsaic · · Score: 0, Flamebait

    Pardon me for sounding harsh but...

    WHAT THE FUCK ARE YOU TALKING ABOUT?

    Firstly, Jobs has been back at Apple since 1996/1997. Uh... one, two, TWELVE! No.

    MacOS X wasn't intended to make a surge - rather, it was meant to gradually move into the market place. It is doing this - more people are moving over to MacOS X every day. I know because I'm a hard-core UNIX geek, and I bought a new iBook and a PowerMac G4 800DP. 40% of Apple's computer sales are from people who have never owned a Macintosh before.

    iMac pre-sales are doing great. Demand has been, according to all sources, very high. Hence, they are not able to meet production demands right away, and some units will be shipped a bit later. All this because no one's buying, right?

    Apple posted a $66 Million profit in the quarter ending September, and some $38 Million profit for the quarter ending December.

    Between November 10 and December 30, more than 125,000 iPods were sold (one to me, too - listening to it right now - it is awesome) - at $400 each, that's some $50 Million in revenues. On a fucking MP3 player.

    The Titanium G4 PowerBook has been selling like crazy. I know because I live near an Apple store, and every time I'm in there, someone else is buying one. No joke.

    The new iBook has got to be one of the most successful consumer portables ever. And it hasn't even been out for a year yet. It is selling faster than you can imagine - and to big customers, too. Maine ordered 36,000 for their teachers and students. Some district in Virginia ordered 24,000.

    All this because Apple & the Macintosh are dying?

    Apple has been "dying" for the past 20 years. Shit, I used to spout the same BS. I hated Apple and the Macintosh. However, one can only lie to oneself for so long - with MacOS X and the new hardware lineup, Apple is churning the best personal computers in the industry today. One pays a little more (on the high-end machines), but they are rewarded with a premium machine - much like one pays more for a BMW than they would a Geo Metro.

    Only two computer manufacturers are making money currently. Apple & Dell. To quote Steve Jobs, "Dell does it by being Wal Mart. We do it by innovating."

    Well, however they do it, they're the only computer company doing anything exciting in the field today. One must applaud them for that.

    --
    Every once in a while I like to masturbate a new word into my vocabulary, even if I don't know what it means.
  28. I'm wondering if this issue is related by cecil36 · · Score: 2, Troll
    I'm running M$ Windoze on an Athlon 750. I noticed that the system crashes on a few occasions. Specific incidents where a crash can be predicted is when I
    • Play Diablo II on Battle.net. The crash occurs when I try to join an Open game after leaving another Open game.
    • Run M$ Word for an extended period of time. The crash comes when I scroll through a large document or paste a lot of text or images from another source.

    I also noticed that as I run programs, not all the memory used by the program is freed when the program terminates. I ran the System Monitor and it revealed to me this information. I'm not sure if this is Athlon or Windoze related. Anyways, I'm suspecting that the problem may not be limited to Linux boxes.
    1. Re:I'm wondering if this issue is related by MrResistor · · Score: 1, Offtopic
      Not sure about the MS Word thing, but I'd say your Diablo II crashes are a Blizzard issue. I've had plenty of problems with that game; lots of crashes, mostly, although crashes are a lot less severe and a lot less frequent with the current patch. The most irritating problem, though, is that my Play disc is unreadable by my DVD drive. This forces me to use my 4x8x CD-R (that's right, no -RW) which is painfully slow. I suspect it's the totally ineffective copy-protection that's causing that particular problem.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    2. Re:I'm wondering if this issue is related by Anonymous Coward · · Score: 0

      Well, offhand, I'd say it's probably because you're using a crappy closed-source OS that's infamous for poor quality. Or were you trying to be funny?

    3. Re:I'm wondering if this issue is related by Magila · · Score: 1

      The Word bug is well known, I have experianced it myself and it has nothing to with the processor.

    4. Re:I'm wondering if this issue is related by Anonymous Coward · · Score: 0

      All the memory is rarely freed in Windows. There are managers to help with that, but they take up memory.

    5. Re:I'm wondering if this issue is related by Anonymous Coward · · Score: 0

      This particular bug would cause a hard system crash or lockup, not an application fault. (admittedly, it's not easy to tell the difference running 9x - you will usually get a lockup, but sometimes a VxD error.)

      The Word crash is 100% due to 9x's crappyness. Upgrade to NT/2K/XP and it will go away.

    6. Re:I'm wondering if this issue is related by cecil36 · · Score: 1

      I know M$ stuff is crap, but I also heard from a consultant at a trade show that the Athlon does not do a good job with multitasking office applications. This is why I posted my $0.02 above in hopes of getting feedback from other /.ers who spend more time researching this stuff than I do.

  29. Does it affect other versions of Windows? by dcr · · Score: 1

    I have been to the AMD site, the Microsoft site,
    and all of the others mentioned, but I have yet to see any mention of any version of Windows being affected other than Windows 2000. Windows XP is not affected - I found that out here and on the AMD site (Microsoft's site, oddly enough, does not mention this). Does anyone know if the bug affects Windows 98?

  30. so, how do I tell if I have the problem.... by Radnimax · · Score: 1

    if my computer hasn't had any lockups in 3d? I use a voodoo card and they don't take advantage of agp so that doesn't help me. Is there another way to find out?

    --
    "You can kill a man, but you can't kill what he stands for. Not unless you first break his spirit."-Smoking man,X-Files
  31. this is not a motherboard bug either... by PianoMan8 · · Score: 1

    This is a problem with the way the AGP GART memory mapping is handled in the linux kernel. So once again, slashdot got it wrong. Please, skim the articles in question before posting..It is a minor, elusive buglet in the linux kernel memory management.

    This is all caused by AGP. Once again, the race for more frames per second in quake3 has caused a stability damaging technology to become mainstream.

    ugh..

    john.c

    --
    - --
    "I Hate Quotes" -- Samuel L. Clemens
    1. Re:this is not a motherboard bug either... by barawn · · Score: 4, Interesting

      Interestingly enough, this feature of AGP is not really critical to increasing performance in games - in fact, it could be counterproductive to it.

      The AGP GART (Graphics Address Remapping Table, I believe) maps "video card memory addresses" to "main memory addresses", i.e., it's to allow the graphics card to grab textures, etc. directly from main memory without going through the CPU.

      Many motherboard manufacturers use this feature to provide on-board video without any dedicated memory so they don't have to include any additional memory for the graphics card.

      Of course, since this blows so massively performance-wise, it's mostly abandoned now.

      Is the GART actually useful for anything except extending the video card's onboard memory? I'm not really sure...

    2. Re:this is not a motherboard bug either... by addaon · · Score: 3, Informative

      While the use of the GART you mention (video chipsets with no onboard memory) really does suck, performance-wise, the GART itself is not useless. Most games today limit themselves to 16MB or so of textures, so that they run properly, without swapping to main memory, with a 32MB video card. However, if you want a game with 256MB of textures, say, you have three options.

      1) Get a video card with 270+MB of memory. (Yeah, right.)

      2) Snatch from main memory the portions of the texture you need. (This gets slow AND ugly if you use more than ~16MB in a single frame.)

      3) Use the GART, take (less of) a performance hit, and just keep the textures in system memory.

      This was the original purpose of the GART, and is still important.

      --

      I've had this sig for three days.
    3. Re:this is not a motherboard bug either... by Anna+Merikin · · Score: 1

      I first saw mention of this `bug' in the alt.os.linux.mandrake ng some months ago. The linux.redhat ng I also frequent was void of complaints. The Nvidia GeForce Xxx chipsets running on VIA-equipped mobos seem to interact with the GART and the spec fetches or whatever. I suspect Mandrake users are more likely to use the Nvidia product (or perhaps any product that uses AGP-4x...) than Redhat users, of which I am one. A possible fix: turn off AGP-4x in BIOS. Games won't run as fast, but a box that crashes is useless no matter how clear the action was when it froze.

  32. Don't cache it then! by Papineau · · Score: 4, Insightful

    From the LKML post linked in the story, it seems it's because some 4MiB pages (I couldn't understand why 4KiB pages aren't affected, if they effectively are not) are allocated for the AGP (GART more specifically) with some bits set telling it is cacheable.

    Why would somebody want to cache the AGP memory? I'm pretty sure it's used 99.99% of the time as write-only memory, because it's the main output method of most computers. What's the point of caching that? It can only prevent the use of the CPU cache by some more important things, no?

    Feel free to correct me if I'm wrong, I'm not very familiar with the usage of AGP memory (or GARTs).

    1. Re:Don't cache it then! by tommck · · Score: 2
      Excuse my ignorance, but what the heck are "MiB" and "KiB" ??

      T

      --
      ---- It puts the lotion on its skin or else it gets the hose again. It does this whenever it's told.
    2. Re:Don't cache it then! by slamb · · Score: 1
      Excuse my ignorance, but what the heck are "MiB" and "KiB" ??

      Mibibytes and kibibytes. They refer to 2^20 and 2^10 bytes, respectively. (I.e., what many other people call megabytes and kilobytes.)

      The scientific community had decided on SI unit prefixes like 100 years ago. "Mega" means 10^6 and "kilo" means 10^3. The computer science people came along and said "no, those will be powers of two for us." These units are a (probably futile) attempt to correct that particular stupidity.

      I think it won't work out, because there's too much legacy stuff that there will always be confusion at this point about what "mega" and "kilo" mean with computers. Besides, "mibi" and "kibi" sound stupid enough that they'll probably never catch on.

    3. Re:Don't cache it then! by Papineau · · Score: 1

      They won't catch on if nobody uses them, as Divx showed us (not the one you're using, the other, older one). But I still think that these units are better, if only resulting in better advertisement from the HD makers.

    4. Re:Don't cache it then! by sketerpot · · Score: 1
      They stand for Mibibytes and Kibibytes, respectively. They are more proper ways of saying what most people just say MB and KB for.

      There was an article on slashdot a while back about these...

    5. Re:Don't cache it then! by tommck · · Score: 2
      Well, thanks for the info... that jogged some sort of neurons... lost in a sea of beer and television. :-)

      T

      --
      ---- It puts the lotion on its skin or else it gets the hose again. It does this whenever it's told.
    6. Re:Don't cache it then! by Anonymous Coward · · Score: 0

      Does that mean the next step up is "GiB"? And doesn't "Gibibytes" sound like a twisted brand of dog food?

      I play too much quake

    7. Re:Don't cache it then! by Mr.+Fred+Smoothie · · Score: 2
      And doesn't "Gibibytes" sound like a twisted brand of dog food?
      No, it sounds like what you might get if you teased the lead singer of the Butthole Surfers with a ham sandwich.
      --

    8. Re:Don't cache it then! by SurfsUp · · Score: 2

      Excuse my ignorance, but what the heck are "MiB" and "KiB" ??

      The are "MibbleBytes" and "KibbleBytes" respectively.

      ;-)

      --
      Life's a bitch but somebody's gotta do it.
  33. Re:this is something.. by Cheeko · · Score: 1

    Compaq is currently making money as well, though its tough to consider them in the same class since less than a third of their business is PCs anymore.

  34. Not necessarily the motherboard by jayteedee · · Score: 1

    It could be either the motherboard or the OS which is collecting and writing stall data. About the only conclusion to be truly drawn so far is that it is unlikely an AMD defect. I'm sure AMD would like this to be the case too!

    --
    Religion and science are both 90% crap..but that doesn't negate the other 10%.
  35. is this why... by BenTheDewpendent · · Score: 1

    battle realms keeps crashing on my machine?

  36. Troll - Mod this wanker down by Anonymous Coward · · Score: 0

    What a crock of shit. Maybe I should make some software using AMD quirks and then claim that it proves the problems with the problematic Intel chips.

    As if "lower cost" has shit all to do with it either.

  37. It is (not?) a CPU bug. by crandall · · Score: 2, Interesting

    If the bug doesn't appear on intel chips, then how are we supposed to believe that it's not an AMD bug? Sure, we could blame the motherboard... but wouldn't that mean via/intel solutions would carry the same issue?

    Anyone have any knowledge as to how intel treats this 4mb pages different?

    I mean, if the bug is caused by AMD's precaching of AGP Gart mapped memory, and intel just doesn't precache that memory, then now is it NOT an AMD processor bug?

    When two processors aren't equal, there has to be a reason for the difference in running software.

    (Note that I prefer AMD, so I'm just looking for answers, not trolling).

    1. Re:It is (not?) a CPU bug. by Anonymous Coward · · Score: 0

      fuck you

    2. Re:It is (not?) a CPU bug. by tommck · · Score: 5, Insightful
      If the bug doesn't appear on intel chips, then how are we supposed to believe that it's not an AMD bug?

      Well, based on my reading of other posts, it is a simple case of AMD taking advantage of some features of AGP that are within spec that Intel is not. When the OS assumes that things are done Intel's way instead of adhering to the spec, things will show up on an AMD processor and not on an Intel.

      AMD is doing things correctly, albeit differently from Intel. This is exactly how we are supposed to believe that it's not an AMD bug.

      T

      --
      ---- It puts the lotion on its skin or else it gets the hose again. It does this whenever it's told.
    3. Re:It is (not?) a CPU bug. by Anonymous Coward · · Score: 0

      MY understanding is this. Its not a cpu bug because the cpu works as it was designed to. That happens to be a little different then some agp implmentations expected including ones found on amd's own chipsets. So its not so much a bug in either so much as it is an incompatibility.

    4. Re:It is (not?) a CPU bug. by LordNimon · · Score: 1
      If the bug doesn't appear on intel chips, then how are we supposed to believe that it's not an AMD bug?

      AMD's CPUs support something (speculative writing) that Intel processors don't, and the Linux kernel has a bug that is only noticeable when this feature is used.

      Setting device memory as cacheable is a kernel bug, no matter what the processor does. It's a kernel bug even if you're using Intel chips.

      This sort of thing is commonplace in computers. A certain piece of software or hardware doesn't follow some specification to the letter, but because the components involved don't support any features that require strict compliance, the bug isn't noticed for years. Then, one component is updated to support a new feature that relies on strict compliance. That's when all the bugs appear.

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    5. Re:It is (not?) a CPU bug. by crandall · · Score: 1

      Right. And I fully understand this, and figured that's probably what the case was. After all, this same issue is the main reason why you ever found hardware that was 'incompatible' with Athlon systems. intel never adhered to any kind of specs, and hardware developers could be sloppy as they wanted, assuming it worked. Then AMD/Via came along, and made hardware that conformed to spec, and all of a sudden all this out of spec hardware wouldn't work.

      It's unfortunate really, because it's all a matter of numbers. Since more people use intel stuff (for now anyway), whenever those people hear about a bug like this, they assume (again, and wrongly), that AMD produces inferior processors, when it is actually a case of people being out of spec to begin with.

    6. Re:It is (not?) a CPU bug. by leviramsey · · Score: 1
      when it is actually a case of people being out of spec to begin with

      Another case of MS Kerberos-itis?

  38. Re:Troll - Mod this wanker down - You serious?? by Anonymous Coward · · Score: 0

    Gimme a break. None of the software described, including the Everquest upgrade has NOTHING to do with 'quirks' people programmed on purpose. It's just a fact of reality that AMD isn't 100% compatable. Or do you think this whole slashdot article you responded to is all a conspiracy against AMD by Intel to make them put bugs in it on purpose. If anyone is a troll, it's you.

  39. Boycott $lashdot by Anonymous Coward · · Score: 0

    They are evil! They contribute to global warming, fund animal experiments and promotes terrorism.

  40. Re:Easy - Buy Intel. The cost of using 2nd party.. by Anonymous Coward · · Score: 0

    That's interesting. The EQ patch never affected my Athlon system. You must be using an inferior motherboard.

  41. new thinkgeek item specifically designed for AMD?? by edrugtrader · · Score: 2

    .... "don't blame me, its a motherboard problem"

    --
    MARIJUANA, SHROOMS, X: ONLINE?! - E
  42. The Nature of the Bug by 4of12 · · Score: 3, Insightful

    Hmmmm.

    Is the Bug...

    • (A) In the Athlon cache?
    • (B) In the chipset?
    • (C) In the AGP-using devices misusing memory?
    • (D) In the Linux kernel?
    Well, AFAICT, the real bug is in the communication of relevent knowledge.

    These kinds of bugs would have significantly shorter duration if the specifications for all four possible culprits in (A)-(D) were openly published, completely, for all to see.

    --
    "Provided by the management for your protection."
    1. Re:The Nature of the Bug by scott1853 · · Score: 3

      I was going to mod you up till I read your last sentence. :)

      Open sepcification? The OS is open source. That doesn't mean that anybody in their right mind would want to read through it, but it's available.

      Chipset specs are all agreed upon by a standards body, as are the bus specs. It's not like any one manufacturer was keeping a protocol secret so everybody just guessed at it. If you want to see them, go ahead, I'm sure you can find information. It's not going to be dumbed down and summarized for you though. It's going to be white papers with lots of weird electrical diagrams and acronyms.

      What we're seeing here is when several layers don't correctly define or handle an error condition. You can compare it to U.S. laws. They define the general outline but don't actually define every possible parameter, they let the courts handle the specific details. Consider the OS the court.

      So most likely, something was left out of the spec and assumptions were made by several different manufacturers who each has their own take on how it should be done.

      But, since the software is easier to change than the hardware, that's the logical place to fix it. It doesn't take all the complaining and finger pointing that's going on this site. Just fix the damn thing, release an update, and move on. Although it's very interesting to see how so many bruised egos defend themselves in the ways that so many of us complain about when our enemies use those tactics.

    2. Re:The Nature of the Bug by (outer-limits) · · Score: 1
      This bug reminds me of some of the bugs I used to read about in the IBM mainframe OS, MVS/XA, (OS/390, OS/Z, etc.) It appears to me that there comes a level of complexity in all hardware/software, using current technology, that becomes unsustainable. The thing that surprises me is that it hangs together as well as it does so far.

      The claim used to be made that MVS was going to be overrun by the smaller operating systems because they were smaller, faster, better, (just read some of the sales blurbs the Unix developers used to write for themselves, they were pretty smug and self satisfied, if you read them now.) The fact is, if you want an operating system that does all the wonderful things that technology makes possible, controlling it and hooking it all up becomes in the end very difficult. MVS is still used by banks etc, because it has been adapted to do just that, over time.

      Perhaps it is time for Unix and Windows to take a step back and just ask how the complexity is going to be dealt with. The recent Linux paging debacle is not a once off, it is something that is not due to bad programming, it is a systematic problem that has to be addressed.

      GNU BSD may have taken a long time to not apparently get very far with it's micro kernel approach, (but this may be because this system was unintentionally or intentionally addressing this very problem), and it will now move on to overtake Linux as a stable path forward.

      --

      Microsoft - Where would you like to go today, Maybe Jail?

  43. Is this only a Linux problem? by sdo1 · · Score: 2

    I recently put together an HTPC (Home Theater PC) based on an ASUS A7V133 (Via) motherboard with AMD Duron processor. It runs Windows 98. I had been experiencing an unbelievable number of random lockups (no blue-screen, no error... just locks). For the most part, I couldn't keep the system running for more than an hour or so.

    In doing extensive research on the problem, I found very large numbers of people with the same problem and very little explanation. I tried MANY different solutions and eventually found one that worked. It involved wiping everything out and installing hardware and software in a VERY specific order. It seems that if you don't install the VIA 4-in-1 drivers (which include GART) at just the right time in the system building, the drivers don't work properly and thus the random lockups.

    I wonder if this is in any way related to the problem here.

    -S

    --
    --- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
    1. Re:Is this only a Linux problem? by L0rdJedi · · Score: 1

      I had this problem recently as well, except that I could run my system just fine and it would only lock up when using the network heavily. Turns out I needed the PCI Latency patch. I've been running for 3 days straight now (my uptime before was a day and a half) with no problems.

    2. Re:Is this only a Linux problem? by Skuld-Chan · · Score: 1

      I don't get this - I've been using via/amd based boards for years and they have always seemed rock solid - even in windows. I'm using one right now that rarely crashes on me (those latest nvidia drivers really screwed me over though - but since I rolled back its been perfectly fine). I don't recall doing anything special to make it work really.

    3. Re:Is this only a Linux problem? by Tsu-na-mi · · Score: 1

      I had similar problems. I was using a Matrox g450 AGP card and an Intel 10/100 NIC. Turns out both cards were in contention for bus mastering. Disabled bus mastering on the video card and poof, no more random lock-ups.

      --
      Dave

      --
      I've built up so much character I have an alter-ego
    4. Re:Is this only a Linux problem? by Brian+Stretch · · Score: 2

      If you have a SB Live! series soundcard in your system, remove it and see if that helps. If it does, buy something else. I went with a Turtle Beach Santa Cruz (which is also well supported by Linux).

      Very, VERY common problem.

    5. Re:Is this only a Linux problem? by Anonymous Coward · · Score: 0

      In VIA's installation guide, they mention the SBLive! cards, their drivers messes up the VIA drivers.
      Via's solution :
      Install the 4-in-1 drivers.
      Reboot.
      Install SBLive! Drivers.
      Reboot.
      Install the 4-in-1 drivers...again.
      Reboot.

      They don't use those words obviously.

  44. Re:Easy - Buy Intel. The cost of using 2nd party.. by pivo · · Score: 2, Informative
    Your argument is incoherant. The idea that lower cost equals lower performance can't be backed up by the presence of a bug, and it ignores real (as in "this is reality") market factors. There have been bugs in all sorts of hardware and software, even the highest performing hardware in the world. There is no correlation.

    If you paid attention to benchmarks you'd see that in almost every case AMD has a higher cost effectiveness than Intel. If you have some specific examples of why AMD is not a good choice (as opposed to vague, illogical ramblings) then why don't you share them? Prove that your mumblings are, "not made up of bugus stuff"

  45. Re:this is something.. by SmittyTheBold · · Score: 1

    Awww....cheap shot ;)

    Besides, if the rumors are true, Mac dual-processor single-GHz boxen will be available within a month...

    --
    ± 29 dB
  46. Re:How do I block these page lengthening posts? by Anonymous Coward · · Score: 0, Offtopic

    How do I block these page lengthening posts?

    The really sad thing is we have this wankerish little friends/foes system, which only puts nicely coloured dots next to someones name, but we still can not choose to ignore a user!

  47. Re:this is something.. by Anonymous Coward · · Score: 0

    FreeBSD users don't have to worry about it either.

  48. This is actually good... by OneFix · · Score: 1

    And I hope AMD feels a pinch from this. Don't get me wrong, I don't want to see it break them. But, if hardware manufacturers realize that turning their backs on Linux can hurt the bottom line, then it will probably be better for all of us in the long run.

  49. Archived posting at MARC by Craig+Davison · · Score: 1

    Because geocrawler seems to be /.ed: Athlon/AGP issue update.

  50. Re:Easy - Buy Intel. The cost of using 2nd party.. by Anonymous Coward · · Score: 0

    How about this entire slashdot topic as proof why it's not a good choice? Hello?

  51. VM Implications? by mjh · · Score: 4, Insightful
    From the gentoo article, I found the following very interesting:
    Yesterday, Rik van Riel, William Lee Irwin and myself were able to discuss this issue of Athlon/AGP instability with AMD....

    ...But now that the problem is out in the open, the solution is clear. The Linux kernel's approach to memory management must become more sophisticated in order to address potential conflicts between the highly-speculative nature of Athlon processors and the non-cache-coherent AGP GART.

    When Linus switched to the AA VM, I got the impression that one of the key differences between the AA VM and the RvR VM is that Rik's VM is much more flexible, but with that flexibility comes complexity, which is why Linus switched to AA's VM. AA's was much simpler to understand and helped to stabalize the VM problems. Does the above quote mean that the AA VM isn't going to be able to handle the requirements to fix this bug? Is this a plug to put back RvR's VM?

    I'm not trying to start a flame war here, just want to understand if I understood what the final paragraph was saying. Please mod me down if I'm way off base, but help me understand too!

    --
    Key to financial independence: Spend less than you earn. Save and invest the difference. Do it for a long time.
    1. Re:VM Implications? by Anonymous Coward · · Score: 0

      How about a new rule for you? If the extent of your knowledge of something is "I heard once that A is more flexible than B" then I really think we could do without your insight.

      Thanks.

    2. Re:VM Implications? by WNight · · Score: 4, Interesting

      I think most people see the VM as eventually becoming quite complex. Profiling memory and disk usage (well, having hooks to allow the disk cache to cache based on memory use) allows you to guess when something will be needed and not page it out if it's needed immediately, or to page out something because you know it's not going to be needed for a while.

      And eventually, all memory management systems will either reach an out of memory issue (even with a reserved cache, the OS can still grow beyond safety margins) and either stall or kill processes. While some people feel that RIk is focusing a little heavily on the killing processes side, it is something you have to be prepared to do so you want to kill a less useful task (a forked apache server, not the main process, for example) instead of killing something critical to operation.

      You can usually come up with a simple solution that covers 95% of the cases very well, but it'll fall apart on that last 5% in a bad way. The complex solutions often offer lower performance in everyday situations but guarantee performance will never get as bad as the easy solutions would allow.

      So, I think anyone with design experience expects Rik's VM (or one like it) to go back into the kernel eventually.

      Personally, I think Rik should look at the issue of having "Emergency" swap that you don't go into except for OS processess. Once main swap is filled all non-OS processes fail to allocate any new RAM. This lets the system function well enough for non-kernel code (ideally more customizable) to make a system-specific determination on how to proceed. For instance, kill any processes from /usr/bin/games and see if that helps the issue... But, I'll admit to not being an expert and that this is only an educated guess.

    3. Re:VM Implications? by fishebulb · · Score: 1

      or how about to you, quite trying to stunt discussions. The person brought up some points in a friendly way.

    4. Re:VM Implications? by Anonymous Coward · · Score: 0

      Too bad he's a fucking moron, huh?

  52. Re:Easy - Buy Intel. The cost of using 2nd party.. by SirSlud · · Score: 3, Interesting

    >Lower costs typically means lower perfomance

    What planet are you from? Lower costs (in the case of demonstrated similarity in performance) typically means lower demand and lower consumer valuation of the brand name, which means smaller user base, which means that it generally takes longer to run into compatibility flaws.

    For instance, Nike is more expensive than Puma. Does that mean Nike shoes are better? Of course not, it means people are more willing to buy Nike, because they percieve that the brand gives them additional values. In the world of shoes, that value is the value of conformity and fashion .. in CPUs, it's the value of a larger consumer base, which essentially translates into a higher possibility of latent design flaws (ie, they exist in the costlier platform as well, but are found earlier because of the larger user base), and the value of being in the same boat as everyone else should a product fail in some fashion.

    Thue funniest thing is you're talking about performance. Performance is how well something works when it works. When it /doesnt/ work, thats not performance; it's either compatibility with the outside world or a design flaw. Anyhow, I feel sorry for your view, because I guess you're paying alot of money for brand security .. but everyone in-the-know computer geek I know (I'm a C++ developer, so I'm not talking tech fanboys here) knows that you'd have to enjoy wasting money to justify buying Intel CPUs at this point in time.

    Lest you cite this situation as a reason why I might be wrong .. it has already been fixed in Windows, and there is a known Linux workaround. So really, there's not much of an issue, and my AMD chip still cost me half the price of an Intel CPU, and benchmarks faster than the Intel, to boot! Keep buying your Nikes! I just want the shoe. :)

    --
    "Old man yells at systemd"
  53. Re:Easy - Buy Intel. The cost of using 2nd party.. by SirSlud · · Score: 2

    >I>essentially translates into a higher possibility

    er, I meant lower possibility of latent design flaws with a large user base. A smaller user base increases the likelihood of problems existing unnoticed for an unspecified amount of time.

    --
    "Old man yells at systemd"
  54. OS Bug by kenneth_martens · · Score: 3, Informative

    According to the article, it is not a problem with the motherboard at all. The problem is "the operating system is creating coherency problems within the system by creating cacheable translation to AGP GART-mapped physical memory." That means it's a problem with the OS, not with the motherboard or processor.

    In truth, we should probably say it is a combination of a problem with the OS and a problem with the processor. After all, Intel processors don't have the same problem, simply because they work differently. So while it may not technically be the CPU's fault, the CPU does play a part.

    1. Re:OS Bug by geekoid · · Score: 2

      If the OS os doing something to spec. that causes the coherency problem, then It's not the OS.
      quite frankly, it's to early to know the truth of this problem.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  55. score -1 (inaccurate) by Anonymous Coward · · Score: 0

    pentiums had problems with floating point numbers. Fp are inherently lossy due to their violation of standard mathematical rules. Just ask a Cray programmer about the nasty workarounds they learned like the back of their hands.

  56. Sounds like a trivial thing to fix by cculianu · · Score: 1

    I am not familiar with the kernel mechanisms for allocating GARTs, but it sounds like all you need to do is have the agpgart.o module, when it's creating a GART, somehow tell the kernel memory allocator and the page table handling code not to set up this region as cacheable.

    The problem is that probably none of the page table code cares to distinguish between cacheable and non-cacheable pages. But anyway it shouldn't be too bad to set up such a distinction.

    Anyway I haven't looked at the kernel code that relates to this yet so I am not sure if I am over-simplifying things... but I trust that someone will have a hack to fix this soon (a hack that doesn't cost anything in performance, unlike that mem=nopentium option), and a proper patch that is more beautiful would probably come out a few days after that....

    However, since NVIDIA's stupid bloated drivers contain their *own* agp GART code, we would also have to coordinate with that vendor to get them to change their GART code to behave properly. Either that or you can try using the linux kernel's agpgart.o with NVdriver, but in my experience Very Bad Things happen why you do that! :)

    -Calin

  57. This makes much more sense...... by tekniklr · · Score: 1

    Last November, I upgraded my motherboard to a new DFI one with the AMD 761 chipset, but kept using the same processor (a Duron 700).

    After the upgrade the bug hit me (the computer would lock HARD every time X started)- at first I just thought it was bad RAM, but replacing that didn't help. Eventually I figured out by my own troubleshooting that unchecking the AMD 761 AGP chipset fixed the bug....

    Since the bug only appeared after the upgrade to a motherboard with an AMD 761 chipset, this makes a lot more sense. (Using the kernel option mem=nopentium fixes my problem, so it must be the same bug....)

  58. You are assuming... by Arker · · Score: 5, Insightful

    You are assuming that AMDs current explanation is 100% true, correct, and complete. There are good reasons to doubt this.


    The "explanation" so far has just raised more questions. Why does the same code that causes the athlon to crash work fine on pentiums? Apparently the GART is cacheable on pentium systems? And the Athlon is billed as pentium-compatible...


    Why does disabling large pages fix the problem? If their explanation is correct, that fix should not work, because it doesn't address the issue they claim to be the problem.


    I'm sure this will get worked around in software (and the linux fix will actually workaround the underlying problem, rather than just making it less likely as the windows world seems to be satisfied with) once the real details of this are known. But to claim it's not a hardware bug is ludicrous. It's a bug with the Athlon CPU, or with certain GARTS found in Athlon chipsets, or both. If AMD were less worried about spin-controlling it and claiming it's the software at fault maybe they would be more forthcoming about what is really going on here.

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Friends don't let friends enable ecmascript.
    1. Re:You are assuming... by Salamander · · Score: 5, Interesting
      Why does the same code that causes the athlon to crash work fine on pentiums? Apparently the GART is cacheable on pentium systems? And the Athlon is billed as pentium-compatible...

      There are different types and levels of compatibility. The Athlon claims base-instruction-set and register compatibility with the Pentium, but it's not pin-compatible and may also differ in any number of behavioral/timing characteristics. This is one such case. The behavior in question is perfectly acceptable within the bounds of the compatibility and standards compliance that AMD claims.

      Why does disabling large pages fix the problem?

      Because it's the large pages that are (incorrectly) marked as cacheable. No large pages, no incorrect mappings, no problem.

      But to claim it's not a hardware bug is ludicrous. It's a bug with the Athlon CPU, or with certain GARTS found in Athlon chipsets, or both.

      Nope. It's a bug in the OS. Anyone who works with memory systems should know the dangers inherent in mixing cache-coherent and non-coherent accesses to the same memory, and should mark pages accordingly.

      It's very tempting to criticize AMD for their handling of speculative writes, but that handling is really irrelevant. It seems to me that the cache line's contents should not be marked dirtybefore the processor has actually written to it (which in this case it never does). Under normal conditions, though, this would only be a performance issue. If a coherent access were made from elsewhere, invalidation and writeback would ensue; the writeback would be unnecessary but not harmful, because it would be writing the same data that were already in main memory. However, the cache wouldn't be involved in the first place if the pages were mapped correctly. There would be no write-allocate, no invalidation, no writeback, and no problem. The invalid mapping turns a slightly silly but legal and normally-harmless processor behavior into a serious coherency problem.

      --
      Slashdot - News for Herds. Stuff that Splatters.
    2. Re:You are assuming... by Dahan · · Score: 5, Informative
      Apparently the GART is cacheable on pentium systems?

      There are Pentium systems with an AGP port? If you mean the Pentium II and up, I don't see why the GART would be cacheable there either; I don't know if the P4 chipsets have changed things, but with the PII and PIII, here's what Intel had to say about the subject:

      For current hardware implementations, the OS will make AGP memory (like other video memory) non-cacheable, so that there is no coherency problem between the CPU caches and the data that the graphics controller uses. Otherwise, graphics controller accesses to AGP memory would require "snooping" the CPU caches, which would cause delays in execution in some cases.

      -- AGP and Graphics Optimization Techniques

      (Emphasis added). As for why the bug doesn't happen on Intel CPUs, it sounds like the Athlon has more aggressive speculative writes and can change memory that wasn't explicitly written to, dirtying the cache. But in any case, even on Intel CPUs, the AGP area is supposed to be mapped non-cacheable.

      Why does disabling large pages fix the problem?

      Don't know about that one; I haven't read the various tech docs for the Athlon. Perhaps the cache works slightly differently with 4MB pages vs 4KB pages?

    3. Re:You are assuming... by VAXman · · Score: 2

      It's very tempting to criticize AMD for their handling of speculative writes, but that handling is really irrelevant. It seems to me that the cache line's contents should not be marked dirtybefore the processor has actually written to it (which in this case it never does).

      That's only half of the issue -- and certainly not illegal behavior. If the line first went into 'E' state, you would still have a coherency problem if you later did a write. Although, true, the issue only becomes visible if you go straight into 'M' state.

      I think the real root of the problem is that they are doing speculative writes (obviously, they mean 'speculative reads-for-ownership' since speculative writes are highly illegal in IA32) into a page which the processor is not storing to.

      If the OS knows that it is not going to do any loads or stores to a page, it should have the right to give the page any memory type it wants, and it shouldn't have to worry about coherency issues, because, as far as the software is concerned, the processor is not a participating agent on the bus.

      IMHO, it's a processor bug, but obviously one that's very easy to workaround (make the memory UC, even though you're not using it).

    4. Re:You are assuming... by Ozx · · Score: 0

      It's also fixed in recent releases of the processor...

  59. Remember by Anonymous Coward · · Score: 0

    It's not a bug, it's an "undocumented feature."

  60. Ha ha ha by Anonymous Coward · · Score: 0

    If the rumors are true, they'll announce a 1GHz box within a month, and it might be available by Christmas.

    Speaking of rumors, didn't the iWalk have a 1GHz processor?

  61. Re:How do I block these page lengthening posts? by Anonymous Coward · · Score: 0

    Mark him as a foe, then change foes to -5 or something like that.

  62. percentage affected? 100% for me!!!! by osjedi · · Score: 1


    I don't care what the percentage affected is. All I know is that mine is one of them!

    I've worked for months trying to get DRI hardware accelleration working right. I can start any hardware-accelerated 3-d app and it will lock my system solid in about 4 seconds. All this time I thought it was a software problem. I can't even express how much wasted time and frustration has resulted from this.

    One line added to lilo.conf and everything works now.

    --
    -=-=-=-=- osjedi uses Debian GNU/Linux. -=-=-=-=-
  63. It's Linux, NOT the motherboard! by Anonymous Coward · · Score: 2, Informative

    Get the story straight:

    "Our conclusion is that the operating system is creating coherency problems within the system by creating cacheable translation to AGP GART-mapped physical memory."

  64. If this was microsoft or Intel by ryusen · · Score: 1

    Of course the Bug exists, but the chances of it actually affecting anyone or a malicious entity taking advatage of it are next to nill in the real world...
    Seems everyone has a form of plausable deniability. The rule i seem to notice when dealing with bugs in software hardware vendors is "It's cooler to add new features than fix the problems with the ones we've already got"

    --

    I believe sex is highly over rated... unless it involves me
  65. Re:Easy - Buy Intel. The cost of using 2nd party.. by pivo · · Score: 1

    Hello? How about my point that this isn't the first hardware bug to hit the world, and that AMD has no monopoly on hardware bugs. See the Linux source for workarounds for all sorts of hardware bugs, almost all having nothing to do with AMD. Are you just stupid or do you have stock in Intel?

  66. Roger that by dnoyeb · · Score: 1

    Allbeit less lossy than Integers...

  67. Word2k by Anonymous Coward · · Score: 0

    I'm not sure about the battle.net thing, It's always worked fine for me.

    However, the Word problem appears in Word2k on every computer I write on for more than an hour.

    It seems to crash whenever I paste things (even small things) after about an hours work. This has been tested on:

    AMD Duron 600
    Intel P3 500
    AMD K6-2 300

    So, my guess is that this is a microsoft issue, and it is highly annoying.

    Maybe a service pack fixes the problem.

  68. Flame bait? by NanoGator · · Score: 2

    Hmm... my post 'Mac users don't have to worry about using the term gigahertz' post got modded down as flamebait.

    The original post of 'Mac users don't have to worry about this [the Athlon bug]' is flame bait, my response was a humorous way of saying why 'this is why buying a Mac won't solve my problem.'

    An Off-topic moderation wouldn't have bothered me since I didn't spell out my reasoning, but I do feel the flamebait call was bad.

    --
    "Derp de derp."
  69. off the wall thought by ryusen · · Score: 1

    apple should dump motorola and buy into ibm's power4 processors for their really high end workstations...

    --

    I believe sex is highly over rated... unless it involves me
  70. No AMD bug by Anonymous Coward · · Score: 0

    We have looked and looked and looked and can find no bug to in the AMD processor. We suspect it is another Intel media scam to discredit AMD. AMD is the Best! Intel cannot compete!

  71. In the spirit of... by Refrag · · Score: 5, Funny

    ...Slashdotters that always point out their favorite OS isn't vulnerable to a particular bug.

    My Macintosh isn't affected by this bug due to its PowerPC processor.

    --
    I have a website. It's about Macs.
    1. Re:In the spirit of... by Anonymous Coward · · Score: 0

      Good for you, you translucent little f'aggit.

  72. erm . . . . What if you're not using NVidia? by himi · · Score: 1

    That option might make a difference with NVidia cards, but it's not going to make any difference to anyone who's using something else (except for making their Xserver fail to start, because of an unknown option in the config file).

    The mem=nopentium option to the kernel works everywhere for this particular problem.

    himi

    --

    My very own DeCSS mirror.
  73. Re:I'm also wondering if this issue is related by bobKali · · Score: 1

    I've been having this same thing happen on my K6-2 machine. And curiously enough, lately when I've tried to use SUSE's online-update, as soon as the downloaded updates start to install, the screen gets garbage all over it, and on one monitor I get logged out with a dirty login screen, and on the other I've still got a dirty version of my desktop there.

  74. Bang on. by himi · · Score: 2

    I believe that's the currently proposed fix - it may change as people understand the details more, but I think that's the basic idea.

    himi

    --

    My very own DeCSS mirror.
  75. Re:How do I block these page lengthening posts? by shepd · · Score: 0, Offtopic

    Doesn't work if you surf below 0 because CmdrTaco has told us that if you surf that low you don't deserve to be able to filter out posts.

    I think he's wrong (and it isn't the first time -- see my sig), but hey, I don't run the site.

    --
    If you could be told what you can see or read, then it follows that you could be told what to say or think - BoC
  76. Re:this is something.. by geekoid · · Score: 3, Informative

    BOy where have I heard that before... oh yeah every 2 years since there have been macs..sheesh.
    FYI I don't own a mac, but I will purchase one next time I want a computer.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  77. Re:Easy - Buy Intel. The cost of using 2nd party.. by WNight · · Score: 2

    Why doesn't your ISP use AMD for servers?

    They probably do. Well, depending on size. If they're AOL-sized, I doubt they use PCs at all.

    If they're smaller, they probably use AMD chips. Certainly they used celerons and other cheap technology.

    There are two types of setups.

    1) Very expensive server with the best (and best "name") hardware money can buy.

    2) Cheap crap in a fail-over cluster.

    For many things like email servers, news servers, etc, the cheap cluster is most cost efficient, easier to maintain (want to fix one? Unplug it and the others take over automatically), and easier to build.

    While your ISP may not use AMD (the saving for a cheap duron + mobo vs cheap celeron + mobo aren't great when you get into motherboards with integrated video and lan) they would if it saved them any money.

    There are some taks that are hard to "fail over" and those require a sturdy server, but even then, as long as it's not rack mounted, AMD has a good reputation (with an AMD chipset).

  78. binary data units (KiB and MiB) by Michael+Wardle · · Score: 1

    KB = kilobyte = 10^3 bytes = 1000 bytes
    MB = megabyte = 10^6 bytes = 1000000 bytes
    KiB = kibibyte = 2^10 bytes = 1024 bytes
    MiB = mebibyte = 2^20 bytes = 1048576 bytes

    This is a new standard designed to eliminate confusion, particularly as the discrepancy between powers of 2 and powers of 10 becomes very large when dealing with today's storage sizes (such as terabytes), and ensure that kilo and mega mean what they have traditionally done in science and every day life rather than the unofficial twisted computing adaptations.

    MINOR CORRECTION: Others have claimed that MiB is a symbol/abbreviation of mibibyte. It's acutally mebibyte. The binary names always use the first two letters (first syllable) of the SI prefix.

  79. Boards Affected by Anonymous Coward · · Score: 0

    Okay, so if this is a mobo bug, which boards are affected? Anyone have any problems with the Athlon MP 1800+ and the Tyan Boards (more specifically with S2460)? What other boards are affected?

  80. Nobody's fault? by Lars+T. · · Score: 1, Offtopic

    What do you mean, "nobody's fault" - that's unamerican!

    --

    Lars T.

    To the guy who modded me down from perfect to terrible Karma - Apple haters still suck

  81. The Answer by Ironix · · Score: 0, Offtopic

    x=42

    --
    Still #1 -- Lonely Gay Geek
  82. My email address by DragonHawk · · Score: 1

    "What's the point of a sig about your email address when you don't even display one?"

    When I wrote that signature, Slashdot had the concept of a "display email address", which was always shown. I had a spam-guard in that field.

    Then I did not visit slashdot for about a year. Things have changed.

    Anyway, since you pointed that out (thank you, BTW), I have updated my signature to include the spam-guarded address that was previously displayed automatically. (No, I do not trust the spam-guards built-in to Slashdot.)

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  83. Re:How do I block these page lengthening posts? by Anonymous Coward · · Score: 0

    So what. You can give an extra +1 bonus to every moderation and browse at 0, while assigning -6 to your foes. It's equivalent to browsing at -1. Works for me.

  84. It's a feature. by ImaLamer · · Score: 1

    Yeah, XP was to stable so they brought back that good old 9x feel!

  85. details (if not bug, design flaw) by Anonymous Coward · · Score: 0

    Linux marked the memory uncachable. Linux also
    has a cached mapping of up to 894 MB of RAM,
    so the AGP memory is mapped twice. Linux does
    not access via the cached mapping, but accesses
    to nearby regions cause speculative writes into
    the AGP memory. These speculative writes don't
    cross page boundries... but Linux uses giant
    4 MB pages for the kernel for extra speed.

    OK so far, but get this: the processor reads in
    a cache line for a memory write, MARKS IT DIRTY,
    and then cancels the write. So there is a dirty
    cacheline by accident, and the stupid processor
    will write it out later. Before that happens,
    the AGP video card writes to the memory and
    doesn't bother to let the CPU know about it.
    So then the cache line gets flushed, plopping
    old data on top of new and even wasting a
    memory bus cycle to do it.

  86. Re:this is something.. by Glock27 · · Score: 2
    How did this get modded to "flamebait"?!? Sometimes the truth hurts...

    I vote for a +5 Informative mod. :-)

    299,792,458 m/s...not just a good idea, its the law!

    --
    Galileo: "The Earth revolves around the Sun!"
    Score: -1 100% Flamebait
  87. What if the puma 'freezes' while you run? by wirefarm · · Score: 2

    You wouldn't wear them, would you?
    I had the same opinion about wasting money on Intel, so I bought AMD. Though I'm glad you're having a good experience with your AMD, I simply can't agree:
    I really don't agree with your feeling that performance is how well something works when it works. If that were true, I could just stay home from work most of the time and kick butt when I show up. (OK, I sort of do that now, but that's another issue...)

    Performance, in my book, is a judgement of how well something is doing its job. My AMD 'sometimes-kickass' workstation is not performing well in my opinion, even though when it does run, it runs great.
    If I look at the system as the tool that it's supposed to be, it simply isn't giving good performance.

    Let me explain:
    I have a few boxes on a small network at home - My main workstation is an Asus/AMD 1200Mhz setup running RedHat 7.2 - Before that, it was an Asus/AMD 600Mhz setup. Both systems have had the same problems, even though *Every Component* has been replaced in an effort to track down the problem. This morning, it froze a few minutes after the screen saver kicked in.
    Each time this happens, I have to do a hard reboot. The other day, I added that mem=nopentium option and it still has the problem.
    I used to have some big drives in the PC, but they were getting thrashed by the powerdowns, so I replaced them with a single 10GB and moved the 2 60GB drives to the server in my laundry room.
    The server, by the way, is an old 300Mhz IBM with an Intel chip and it happily chugs along, serving files by Samba, database stuff, CGI/Apache stuff, SSH logins, VNC logins, whatever I happen to throw at it.
    This is a machine that I literally snagged from the trash, but you couldn't *pry* it from me at this point. I just ran uptime on it, just for kicks:

    11:39am up 155 days, 13:45, 3 users, load average: 0.00, 0.00, 0.00


    So, do I regret spending so much time and money on AMD? Yes.
    Would I buy them again?
    No.

    To me it *is* a performance issue. The AMD system has not done its job.

    YMMV,
    Jim in Tokyo

    --
    -- My Weblog.
  88. This doesn't just affect Linux by 00Monkey · · Score: 1

    I am running Windows XP and have been trying to get this problem solved for XP too. I have to put my GeForce 3 in "PCI Mode" in order to use any 3d apps or games and it crashes constantly in 2d, getting stuck in strange driver loops.

    I'm not even close to the only person out there who's stuckk with crappy performance because of this problem. I know there's a problem somewhere between Microsoft, Via, and AMD but that's as far as I personally can get. Hopefully this will be taken care of soon.

    Abit KT7A-RAID
    Asus V8200 Deluxe GeForce 3
    Athlon Thunderbird 1200/200

  89. What systems EXACTLY are affected? by Confessed+Geek · · Score: 1

    I'm spec'ing a Dual Athlon 1800+ MP workstation for scientific work and calculation. I'm planning on using the Tyan Thunder SMP board - the one with optional SCSI, 2 onboard NICs and an onboard ATI card. This machine is going to be used only for calculation, desktop work and coding/compiling. Due to some standardization issues it's going to be running 2.2.x not the 2.4.x kernel. No 3d gaming on this box.

    Is this machine going to run afoul of this bug? I have heard that its a problem in the 2.4 kernel but nothing about 2.2. Since its using ATI instead of Nvidia will that make a difference?

    I know, its gauche to not act suave and opinionated on slashdot but I actually need to know something ;)

    Thanks!

  90. Not the same bug! by niftyzero · · Score: 1

    This seems to be a completely different bug than the previous article. That bug was about the virtual memory translation table. It's in the AMD processor revision guide as errata #16. It's a processor bug. That bug seems fixable with the mem=nopentium flag. The bug mentioned in the latest article (Tracking Down, etc...) is about cache coherency semantics, which is a completely different beast.

  91. Re:How do I block these page lengthening posts? by shepd · · Score: 1

    >Works for me.

    How do I upgrade my account to include options for +1ing "overrated" moderated posts too? I'm missing that option. :-(

    I must still be in the old database. Oh well... Perhaps you wouldn't mind asking CmdrTaco to add that feature to my account?

    --
    If you could be told what you can see or read, then it follows that you could be told what to say or think - BoC
  92. Re:this is something.. by Hermanetta · · Score: 0

    No doubt!

    I say mod this poor bastard up too. Its good comment. I wish more posts were of this quality. ;)

    (as I get modded down for: agreeing with a mac applogist(flaimbait), asking for a good mod (offtopic), seconding an idea (redundant) and adding this comment at the bottom (flamebait) ouch...)

  93. Re:this is something.. by Hermanetta · · Score: 0

    Sorry, should have read:

    as I get modded down for: agreeing with a mac applogist(flaimbait), asking for a good mod (offtopic), seconding an idea (redundant) and adding this comment at the bottom (troll) ouch...

    hate it when I screw up a crappy joke

  94. Windows fix by chrisvdp74656 · · Score: 1

    Save this as a .reg file, then double click it. It *should* work. (Was exported tree, got cropped in Notepad)

    ---Cut here---
    Windows Registry Editor Version 5.00

    [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Con tr ol\Session Manager\Memory Management]
    "LargePageMinimum"=dword:ffffffff
    ---Cut here---

    --
    09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    1. Re:Windows fix by chrisvdp74656 · · Score: 1

      ***IMPORTANT***
      Remove the space from '...\Contr ol\...' and damn Slashcode. :)

      Chris

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
  95. Re:I'm also wondering if this issue is related by Peter+Harris · · Score: 1
    I've been having this same thing happen on my K6-2 machine. And curiously enough, lately when I've tried to use SUSE's online-update, as soon as the downloaded updates start to install, the screen gets garbage all over it, and on one monitor I get logged out with a dirty login screen, and on the other I've still got a dirty version of my desktop there
    A dirty version of your desktop? What, like a pornographic backdrop? Very European.. must try SuSE some time.
    --

    -- What do you need?
    -- Gnus. Lots of Gnus.
  96. New system with random lockups by 4Runner · · Score: 1

    I just built a new AMD Athlon system with 512MB ram, NVidia GeForce2 MX 400 GFX card and the system will freeze alot (RH 7.2). Redhat 7.2 on an Intel system we have is rock solid. When the AMD system freezes, the only way to get out of it is to hit the reset button. This is with or without the mem=nopentium option added to the kernel during booting. So, I hope there is another solution besides mem=nopentium because it ain't working for me.

  97. Wait a minute.. by gooberguy · · Score: 1

    Wait a minute, that means that not only did Windows and Linux have the same problem, but Windows got patched first.

    Oh Noooooo!

    We programmers must be getting soft, not fixing a kernel bug that has been around for 2 years. (as you can see, I think the OS is responsible for stopping memory used by AGP devices from being cacheble, not the CPU or mobo)

    D/\ Gooberguy

    --


    Karma: Meh (Mostly from meh.)
  98. Re:this is something.. by dhamsaic · · Score: 2

    I've got karma to burn. With 3 down-mods, I'm at 47 now. So it's no big deal. It's just sad that people don't recognize the truth. Of course, the idiot above me got one downmod and that's it, even though he was spouting off complete bullshit. Oh well...

    --
    Every once in a while I like to masturbate a new word into my vocabulary, even if I don't know what it means.
  99. Powers of two by DragonHawk · · Score: 2

    "I think it won't work out, because there's too much legacy stuff that there will always be confusion at this point about what "mega" and "kilo" mean with computers."

    Not to mention the fact that computers are incapable of "thinking" in anything but a power of two. You will not find a discrete quantity of 10 (or a power thereof) bytes anywhere in a computer system. This makes the SI units useless for computers. While re-defining them for use in computers was and still is an abuse, the lack of applicability of the conventional SI units makes it largely a non-issue. The only people who care are are HDD manufacturers who rate drives in "millions of bytes" so they can swindle stupid customers.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  100. Re:this is something.. by Glock27 · · Score: 2
    I wasn't really addressing your karma level...I was just commenting on the often ridiculous moderation system on Slashdot.

    My secret wish would be for Apple to support AMD x86-64 processors with MacOS X...not that it'll happen. That would be a great combination.

    On the other hand, if they can really hit 1.5+ GHz. with the G5, that'll be OK too. Just a lot more expensive.

    299,792,458 m/s...not just a good idea, its the law!

    --
    Galileo: "The Earth revolves around the Sun!"
    Score: -1 100% Flamebait
  101. Re:this is something.. by dhamsaic · · Score: 2

    Well. To address 2 points...

    Yes, Slashdot moderation is ridiculous. Period. It's the biggest problem with Slashdot, and that's why I hardly ever read it anymore.

    As far as Apple and processors... the G4 is a pretty good processor, and I have a Dual 800MHz G4 and am loving every second I get to spend on it... but I also have a Dual Athlon MP 1600+ (1.4GHz) and the Dual Athlon spanks the shit out of it at all the things that really matter (Quake 3 :P). Now, I am using Linux on the Athlon and OS X on the G4, so you can draw any comparisons you wish.

    That having been said, the G4 is a fast machine. But... It was $4,000 with monitor ($500 Sony CPD-G400), whereas with a $1,000 monitor, my Dual Athlon was only $3,000. The only differing factors is that the Athlon has a GF3 Ti500 (the PowerMac has a GF3 regular), the G4 has 128 more megs of RAM (whoopty doo), and the G4 has an extra 60 gig drive. Now I'm not complaining - I think the machine was worth what I paid for it. But for $3,500 (computer itself), you'd think it should spank the shit out of a $2,000 computer (okay, $2,100 with shipping and all).

    The point is, the G4 isn't that fast. Apple really would do better to put in some AMD processors, knock the price down a LEETLE and be able to claim that it really *did* burn Pentiums (instead of just with Adobe products).

    One can only hope...

    --
    Every once in a while I like to masturbate a new word into my vocabulary, even if I don't know what it means.