Slashdot Mirror


Tracking Down The AMD "Processor Bug"

tercero writes: "over at the Gentoo Linux website there is an update on the AMD processor bug mentioned here. The sum up is that AMD claims it's not a bug with the Athlon processor, but with the motherboard. More detailed information can be found on this LKML post." An Anonymous Coward points to a similar explanation at Linux Weekly News. Update: 01/25 01:25 GMT by T : Daniel Robbins from Gentoo clarifies: "AMD is not calling this a 'motherboard' issue, it is an interaction between a feature of the Athlon called 'speculative writes' and the design of the GART, which is not cache-coherent. It's a 'Athlon/cache coherency/GART' problem, not a 'motherboard' problem."

1 of 237 comments (clear)

  1. Re:You are assuming... by Salamander · · Score: 5, Interesting
    Why does the same code that causes the athlon to crash work fine on pentiums? Apparently the GART is cacheable on pentium systems? And the Athlon is billed as pentium-compatible...

    There are different types and levels of compatibility. The Athlon claims base-instruction-set and register compatibility with the Pentium, but it's not pin-compatible and may also differ in any number of behavioral/timing characteristics. This is one such case. The behavior in question is perfectly acceptable within the bounds of the compatibility and standards compliance that AMD claims.

    Why does disabling large pages fix the problem?

    Because it's the large pages that are (incorrectly) marked as cacheable. No large pages, no incorrect mappings, no problem.

    But to claim it's not a hardware bug is ludicrous. It's a bug with the Athlon CPU, or with certain GARTS found in Athlon chipsets, or both.

    Nope. It's a bug in the OS. Anyone who works with memory systems should know the dangers inherent in mixing cache-coherent and non-coherent accesses to the same memory, and should mark pages accordingly.

    It's very tempting to criticize AMD for their handling of speculative writes, but that handling is really irrelevant. It seems to me that the cache line's contents should not be marked dirtybefore the processor has actually written to it (which in this case it never does). Under normal conditions, though, this would only be a performance issue. If a coherent access were made from elsewhere, invalidation and writeback would ensue; the writeback would be unnecessary but not harmful, because it would be writing the same data that were already in main memory. However, the cache wouldn't be involved in the first place if the pages were mapped correctly. There would be no write-allocate, no invalidation, no writeback, and no problem. The invalid mapping turns a slightly silly but legal and normally-harmless processor behavior into a serious coherency problem.

    --
    Slashdot - News for Herds. Stuff that Splatters.