Slashdot Mirror


Closure On the Linux Lockup Bug

jones_supa writes: Dave Jones from Red Hat has written a wrap-up of the strange bug that has made some machines running Linux to freeze. (Previous discussion.) Right down to his final week at Red Hat before Dave gave all his hardware back, Linus Torvalds managed to reproduce similar symptoms, by scribbling directly to the HPET timer. He came up with a hack that at least made the kernel survive for him. When Dave tried the same patch, the machine ran for three days before he interrupted it, which was a promising result. The question remains, what was scribbling over the HPET in his case? The only two plausible scenarios Dave could think of were that Trinity generated 0xFED000F0 as a random address and passed that to a syscall which wrote to it, or a hardware bug. That's where the story ends for now. Linus' hacky workaround didn't get committed, but him and John Stultz continue to back and forth on hardening the clock management code in the face of screwed up hardware, so maybe soon we'll see something real get committed on that area.

5 of 115 comments (clear)

  1. In other words.. by Anonymous Coward · · Score: 2, Funny

    Closed NOTABUG?

  2. Re:"friend" and "foe", but no "neckbeard" by Anonymous Coward · · Score: 2, Funny

    AC here, no longer posting as myself since I've long lost my SO account, can't be bothered to find the password for the ancient yahoo email address, and after working on the inside in finance will probably never post an opinion (as my own) again. (Yes, that was a run on sentence.)

    If 1986 qualifies as a "neckbeard" you missed the mark, unless he's a Berkley neckbeard. The 80's were a magical time when power ties, very bad print shirts, and driving your overpriced car with women and blow was available to any person who could reasonably crank out C or Basic.

    Just saying...

  3. Re:hardening is NOT blaming the hardware by kad77 · · Score: 3, Funny

    What you posted about his being the 4th post struck me as wrong, given how far it was down the page. I'm bored, so I took a moment to look at how many posts have an earlier timestamp than the one you are slamming (at least 8), and 2 make dismissive statements about hardware, including the first comment of article at 8:12, and another at 8:19 seemingly dismissing hardware as a possibility.

    So your snide comment is not based in fact. It's like you are reading a different page. Maybe you need glasses. An attitude adjustment, for sure.

  4. Re:plus don't crash on bad hardware. Hotplugged CP by cerberusss · · Score: 4, Funny

    Sometimes it

    Sometimes it -- what? Did someone attempt to hot-swap your CPU again? (-:

    --
    8 of 13 people found this answer helpful. Did you?
  5. Re:does not sound like closure to me by tippen · · Score: 3, Funny

    One of the more memorable quotes I heard while developing embedded systems: if you can fix it in software, it isn't a hardware bug

    Annoying as hell to the software team when it is clearly a bug in the hardware, but very true at a practical level for the engineering team trying to get product out the door.