Slashdot Mirror


Software To Diagnose Faulty PC Hardware?

Etylowy writes "Over the years I have repaired my own PC and those belonging to family and friends many, many times. While in most cases it turned out to be restoring a system after malware/the user/Windows made a mess, or simple cases of 'follow the smell of smoke and molten plastic,' there were some nasty ones where the computer mostly works. By 'mostly,' I mean: you can boot it up, it might even work for a while, but will crash way too often to blame it all on Microsoft — what do you do then? Once you strip it of any extra hardware (which, with today's motherboards that have pretty much everything integrated, might not be an option) you are left with the CPU, motherboard, graphics card, RAM and HDD. You can test the HDD, you can run memtest86+ to check the RAM, but how do you go about testing the CPU, motherboard and graphics card trio to find which is to blame? Replacing them one by one isn't really an option. Do you know of any software that would help the way memtest helps with RAM?"

11 of 274 comments (clear)

  1. OCCT by PFAK · · Score: 4, Interesting

    It will stress your RAM, CPU, and GPU or all at once with pretty temperature and utilization graphs (for Windows only): http://www.ocbase.com/perestroika_en/

    --

    Free means no restrictions, ironic the FSF's GPL forces restrictions, isn't it? What's your definition of free?
  2. Eurosoft PC Check by jdb2 · · Score: 4, Informative

    This is probably one of the best and most comprehensive OS agnostic boot-CD/floppy general purpose PC hardware testing and burn-in tools I've come across IMHO.

    Here's its web page : http://www.eurosoft-uk.com/pc_check.htm

    In any case, I recommend plugging the ATX cable into a power supply tester that presents a non-trivial load as a first step in diagnosing any PC. You'd be surprised in what ways the problems caused by out-of-spec voltages can be manifested.

    jdb2

    1. Re:Eurosoft PC Check by Artifakt · · Score: 5, Informative

      Every power supply which I've found failed was visibly broken once you opened it up, and it was always the capacitors. No Exceptions - capacitors had sprayed gunk all over, their Aluminium cans had popped off the bases, etc. Typical electrolytic fluid is white-ish, but once it bakes dry will scorch, and so gradually turn reddish brown. Many capacitors have grooves scored into the tops which form sort of impromptu blow out panels, and often you will see them bulging, with traces of fluid escaping from these grooves where they are actually splitting open, or scorched fluid forming a red-brown powdery residue outlining them. The grooves are usually in either an X (or Plus) or a sort of K shape. The PSUs are often still working (somewhat) at that point, and often, the PSU may be putting out nominally correct voltages when cool but deviating when it heats up. I had one client's PC that made a loud bang twice over a period of about a week, but the PC didn't really start acting funny until the third bang. Opening the PSU revealed three small caps that had blown completely off the board. It had probably kept running with no obvious symptoms through the first two.
              Of course, only a trained pro with good tools should ever examine the inside of a power supply while live. But, if you are willing to unplug one and take it out of the PC and let it sit overnight, just to make sure the larger capacitors have fully drained, I recommend examining them. Yes, that voids the warranty if you aren't a pro, but if you were going to junk it and buy a new one anyway, so what? But before you open one, read this:

              DON"T EVER OPEN A PLUGGED IN POWER SUPPLY. IF THIS DOESN"T APPLY TO YOU YOU ALREADY HAVE AN ELECTRICIANS LICENCE, A EE DEGREE, OR SIMILAR. DON"T OPEN A POWER SUPPLY UNLESS YOU KNOW THE LARGE CAPACITORS INSIDE ARE DISCHARGED - THEY CAN MAKE YOUR ARM MUSCLES CONTRACT HARD ENOUGH TO BREAK YOUR BONES. GIVE THEM AT LEAST AN HOUR TO RUN DOWN, THEN USE AN INSULATED TOOL TO CROSS THE PLUG PRONGS BEFORE YOU OPEN THE CASE.

              Split caps or scorched ones will confirm you are right in your guess that it's the PSU. While you're at it, if you think the problem is the motherboard, check for capacitor damage there too, as it's not all that uncommon for that to be why a mainboard fails. Cheap electrolytics are probably responsible for more than half of all consumer electronics failures, they are by far the most likely source of intermittent failures, ones that come and go with temperature, or glitches that only partly disable something, and they are detectable.

      --
      Who is John Cabal?
  3. Preventative Medicine - get a UPS by jackchance · · Score: 4, Informative

    Most home computer hardware failures come from "brownouts".

    If you notice that your lights dim a little bit when your fridge compressor or AirCon comes on, that is a recipe for a computer failure. Spend $50 get a UPS
    Btw, i noticed that my linksys wifi router was also extremely sensitive to brownouts. It would get funked up and need to be power cycled. Plug it into a UPS , no more wifi problems either.

    I learned this the hard way when i moved to an old building in the east village of NYC and had 3 motherboards/cpu fail within a 3 month period.

    --
    1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
  4. SMART for dying hard drives by Wrath0fb0b · · Score: 4, Informative

    http://sourceforge.net/apps/trac/smartmontools/wiki is great for finding out what the drives think about their own health. Things to look out for are spin-retry counts (which lead to that annoying 2-5 seconds freeze), high reallocated sector counts (never never never use chkdsk to attempt to fix a broken hard drive. With the robustness of modern journaling file systems (HFS, extN, NTFS), storage errors are almost always hardware errors. Running chkdsk stresses the drive just as it's failing and usually pushes it over the edge -- and then users complain that you can't recover their data.

  5. Overheat by gd2shoe · · Score: 4, Informative

    That's a marginal idea at best, but a common one.

    While the technique of blasting a processing unit to see how it behaves at maximum temperature will sometimes find a faulty unit, many faults are not temperature related, and will not show up on this test. It's fine that you brought it up here, but something that both heats the CPU/GPU and tries to test as many pathways / as much of the instruction set as possible would be far more useful. (cf memtest86+ for RAM)

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
  6. PSU by gd2shoe · · Score: 5, Informative

    Oh, and don't forget to check the PSU. When it acts up, it will often appear to be a hardware fault somewhere else in the machine. (often RAM, but can be MB, CPU, GPU...)

    This certainly doesn't answer the posters question, but it is related and important.

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    1. Re:PSU by Daneurysm · · Score: 4, Informative

      I was just about to mention this. I used to work in a mom-n-pop shop, the only one in the area, for a long time.

      I have seen some of the most ridiculous problems that were PSU related. Serial mouse not working, VGA card outputting in B&W, slow and or intermittent performance, HD's that constantly reset (and sound like click of death in the process), new memory being blown, known good memory acting like bad memory, CD-R's that can't burn (or finish burning successfully), software modems that couldn't go off hook, AGP cards crashing, PCI cards crashing, VLB SCSI cards not working at all.

      The list really just goes on and on and on. Software to diagnose faulty PC hardware? Sorry, no thanks. I had tried all manner of diagnostic and test software over the years. Some worked some of the time. (mem tests and HD scanners), the rest were borderline use-less pieces of crap. Not only that, but because of faulty PSU's (usually overloaded, or just old, or overheating, etc etc etc) I have seen those same programs misdiagnose just about everything.

      Aside from simple sensor reading and verification (of code, built in HW diagnostics, etc) I do no trust 'software based' hardware diagnosis, especially on a PC.

      YMMV.

    2. Re:PSU by mysidia · · Score: 5, Insightful

      Check supply voltages first.

      There's a really fancy test program to do this... it's called a digital multimeter, and it's a piece of hardware with two probes.

      You touch one probe to ground, and then use the other to check all the leads going into MB for supply voltage.

      For desktops that is.

      For servers, the power supplies are generally smart modular units, and you check their voltage outputs in the BIOS screens, or using remote management via BMC: IPMI, iLO, Drac, or ALOM

    3. Re:PSU by robbak · · Score: 4, Insightful

      While that is good "Bad or Maybe" test, most PSU problems are transient over- or under-voltage conditions, which a DMM is not going to reveal.
      And there are testers that will measure all (or most) of the voltages produced at once - you jut plug the atx cable into the device, and many of them have a pass-through, so you can test the PSU under load. I'd look for one that could flag a transient problem, if it exists.

      Mind you, since writing the above I have looked around for one, and have failed! They all are pretty simple devices that do not detect transients, I could find no pass-through devices, and they all test under very anemic loads. All told, I am not impressed by any of them.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
  7. Re:Mod up - Everyone buy one of these by cbiltcliffe · · Score: 4, Funny

    With hardware its usually bad psu, then bad memory, then bad caps.

    Then bad karma, then bad mojo.

    --
    "City hall" in German is "Rathaus" Kinda explains a few things......