Slashdot Mirror


Software To Diagnose Faulty PC Hardware?

Etylowy writes "Over the years I have repaired my own PC and those belonging to family and friends many, many times. While in most cases it turned out to be restoring a system after malware/the user/Windows made a mess, or simple cases of 'follow the smell of smoke and molten plastic,' there were some nasty ones where the computer mostly works. By 'mostly,' I mean: you can boot it up, it might even work for a while, but will crash way too often to blame it all on Microsoft — what do you do then? Once you strip it of any extra hardware (which, with today's motherboards that have pretty much everything integrated, might not be an option) you are left with the CPU, motherboard, graphics card, RAM and HDD. You can test the HDD, you can run memtest86+ to check the RAM, but how do you go about testing the CPU, motherboard and graphics card trio to find which is to blame? Replacing them one by one isn't really an option. Do you know of any software that would help the way memtest helps with RAM?"

274 comments

  1. OCCT by PFAK · · Score: 4, Interesting

    It will stress your RAM, CPU, and GPU or all at once with pretty temperature and utilization graphs (for Windows only): http://www.ocbase.com/perestroika_en/

    --

    Free means no restrictions, ironic the FSF's GPL forces restrictions, isn't it? What's your definition of free?
    1. Re:OCCT by Etylowy · · Score: 1

      As far as I can see there is no GPU testing option, but it seems like a good solution for Mobo + CPU.

    2. Re:OCCT by PFAK · · Score: 3, Informative

      Did you actually install it? (or are you a typical /. reader?) It has a "GPU" option for stress testing your graphics card if you have the latest DirectX updates installed.

      --

      Free means no restrictions, ironic the FSF's GPL forces restrictions, isn't it? What's your definition of free?
    3. Re:OCCT by Etylowy · · Score: 1

      It's the first time I have heard about this software, so I had a look at http://www.ocbase.com/perestroika_en/index.php?2008/03/14/23-what-is-occt - and all it says is "OCCT (stands for "OverClock Checking Tool) is a CPU stability testing program, developped by myself." - not a single word about GPU on the whole "what is occt" page. Maybe I am not paranoid enough to assume that the developer is a liar and check if he hadn't hidden like 1/3 of his softwares functions ;-)

    4. Re:OCCT by piero.grimo · · Score: 1

      It will stress your RAM, CPU, and GPU or all at once

      So how does it help identifying which one is to blame?

    5. Re:OCCT by adamstew · · Score: 2, Informative

      many people overclock their GPUs too, so it would make sense that a tool for Overclocking stability tool would stress that as well.

    6. Re:OCCT by Narpak · · Score: 2, Informative

      More than once I have experienced that the on-board sound chip from Realtek causes the computer crash or have significant slowdowns. Disabling and putting in a budget soundcard fixed it. So I would suggest that disabling various on-board components in turn could uncover the culprit. That being said, identifying hardware problems have always, for me, been a bit hit and/or miss.

    7. Re:OCCT by astar · · Score: 1

      On your sig, English dictionaries have a lot of definitions of free, and as I understand it, none that exactly match free as in free software. That is why people who need to be precise say gratis and libre. You are playing nominalist.

      Not all your fault. The current English dictionaries are probably the result of the ongoing long-term cultural deterioration. I would expect that really old English dictionaries have the meaning.

    8. Re:OCCT by JMandingo · · Score: 2, Informative

      Use a can of compressed air to purge out any accumulated dust. Less dust means a cooler box, which may just bring the unit back within whatever temperature (or, by extension, power) tolerance it is pushing the envelope on. Another technique is to wiggle every cable and connector and slotted card, just to make sure nothing has come loose. Check to make sure all the fans are running whilst powered on.

      --
      Vonnegut was right: Of all the words of mice and men, the saddest are, "It might have been."
    9. Re:OCCT by Anonymous Coward · · Score: 0

      Fer christsakes! Just Google "pc diagnostic card"
      There's lots of them out there.

    10. Re:OCCT by PopeRatzo · · Score: 0, Offtopic

      ...the ongoing long-term cultural deterioration.

      There is no such thing. Every culture has a small number of people for whom nothing is as good as it once was. Music "deteriorated" when The Beatles arrived, because their music wasn't as "good" as Glenn Miller or Beethoven. Painting "deteriorated" when those sloppy Impressionists stopped coloring inside the lines. Movies "deteriorated" when Godard didn't tell a story as neatly as Howard Hawkes. Human communications "deteriorated" because people send email instead of writing letters. Everything was better "back in my day".

      I don't believe "ongoing long-term cultural deterioration" means what you think it means, if it exists at all.

      --
      You are welcome on my lawn.
    11. Re:OCCT by astar · · Score: 0, Offtopic

      It is pretty easy to cite examples you would probably find credible.

      For instance, there has not been a new fundamental science discovery in sixty years. The current dominate TOE is string theory which with 30 years of work has not produced one testable conjecture. In fact it is so political dominate in the physics departments that you cannot get a job unless you sign on to it.

      As far as your examples are concerned, I am not much into music, but consider Ode to Joy to be a culturally universal anthem of freedom and thus superior to the Beatles. Still I have enjoyed the Beatles on occasion. I especially like "Here comes the sun".

      Modern painting is a complex subject, but I respect (darn, blank on his name) statement that modern art was the hope of the world. I think he is wrong. One reason is the art world is well-documented to be more about who you know than how "good" you are. Actually I used to own an original piece of Japanese abstract art.

      I do not go to movies, watch TV or do DVD's. I figure I have better things to do. So I am not sure what the current offerings are like. But I understand sequels are often produced, which does not much reflect creativity, except perhaps originally by marketing. I suspect Hollywood is hard to defend, but most attacks on it seem to be by right-wing crazies. They may have the right idea, but are too ignorant to say anything fundamental.

      I probably do most of my communication in emails. I like it because it is asynchronous. It is my preferred method of communications. I think I plead non-guilty there. My preferred workstation is OpenBSD so I am pretty much a computer nerd.

      Anyway, these are your examples, not mine. I would tend to cite more substantive things. Ultimately, it is a philosophy issue, a two thousand year old social disease. But for current events I would trace it to 1945.

      So, I do not think I am saying what you think I am saying.

    12. Re:OCCT by Anonymous Coward · · Score: 0

      You're a douche bag.

    13. Re:OCCT by Blakey+Rat · · Score: 1

      Nostalgia has ruined your brain.

      I do not go to movies, watch TV or do DVD's. I figure I have better things to do. So I am not sure what the current offerings are like.

      You admit you don't know what the hell you're talking about, so why do you keep talking?

    14. Re:OCCT by xmundt · · Score: 1

      Greetings and Salutations...
                Thanks to this new fangled thing called "Advertising" it is hard to avoid getting smacked in the face with information about upcoming entertainment offerings. It is also very easy to tell from these ads that the movie being pushed is probably mindless pap that has nothing to offer but tired retreads of ancient plots, and, characters being forced into making the worst possible decisions in order to keep the weak plot moving forwards to the inevitable conclusion.
                    Not that I have strong opinions about them...of course.
                    pleasant dreams
                    dave mundt

      --
      YAB - http://blog.beemandave.com/
    15. Re:OCCT by robbak · · Score: 1

      A good point! A tester can tell you what it tested when an error occurred, But that is not always the cause. An error while testing the memory could be caused by the cpu's cache that the data passed through, or the motherboard circuitry. Same thing for the GPU - was it an error in the GPU, or was it an error getting the data to or from the GPU, or a cpu error while analyzing it?

      All to often, you just can't tell. Welcome to the world of PC repair. Add a fault that doesn't show up when the system is warm, and we start to become bald!

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    16. Re:OCCT by Khyber · · Score: 1

      "For instance, there has not been a new fundamental science discovery in sixty years."

      Memristors and nanotechnology don't count? What about quantum entanglement having the potential to transmit information 10,000X the speed of light?

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    17. Re:OCCT by astar · · Score: 1

      I did go on to comment on the prevelence of sequels. This is hard to miss even if you do not go to movies. Anyway, I was just commenting on the irrelevance of the straw men that were set up.

      I notice you did not comment on my example, which was in physics. I think you have no future, know it in your heart, and cannot face it. So you attack someone who gives you a partial explanation of why. Or do you believe the media's reassurance on the economy? If you do not, how bad do you think it will get? What happened? What are you doing about it?

    18. Re:OCCT by Blakey+Rat · · Score: 1

      I did go on to comment on the prevelence of sequels. This is hard to miss even if you do not go to movies.

      And?

      Without any proof that sequels are worse movies than average, that means nothing. (Now, given, sequels very well might be nearly-universally worst than average movies, but you sure as hell didn't show it.)

      I notice you did not comment on my example, which was in physics.

      That's because I'm not a physicist. Are you?

      In my field, web development and usability, there has been AMAZING change in the last 10 years, heck, the last 5 years. And it didn't even exist 15 years ago.

      I think you have no future, know it in your heart, and cannot face it. So you attack someone who gives you a partial explanation of why.

      You're the hopeless pessimist. Why don't you just go put a pistol in your mouth and end it all, if the world is so horrible? Plus it would spare us your whining.

      Or do you believe the media's reassurance on the economy?

      I don't believe anything the media tells me.

      But here's what I know about the economy: my job hasn't, for even a split second, ever been in doubt. Our company is doing quite well, actually. I got a raise last week, and not just a wimpy cost of living raise. Things are pretty damned good, from my perspective.

      If you do not, how bad do you think it will get?

      Look, our "crash" is nothing, compared to the 1929 crash.

      And you know what? Even during the darkest, poorest days of the Great Depression, the average man was orders of magnitude better-off than he would have been a century before. Orders of magnitude.

      But see, here's one difference between me and you: I know history. You quickly learn that you live in the best society that has ever existed, and everything in the past sucked hard.

      You tell me, our modern world is so horrible, where do you want to live? The Roman Empire? The Renaissance? Look it up: what was the life expectancy in your ideal place? (Odds are, if you lived there, you'd already be dead.) What was quality of life like? What are your odds of being a free man vs. slave? Then come back and we'll talk.

      What are you doing about it?

      Nothing.

    19. Re:OCCT by Cylix · · Score: 1

      I also praise the authors of "stress" a handy application for linux which can perform non-destructive stress to hard disks as well as provide load on the processor and memory.

      For memory I really do like memtester, but I wish it was a bit less verbose. (For a user space app it's not bad)

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    20. Re:OCCT by astar · · Score: 1

      Good question. The photoelectric effect was explained about 1905 by Einstein as involving the quantitization of energy. This is where he made his reputation in the field. Skipping a lot of important people, in 1921 we had Schrodinger's equation. Quantum electrodynamics (QED) came along in the 1940's. Quantum entanglement comes out of these theories.

      Memristors and nanotechnologies do not count because they are not fundamental. They often reveal new phenomena, but for an explanation people go back to QED.

      Now Einstein objected to quantum mechanics. He observed "God does not play dice". I share his inclination. On the other hand, he objected to spooky action at a distance, Yet, that has been reduced to a commercial product. And I regard reduction to engineering as tending to show that we actually know something. So since we have had transistors, I think it fair to say we have know something about microphysics. Indeed, microphysics and some kinds of astronomy are really the places to look to for new fundamental discoveries. But I advise against confusing the existence of new and better tools with fundamental discoveries. The tools tend to make possible the fundamental discoveries, but they are not the same thing.

    21. Re:OCCT by wisty · · Score: 1

      Bio-infomatics is where the fundamental science is being done these days, not physics.

    22. Re:OCCT by mrmeval · · Score: 1

      It's made by a shit eating cretin who locks up all rights, even the right to give it to your friends. Why the fuck would you encourage such a malevolent asshole is beyond reason.

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
    23. Re:OCCT by astar · · Score: 1

      Thank you for your reply.

      Regarding sequels, I never actually said they were worse. I said it showed a lack of creativity, which I admit usually means the same thing. But I do not think a "near-universal" phenomena needs to be argued about.

      I admit to taking a lot of physics in school. But I know some history of physics and I have been exposed to some decent philosophy. So I admit some advantage, but it seems to me many people know about fundamental physics discoveries. I would assume you know about general relativity and quantum mechanics existing, thought you might not know the dates. It seems to me you do not have to be a physics person to know about fundamental theories.

      Regarding web development, I am a bit up on that, though I assume you know more about it than I. You are confusing new tools with new fundamental theories. Or perhaps you are confusing engineering with science. Anyway, "change" is not what I am talking about.

      Pessimism. I admit to baiting you. I think this is the best of all possible worlds and always has been. This sort of identifies my philosophical preferences, and it's meaning is rather technical. Take it as meaning it is always possible to meet necessity, and so you might have a future, though it is not guaranteed.

      I got a raise last week. Congratulations on the raise. You must be really good at your job. You are dealing with microeconomics. Your future will be determined by larger phenomena, which do not even seem to be on your radar screen. I will go out on a limb and predict another major economic downturn this year. There is some possibility of it happening this month.

      Our crash is nothing compared to 1929 crash. Hmm, our crash is still happening and the fundamentals are worse than in the great depression. You cite 1929, which was the stock market crash. This has some significance, but was not the start and not the end. It was most immeadiately the result of the earlier pound sterling crash, which was the reserve currency of the day.

      Orders of magnitude. I would not agree with orders of magnitude. In the 1820's we had a fairly sane economic system. For instance, we had a national credit bank, eventually destroyed by the New York financial interests. We were doing a lot of infrastructure development, mainly canals. I suspect people were pretty much happy. If you are less precise on time, I might be inclined to agree.

      I know a little history. I disagree that everything in the past sucked hard. Our successes are rooted in the past. Often in the profound ideas generated by a few of the elite. But for the average person, life often tended to suck hard.

      History is an interesting topic. There are a lot of theories of history and they give different results. A common one these days is sort of the present as a thin bubble and the next instance is determined by what is in the bubble. I reject that approach. Since you repeatedly cite your historical knowledge, you might have a position on theory.

      I never said the modern world was horrible. Some extensive parts have been and are. I think you would have to agreed with that. An easy example is a lot of Africa. Also, I suspect I would use a more extensive time frame to define modern. For instance, I consider the post world war II period as that of current events. Again, it is a theoretical consideration, but I think quite useful.

      Well, I agree with you about the media.

    24. Re:OCCT by astar · · Score: 1

      I do not know much biology. I confess I googled. I will allow that characterizing DNA in 1953 is a fundamental theory, though perhaps we should go back to Mendelev. This would put my 60 year figure for physics a little off when it comes to biology. But perhaps because of my ignorance I do not know of comparable new discoveries more recently. As far as I can tell, something like sequencing the human genome is new tools elaborating on an old theory. I have noted a tendency to confuses new tools with new fundamental theory. However, you said fundamental science, not new fundamental theory, and you might well be right.

    25. Re:OCCT by Blakey+Rat · · Score: 1

      Regarding sequels, I never actually said they were worse. I said it showed a lack of creativity, which I admit usually means the same thing. But I do not think a "near-universal" phenomena needs to be argued about.

      Yes, well, the main point here is that you're decrying movies, TV and (presumably) video games that you've never even watched. You're no different from the fundamentalist Christians holding up signs outside Harry Potter screenings, in my eyes.

      I admit to taking a lot of physics in school.

      In other words, you're not a physicist.

      I think this is the best of all possible worlds and always has been. ... Take it as meaning it is always possible to meet necessity, and so you might have a future, though it is not guaranteed.

      So this is the best of all possible worlds, but I'm still going to die horribly, alone and afraid?

      Stop with this vague nonsense and just come out and TELL me why I have no future. What's your theory? What the fuck am I supposed to be so concerned about, exactly? "Lack of new physics breakthroughs" doesn't exactly equate to "horrible nuclear holocaust" or whatever the hell you have in-mind. Be specific.

      You are dealing with microeconomics. Your future will be determined by larger phenomena, which do not even seem to be on your radar screen. I will go out on a limb and predict another major economic downturn this year. There is some possibility of it happening this month.

      Oh, so you're an economist, too?

      Ok, so let's pretend you're not just grasping at straws. Let's assume, further, that the economy does downslide... so what? I'm still better off in the year 2009 with a god-awful economy than I would be if I was born in 1809. The worst that'll happen is that we all have to work a little harder... I'm not even so sure that's a *bad* thing, and it's definitely no harbinger of vague destruction.

      In the 1820's we had a fairly sane economic system. For instance, we had a national credit bank, eventually destroyed by the New York financial interests. We were doing a lot of infrastructure development, mainly canals. I suspect people were pretty much happy.

      Prove they were happy. Give me data, or shut the hell up. (By the way, it's going to be a pretty hard proof, considering the number of slaves at the time!)

      I disagree that everything in the past sucked hard.

      Then you haven't studied enough history.

      Often in the profound ideas generated by a few of the elite. But for the average person, life often tended to suck hard.

      I hate to break this to you, but I'm an average person. And, this is what sets us apart, my life is pretty great! I didn't even have to be born to some asshole who's ancestors tricked or killed people into declaring him king to get enough food on the table.

      History is an interesting topic. There are a lot of theories of history and they give different results. A common one these days is sort of the present as a thin bubble and the next instance is determined by what is in the bubble. I reject that approach. Since you repeatedly cite your historical knowledge, you might have a position on theory.

      What the fuck are you even talking about at this point? A bubble? Like that episode of Star Trek: The Next Generation with Dr. Crusher?

      I never said the modern world was horrible.

      You don't? Then how come you keep saying we're all going to suffer some kind of calamity? Like you did 5 paragraphs ago? (And you still haven't even gotten close to explaining what a lack of new physics discoveries has anything to do with it.)

      Jesus Christ. Do you talk like this at parties? You must be a real hit.

    26. Re:OCCT by kav2k · · Score: 2, Informative

      You can add Furmark to your "toolbox" to stress-test your GPU, free, built almost for that specific purpose and effective.

    27. Re:OCCT by vlm · · Score: 1

      It's made by a shit eating cretin who locks up all rights, even the right to give it to your friends. Why the fuck would you encourage such a malevolent asshole is beyond reason.

      Like the article says, "will crash way too often to blame it all on Microsoft".

      --
      "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    28. Re:OCCT by Lord_Dweomer · · Score: 1

      Will this catch fried graphics cards? I have a Dell (yeah, I know) m1730 xps with dual nvidia 8800m gtx cards thatg no longer seem to work and show exclamation marks in the device manager and windows is using it's backup drivers.

      I played around with different driver versions and other fixes but Dell still feels it is a software issue despite there being many forum threads from people with the same setup experiencing the same symptoms and it seems to be caused by the cards overheating due to poor engineering on nvidias part and an issue with the Dell bios that doesn't turn the gpu fans on until it's already way too hot.

      Trying to find some way to confirm and isolate the hardware issue to prove the situation to Dell.

      --
      Buy Steampunk Clothing Online!
    29. Re:OCCT by Khyber · · Score: 1

      Memristors are the newest fourth fundamental piece of electronics. first proposed in 1971 and still no commercial version available yet.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    30. Re:OCCT by Anonymous Coward · · Score: 0

      Free means no restrictions, ironic the FSF's GPL forces restrictions, isn't it? What's your definition of free?

      To quote Anarchy Burger by the Vandals:

      America stands for freedom

      but if you think you're free

      try walking into a deli

      and urinating on the cheese

      - In order to ensure everyone is free as possible, some restrictions are necessary.

    31. Re:OCCT by Anonymous Coward · · Score: 0

      Hiren's CD 10.0 is your friend, it has all that software built in :)

      PC Check 6.0 is pretty good on it's own for testing all that stuff to, you can set it up to automate the test too.

    32. Re:OCCT by FreakWent · · Score: 1

      Well we can see a steady decline in the sophistication and importance of the ideas presented in popular media, and a decline in the complexity of their presentation.

    33. Re:OCCT by PopeRatzo · · Score: 1

      If you were to do a little research, you'd learn that sequels are no more prevalent today than they were in 1940, 1950, 1960, 1970... etc.

      Sorry, astar, but it's starting to look like you're an idiot.

      --
      You are welcome on my lawn.
    34. Re:OCCT by PopeRatzo · · Score: 1

      Sequels are NOT more prevalent now than in earlier eras of film.

      --
      You are welcome on my lawn.
    35. Re:OCCT by PopeRatzo · · Score: 1

      Astar, I'm starting to think you're having us all on. I'm sorry I criticized you, now that I realize you're just spoofing us.

      --
      You are welcome on my lawn.
    36. Re:OCCT by PopeRatzo · · Score: 1

      Ever looked at "popular media" of the 1950's?

      --
      You are welcome on my lawn.
    37. Re:OCCT by astar · · Score: 1

      Again, note that movies were someone else's example. My example was from physics. I just commented on a "near-universally recognized" problem and used it to cite an absence of creativity, which I happen to think is a general and fundamental problem. The physics issue also I think reflects a fundamental problem.

      I find it interesting that people like to focus on a small criticism of movies. Think how much I would be abused if I attacked games, :)

      Anyway, in 1940 I think the Hollywood star system was in full swing.

         

  2. what about PTS? by midol · · Score: 2, Insightful

    The phoronix test suite is a good profiler, at least it would narrow the search. But, as you observed, once you are down to the RAM and integrated devices what options do you really have expect to toss the mobo?

  3. Eurosoft PC Check by jdb2 · · Score: 4, Informative

    This is probably one of the best and most comprehensive OS agnostic boot-CD/floppy general purpose PC hardware testing and burn-in tools I've come across IMHO.

    Here's its web page : http://www.eurosoft-uk.com/pc_check.htm

    In any case, I recommend plugging the ATX cable into a power supply tester that presents a non-trivial load as a first step in diagnosing any PC. You'd be surprised in what ways the problems caused by out-of-spec voltages can be manifested.

    jdb2

    1. Re:Eurosoft PC Check by Omnifarious · · Score: 2, Informative

      I second this. I've had 2 or 3 PCs now that have begun acting very strangely only to discover that the real problem was the power supply. Replace it and the PC acts fine again.

    2. Re:Eurosoft PC Check by piero.grimo · · Score: 2, Informative

      Same here. I've consistently had problems with a PC to discover years later that the PSU was defective (it actually blew up). I got a 450W PSU and all the bizarre symptoms have vanished.

    3. Re:Eurosoft PC Check by Anonymous Coward · · Score: 0

      Not sure if Eurosoft's PC-Check is better than the PC-Doctor Service Center product that you can buy on Amazon. It looks like the PC-Doctor one has a lot more hardware testing tools.

      http://www.amazon.com/PC-Doctor-Service-Center-Computer-Diagnostics/dp/B000Z88VXK

    4. Re:Eurosoft PC Check by Monolith1 · · Score: 1

      +1 for pc check

    5. Re:Eurosoft PC Check by Artifakt · · Score: 5, Informative

      Every power supply which I've found failed was visibly broken once you opened it up, and it was always the capacitors. No Exceptions - capacitors had sprayed gunk all over, their Aluminium cans had popped off the bases, etc. Typical electrolytic fluid is white-ish, but once it bakes dry will scorch, and so gradually turn reddish brown. Many capacitors have grooves scored into the tops which form sort of impromptu blow out panels, and often you will see them bulging, with traces of fluid escaping from these grooves where they are actually splitting open, or scorched fluid forming a red-brown powdery residue outlining them. The grooves are usually in either an X (or Plus) or a sort of K shape. The PSUs are often still working (somewhat) at that point, and often, the PSU may be putting out nominally correct voltages when cool but deviating when it heats up. I had one client's PC that made a loud bang twice over a period of about a week, but the PC didn't really start acting funny until the third bang. Opening the PSU revealed three small caps that had blown completely off the board. It had probably kept running with no obvious symptoms through the first two.
              Of course, only a trained pro with good tools should ever examine the inside of a power supply while live. But, if you are willing to unplug one and take it out of the PC and let it sit overnight, just to make sure the larger capacitors have fully drained, I recommend examining them. Yes, that voids the warranty if you aren't a pro, but if you were going to junk it and buy a new one anyway, so what? But before you open one, read this:

              DON"T EVER OPEN A PLUGGED IN POWER SUPPLY. IF THIS DOESN"T APPLY TO YOU YOU ALREADY HAVE AN ELECTRICIANS LICENCE, A EE DEGREE, OR SIMILAR. DON"T OPEN A POWER SUPPLY UNLESS YOU KNOW THE LARGE CAPACITORS INSIDE ARE DISCHARGED - THEY CAN MAKE YOUR ARM MUSCLES CONTRACT HARD ENOUGH TO BREAK YOUR BONES. GIVE THEM AT LEAST AN HOUR TO RUN DOWN, THEN USE AN INSULATED TOOL TO CROSS THE PLUG PRONGS BEFORE YOU OPEN THE CASE.

              Split caps or scorched ones will confirm you are right in your guess that it's the PSU. While you're at it, if you think the problem is the motherboard, check for capacitor damage there too, as it's not all that uncommon for that to be why a mainboard fails. Cheap electrolytics are probably responsible for more than half of all consumer electronics failures, they are by far the most likely source of intermittent failures, ones that come and go with temperature, or glitches that only partly disable something, and they are detectable.

      --
      Who is John Cabal?
    6. Re:Eurosoft PC Check by matrix28au · · Score: 1

      I have friends who are in a charity organsations and they only use PC Check. Its a proven tried method & works :)

    7. Re:Eurosoft PC Check by rickb928 · · Score: 2, Informative

      Don't trust the caps with the 'X' pattern. The 'K' pattern is more reliable.

      Ask any of the many who had Dell machines from about 2000-2004. And HP/Compaq. And Acer. Not so much IBM/Lenovo. I have no reports for Gateway.

      Also affected ASUS, MSI, AOpen, Gigabyte motherboards, pretty much all brands.

      For a period of time, there werw substandard caps being used, but the maker either faked the testing or used different component parts in production runs than in certification. If you got stung by these, you and I were the QA.

      It was not pretty.

      --
      deleting the extra space after periods so i can stay relevant, yeah.
    8. Re:Eurosoft PC Check by Cylix · · Score: 2, Informative

      YOU SHOULD NEVER USE CAPS LIKE THIS AND NEVER SUGGEST SOMEONE BRIDGE COMPONENTS WITH A SCREW DRIVER.

      I'm getting a bit tired of replying to all of the bad advice I see flying around. However, never discharge caps by bridge the connectors (even if the tool is insulated). A large enough power source can cause some serious problems.

      The proper way to handle this is to terminate the load into a ground source capable of dissipating the load. Earth ground will suffice, but don't dump a crap ton of current into the ground of your house.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    9. Re:Eurosoft PC Check by Mishotaki · · Score: 1
      this reminds me of an old friend that contacted me the other day.... he told me he fried his PC...

      i ask how do you know you fried it?

      he answers that he played around inside of it with a screwdriver while the power was on... it made some nice sparks and won't work anymore...

      yes, i told him he was an idiot for doing that...

    10. Re:Eurosoft PC Check by highways · · Score: 1

      DON"T OPEN A POWER SUPPLY UNLESS YOU KNOW THE LARGE CAPACITORS INSIDE ARE DISCHARGED - THEY CAN MAKE YOUR ARM MUSCLES CONTRACT HARD ENOUGH TO BREAK YOUR BONES. .

      Only if you're in a 110V country like the USA.

      In Australia/UK/EU where it's 230-250V, a stuff-up can mean DEATH.

      No, really - 240V will kill you. If you're lucky, you might only get scorched nerves.

    11. Re:Eurosoft PC Check by iq+in+binary · · Score: 1

      DON"T EVER OPEN A PLUGGED IN POWER SUPPLY. IF THIS DOESN"T APPLY TO YOU YOU ALREADY HAVE AN ELECTRICIANS LICENCE, A EE DEGREE, OR SIMILAR. DON"T OPEN A POWER SUPPLY UNLESS YOU KNOW THE LARGE CAPACITORS INSIDE ARE DISCHARGED - THEY CAN MAKE YOUR ARM MUSCLES CONTRACT HARD ENOUGH TO BREAK YOUR BONES. GIVE THEM AT LEAST AN HOUR TO RUN DOWN, THEN USE AN INSULATED TOOL TO CROSS THE PLUG PRONGS BEFORE YOU OPEN THE CASE.

      I'd modify that to say "If you don't what P=IE means, don't ever open a computer case, PERIOD."

      That aside, as a computer tech and salesman, the first thing I do when a customer brings a machine into my shop complaining of blue screens or anything that even remotely smells of hardware, I check the caps on their motherboard. It's actually a 50/50 shot that there's a popped or bulged cap or 2, usually accompanied by enough dust to smoke out a 10x10 room if a compressor is taken to it.

      %90 of the hardware failures I see are cause by overheating, with 3/4 of those incidences caused by dust. A good rule of thumb is if you see dust bunnies, the machine is probably cooked.

      --
      Of all the Universal Constants, here's one I know: Nice guys finish last ;)
    12. Re:Eurosoft PC Check by ajlisows · · Score: 1

      I'm probably going to sound like a moron here, but what the heck. Not the first time, not the last. I mentioned that I have been seeing a lot of bulging capacitors on motherboards somewhere else in this thread. I do not have a very strong knowledge of components but it seems to me that those computers with bulging/slightly leaking caps may work fine, but their performance is far worse than would be expected from the machine. Can motherboards "Wear down" in this fashion, making the computer slower as it ages? Or will issues with capacitors be more or less an all or none thing?

    13. Re:Eurosoft PC Check by mrmeval · · Score: 1

      They will cause various errors which seem to be covered up by the intense error checking of the hardware and operating system. I have had three Gigabyte motherboards. The first one had the leaking capacitor problem due to the corporate espionage that garnered only half of the capacitor formula. The next two Gigabyte motherboards did not have that excuse yet they still had leaking capacitors. I will not ever buy Gigabytes product again.

      With all three boards glitches would start randomly and they would progress as the capacitors continued to fail until I could not use the system. The third board bought in 2006 failed badly enough that it caused excess current to be pulled from the power supply which did not fail but generated a lot of heat.

      I now have an ASUS 'solid electrolyte caoacitor' board and a high end power supply and will see how it goes. I bought them at the beginning of 2009 and have not had an issue other than no support for the ATI Radeon card other than xorgs drivers.
       

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
    14. Re:Eurosoft PC Check by Anonymous Coward · · Score: 0

      Uhm.

      Acer owns Gateway. Packard Bell and eMachines, too. For quite some time. Basically, it's all the same stuff.

      Bad caps have been quite the issue, though. A few motherboard brands folded over it.

    15. Re:Eurosoft PC Check by ozbird · · Score: 1

      Don't trust the caps with the 'X' pattern. The 'K' pattern is more reliable.

      'Y'?

    16. Re:Eurosoft PC Check by Quatermass · · Score: 1

      The fact that this web site doesn't tell you the price of the product, makes me wonder how affordable it is?

      --
      Stuart http://stuarthalliday.com/
    17. Re:Eurosoft PC Check by Quatermass · · Score: 2, Informative

      Eh?

      The main capacitors (usually around 600-1000uF) that smooth the output of the rectified Mains is only about 300-400V and if designed correctly will have discharge resistors across them to render them safe in milliseconds.

      See
      http://pavouk.org/hw/en_atxps.html
      or
      http://www.smpspowersupply.com/ATX_power_supply_schematic.pdf

      for examples.

      Please do not alarm people needlessly.

      --
      Stuart http://stuarthalliday.com/
    18. Re:Eurosoft PC Check by mea_culpa · · Score: 2, Informative

      You can use a resistor to drain the caps safely.
      This is the preferred method as shorting them with bare metal can cause damage to the cap especially if it is highly charged.
      This is ELE-100 stuff here.
      Take a 25K 10W resistor, hold it with a pair of insulated pliers and short the leads of the capacitor with the resistor for about 30 seconds. Verify that it is actually drained by measuring it with your DMM. Repeat if necessary.

    19. Re:Eurosoft PC Check by cbiltcliffe · · Score: 1

      This is bullshit.

      The large caps in a computer PSU are not charged up to mains voltage.

      They're on the secondary side of the transformer, so at most, they'll be at 24 volt, but usually 12 or 5.

      You won't even feel that if you grab both pins.

      It'd be like grabbing both terminals of a 12 volt car battery. It's capable of supplying probably 700 amps, but you won't feel it, because 12 volt across the resistance of the human body is on the order of microamps. On the other hand, if you do short these out with a screwdriver, you'll see an impressive fat spark, and could damage other components in the area, so it's still not a good idea.

      --
      "City hall" in German is "Rathaus" Kinda explains a few things......
  4. random thoughts by Anonymous Coward · · Score: 3, Insightful

    self-checking programs like Prime95 can be useful to test the computer more generally (if you've verified with memtest a failure here basically means cpu/chipset at fault).

    Other things I've tried before have been (if the motherboard allows) things like significantly underclocking sections of the motherboard/processor, if an specific underclock fixes the problem you just significatnly narrowed down the list of possible failures.

    there are similar programs to memtest that will check a GPUs output conforms to what it should, but if you just have random-crashy-badness that can be a pain to diagnose. Sometimes things like just running without graphics drivers for a while can help spot those problems, if the computer no longer crashes you can look a bit further away from the graphics card as most of it's capabilities won't be used.

    1. Re:random thoughts by lukas84 · · Score: 1

      For stability tests, i prefer IntelBurnTest over Prime95. Basically it just automates running LinPack.

      Will test memory as well, and has a 64bit version available.

      http://downloads.guru3d.com/IntelBurnTest-v2.3-download-2047.html

    2. Re:random thoughts by godrik · · Score: 1

      I used prime95 to test some hardware last year and successfully found some processors on the network which divided wrong.

  5. Just replace it. by lukas84 · · Score: 2, Informative

    Repairing hardware makes no sense anymore. Just swap in a new machine from the pool, so the user will be happy again, call the manufacturer to send someone onsite to replace the system board, redeploy the image, and put the machine back into the pool.

    At home, i usually replace the machine before it has a chance to get old and flaky.

    1. Re:Just replace it. by Trahloc · · Score: 1
      That works awesome for the corporate world. But last I checked friends and family dont have a pool to draw from and if you at least read the first couple words of the summary.

      "Over the years I have repaired my own, family, and friends' PCs many, many times".

      I know RTFA is too much to ask on other articles but RTFS's first sentence on askslashdot can't be THAT much ... can it?

      --
      The Goal: A long simple life filled with many complex toys.
    2. Re:Just replace it. by lukas84 · · Score: 1

      So they won't get a replacement machine, but it's the same thing. Call up the manufacturer, have him replace everything, and then restore from their last stable backup.

    3. Re:Just replace it. by vxvxvxvx · · Score: 1

      What if he is the manufacturer? If he's building the machines for friends & family he can't simply call someone else up to replace everything. Or perhaps someone else built the machine from components. The fix is the same, replace whatever is broken, but the question is how do you determine which thing is broken. (This is a good reason not to build machines for friends and family.)

    4. Re:Just replace it. by gd2shoe · · Score: 1

      You still don't seem to get it. Friends and family only rarely ask you to fix a machine that's still under warranty. More often you wind up diagnosing / replacing the broken part yourself, and sending them on their way.

      (often the 1year warrantied hard drive that gives out at 13 months; people aren't going to replace their computer every year because of that.)

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    5. Re:Just replace it. by Trahloc · · Score: 2, Interesting

      So your solution is to pawn off the problem to someone else? Either your an ancient tech near retirement who is bored to tears with what he does or you never really loved this stuff to begin with. Unless I take glee in the idea that a particular individuals machine is broken because I despise them I'll help utter strangers fix their stuff just because its *fun* to figure out a problem. Helping the other guy and gaining their gratitude is a bonus. And yes I've been doing this for long enough that it isn't something I'll "grow out of".

      --
      The Goal: A long simple life filled with many complex toys.
    6. Re:Just replace it. by lukas84 · · Score: 0, Flamebait

      No, it's just that i've given up on trying to solve issues that are utterly impossible to figure out, because you're basically just guessing what the issue could be, based on your experience.

      In my job, i've learned that this does not pay - fixing an out-of-warranty machine for 185 CHF per hour is _not_ something a customer will pay for - replacing the machine is cheaper and gets you a new one with 3 years of warranty.

      Of course there are still friends and family, but i've stopped building machines for them from parts since i've got out of my apprenticeship. They'll expect instant and free support for every issue they have, so my recommendation is usually to get a machine from a local shop where they can annoy someone else.

      The same goes for many software issues - sure, if i have a strange issue on one of my machines, i'll usually spend a few hours on trying to resolve, just to satisfy my curiosity. The same also goes for servers at work.

      But if i have a non-reproducible problem on a client machine, replacing it with a swap machine and a fresh OS image immediately fixes the users issue and costs less money.

      Add to that that a lot of hardware has been replaced by laptops, where you can do very little in case of issues, since replacement parts are fuck expensive and maintenance manuals sometimes hard to come by, depending on the manufacturer.

      Also, most of the client machines at work consist of very few components, and fixing out of warranty machines makes little sense - a new ThinkCentre M58 costs around 1000$ - getting a replacement mainboard for an old A51 or such costs around 200$, plus labour, and if you're unlucky the problem wasn't the board but the psu, cpu or memory, and you'll need to order more parts and invest more work.

      But hey, maybe i'm just to negative about this. Maybe you can enlighten me how you can sort out these issues.

    7. Re:Just replace it. by jp10558 · · Score: 1

      Yea, and basic Dell Precision Workstation T3400s are ~$650... Even less reason to deal with them out of warranty.

      --
      Opera, Proxomitron-Grypen,GPG 0x0A1C6EE3
    8. Re:Just replace it. by bahstid · · Score: 0, Troll

      Maybe if you got your head out of the clouds and realised that 185 CHF (~180USD) is what the vast majority of people on this planet would regard as a pretty reasonable WEEKLY wage, you wouldn't think that advice like "well its only $1000 to replace, so just throw it away" adds anything to this conversation. For a significant percentage of the Slashdot demographic maybe we could call it daily, but doesn't really change the issue. I wonder if you have ever considered what happens to the thrown away item, which is likely to be 98% functional, even though its an antique 3 year old piece of hardware. You might consider that there are billions of people out there (not only in some third world dustbowl) for whom this would be a treasured item and who might be interested to find an easier (and more economical) way to regain that 2%. Believe it or not, some of us are even interested in salvaging that last 5% from really broken boxes, before we add them to the to the massive stinking junk-pile of this disposable culture.

      "But hey, maybe i'm just to negative about this. Maybe you can enlighten me how you can sort out these issues."

      For you I would suggest taking a year of your life to sort out these issues and go somewhere. Go wander the earth for a bit. Go see undeveloped, developing and developed countries. Meet normal people. Rich ones, poor ones and ones doing ok. Do some work that you aren't trained in. Escape your bubble. When you get back home I can pretty much guarantee that the only negativity you will feel is toward your old attitude and those that still share it. You will suddenly have a great appreciation for the immense privilege you live in and be in awe of the planet around you. And maybe even helping out a less knowledgeable friend won't be just an annoyance anymore.

    9. Re:Just replace it. by lukas84 · · Score: 0, Troll

      I wonder if you have ever considered what happens to the thrown away item, which is likely to be 98% functional

      It gets recycled by SWICO http://www.swicorecycling.ch/.

      The cost for that is included into the price of buying a new device.

      For you I would suggest taking a year of your life to sort out these issues and go somewhere.

      Ah well, unfortunately i'm not old money and can't afford a year without a job.

    10. Re:Just replace it. by Anonymous Coward · · Score: 0

      Wow, another idiot...

  6. How to test? by girlintraining · · Score: 3, Insightful

    Well... typically you find the fault by using an application which stresses one of those components far more than any other and then seeing if the failure condition you're observing occurs more often. This is just basic troubleshooting, it's not even specific to computers.

    --
    #fuckbeta #iamslashdot #dicemustdie
    1. Re:How to test? by Etylowy · · Score: 1

      For RAM it's fairly easy - 2-3 different data manipulation methods used by memtest and you know if there is an issue. With CPU it's WAY more tricky - example: I once had a CPU (it was some kind of old AMD, Duron I think) which crashed every single time I used LAME to encode wav to mp3. I have failed to find any other software that would crash every single time (or even often), but the system was generally less stable. As you can see sometimes it is not enough to simply stress the CPU with a generic app - that's why I was asking if there was software designed to test CPUs, just like memtest is designed to test RAM and nothing else.

    2. Re:How to test? by gd2shoe · · Score: 1

      I hear you. I too have often wished for something like this. (something that would test individual pathways and as much of the instruction set as feasible, preferably all of it.) Stress testing is still a good idea, but it should be done in tandem with real testing.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    3. Re:How to test? by Khyber · · Score: 1

      "For RAM it's fairly easy - 2-3 different data manipulation methods used by memtest and you know if there is an issue."

      This isn't entirely accurate. Memtest won't tell you the operating speed of the memory module. It's quite common to have a memory module pass every Memtest86+ test and then go to a hardware-based RAM checker only to find out the speed starts at the proper MHz range then drops by half or more - bad RAM chip.

      Memtest will not catch that and sometimes that is the ONLY way to diagnose a faulty RAM module.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
  7. Preventative Medicine - get a UPS by jackchance · · Score: 4, Informative

    Most home computer hardware failures come from "brownouts".

    If you notice that your lights dim a little bit when your fridge compressor or AirCon comes on, that is a recipe for a computer failure. Spend $50 get a UPS
    Btw, i noticed that my linksys wifi router was also extremely sensitive to brownouts. It would get funked up and need to be power cycled. Plug it into a UPS , no more wifi problems either.

    I learned this the hard way when i moved to an old building in the east village of NYC and had 3 motherboards/cpu fail within a 3 month period.

    --
    1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
    1. Re:Preventative Medicine - get a UPS by lukas84 · · Score: 2, Insightful

      That's not an Online UPS, so it won't protect against all grid issues. And Online UPS are expensive and noisy.

    2. Re:Preventative Medicine - get a UPS by a09bdb811a · · Score: 2, Interesting

      If you notice that your lights dim a little bit when your fridge compressor or AirCon comes on, that is a recipe for a computer failure.

      Why? Doesn't the computer's PSU have enough juice in it to survive a quick dip in voltage? Besides, almost all PSUs are rated ~90-260V, so I always assumed if it dips from 230V, it won't matter.

      Occasionally my lights dim but I don't seem to have had problems. I'm still waiting for my decade-old P3 to die so it can be replaced by an Atom board, but the darn thing keeps on running.

    3. Re:Preventative Medicine - get a UPS by Etylowy · · Score: 1

      Good power supply should handle it - maybe not as well as UPS but will greatly contribute to general system stability.

    4. Re:Preventative Medicine - get a UPS by The+Grim+Reefer2 · · Score: 3, Informative

      Most home computer hardware failures come from "brownouts".

      If you notice that your lights dim a little bit when your fridge compressor or AirCon comes on, that is a recipe for a computer failure. Spend $50 get a UPS
      Btw, i noticed that my linksys wifi router was also extremely sensitive to brownouts. It would get funked up and need to be power cycled. Plug it into a UPS , no more wifi problems either.

      I learned this the hard way when i moved to an old building in the east village of NYC and had 3 motherboards/cpu fail within a 3 month period.

      What you really need in the case you describe is a good line conditioner. I didn't look at the 'UPS' you mentioned, but many in that price range are not a true UPS and will still allow for under voltage to occur, albeit for a shorter period if you're lucky. .

    5. Re:Preventative Medicine - get a UPS by dbs11 · · Score: 1

      It sounds like you may have a bad neutral. In the US, we get two hots and a neutral from the electric company into a building. From each hot to neutral is 120 volts for receptacles and the like; across both hots you get 240 volts for heavy loads like stoves, dryers, and central air units. If the neutral opens up the voltage doesn't divide evenly. It will sag on the more heavily loaded hot leg and soar on the other. You notice this when you switch on heavier loads like a refrigerator or toaster. Open, or half-open neutrals are not rare. The connections outside can corrode due to aging. Old boxes, especially in a cellar or other damp location, are another culprit. We had this happen due to an incompetent electrician. He replaced the circuit panel and forgot to tighten the neutral screw. The kitchen lights got super bright when we turned on the one of the stovetop elements, and dimmed when we turned on a second. Among other things we lost the doorbell transformer, the garage door opener, and a few digital clocks. Settlement was a lot of fun. The real estate agent had brought this guy in to replace the box for a finicky buyer. The damages came out of their commission. They didn't argue much - it was their electrician, and he could've burnt down my house.

    6. Re:Preventative Medicine - get a UPS by gd2shoe · · Score: 1

      I doubt it was the dips that killed your equipment. More likely, it was the spikes on the line that shortened their life. (same crummy electric grid that caused your brown outs) Of course, each dip can be accompanied by a spike as the power recovers. As the other poster here mentioned, its not a matter of keeping power to your devices, it's a matter of conditioning the power that's coming in.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    7. Re:Preventative Medicine - get a UPS by Anonymous Coward · · Score: 1, Informative

      Why? Doesn't the computer's PSU have enough juice in it to survive a quick dip in voltage?

      No. Off-the-shelf computers from the big vendors tend to select the cheapest, lowest-rating power supply they can find. And since it's the cheapest the power supply vendor may additionally cut corners. A *good* power supply? A little brownout is no problem. Most PCs do not have a good power supply.

      I'm still waiting for my decade-old P3 to die so it can be replaced by an Atom board, but the darn thing keeps on running.

      From my experience at a computer surplus, P3s and below have been very reliable. VERY reliable. Higher-end (2.8ghz+ or so) P4s have exibited increasing rates of blown motherboard caps and power supply failures, and the Pentium Ds we are starting to get have had VERY high failure rates.

                Anyway, my burn-in method won't help you. But at the University surplus I work at, we have an automated netboot Ubuntu installer. We *could* basically ghost it on, but the net installer works out the ethernet, hard disk, CPU and RAM pretty hard -- it has actually found many machines (out of ~10,000 a year we get through) that have no apparent blown caps (*cough* *GX270* *cough*) but are nevertheless unstable. This does not help narrow down the fault, but it does narrow down if it's real or if it IS windows and/or drivers though.

                  Power supply -- check the BIOS, and if it doesn't have one, you'll have to have a voltmeter with you. I've seen power supplys where the voltage sags, it'll run but crash randomly. In reality, I have not checked the power supply very frequently, the below detects most faults.
                  CPUburn -- this'll exercise the CPU.
                  Memtest86 -- memory
                  if it doesn't crash with these going, then run something video-card-intensive. If it then crashes it could be the card unstable, mobo unstable (either not supplying the card enough voltage, or other problem...), or faulty power supply (sagging under load of the video card perhaps? You did test this right?) Or it could be drivers of course.

    8. Re:Preventative Medicine - get a UPS by Anonymous Coward · · Score: 1, Informative

      Your observation is correct: Modern switching power supplies don't care much about voltage, as long as it's in a certain range: They simply draw more current when the voltage is lower. It's not the brown-out as such which kills the computer but the transients which go with it. Power supply quality varies. Good power supplies can bridge longer drop-outs and withstand stronger voltage spikes than others. It speaks for a PSU when the computer keeps running while a short drop out turns the TV off, for example. A PSU like that most likely doesn't need a UPS to protect it from bad mains.

    9. Re:Preventative Medicine - get a UPS by xaxa · · Score: 1

      Interesting...

      In my parents' house, turning on the 7kW electric shower briefly dims the lights, do they have a problem with the neutral? The house is in the UK, so the electricity supply to the building is 230V single phase (IANAE).

    10. Re:Preventative Medicine - get a UPS by Anonymous Coward · · Score: 0

      Wait a minute; hold up. A 7 thousand watt shower?? What are you washing with that thing? An elephant?

      Imagine a beowulf cluster of those... amiright?

    11. Re:Preventative Medicine - get a UPS by Anonymous Coward · · Score: 0

      How long does it take to heat a glass of water in a 1kW microwave oven? A 7kW flow heater is not out of the ordinary.

    12. Re:Preventative Medicine - get a UPS by adolf · · Score: 1

      Maybe. The loose neutral issue is a real, serious problem, as you've seen. It is something that I troubleshot and fixed myself in my current house, blessedly before anything expensive happened.

      Brownouts can also be caused by one or both hot wires being bad (resistive) somewhere along the way. The symptoms are different from a loose neutral connection in that the lights on other circuits don't brighten at the same time as the brownout occurs.

    13. Re:Preventative Medicine - get a UPS by Teun · · Score: 1
      Although the UK allows rather unusual wiring for high Amperage circuits (the reason you have fused plugs) I doubt this heater runs on a 32 Amp fuse.

      Here in The Netherlands these heaters have never been popular, natural gas is half the price, but those that I've seen were always 3-phase.

      Because a heating element doesn't really have an increased start current like an electrical motor which would cause a flicker unless you mean a semi-permanent dimming of the lights when you're running your shower.

      In the latter's case there might be a bad connection and they get hot and can cause a fire, especially at these currents. So stick a Volt meter in a socket and check! (IAAE)

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    14. Re:Preventative Medicine - get a UPS by Anonymous Coward · · Score: 0

      First of all, please remember that in the United States, computers are almost always run on a 110 v source. Also, many older PSUs are not rated for 90-230, but are rated for more like 100-130 and 210-250, depending upon the position of a switch.

    15. Re:Preventative Medicine - get a UPS by Thoughts+from+Englan · · Score: 1

      I would suggest you get this checked out by a profesional. 7kW is a big load (3 bar electric fire = 3kW in case you wern't aware) but I wouldn't expect the supply into the house to dip when it is switched on which suggests there may be a problem else where - at the very least contact a qualified sparkie for an opinion. (best of course if they are a friend as the advice is more likely to be straight)

      --
      That was supposed to be "Thoughts from England" ... Oh well.
    16. Re:Preventative Medicine - get a UPS by xaxa · · Score: 1

      Although the UK allows rather unusual wiring for high Amperage circuits (the reason you have fused plugs) I doubt this heater runs on a 32 Amp fuse.

      It is on a separate circuit with a 30A fuse (perhaps it is 6.5kW, I don't remember). I don't think I've ever seen a UK house with a 3-phase supply.

      Here in The Netherlands these heaters have never been popular, natural gas is half the price, but those that I've seen were always 3-phase.

      Unfortunately, when it comes to British housing there's far too much stuff that's cheap in the short term (like only installing a cold water pipe to the new electric shower), or inefficient because the person paying for the equipment doesn't pay the operating costs (like far too many rented properties without double glazing, high-efficiency appliances, or decent insulation).

      So stick a Volt meter in a socket and check! (IAAE)

      I shall investigate next time I'm there (it's been this way for ~12 years, I'll risk another couple of months).

    17. Re:Preventative Medicine - get a UPS by Anonymous Coward · · Score: 0

      That's why I like my PSU. It can actually run for about a second without power, no problem. It comes in quite handy in a country frequented by powerdips/brownouts.

    18. Re:Preventative Medicine - get a UPS by Teun · · Score: 1
      The 30 Amp is probably correct, this was build in the days of 240 Volt supply.

      You are right with your observation of short term solutions, I'm still amazed by the water heaters without insulation, someone made good money selling after market jackets :).

      The worst I've seen in a UK installation was two wires in parallel, theoretically enough copper but in reality one of them will carry more current than designed for and thus overheat.

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    19. Re:Preventative Medicine - get a UPS by Raedwald · · Score: 1

      Or move to Europe, where it seems we do not have home wiring and power companies as poor as they are in the US. I've been told that not all US homes have a ring-main.

      --
      Ne mæg werig mod wyrde wiðstondan, ne se hreo hyge helpe gefremman.
  8. Tools by Anonymous Coward · · Score: 1, Informative

    CPU:
    Prime95 (Step 2): http://www.mersenne.org/freesoft/#newusers ... Blend test for memory+CPU stability, Small FFT for CPU
    Lynx: http://www.softpedia.com/get/System/Benchmarks/LinX-benchmark.shtml

    Video Card:
    3dmark: http://www.futuremark.com/benchmarks/

    When testing the video card, listen for high pitch squealing (power issue), over heating, and symptoms like white dots appearing at random. This is not a test tool but will put some stress on the card.

    1. Re:Tools by lukas84 · · Score: 2, Informative

      Furmark http://www.ozone3d.net/benchmarks/fur/

      Is better suited for stressing your GPU, it's also free.

  9. Built in Self Test by camgirlshide · · Score: 1

    Many machines, like my dell notebook, have a built in self test. On this machine I'm typing on now (Dell Inspiron 1501) if you go into the slef test mode (instructions to enable are on the spash screen) you can pick one of two tests - a quick 15 minute test or one that runs for several hours. I assume this self test is the same test that dell runs before they let a machine leave their factory. I have also seen their techs use it to determine which piece of hardware they should send you if they believe there is a hardware problem. Unfortunately, I don't think it is fool proof. I'm pretty sure the hard drive is going on this laptop even though I don't see anything to indicate that on the smart status or the self test screens. It's a good stat though.

    1. Re:Built in Self Test by gd2shoe · · Score: 2, Interesting

      It is a good start, but no more than that. Those tests are certainly not comprehensive (and should be). On the plus side, they often have your specific hardware in mind, and might possibly catch something that other tools wont. (doesn't happen often, but sometimes...)

      SMART is also not the end-all of hard drive indicators. A drive can pass SMART, and still be on the way out. I've found (for those familiar with Linux) that a dd from the hd to /dev/null will often spit out errors on a drive that's getting ready to fail. A linear read is far faster than a read/write surface scan, albeit not as thorough. (can be run from knoppix live CD)

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
  10. Microscope by grapeape · · Score: 3, Informative

    I like the Microscope products...their newest version Microscope duo boots off of a USB stick. For machines that dont boot at all they also have a diagnostic card, its basically a pci card that has an led readout that give a series of post codes that can help diagnose if its the board, a card, memory, etc. They can be found at http://www.micro2000.com/

    The handiest piece of diagnostic gear I use is actually a simple power supply tester. You would be amazed how many systems that appear to power up are actually suffering from a dead -5 or +5 rail on the powersupply. Many tend to think if the fans spinning the powersupply is ok but thats often not the case. The best part is they are cheap...around $10 for a basic one.

    1. Re:Microscope by Anonymous Coward · · Score: 0

      Man, these guys still around ? I remember them from way back in the dark days of MS-DOS and Norton Utilities to diagnose hardware...

    2. Re:Microscope by fjin · · Score: 1

      I do consumer computer repairs as my work. One observation is, that powersupply lasts about 5 years - when it is in general-purpose non-gamer-box.

    3. Re:Microscope by Immrama · · Score: 1

      I have used this for years. It is great to test a machine so you don't have to spend time swapping parts. It gives an answer fast.

    4. Re:Microscope by Cylix · · Score: 1

      A power supply tester is mostly useless. The basic features of any modern motherboard include sensors which display the voltage readings.

      A power supply tester simply identifies whether or not an unloaded voltage source is within the 5% variance. It would have to be extremely poor condition to not pass this test (sic, obviously failed and identified from the same common tools everyone has access too).

      In many circumstances I find it necessary to apply load to a power supply in order to quickly identify the fault.

      Again, testers are nice, but the reality is you have access to a tester which can offer you information on both the performance of the power supply and the motherboards own voltage regulators and step down components.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    5. Re:Microscope by Anonymous Coward · · Score: 0

      This is my First post on Slashdot though I've been reading for years. What this guys says is absolutely the case on both counts. I have a 15 dollar POWMAX PSU tester ( http://www.dansdata.com/quickshot018.htm ) that has been worth it's weight in gold.... X10. All kinds of twitchy wonky behavior can be traced to an undervolted or dead 5 volt rail which BTW 99% of the time is what's wrong. A red 5v rail light and a red "bad" light tells me it's the PSU and it's never been wrong in the several dozen machines I've used it on.

  11. Testing video card? by Anonymous Coward · · Score: 1, Interesting

    ATITOOL

    It's not just for ATI. Has a card stressing feature.

    Another program to note is something like: speedfan

    I can't count how many times a problem was directly caused by high temperatures on the cpu, gpu, etc.

    And one more tool which I keep in my toolbox:

    Spacemonger

    A quick run of it gives you a visual representation of the hard drive. I've fixed several problems by seeing that crap needs to get deleted.

    Good luck!

  12. I wish you had asked this question 2 weeks ago... by Anonymous Coward · · Score: 1, Interesting

    I've slowly replaced every component in my system due to random crashes. Memory, hard drives, motherboard, power supply, video card and finally this morning the CPU. Each with a fresh OS install.

    I'm left with either the case, or the DVD drive being the culprit - if it is the DVD drive, I'm gonna kill someone - most likely me...

  13. Hiren's... by Zakabog · · Score: 3, Insightful

    Hiren's BootCD contains a bunch of different utilities for doing just this. Plus it's bootable, so if you can't get into the OS you can still use the CD. It can do just about anything you'd need to in order to diagnose and repair a machine. You just gotta find it (usually the pirate bay or other torrent sites are a good place to look.)

    1. Re:Hiren's... by dogfolife69 · · Score: 1

      only problem with Hiren's is that the ability to use certain test on certain hardware configuration at times can be hit or miss. I like to use Memtest for RAM, and Seatools for HD related errors... I knoppoix (linux) for getting data from a working drive where windows has be messed up to the point of not working, and not repairable.

    2. Re:Hiren's... by DigiShaman · · Score: 2, Insightful

      1. Hiren's BootCD is the only thing you need to diagnose hardware, repair, and transfer data. You can even make a bootable USB thumb drive instead of using it as a CD. It's the gift of the Gods!

      2. Be aware that just about every program in this collection is pirated. If you are making a profit through using this boot CD, purchase the F-ing programs by themselves!!! It's one thing to pirate software, it's quite another to ride off the backs of others work.

      --
      Life is not for the lazy.
    3. Re:Hiren's... by Nimey · · Score: 1

      Perhaps I'm blind, but I can't find a download link on that page, just a bunch of wank detailing exactly what's on the semi-mythical ISO.

      Where should I be looking?

      --
      Hail Eris, full of mischief...

      E pluribus sanguinem
    4. Re:Hiren's... by eprimetime · · Score: 1

      Bittorrent is your friend there - he cannot host it because as others have said, a lot (not all) of the tools are pirated - a decent tracker or even a google search will help you.

    5. Re:Hiren's... by DigitalCrackPipe · · Score: 1

      That looks almost exactly like The Ultimate Boot CD with the addition of a few pirated programs. I suggest having a copy of UBCD around - it's handy, free, and easy to obtain.

    6. Re:Hiren's... by Anonymous Coward · · Score: 0

      It's one thing to pirate software, it's quite another to ride off the backs of others work.

      How are these two things different?

  14. Hardware tester by iammani · · Score: 2, Informative

    When you no longer trust your CPU/motherboard, I am afraid the only option to test them would be a hardware circuit (which can make decisions using its own CPU) specifically designed for your motherboard/processor. Which I believe only manufacturer will have access to. If you are looking for a more practical solution. The only way is to eliminate the possibility of all other hardware failing (by simply removing them or using them on a good machine) and assuming it must be CPU/motherboard issue(which means you may have to junk them both and buy new ones). And dont forget to test you power supply unit (not checking it on my old PC cost me hell a lot of hours)

    1. Re:Hardware tester by snadrus · · Score: 1

      Lately buying the lastest special combo beats web purchases of old hardware for testing in all but the most special cases. In those cases either have a warranty, seller test, or expect trouble if its bleeding edge and use forums.

      --
      Science & open-source build trust from peer review. Learn systems you can trust.
    2. Re:Hardware tester by Cylix · · Score: 1

      Most hardware is 'dumb' and does not have fault latches.

      This is a cost that was avoided in order to make cheap motherboards and system components.

      Hardware troubleshooting is in no form about trust. It is applying a series of logical steps designed to isolate and repair failures.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  15. Best Memtest I've Used Is... by Anonymous Coward · · Score: 0

    Age of Conan

    Believe it or not. I had some crap memory (OCZ Reaper) which other than a few random crashes, *mostly* worked. However, it would consistently corrupt AoC's patches. If AoC decides to re-download a GB of patches on you over and over, check your memory. I've since replaced the memory on this machine and had no problems since. Sadly I've stopped playing AoC. Oh well.

    1. Re:Best Memtest I've Used Is... by lukas84 · · Score: 1

      I just hope at some point people will decide that ECC should be mandatory for everything.

      GPUs like ATIs HD5800 series already employ memory with ECC.

      2GB sticks are now the norm, systems with 4GB or 6GB of RAM pretty standard. So ECC would make a lot of sense. But we still don't see it anywhere, even though by now all modern hardware is capable of it (though Intel disables it on the consumer badged versions).

    2. Re:Best Memtest I've Used Is... by uglyduckling · · Score: 1

      Mac Pro machines use ECC RAM. Another reason why they're so expensive, but worth it if your creative / scientific business depends on the computer working properly.

    3. Re:Best Memtest I've Used Is... by lukas84 · · Score: 1

      They have to, all Xeon based machines require it.

      Just like all other Xeon based workstations.

  16. SMART for dying hard drives by Wrath0fb0b · · Score: 4, Informative

    http://sourceforge.net/apps/trac/smartmontools/wiki is great for finding out what the drives think about their own health. Things to look out for are spin-retry counts (which lead to that annoying 2-5 seconds freeze), high reallocated sector counts (never never never use chkdsk to attempt to fix a broken hard drive. With the robustness of modern journaling file systems (HFS, extN, NTFS), storage errors are almost always hardware errors. Running chkdsk stresses the drive just as it's failing and usually pushes it over the edge -- and then users complain that you can't recover their data.

    1. Re:SMART for dying hard drives by Anonymous Coward · · Score: 1, Informative

      This research paper says that modern journaling file systems are not as robust as you might think: http://pages.cs.wisc.edu/~crubio/includes/pldi09.pdf

    2. Re:SMART for dying hard drives by Anonymous Coward · · Score: 0

      SMART is poorly implemented in linux. at least on the hardware most ISPs use for dedicated servers.

  17. prime 95 by LordKronos · · Score: 2, Interesting

    Prime 95 is a good test of CPU/RAM, as well as to see if the system remains stable under peak temperature. It's often used to burn in overclocked machines.

    1. Re:prime 95 by poly_pusher · · Score: 1

      With Prime 95 you can test your CPU, Memory or Both. It's a great tool to find out which piece of hardware is throwing errors. I use it for overclocking but have also used it to identify faulty pieces of memory.

    2. Re:prime 95 by VoltageX · · Score: 1

      IntelBurnTest (using Linpack) will stress the CPU far more than Prime95 ever could. Useful for heatsink testing too!

      --
      "Anonymous could not immediately be reached for further comment." - International Business Times
  18. Overheat by gd2shoe · · Score: 4, Informative

    That's a marginal idea at best, but a common one.

    While the technique of blasting a processing unit to see how it behaves at maximum temperature will sometimes find a faulty unit, many faults are not temperature related, and will not show up on this test. It's fine that you brought it up here, but something that both heats the CPU/GPU and tries to test as many pathways / as much of the instruction set as possible would be far more useful. (cf memtest86+ for RAM)

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    1. Re:Overheat by eggy78 · · Score: 1

      many faults are not temperature related, and will not show up on this test.

      I've actually had a lot of well-performing and well-cooled hardware that just doesn't work right. I recently gave up on a video card because it was randomly bringing the entire system to its knees (I suspect the drivers were waiting on a lock or something), and would ultimately lead to a hard crash. If the system was never allowed to sleep, it would be fine with whatever load it was subjected to. Once it had come out of standby, all bets were off as to whether the system would slow, hang, or crash and how long it would take to get there.

      Overall it just seems that I hit upon a specific configuration/code path that is probably a driver issue, but definitely not temperature related and just as problematic as a hardware issue if all drivers behave similarly (and hard to use an overclocking tool to test). I replaced it with a (...ok, *the*) different brand of GPU and everything has been working fine since.

    2. Re:Overheat by gencha · · Score: 1

      Actually, OCCT includes a feature to run tests similar to memtest86 on the GPU.

    3. Re:Overheat by ThurstonMoore · · Score: 1

      Can I have the old GPU?

  19. Buy Dell or HP by dcheest · · Score: 1

    That's exactly why I buy Dell or HP PC's. They both offer complete hardware diagnostic programs on a bootable CD or a utility partition and they replace the parts that fail the tests if it's under warranty.

  20. PSU by gd2shoe · · Score: 5, Informative

    Oh, and don't forget to check the PSU. When it acts up, it will often appear to be a hardware fault somewhere else in the machine. (often RAM, but can be MB, CPU, GPU...)

    This certainly doesn't answer the posters question, but it is related and important.

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    1. Re:PSU by Daneurysm · · Score: 4, Informative

      I was just about to mention this. I used to work in a mom-n-pop shop, the only one in the area, for a long time.

      I have seen some of the most ridiculous problems that were PSU related. Serial mouse not working, VGA card outputting in B&W, slow and or intermittent performance, HD's that constantly reset (and sound like click of death in the process), new memory being blown, known good memory acting like bad memory, CD-R's that can't burn (or finish burning successfully), software modems that couldn't go off hook, AGP cards crashing, PCI cards crashing, VLB SCSI cards not working at all.

      The list really just goes on and on and on. Software to diagnose faulty PC hardware? Sorry, no thanks. I had tried all manner of diagnostic and test software over the years. Some worked some of the time. (mem tests and HD scanners), the rest were borderline use-less pieces of crap. Not only that, but because of faulty PSU's (usually overloaded, or just old, or overheating, etc etc etc) I have seen those same programs misdiagnose just about everything.

      Aside from simple sensor reading and verification (of code, built in HW diagnostics, etc) I do no trust 'software based' hardware diagnosis, especially on a PC.

      YMMV.

    2. Re:PSU by mysidia · · Score: 5, Insightful

      Check supply voltages first.

      There's a really fancy test program to do this... it's called a digital multimeter, and it's a piece of hardware with two probes.

      You touch one probe to ground, and then use the other to check all the leads going into MB for supply voltage.

      For desktops that is.

      For servers, the power supplies are generally smart modular units, and you check their voltage outputs in the BIOS screens, or using remote management via BMC: IPMI, iLO, Drac, or ALOM

    3. Re:PSU by gobbo · · Score: 1

      Mod parent up. A proper multimeter and a power supply voltage chart is the skeptic's answer to all kinds of hardware voodoo.

    4. Re:PSU by robbak · · Score: 4, Insightful

      While that is good "Bad or Maybe" test, most PSU problems are transient over- or under-voltage conditions, which a DMM is not going to reveal.
      And there are testers that will measure all (or most) of the voltages produced at once - you jut plug the atx cable into the device, and many of them have a pass-through, so you can test the PSU under load. I'd look for one that could flag a transient problem, if it exists.

      Mind you, since writing the above I have looked around for one, and have failed! They all are pretty simple devices that do not detect transients, I could find no pass-through devices, and they all test under very anemic loads. All told, I am not impressed by any of them.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    5. Re:PSU by hairyfeet · · Score: 1

      Preach fellow repair man, preach. That is why I always try to keep at least a couple of decent 400 watt PSUs around, as all kinds of hardware 'problems" could be traced back to those suckers. At the last shop I worked we had one of those PSU multimeters to test, but frankly I found that to be crap. You'd be surprised how many would test fine when first fired up only to get crappy once it had time to heat up. So with PSUs I'm of the "when in doubt, toss it out" mindset. 400 watt PSUs are cheap, and it is better to give the customer a new one that waste the time dealing with one that may be iffy.

      The other one that is a royal PITA? Bad drive channel, usually caused by problems with the southbridge. I've found that particular problem in PCs where the ID10T...err customer, left their machine plugged in during an electrical storm (usually straight into the outlet, like it was a toaster) and the PC getting surged. The closest I've found to software tools to check for this is using a combo of Diskcheck along with Spinrite. After I've changed out the PSU (naturally) I make a couple of copies of the customers folders that have plenty of files and use Diskcheck to see if the CRC matches. If it don't I have Spinrite look for bad sectors and if it finds the drive is okay you can pretty much figure that you have a bad channel and the board needs to go. This is of course after already doing Memtest on the RAM and stress testing the CPU.

      Another good tool (if you can find a copy) is the Computer Repair Utility Toolkit. This little baby will turn 150Mb of your flash stick into a pretty damned helpful "all in one" kit for removing bugs, finding hardware info, etc. Unfortunately a couple of the FOSS guys didn't like having their tools included in this toolkit (why, since it is free and provided links to their sites I have no clue) so they had it yanked. The only link I was able to find for it is here. This is the V2 or last version, which had even more useful tools than the V1 reviewed above.

      While I can understand the desire that the guy that wrote TFA has in finding an "all in one" that will test all the hardware to hunt down those PITA problems, sadly in my 15 years in PC repair I haven't ever found one. If somebody does know of one please post for all us PC repairman out here in the sticks.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    6. Re:PSU by hannson · · Score: 1

      This can also be done very quickly by using a ATX power supply tester like this one. It has a LCD screen which shows the voltage for IIRC every connector on your power supply. In use image here: http://www.ocia.net/fullsize.php?filename=32_9.jpg

    7. Re:PSU by Cloud+K · · Score: 1

      Agreed, speaking as a sufferer of now 3 Enermax PSUs with "right, I'm going to die now kthxbai" syndrome. And these are supposedly *good* PSUs. In my case 2 identical systems did this on the same day. Put PSU2 into PC1 and it worked for an extra month - albeit risky - but some motherboards seem to have a greater tolerance of dying PSUs than others.

      But the really crap PSUs that you get bundled with a case etc are most likely output all kinds of crap and cause random weirdness and crashing.

      The unfortunate reality of PCs, unless anyone informs me otherwise, is that you sometimes have to gamble (an gambling amount, i.e. uncomfortable) on what experience tells you will *probably* be the problem. Only too often it's the PSU, and to minimise more odd errors the replacement will cost £50-100 (ermm... $50-100 these days). The best you can do is get something that, if it doesn't help, can be reallocated and help you somewhere else.

      FWIW based on experience and after the usual cooling and elimination checks I usually suspect the memory first (and just to complicate matters, memtest86 doesn't always pick up the problem - you have to try another stick). Then the PSU, even if it looks like a motherboard issue (unless the mobo has visibly leaky/bulging caps). Then the motherboard. Then the CPU, which almost never fails.

    8. Re:PSU by fuzzyfuzzyfungus · · Score: 1

      The one problem with PSU testers like that is that they don't do passthrough to allow you to look at the system under load.

      They should be able to detect grossly off-spec voltage conditions, which is certainly good, and some of them might put a touch of load on the PSU; but they won't be of much use if your problem occurs only under a few hundred watts load.

    9. Re:PSU by ajlisows · · Score: 1

      I'm curious, is there any good piece of software that can give you information about how the power supply is performing? At least graph what voltage it is supplying to the motherboard or something like that? Obviously an actual physical multimeter is best, but I will help friends with computer problems remotely (If a buddy is having problems on a Tuesday night and I don't want to drive 30 minutes to his house) and being able to do some troubleshooting without touching the hardware would be nice.

    10. Re:PSU by mysidia · · Score: 3, Insightful

      Yes... but unless you are doing this professionally, or going way out of the way to build a full blown test rig and load bank yourself, the gear required to fully test a PSU anywhere near max load is not worth it to the average person, a spot check with a DMM on the bench or in the PC (if the PC is working) is a good tradeoff, and if there is any question, try replacing the PSU.

      Versus buying a $100,000 Sunmoon or Chroma tester. Or bench Oscilloscope + DC Load generator + Variable AC output gear (for varying input voltages to the PSU under testing). To be honest, all this sort of gear is pretty cool, and would let you even get an idea about how clean the output signal is from the PSU, it's just overkill to do that much testing as end-user for PCs.

      On the other hand... no geeky technology enthusiast should really be without a Fluke DMM or similar piece of gear in their bag of tricks with at least ability to measure emf, current, resistance, and (maybe) LCR, over the years i've found it indispensable.. measuring electrical characteristics is useful for many things, not just for PSU testing.

      I won't knock the little ATX test products, but they're really no better than a DMM and a big resistor.

    11. Re:PSU by Anonymous Coward · · Score: 0

      Not necessarily true - I have had at least 2 PSUs that didn't kick up a fuss until they were under load. One was even a honest-to-goodness System Killer that appeared to work perfectly (within tolerances) when I tested it with my multimeter. The truth is, until someone invents a testing board (board with voltage displays, all the necessary plugs and resistance that artificially loads the PSU), we'll never be able to test PSUs with 100% accuracy. Or you could make your own with hobby resistors, but that's a lot of effort and it's usually easier to test every other component for fault and then decide it's the PSU by process of elimination.

    12. Re:PSU by gd2shoe · · Score: 1

      I doubt it. I know that some motherboards monitor voltages (and you can check in bios), but I highly doubt most of them do. Software can only chart what hardware can check.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    13. Re:PSU by CAIMLAS · · Score: 1

      This.

      You can not diagnose a problem by looking at (several) layers above the cause of the problem. It's like tracing a wire to find where a rat chewed through it, or trying to find the other end of your ethernet cable in a huge snarled tangle of many wires. No. It won't work as an effective means of finding the problem, other than to tell you that there is a problem.

      The above mentioned problems are indeed fairly indepth; I've been doing this whole "computer geek" thing for 15 years now, and I've repaired and diagnosed my fair share of other's PCs (for money or otherwise), not to mention work-related stuff. I've not come across nearly that many -different- symptoms, but I have seen a number of them.

      That said, there tends to be a diagnosis procedure I've picked up (and roughly follow). It goes like:

      1) Verify hardware visually and audibly for proper function. Eg. fans are spinning, heatsink is not dangling, no burst/leaking/bulged capacitors, excessive dust build-up (blow it out to make sure) or scorch marks anywhere within the chassis. This takes, tops, 5 minutes.
      2) If the above checks out Okay, run memtest86 for no fewer than 2 full passes or three hours.
      3) If the above passes Okay, chances are the base hardware is (mostly) fine (eg. board/cpu/ram). It's usually enough to rule them out as the primary fault if there are frequent failures/faults.
      4) Run an intensive/load-based test to see if the problem might be due to power/load and a crap PSU. Of course, doing a little math and looking at the PSU's rating and make should be enough to tell you if it's junk/not doing its job. A multimeter comes in handy here, too. (This would be when you load test something like the graphics card.)
      5) Failing that (rarely) swap out the PSU. It's probably the problem.
      6) If the swap doesn't fix the problem, start yanking cards (or disabling onboard stuff in BIOS) and trying the machine with known-good hardware. If that still fails, well... it's probably the board or some really, really bad mains power.

      In my experience, the most likely parts to fail are:
      1) Fans. I hate the damn things, and they've caused many a good-but-warm processor/board/etc. to completely go *poof*, often with the token magic smoke.
      2) PSU. Surge protectors are your friend, but there are a lot of crap PSUs out there, and there's not much that can be done for brown outs and just crap power.
      3) RAM. Not nearly as common as it used to be, but still fairly common.
      4) PCBs. Anything from PCI cards to the mainboard can do it, and if these are the cause, chances are even it's due to foreign matter (dust/dirt in the riser, intermittent grounding, etc.)
      5) CPUs. Provided the failure isn't secondary to the PSU or board going, CPUs don't seem to fail all that often anymore, it would seem. I chalk it up to internal board/CPU sensors for temperature monitoring, plus cleaner power coming into the board from the PSU than used to be the case.

      The most amazing "failure" I've seen, is a system which had an Antec 450 watt PSU, circa 2002 or so - IE, right around when all those bad-cap PSUs and mainboards were being made. For whatever consequence, it was an AMD Athlon XP+ 2000 and had been running in a dusty house, on 24/7, for about 7 years, where the power was prone to frequent brownouts (also, see: bad wiring). Again, the system was rock solid, but when I took the PSU apart to replace the fan when the system was de/recommissioned to me, half the capacitors in the thing were popped/leaked (the electrolyte had long since dried up). Yet, still stable, and it seemed to have very little output variance.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    14. Re:PSU by ppanon · · Score: 1

      Or you could swap in a known-good PSU.

      --
      Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
    15. Re:PSU by misnohmer · · Score: 1

      You obviously never had to deal with many (or any) power supply issues. If you can diagnose the issue with the multimeter (missing power rails, or too high) you'll likely have other indicators (such as the computer won't start or smoke coming out of something). Most power supply issues show up under certain conditions only (such as specific load pattern) and only occur briefly (sufficiently to cause memory errors, put hardware in half-reset undefined states, etc). You could put all of the voltage rails on some kind of continuous monitor (maybe oscilloscope with triggers set above and below acceptable values) and then run the machine until the undesired behavior occurs. Repeat for each voltage rail. Not a quick thing to do either.

    16. Re:PSU by dov_0 · · Score: 1

      I keep few 'good' PSU's hanging around and just swap them over with the dubious one. Takes less time than testing with a multi-meter or installing software.

      --
      sudo mount --milk --sugar /cup/tea /mouth /etc/init.d/relax start
    17. Re:PSU by drinkypoo · · Score: 1

      Most modern motherboards have a voltage sensor expressed to the user, and you can read the voltages and graph them over time to find transient failures like that. But as you might imagine, giving a really good test to a power supply is non-trivial. The best thing is still to have one on hand to swap in, if you suspect transient power supply problems.

      Other than that, the only best thing to have is a PCI POST card. But that won't help you with transient problems either, as you surely know :)

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    18. Re:PSU by Anonymous Coward · · Score: 0

      Yes, but what a multimeter does not capture is faulty caps inside your PSU, or on your mother board, which cause numerous spikes on the supply. The result is a very noisy supply, that can break almost everything. You'll need an oscilloscope to see those.

      Would be an interesting add on for multimeters, a kind of AC coupled AM detector, that measures the envelope of a DC signal.

    19. Re:PSU by fast+turtle · · Score: 1

      Funny that everyone is talking about PSU's and them being to small since I just helped a friend with a PSU that was Too Big. Yep that can happen. I'd suggested a nice 550 watt PSU that would have worked quite well (single rail design) and he goes and spends the money on a 1KW unit. Of course his system refuses to even post as the PSU isn't being loaded hard enough to continue (part of the ATX specs) so it immediately shuts down.

      To make a long story short, I took over a testing PSU (300 watt by FSP) to make sure his board wasn't DOA and lo and behold, it posts with a video card installed.

      Lesson Learned - keep a small PSU on hand for testing DOA board status along with a decent low power Vid Card like a 7300GT and yes it's good nuff to post a board that doesn't have IGP

      --
      Mod me up/Mod me down: I wont frown as I've no crown
    20. Re:PSU by mysidia · · Score: 1

      That's why you leave the PC attached during DMM testing, and you apply full load to it while measuring the voltage.. while watching carefully for any anomolies at the moment the load is being applied.

      If you're concerned about not seeing the issue, use voltage and current dataloggers.

      Oscilloscopes are really expensive.

    21. Re:PSU by Anonymous Coward · · Score: 1, Interesting

      Places like Maplin or Radio Shack sell PSU Testers. You plug it into the PSU, you get told if its out of spec.

      But it will not test if you've bought a crappy low cost PSU which says its a 500W PSU and its really can only hold 400W.

      I use a floppy disc test program called TuffTEST-Pro

      http://www.tufftest.com/

      It tests quite a few aspects of a PC.

      * Configuration Function
      o Current Configuration
      o CPU Speed Test
      o MMX Extensions Test
      o BIOS Information
      o BIOS Checksum Test
      o Switch Positions ****
      o System Memory Size
      o Extended Memory Size
      o Expanded Memory Size
      o CMOS Configuration
      o Edit CMOS settings ****

      * Certification Tests (Quick System Check)
      o Abbreviated System Test ****
      o Extensive System Test ****
      o System Board Test
      o Math Coprocessor Test
      o Main Memory Test
      o Extended Memory Test
      o Expanded Memory Test ****
      o Diskette Drives Test
      o Fixed Disk Drives Test
      o Serial Port Input/Output Test ****
      o Parallel Port Input/Output Test ****

      * Parallel Port Tests (LPT1, LPT2 & LPT3)
      o Interface (Signal Loopback) Test (uses TEST Plug) ****
      o Send the ASCII Character Set to a Printer Test ****
      o Echo the Keys Pressed to a Printer Test ****
      o Monitor Handshake and Data Signals While Printing Test
      o Monitor Status Signals from a Printer Test ****
      o Data Test (uses TEST Plug) ****
      o Control Test (uses TEST Plug) ****
      o Interface Status Test (uses TEST Plug) ****

      * Serial Port Tests (COM 1, COM2, COM3, COM4 & up to 64 user-defined port addresses)
      o Internal Operations Test
      o External Operations Test (uses TEST Plug) ****

    22. Re:PSU by JakFrost · · Score: 1

      Suspect the Power Supply Unit

      Seeing as people have already mentioned Power Supply Unit as a likely cause of flaky computer problems and random reboots I want to expand on why this is the case and how to diagnose these problems on the cheap. Since some PSU problems are quick and easy to diagnose by checking for dead, low, or high +3.3, +5, or +12 Volt rails others are more problematic such as transient voltage drops that occur randomly or under load or due to thermal overload. These PSU problems usually happens due to old and aged capacitors that have weakened, leaked, have blown, or just plain failed.

      If you have a power supply that is easy to open you can do a quick cursory check by opening it and looking for any bulged, blown, or discolored tops on capacitors (those tall cylinders). Be careful not to touch any of the power leads inside the power supply since the capacitors hold charges even when disconnected from power and some of the discharges might be dangerous or deadly.

      Digital MultiMeter Voltage Readings and Load Generation Programs

      First, you'll need a Digital MultiMeter (DMM) that gives you a simple DC Voltage output and then connect the black write to the ground and the red write to the "hot" red or yellow wires on the Molex connectors to test for voltage. Then you should get yourself a load generating application, such as Prime95 for Processor and Memory loading and a Graphics Card loading application like 3DMark benchmark. Leave your DMM leads inside the molex power connectors for one of the voltage rails (+5 or +12) and start the application. Watch the DMM for any voltage drops or droops and let the application run for a few minutes, 5-10 should be enough, to generate enough load over time to create build up a thermal load on the PSU and the computer subsystems you are testing. If your voltage drops more than 5% of the original you should suspect a problem with the voltage regulation for that rail under load. Make sure to test the other rail also, so if you tested +5 then do the +12 rail to check.

      My Own Case of Failing Power Supply

      I had to use this procedures to diagnose failing and faulty power supplies in many computers, including my very own system that had a 4-year power supply that suddenly started drooping in voltage on the +5 rail going down from +5.05 to +4.78 under load causing my hard drives to drop out of my RAID arrays. You can read my own issues with PSU problems in the thread below, including detailed diagnostic steps I took and pictures of my issues.

      HardForum.com - 0.30 V drop only on +5V rail during Prime95 - Is this normal?

    23. Re:PSU by RMH101 · · Score: 1

      It's a lot easier to have a decent, known-good PSU in your testing kit. Seriously, swapping out components with known-good ones is the fastest way of proving hardware failure. This plus experience means you'll be able to narrow down the fault accurately.
      For what it's worth, the biggest causes of tempramental machines are bad ram, and bad PSU. All sorts of weird behaviour can be attributed to these.
      Swap 'em out, and run through an hour of Prime95 and if it passes the torture test, you're probably good.

    24. Re:PSU by Anonymous Coward · · Score: 0

      Use an oscilloscope, preferably one with a decent amount of storage capacity. This will enable you to monitor your supply voltage under load and pick up transients AND noise (in the form of sine or triangle waves where there should be a flat constant voltage). Typically, a failing or overloaded PC power supply beats components to death by adding an AC component to the DC voltages. If you can't afford a "real" scope, there are boxes and kits that provide the minimal required components, and plug into a PC to provide a user interface, controls, memory, etc.

    25. Re:PSU by Mephistophlese · · Score: 1

      While that is good "Bad or Maybe" test, most PSU problems are transient over- or under-voltage conditions, which a DMM is not going to reveal.

      Even cheap DMMs have a min/max button which will freeze the display with the last minimum or maximum value which can be used to view over and under voltages. Naturally this can be time consuming if you are testing each different voltage output on each rail of a PSU while under load.

      --
      I don't mean to sound cold and cynical - but I am, so that's the way it comes out.
  21. Just tell them by Anonymous Coward · · Score: 0

    Just tell them that they'll need to buy a new computer. Also tell them you'll be nice and take their old one off them for "proper disposal" for a fee of only $50. That plus the $50 "diagnostic fee" means you come out $100 + 1 computer richer.

  22. stresslinux by zal · · Score: 1

    http://www.stresslinux.org/

    nice single purpose linux distro to stress test a system

    --
    -- never underestimate someone who overestimates himself
  23. Replace the integrated part by gd2shoe · · Score: 2, Informative

    Integrated devices can typically be replaced with PCI/PCIe devices. If an integrated network or sound card gives out, it can often be easier and less expensive to shove a new device into the case and disable the old one in the Device Manager. Still, integrated devices don't go out that often. It's more common for the MB itself to go (my experience, anecdotal).

    --
    I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    1. Re:Replace the integrated part by sortius_nod · · Score: 3, Insightful

      Even when they do, it's usually a sign the rest of the board is on it's way out too. A device on the board not functioning can mean a number of things (MB controllers acting up, visible/non-visible corrosion in the board, blown capacitors, etc), so you can be up for a lot of weird behaviour from the board that you can't pin down.

      To be honest, relying purely on a test suite to tell you what's broken will lead to disaster. Only through experience do you get the pointers toward what is actually faulty. Add to this that true diagnosis only comes from swapping out parts, and, well, test suites don't look at all like a viable option.

      When I am repairing hardware about the only suite I use is memtest86+ and a decent live linux distro. You can usually pick devices that have failed with lspci, however this is not always correct. It all goes back to having test hardware & the knowledge of what certain behaviours in systems are caused by certain faults. After 15 years of working in IT with both hardware & software faults, there's only so much you can do with limited or no test hardware. Most of the time when you're diagnosing hardware faults on the phone it's an educated guess at best, the only time you truly get a decent diagnosis is when you have the machine with you and can swap parts out. Hell, we don't even use the Dell diagnostics at work due to their inability to give decent results on anything other than RAM.

    2. Re:Replace the integrated part by gd2shoe · · Score: 1

      There is a lot of truth in your post. I think you're mostly right. I also think you might be holding a one-sided argument through much of your post.

      Even when they do, it's usually a sign the rest of the board is on it's way out too.

      It can be. You have to wonder, why did it fail? Was there a surge? Is the PSU dying and stressing things? Was that particular integrated chip part of a bad batch? Did it get an ESD on installation? Has a controller failed? In the last case, you will usually see additional symptoms. Most integrated devices are hooked into the PCI bus as if they were plugged in. If you plug in a new card, and it works just fine, it probably wasn't the bus controller.

      There are plenty of reasons why a board may continue to work for years after an integrated part has failed. I don't see it often, but it has happened.

      visible... corrosion in the board, blown capacitors, etc [scorch marks]

      Well, no duh (for most of us). And yes, no software can substitute for actually looking at the board.

      To be honest, relying purely on a test suite to tell you what's broken will lead to disaster. Only through experience do you get the pointers toward what is actually faulty.

      True, true. (experience and knowledge)

      Add to this that true diagnosis only comes from swapping out parts

      Not true, strictly speaking. Often, swapping out parts is a vital part of diagnosis. It isn't always. For example: if the problem appears to the hardware, swapping it out might mislead you into thinking that the hardware really had failed, or that there's a deeper problem (CPU/MB) while the issue is really software. (Not likely from knoppix as well as windows, but it can happen. Besides, there's still a lot of hardware you can't test from knoppix.)

      ...and, well, test suites don't look at all like a viable option.

      Not true either. Granted, the last step in diagnosis is fixing the problem and observing it disappear. In that sense, installing fresh hardware is often vital. The real reason most test suites aren't viable is because they make no attempt to be thorough. They'll often give a pass to hardware that's clearly failing. A "stress test" may be a good idea, but it's not a real test.

      And no, software tests are no pancea in any case.

      It all goes back to having test hardware & the knowledge of what certain behaviours in systems are caused by certain faults. After 15 years of working in IT with both hardware & software faults, there's only so much you can do with limited or no test hardware.

      True.

      Most of the time when you're diagnosing hardware faults on the phone it's an educated guess at best, the only time you truly get a decent diagnosis is when you have the machine with you and can swap parts out.

      Very true, but only in part because you can swap out parts. Phone diagnosis is no diagnosis at all. (unless you're diagnosing a PEBKAC problem) You can't do a visual of the machine if it's not in front of you, and it becomes difficult or impossible to run test suites remotely. But this thread really isn't about phone diagnosis.

      In short: It is possible to diagnose a computer entirely from software. It is also possible to have a problem which must be diagnosed by swapping out hardware. It depends on the problem, and the quality of the test.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    3. Re:Replace the integrated part by Khyber · · Score: 1

      "In short: It is possible to diagnose a computer entirely from software."

      Tell me a piece of software that'll expose a dying capacitor, please?

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    4. Re:Replace the integrated part by ajlisows · · Score: 1

      Is it just me or has anyone else seen a lot of bulging/leaking capacitors on motherboards lately? Perhaps I was less observant in the past but it is pretty common for me to break open the case of a 3-4 year old machine and see a few capacitors that look like they are about to explode.

    5. Re:Replace the integrated part by tepples · · Score: 1

      Integrated devices can typically be replaced with PCI/PCIe devices. If an integrated network or sound card gives out, it can often be easier and less expensive to shove a new device into the case and disable the old one in the Device Manager.

      That is, if you have enough slots. A slim case might have room for only one or two PCIe cards on a riser. If you've replaced the integrated graphics with ATI and the integrated wired networking with a WLAN card, what do you do once the onboard audio gets sick, other than build a new PC out of the remaining working parts?

    6. Re:Replace the integrated part by gd2shoe · · Score: 1

      Tell me a piece of software that'll expose a dying capacitor, please?

      ???

      visible... corrosion in the board, blown capacitors, etc [scorch marks]

      Well, no duh (for most of us). And yes, no software can substitute for actually looking at the board.

      ...

      In short: It is possible to diagnose a computer entirely from software. It is also possible to have a problem which must be diagnosed by swapping out hardware.

      And they say there's no such thing as a stupid question. Are you a troll, or did you just have a moment of temporary insanity? I get the impression you didn't read what I wrote.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    7. Re:Replace the integrated part by gd2shoe · · Score: 1

      That happens. It has been rare in my experience. The chances that all expansion slots on an average modern computer are used has become quite slim. I expect it to become more common as smaller form factors gain popularity.

      My opinion, of course.

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    8. Re:Replace the integrated part by Khyber · · Score: 1

      I read all that you wrote, you still fail to mention this magical piece of software that makes it possible to diagnose things like failed capacitors. I know some BIOS features will let you check for voltage spikes or dips, which are pretty much signs of a screwed capacitor, but is there software that can be run outside of BIOS that will do this?

      Does critical thinking take that much brain power now days?

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    9. Re:Replace the integrated part by toddestan · · Score: 1

      It's called the capacitor plague, and it goes back a while - I've seen it in computers as old as the early slotted P3's. It seems to have gotten better with newer hardware, but it's extremely common in any PC from about 2001-2005 or so. Interesting story behind it, as it turns out to be a case of industrial espionage where a tainted electrolyte recipe was stolen, which led to some Chinese company making capacitors that initially work okay, but with a significantly shortened lifespan. Too bad it affected lots of people around the world and led to countless electronics which would have otherwise been useful landfilled instead.

      See http://en.wikipedia.org/wiki/Capacitor_plague.

    10. Re:Replace the integrated part by Tynin · · Score: 1

      That is, if you have enough slots. A slim case might have room for only one or two PCIe cards on a riser. If you've replaced the integrated graphics with ATI and the integrated wired networking with a WLAN card, what do you do once the onboard audio gets sick, other than build a new PC out of the remaining working parts?

      I'm late to the show on answering you, but if all you are missing is broken audio, they have USB audio devices these days. The mini-jack plug went out on my wife's computer, and she had no spare PCI slots to replace the onboard audio with. The solution was a USB stereo audio adapter. (Here is one http://www.newegg.com/Product/Product.aspx?Item=N82E16812186035) they even go up to 7.1 surround, but I'd imagine it isn't a great solution for a gaming computer.

    11. Re:Replace the integrated part by gd2shoe · · Score: 1

      Does critical thinking take that much brain power now days?

      Apparently it does, as you don't seem to be able to muster any.

      I read all that you wrote, you still fail to mention this magical piece of software that makes it possible to diagnose things like failed capacitors...

      I've said plainly now (twice) that no software will pick that up*. I've also said plainly (twice) that certain problems cannot be diagnosed with software.

      You sir, fail at reading.

      *(Voltage test in BIOS isn't a bad idea, and might be accessible from software, but... now wait for it... "no software can substitute for actually looking at the board.")

      --
      I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
    12. Re:Replace the integrated part by cbiltcliffe · · Score: 1

      Tell me a piece of software that'll expose a dying capacitor, please?

      Windows. :)

      --
      "City hall" in German is "Rathaus" Kinda explains a few things......
    13. Re:Replace the integrated part by Khyber · · Score: 1

      "no software can substitute for actually looking at the board."

      Some capacitors fail without exploding or exhibiting external signs of failure. For example, my old Peavey Heritage amp. One of the main capacitors had gone bad, but you'd NEVER know until you'd opened all the caps up up to see one that had imploded but the outside shell hadn't imploded with it like they'd normally do.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
  24. PSU by sulliwan · · Score: 1

    Just replace the damn power supply already and stop wasting your time testing the cpu and the mobo.

  25. Other cause by Anonymous Coward · · Score: 0

    Well, some years ago i had a customer's PC that had many problems at their office but was running fine when i was trying to catch the defect.
    After some time, i found that the PC wasn't the problem but instead the UPS wich was causing too much magnetic fields when beging near the PC.
    Then i put one meter of distance between both and problems disapeared ...

    Sometimes, matters are not really obvious ;p

  26. PC-DOCTOR is what is your solutions by Anonymous Coward · · Score: 0

    http://www.pc-doctor.com/

  27. Swap the damn hardware by evilviper · · Score: 3, Informative

    but how do you go about testing CPU, motherboard and graphics card trio to find which is to blame? Replacing them one by one isn't really an option. Do you know any software that would help the way memtest helps with RAM?

    There is no way to tell, with software, whether your PSU, CPU, or motherboard is to blame, in the overwhelming majority of cases.

    It's just idiotic to say "Replacing them one by one isn't really an option". In fact, that's by far the best option. I don't run memtest for a week to find out I have bad RAM, I take 30 seconds to swap it, and find out, for certain, in no time. PSUs are equally easy to swap, AND are the more likely component to fail, so that's the best place to start.

    If you don't know whether it's CPU or the MoBo, buy a new motherboard... Vastly more likely to be the cause, and pretty damn cheap just as soon as they're no longer brand new. Of course CPUs fail, but it's likely to be obvious from a visual inspection if they've been installed wrong, or otherwise abused.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    1. Re:Swap the damn hardware by Etylowy · · Score: 1

      To replace hardware piece by piece you need to have replacements. When called by a relative on a Friday evening to fix the computer you don't usually cannibalize your own hardware just to have parts, do you?

    2. Re:Swap the damn hardware by Doppler00 · · Score: 1

      I agree. When I build a new system I first:
      memtest86+
      cpu test with something like prime95
      CPU+GPU test with prime95 and then another 3D game running in the background.

      If it survives that last test, then it's good. I've found overheating of my system to be the main cause of crashes. I've actually had to underclock my RAM to get it stable. If something does fail, I swap that component or add more fans and try again.

    3. Re:Swap the damn hardware by Anonymous Coward · · Score: 0

      Ultimately the "swap components" is the best method to test. My guess is most on /. have access to multiple machines and can easily swap a working power supply in and rule that out. Most probably access to more than one computer using the same CPU socket type and can swap everything to determine what is bad. I don't bother buying new components to test with, if I'm going to buy a new mobo to test then I'm just going to decide it's time to upgrade the computer and get a current one instead of an old socket type that's obsolete. Sucks for single computer families - they should probably buy the extended warranty on computers and go with companies that are known for customer service. And of course there's always geek squad.

    4. Re:Swap the damn hardware by Cylix · · Score: 1

      Unfortunately you left out one major component in this troubleshooting scenario.

      Before applying any troubleshooting steps you must first create a verifiable test condition to reproduce the problem.

      If the problem cannot be reliably reproduced it will be difficult to isolate the fault with physical isolation, reduction or replacement of specific components.

      Before beginning on such an endeavor strive to create a scenario in which the problem can be reproduced quickly.

      Waiting a week for the fault to reproduce could be a lengthy period of troubleshooting.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    5. Re:Swap the damn hardware by myxiplx · · Score: 1

      No, but if the computer is that badly broken, you're better off telling them that you'll need to take it away to find what's faulty.

      After all, you can either sit there for hours and hours running various test programs on dodgy hardware in an attempt to guess which part is bad, or you can take it away, spend a few minutes plugging in good components, and have a much better shot at telling them what's faulty.

      It takes you less time, and it's a far more professional approach. Instead of saying "I think it's the graphics card", but having nothing more than an educated guess to back that up, after swapping the hardware you're now in a position to say "I'm pretty sure it's the graphics card, and it's run fine here overnight with a new one fitted".

      Swapping the hardware saves you time and them money. If you don't have hardware to swap, make damn sure you tell people that at best you can give them an educated guess as to what's gone wrong.

  28. prime95 by MoFoQ · · Score: 1

    never heard of prime95?

    it's been used for years to check stability in rigs by overclocking and gaming enthusiasts.
    They even have various different "levels" of FFT tests to limit the torture tests to within CPU cache levels which tests the CPU...or more than tests the RAM, PSU, etc.

    Prime95

  29. Serious answer: don't bother: upgrade. by lkcl · · Score: 1

    I've done a significant amount of PC construction and reconstruction: approximately 60 from-scratch builds in 20 years. One thing that that has taught me is: do not bother to try to diagnose motherboard or CPU faults: just replace them, end of story.

    Even Integrated Motherboards can be had for £40, and CPUs for £25. You can get dual-core 1.6ghz Atom Integrated-everything-including-CPU motherboards for £90.

    For the amount of time and effort spent unscrewing components and testing combinations that may, if there is some I.C. damage, result in EXTRA damage to other components, the risk and the time is *just* not worth it.

    There is, however, short-circuit protection in Hard Drive channels (there is now: there used not to be!), USB devices, PCI cards etc. so the risk associated with these components of them causing further damage, if they themselves are damaged, is much lower (but still possible).

    Additionally, short-circuit protection in the PSU is also present and helps mitigate the risk of further damage.

    Basically: if you find that a machine is acting up, do an Internet Search for that model: there may be a firmware upgrade that fixes the problem. I once bought eight identical machines (£125 each) and they all had EXACTLY the same memory / unreliability fault. eighteen months later i found the firmware upgrade that changed the timings to workaround the problem.

    Other than that: if you cannot find any evidence of firmware upgrades to potentially fix an unreliable machine - throw out the power supply, the motherboard and the CPU, without hesitation (or get them replaced under warranty). Simple as that.

    *possibly* keep the memory, but bear in mind that when you upgrade the CPU and the motherboard, you will likely need a different kind of memory, and that memory is likely to be incredibly cheap, anyway.

    Peripherals and cards: you should be okay (but test them one at a time).

    Ultimately it's about risk management, and the level of integration is simply too high to take any risks. Throw the components out, and get new ones.

    1. Re:Serious answer: don't bother: upgrade. by vxvxvxvx · · Score: 1

      Other than that: if you cannot find any evidence of firmware upgrades to potentially fix an unreliable machine - throw out the power supply, the motherboard and the CPU, without hesitation (or get them replaced under warranty).

      If you've only ruled it down to one of those 3, how will you get the companies to replace those parts under warranty?

  30. Practical System Stressing... by Linker3000 · · Score: 3, Funny

    I stress my Linux boxes by telling them that if they develop a fault I'll re-image them with Vista.

    Not a single one has dared to fail on me yet.

    --
    AT&ROFLMAO
    1. Re:Practical System Stressing... by Artifakt · · Score: 1

      Take all your consumer electronics to the movies once a year. Set them on the couch, give them a bowl of popcorn buttered with WD40, and let them watch "The Brave Little Toaster". (Popcorn is optional).

      --
      Who is John Cabal?
  31. PC Doctor Service Center by Anonymous Coward · · Score: 0

    The PC-Doctor software runs on a PCI boot ROM, DOS, Linux, and Windows. Its pretty good at identifying problem areas and problematic components. They sell a retail product called Service Center which comes with PCI tester card, USB device tester, MiniPCI tester card, power supply tester, and some other neat little toys. It looks really cool:

    http://www.amazon.com/PC-Doctor-Service-Center-Computer-Diagnostics/dp/B000Z88VXK

  32. The source of your woes by brassmaster · · Score: 1

    crash way too often to blame it all on Microsoft —

    Not possible.

    1. Re:The source of your woes by Anonymous Coward · · Score: 0

      Indeed.

  33. and pay $200 for a $100 video card by Anonymous Coward · · Score: 0

    and pay $200 for a $100 video card yes at one time dell wanted $200 more for a video card upgrade that in stores / online was only $100 difference.
    This was with a BTO system.

  34. empirical testing: Compile the Linux kernel by lkcl · · Score: 1

    gcc is an incredibly good test application. it's horrendously cpu-intensive, and it is designed to eat whatever physical memory is available. compiling c++ applications is particularly memory-intensive, but the best test of both disk and memory has to be simply to compile the linux kernel.

    if you have multiple cores, you can use "make -j {number of cores + 1}" and this will test all of the CPUs, as well. if you particularly want to stress things, make that "make -j {number of cores * 2}" instead.

    1. Re:empirical testing: Compile the Linux kernel by cschepers · · Score: 2, Funny

      Or you could install Gentoo. That'll eat up the CPU, RAM, and hard disk for darn near eternity.

  35. on which machine do i log calls, and which to test by Anonymous Coward · · Score: 0

    bring the crash cart.

    oddly enough, a new power supply has helped more than once.

  36. Ultimate Boot CD by googlesmith123 · · Score: 1

    http://www.ultimatebootcd.com/

    This is pretty much the best free tool there is to test and diagnose a system. It also has a bunch of tools for partitioning and the like as well as password resetting.

    I've had this in my arsenal for many years now, it's a great tool.

    --
    Say NO to unpaid Internships!
  37. Video memory stress test. by voodoowizard · · Score: 1

    Works great for testing your video cards ram. http://mikelab.kiev.ua/index.php?page=PROGRAMS/vmt_en I have found the program on other sites but that is where it came from.

  38. Not a perfect solution by Whuffo · · Score: 1
    If your hardware is suspect, then the output of any program running on that hardware would also be suspect. Keep that in mind when you run diagnostic software - if it says the system is good then it probably is but if the software reports errors then the reported error isn't necessarily accurate. I've also seen these programs detect failures in perfectly working systems. I've tested many of these "technician on a disk" programs over the years and Microscope is the best of a bad bunch.

    A more productive diagnostic method is to "divide and conquer" - consider the various replaceable sub-assemblies and diagnose only to that level. Tips: most failures are memory related - bad RAM or it's not making good contact in its socket. If the system locks up immediately at boot or after running a short time, it's almost always memory. Bad power supplies are also a fairly common source of general flakiness and no diagnostic software will be able to diagnose those problems. Bad motherboards are rare and bad CPU chips are almost unheard of.

  39. A few thought ( elimination process ) by hebertrich · · Score: 1

    First .. if you repair pc's on any kind of basis i suggest you make a test jig.
    you get a flat surface and fix a working mobo/cpu and 2 power supplies.Make a second
    space for the mobo to be tested Have a screen keyboard and mouse at the station
    That will help you trermendously as you can just move the parts about ,
    for most everything but the cpu/mobo fault isolation it makes it a breeze.

    But that's where we stop.
    When you are hit with a fault of the mobo or cpu the only valid suggestion nowadays
    is to replace both.

    You may like to know what's broken , but that's pointless as you need to change both the cpu
    and the motherboard , and i explain myself.

    You have no way of knowing which caused which to fail.
    The only valid fix is to replace them both.If you plug in a new cpu in a bad mobo and blow the new cpu
    you're no better off and lost a cpu.If the cpu is at cause and caused the mobo to fail , well it's no better
    fix here either cause you damaged the mobo.

    No . i strongly beleive there's no point in trying to find out but satisfy your natural and beleive me, mutual
    curiosity.

    Happy trails :)

    1. Re:A few thought ( elimination process ) by vxvxvxvx · · Score: 1

      You may like to know what's broken , but that's pointless as you need to change both the cpu and the motherboard , and i explain myself.

      Only pointless if you don't plan to get a replacement under warranty.

  40. What separates a PC from a real computer... by cdrguru · · Score: 1

    Your average PC hardware has utterly no way to "test" it. You can sort of test RAM - to the point of identifying there is a failure somewhere in the memory. OK, if you have four DIMMs what does that mean? Well, it means you have a RAM problem somewhere.

    Motherboard? Not really any sort of testing possible. There are some "pretend" diagnostic tools that will try to tell you if something fails, but what exactly does that mean? Nothing. If you have a ATAPI DVD drive and a SATA hard drive I assure you that a failing drive can easily appear as creating a failure to some "motherboard" test.

    There is no clear isolation of the hardware whatsoever, and no ability for the hardware to meaningfully participate in any sort of testing. So you are left with changing parts - more or less what I like to call "throw parts at the problem". Today this isn't terribly practical as most everything is on the motherboard. If you are a skilled screwdriver user you could replace the motherboard, but for most people it is just getting a new computer.

    Even if you take a computer to a "computer shop" you are likely to see very little in the way of diagnostics or fault isolation. They will pull out something and replace it with something they have lying around to see if that "fixes" the problem. Often they will do this blindly without much real thought in the process. The end result for the customer is that their computer works again but nobody really knows what the problem was. And, by the way, here is the bill for the parts that we replaced.

    There are some external hardware parts that are pretty simple to diagnose and replace. The power supply is probably the most prone to failure and is pretty obvious - the machine is dead with no lights. A CD or DVD drive is pretty simple to sort out as well with most common failures because it either works or it does not. In either case it is a few connectors and a few screws and you have the part in your hand. Both are going to be less than $100 to replace and well worth doing it.

    The lack of any real diagnostic ability - or even ability to verify proper operation - is a serious limitation in the PC world. If you move up to real server hardware you see all sorts of diagnostic and fault isolation capabilities. Things like the memory test telling you what DIMM is bad or that a hard drive is failing. But the real gem of hardware diagnostics seems to be reserved for mainframe systems. It tells you a part is going to fail, tells you where the part is and you can confirm that it fails specific tests and a new part passes the same tests.

  41. Don't overlook the hardware basics... by jpdbest · · Score: 1

    Sometimes a quick visual inspection of the interior of the computer can lead to the cause of the problem. Double-check the cabling, cards, memory, etc. to make sure that everything is secured in place. Even if the cards appear to be fine, I've seen it where they sometimes need to be removed and reseated. Don't forget about cooling as well. Make sure that the system has adequate cooling, that the existing fans/heatsinks are not clogged with dust and have good mobility with the flick of a finger. Double-check the fans are operational with case open and system is powered, and most motherboards have basic temperature monitoring for the CPUs and speed monitoring for the fans. On the motherboard, make sure to check the capacitors. Over the years (as recently as a couple weeks ago), I've had to replace motherboards because the capacitors had gone bad:

    see -- http://en.wikipedia.org/wiki/Capacitor_plague

    Some people have already mentioned it, but it needs to be stressed, a *good* power supply is mandatory and if necessary a UPS. The power supply can be perfectly operational and even pass with a power supply tester (also a good investment), but if the power being supplied to it is not consistent (brown-outs) or simply not adequate to drive all the components (e.g. video cards, # of drives, etc.) that can cause problems. In one case by simply swapping the cheap power supply out for a good quality one that I had as a spare from an older system resolved the problem.

    Inproper BIOS settings can also cause problems. Memory/CPU voltages or speed may be incorrect? Conflicting on-board video/audio still enabled when add-in video cards and audio cards have been added?

    I still haven't even gotten to the software debugging side of things...

  42. I'm not so sure about that by Sycraft-fu · · Score: 1

    You have a source to back that up? Because if not, I'm calling shenanigans. That seems real unlikely for a number of reasons:

    1) This would be a recipe for lawsuits. After all, this situation of momentary power drops happens ALL the time on all kinds of circuits. If computers weren't able to handle it, that'd be a great way to get sued. With consumer devices you don't get to say "Oh this thing is super sensitive you have to take all kinds of measures to protect it." You device is expected to deal with common conditions and there are tests out there for that sort of thing. FCC Part B is an example, which deal with unintentional radiators of EMF/RFI.

    2) Modern PSUs are almost universally active PFC, meaning they smooth their load on the electrical grid. The other side effect of that is they are voltage and frequency agnostic. You'll notice they are generally spec'd like "AC input 90-264V, 47Hz-63Hz." They will work anywhere in that range, they don't require a specific voltage or frequency. Well, as such a momentary line sag is probably nothing to worry about. The voltage is still within the operational range. The PSU doesn't care, it just draws more current. Unless you are near its operational limits, this isn't a problem.

    3) PSUs have bigass capacitors in them. Google around for some pictures of the inside of a PSU. You'll notice some extremely heavy duty caps. Those provide a whole bunch of instantaneous power reserves. These can deal with both quick increases in demand for power from the system, and drops in supply from the line. That is one of the major reasons to stick a cap in a system, they smooth out a power rail.

    4) The consumer devices you call UPSes, aren't. What I mean is they are not truly uninterruptable. For that, you need a fully online UPS system. What that means is something where the incoming power is converted to DC, sent to a battery, then the output of that is inverted and sent to the computer. That will have no interruptions, no sags. A normal UPS is a line interactive one. It is fast, but not instant. A momentary drop it can't catch. It takes a few tens of milliseconds to switch to battery. For that matter, line interactive UPSes don't tend to make up low voltage conditions with battery, they instead switch taps on a transformer and act as a voltage regulator, again not instant. So while they'll help with a chronic sag, a true grid brownout, they aren't fast enough to catch a fast drop.

    I'm not saying that power conditioning and backup is a bad idea, quite the contrary. I am saying that I think you are incorrect that the momentary sag caused by turning on a high drain device is a problem. If all you've got is anecdotal evidence, well I thin you should rethink your position. As a counter to your anecdote, here's my own: I live in the desert and thus have an AC unit that kicks in all the time. It causes a line sag when it kicks on. I also have a fridge, freezer, and some other devices that cause power on sags, like a receiver (it has a huge set of caps that have to charge). I do have a UPS on my computer, but there have been times when I didn't, and I have devices that aren't on UPS power. None have died.

    1. Re:I'm not so sure about that by jackchance · · Score: 1

      I concede that i was incorrect to place the blame on the brownouts specifically. I should have said home PC hardware failures are caused mostly by electrical problems.. I mention the brownouts because that is something visible (as opposed to the spikes.)

      And getting a cheap UPS solved the problem. Specifically I got an , which was around $50.

      If you spend $500 to $5000 on a computer (or other electronics), it is a good investment to protect it with a $50 UPS.

      --
      1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
  43. It's not the CPU by Anonymous Coward · · Score: 0

    I have been a pc-building hobbyist, done it for personal gain, and worked in corporate support environments for years. It's never your CPU. Modern CPUs do not "go bad" unless subjected to abuse/lack of cooling, in which case they will fry and not work at all. A CPU can't really work halfway.

  44. Rule #1 of Diagnosing Hardware by frank_adrian314159 · · Score: 1

    1. Check the software
    2. It's probably the software
    3. Really, it's going to be the software

    ...

    87. OK, now you should run some diagnostics

    Really. The bottom line is that computers and their parts (especially non-moving ones like processors and RAM), once they're burned in and assuming you don't try to run them overclocked for twenty years without rotating them out, are pretty reliable. I can't count more than a couple instances of hardware failure post burn-in across about fifteen different home machines over twenty years. And both times those were disk failures, which are usually obvious to diagnose (as are broken CPU fans, which happened to a friend). Contrast this to my experience with Windows machines, where bad drivers, creeping registry cruft, and general unpleasantness of management force you re re-install the OS every couple years (and why I'm switching as machines rotate out to either Linux or Macs).

    So as to my advice... see above.

    --
    That is all.
    1. Re:Rule #1 of Diagnosing Hardware by symbolset · · Score: 1

      Some of us maintain these things by the many thousands. They do fail. I've seen processors fail, in service. Likewise RAM chips, voltage regulator modules, Northbridge chips, network ports and every other component, at least a few times. You're right that it's often the software but when it's isolated in hardware one thumb rule has held steady for me for nearly 30 years: 90% of the time it's the cables.

      --
      Help stamp out iliturcy.
  45. The test suite I use by cracauer · · Score: 1

    Memtest86 tests much less of the memory than you think. It is 100% no-load. It does find outright broken memory cells but it does nothing if the memory interface runs unreliably.

    To test your memory interface under stress you use a program named "Superpi", you run the "32M" test. It is available for Linux and runs on FreeBSD. I find a lot more problems with SuperPi than with memtest, a lot of memtest-stable machines don't actually work right once you stress-test.

    %%

    To test the CPUs/cores, you use "MPrime" or "Prime95" (same thing). It is the hardest load test that the overclocking record chasers have found, and they try very hard to find more and more nasty tests to proof that their competitors' overclock is not valid. They do this all day long, you should profit from their research.

    You run MPrime with one instance per core. Available for Linux, IIRC also works on FreeBSD.

    Be warned that the CPU temperature during MPrime will raise to levels that no other program I am aware of reaches. That's the point. MPrime also has a very high amount of plausibility checking on it's intermediate results. The combination of those two factor is why it is such an effective hardware test.

    %%

    So, in summary:

    Run:
    1) MPrime for 36 hours (all cores simultaneously, one MPrime each)
    2) 24 hours of memtest86
    3) a whole bunch of SuperPi 32M.

    If there is any 3D graphics ever used you also run Futuremark's 3DMark (Windoze only).

    Oh, and you will have to note the CPU temperature that you get during that mprime run and never exceed that temperature during everyday work from then on. This usually isn't a problem since mprime will heat your CPU like nothing else.

    Good luck. Notebooks in particular, and cheap ready-made desktops not distributed by Dell tend to fail this outright. If any of these steps fail you can't pass any important data through this computer, it can and sooner or later will scramble you harddrive contents, silently, so that you backup USB drive already has the corrupted version by the time you notice.

  46. Well by ShooterNeo · · Score: 1

    Toast and Pi and various other CPU stability test programs will let you test the CPU.

    Go into system configuration with windows and turn off auto-reboot, so that if the machine blue screens, you can see what the error code is. Sometimes that will let you isolate it to graphics or the motherboard.

    Ultimately, the way to find out IS to replace the components one by one. If you have several machines, or spares from an older machine, you should swap each component and run the machine until either you get a crash or it's been long enough that you must have found the problem.

  47. Re:I wish you had asked this question 2 weeks ago. by Panaflex · · Score: 1

    It's your power mains... get a good UPS with a line conditioner.

    --
    I said no... but I missed and it came out yes.
  48. Power supply by Alioth · · Score: 2, Interesting

    You didn't mention the power supply.

    In my experience, a "crashy machine" is almost always down to the PSU. Out of the dozens of "crashy machines" I've had to fix, only one was due to bad memory. The rest were *all* down to faulty power supplies, and all of those were due to capacitors that had failed.

    I have an oscilloscope so I can easily test for ripple without needing to open up the power supply and look for the obvious signs (bulging capacitors, maybe ones that have leaked). We've had dozens of machines at work with supplies that have gone bad this way. Bad capacitors have been a real problem in recent years. Four years ago, it wasn't just in power supplies either - we had to return 70 machines to Hewlett-Packard under warranty after the capacitors on the motherboard began failing after 3 months of use. We've not seen anything on that scale on motherboards since, but we still have frequent problems with power supplies failing from "capacitor plague".

    A machine of mine was actually killed by a sudden power supply failure - the PSU let the magic smoke out with a loud "bang", and there was the sound of stuff richocheting around the computer's case. That sound turned out to be bits of exploding chips on the motherboard. The only thing that survived that incident was the CD-ROM drive - all other components were destroyed.

    1. Re:Power supply by Etylowy · · Score: 1

      True. Checking PSU, especially noname/never-heard-the-name is the third thing you should do about the hardware.
      First is checking all the connections
      Second is checking if getting rid of the half inch of dust +using compressed air solves the problem

      The first 3 solve 19 in 20 hardware problems.

    2. Re:Power supply by MarsLander · · Score: 1

      Four years ago, it wasn't just in power supplies either - we had to return 70 machines to Hewlett-Packard under warranty after the capacitors on the motherboard began failing after 3 months of use.

      Sounds like you got some motherboards fitted with capacitors filled with faulty electrolyte manufactured by a company that stole the recipe (but got it wrong) in a case of bungled corporate espionage in 2003.

      "According to the source, a scientist stole the formula for an electrolyte from his employer in Japan and began using it himself at the Chinese branch of a Taiwanese electrolyte manufacturer. He or his colleagues then sold the formula to an electrolyte maker in Taiwan, which began producing it for Taiwanese and possibly other capacitor firms. Unfortunately, the formula as sold was incomplete."

      http://www.boingboing.net/2003/05/27/bungled-espionage-bl.html

  49. It's a loaded question by camperslo · · Score: 1

    What's the best software to change a tire on your car and find the leak?

    Software can check quite a few things, but for the most part during a short time interval, digital hardware is either working or it isn't. So software performance tests may not be very good at revealing something marginal.

    Beyond a few software tests and ruling some things out by substitution, it generally takes someone with some hardware troubleshooting skills, and some test equipment.

    Of course test equipment starts with your senses. Software isn't very good at spotting things like a failing fan, or dust buildup in heatsinks, cooling vents, or the power supply. Software won't find that little solder blob or loose screw shorting something. It probably won't tell you about something poorly seated or dirty in a socket. It won't tell you about the marginal power supply or high-resistance connector that makes the voltage dip when a drive spins up... It won't tell you if the CPU doesn't have thermal compound properly applied (although software monitoring of temperature sensors does help).

    Of course it goes without saying that you've made sure that bios settings are such that nothing is stressed. Don't be afraid to let a memory test run overnight or longer.

    A multitmeter, oscilloscope, and dummy-load resistors are a good starting point. Adjustable power supplies to allow board testing at the upper and lower ends of the specified operating voltages can also be very revealing. A hair dryer and freeze spray may help localize thermal intermittents. A temperature probe and IR videcam can be handy. For example being able to see a pin of a connector heating could reveal a problem even when voltages are within normal limits.

    If qualified to do so, use an oscilloscope and voltmeter to see that any switching regulators on the motherboard are functioning properly. Failing capacitors sometimes have obvious physical signs, but don't count on finding bad parts by appearance only. Seeing excessive ripple/noise with a scope can make filtering problems immediately obvious. Many modern boards take a 12 Volt input and convert it to what the CPU requires. In some cases the related components are heavily stressed.

    Beyond simple things like regulator problems, it is unlikely that most outside of a specialized service facility could actually fix much on a motherboard. Even if when possible, it is not likely to be cost-effective.

    Use every clue presented. What's going on when the malfunction occurs? (what's running, is the environment hotter or cooler, is equipment subjected to vibration or static discharge, note time of day when other equipment kicks in etc.)

  50. Good comment )))..) ) ) ) ) ))))))) by ClioCJS · · Score: 1

    But leaving a parenthesis open like that is offensive. )))))

    --
    -Clio
    Karma: Bad (mostly from not giving a fuck)
    Blog: http://clintjcl.wordpress.com
  51. Re:F$F Shill by Anonymous Coward · · Score: 0

    No one wants to read your inane babbling, which is apropos of nothing. You neckbeard "GNU/Linux" zealots find a way to mention "free software" in every damn forum online, regardless of the topic.

  52. Strange crashes Win 7 by tenco · · Score: 1

    I had this when I tried to run Windows 7 on my old machine (32bit). Random crashes, 7 or so alone in the first 48h. This machine never gave me issues with Linux, Windows XP or Vista. So I run Prime95 on Win 7 on it - guaranteed error in about 2h of processing. And now comes the odd part: Prime95 runs straight 9h on both XP and Vista without any problems. Anyone else got such problems?

  53. burnintest by DMoylan · · Score: 1

    http://www.passmark.com/products/bit.htm

    burnintest. have used it for years. works fine. some systems which would run fine for days and then crash were driving us crazy. this software found memory, video and cpu problems. free version of version i bought only ran for 15 minutes. might be enough to find your problem. windows only though so that might be a problem.

  54. You don't, you swap out hardware by GuyFawkes · · Score: 2, Informative
    Of course with time you get experience, dry joints tend to follow power tracks on a PCB, and by gently flexing you can hear them tick.

    Swapping out is the ONLY way.

    I have systems with intermittent (heat activated) dry joints on a mobo, partly duff RAM, and partly duff (rebranded at higher clock) CPU. ONLY swapping out will find it.

    HTH etc

    --
    http://slashdot.org/~GuyFawkes/journal
  55. Pop in a live cd of some other OS and give it a go by Tynin · · Score: 1

    Lots of good posts so far, but one thing I also do and would suggest trying as well (depending on what the problem you are dealing with) is to also drop in a live cd of Ubuntu or Knoppix, install whatever app would also put a strain on whatever part of the system appears to be failing, and see if the problem occurs in another OS as well. I've seen Windows fail in some pretty interesting ways that seems like hardware is failing. But when testing with another OS and the problem doesn't reoccur, I often then suggest reinstalling Windows. While it isn't overly common these days, Windows can pretty silently get hosed up and crash for no apparent reasons and make it seem like hardware when it is just a borked up driver causing the issue.

  56. OT, I know by Anonymous Coward · · Score: 0

    "Free means no restrictions, ironic the FSF's GPL forces restrictions, isn't it? What's your definition of free?"

    No, free does not mean "no restrictions" literally.

    A "free" man is not free to murder, or steal or a lot of other things but that doesn't make him any less free nor would doing any of those actions make someone more free.

    That said, I gotta ask, are you one of those simpletons who thinks that you don't have to work for your freedom in order to maintain it? I mean it's freedom so it should be free and clear, right?

  57. Did you check the logs? by w0mprat · · Score: 1

    Check the event viewer for logged any errors or crashing drivers. It boggles my mind how many people don't know to check this, and how many nerds trying to help don't tell you to check this. Frankly many people don't know Windows even has such logs. It is essiential when trying to troubleshoot unexplained crashes on any platform that you RTFL (read the fricken LOGS).

    Most crashes in windows are either hardware related or shitty drivers. Windows these days is resilient to crashing applications, but crappy drivers will lock your system right up, and faulty hardware will make it all go pear shaped. Stress test that system, if it locks up. Software to use:

    CPU: OCCT
    Orthos
    Intelburntest

    GPU: Furmark (app will seriously heat your GPU better than any game. In some rare cases too much, if it was running fine, it may not afterwards) 3DMark

    Hard disks can also cause system crashes, even without a event being logged in event viewer. Run a surface scan of your HDD, use several different applications. Your hard drive may be coming back with healthy S.M.A.R.T data, but still be causing your system to crash and your data to be corrupted.

    HDTach will nicely stress your HDD. Replace or at least re-seat your IDE and SATA cables. Unplug all USB devices, I can't believe how many systems have issues booting or running stable with dodgy USB devices.

    Finally, use CoreTemp and SpeedFan and run the PC with the side panels off. Temperature is a huge cause of many system crashes, especially in hotter climates.

    --
    After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
  58. Use your eyes. by adolf · · Score: 1

    RAM is easy to test using basic troubleshooting techniques: Remove some of it, see if the problem recurs. Replace some of it with good spares, see if the problem recurs. Etc, so on. memtest86 also does a decent job of finding bad modules if left to run long enough, but since it runs in isolation from the rest of the computer it will not detect certain corner cases of bad RAM.

    Power supplies are similarly easy: Swap it out for a known good supply, and see if the problem recurs.

    I've never had a CPU fail, so I'm afraid that I've never had to develop any particular troubleshooting techniques for them. Even when the heatsink has fallen completely off, in my experience, the CPU is just fine.

    As for the rest of a highly integrated system: It doesn't do any good at all to figure out exactly which components in particular on an integrated motherboard are being problematic, as there's no practical way to replace them without tossing the whole board.

    And in fact, every single motherboard problem I've experienced in the past 5 or 10 years has been easy to identify: Bad electrolytic capacitors. And they're easy to spot, since by the time they've drifted far enough out of spec to cause frequent-enough problems that folks start looking for a fix, the caps are all swollen and/or leaking goop.

    So have a good look around the motherboard. If any caps are swollen or leaking, replace the whole board[1]. And consider replacing the power supply at the same time, as well, since it might be a contributing factor in the failure of the motherboard's caps (and is stuffed full of its own set of aging and possibly failing capacitors).

    [1]: Yes, I know. It's easy and cheap to replace some or all of the electrolytic capacitors on a board if you're good at working on multi-layer PCBs. But most people aren't, and if they need to pay someone else to do it, it's going to be costly to the point where it becomes far more practical to simply buy a new board with a warranty.

  59. Could be the drivers... by Anonymous Coward · · Score: 0

    In which case, you're kinda screwed :/

  60. prtdiag -v, fmadm faulty, and system messages by uassholes · · Score: 1

    "prtdiag -v", "fmadm faulty", and dmesg are helpful on Solaris. If that's not what you're running, you could boot an OpenSolaris live CD.

  61. reduce variables by Anonymous Coward · · Score: 0

    Boot a different operating system, eliminate Microsoft from the list.

  62. Quicktech by Sensible+Clod · · Score: 1

    Quicktech has a nice (non-free) test kit that includes software and hardware. I have seen the software used on my machine, and it has tests for just about every hardware component you can think of, including the video card.

    --

    The difference between spam and poop is that you don't have to dig through septic tanks looking for real food. -- Me
  63. UBCD by skogs · · Score: 1

    I've found the UBCD -- Ultimate Boot CD to be quite useful.

    http://www.ultimatebootcd.com/

    It does come in handy, includes many of the necessary tools to determine HDD end of life etc.

    It certainly isn't perfect, but I am amazed nobody has mentioned it yet in the discussion. Obviously real tools are on my bench, but when the poster specifically asked for software....this is the easiest and most broad spectrum solution.

    --
    Who is this that even the wind and the waves obey Him? Surely this computer must submit also!
  64. Testing software by StarHeart · · Score: 1

    I recently diagnosed two desktop machines. One ended up having a bad stick of memory, with the original symptoms being a corrupted copy of Windows XP that wouldn't boot. The other a bad hard drive, the symptoms being it would hang during use randomly and even during boot.

    I used Prime95 and Memtest86+ to detect the bad stick of memory. Prime95 quickly came up with a error during the stress test, and Memtest86+ also came immediately came up with errors. In the past I have since subtle errors with Memtest86+ that only show up in later tests or with multiple passes. Instant answers isn't how it always goes.

    For the bad hard drive I ended up doing a variety of tests. I tried Prime95, and since it was a Seagate drive, Seagate's Seatools. I didn't get any clear answers from them. At a later point I booted into a Fedora 11 Live cd, which popped up with a SMART error. Which ended up being a bad sector that needed to be remapped. I then tried using Spinrite to fix it, but ended up seeming to just hang on this one spot. So I replaced the hard drive. Afterword I reran Spinrite against the new drive, and it came up with nothing. I also played with Sandra Benchmarks at the end to stress the machine.

    --
    Havoc Penington, the bane of my Linux desktop.
  65. [OT] Re:Overheat by eggy78 · · Score: 1

    Ha. Sorry dude. I doubt it's worth your time, and I'm way too lazy to ship it. It was only a $40 card to begin.

    I'll figure something out. Use it in another machine or see how it does with Windows 7.

  66. Just get hardware testers by Khyber · · Score: 1

    Half of your RAM issues wont' be able to be diagnosed with any piece of software. No RAM checking software will keep tabs on the operating speed of the RAM. Ditto with a CPU tester, there's hardware and socket adapters to help you plug in CPUs and test them with hardware.

    My time spent in the hardware repair/replacement service has taught me that most software diagnostics just fall short. One place I worked for used a combo of Prime95 and some custom stress-testing software - almost every machine would pass those diagnostics but then we'd go to do a full hardware check or send the unit to burn-in and it would fail. If you don't have the dedicated hardware for checking other potentially faulty hardware, you're just going to play the shotgun game until you find the issue - that's a waste of time and money.

    --
    Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
  67. Not so simple. by Cylix · · Score: 1

    There are some fairly straight forward applications that several readers have mentioned.

    However, relying on software to determine a fault when no fault indicators are built into your motherboard is an inherently flawed logic.

    The vast majority of systems today are quite dumb and have no reporting. Even on more expensive systems this reporting is still not the most reliable method of troubleshooting hardware.

    That is to say that software cannot be helpful in the troubleshooting process. It can be immensely useful if applied correctly with the right approach.

    Software used as a tool for isolation purposes can help verify and ferret out problems. Memtest is useful (not perfect) for finding memory faults within the memory subsystem. (Controller, memory and physical pathways). Stress tests and smart data can be useful to isolate problems with hard disks or other faults within the disk subsystem.

    The approach is nearly always the same. Use a common set of tools to attempt to identify obvious flaws with the system. (Praying you have a board and combination of hardware which does more then just fault).

    In the end, identifying an unknown error is a combination of agitating specific areas within the system and attempting to illicit a fault under a controlled set of circumstances.

    When this fails... chuck parts at it.

    --
    "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  68. Anonymous Coward by Anonymous Coward · · Score: 0

    Acquire a Best Buy Geek Squad's MRI disc, but for your sake do not work there. They contain many tools already mentioned with an automatic run procedure which will stress test and do much of the troubleshooting for you. It is designed to identify faulty hardware, so that's the answer to your question. Now just how to get a MRI disc...

  69. Mod up - Everyone buy one of these by bogie · · Score: 1

    If you work on PCs even infrequently this is a must have tool. Yea a multimeter is great but a) you need to know how to use it and b) you can push the probe into the wrong place and make a mess of things.

    With hardware its usually bad psu, then bad memory, then bad caps.

    --
    If you wanna get rich, you know that payback is a bitch
    1. Re:Mod up - Everyone buy one of these by cbiltcliffe · · Score: 4, Funny

      With hardware its usually bad psu, then bad memory, then bad caps.

      Then bad karma, then bad mojo.

      --
      "City hall" in German is "Rathaus" Kinda explains a few things......
    2. Re:Mod up - Everyone buy one of these by Ngarrang · · Score: 1

      If you work on PCs even infrequently this is a must have tool. Yea a multimeter is great but a) you need to know how to use it and b) you can push the probe into the wrong place and make a mess of things.

      With hardware its usually bad psu, then bad memory, then bad caps.

      Add in bad thermal to that mix. A suspect fan and too much heat can cause a lot of transient to Really Bad (tm) problems.

      --
      Bearded Dragon
  70. ROM BIOS's POST by shentino · · Score: 1

    'nuff said.

  71. short answer: no by Eil · · Score: 1

    Once you strip it off any extra hardware (which with today's motherboards, with pretty much everything integrated, might not be an option) you are left with CPU, motherboard, graphics card, RAM, and HDD

    Think about what you're asking here. A basic desktop computer needs four things in order to run (not be useful, but just run):

    1. Power supply
    2. Motherboard
    3. CPU
    4. RAM

    Of these, the only one that can be cordoned off for a special test is the RAM. That's because the program testing the RAM can move itself out of the way to test a particular area. And even then, a failed RAM test does not always indicate bad RAM. All of these components are required in order to execute a single software instruction. To use a car analogy, you can't test an engine for proper performance without some minimum subset of its parts. If you suspect flaky hardware, you have to do what millions of others have done before you: start swapping parts or take the machine to someone who can.

    And the bit about windows crashing "too often to blame Microsoft," is just laughable. A buggy device driver can easily make any software problem look like a hardware one. I hope you've at least fired up an Ubuntu or Knoppix live CD before firmly blaming the hardware in this case.

  72. QuickTech or QT by iq+in+binary · · Score: 2, Interesting

    My shop uses it, works pretty well. A full scan can take up to 6 or 8 hours (we set up hardware diags before leaving for the night, and in the morning on a 24-channel KVM), but it is THOROUGH. VRAM, RAM, HDD, CPU, everything is tested and thoroughly. First step should be testing the PSU, then running QT.

    --
    Of all the Universal Constants, here's one I know: Nice guys finish last ;)
    1. Re:QuickTech or QT by Anonymous Coward · · Score: 0

      quicktech++

  73. PC-Check by Eurosoft UK by trparky · · Score: 1

    The only one that I know of is PC-Check by Eurosoft UK. Loads as its own bootable program (OS-independent) and tests everything you can think of when it comes to a computer.

  74. UltimateBootCD by Anonymous Coward · · Score: 0

    http://www.ultimatebootcd.com/

  75. Prime95 by Anonymous Coward · · Score: 0

    Prime95 was pretty awesome for testing CPU.

    I had a quad cpu with 3 failed cores, BSOD terrorizing my system for a month.. till i ran Prime95.

    3 core failures were ID'd in less than 5 mins.

  76. this is a stupid question by ILuvRamen · · Score: 1

    How do you test if the CPU is bad? Umm...maybe the fact that it won't boot and you can listen to the POST beep codes maybe? Duh. And if it doesn't beep, use a probe card. That's what I do. It displays a 2-3 digit hex value on a little display after you put it in an empty PCI slot. It tells you via the code what part of the POST it's stalled at if it isn't booting. Like if it's FF, it's usually the CPU or power and if it's 41 that's usually the BIOS or a motherboard problem if I recall (for Award BIOS at least)
    As for testing other parts that will at least let it boot if it's broken, I boot into a CD of BartPE or Ubunutu Live CD and see what hardware is listed as not working. Between those and the probe card I can diagnose a broken anything.

    --
    Google's Super Secret Search Algorithm: SELECT @search_results FROM internet WHERE @search_results = 'good'
  77. Only a couple tools needed. by bwave · · Score: 3, Informative

    We have repaired about in excess of 50,000 machines, and I'll tell you the tools needed are very simple. The process we do is, open the machine, dust with air compressor (with humidity drier, you can pickup at sears a 4gal with drier for about $99, saves alot of money on $3-6 cans of air) and central vacuum system (a shopvac will work), then inspect the motherboard & video card for blown caps. Take off the cpu fan and inspect the compound, if it is home built, lord only knows what you'll find. Test the power supply with a digital power supply tester (one of the $12 lcd ones) if good, still open the power supply, look for blown caps. (many will have blown caps, and be causing sporadic problems the simplistic tester will not). See if machine will power on / boot. If it doesn't power on, or hangs on post, remove modem and nic if it's a seperate card, when these are blown by lightning will cause no post. Ensure the hard drive is mounted properly with 4 screws installed, less than that the vibrations will cause the drive to go bad. (don't care what operating specifications you show me, or what G-rating the drive has, this is the case) Then test memory with Memtest86+ 1.70, and the hard drive with one of the 3 versions of Seatools by seagate. (some versions will lock on some video/chipsets, if you get a long string of bad sectors on a hdd bigger than 320gb, that begin about 2/3rds way through drive, test with a different version to be sure, as there is a sector count issue with some large hds) The 3 versions are an older GUI one, the newest GUI one, and the text version. If you have even 1 bad sector - replace the drive. We do the above process on EVERY machine before we attempt to do anything else, it is well worth the couple hours it takes to do. If you make it this far, than 99% of the time, you're problem is malware/viruses. Run Combofix, look for files not removed by it, boot with Ultimate Boot CD (the WinPE based one) or something like Knoppix and manually remove them. Search the WIndows, Windows/System32, Windows/System32/Drivers directories for files created in the past month, anything suspcious is probably a malware. Rename those files. Look under Program Files, Program Files/Common, ProgramData, and Users/UserName/ApplicationData for suspicious directories and rename/delete, these are where your AlphaAntivirus, Windows Police Pro, UltimateAV, etc, like to hide. Boot back into windows, run Hi-Jack This!, remove any suspicious entries, reboot, anything left? If so, remove manually with bootcd. In add/remove programs, remove all unneccessary programs. Then run CWShredder, Malwarebytes Antimalware, Spybot, and AVG Antivirus. (Feel free to substitute legimate antimalware/antivirus tools in place of these 3, but we find these 3 work best for us. Install all Windows updates, update all sytem drivers, try browsing the internet for 2 or 3 minutes. If all seems ok, reboot one last time, and be sure you can browse the inet still. All done! This fixes pretty much everything. Other than specific issue your customer may have complained about. Also, be sure to check the amount of ram here are what we recommend, otherwise, with latest service packs, etc. machine will seem sluggish. Windows 95 - 96mb+, Windows 98/ME - 196mb+, Windows 2000 384mb+, Windows XP 640mb+, Windows Vista Home Basic 1Gb+, Windows Home Premium 2Gb+, Windows Vista Ultimate/Windows 7 4Gb+ If you don't give machine back with this amount of ram, your customer will swear machine is slower than when the brought to you, doesn't matter how untrue it is, doesn't matter how much malware you removed or how machine didn't even go into windows! CPUs/Video Card rarely go bad unless abused. Normally, your find a under-rated power supply, or defective power supply to blame. Also, if you're working with a notebook, be sure to dust the exhaust/intake vents, if still power down/lockups, you need to disassemble and recompound cpu/video chipset with Arctic Silver 5. The other thing is power problems, mouse lockups, etc many times are caused by bad batteries, try running w/o a batter installed, just ac adapter. Any battery older than 2 1/2 years old is suspect. And of course, look for broken dc power jacks.

    1. Re:Only a couple tools needed. by bwave · · Score: 1

      In case you're curious, we have 1 full time person who works in a seperate room for the dusting, inpection, memory & hard drive testing. We have a bench spot with a air hose/central vacuum, along with a power cord for power supply testing. We then have 8 other spots for him to run memory/hard drive tests. Normally all of these spots are full. They then move to the next step. We have 1 full time tech that works on exclusively notebooks/lcd/tv hardware issues, he has a main desoldering/soldering station for dc jacks, cap replacements, and notebook motherboard replacements. He then has 4 additional bench spots for notebook testing. He also does most of the notebook bad to new hard drive cloning, and some of the Windows rebuilds on those. He will also test customer returned parts such as hard drive or memory in one of his 3 desktop testing machines. We then have 1 full time tech who builds/tests new desktop computers, data backup/recovery procedures, and desktop virus cleanup. We have another full time tech who does desktop/notebook virus cleanup, along with placing orders with vendors and receiving in shipments, and rma processing. Both of these techs work on about 5 machines simulatenously. We also have our full time bookkeeper work on about 2 to 3 notebook virus cleanups at a time while filing, answering the phones, etc. We have 1 full time sales manager, and one part time repair order taker. They usually test customer returned lcds, video cards, external/flash hard drives, printers, etc They also test every new notebook, and complete the Out of Box Experience on all new notebooks we sell, along with av software instalation and data transfer from old notebooks to new. We will not sell a new notebook without testing the ram and hard drives first, and doing a cleanup/OBE finalization. Most machines have a bench time of about 6 to 8 hours, we bill only for "interactive time", time which we are physically working on the machine or operating the mouse/keyboard. Typically we bill for about 45min to 90min of time. Labor is billed at $1.90 per minute. Usually, we take in about 20-30 machines a day. Typical turn around time is between 3 to 10 business days, depending on volume.

    2. Re:Only a couple tools needed. by Anonymous Coward · · Score: 0

      We have repaired about in excess of 50,000 machines, and I'll tell you the tools needed are very simple. The process we do is, open the machine, dust with air compressor (with humidity drier, you can pickup at sears a 4gal with drier for about $99, saves alot of money on $3-6 cans of air) and central vacuum system (a shopvac will work), then inspect the motherboard & video card for blown caps. Take off the cpu fan and inspect the compound, if it is home built, lord only knows what you'll find. Test the power supply with a digital power supply tester (one of the $12 lcd ones) if good, still open the power supply, look for blown caps. (many will have blown caps, and be causing sporadic problems the simplistic tester will not). See if machine will power on / boot. If it doesn't power on, or hangs on post, remove modem and nic if it's a seperate card, when these are blown by lightning will cause no post. Ensure the hard drive is mounted properly with 4 screws installed, less than that the vibrations will cause the drive to go bad. (don't care what operating specifications you show me, or what G-rating the drive has, this is the case)

      Then test memory with Memtest86+ 1.70, and the hard drive with one of the 3 versions of Seatools by seagate. (some versions will lock on some video/chipsets, if you get a long string of bad sectors on a hdd bigger than 320gb, that begin about 2/3rds way through drive, test with a different version to be sure, as there is a sector count issue with some large hds) The 3 versions are an older GUI one, the newest GUI one, and the text version. If you have even 1 bad sector - replace the drive.

      We do the above process on EVERY machine before we attempt to do anything else, it is well worth the couple hours it takes to do.

      If you make it this far, than 99% of the time, you're problem is malware/viruses. Run Combofix, look for files not removed by it, boot with Ultimate Boot CD (the WinPE based one) or something like Knoppix and manually remove them. Search the WIndows, Windows/System32, Windows/System32/Drivers directories for files created in the past month, anything suspcious is probably a malware. Rename those files. Look under Program Files, Program Files/Common, ProgramData, and Users/UserName/ApplicationData for suspicious directories and rename/delete, these are where your AlphaAntivirus, Windows Police Pro, UltimateAV, etc, like to hide. Boot back into windows, run Hi-Jack This!, remove any suspicious entries, reboot, anything left? If so, remove manually with bootcd. In add/remove programs, remove all unneccessary programs. Then run CWShredder, Malwarebytes Antimalware, Spybot, and AVG Antivirus. (Feel free to substitute legimate antimalware/antivirus tools in place of these 3, but we find these 3 work best for us.

      Install all Windows updates, update all sytem drivers, try browsing the internet for 2 or 3 minutes. If all seems ok, reboot one last time, and be sure you can browse the inet still.

      All done! This fixes pretty much everything. Other than specific issue your customer may have complained about.
      Also, be sure to check the amount of ram here are what we recommend, otherwise, with latest service packs, etc. machine will seem sluggish.

      Windows 95 - 96mb+, Windows 98/ME - 196mb+, Windows 2000 384mb+, Windows XP 640mb+, Windows Vista Home Basic 1Gb+, Windows Home Premium 2Gb+, Windows Vista Ultimate/Windows 7 4Gb+ If you don't give machine back with this amount of ram, your customer will swear machine is slower than when the brought to you, doesn't matter how untrue it is, doesn't matter how much malware you removed or how machine didn't even go into windows!

      CPUs/Video Card rarely go bad unless abused. Normally, your find a under-rated power supply, or defective power supply to blame.

      Also, if you're working with a notebook, be sure to dust the exhaust/intake vents, if still power down/lockups, you need to disassemble and recompound cpu/video chipset with Arc

    3. Re:Only a couple tools needed. by cmdean · · Score: 1

      No points of I'd mod you up to 5+. What a detailed usable response, well done!

  78. Depends on the failure. by Anonymous Coward · · Score: 0

    PC Hardware diagnostics are one of those dark arts that take time to get good at. I just moved from one flooring facility to another and people at the office thought "gee, he fixes one or two $1000 laptops a day for restock". Nope; I run a dozen-two dozen units at once. Takes 15-25 minutes of human-time to do a machine if you're set up right using software diagnostics.

    The big thing is to have a hardware qualification process that catches 90-95% of failures. For HP laptops

    Full run of PC-Doc 6 minus loopbacks (loopbacks are meaningless) (bootable disk/usb key)
    Memtest X86 (bootable disk)
    24HR loadtest (bootable disk)
    Check each feature manually (via windows)
    *Check manufacturers website for bios updates and install if critical
    Restore Software from ODD/HDD

    If it passes that, the hardware is good and won't fail any time soon.

    For testing advanced video functions for artifacting/bad registers, I've found the any 64K project demo as a great loadtester and is sufficient for testing a video card over a period of several hours. Some are more intense than others so I have a few different ones for different units; I loop them, go home, come back and if it passes that it passes a loadtest.

    It takes more time if you've got to do a part replacement or a tear-down diagnostic; meaning the unit won't boot software or the issue requires you to pull parts one-by-one to isolate (such as figuring out if your SATA disk is demounting due to a faulty USB Cable as the unit has a crappy implimentation E-sata and the port doubles for USB).

    You aren't going to catch the wierd failures but then again, those are rare and customers aren't always going to complain.

    FYI: its handy to have a camera and a *nice* color printer for warranty claims : - ) . Mark problems with stickers; techs appreciate it.

  79. Why do they have to be specialist utilities? by schizz69 · · Score: 1

    I find the best stress test for PC's are the actual programs that are run. For instance winrar is great to stress memory and CPU, nothing like trying to compress a 4.7GB+ iso to stress that area. And for video, why not crank crysis up to max specs and run the built in benchmark. These of course dont take the place of proper hardware diagnostics, and if you lack the equipment, take your box to a mates (ENSURE THEY HAVE A COMPATIBLE HW CONFIG). If you lack the knowledge, take it to a computer repair guy.

  80. Get down, get down for BootZilla by Anonymous Coward · · Score: 0

    BootZilla offers a boot cd option with the ability to test memory (memtest86 & memtest86+), hard drive (hdat2, mhdd32, DFSee), Video memory (Video Memory Tester), and I'm certain I'm forgetting something - http://www.bootzilla.org

  81. Hiren Boot CD by psyph3r · · Score: 1

    Hiren Boot cd 10.0. Has lots of hardware testing programs that you can boot to and run. http://www.hiren.info/pages/bootcd. I use it often.

  82. Progress in reverse by Archtech · · Score: 1

    Between 1979 and 1985, I was directly involved in remotely diagnosing hardware and software faults on DEC's VAX computers. To start with, they were quite big machines - definitely server class in today's terms - the size of a row of refrigerators with the old hard drives the size of washing machines. And of course they cost a lot more than PCs. Nevertheless, their operating system, VMS (now OpenVMS) was specifically designed to be portable across a very wide range of computers, from PC equivalents to near-mainframes and clusters. By 1985 there were MicroVAXes slightly larger than modern PC system boxes, soon to be followed - with the addition of bitmapped graphics monitors - by VAXstations that were very similar to the kind of big desktop PCs many of us still use today.

    So what's the point of this little excursion down memory lane? Well, right from its inception in the mid-1970s, VMS had built-in error checking and logging, which was soon exploited to provide very sophisticated and accurate diagnostic software. A competent sysadmin would check the summary error log from time to time, and zoom in on any developing patterns. The fully detailed level of the error log was mind-boggling, with complete dumps of all the registers and data for every single error - of which there might of course be thousands. The VMS engineers soon produced software to automate error log and crash dump analysis, so it was often fairly easy to see which hardware component was at fault. (One of the reasons we disliked Unix, which was becoming popular as a cheaper alternative to VMS, was its complete lack - at that time - of any comparable diagnosis features).

    Thus, when I sat in front of my VAXstation 20 years ago, I could at any time call up the error log at any of a number of levels of detail, and analyze it or a crash dump file in order to see if any hardware units were generating errors that might be significant. There were also standalone and online diagnostic programs, although system engineers working to diagnose a flaky computer often preferred to load up the machine with a lot of specially designed "exerciser" programs that stressed the hardware, as they could then avail themselves of VMS' built-in error logging features.

    Whenever one of my PCs acts up nowadays, I find myself missing VMS' many troubleshooting features. The Windows event viewer seems carefully designed to collect as much useless information as possible, without ever catching hardware errors and the like. How often has a PC crashed due to a hardware fault, rebooted, and then - when I examined the event log - shown nothing at all except the restart.

    It seems strange that Microsoft missed the opportunity to copy this useful facility from VMS, when so much else was incorporated in Windows NT and its successors. One salient difference is that anyone writing a VMS device driver was encouraged to include errorr-logging, whereas the huge majority of Windows drivers seem to lack anything of the sort. The assumption seems to be that PCs work most of the time, and if they crash or work unreliably then the solution is to reboot - or, if the problem is persistent, buy a new PC.

    --
    I am sure that there are many other solipsists out there.
    1. Re:Progress in reverse by Mr+Z · · Score: 1

      One salient difference is that anyone writing a VMS device driver was encouraged to include errorr-logging, whereas the huge majority of Windows drivers seem to lack anything of the sort. The assumption seems to be that PCs work most of the time, and if they crash or work unreliably then the solution is to reboot - or, if the problem is persistent, buy a new PC.

      Unless they're selling into a corporate environment where the corporation has standardized on specific hardware, there's no incentive for manufacturers to provide such error logging. In fact, there's incentive not to. In the consumer market, if a peripheral starts logging errors, the consumer is likely to replace the component with a competitor's offering, not a replacement from the same vendor. It's a great way to throw business to your competitors. There's far more incentive to just silently fail and hope you never get found out.

      For this to show up in Windows, Microsoft would have to make it a driver certification requirement. Manufacturers would certainly push back on that. I think it'd be a good thing, though, because it'd make it much easier to determine which manufacturers ship solid products vs. which ones are crap. Even though I don't use Windows (except at work), even I'd benefit, because all the Windows users' experience would get more data out there on what's good and what's bad.

  83. Three basic failure mechanisms.. by elteck · · Score: 1

    Unfortunately most home use PSU's have no self test build in, so you need to verify it yourself. Are the voltages in range? also under the heavy load? Almost all other units in our PC do have a POST (power on self test). So if a PC boots fine, what remains are mainly intermittent failures, which are a lot harder to uncover.

    Apart from hard drive failures, the three most common types of failures you'll find on a mother board are:

    1) faulty supply capacitors. These cause supply noise, or even worse, make the supply on the board itself unstable. The small ceramic caps on a board rarely fail, but the bigger elcos often do. The result can be that digital communication on the communication busses fails, or parts become unstable.Not so easy to diagnose by software. What is a clear sign, when more then one part on your board seems to fail. With a scope you can check for supply noise. Some bad elco series were easy to see, their packages were ballooning and they were leaking some brown stuff.

    2) intermittent connections. This happens more often then you would guess, but certainly in modern PC's where big (memory) chips are soldered down on ball grid arrays, failing connection do occur. To diagnose it, you could try tapping on the board with the handle of a screwdriver, while your PC is running some heavy program. But that does not reveal all bad connection of the ball grid arrays. More effective is "cold spray", cooling the suspected parts down will widen the gaps of bad connections.

    3) Faulty chips. I think this is actually rare, chips die pretty hard (even memory), most parts suffer from failing connections. But if a chip is on the edge of failing, that is revealed by heating them up. Give them a stress test using a heavy load, place your PC in a warm place, direct sunlight and run a stress test. This is probably best captured by software tools.

  84. I can help by stinkbomb · · Score: 1

    I'm typing this because Slashdot said so. Passerby were amazed by the unusually large amounts of blood. Try this: http://lmgtfy.com/?q=pc+diagnostic+hardware

  85. Same PSU? by bigtrike · · Score: 1

    The advice is for power supplies which are not plugged into the wall, which are still dangerous. Every PC power supply I've run across will run on 100-250V, suggesting it's the same power supply for all regions, with the similarly sized capacitors.

    1. Re:Same PSU? by Mr+Z · · Score: 1

      Every PC power supply I've run across will run on 100-250V, suggesting it's the same power supply for all regions,

      Most modern PSUs work that way. Older ones are 110v or 220v only, or had a switch.

  86. Re:I wish you had asked this question 2 weeks ago. by mdwh2 · · Score: 1

    Don't forget the lead!

    For months I was struggling to explain why my computer was randomly rebooting or switching off. Then eventually it didn't turn on at all - thankfully I had an older computer next to it, and I suddenly thought to try the same lead in it, and it didn't work either.

  87. Faults like power supply faults aren't heat relate by Calyth · · Score: 1

    One of the most overlooked computer problems are faulty power supplies that cannot give power near what the specs says.

    CPU may be the brains of the computer, but the power supply is the heart, supplying vital electricity to all the component. Too often, I've worked on machines where as soon as I plug in a cheapo tester, nothing lights up as the proper voltage, yet the machine still manages to "run".

    That could possibly be monitored with software, if the BIOS supports voltage monitoring.

  88. Check out SunVTS by inimeg156 · · Score: 1

    The tool has support for majority of the system components. It does not run on windows but you can download a boot cd/usb image and do your testing. http://www.sun.com/oem/products/vts/

  89. From TFA; by Old+Sparky · · Score: 1

    "crash way too often to blame it all on Microsoft"?!?

    WTF? Over.

  90. Why not try this by Anonymous Coward · · Score: 0

    I don't know why you'd want to use software to diagnose a hardware problem. If the hardware won't work it won't be able to run your software.

    You should try a pci diagnostic card like this:
    http://www.uxd.com/phdpci.shtml

  91. Vendor Specific by Anonymous Coward · · Score: 0

    Stressing a computer may not have the intended results. Most vendors provide a toolkit for testing. It is often tightly coupled to the motherboard and BIOS. Every vendor has their own utilities, but they typically involve the ability to utilize the internal sensors of most devices. This means that you can see if the CPU is running hot, a fan is not spinning at prescribed RPM, voltage regulation to the CPU is varying unexpectedly, or the like. They often are also able to look into BIOS level logs to see if there are any failures there.

    Modern hard drives typically have their own internal diagnostics, also (SMART and derivatives). A media failure is typically indicated by slowing access times (as the drive relocates more data from failing sectors). However, a much more common problem is mechanical failure, which is often sudden and catastrophic. Still, there are tools for newer or high-end disk diagnostics that do not require a stress test.

  92. Ultimate boot CD by Sheik+Yerbouti · · Score: 1

    www.ultimatebootcd.com

  93. Fashion ED Hardy Drawstring Pants For Female Free by Anonymous Coward · · Score: 0

        Welcome to our website: Http://www.tntshoes.com

        basketball shoes,men's shoes,sport shoes,air shoe,force shoe,shox shoe,max shoe,JD shoes and other fusion shoes
    1)Small order and dropshipping avaliable
    2) Various styles and color on our website
    3)Our shoes are all coming with original boxes, tags and cards
    4)The more you order, the lower price you will get
    5)You can mix any items from our store together
    6)shippment: (5-7 business days)

    Our company have many other brand products,if you are interested in our products .contact me directly

    OUR WEBSITE:
                                                            YAHOO:shoppertrade@yahoo.com.cn

                                                                    MSN:shoppertrade@hotmail.com

  94. The bin by Kittenman · · Score: 1
    I tried to fire up an old Windows 98SE box at the weekend - it wasn't talking to my screen and plugging in another box assured me that the screen was fine.

    I'm tossing that baby. Life's too short. And I can get Pharoah, Caesar 3 and Starcraft working on XP with a little work...

    --
    "The greatest lesson in life is to know that even fools are right sometimes" - Winston Churchill
  95. 3d stress testing in Linux by Anonymous Coward · · Score: 0

    I'm interested to know if anyone has found a good, lightweight (300MB) 3d stress test for Linux, akin to Furmark (glxgears does not count). I've been referred to running some modern video games under Wine. 5GB of game files is definitely overkill for batch-testing dozens of machines. They will be disk imaged over the network and unattended. Ideally it should be scriptable to launch from command line with arguments or a parameters file and optionally output some data periodically to STDOUT/file.

  96. PSU by obarthelemy · · Score: 1

    The components most likely to fail totally or partially are
    1- cables and connections, either badly connected, or inside the plastic wrapping
    2- PSU

    Over the years, I've never had a MB or Vidcard fail me. A CPU, once. But I had the funniest things with cables:
    - power button turning flaky, causing reset
    - short somewhere in the keyboard or its cable, causing PC reboots
    - network failing everyday during lunch. maintenance staff strongly suspected (gaming ?)... turns out the cable was broken inside its sleeve, and the midday sun, shining directly on it, cut the connection.

    --
    The Cloud - because you don't care if your apps and data are up in the air.
  97. Hottest G-unit tracksuit man Top Quality!! by Anonymous Coward · · Score: 0

        Welcome TO Our Website: Http://www.tntshoes.com

    Hi friend, we are a prefession online store, you can see more photos and price in our website which is show in the photos
    we take paypal as payment, . shoes Nike jordan1-23 $28-42 free shiping. hellow our website is see our website in the photos attached, we are a online shopping mall, we have all kinds of brand new shoes,clothing, handbag,sunglasses,hats etc for sale, all of our product is best quality but the price is so cheap. You can find the more photos and the price for our product in our website, if interested please email me by we are selling all brand new handbag,

    OUR WEBSITE:
                                                                Http://www.tntshoes.com

                                                            YAHOO:shoppertrade@yahoo.com.cn

                                                                    MSN:shoppertrade@hotmail.com

  98. Re:F$F Shill by Anonymous Coward · · Score: 0

    Sup Sheryl Crow?!

  99. Wholesale Ed Hardy Handbags,LV Sunglass by Anonymous Coward · · Score: 0

                      Welcome TO Our Website: Http://www.tntshoes.com

        We are a prefession online store, you can see more photos and price in our website which is show in the photos
    hi our website is see our website in the photos attached, we are a online shopping discount store, pls find the more photos and the price for our product in our website, if you are interested please email me by we take paypal as payment, . Sunglass: lv, dior, D&G channeletc $15-35 free shipping.we have large of brand new shoes,clothing, handbag,sunglasses,hats etc for sale, all of ourproduct are 100% best quality with the amazing price.

      OUR WEBSITE:
                                                            YAHOO:shoppertrade@yahoo.com.cn

                                                                    MSN:shoppertrade@hotmail.com

                                                                          Http://www.tntshoes.com