Slashdot Mirror


Lessons In Hardware / OS Troubleshooting

Esther Schindler writes "We like to imagine that every Microsoft OS installation will work just as well as the company promises. When things don't work out, identifying and remedying the case of failure can be time-consuming and frustrating. This lesson in how to determine why Windows 7 didn't install may help you troubleshoot a problem of your own, and save you from a Lost Weekend. Maybe you'll find this account useful all on its own. But the real key here is that the author is Ed Tittel — who's written over 100 books. If this hardware geek spends days solving a CPU-meets-Windows 7 problem, what chance do mere mortals have?"

16 of 236 comments (clear)

  1. What I love here is the part where he by jra · · Score: 2, Interesting

    just rolls right on past the fact that, if what he was installing was -- oh, say -- a Linux distribution, he wouldn't have an opaque "I'm uncompressing files" thermometer, he'd have real progress status messages, with, y'know, *parameters* and stuff, and -- unlike me this morning with my boss's iPhone -- a hope of actually figuring out what's broken.

    But he's apparently completely blind to the fact that that's the *real* problem here.

    "We'll just make fault-tolerant users", indeed

    1. Re:What I love here is the part where he by Jezza · · Score: 4, Interesting

      I temper this approach with the "this is easy as hell and very quick". So even if I think it is something, if there is something else it could be that's really quick to try I'll ignore my "brilliance" and try that. What is amazing is how often there isn't actually one problem, but two. Also helps if you have a similar working system that you can take the components from (so you know that this or that doodah actually works).

  2. 100 books? by cranesan · · Score: 4, Interesting

    I'm a little suspicious; how much of an expert can you be writing 100 books on a variety of subjects.

    Reminds me of a tech instructor I had who proudly informed the class he teaches oracle classes, mysql classes, sql server classes, cisco classes, juniper classes, .net development classes, php, etc..... Yeah he couldn't answer any basic questions that strayed from the text book in front of us.

  3. Harware issue? Welcome to Linux by ben_kelley · · Score: 4, Interesting

    If you have never had a hardware issue when installing Linux on a machine you must be very lucky.

    "Most things work fine" people tell me, which is true. The trouble is that the chances of you owning something that doesn't work is relatively high. (There's probably something from my statistics course that explains why that is, but I have so far managed to suppress that memory.)

    After having rebuilt a Mac with OS X, and rebuilt a laptop with Ubuntu 9.04, I was surprised at how smooth and the Ubuntu install was. Of course that was until I wanted to use my webcam with Ubuntu. These kinds of problems get very difficult very fast in Linux. When 9.04 first came out there was a dependency problem that meant that you couldn't easily get some webcams working.

    To be fair, that problem is most likely sorted out now, and a non-Apple webcam would have needed a (very easy to install) driver on OS X as well. The point is, Windows and hardware generally work very well.

    1. Re:Harware issue? Welcome to Linux by jedidiah · · Score: 4, Interesting

      There was another fellow that mentioned the idea of staying away from the top and the bottom.

      Avoid the dregs and the bleeding edge.

      That middle will probably me much more reliable under Windows and more likely to be supported on Linux (or even MacOS).

      No one cares enough about the dregs to support them under Linux or MacOS and the bleeding edge stuff is just too new.

      That approach does pretty well regardless of OS today and did pretty well 16 years ago too.

      The problem with "statistics" is that any give PC isn't really random. It's a reflection of it's owner. It may be a dreg, a poster boy for bleeding edge gamer conspicuous consumption or something that's more moderate.

      "When 9.04 first came out" is covered by this rule actually.

      --
      A Pirate and a Puritan look the same on a balance sheet.
  4. Re:You're Kidding by Anonymous Coward · · Score: 1, Interesting

    More evidence he's a "script-kiddy": He uses Microsoft's "excellent" Windows 7 USB DVD Download Tool, instead of simply using diskpart to create a partition on the stick and copying the files over from the ISO. (In case there are more people reading this who don't know how to make a bootable USB flash drive with stuff that's already in Windows: diskpart is the command line disk partitioning tool in more recent versions of Windows. When it creates a partition on a clean drive/stick, it automatically writes a boot sector that will load "bootmgr", which will then proceed to boot from the device. That's all there is to it.)

  5. This happened to me in a production CPU. by UnifiedTechs · · Score: 3, Interesting

    Actually I had the exact same problem a few months ago upgrading a Dell server from Win2003 x86 to Win2008 x64, I suspected the CPU from the beginning, but I spent a few hours before the Dell Tech agreed with me. They sent a replacement and it worked like a champ.

    This proves it has happened to a production Intel Core2Duo CPU at least once, I can't believe I was the only one.

  6. My lesson. by w0mprat · · Score: 4, Interesting

    Just last night I fixed my parents computer in one of those long fixes that turns out to be the most fundamentally trivial things. This is why this is not my main occupation.

    Basicly they had a reccently built custom Windows 7 + Ubuntu PC that had begun randomly shutting down, often minutes after it had been powered up.

    Ok first thing, any obvious errors or cicumstances? No, it would just randomly power off. Windows event logs showed kernel power events, no specific driver, service or app crashing anywhere. Linux was the same. Not a thermal issue cpu + gpu temps nominal and stress test din't immediatley cause a crash.

    Suspecting a power or a motherboard issue, first checked and re-seated things internally. It still occured.

    Removed extraneous cards, connectors and drives. No result. It would even happen sitting in BIOS setup. Have ruled out a number of problems.

    Checked for electrical shorts, poor voltage etc.

    Dying power supply? Overloading or shorting? Nope, all voltages nominal, and it was brand new.

    I was about to try a spare power supply and a thought occured to me..

    It's almost as if the reset switch was being hit, but it wasn't even close to being knocked at any point and the switch otherwise worked fine. Then I knocked the case and the system reset. Yep, the reset switch was faulty, jolting it even slightly would reset. Who needs a reset switch since Vista anyway? Unplugged it from mainboard. Solved.

    I decided not to even joke about charging my Dad for two hours of my time.

    Chances are if he paid someone to do it they wouldn't necessarily have found the fault that quickly, and he'd be hundreds of dollars out of pocket.

    The lesson in troubleshooting? Um... I'm not sure.

    --
    After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
    1. Re:My lesson. by Anonymous Coward · · Score: 1, Interesting

      Er, how exactly did you get "kernel power events" when the reset switch was hit? Reset switches are dumb enough not to do any cute ACPI tricks, but just hard and directly.

    2. Re:My lesson. by girlintraining · · Score: 3, Interesting

      The lesson in troubleshooting? Um... I'm not sure.

      You did exactly what any computer tech should: Check the most common reasons for failure, then move to the edge cases. A faulty switch is rare. Swapping individual components out would have eventually narrowed it down to the case itself. Two hours sounds about right for a competent technician to run down the list to get to the point where that would be a likely cause of failure.

      --
      #fuckbeta #iamslashdot #dicemustdie
  7. Re:Top 7 problems with Windows 7 by caspper69 · · Score: 2, Interesting

    He was my professor a couple of semesters ago. I can vouch for that!

    Hell of a nice guy, and pretty talented to boot though.

  8. Re:You're Kidding by Gadget_Guy · · Score: 2, Interesting

    Sure, he replaced all the other parts of the system before he replaced the CPU, but he already had those other parts on hand

    Correction to myself. He did buy another motherboard because he had already had problems with it previously. If it had been problematic before then it seems reasonable to think that it may have been the causing this issue on the Win7 install.

  9. Re:Sooooo by jimicus · · Score: 2, Interesting

    It would never have helped him - he was using an engineering sample CPU, for heaven's sake!

    Having said that, I'm a Linux admin and it causes me no end of frustration when I need to troubleshoot something on Windows and I am painfully reminded that:

    - The event log is a PITA to browse through, because you have to double-click on specific events to see the detail. Search doesn't work very well when you're not entirely sure what you should be searching for.
    - Application software frequently doesn't write to the event log. If you're lucky it keeps its own separate log, if you're unlucky it was written by someone who thinks a log is what you get when you chop down a tree. (How the Hell any bugger ever troubleshoots during the development process I have no idea. Unless the dev build does create logs but some arse-head middle manager decided to turn them off in the production version).

  10. Re:actually by Sycraft-fu · · Score: 2, Interesting

    I love that idea and tried to do it at work to help myself learn more Linux, but I just couldn't. Part of the problem was that I'd drop back to Windows when Linux was being a pain and just not go back to Linux since there was nothing Linux did that Windows didn't. The other problem is that a large part of my job involves running various VMs on my system and as anyone can tell you, running multiple VMs in parallel that hit the disk hard on consumer hardware is a world of pain.

    You are correct though that on modern hardware, it works great. Part of it is just that modern CPUs are so fast, another part is that VM software has improved a lot. However an even bigger part is that modern hardware has special VM support. If you get a processor with VT-x or AMD-V, it helps. Get one that supports VT-d or AMD-Vi and nested page tables and damn, you are talking near native speeds to within a couple percent in most cases.

    A single VM can run in such a way as to fool you in to thinking you are running on native hardware in most situations.

  11. The lesson here is that people still don't have... by sarkeizen · · Score: 3, Interesting

    ..a clue about how computers work. Even experienced windows professionals.

    I mean this guy has 32 bit OS working and moves to 64 bit OS...am I following this ok. The 32 bit install presumably went well on the hardware and the 64 bit install fails.

    So I grok his first attempts which are replacing the install media once. Seems like a reasonable assumption (some bit out of the billions on the DVD image just happened to be flipped the wrong way). From there though he starts to lose me. The motherboard is perhaps plausible but you would have to be assuming some rather significant difference in hardware support between the 64bit and 32bit systems. From there? RAM is 64bit how? Or even my HD?

    I think the most significant thing to learn here is twofold.

    i) People - even experienced computer professionals - treat computers like they are magic. Like there is no real science behind how they work. Clearly this guy was replacing parts based on some "experiential weighted average" with regard to how likely they are to cause a "weird" problem.

    ii) When A. C. Doyle said "When you have excluded the impossible" he neglected to state that the *order* in which one does so is significant. Eliminating things in order of their apparent relation to the problem (i.e. all the things for which 64 bits makes a difference) and (in a business environment) with respect to cost (i.e. Replacing a CPU is often a cheaper test than replacing a motherboard wrt labour) will likely fix your problem sooner than just going for the "usual suspects".

    Aside: I've had two cases where I found a CPU issue. One was very similar to this - crashing during a Windows 2000 install - often at the same place. The problem I had was actually thermal - the heatsink was reversed leaving the thermal patch making minimal contact with the heat spreader. Somehow I figured that out without replacing everything else first.

  12. Re:In all fairness by tpstigers · · Score: 2, Interesting

    Actually, in the end I blame HP (the manufacturer of the box in question). Very few of the devices in the box were supported out-of-the-box. When I first tried the install, I ended up with bad video, no sound and no peripherals. The later install (with the wireless card) installed beautifully, I assume because it had access to Windows Update.