Lessons In Hardware / OS Troubleshooting
Esther Schindler writes "We like to imagine that every Microsoft OS installation will work just as well as the company promises. When things don't work out, identifying and remedying the case of failure can be time-consuming and frustrating. This lesson in how to determine why Windows 7 didn't install may help you troubleshoot a problem of your own, and save you from a Lost Weekend. Maybe you'll find this account useful all on its own. But the real key here is that the author is Ed Tittel — who's written over 100 books. If this hardware geek spends days solving a CPU-meets-Windows 7 problem, what chance do mere mortals have?"
He has issues with an "unsupported and unwarranted engineering sample CPU from Intel" with Windows 7... and Windows 7 is of course to blame according to the OP.... *roll eyes*
We like to imagine that every Microsoft OS installation will work just as well as the company promises.
Actually around here people like to imagine that every MS OS installation will miserably crash, because then they strut around feeling good about using Linux.
This is front-page news for Slashdot now? Here's the sum total of TFA:
Wow, color me impressed!
How are "mortals" supposed to figure it out? I guess they buy a PC from Dell because everything in that article qualifies as "no duh" for system builders.
"What do you despise? By this are you truly known." --Princess Irulan, Manual of Muad'Dib
/)
just rolls right on past the fact that, if what he was installing was -- oh, say -- a Linux distribution, he wouldn't have an opaque "I'm uncompressing files" thermometer, he'd have real progress status messages, with, y'know, *parameters* and stuff, and -- unlike me this morning with my boss's iPhone -- a hope of actually figuring out what's broken.
But he's apparently completely blind to the fact that that's the *real* problem here.
"We'll just make fault-tolerant users", indeed
I'm a little suspicious; how much of an expert can you be writing 100 books on a variety of subjects.
.net development classes, php, etc..... Yeah he couldn't answer any basic questions that strayed from the text book in front of us.
Reminds me of a tech instructor I had who proudly informed the class he teaches oracle classes, mysql classes, sql server classes, cisco classes, juniper classes,
If this hardware geek spends days solving a CPU-meets-Windows 7 problem, what chance do mere mortals have?"
You need first to show me a "mere mortal" who has, and uses, an engineering sample CPU. There is a very good reason why -ES parts are marked as such - because they have bugs. And those bugs will be a problem sooner or later.
So the whole sob story can be reduced to this. The guy runs software on a prototype hardware, and the software crashes. In other breaking news, dog bites man.
[X] Psst ... it's not your imagination, honey :-)
... oh, wait a sec ...
... but at least it blends ...
[X] I use BSD, you insensitive clod!
[X] In Soviet Russia, Windows crashes YOU!
[X] CowboyNeal is my runtime environment (Ewww!)
[X]
[X] If someone says "There's an app for that" one more time I'll throw a chair at them!
[X] Steve Ballmer posts on slashdot!!!
If you have never had a hardware issue when installing Linux on a machine you must be very lucky.
"Most things work fine" people tell me, which is true. The trouble is that the chances of you owning something that doesn't work is relatively high. (There's probably something from my statistics course that explains why that is, but I have so far managed to suppress that memory.)
After having rebuilt a Mac with OS X, and rebuilt a laptop with Ubuntu 9.04, I was surprised at how smooth and the Ubuntu install was. Of course that was until I wanted to use my webcam with Ubuntu. These kinds of problems get very difficult very fast in Linux. When 9.04 first came out there was a dependency problem that meant that you couldn't easily get some webcams working.
To be fair, that problem is most likely sorted out now, and a non-Apple webcam would have needed a (very easy to install) driver on OS X as well. The point is, Windows and hardware generally work very well.
I think it was the 404th time actually.
Actually I had the exact same problem a few months ago upgrading a Dell server from Win2003 x86 to Win2008 x64, I suspected the CPU from the beginning, but I spent a few hours before the Dell Tech agreed with me. They sent a replacement and it worked like a champ.
This proves it has happened to a production Intel Core2Duo CPU at least once, I can't believe I was the only one.
iRepairIT - iPhone, Mac, & PC Repair
I didn't read the article or the summary. The title was plenty.
You are welcome on my lawn.
Dear self important guy who isn't near as good at computers as he thinks he is:
This may surprise you to learn, but all those defaults out these, all those specified values, all that kind of stuff, that isn't just arbitrary. See many smart engineers and other folks worked on designing and creating all the hardware for your computer. A lot of extremely complex stuff went in to it, modern computers are quite a marvel of engineering. As such, they discovered that certain tolerances, certain ranges work well. Outside of that, there can be problems. Thus the defaults because, well, default. They set them so that things are very likely to work in all cases.
As with most things, they aren't absolutes. They aren't things you can never exceed. In various circumstances you can go outside those normal ranges, sometimes by a little, sometimes by a lot. However, problems can potentially result. What problems those are and when they happen is not predictable. A system can appear stable but only crash on one app, or it can be stable for awhile then develop an instability.
Regardless, the first step to troubleshooting should be to USE THE FUCKING DEFAULTS, you idiot!
Seriously, I'm supposed to take someone seriously who is running overclocked settings of some sort or another (RAM timings, FSB, etc) and an engineering sample CPU and has problems? Ummm, duh. That right there is asking for problems. When you OC, you go in to it knowing you may have some difficulties. You understand this is the tradeoff for something that runs faster than spec. If you start having problems, the first step is to back off the OCing and see if that fixes it.
This is true even of OC'd systems that were fine but aren't now. I had a Celeron 300 that I OC'd to 450 back in the day and it worked well for about a year, then started to burn out. System started crashing randomly, and so on.
To me, it sounds like he's being whiny because he didn't bother to troubleshoot his setup properly. Come talk to me when you've got a retail CPU running at stock spec and FSB, RAM running per it's JEDEC spec at standard voltage and so on. Oh, what's that? You did that and it stopped having problems? Well there you go then. Don't bitch that your i7 920 "should" run at 3.8GHz. I don't care if others have done it, doesn't mean it'll work in your case. If it does, wonderful. It if doesn't well tough shit. Don't get mad at the software. It has pretty much no way to know if the CPU is going crazy as it runs on the CPU. About the only way software can indicate a CPU problem is by inducing a problem and thus a crash.
The guy does 400+ successful installs, then runs into a decidedly obscure hardware problem, and people flame him? And Windows 7?
Yee Gods. Get a life folks. I read this as a success story, both for the author and for Microsoft.
Three Squirrels
Nowhere in the original article did I get the sense that the author was blaming Windows for his issues. In fact, he starts out by stating that he's installed Windows 7 hundreds of times without a single incident, but this was a "problem PC". So, how did this turn into an anti-Windows rant? Oh, right, it's Slashdot...
who's written over 100 books
Michael Behe's written dozens of books trying to debunk evolution. It does not make him an expert in evolution. He installs Windows, copies down what he sees on the screen and writes it down. That does NOT translate into "he knows what he's doing". I'm not saying he's not an expert, just that it's not a valid qualification.
If this hardware geek spends days solving a CPU-meets-Windows 7 problem, what chance do mere mortals have?"
They wouldn't be installing an OS. Very few non-geeks do so. They buy a computer from a vendor like Dell, it comes with an OS. When it's time to upgrade, they buy a new PC and give the old one to their kids or grandparents. They also, as has been stated numerous times in the comments, wouldn't be installing on machines that had an engineering sample for a CPU. Actually, this debunks the claim that because he's written books, he's an expert. He knew he had a machine with an unsupported processor in it and still replaced everything in the machine first. Um....duh!
Which is more painful? Going to work or gouging your eye out with a spoon? Find out!
http://www.workorspoon.com
I'm all for FOSS, Linux etc.
But this approach of yours won't convince any Windows user to switch. Instead, it's likely more people will get convinced that FOSS users are assholes.
Just last night I fixed my parents computer in one of those long fixes that turns out to be the most fundamentally trivial things. This is why this is not my main occupation.
Basicly they had a reccently built custom Windows 7 + Ubuntu PC that had begun randomly shutting down, often minutes after it had been powered up.
Ok first thing, any obvious errors or cicumstances? No, it would just randomly power off. Windows event logs showed kernel power events, no specific driver, service or app crashing anywhere. Linux was the same. Not a thermal issue cpu + gpu temps nominal and stress test din't immediatley cause a crash.
Suspecting a power or a motherboard issue, first checked and re-seated things internally. It still occured.
Removed extraneous cards, connectors and drives. No result. It would even happen sitting in BIOS setup. Have ruled out a number of problems.
Checked for electrical shorts, poor voltage etc.
Dying power supply? Overloading or shorting? Nope, all voltages nominal, and it was brand new.
I was about to try a spare power supply and a thought occured to me..
It's almost as if the reset switch was being hit, but it wasn't even close to being knocked at any point and the switch otherwise worked fine. Then I knocked the case and the system reset. Yep, the reset switch was faulty, jolting it even slightly would reset. Who needs a reset switch since Vista anyway? Unplugged it from mainboard. Solved.
I decided not to even joke about charging my Dad for two hours of my time.
Chances are if he paid someone to do it they wouldn't necessarily have found the fault that quickly, and he'd be hundreds of dollars out of pocket.
The lesson in troubleshooting? Um... I'm not sure.
After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
I haven't read the title, summary, article or any posts, and I think it's time to get new glasses... I can't read a damn thing!
How many people had the same impression I had: "Why, this sounds exactly like one of the 'Chaos Manor' columns Jerry Pournelle used to write in BYTE!"
All it needs is a few of Jerry Pournelle's favorite stock phrases. "The disk trundled for a while..." "I tried swapping out the hard disk, but no joy..." "I called up Bill Godbout..."
"How to Do Nothing," kids activities, back in print!
"I can't read a damn thing!"
Coming soon, to a galaxy near you, will be a COMMUNITY COLLEGE, complete with a REMEDIAL READING class! Enroll early, and avoid the rush!
"Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br
Windows 7 will install perfectly well without a network connection. (Just to make sure, I postponed sending this until my test VM, sans network interface, had completed installing.)
My guess would be either you were using badly OEM'd install media (a practice I do wish MS would prohibit) or you don't know how to manually install device drivers.
What were those 100 books on? Think about that - how many years has be been writing books. Say it's been 10 years. 10 books a year? A book every 5.2 weeks? WTF
Here's what I posted as a reply to this "expert's" article. It's now awaiting admin approval to appear as a comment. We'll see if it makes it.
==================
While reading, I was thinking this was a well-written detective story. Then I got to the end and found out it's a story about a massive waste of time because you didn't follow standard procedures.
Here's how to save a few days next time: go to the motherboard manufacturer's website, get the list of supported CPUs for the motherboard you're trying to install. Then download and install the BIOS that supports that CPU. It really is that simple.
Asus is particularly good at providing a CPU support list for their motherboards. It took me entire minutes to find the lists for the P5Q3 and P5E3 Deluxe (not P5E3 Pro, as you wrote). The QX9650 is listed for both motherboards -- and in both cases, it is supported only as of a recent BIOS revision.
So all you had to do was download and install BIOS version 0204 or later for the first motherboard, the P5Q3, and I bet Win 7 would have installed correctly the first time.
As for the motherboard automatically making BIOS changes to match the fast DIMMs you installed, Asus motherboards do NOT do this by default. You must have left the BIOS in some sort of overclocker's mode.
Next time, look up and download the BIOS that supports the CPU you're trying to use. After installing it, use the BIOS setting that restores all other BIOS settings to their defaults. Then install the OS. THEN and only then, can you start tweaking BIOS settings.
Once again, the article was well written. But it's also an inadvertent confession.
==================
One of the reasons I use Linux is that, currently, it is much more secure than Windows, given my personal use scenario.
Yes, if I were a specialist in securing Windows that might not be the case, but I'm not. Yes, if equivalent amount of effort was invested to break the security of casual users of Linux compared to that invested in breaking Windows, again, Linux might not be any more secure than Windows (well, with Linux, there are distros where I can always boot off of USB and then not save any changes, so until Microsoft offers me the same functionality there's little chance that I could use it in as secure a fashion as I can use Linux).
Running Linux in a VM under Windows just wouldn't "cut it" for me. Sorry.
When you crank out a lot of stuff, it is extremely hard to make all of that stuff be high quality. Quality usually takes time, it takes research, it takes refinement. It is possible, in some rare cases, to have someone that produces a vast quantity of work, all of which is top quality. However it is far more common to see someone produce a vast amount of mediocre to bad quality work.
As an example: Dr. Mark Russinovich has written a grand total of three technical books to date. So, clearly a man who doesn't know what he's talking about right? Wrong. Those three books are "Inside Microsoft Windows 2000," "Windows Internals Fourth Edition," and "Windows Internals Fifth Edition." He has, literally, written the book (along with David Solomon) on the recent versions of Windows, published by MS themselves. These are extremely accurate, comprehensive, technical documents of Windows down to its very fundamental levels. He also has written a suite of tools, the Sysinternals tools, so good that MS bought them, and hired him on as a technical fellow.
So while he's produced only three books, they are all of the highest quality of technical information. There haven't been more because he hasn't had the time to write hundreds of books, nor the need to issue revisions to correct problems with the ones he has (each new edition covers a new version of Windows).
Thus when I hear someone talk about how good they are because of the quantity of they works, I am skeptical. The only way you get a vast quantity of high quality work is either laboring an entire lifetime (and even then often not), being a prodigy, or both.
..a clue about how computers work. Even experienced windows professionals.
I mean this guy has 32 bit OS working and moves to 64 bit OS...am I following this ok. The 32 bit install presumably went well on the hardware and the 64 bit install fails.
So I grok his first attempts which are replacing the install media once. Seems like a reasonable assumption (some bit out of the billions on the DVD image just happened to be flipped the wrong way). From there though he starts to lose me. The motherboard is perhaps plausible but you would have to be assuming some rather significant difference in hardware support between the 64bit and 32bit systems. From there? RAM is 64bit how? Or even my HD?
I think the most significant thing to learn here is twofold.
i) People - even experienced computer professionals - treat computers like they are magic. Like there is no real science behind how they work. Clearly this guy was replacing parts based on some "experiential weighted average" with regard to how likely they are to cause a "weird" problem.
ii) When A. C. Doyle said "When you have excluded the impossible" he neglected to state that the *order* in which one does so is significant. Eliminating things in order of their apparent relation to the problem (i.e. all the things for which 64 bits makes a difference) and (in a business environment) with respect to cost (i.e. Replacing a CPU is often a cheaper test than replacing a motherboard wrt labour) will likely fix your problem sooner than just going for the "usual suspects".
Aside: I've had two cases where I found a CPU issue. One was very similar to this - crashing during a Windows 2000 install - often at the same place. The problem I had was actually thermal - the heatsink was reversed leaving the thermal patch making minimal contact with the heat spreader. Somehow I figured that out without replacing everything else first.
INstall linux and run Windows in a VM. When your windows install gets infected/hosed with a virus/malware/whatever it could well mess up your linux VM machine and make it inrecoverable but if you install Windows in a VM and run on top of linux the worst that can happen is the VM gets hosed.
Actually, in the end I blame HP (the manufacturer of the box in question). Very few of the devices in the box were supported out-of-the-box. When I first tried the install, I ended up with bad video, no sound and no peripherals. The later install (with the wireless card) installed beautifully, I assume because it had access to Windows Update.