Logging Unexpected Shutdowns/Crashes w/ Linux?
sweede asks: "I have a dedicated server that seems to reboot more often than it should. In Windows 2000/XP (maybe NT4.0?), if your computer or server crashes it will leave an event message in the Event Viewer for you to review on what went wrong. Is it possible to do something similar in Linux? Where a power outage or an unexpected kernel panic will leave a message in /var/log/event (or whatever) Searching Google for 'kernel trapping' doesn't give me a whole lot of info on the subject."
That the reason Linux doenst write anything to the HD after Panic si so that it doesnt mangle/destroy the FS.
Why not reserve a set place on the hard drive and write out error trap information there? There's no reason the filesystem needs to be involved at all. I'm going to guess that's what Windows does.
NO CARRIER
You fail to understand what happens to create the "Dirty Bit".
1: System starts up (say clean).
2: It marks a bit on the partition that system has been started up.
3: Usage Usage Usage
4: Send shutdown
5: System umounts cleanly. Undoes "dirty bit"
6: Power == 0
On a dirty FS, stage #5 is never hit so when system comes back on, it checks the bit and detects unclean shutdown. The bit is never wrote during the unclean shutdown.
In the similar problem, I see problems when NTkern crashes. How exactly does it manage to:
1: Read the partitiom
2: Read the program on the partition
3: Run the insert log program to add log entry
4: Still have the "blue screen"
I smell nasty data corruption waiting to happen. After all, if you cant guarantee the state of the kernel, does it really justify reading, writing, and executing on a crashed kernel????
OK. Then how do you guarantee the state of the kernel? If you use bios calls, it screws up the memmap even more. Thats assuming you can even pass something like that.
100$ question: How do you break out of code inserted that might have had a bug? How do you determine what code had that bug?
Answer those, and then I'll trust Write_after_system_crash api
I expected the Indy to kernel panic or turn off. Instead, below the complaints about the missing ethernet cable ("en0: link carrier not detected" or similar), there was a lone status message: "Power failure detected."
No UPS, no power saving devices of any kind, only the filter caps in the power supply between the logic board and the unreliable, crufty power system of a 70 year old house at the mercy of a power strip first used on my (brand new at the time) Atari 800. The other computer on the power strip (350 P2 running RH 7.1) rebooted hard, right in the middle of heavy FS activity. I had to hit the reset button before it would come back up again, too - the brownout hung the POST.
Every cloud has a silver lining (except for the mushroom shaped ones, which have a lining of Iridium & Strontium 90)
I wonder if /dev/nvram (the small amount of NVRAM availible on the RTC) is large enough to store such a dump.
As others have said, the "Linux crash" is probably hardware failure.
The most common cause of serious failure, if the software has been installed correctly and tested, is bad contacts. To fix the problem, just loosen the screws that hold the adapter cards, pull the cards out about 1 millimeter or 1/32 of an inch, push the cards back in fully, and re-tighten the screws. Also, pull all connectors off a similar amount, and push them back on. Do the same with the memory modules. That's all.
The scraping caused by moving the contact points a tiny amount is actually very violent on a micro scale. The scraping removes oxide that causes a contact to lose electrical conduction.
This is reliable information. I've been selling and occasionally repairing PCs since before IBM sold PCs, back in the days when personal computers cost $2300, had two diskette drives and no hard drive, and ran the CP/M operating system.
My guess is that, if you had a penny for every real crash of a stable distribution of Linux, after a few years you might still have to borrow money from your little brother to buy a piece of bubble gum.