Ask Slashdot: Faster Reboots?
RobertPearse
submits this question: Here's an article about
ILM engineers using a small
Linux box to reboot systems quickly. They mention a "TTL
data-acquisition card". Housed in a linux box, it allows
them to reboot a system very quickly. Anybody know
anything about this? My company is a Microsoft shop -- and
given the fragility of NT, this sounds like a great idea.
(God forbid we need to reboot an NT box ;-)"
That sounds nice, but there is a big problem with that. Assume you save the state of the machine (say at time point x) and a few seconds later (time point y) the machine BSODs for some software reason (say NT is flaky). Now you reboot the machine and load it back to state x, what is going to keep it from crashing at state y all over again?
These sort of protection schemes are really only good at protecting you from hardware failures, which is not what you are looking for. Also, if your system has a lot of memory it can take quite a while to dump the memory to disk and read it off again on boot. I work on machines that routinely have 4GB of memory, and when they crash (and core dump) it can easily take 15-30 minutes to finish dumping the core. These machines can boot regularly in less than 10 minutes (easily). Of course with the checkpointing scheme you don't lose as much work, which is a big thing when you are using the machine to do an extremely computationally intensive task or have a huge database in memory.
I read the internet for the articles.
Many laptops save the entire state of the system when you put them in 'sleep' mode. The contents of RAM, register values, program counter location, etc., are saved to disk. When you power up the machine again, everything's quickly reloaded and continues where you left off.
For faster reboots, all you have to do is reboot, let the system come up, and then save the state of the system at this point. Later when NT goes BSOD, you just reload your freshly rebooted image which is much faster than a true reboot, since the time it takes to restore a system state is essentially the time it takes to reload RAM (128MB tops) plus state info (insignificantly sized chunk of data).
Most importantly, the runlevel stuff (SysV initialization) is just abysmally slow. If you want fast reboots, put just what you need into /etc/rc and get rid of all the other stuff.
It also helps to make a custom kernel that includes only the modules and drivers you need.
If you have certain kinds of hardware or controllers in your system (Adaptec comes to mind), trying to achieve fast reboots is hopeless because they need lengthy initialization routines and microcote downloads.
This kind of setup isn't good for many day-to-day uses, and it will probably cause problems with distributions like RedHat or Caldera. But if you want fast reboots, you can get them. Last I did this was on an older distribution of Slackware.