Patch the Linux Kernel Without Reboots
evanbro writes "ZDNet is reporting on ksplice, a system for applying patches to the Linux kernel without rebooting. ksplice requires no kernel modifications, just the source, the config files, and a patch. Author Jeff Arnold discusses the system in a technical overview paper (PDF). Ted Ts'o comments, 'Users in the carrier grade linux space have been clamoring for this for a while. If you are a carrier in telephony and don't want downtime, this stuff is pure gold.'"
Update: 04/24 10:04 GMT by KD : Tomasz Chmielewsk writes on LKML that the idea seems to be patented by Microsoft.
If you are a carrier in telephony, you should have many load-balanced servers that can be taken offline one at a time and restored after patching. They probably would be taken out of the loop for the in-place patching anyway. So who is "clamoring"?
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
honestly how much downtime are we talking here? 30 seconds?
Sometimes, life itself is sarcasm...
That is truly amazing tech, right there. It would be interesting to know the security implications of being able to hot-patch the kernel, however.
Trying to keep one server up 24/7/365 is a usually mistake. You'll never achieve 100% uptime. A much better idea is to use clustering and distributed computing so your overall system can survive the loss of individual servers.
The key sequence to access my Slashdot bookmark in Firefox is Alt-B-S. I don't believe this is a coincidence.
There was a kernel exploit recently where someone submitted a patch that modified the running kernel using this technology. It didn't work for me, so I had to resort to patching the .c that was affected - but a lot of people reported that it worked.
Get your own free personal location tracker
I thought their working slogan was:
Windows 7, it's not awful like Vista!
Rather that a source code level system I'd prefer a way of replacing loadable kernel modules without a reboot. Then push more code into modules -- eg file system. (Hey sounds like a micro-kernel).
Can ksplice be installed without rebooting?
He basically compiles a patched and unpatched kernel with the same compiler, compares the ELF output, and uses that to generate a binary file that corresponds to the change. That gets wrapped in a generic module for use, another module installs it along with JMPs to bypass the old code and use the new, and he performs the checks needed to make sure he can safely install the redirects.
He also has to differentiate real changes from incidental ones (the example given is changing the address of a function - all references to it will change, but they don't really need to be included in the binary diff).
The only human work required is to check whether a patch makes semantic changes to a data structure... whether eg. an unsigned integer variable that was being used as a number is now a packed set of flags - the data declaration is the same, but it's being used differently.
Interesting paper. Also a useful new set of capabilities for any Linux user who can't handle downtime for quarterly patching... worth its weight in gold in some businesses.
Erik
I'd rather have at least two of anything important and have statefull failover between them.
If you've got this system that's so critical you can't reboot it for a kernel upgrade, what do you do when the building catches fire or a tanker truck full of toxic waste hops the curb and plows through the wall of your datacenter?
I'd rather have a full second set of anything that critical. It should be in a different state (or country) and have a well designed and frequently used method of seamlessly transferring the load between the two (or more) sites without dropping anything.
If you can't transfer the workload to a location at least a couple hundred miles away without users noticing then you're not in the big league.
And as long as the workload is in another datacenter, what's the big deal about rebooting for a kernel upgrade.
Once again, we have an over-engineered solution to a non-existent problem.
Any enterprise-level customer is going to have a VERY lengthy Q&A process before deploying anything into production. This includes testing kernels, hardware, networks, interaction, application, data and so on. One pharmaceutical company I know of is federally mandated to do this twice a year, every year, for every single machine that reads, writes or generates data. Period.
So you hot-patch a running Linux kernel. How do you Q&A that? How do you roll back if the patch fails? Where is your 'control'?
The answer? A duplicate machine. But wait, if you have two identical machines... isn't that... a cluster?
Exactly. And THIS is how you perform upgrades. You split the cluster, upgrade one half, verify that the upgrade worked, then roll the cluster over to that node, and upgrade the second portion of the cluster. If you have more machines in the cluster, you do 'round-robin' upgrades. You NEVER EVER touch a running, production system like that.
Well, not if you want any sort of data integrity or control and want to pass any level of quality validation on that physical environment.
Tomasz Chmielewski wrote on LKML: the idea seem to be patented by Microsoft, i.e. this patent from December 2002: http://www.google.com/patents?id=cVyWAAAAEBAJ&dq=hotpatching In essence, they patented kexec ;)
Andi Kleen promptly provided prior art: The basic patching idea is old and has been used many times, long predating kexec. e.g. it's a common way to implement incremental linkers too.
davecb@spamcop.net
Let's get the rest of the usual jokes out of the way while we're at it.
If there were no kernel, it would necessary to create our non-rebooting robot overlords are belong to Chuck Norris.
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
Not only the CEO. I lived to see even a hardline IT guy (admittedly, one whose goal in life seems to be to be against whatever you want, and to avoid doing any extra work... actually, make that just: any work) argue along the lines of "nooo, you can't have the servers only 60% loaded! It's a waste of valuable hardware! Why, back in my day (of batch jobs on punched cards, presumably) we had the mainframe used at least an average of 95% before asking for an extra server!"
It always irks me to see people just not understand concepts like "peak" vs "average", or "failing over".
- A cluster of, say, 4 machines (small application, really) which are loaded to 90% of capacity, if one dies, the other 3 are now at 120% of capacity each. If you're lucky, it just crawls, if you're unlucky, Java clutches its chest and keels over with an "OutOfMemoryError" or such.
- if you're at 90% most of the time, then fear Monday 9:00 AM, when every single business partner on that B2B application comes to work and opens his browser. Or fear the massive year-end batch jobs, when that machine/cluster sized barely enough to be ready with the normal midnight jobs by 9 AM, so those users can see their new offers and orders in their browsers, now has to do 20 times as much in a burst.
Basically it amazes me how many people just don't seem to get that simple rule of thumb of clusters: you're either getting nearly 100% uptime and nearly guaranteed response times, _or_ you're getting that extra hardware fully used to support a bigger load. Not both. Or not until that cluster is so large that 1-2 servers failing add negligible load to the remaining machines.
A polar bear is a cartesian bear after a coordinate transform.
As an admin for some -very- high availability systems, load balancers are not a silver bullet. This solution would most apply for running one-node clusters who are using a single machine as a perimeter network device. (ex. firewall) I see lots of these in the racks at our NOC provider.
1. We connect to several load balanced systems and the complexity introduced by load balancers translates to inexplicable down time. No load balancers means a pretty steady diet of the latest and greatest server hardware, but no down time. The a few minutes of down time costs more than the server hardware.
2. High availability translates more roughly into nodes that can fail (ex. power off) and not take the cluster down. This boils down to active-passive application architecture more than just using heartbeat.
As an FYI, PostgreSQL clustering is a killer application for me. Erlang is also great in many ways, but requires application architecture with active-passive node awareness. Which isn't present in things like Yaws, or even my other favorite non-erlang app nginx. Heartbeat is the solution there, but I'd like to see yaws be cluster aware on its own. http://yaws.hyber.org/
This is old news down in the South.
They don't bother splicing. Them good ol' boys been big on Kernel Sanders for years now.
Lots of people are saying, "100% uptime of a particular machine is neither necessary nor desirable, full failover is better. Full failover is the only way to handle catastrophic hardware failures." Or something to that extent.
But this isn't about 100% uptime. It's about not having to reboot for a kernel upgrade. You should still have hot failover if you want HA, this just removes one more thing that requires a reboot.
It's like people saying, "I don't mind rebooting after installing Office, I don't expect 100% uptime from my workstation." Of course you don't need to be able to do software installs without rebooting. But isn't it nice to have that option available?
Same with this. When (and if) it gets stabilized and standardized, you'll use it. Not for 100% uptime, just because it's nice to not be required to reboot to enable a particular software install.
Stop-Prism.org: Opt Out of Surveillance
And THIS is how you perform upgrades. You split the cluster, upgrade one half, verify that the upgrade worked, then roll the cluster over to that node, and upgrade the second portion of the cluster. If you have more machines in the cluster, you do 'round-robin' upgrades
Hmmm. I happen to live by your words in an environment where this is theoretically possible, but practically impossible. Why? Because when the cluster rolls to a passive node, the application times out on the existing connections. The time outs have business ($$$$) implications. I wish it were okay to have infinite retries, but it's viewed as a violation of the service agreement. Telephony is like this too.
An academic ideal for sure, but please speak more humbly because it is no silver bullet.
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
I love anything that makes a billionaire whine.
They're using their grammar skills there.
This is GPL'd software. Bill Gates told me nobody could improve it. These Linux developers are truly renegades!
include $sig;
1;
You'd roll back much the same way, or even perhaps by rebooting into the previous kernel image from disk.
Every production environment I've ever administered had a smaller version set aside for testing. We'd configure the machines identically and just make the cluster smaller. Then we'd test on the test machines any action that was to be made part of the admin process of the production machines. If it passes on the test machine and fails in production, then you didn't make the machines sufficiently similar.
Round robin upgrades take ( ( (time_to_idle + time_to_upgrade + time_to_reboot) * machines ) / 2) on average to get a machine upgraded. If you have a "Critical" upgrade, that might be longer than you want.
Not everyone has the exact same QA requirements you do, either. Some of us are happy with proving that it works, then proving that it worked on the production machine, then resuming our normally scheduled maintenance.
I would think that on top of the benefits of patching running high-uptime servers this would in the long run also result in yet another benefit to running Linux on your desktop instead of Windows. I don't see any reason RedHat, Ubuntu and everyone else wouldn't implement this type of kernel upgrade for convenience' sake.
I keep forgetting my place. Jesus is for losers. Why do I still play to the crowd?
AmigaOS had its kernel in ROM, and could be patched on the fly. That was back in 1985, so even if it was patented, it isn't now.
The patching function was not an accident either; there was an OS-function for this purpose. Originally it was intended to allow bug-fixed to be installed without having to change the ROM, but it was quickly coopted into a mechanism for enhancing the OS in various other ways as well.
Funny thing... this was the smallest part of my oh, hour and twenty minute interview with the reporter. The reason for the call was to hear about what was up with the 2.6.25 release; she probably spent more time talking with me about KVM and Xen; and I mentioned ksplice just as an aside, as an example of lots of really interesting and exciting work that doesn't necessarily happen as part of a mainline kernel release. I spent maybe 2-3 minutes tops talking to her about ksplice --- and that's what she ends up writing about and getting slashdotted!
Their only resort is to appeal to court.
There are no applications in other countries.
A