New Linux Kernel Flaw Allows Null Pointer Exploits
Trailrunner7 writes "A new flaw in the latest release of the Linux kernel gives attackers the ability to exploit NULL pointer dereferences and bypass the protections of SELinux, AppArmor and the Linux Security Module. Brad Spengler discovered the vulnerability and found a reliable way to exploit it, giving him complete control of the remote machine. This is somewhat similar to the magic that Mark Dowd performed last year to exploit Adobe Flash. Threatpost.com reports: 'The vulnerability is in the 2.6.30 release of the Linux kernel, and in a message to the Daily Dave mailing list Spengler said that he was able to exploit the flaw, which at first glance seemed unexploitable. He said that he was able to defeat the protection against exploiting NULL pointer dereferences on systems running SELinux and those running typical Linux implementations.'"
Fast! leave the sinking Ship before its too late!
It's okay, I've got auto-update disabled.
What's the value of information that you don't know?
If this had been Windows, the article would have been tagged defectivebydesign.
After all the crap Brad had to put up with from the SELinux faction, it's good to see technical ability once more scoring points over politics.
(I understand the irony that this comment, is, in itself, purely political)
It's important to note that there is almost never any "preferred" or "special" release of Linux to use. And obviously this flaw doesn't affect people that don't use any security modules.
This is not good news, but it's important news. The kernel's not likely to have a "fixed" re-release for this version, although there probably will be patches for it as well. And when in doubt, just don't upgrade. Not very many machines can take advantage of all of the cool bleeding-edge features that come with each release, anyways. Lots of older versions get "adopted" by someone who will continue to maintain that single kernel release.
What's the value of information that you don't know?
I think that tag is mostly reserved for DRM related news...
And I have seen news about linux DRM modules also tagged that.
What's the value of information that you don't know?
I always disable those security modules as they always end up to incompatibilities and other erratic behavior in software.
Exactly what do they do anyway?
More and More like windows everyday
The explanation given by the SANS Internet Storm Center seems strange. Firstly, the code they provide as an example shows one variable being assigned "sk" and then another variable being checked "tun". I think the "if" statement should be checking "sk" and not "tun". Secondly, the assignment could be a perfectly valid assignment of null; "tun->sk" could indeed be null. The compiler should certainly not optimize out the if statement just because the variable has already been assigned. This would be a major fault in the compiler.
Perhaps they have just oversimplified to the point of writing nonsense...
So, he's dereferencing tun, and then checking if tun was NULL? Looks like the compiler is performing an incorrect optimisation if it's removing the test, but it's still horribly bad style. This ought to be crashing at the sk = tun->sk line, because the structure is smaller than a page, and page 0 is mapped no-access (I assume Linux does this; it's been standard practice in most operating systems for a couple of decades to protect against NULL-pointer dereferencing). Technically, however, the C standard allows tun->sk to be a valid address, so removing the test is a semantically-invalid optimisation. In practice, it's safe for any structure smaller than a page, because the code should crash before reaching the test.
So, we have bad code in Linux and bad code in GCC, combining to make this a true GNU/Linux vulnerability.
I am TheRaven on Soylent News
CFLAGS+= -fno-delete-null-pointer-checks
Job done (should work with Gentoo, buggered if I know how to do this in other distros, DYOR), even with -O2/-O3. This is an optimisation/code conflict. The code itself is perfectly valid, so if your CFLAGS are -O -pipe you have nothing to worry about. GCC's info pages show what is enabled at various optimisation levels. -fdelete-null-pointer-checks is enabled at -O2. Of course, this only applies when you compile your own kernel. If vendors are supplying kernels compiled with -O2 without checking what it does to the code then it is obvious who is to blame.
Resistance is futile. Reactance buggers it up.
Actually, it's already been fixed as of 2.6.31-rc3. Interestingly enough, the code by itself was fine until gcc tries to re-assign the pointer value upon compiling. Steven J. Vaughn-Nichols had a decent write-up about it in Computerworld.
C|N>K
Why the crap doesn't "tun->sk" panic the kernel? They had it coming if the userland can tamper with the kernel's mapping of page 0.
This will cause the kernel to try to read/write data from 0x00000000, which the attacker can map to userland
This was somewhat surprising to me. Digging around a bit, it looks like it has something to do with an seLinux handler.
Can anyone elaborate on this?
To me, the "if (!tun)" check should/must be before the de-reference; otherwise, it is meaningless! However, the compiler should print a warning in this case, not just optimize it away.
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
Actually, it's already been fixed as of 2.6.31-rc3. Interestingly enough, the code by itself was fine until gcc tries to re-assign the pointer value upon compiling. Steven J. Vaughn-Nichols had a decent write-up about it in Computerworld.
And another mountain out of a molehill allowing microsoft astroturfers to troll a flame-war out of a non-event.
VLC FOR MAC IS DYING! IF YOU DEVELOP, PLEASE SAVE IT!!
Parent is absolutely right and basically everything in grand parent's post is wrong.
It's sad to see that moderators know nothing about the C language and mod grand parent up, but confirms what most people know anyway: slashdot is not the place for geeks anymore it's the place for lamers.
Ok, I know I shouldn't be feeding the troll, but read the article: the kernel source itself is perfectly fine, is the compiler that optimizes the check away.
I tried to google code search for "tun->sk" and Linux doesn't contain that snippet of code. Since SANS claimed that drivers/net/tun.c is at fault, I looked at that source file and didn't find any instances where "if (!...) return ...;" is performed after NULL dereference.
I think the only fascinating bit of the story is that the SElinux extension allows you to map a page at memory address 0 (the NULL page), making NULL dereferencing valid. I also found out about that a while ago, but I didn't know it has anything to do with SElinux. By the way, mapping the NULL page also works on Mac OS X.
However, mapping NULL page is typically NOT exploitable. A correct program will simply reject access to NULL pointer, giving it a special semantic regardless whether the memory page itself is valid or not.
I once had a signature.
In my days (70's) of supporting a family by getting paid to squeeze code into a 32K "mainframe", everybody called it "Assembler" or "Assembler language".
Slashdot entertains. Windows pays the mortgage.
Guys, I'm trying to decide what to post:
[ ] Downplay how serious flaw is ...or we could RFA
[ ] Compare to Window's track record
[x] Make a meta-reference to Slashdot psychology
[ ] Post work-around that doesn't fix problem
[ ] Say that flaw is a feature
[ ] bash Windows
[ ] Claim that not all Windows software is bad
[ ] Claim that the more popular gets, Linux will be targeted more
[ ] Pretend I understand the problem
Slashdot needs Geekcode | Can anyone recommend any good SCIFI? My tastes: Foundation, Startide Rising, CITY, Ringworld,
Oh, found the code on lxr. It looks like Linux kernels up to 2.6.29.6 are NOT affected, and this is a vulnerability introduced in 2.6.30 due to a fairly significant rewrite of tun.c. Linux 2.6.30 was released in Jun 9, 2009, just a month ago. Funny the tun.c rewrite was not mentioned in the set of changes for 2.6.30.
I think this example actually shows a forte of Linux as open source. New vulnerability is found very quickly after "new" code is released.
I once had a signature.
For some reason I didn't link this correctly. The set of changes for 2.6.30 is found http://kernelnewbies.org/Linux_2_6_30.
I once had a signature.
How very optimal indeed!
Ok, I know I shouldn't be feeding the troll, but read the article: the kernel source itself is perfectly fine, is the compiler that optimizes the check away.
Absolutely not. The code itself has a severe bug: If tun is a null pointer then it invokes undefined behaviour. Undefined behaviour means anything can happen. Anything can happen means a severe bug, especially in kernel code. The optimizing compiler just turned C source code that was buggy, but not obviously enough for the programmer, into assembler code that would have been obviously buggy to anyone. Most definitely not the fault of the compiler.
Isn't someone running a static checker on the Linux kernel? There are commercial tools which will find code that can dereference NULL. However, there aren't free tools that will do this.
one guy blames compiler, other blames code, then another blames compiler, another blamdes code ... repeat x 10....
all moderated informative, WHOS RIGHT?
Everytime I try to use Linux I always end up needing to compile the kernel because I want some special feature or software package that requires it. Software that doesn't ship with any distro as a prepackaged ... package. Anyways, I hate the kernel compile procedures, what a pain. I really wish there was a better tool than menuconfig/xconfig and the ensuing make and install commands.
For some unexplained reason I always get an obscure weird error. I do know C/C++ but I am not a programmer (especially not a kernel programmer on Linux) so I really don't want to spend days figuring it out. I just want to use it. I am truly amazed it's been 13 years (the first time I compiled a Linux kernel successfully) and there is *STILL* no nice program to manage kernel compiles and translate common errors into plain english or even spanish(which I don't know lol).
Then an exploit comes out or a necessary kernel upgrade happens and I have to go through it all again. Ugh. Comon guys, someone make a utility that makes compiling and installing a kernel impossible to screw up. Easy even when it's giving bizzare errors. Because the damn thing never does what it's supposed to do! Argh!
For that matter this wonderful tool should be able to handle compiling any source for nearly any software. You shouldn't need a stinkin RPM or APT package which I can never seem to find the right flavor of even when they are available. Every distro needs to have the same easy to use tool that compiles directly from source and can deal with common errors itself.
13 years, and I'm still waiting.
Here's a simple test program. Compare the output when compiled with "gcc -o test test.c" to the output with "gcc -O2 -o test test.c". I used gcc version 4.3.2 (Gentoo 4.3.2-r3 p1.6, pie-10.1.5)
(Wow, the "preview" is ruining the formatting of this code.)
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
int main () {
int fd;
char *string = " This text was read from a null pointer!\n";
int length = strlen(string);
char *pointer; /* Create a simple file... */ /* mmap the file to address zero */ /* note: gcc optimizer doesn't know that the return value was null */ /* Let's dereference that pointer and write to it! */
fd = open("random_nonexistant_file", O_CREAT | O_RDWR | O_TRUNC, 0666);
write(fd, string, length);
pointer = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, fd, 0)
;
pointer[0] = 42; /* Then we'll display it, but not using printf
since it refuses to print from null pointers. */
write(1, pointer, length); /* Also print the pointer value, just to make sure it is null. */
printf("Pointer address: %d\n", (int) pointer); /* Now let's find out if GCC thinks the pointer could possibly be null... */
if (pointer == 0) {
printf("The pointer is null!\n");
} else {
printf("The pointer is not null!\n");
};
};
Either way, the pointer is displayed to be zero, but when you run the unoptomized version, it correctly states that the pointer is null, but the optimized version will tell you that it isn't null. When optimization changes the behavior of a functional program, it is clearly doing the wrong thing.
You can spin it any way you want but it is still not true. There are jokes about "undocumented features" but this isn't even that. It's a clearly documented feature if you look up the optimisation. It's not even incorrect according to the C standard. When you dereference the null pointer then you get "undefined behaviour" up to and including initiation of global nuclear war on computers with the appropriate peripherals (unfortunately I can't find the original Fortran quote I am abusing here). The fact that your compiler does different things during different compiles is perfectly correct and the least of your problems.
This code is only reasonably legitimate because Linux user processes normally do guarantee that null pointer de-references cause aborts.
=~ s,(.*),<sarcasm>$1</sarcasm>,g if any_point_you_wish();
I don't know that I would go so far as to call it FINE, since it certainly violates best practices, but it's not technically wrong.
The code makes a potentially undefined assignment, but before doing anything significant with it, it checks for the undefined condition. It's not technically wrong but it is against best practices. Without the invalid optimization it wouldn't be a problem. In turn, the optimization is in the opposite condition. It is technically wrong, but where best practices are followed, it does no harm.
Reply to myself; sorry
This code is only reasonably legitimate because Linux user processes normally do guarantee that null pointer de-references cause aborts.
Bullshit; What you mean is:
The reason this could be legitimate in the Linux Kernel is because they have a special environment where null pointer dereferences are guaranteed not to cause a crash. However, that means that they have to rely on the compiler behaving in a defined way which means that they are fundamentally responsible for checking all the optimisations etc. etc. which their code relies on.
In a Linux user process, the compiler behaviour would be fine because Linux user processes normally do guarantee that null pointer de-references cause aborts. The kernel code would be wrong because
Thinking about this, it seems more, it seems to be a needlessly hairy and careless thing to allow in kernel code. There's a perfectly legitimate way to do this in C (get the buffer, check it's not null, use it). That legitimate way could be optimised by the compiler to be just as fast as the code used. Why not just do it right in the first place. Then, when someone cuts and pastes into user code at some later point, there won't be an problems.
=~ s,(.*),<sarcasm>$1</sarcasm>,g if any_point_you_wish();
And if I had to venture a guess, this mistake probably came about from one of the commercial contributors. Namely, the one that skyrocketed to the 4th spot in contributions last month (seeing as how well, Red Hat's been at this for a long time now, and while it's not impossible that they committed this flaw to the core, it's far more likely that Intel's newfound interest and throwing muscle @ the linux kernel is probably what caused this).
But that is one issue with some of the FOSS mentalities out there. Source is only half of a software product. A quality build is the second half. If you don't spend as much effort on your build/test workflow as your dev workflow, then what is the point? There is no such thing as "correct source". The product you ship is either working or not, and software runs in binary form.
The compiler cannot know tun == 0, which is what the "if (!tun)" is testing (BTW as a pun), and the code is clearly crackpot.
..." be optimised, since NULL is not guarented to be 0.
Any idiot, who tries to defend GCC. or "undefined" to justify optimisation removing this 0-test needs to spend the rest of their lives writing absolute machine code, in HEX, as they clearly do not understand high-level languages.
If, and only if "tun = 0; if (!tun)
If anyone wrote this code, as opposed to it getting patched in, they need their coding finger cut off, since it is not only dangerous, manifestly daft, stupid, BUT also a race unless LOCKED+INTERRUPTS off.
Clearly this code did not get enough competent eyes.
This is all a HUGE nonsense, I sincerely hope that none of you, who think this "undefined" ever write real used code, Goedel's theorem tells you that the compilers cannot know, except in a few tiny, well defined circumstances, what the value of a variable will be, at run time. That is why compiler optimization is HARD, eg
As part of the thread has already pointed out, THIS IS REALLY BAD CODE, full of puns and races, it should never have got past the sub-system maintainer, Morton & Tovolds.
The moral here is __think__, __write__what__you__mean__, and get sensitive code REVIEWED, by somebody competent.
Obviously, it is technically wrong since we're now reading a story about a null pointer exploit in the kernel...
Thank you for expaining this both correctly an in detail, I simply did not have the patience or tolerance of rubbish!
One reason why C++ needs the Scott Myers books is because Barjne took your attitude,
so the standard MUST specify, in detail, what is to happen but the name of the game is to write the standard so that behavior is intuitive and obvious,
Complexity is the enemy of correctness.
Absolute, CORRECT, CLEAR.
I once had a PhD student who could not understand this, he was very cleaver, but always too agressive, and constantly broke a good optimizing BCPL compiler for the PDP10, since I didnt seem to be able to teach him about regression tests either, even though he did do a lot of good he was, correctly, downgraded to a D.Phil by the external since he couldnt explain himself in the viva.
Code Review is not useless.
Umm - no - the *code* does the undefined behaviour and *then* checks if the undefined behaviour could happen. But, heck, mistakes happen - it was identified and fixed. Not much of a story really.
A comment on Reddit pointed out something interesting. He speculated that the reason the test was after the assignment is that the programmer was trying to follow the often recommended style of initializing your variables when you declare them. The kernel uses C89, which requires all declarations to be at the top of the function, before any code (other than initializers). Thus, he couldn't test for null before initializing.
C99 removes the restriction that declarations have to be first in the body. If the kernel were using C99, the programmer could have done his argument sanity checks first, such as the check for the null pointer, and then declared and initialized the variable, AFTER the test.
The code *IS* technically wrong: It dereferences a NULL pointer. The fact that the pointer is checked against NULL *after* dereferencing it does not help one bit. Once you invoke undefined behavior, the code could do ANYTHING you can imagine and it wouldn't be the compiler's fault.
The very last thing you need is an OS, especially embeded, think mobile, or desktop, where one error, often in a daemon, or driver brings the whole thing crashing to a halt, and one that you can't debug. Think M$ Windows BSOD.
The OS is there to manage those problems, and stop the buggy process, and only those. Not your last hours editing or last weeks work or your corporate web server. So this is, once again confused nonsense; in the kernel NULL ==== (void *)0, but the kernel TRAP handler treats KERNEL ILLEGAL 0 DEREFERNCES specially [(NB NULL ==== 0) is not true in all architectures], indeed all KERNEL errors are specially handeled or disabled. In effect, the kernel will trap, log, and terminate processes which " ill mem ref" unless they have a registered or default handler [think boot time probes for mem-mapped devices] but the hardware gives you enough context to determined what happened or is broken
You, sir, have no clue what you're talking about. This has nothing to do with best practices, the code is just wrong. That is, unless you consider "you won't dereference a NULL pointer" a "best practice". The rest of us consider it a fundamental law of the universe.
A comment on Reddit pointed out something interesting. He speculated that the reason the test was after the assignment is that the programmer was trying to follow the often recommended style of initializing your variables when you declare them. The kernel uses C89, which requires all declarations to be at the top of the function, before any code (other than initializers). Thus, he couldn't test for null before initializing.
You mean there was some orbital mind control laser preventing him from writing this?
Or this?
This was a simple oversight.
The compiler assumed that if tun had been dereferenced, it couldn't be NULL. That's a false assumption, and there's probably dozens of other time-bombs sitting in code compiled with that compiler that just haven't been discovered yet.
The rest of us consider it a fundamental law of the universe.
Clearly, your universe is too small! Technically, the code dereferenced NULL+offset where offset (and so NULL+offset) is non zero (which I presume you are hard wired to consider to be the NULL value).
In an environment where a segv (or equivalent) won't be triggered, the code's not wrong until it makes use of an invalid dereference. The if would have prevented it. I don't think that makes it GOOD since in most environments it will fail.
In some languages or with some C optimizers, the assignment would never be evaluated at all until after the if.
Let's face it, the bug is a corner case in the complex interaction between the compiler and the kernel's vision of the environment.
I have found errors in assemblers before during intense debugging. When the C doesn't do what you want, and the results in the debugger don't make sense, you have to toss all your assumptions and will yourself to see the non-obvious. I found bugs in the Intel assembler and the PharLap assembler in certain addressing modes that were not commonly used by assembly coders. These language tools are just that, Tools. When you are a serious software engineer, you are responsible down to the bit level for efficiency and reliability. It doesn't happen often, but when you are scratching your head and going around in circles, you have to check it down to the bits.
How does this get past SELinux?
If you can limit everything that people can do after exploiting a vulnerability and being on the system, and reducing root to nothing, how does this vulnerability/exploit differ?
If you ignore ACs because they are anonymous - you're an idiot.
Where exactly does it say in TFA that this has been used to do a remote attack? It doesn't, and the YouTube demo shows quite clearly that Brad Spengler is already logged into the box he's cracking using his own account. Trailrunner7 is trying to sex this up. Of cause if there were a sizable population of consumer Linux desktops out there, and an active virus hacker community targeting the said consumers with all the social tricks they use in the Windows world to get them to execute attachments, then we would be in a world of trouble at this point. But of cause we're not are we.
This was a very clear explanation, thanks.
Why do so many people in this article think that tun->sk dereferences an offset? It offsets a dereference. There's a huge difference here. The compiler would be strictly wrong if it were the former, but the compiler is not violating the C standard (arguably the optimization is over-aggressive, but it's not WRONG).
There are two bugs here:
1. Null-pointer dereference. This is not an offset-dereference.
2. Compiling with -fdelete-null-pointer-checks in a C-incompatible environment where NULL pointers can actually be validly dereferenced. Getting rid of that optimization would essentially leave you with a slightly-less performant offshoot language of C where you can dereference NULL.
The submitted patch fixes only the former (thus far, anyway).
When you look at the code in context, tun is set by a function call, then you get to the initialization of the local sk variable, and after that it attempts to check the value of tun.
So, yeah, the test is too late.
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
I'm pretty sure the guy who found it was not just throwing NULL pointers at all the APIs. (He used the source to find it.)
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
In the source, tun was set by a function call before the lines quoted.
I'd say that's a bug in the the friendly article, myself.
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
In the real world, you are not using this kernel.
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
Maybe you have intricate dependencies. They do exist. Maybe you are running thousands of Linux machines. That's great.
But it seems odd that you don't seem to realize that this particular bug is most likely not in any of the kernels running on any of those thousands of machines.
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
I think it's save to assume that there will be an option to warn about this, soon ;)
That being said, you can already switch off the particular optimization, which is already planned for all (default) future kernel builds.
Why do so many people in this article think that tun->sk dereferences an offset?
Because that's what it does? Gcc sure thinks so based on the code it generates. Yes, I did actually check, given c=p->b, p is a pointer to struct a, and struct a is struct a { int a[0x400]; int b; };, it loads the register for c with the data at the address p+offset. That generates exactly 1 fetch at the address of p+offset. In x86, that's movl 4096(%eax), %eax
Note that just by having int a[0x400], I make sure that even when p=0, the actual memory access happens in page 1, not page zero, so even if the kernel makes sure page zero will always fault, the compiler generated code will happily let me do p->b when p=0. If you don't think it should, blame the compiler. Neither the kernel nor the hardware can know what source code lead to the memory access at address 0x1000, only the compiler can know that.
If you then demand that there be two unmapped pages at the start of the address space, I'll just define int a[0x800] and bring the problem right back.
That all means that the compiler's implicit assumption that p->b will necessarily fault when p=0 is dead wrong (and no, gcc does not confine that assumption to cases where the size of the struct is smaller than a page). Further, since C defines the NULL pointer and gcc (and most others) use 0 as a sentinel value for the NULL pointer, it's the compiler's responsibility to enforce that. The hardware doesn't consider accessing address 0 to be at all special.
The convention of the kernel disallowing page zero being mapped and the compiler assuming that will cause a fault if a NULL pointer is used is a dirty hack. It does give a substantial performance benefit, so we can mostly forgive it (though it really should only assume that when sizeof(struct) < page size).
Eliding code (that is, generating object code that doesn't do what the source code is defined to do) based on an assumption from a dirty hack that has provable failure cases is not something we can excuse. For the potential benefit of 8 bytes and 0.3 nano seconds it introduced a nasty security flaw.
It's worth noting that if anyone at all had followed best practices, none of the ugliness would have had any actual effect, but it's the compiler that was dead wrong.
Given that that's a rather big assumption and can cause so much badness, it probably shouldn't be enabled by the innocuous looking -O3. In all other cases I've seen, when optimizations might affect correctness, the manual warns about that and states under what conditions the problems can occur. All of the cases I've seen were shortcuts in floating point operations that might lose precision, not something that would rip out elements of program flow.
I didn't use 'NULL' anywhere in the program. I used address '0', which 'NULL' is usually defined as, but strictly speaking, isn't the same thing. Basically what happened was that GCC, upon seeing me use the value in that pointer to access memory, up and decided it couldn't possibly be zero, and used that information for optimizations.
Just because there is a NULL value doesn't mean the compiler should assume that that value is always NULL.
In particular, if you want to use vm86 mode in Linux, for example to use BIOS calls or any other real-mode software which will want to look at the first page of memory, you have to map that memory to that address, which necessarily involves using pointers to address zero.