New Linux Kernel Flaw Allows Null Pointer Exploits

← Back to Stories (view on slashdot.org)

New Linux Kernel Flaw Allows Null Pointer Exploits

Posted by Soulskill on Saturday July 18, 2009 @12:17AM from the recipe-for-fun dept.

Trailrunner7 writes "A new flaw in the latest release of the Linux kernel gives attackers the ability to exploit NULL pointer dereferences and bypass the protections of SELinux, AppArmor and the Linux Security Module. Brad Spengler discovered the vulnerability and found a reliable way to exploit it, giving him complete control of the remote machine. This is somewhat similar to the magic that Mark Dowd performed last year to exploit Adobe Flash. Threatpost.com reports: 'The vulnerability is in the 2.6.30 release of the Linux kernel, and in a message to the Daily Dave mailing list Spengler said that he was able to exploit the flaw, which at first glance seemed unexploitable. He said that he was able to defeat the protection against exploiting NULL pointer dereferences on systems running SELinux and those running typical Linux implementations.'"

17 of 391 comments (clear)

Min score:

Reason:

Sort:

I always disable those by brunes69 · 2009-07-18 00:40 · Score: 2, Interesting

I always disable those security modules as they always end up to incompatibilities and other erratic behavior in software.
Exactly what do they do anyway?
1. Re:I always disable those by Antique+Geekmeister · 2009-07-18 01:40 · Score: 2, Interesting
  
  They ruin otherwise working code that was written in slightly different environments, and for which the very arcane behavior of SELinux has not been tuned. They're also often difficult to write test suites for, especially the unpredictability of SELinux changes, since they affect behavior due to factors entirely outside the control of the particular program author: they affect behavior based on where the package installs the code and what SELinux policies are in place.
  It's gotten better: Linux operating system authors have gotten more capable with it, and the software authors have learned not to stuff everything wherevery they feel like (with the exception of Dan Bernstein, but he's loosened his licenses, so we can fix that now).
2. Re:I always disable those by Antique+Geekmeister · 2009-07-18 06:53 · Score: 2, Interesting
  
  SELinux does help prevent tools from executing in locations or in places that are inappropriate: it helps reduce the destructive capabilities of components that are mis-installed, installed without proper permissions, or that have certain classes of errors. It also helps force you to think before doing something foolish, such as running CGI tools that are not in /var/www/cgi-bin/: it's too easy for foolish people to use .htaccess or poorly handled HTTPD include directives to include some very foolish CGI tools, and I've had SELinux detect this and prevent a fool from running his very dangerous CGI utilities out of his NFS mounted home directory. So it has uses.
  Now, that "Jack will never learn to lock it down, so let's not help him so that he has to do it all from scratch himself" message you seem to have is, I'm afraid to say, both popular and insane. Jack _does not want_ to have to learn this material. Jack just wants to have his web site work. Jack doesn't have the time or the money to invest in a long-term career in server security programming: This is a modest tool that can be helpful against some classic errors. (Running CGI utilities out of home directories is my favorite that is blocked by SELinux by default.)
Wait, what? by TheRaven64 · 2009-07-18 00:58 · Score: 4, Interesting

This code looks right?

struct sock *sk = tun->sk; // initialize sk with tun->sk ... if (!tun) return POLLERR; // if tun is NULL return error

So, he's dereferencing tun, and then checking if tun was NULL? Looks like the compiler is performing an incorrect optimisation if it's removing the test, but it's still horribly bad style. This ought to be crashing at the sk = tun->sk line, because the structure is smaller than a page, and page 0 is mapped no-access (I assume Linux does this; it's been standard practice in most operating systems for a couple of decades to protect against NULL-pointer dereferencing). Technically, however, the C standard allows tun->sk to be a valid address, so removing the test is a semantically-invalid optimisation. In practice, it's safe for any structure smaller than a page, because the code should crash before reaching the test.
So, we have bad code in Linux and bad code in GCC, combining to make this a true GNU/Linux vulnerability.

--
I am TheRaven on Soylent News
1. Re:Wait, what? by Anonymous Coward · 2009-07-18 01:10 · Score: 2, Interesting
  
  The patch.
2. Re:Wait, what? by pdh11 · 2009-07-18 01:16 · Score: 5, Interesting
  
  Technically, however, the C standard allows tun->sk to be a valid address, so removing the test is a semantically-invalid optimisation.
  No. Technically, if tun is null, dereferencing it in the expression tun->sk invokes undefined behaviour -- not implementation-defined behaviour. It is perfectly valid to remove the test, because no strictly conforming code could tell the difference -- the game is already over once you've dereferenced a null pointer. This is a kernel bug (and not even, as Brad Spengler appears to be claiming, a new class of kernel bug); it's not a GCC bug.
  But as other posters have said, it would indeed be a good security feature for GCC to warn when it does this.
  Peter
3. Re:Wait, what? by Athanasius · 2009-07-18 03:59 · Score: 3, Interesting
  
  This ought to be crashing at the sk = tun->sk line, because the structure is smaller than a page, and page 0 is mapped no-access (I assume Linux does this; it's been standard practice in most operating systems for a couple of decades to protect against NULL-pointer dereferencing).
  If you actually read the exploit code (see: http://grsecurity.net/~spender/cheddar_bay.tgz) the thing that really enables this exploit is one of two ways to map page zero. One of these seems to be a flaw with SELinux (either with the default settings and/or how the default config commonly ships) or using personality(2) to select a personality that explicitly allows this.
  From the exploit for the personality case:
  
  int main(void) { int ret; struct stat fstat; ret = personality(PER_SVR4); if (ret == -1) { fprintf(stderr, "Unable to set personality!\n"); return 0; }
  
  Note you do need some setuid root program even with this (from my reading of the exploit code).
  In the SELinux case "it just works" without needing the setuid program it seems.
Re:Serious bug in gcc? by TheRaven64 · 2009-07-18 01:58 · Score: 5, Interesting

Except that his explanation is wrong. Dereferencing NULL is illegal, but pointer arithmetic on NULL is legal (and, even if it were illegal, would be practically impossible for a compiler to check). This statement is not dereferencing NULL, it is (potentially) performing pointer arithmetic on NULL (adding the offset of the sk field to it) and then dereferencing the result. This may or may not be valid, depending on how NULL is defined (it doesn't have to be 0 in C, although it usually is), what the offset of sk is, and what the memory layout of the target platform is.
On most modern platforms, NULL is defined as (void*)0 and the entire bottom page of memory is mapped as no-access. On some embedded systems, however, the bottom few hundred bytes are used for I/O and you get the addresses of these by adding a value to 0. On these systems it is perfectly valid (and correct) C to define a structure which has the layout of the attached devices and then cast 0 to a pointer to this structure and use that for I/O.

--
I am TheRaven on Soylent News
Re:Double standards by Shinobi · 2009-07-18 02:40 · Score: 4, Interesting

Sure. My last entire project has been specifically about that. Been working on a piece of software to go onto an embedded device with deterministic behaviour, with the hardware specs being 32kiBiByte RAM, no cache, 8MHz processor.
Most people I am forced to work with who have a comp sci degree are unable to work under such conditions. On the other hand, EE's and comp.eng graduates tend to be very nice to work with on such projects.
Running a static checker on the Linux kernel? by Animats · 2009-07-18 04:21 · Score: 2, Interesting

Isn't someone running a static checker on the Linux kernel? There are commercial tools which will find code that can dereference NULL. However, there aren't free tools that will do this.
Re:Double standards by TheSunborn · 2009-07-18 04:21 · Score: 2, Interesting

The compiler is allowed to assume that no other code(Including no other thread, running the same code) change the value of a variable behind its back(As long as the variable it not volatile(Volatile got it's own can of worms)), so the optimization is safe.
so in the code
int *data=myFunc();
val=*data;
printf("%d\n",val);
val=*data; // The compiler is allowed to optimize this call out.
printf("%d\n",val);
And the compiler is allowed to turn this into nothing:
int *val=myFunc(); // Returns a valid pointer to an int.
*val=10;
if(*val==11) {
*val=42; // The compiler may remove anything in here.
}
If multi-threaded code were allowed to do what you describe, then it would be impossible to do most of the optimizations that good c compilers do.
Re:Serious bug in gcc? by Rockoon · 2009-07-18 04:46 · Score: 3, Interesting

In effect, the code is a form of broken defensive programming (you check after the fact whether you've screwed up). It's wrong, but we still wouldn't want the compiler to silently remove the check. So I think the ideal solution (besides fixing the code) is to add a warning to the compiler. NULL pointer dereferences are a bug in the vast majority of cases, and checking for a NULL pointer after dereferencing it (in such a way that the compiler recognizes it and is about to remove the check) is at best redundant and more likely a bug.
My problem with this sort of thinking is when you throw in macros and templates and whatnot, there can end up being hundreds, thousands, even millions of "redundant" tests againt NULL specified by the expanded source. Now, I suspect that simply adding this warning to GCC and then compiling some large project would generate so many such warnings that the only reasonable choice would be to then disable that warning. The warning would then have no value, and if so then that certainly doesnt address the "problem."

As far as the other stuff.. my point was that the arguement that the compiler should never optimize away such if() statements is flawed. I was responding to someone who did in fact make such a claim. There are certainly cases where the pointer absolutely cannot be NULL (or absolutely must be) .. ex, the pointer was just assigned, or its nested within another test for null.

--
"His name was James Damore."
Re:Serious bug in gcc? by LateArthurDent · 2009-07-18 04:50 · Score: 2, Interesting

Given the code:
a = foo->bar;
if(foo) something()
gcc is doing precisely the wrong thing - optimising out the if on the theory that the app would have crashed if it was null.
No, that's not the theory. They weren't optimizing the if out because the app would have crashed, they were optimizing the if out because if the programmer is dereferencing foo beforehand without testing, one can assume that the programmer is sure that foo is not null by that point. I agree with you that a warning should be thrown (and I'm not sure if it is or isn't), but that if really should be optimized out.
Re:Just like Linux by Jimithing+DMB · 2009-07-18 05:02 · Score: 3, Interesting

Funny enough a few months back I made a very similar error if not the exact same error while coding on the bootloader for Darwin/x86. Except in my case it wasn't exactly a true error because in the bootloader I know that a page zero dereference isn't going to fault the machine but will instead just read something out of the IVT.
So as I recall it seemed perfectly reasonable to go ahead and initialize a local variable with the contents of something in the zero page and then check for null and end the function. But GCC had other ideas. It assumed that because I had dereferenced the pointer a few lines above that the pointer must not be NULL so it just stripped my NULL check out completely. Had it warned about this like "warning: pointer is dereferenced before checking for NULL, removing NULL check" then that would have been great. But there was no warning so I wound up sitting in GDB (via VMware debug stub) stepping through the code then looking at the disassembly until I realized that.. oops.. the compiler assumed that this code would never be reached because in user-land it would have segfaulted 4 lines ago if the pointer was indeed NULL.
Obviously the fix is simple. Declare the variable but don't initialize it at that time. Do the null check and return if null. Then initialize the variable. If using C99 or C++ then you can actually defer the local variable declaration until after you've done the NULL check which IMO is preferable. It may be that the guy wrote it as C99 (where you can do this) then went oops, the compiler won't accept that in older C and simply moved the declaration and initialization statement up to the top of the function instead of splitting the declaration from the initialization. My recollection of how I managed to introduce this bug myself is shady but as I recall it was something like that.
Re:Double standards by locofungus · 2009-07-18 06:07 · Score: 2, Interesting

int *data=myFunc();
val=*data;
printf("%d\n",val);
val=*data; // The compiler is allowed to optimize this call out.
printf("%d\n",val);
Actually, the compiler isn't allowed to optimize that second assignment to val out unless it can see the source for printf and can prove that there are not other aliases to the memory that data points to that might be changing it.
Even if you assume the default printf(), myFunc might be returning a pointer to one of the buffers used for IO.
This is one of the reasons that C99 introduced the __restrict keyword; to allow the compiler the make the sort of optimization you are suggesting here.
Tim.

--
God said, "div D = rho, div B = 0, curl E = -@B/@t, curl H = J + @D/@t," and there was light.
Re:Actually, it's already been fixed by harlows_monkeys · 2009-07-18 09:01 · Score: 2, Interesting

Actually, it's already been fixed as of 2.6.31-rc3. Interestingly enough, the code by itself was fine until gcc tries to re-assign the pointer value upon compiling
The code by itself is not fine. The underlying bug is that the kernel is allowing memory at virtual address 0 to be valid. The compiler was designed for an environment where there can never be a valid object at 0, and has chosen the bit pattern 000...000 to be the null pointer. If you want to use C in an environment where 0 can be a valid address, then you need to use a C compiler that uses some bit pattern other then 000...000 for the null pointer.
And yes, compilers can do that. According to the C standard, assigning the constant 0 to a pointer type does NOT mean assigning the numerical representation of 0. It means assigning a machine dependent value that compares unequal when compared to any pointer to valid memory.
Furthermore, the Linux version of gcc is designed for an environment where accessing invalid memory causes a fault. Thus, the compiler is doing nothing wrong optimizing out that test for null, since in the environment for which it was designed, null pointers follow the C standard and do not point to a legitimate object, and accesses of illegitimate objects fault, and in such an environment that test can never be executed.
The kernel developers either need to make sure the kernel environment conforms to gcc's assumptions about its environment, or they should use a special version of gcc designed for the kernel environment.
Re:Linus, you Rookie !! by sjames · 2009-07-18 12:58 · Score: 2, Interesting

The rest of us consider it a fundamental law of the universe.
Clearly, your universe is too small! Technically, the code dereferenced NULL+offset where offset (and so NULL+offset) is non zero (which I presume you are hard wired to consider to be the NULL value).
In an environment where a segv (or equivalent) won't be triggered, the code's not wrong until it makes use of an invalid dereference. The if would have prevented it. I don't think that makes it GOOD since in most environments it will fail.
In some languages or with some C optimizers, the assignment would never be evaluated at all until after the if.
Let's face it, the bug is a corner case in the complex interaction between the compiler and the kernel's vision of the environment.