The Linux Kernel Is Now VLA-Free: A Win For Security, Less Overhead and Better For Clang (phoronix.com)

← Back to Stories (view on slashdot.org)

The Linux Kernel Is Now VLA-Free: A Win For Security, Less Overhead and Better For Clang (phoronix.com)

Posted by msmash on Monday October 29, 2018 @08:45AM from the for-the-record dept.

With the in-development Linux 4.20 kernel, it is now effectively VLA-free. From a report: The variable-length arrays (VLAs) that can be convenient and part of the C99 standard but can have unintended consequences. VLAs allow for array lengths to be determined at run-time rather than compile time. The Linux kernel has long relied upon VLAs in different parts of the kernel -- including within structures -- but going on for months now (and years if counting the kernel Clang'ing efforts) has been to remove the usage of variable-length arrays within the kernel. The problems with them are:
1. Using variable-length arrays can add some minor run-time overhead to the code due to needing to determine the size of the array at run-time.
2. VLAs within structures is not supported by the LLVM Clang compiler and thus an issue for those wanting to build the kernel outside of GCC, Clang only supports the C99-style VLAs.
3. Arguably most importantly is there can be security implications from VLAs around the kernel's stack usage.

17 of 113 comments (clear)

Min score:

Reason:

Sort:

Re:"VLAs within structures" not part of C by aardvarkjoe · 2018-10-29 09:08 · Score: 4, Informative

This is what they are referring to. Code like (from that link):

void foo (int n) { struct S { int x[n]; }; }

--

How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
Re:"VLAs within structures" not part of C by Anonymous Coward · 2018-10-29 09:30 · Score: 2, Informative

Looks like you are lost, buddy. This is C.
And how does your vector BS solve the problem? Is its storage allocated on stack entirely?
GCCisms by jd · 2018-10-29 09:40 · Score: 4, Informative

The first problem is that they can be dropped from future versions of GCC. They're not part of any standard, after all.
The second problem is that there are situations in which GCC isn't the most suitable compiler. You want to minimize hacks for each different compiler supported.
Security is a big thing, too. It's hard to audit fundamentally unpredictable code.
A major step forward.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
1. Re:GCCisms by The+Evil+Atheist · 2018-10-29 10:41 · Score: 3, Informative
  
  VLAs are part of the C99 standard. It says so right in the summary, and you can look up the standard itself.
  
  --
  Those who do not learn from commit history are doomed to regress it.
2. Re:GCCisms by jd · 2018-10-29 22:41 · Score: 2
  
  So you're aware that GNU introduced features often way in advance of any standard and that the GNU syntax/semantics don't always match the ISO version.
  Let's see what ISO says about VLAs:
  C99 adds a new array type called a variable length array type. The inability to declare arrays whose size is known only at execution time was often cited as a primary deterrent to using C as a numerical computing language. Adoption of some standard notion of execution time arrays was considered crucial for C’s acceptance in the numerical computing world.
  Does this match your experience?
  Would discontiguous pools of contiguous memory, giving you the ability to make anything flexible size, be that much worse as that's what the compiler will be using anyway?
  
  --
  It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
3. Re:GCCisms by Uecker · 2018-10-30 03:14 · Score: 2
  
  So you're aware that GNU introduced features often way in advance of any standard and that the GNU syntax/semantics don't always match the ISO version.
  Yes of course. In fact, I added myself a GNU extension. I am also participating in WG14.
  
  Let's see what ISO says about VLAs:
  C99 adds a new array type called a variable length array type. The inability to declare arrays whose size is known only at execution time was often cited as a primary deterrent to using C as a numerical computing language. Adoption of some standard notion of execution time arrays was considered crucial for C’s acceptance in the numerical computing world.
  Does this match your experience?
  Absolutely. I use C for numerical computing and VLAs a very important.
  
  Would discontiguous pools of contiguous memory, giving you the ability to make anything flexible size, be that much worse as that's what the compiler will be using anyway?
  I don't understand what you are trying to say. The VLA will live on the stack or the heap depending on where one allocates it. In both cases, there is no way to resize it. Making it resizable is much harder and no compiler does this as it would require a level of indirection which reduces performance and would require some kind of automatic memory management (automatically running destructors). Of course, you can always add your own abstraction for resizable arrays.
High vs low languages. by HeckRuler · 2018-10-29 09:44 · Score: 3, Interesting

VLAs are an example of C becoming ever so slightly higher level. When the language does things under the hood without telling you it's just an invitation to bite you in the ass. Good purge.
1. Re:High vs low languages. by HeckRuler · 2018-10-29 10:47 · Score: 5, Interesting
  
  Yes. Exactly that. It's allocating space for you. It figures out at run-time the length of your array rather than you having to do it by hand at compile-time. I didn't actually know of any security flaws this would lead to, but it stops debuggers from knowing details about calls so it obscured some information from me and pissed me off once.
Re:Finally! by slickwillie · 2018-10-29 10:07 · Score: 3, Funny

But it is not TLA (Three Letter Acronym) free.

I think the Linux community should join CAT - the Campaign to Abolish TLAs.
Good for debugging too by drnb · 2018-10-29 10:32 · Score: 3, Interesting

It helps with debugging too. Build with two unrelated compiler systems and bugs that don't stand out in one may stand out in the other. And I am talking run-time errors not compiler warnings.
GNU'isms by drnb · 2018-10-29 10:35 · Score: 2

Once we get rid of all the GNU'isms can we go back to simply calling it "Linux". ;-)
1. Re:GNU'isms by jd · 2018-10-29 21:56 · Score: 2
  
  The GNU over Linux refers to GNU userspace over Linux kernelspace. So a GNU userspace over OpenBSD would be GNU/OpenBSD. BSD/BSD is 1, since you're dividing by itself.
  
  --
  It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:Finally! by Spacelem · 2018-10-29 10:36 · Score: 2, Informative

Acronyms are words that you pronounce, like laser (Light Amplification by Stimulated Emission of Radiation), scuba, radar, or PIN (Personal Identification Number number).
Initialisms are words you spell out, like FBI, CIA, DNR, ECG, MRI, DVLA etc.
A TLA is an initialism, not an acronym, so really it's not a TLA, it's a TLI. Not sure which one CAT is supposed to be though!
Re:Non-standard language extensions by arth1 · 2018-10-29 11:35 · Score: 2

When the Linux kernel depends on non-standard language extensions that only GCC implements, that's OK.
Except that VLAs are part of the C99 standard, and there's nothing in the standard that says they can't be used in a struct - it's just difficult for the compilers. gcc has chosen to technically implement it as an extension, while Clang/LLVM doesn't support it (nor the floating point pragmas of C99, which has also been an issue for some kernel code).
C does not really have arrays by aberglas · 2018-10-29 12:00 · Score: 4, Interesting

Just memory addresses. *Foo could be one or a few or many. Pointer arithmetic.
So variable arrays feels odd.
If you did not like chasing down weird memory corruption problems then you would not be using C (or C++) in the first place.
It would have been trivial to add a little bit of sanity with syntax like
void foo(char buf[blen], int blen)
so a compiler could, in debug mode, check. But no, that would not be a hero's C. nor is variable length arrays.
Incidentally, C's lack of arrays is not efficient. E.g. it is the reason we need 64 bit pointers, namely that C can only address 4 gig in 32 bit pointers. Java can access 32 gig of memory with 32 bit pointers because mallocs are aligned, and 32 gig is more than enough for the vast majority of current applications, and likely to remain so for a long time to come. Doubling your pointer size with lots of zeros is expensive, it clogs caches etc.
Re:"VLAs within structures" not part of C by Darinbob · 2018-10-29 12:41 · Score: 3, Informative

Generally, you can get the tricky parts of the kernel done in C, then layer C++ on top of it. That's what a lot of embedded RTOS systems do. The biggest snag is the tendency of getting bloated code from developoers not aware of what C++ does behind the scenes.
Re:"VLAs within structures" not part of C by Joce640k · 2018-10-29 23:20 · Score: 2

It's almost as if you don't know that std::vector can use a custom memory allocator, eg alloca().

--
No sig today...