The Linux Kernel Is Now VLA-Free: A Win For Security, Less Overhead and Better For Clang (phoronix.com)
With the in-development Linux 4.20 kernel, it is now effectively VLA-free. From a report: The variable-length arrays (VLAs) that can be convenient and part of the C99 standard but can have unintended consequences. VLAs allow for array lengths to be determined at run-time rather than compile time. The Linux kernel has long relied upon VLAs in different parts of the kernel -- including within structures -- but going on for months now (and years if counting the kernel Clang'ing efforts) has been to remove the usage of variable-length arrays within the kernel. The problems with them are:
1. Using variable-length arrays can add some minor run-time overhead to the code due to needing to determine the size of the array at run-time.
2. VLAs within structures is not supported by the LLVM Clang compiler and thus an issue for those wanting to build the kernel outside of GCC, Clang only supports the C99-style VLAs.
3. Arguably most importantly is there can be security implications from VLAs around the kernel's stack usage.
1. Using variable-length arrays can add some minor run-time overhead to the code due to needing to determine the size of the array at run-time.
2. VLAs within structures is not supported by the LLVM Clang compiler and thus an issue for those wanting to build the kernel outside of GCC, Clang only supports the C99-style VLAs.
3. Arguably most importantly is there can be security implications from VLAs around the kernel's stack usage.
This is what they are referring to. Code like (from that link):
How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
Looks like you are lost, buddy. This is C.
And how does your vector BS solve the problem? Is its storage allocated on stack entirely?
The first problem is that they can be dropped from future versions of GCC. They're not part of any standard, after all.
The second problem is that there are situations in which GCC isn't the most suitable compiler. You want to minimize hacks for each different compiler supported.
Security is a big thing, too. It's hard to audit fundamentally unpredictable code.
A major step forward.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
VLAs are an example of C becoming ever so slightly higher level. When the language does things under the hood without telling you it's just an invitation to bite you in the ass. Good purge.
But it is not TLA (Three Letter Acronym) free.
I think the Linux community should join CAT - the Campaign to Abolish TLAs.
It helps with debugging too. Build with two unrelated compiler systems and bugs that don't stand out in one may stand out in the other. And I am talking run-time errors not compiler warnings.
Once we get rid of all the GNU'isms can we go back to simply calling it "Linux". ;-)
Acronyms are words that you pronounce, like laser (Light Amplification by Stimulated Emission of Radiation), scuba, radar, or PIN (Personal Identification Number number).
Initialisms are words you spell out, like FBI, CIA, DNR, ECG, MRI, DVLA etc.
A TLA is an initialism, not an acronym, so really it's not a TLA, it's a TLI. Not sure which one CAT is supposed to be though!
When the Linux kernel depends on non-standard language extensions that only GCC implements, that's OK.
Except that VLAs are part of the C99 standard, and there's nothing in the standard that says they can't be used in a struct - it's just difficult for the compilers. gcc has chosen to technically implement it as an extension, while Clang/LLVM doesn't support it (nor the floating point pragmas of C99, which has also been an issue for some kernel code).
Just memory addresses. *Foo could be one or a few or many. Pointer arithmetic.
So variable arrays feels odd.
If you did not like chasing down weird memory corruption problems then you would not be using C (or C++) in the first place.
It would have been trivial to add a little bit of sanity with syntax like
void foo(char buf[blen], int blen)
so a compiler could, in debug mode, check. But no, that would not be a hero's C. nor is variable length arrays.
Incidentally, C's lack of arrays is not efficient. E.g. it is the reason we need 64 bit pointers, namely that C can only address 4 gig in 32 bit pointers. Java can access 32 gig of memory with 32 bit pointers because mallocs are aligned, and 32 gig is more than enough for the vast majority of current applications, and likely to remain so for a long time to come. Doubling your pointer size with lots of zeros is expensive, it clogs caches etc.
Generally, you can get the tricky parts of the kernel done in C, then layer C++ on top of it. That's what a lot of embedded RTOS systems do. The biggest snag is the tendency of getting bloated code from developoers not aware of what C++ does behind the scenes.
It's almost as if you don't know that std::vector can use a custom memory allocator, eg alloca().
No sig today...