How Linux's Kernel Developers 'Make C Less Dangerous' (hpe.com)
Hewlett-Packard's Enterprise blog summarizes a talk by Linux kernel developer Kees Cook at the North America edition of the 2018 Linux Security Summit. Its title? "Making C Less Dangerous."
"C is a fancy assembler. It's almost machine code," said Cook, speaking to an audience of several hundred peers, who understood and appreciated the application speed resulting from C... Over time, Cook and the people he worked with discovered numerous native C problems. To deal with these weaknesses, the Kernel Self Protection Project has worked slowly and steadily on protecting the Linux kernel from attack. In the process, it has worked to remove troublesome code from Linux....
With its operational baggage and weak standard libraries, C contains a great deal of undefined behavior. Cook cited -- and agreed with -- Raph Levien's blog post "With Undefined Behavior, Anything Is Possible." Cook gave concrete examples. "What are the contents of 'uninitialized' variables? Whatever was in memory from before! Void pointers have no type, yet we can call typed functions through them? Sure! Assembly doesn't care: Everything can be an address to call! Why does memcpy() have no 'max destination length' argument? Just do what I say; memory areas are all the same!" Some of these idiosyncracies are relatively easy to deal with. Cook commented, "Linus [Torvalds] likes the idea of always initializing local variables. So, you should 'just do it....'"
The long-term solution? More security-savvy open source developers... While at times, the idea of coming up with a Linux C dialect has been attractive, that's not going to happen. The real issue behind the problem of dangerous code is "people don't want to do the work to clean up code -- not just bad code, but C itself," he said. As with all open source projects, "we need more dedicated developers, reviewers, testers, and backporters."
LWN.net has its own run-down of Cook's talk, as well as a link to a PDF file of his slides.
"Sound good," posted one of their commenters, "though ultimately I'd like kernel devs to adopt Rust as their main Linux kernel development language. Beats the crap out of C and C++ combined."
With its operational baggage and weak standard libraries, C contains a great deal of undefined behavior. Cook cited -- and agreed with -- Raph Levien's blog post "With Undefined Behavior, Anything Is Possible." Cook gave concrete examples. "What are the contents of 'uninitialized' variables? Whatever was in memory from before! Void pointers have no type, yet we can call typed functions through them? Sure! Assembly doesn't care: Everything can be an address to call! Why does memcpy() have no 'max destination length' argument? Just do what I say; memory areas are all the same!" Some of these idiosyncracies are relatively easy to deal with. Cook commented, "Linus [Torvalds] likes the idea of always initializing local variables. So, you should 'just do it....'"
The long-term solution? More security-savvy open source developers... While at times, the idea of coming up with a Linux C dialect has been attractive, that's not going to happen. The real issue behind the problem of dangerous code is "people don't want to do the work to clean up code -- not just bad code, but C itself," he said. As with all open source projects, "we need more dedicated developers, reviewers, testers, and backporters."
LWN.net has its own run-down of Cook's talk, as well as a link to a PDF file of his slides.
"Sound good," posted one of their commenters, "though ultimately I'd like kernel devs to adopt Rust as their main Linux kernel development language. Beats the crap out of C and C++ combined."
I hear that's a very lean and robust language.
C/C++ is not an 'amateur night' programming language, it's not 'child proofed', it doesn't hold your hand like you're a child, you can write entire operating systems in it, and as such it's supposed to have access to anything and everything, and that just so happens to include mucking up the OS of the machine you happen to be testing your code on. 'Sanitizing' it, 'child proofing' it would take away that power and make it useless. At that point you may as well just be writing things in BASIC or some other interpreted language that doesn't allow you access to anything terribly powerful or important. I've never heard anyone refer to C/C++ (or languages of similar power) as 'dangerous' before. I think it more likely that programmers have become lazy, or just aren't educated enough to be responsible with a powerful programming language, and as a result we end up seeing code that's sloppy, ill-behaved, and 'dangerous' because of it. Just like people complaining about how bad drivers are (and that we should ban humans from driving and make them use automation instead, which is stupid), someone wants to take the power away, when the real, rational solution is better education/training/testing. Have schools become lazy in how prospective programmers are educated and how their knowledge is tested? Then lets fix that problem rather than making decent programmers (and drivers) live in a world where the ability to really be behind the wheel and in control of the machine is taken away from them, because some people can't cut it.
Some 20 years ago I was consulting for a company that needed $BigCompay approval to release software. While at $BigCompany I ran across an old boss, who flat out said "interview with us, you're in". I did.
One guy asked the standard string reversal question, in C. I put a pointer at each end of the string and walked them together. The guy was completely flummoxed, it was like he'd never encountered an answer he hadn't thought of. This was like the second question he asked me, we spent the rest of the interview me explaining my solution, he never understood it.
Guy turned out to be my sorta boss (matrix management means the guy you report to has no say in your performance review). He was a good guy, one of the early employees of the company, but as time went on I realized he did not understand C pointer arithmetic.
Mind you, this guy was smarter than me, and more driven. But he had never done assembly programming, hence he never really understood C pointers.
Me? Started with Z-80 assembly, moved to 8080, 8086, then 80386, which is when I learned C. Took to C pointers like a duck to water.
FWIW, the company I was consulting for never paid me for my last month (2 bi-weekly paychecks). Lots of phone calls, meetings, and fights. Huge reason I quit consulting and went back to working for companies.
No one refuses to use Rust because of NIHS because they would not have invented C either.
The real reason is entrenchment. People are just used to old ways. Rust's borrow checker has a learning curve. C programmers are used to old imperative style programming, not things like pattern matching.
That is basically the thing that applies to C code and has applied to it from the beginning. In the hands of somebody that knows what they are doing, C is not more dangerous than any other language.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
The exact same thing could be said about them wanting to switch to something new instead of learning C! In fact, if you're infected with NIHS you'd never be able to use an older generation of tools, you'd always need to adopt a new tool even when the old tools are better.
Like people who need to use an app from some company so they can communicate, because email is "old."
Because the statement is hollow: "Sound good," posted one of their commenters, "though ultimately I'd like kernel devs to adopt Rust as their main Linux kernel development language. Beats the crap out of C and C++ combined."
You have no scientific proof. It is your opinion.
People have been attempting to replace C since it was conceived.
Sort of like the idiocy in configuration management systems I see today.
Lets throw RPM and well know configuration management tools we have used for the past 20 plus years like bash shell scripting and invent a way to configure boxes by adding not required additional daemons, Oh lets also throw in different languages and screwy config formats like YAML. That way we can boost the requirements on our resumes for higher salaries so we can make things incredibly overly complicated, to even updated a single .conf file in etc.
I call BULLSHIT on you CF Engine, Puppet and Chef admins.
Like these management systems such as puppet..etc....you have no proof these systems like rpm and C should be replaced for build, configuration or programming.
Other than the fact it wasn't invented by a very naive generation of idiots out of college who like to pretend the last 20 years of the computing industry was a mistake and really doesn't exist.
Just my opinion, but everyday I manage thousands of boxes with bash and rpms for config management. I also patch using the C language.
Works great, and I don't need to learn Ruby, YAML, or create daemons for clients, or secure their firewalls...or...
Well, you get the idea.
Got Geometrodynamics? Awe, too hard to figure out? Too bad.
Nonsense. It is "if it is not broken, do not fix it." C is not broken and the majority of the kernel is already written in it. It is just a tool that requires you to know what you are doing. But that applies to kernel development anyways. Rust is for morons with delusions that do not have what it takes but think they can write advanced code anyways. They cannot. The language is not the main difficulty here.
Incidentally, the believers in "just use this great new tool and all your problems will be solved" have been around forever, and they have been just as stupid way back as they are today. Brooks already had a chapter on it: "There is no silver bullet." Yet time and again, some people with little understanding of what matters claim that their pet-tool is that silver bullet.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Before I read TFA, I was expecting to learn something about C coding that I didn't before.
I learned years before to NEVER, NEVER, NEVER ignore warnings. Very smart people put in those warnings and you should take heed of them. The example given (initializing a local variable in a switch) is something that you should never do - initialization must always be done outside of conditional code. If the initialization value changes according to conditions, then initialize the variable at the define with an invalid value and then test for it when you use the value outside the switch statement.
Demanding that there must be a clean build with no warnings before code goes to QA is a great way of minimizing unexpected problems down the road (I have found that before QA tests any code, they build it and send it back if any warnings come back). It doesn't take a lot of work to fix warnings and if the coder doesn't understand the reason for the warning, then they should be educated as to the reasons why it is a problem.
There are a number of APIs and constructs (like strncpy, memcpy and VLAs) mentioned in the article that never be used as a matter of course. Their use should be laid out clearly in the coding rules/guidelines and it only takes a few seconds to add grep statement to a make file to look for specific APIs and terminate the build if they're found. I've done this for years for teams that I've lead and there's usually a bit of grumbling but when you explain the reasons why you should always get compliance.
From my experience, inadvertent coding (security) issues comes from not having a strong set of build (acceptance) and coding rules right from the beginning of a project.
Mimetics Inc. Twitter
For one, Linux is written in C already. For another, Rust comes with an ideology. C doesn't care.
Looks to me like these people just want their "safe space" in the kernel, where nobody tells them off for having coded something stupid.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
It's because of the Not invented here syndrome (NIHS) - A mindset or corporate culture that favors internally-developed products over externally-developed products, even when the external solution is superior.
WTF? Unless you're working at Bell Labs, choosing C doesn't seem to fit into the "NIHS" you describe. Not even a little bit.
People who say "sheeple" have about as much sophistication as an AOL user, and in fact are probably actually AOL users.
For another, Rust comes with an ideology. C doesn't care.
Oh, C definitely has an ideology: Do anything you want, but beware the consequences.
Mind you, I'm quite comfortable with that. It's the kids raised on Java that shit their pants when they realize that C won't protect them from themselves. Since you can't (or at least shouldn't write an operating system in Java, they keep casting about for a suitable language that isn't C. But since nearly every operating system of note is written in C or a C derivative, I think we can safely assume for now that they're just chasing their own tales.
People who say "sheeple" have about as much sophistication as an AOL user, and in fact are probably actually AOL users.
We have MISRA and the Barr Group Embedded C Coding Standard
Start producing code that passes those checks and I bet a lot of the 'issues' go away.
At the level of kernel programming, you want a direct mapping of source code to assembly language, with no surprises or unexpected compiler optimisations - some early day compilers would pad out variables in C structures or even rearrange the order so it didn't match the source code.
With Rust, something like macro overloading would be a code obfuscatory dream.
Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
....kernel developers actually know how to use C.
Seriously...C is an extremely efficient and powerful language, but it must be wielded by thos who know how to use its power.
Every time I thought I knew a little C, one of my programmers - who really knew his shit - would just blow my mind with some of his routines. Function calls that returned pointers to the next function (state-machine type stuff) just blew my mind first time I saw it used. Blindingly fast, but damned difficult to debug.
Even with my background in assembler, some of their stuff was just amaizing.
OTOH, I would never consider writing, say, an ERP app in C, but for kernel work, interface routines, etc. it just cannot be beat.
There's nothing new here.
Different languages have their strengths and weaknesses. I've coded professionally in assembly, Plus, Pl/1, C, C++, C#, Perl, Python, Basic, Visual Basic, Java, Javascript, Bash, TCL, TK, and probably a few more I've long forgotten. Right now Java is a great fit for the problem domain I'm working in. Fortunately I can express myself well in it and the Java developer tools and ecosystem are great.
When C is the appropriate tool I don't hesitate to use it. But I don't hesitate to use my chainsaw when I'm cutting up trees either.
You just have to know what you are doing before you start the engine.
Regardless of which language you use you'd end up with C or Assembly in the bottom.
I'm not sure if Rust is the way to go or if some different language is better. VMS/OpenVMS is using a large chunk of BLISS.
Another alternative I'd think of is Erlang. Or Prolog.
For the future - think outside the box. And that may not mean C, C++ or any of the traditional procedural languages.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
C is broken. Here is my analogy to prove to you why: C is a very powerful low level language, that has few guard rails to get in the way of whatever it is you are trying to do. It is like a giant wood chipper, in that it will easily eat up anything you feed into it: Oak trees, 2 x 4's, old couches, anything. It also has no safety mechanisms, like guard rails, kill switches, occasional mechanical inspections, etc. Would you consider ever going near such a device in the real world? Of course not, it would be far too easy to make one mistake and die horribly.
Why would you choose a language that is the virtual equivalent of a huge dangerous tool, when there are better options available, like Rust or Go? Oh, sure, you are a skilled and expert level coder and you would never make rookie mistakes with void pointers or buffer copies, but are you writing all the code at your org yourself? What about all the other bozos that you work with, you trust them not to screw up?
I grew up with C/C++, and it has a special place in my heart, but I also know that much better things have come along in recent years. Just because you grew up driving a 57 Chevy, doesn't mean it is better than a modern car with ABS, Air Bags, Cruse control and 42 mpg.
HA! I just wasted some of your bandwidth with a frivolous sig!
RTFA: "Even if you're a C expert, as are most of the Linux kernel developers, you can still make killer blunders."
If you think you are immune to C dangers, you aren't an experienced programmer at all.
Oh, I love C. In my mind, you cannot call yourself a programmer unless you have delivered at least one non-trivial piece of software in C.
Why?
Because C is the no-training-wheels programming language. It is the "I'm not saving you from yourself" language. And more importantly, it is the "I will do what you say, not what my compiler writers think that you maybe meant" language. C will do what you tell it to do, it is the original embodiment of the Unix philosophy. It doesn't second-guess the programmer. If I do that, the computers job is to execute, not to think I'm an idiot and can't write code. I probably meant to dereference that pointer, I probably somehow made sure that it's safe and if the compiler can't see it then it assumes that it is wrong, not me.
Such beauty.
Of course, like professional tools in the physical world, in the hands of amateurs they instantly become dangerous. Don't give a chainsaw to a five year old, ok? Not a good idea. And don't give dynamite to a teenager, or something will get blown up and you don't know what.
So is it dangerous? You bet it is. Does it produce insecure code? Almost certainly because very, very few people can actually handle that stuff safely. And no, I don't count myself among them, it's been way too long since I actually wrote code in C.
But there is something to the beauty and the immediacy of having a computer not trying to think for you.
Assorted stuff I do sometimes: Lemuria.org
The only way you get buffer overflows these days is if you turn OFF the standard warnings, and use deprecated functions, while writing C.
Hooboy is that ever wrong. You are scary.
When all you have is a hammer, every problem starts to look like a thumb.
If you think you are immune to C dangers
Never said that. Would be more like "If you pretend to be immune to C dangers...". Of course C has its dangers, but even more so when people think they can program in C because they're used to Java or another high level language.
Slashdot, fix the reply notifications... You won't get away with it...
Yep. The source of numerous buffer overruns.
Many, many years ago, I wrote a function that solved that problem. Internally, it called 'memcpy' but with a max number of items that was supplied in the call OR if it was missing, it used an application wide declaration which was usually '80'.
Worked well for me for 20+ years.
Other solutions are available and acceptable to me.
He was therefore on solid ground in implying that your question was meaningless, or at best, baiting him.
Let's recap.
Oh please... Rust is for people that want to be more productive with their time at the expense of some additional system resources. For the majority of software projects it is a good deal.
The comment he was replying to. That is rhetoric. You know what rhetoric is, quite obviously, because you accused this poster of exactly that.
Nonsense. That is just the propaganda the Rust fanatics put out.
This was his reply. It was factually correct. He called out rhetoric for what it was- propaganda.
You followed that up with your loaded question, giving no one reading any reason to think that you're doing anything but trying to fool a high-school aged kid into walking down that illogical chain of discussion.
"Linus [Torvalds] likes the idea of always initializing local variables." That's new to me. I've seen and often requested myself many cases of redundant local automatic variable initialisation, don't remember seen any backlash against them.
At the end it boils down to "you must know what you're doing".
Kernel development deals with very low level concepts and most people don't undestand them. If you don't know what you're doing you will undoubtely make mistakes and since the OS is the layer on which everything runs mistakes in kernel code have rippling effects throught the system and all the software you run on it.
You could write Linux (the kernel) parts in something other than C but that won't save you from having to know the gory details
The only way you get buffer overflows these days is if you turn OFF the standard warnings, and use deprecated functions, while writing C.
That's bullshit. All you have to do is access an array with an incorrect index. You won't get a warning for that. And you won't even get it flagged with a run-time check until you feed it the wrong data. You know like heartbleed.
IOW your claim is flat out false and implies you either don't program C or you're so misinformed about it that you're a menace if you do program it.
Like almost all languages, Rust won't normally have buffer overflows. They've hyped that so much it would make PT Barnum blush, acting like that's something special.
It's really not the most important thing about Rust. Though in fairness it seems that so many C hackers are so misinformed about their own language and still struggling that it apparently matters.
the borrow checker actually prevents a wide class of memory errors and data races. The former are interesting enough, though not common in C++ (I occasionally have some). The latter is much more interesting though. If I was writing high performance (multithreaded) software with irregular structure then I'd be very seriously considering Rust since it seems to be the only language up to the task.
I'm not though, so I'll stick to C++. I'm not naive enough though to think my language is perfect just because I learned it 20 years ago.
every modern language is safe from buffer overruns assuming a competent programmer.
That's a foolish claim because it's tautologically true while being misleading. Assembly language is free from buffer overrnus with a sufficiently competent programmer. In practice however everyone makes mistakes.
SJW n. One who posts facts.
With Rust, something like macro overloading would be a code obfuscatory dream.
And C macros aren't? So here's the cognitive dissonance that seems to exist with C advocates. It was applied to C++ and appears to applied unchanged to Rust too.
On the one hand the attitude is that good programmers don't need the hand-holding of Crust++. On the other hand Crust++ lets you write other kinds of bad code.
Make up your mind! Do you have good programmers or bad ones? And why on earth don't you have some form of code review?
SJW n. One who posts facts.
It has nothing to do with "entrenchment" in a particular language. The kernel is written in C. Nobody is rewriting it an another language because that wouldn't just be stupid, it would be impossible.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
What is wrong with a simple "key value" config?
To say "C is not broken" is to close your eyes to it's real problems. But it's only broken in the same sense that assembler is broken. It has features that tend to lead to code that is unsafe, difficult to debug, and difficult to understand.
C is an excellent design for a computer language, just not an excellent design for a programming language. And arguments about what makes an excellent programming language are never-ending, because different use cases yield different answers. If you need resizable arrays you have a different answer than if you need fixed length strings, and that's different than if you need immutable values to access between different processes.
But for the basic C use case I generally think certain features should be forbidden. Writing beyond the bounds of allocated storage sounds good, but is expensive to check for...and sometimes needs to be allowed in special cases (e.g., and array within a struct that isn't the final element), but in other cases (e.g. when the array isn't the final element) it should be normally forbidden....except that when it's a pointer based overlay on other storage...
C is full of misfeatures that are inherently dangerous. They allow you to do useful things, but those things should be done through some alternate mechanism, that isn't so liable to errors or abuse. I'm really dubious about all the features that allow you to address an indefinite amount of memory. I'm not sure that there IS a safe way to do that that is also efficient. (One way to handle that would be to allow the program to be loaded with a specified amount of reserved RAM, and if you need to exceed that, to do a save state, and then reload it with a larger specified amount...but allocation on the heap is probably less expensive.)
I think we've pushed this "anyone can grow up to be president" thing too far.