Microsoft Research Touts Its 'Checked C' Extension For 'Making C Safe' (microsoft.com)
Microsoft Research has pre-published a new paper to be presented at the IEEE Cybersecurity Development Conference 2018 describing their progress on Checked C, "an extension to C designed to support spatial safety, implemented in Clang and LLVM."
From "Checked C: Making C Safe By Extension": Checked C's design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code... Any part of a program may contain, and benefit from, checked pointers. Such pointers are binary-compatible with legacy, unchecked pointers but have explicitly annotated and enforced bounds. Code units annotated as checked regions provide guaranteed safety: The code within may not use unchecked pointers or unsafe casts that could result in spatial safety violations.
Checked C's bounds-safe interfaces provide checked types to unchecked code, which is useful for retrofitting third party and standard libraries. Together, these features permit incrementally adding safety to a legacy program, rather than making it an all-or-nothing proposition. Our implementation of Checked C as an LLVM extension enjoys good performance, with relatively low run-time and compilation overheads. It is freely available at https://github.com/Microsoft/checkedc and continues to be actively developed.
The extension is enabled as a flag passed to Clang -- the average run-time overhead introduced by adding dynamic checks was 8.6%, though in more than half of the benchmarks the overhead was less than 1%. They also note that from 2012 to 2018, buffer overruns were the leading single cause of CVEs.
Microsoft Research says they're now evaluating Checked C, formalizing a proof of its safety guarantee -- and developing a tool to semi-automatically rewrite legacy C programs.
From "Checked C: Making C Safe By Extension": Checked C's design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code... Any part of a program may contain, and benefit from, checked pointers. Such pointers are binary-compatible with legacy, unchecked pointers but have explicitly annotated and enforced bounds. Code units annotated as checked regions provide guaranteed safety: The code within may not use unchecked pointers or unsafe casts that could result in spatial safety violations.
Checked C's bounds-safe interfaces provide checked types to unchecked code, which is useful for retrofitting third party and standard libraries. Together, these features permit incrementally adding safety to a legacy program, rather than making it an all-or-nothing proposition. Our implementation of Checked C as an LLVM extension enjoys good performance, with relatively low run-time and compilation overheads. It is freely available at https://github.com/Microsoft/checkedc and continues to be actively developed.
The extension is enabled as a flag passed to Clang -- the average run-time overhead introduced by adding dynamic checks was 8.6%, though in more than half of the benchmarks the overhead was less than 1%. They also note that from 2012 to 2018, buffer overruns were the leading single cause of CVEs.
Microsoft Research says they're now evaluating Checked C, formalizing a proof of its safety guarantee -- and developing a tool to semi-automatically rewrite legacy C programs.
clang/LLVM had been developed in tandem with, practically for a project for making C code safer in the first place: SAFECode.
"We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
There's a difference. CompCert/Verified C is concerned with formally verifiable source code and provably correct compilation, which means pointers are bad.
CheckedC doesn't do any of the above, it is only a secure pointer system. Microsoft's Z3 handles formal verification.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
How many errors are due to C syntax, e.g. "=" vs "=="?
I haven't seen that error in many many years. The compiler gives you a warning in most cases, when you look at code with that mistake it really jumps out at you, and if it somehow does get through the compile phase, rudimentary testing will catch it. You are testing both branches of your if statements, aren't you?
"First they came for the slanderers and i said nothing."
Why not simply use a language that avoids the problem in the first place? In Pascal, for example, you can't do an assignment inside an if test. It's a wiser design choice.
Pretty major error right in the introduction:
> Legacy programs would need to be ported wholesale to take advantage of these languages,
Not true for Rust. C libraries and applications can be ported to Rust incrementally and, in fact, some examples have already been done and shipped! See Federico's work on librsvg for example: https://people.gnome.org/~fede...
C is already safe
Is it? Let's have a look at a security analysis of applications written in C on FreeRTOS. It seems like they're riddled with flaws. Saying "just write better code" lacks real world perspective.
Slashdot ate the because. See:
https://developers.slashdot.or...
There are many vulnerabilities in software in every language.
As it happens, I maintain a database of every CVE ever issued, and part of my job each day is to look at any significant new vulnerabilities published that day. I've learned a couple things about languages and vulnerabilities. Obviously languages that nobody ever uses aren't used in vulnerable software very much - the number of vulnerabilities tracks fairly closely with how much use a language gets. Aside from that obvious fact, there is one more:
Languages designed to be easy for beginners tend to be used by beginners. Beginners make beginner mistakes.
There is very little stupid assembly code out there. There's a lot more stupid Python. This is simply because assembly is generally used by peoppe who know WTF they are doing; Python encourages people to make software without knowing what they are doing, which means they make really bad software.
Probably the worst language I've seen in terms of security was version 4 of PHP. It was really, really dumbed down and frequently used by people who had no clue - on public web sites. The creator of PHP openly and emphatically says he had no idea how to create a good program language, and he's right. He was trying to create a simple blog system, but inflated loops, variables, and conditions, so people started using it as a general purpose programming language for the web.
You DO have to be careful with C - and C programmers generally know that, and are careful. C is designed to be fast and to be flexible, and *simple* in terms of its built-ins, not to be a safe playground for newbies.
I fear the language which may be even worse for security than PHP 4 may be Rust. It may really surprise people for me to say that, but programs written in Rust may very well have more serious vulnerabilities than any other language. Why? Because Rust hypes some very basic features to a ridiculous degree, pretending that avoiding oob access magically makes your code secure, and many Rust programmers actually believe that. By far the vast majority of vulnerabilities are logic errors like "goto fail", not buffer overruns. No language can protect you against goto fail and similar oversights.
By making Rust programmers believe that just uaing Rust makes the software secure, or even meaningfully more likely to be secure, they are lulled into a false sense of security which encourages stupid mistakes. Have you ever seen a Rust program which even I the negative conditions in its unit tests? That's one of the most basic and important things you can do in terms of security. Many Rust fanbois truly believe that using Rust is magic, so they don't even test what happens when someone enters an invalid password, or an empty password, or how about SQL injection in the password? Rust doesn't normally buffer overflow, so no need to think about security, right?