Slashdot Mirror


Microsoft Research Touts Its 'Checked C' Extension For 'Making C Safe' (microsoft.com)

Microsoft Research has pre-published a new paper to be presented at the IEEE Cybersecurity Development Conference 2018 describing their progress on Checked C, "an extension to C designed to support spatial safety, implemented in Clang and LLVM."

From "Checked C: Making C Safe By Extension": Checked C's design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code... Any part of a program may contain, and benefit from, checked pointers. Such pointers are binary-compatible with legacy, unchecked pointers but have explicitly annotated and enforced bounds. Code units annotated as checked regions provide guaranteed safety: The code within may not use unchecked pointers or unsafe casts that could result in spatial safety violations.

Checked C's bounds-safe interfaces provide checked types to unchecked code, which is useful for retrofitting third party and standard libraries. Together, these features permit incrementally adding safety to a legacy program, rather than making it an all-or-nothing proposition. Our implementation of Checked C as an LLVM extension enjoys good performance, with relatively low run-time and compilation overheads. It is freely available at https://github.com/Microsoft/checkedc and continues to be actively developed.

The extension is enabled as a flag passed to Clang -- the average run-time overhead introduced by adding dynamic checks was 8.6%, though in more than half of the benchmarks the overhead was less than 1%. They also note that from 2012 to 2018, buffer overruns were the leading single cause of CVEs.

Microsoft Research says they're now evaluating Checked C, formalizing a proof of its safety guarantee -- and developing a tool to semi-automatically rewrite legacy C programs.

25 of 181 comments (clear)

  1. Funny ... by Misagon · · Score: 3, Informative

    clang/LLVM had been developed in tandem with, practically for a project for making C code safer in the first place: SAFECode.

    --
    "We mustn't be caught by surprise by our own advancing technology" -- Aldous Huxley
    1. Re: Funny ... by jd · · Score: 2

      How is vi the least bit related to the compiler?

      I assume people understand clang is a compiler, not a text editor. Or maybe I'm assuming too much.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    2. Re:Funny ... by haruchai · · Score: 2

      clang/LLVM had been developed in tandem with, practically for a project for making C code safer in the first place: SAFECode.

      AT&T had a safe C variant called Cyclone but haven't heard anything about it in over a decade

      --
      Pain is merely failure leaving the body
    3. Re:Funny ... by HiThere · · Score: 2

      IIUC efence and valgrind don't check for references beyond array bounds, but only for references beyond allocated memory. So this is different (and less expensive) than what they're proposing.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
  2. Re: Microsoft lags behind better research by jd · · Score: 4, Informative

    There's a difference. CompCert/Verified C is concerned with formally verifiable source code and provably correct compilation, which means pointers are bad.

    CheckedC doesn't do any of the above, it is only a secure pointer system. Microsoft's Z3 handles formal verification.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  3. What about C syntax? by david.emery · · Score: 3, Insightful

    How many errors are due to C syntax, e.g. "=" vs "=="?

    At what point do we finally decide that C just wasn't the best choice for large scale long lived systems?

    (And don't tell me about "experts don't make those mistakes". See, for instance https://www.researchgate.net/p... )

    1. Re:What about C syntax? by phantomfive · · Score: 4, Informative

      How many errors are due to C syntax, e.g. "=" vs "=="?

      I haven't seen that error in many many years. The compiler gives you a warning in most cases, when you look at code with that mistake it really jumps out at you, and if it somehow does get through the compile phase, rudimentary testing will catch it. You are testing both branches of your if statements, aren't you?

      --
      "First they came for the slanderers and i said nothing."
    2. Re:What about C syntax? by theweatherelectric · · Score: 3, Informative

      Why not simply use a language that avoids the problem in the first place? In Pascal, for example, you can't do an assignment inside an if test. It's a wiser design choice.

  4. Re:Hmmm. by HiThere · · Score: 2

    What's needed is not independent comparison (well, that's needed, but that's not the problem). What's needed is a license that guarantees that there's no copyright or patented code in the result. I.e., a guarantee that the generated code can be used under any license of your choice without legal danger from either Microsoft or from any company with which they have or have had a business relationship unless the source code compiled by a standard C compiler would have the same problem.

    --

    I think we've pushed this "anyone can grow up to be president" thing too far.
  5. MISRA Comparison? by 0100010001010011 · · Score: 2

    Can anyone compare this to what Embedded has been doing for a while in functional safety?

    https://en.wikipedia.org/wiki/...

    It's why Mathworks makes stupid money off of Polyspace Static Analyzer.

    https://www.mathworks.com/prod...

    https://www.mathworks.com/prod...

    On top of that there's also the Barr Group's Embedded C Coding Standard.

    https://barrgroup.com/Embedded...

    1. Re:MISRA Comparison? by theweatherelectric · · Score: 2

      The fact that the Joint Strike Fighter's coding standards are based on MISRA C is not a good advertisement for it. The JSF's software still doesn't work right and they've been working on it for 20 years. Following MISRA C didn't avoid those problems.

    2. Re:MISRA Comparison? by phantomfive · · Score: 3, Insightful

      tbh honest, I don't think MISRA is that great. It's a grab-bag of miscellaneous error prevention ideas, but without a clear conception of how to avoid bugs, it prohibits some things that aren't a problem, and allows things that are.

      --
      "First they came for the slanderers and i said nothing."
    3. Re:MISRA Comparison? by jd · · Score: 2

      Agreed. And quite a few studies of MISRA say likewise.

      There's probably a subset that is genuinely useful, simply because it does seem to work when selectively applied.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  6. Re:Hmmm. by gweihir · · Score: 4, Interesting

    It is MS Research. MS proper ignores them routinely.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  7. Misleading comparison to other languages by roca · · Score: 4, Informative

    Pretty major error right in the introduction:

    > Legacy programs would need to be ported wholesale to take advantage of these languages,

    Not true for Rust. C libraries and applications can be ported to Rust incrementally and, in fact, some examples have already been done and shipped! See Federico's work on librsvg for example: https://people.gnome.org/~fede...

    1. Re:Misleading comparison to other languages by Anonymous Coward · · Score: 3, Informative

      Rust is a horrible clusterfuck of a language.

      Apart from a cumbersome syntax, traits hide implementation details, so you have to look up the implementation or (worse!) take a look at the implementation of a type in order to know what it will actually do memory-wise, information you need to know in order to use the type in any realistic way. Rust also doesn't have proper classes, which together with the overly complex borrow-checker makes it essentially impossible to write glue libraries to object-oriented foreign code or write object-oriented style in Rust that is not extremely verbose. That's why there is not a single cross-platform GUI library worth using in Rust, because these are naturally designed in an object-oriented way. The Rust people are still musing how to re-invent the wheel in a more Rust-friendly way to create a few widgets on the screen.

      The borrow-checker and lifetimes feel like they have been invented to only make a programmer's life harder. Implicit shortcuts like lifetime elision rules are glued onto the language in an attempt to make programming in Rust less of a pain in the ass, whereas in reality the only increase the complexity of programming in Rust even further.

      Last but not least, Rust claims to enhance safety - which for Rust people almost solely means memory safety of references - but doesn't even have integer range types. Ada had these in the 80s, yet the Rust guys believe they are unnecessary and openly advocate wrapping integers into a struct to achieve a similar effect.

      Add to this glued-on constructs like the "ref" keyword that means what "&" should mean in certain contexts, but "&" cannot be used because it is already used for something different, and you get the mess that Rust is now at version 1. I don't want to imagine what kind of insanely complex mess Rust is going to be in 10-20 years from now. C++ will be nothing in comparison.

  8. Re:Sigh by theweatherelectric · · Score: 3, Informative

    C is already safe

    Is it? Let's have a look at a security analysis of applications written in C on FreeRTOS. It seems like they're riddled with flaws. Saying "just write better code" lacks real world perspective.

  9. Re:Loops, for one. by raymorris · · Score: 4, Informative

    Slashdot ate the because. See:

    https://developers.slashdot.or...

    There are many vulnerabilities in software in every language.
    As it happens, I maintain a database of every CVE ever issued, and part of my job each day is to look at any significant new vulnerabilities published that day. I've learned a couple things about languages and vulnerabilities. Obviously languages that nobody ever uses aren't used in vulnerable software very much - the number of vulnerabilities tracks fairly closely with how much use a language gets. Aside from that obvious fact, there is one more:

    Languages designed to be easy for beginners tend to be used by beginners. Beginners make beginner mistakes.

    There is very little stupid assembly code out there. There's a lot more stupid Python. This is simply because assembly is generally used by peoppe who know WTF they are doing; Python encourages people to make software without knowing what they are doing, which means they make really bad software.

    Probably the worst language I've seen in terms of security was version 4 of PHP. It was really, really dumbed down and frequently used by people who had no clue - on public web sites. The creator of PHP openly and emphatically says he had no idea how to create a good program language, and he's right. He was trying to create a simple blog system, but inflated loops, variables, and conditions, so people started using it as a general purpose programming language for the web.

    You DO have to be careful with C - and C programmers generally know that, and are careful. C is designed to be fast and to be flexible, and *simple* in terms of its built-ins, not to be a safe playground for newbies.

    I fear the language which may be even worse for security than PHP 4 may be Rust. It may really surprise people for me to say that, but programs written in Rust may very well have more serious vulnerabilities than any other language. Why? Because Rust hypes some very basic features to a ridiculous degree, pretending that avoiding oob access magically makes your code secure, and many Rust programmers actually believe that. By far the vast majority of vulnerabilities are logic errors like "goto fail", not buffer overruns. No language can protect you against goto fail and similar oversights.

    By making Rust programmers believe that just uaing Rust makes the software secure, or even meaningfully more likely to be secure, they are lulled into a false sense of security which encourages stupid mistakes. Have you ever seen a Rust program which even I the negative conditions in its unit tests? That's one of the most basic and important things you can do in terms of security. Many Rust fanbois truly believe that using Rust is magic, so they don't even test what happens when someone enters an invalid password, or an empty password, or how about SQL injection in the password? Rust doesn't normally buffer overflow, so no need to think about security, right?

  10. Re:Slashdot ate my post! by raymorris · · Score: 3, Interesting

    You certainly CAN write it in two lines instead of one, sure.
    You asked for an example of where it is convenient.

    As I mentioned, here's the implementation of the string copy library function in C, using some conveniences including assignment returning the value. How would you write this "copy each character" in Pascal?:

    while (*dest++ = *src++);

    I'm going to guess that rather than one line, it'll be about fiveines. Some people prefer not to write five times as much code as needed.

    Personally, I kinda like this habit to not only avoid the error but make it extremely obvious that I haven't done an assignment rather than a comparison:

    if (4 == x)

    By habitually putting the constant on the left side, I'd get a compile error if I accidentally typed = instead of ==.

  11. Re:Slashdot ate my post! by theweatherelectric · · Score: 2

    How would you write this "copy each character" in Pascal?

    Like this: myString := myOtherString;

    C's string handling is nothing to take pride in.

    By habitually putting the constant on the left side, I'd get a compile error if I accidentally typed = instead of ==.

    Pascal doesn't have the problem in the first place.

  12. Re:Sigh by religionofpeas · · Score: 3, Insightful

    I've been writing C programs for 3 decades, and I have made plenty of mistakes along the way. Occasionally because of using the wrong pointer, but most of them were simply because I got the algorithm wrong. None of these "safe" languages would have prevented the 2nd kind of error.

  13. Bugs are not just code, some are in design by perpenso · · Score: 2

    Bugs are not always coding errors, bugs may also be in the design and correct code of any language can manifest such design bugs

  14. Re:C is not the problem by Anonymous Coward · · Score: 2, Insightful

    > People are so obsessed with wrangling the last ounce of performance out of application programs

    You made coffee come out my nose!

    Have you actually *used* a major application lately? I'd say performance is far down the list.

  15. Re:Slashdot ate my post! by gnasher719 · · Score: 2

    Please for the love of god use strncpy.

    Please for the love of god NEVER use strncpy. If your buffer doesn't have enough space, it copies the bytes from source and doesn't write a trailing zero byte, so now you have a trap just waiting to spring on you. It's the worst design possible.

    In addition, calling strncpy() to copy into a buffer of n bytes takes O (n). 5 bytes into a megabyte buffer sets a million bytes to 0.

    Write two helper functions. One that creates a shortened, valid C string if it doesn't fit. One that is guaranteed to crash if it doesn't fit.

  16. Learned it 15 years ago by raymorris · · Score: 2

    I learned Pascal 15 years ago. It's an okay language.
    At the time, Pascal was competing with Visual Basic. VB won.

    The world could have chosen Pascal over VB, but they chose VB. In the 1970s, Pascal competed with C. The world chose C.

    Now the industry is going through a phase in which people aren't distinguishing between beginner languages that are designed to be easy vs professional, enterprise-grade tools. Legos are easy, and I good way to learn some basics. You shouldn't build your house out of Legos. The same is true when building information systems. The simplest tools may not be the best things to build your enterprise with.