Facebook Awards Researchers $100k For Detecting Emerging Class of C++ Bugs
An anonymous reader writes: Facebook has awarded $100,000 to a team of researchers from Georgia Tech University for their discovery of a new method for identifying "bad-casting" vulnerabilities that affect programs written in C++. "Type casting, which converts one type of an object to another, plays an essential role in enabling polymorphism in C++ because it allows a program to utilize certain general or specific implementations in the class hierarchies. However, if not correctly used, it may return unsafe and incorrectly casted values, leading to so-called bad-casting or type-confusion vulnerabilities," the researchers explained in their paper.
Thankfully, I only use FOSS software which is not vulnerable to this problem. Many eyes are sure to catch anything like this in the rigorous peer reviews that happen on every commit.
They haven't awarded anything to "Georgia Tech University", because there is no such thing. Georgia Tech is an institute; the Georgia Institute Of Technology.
dynamic_cast requires RTTI, which means you're a bit optimistic to say "Most caught at compile time, for other casts use dynamic_cast".
Of course, templates mean that the compiler can substitute actual types. That gives you compile-time polymorphism instead of runtime polymorphism, and that in turn means you're increasingly right that most cast errors are caught at compile time. The price is unfortunately even longer compile times. Guess why I'm posting right now....
I actually read the paper (okay, mod me down). Java and .Net have very strong runtime typing systems. C/C++ does not. Adding one is a bit tricky because there are certain things that are legal in C/C++ and not Java. Specifically, it's okay to cast between two classes that are non-polymorphic (unrelated from a type system perspective). Also C/C++ applications often have some additional performance requirements. They've created a runtime typing system and then a mechanism (probably a pre-processor) that can cause static_cast and dynamic_cast to instead use their casting mechanism. You turn it on for debug and off for release. We already have things like debug heaps to look for memory corruption at a small performance cost why not also have a debug type checking system. And, of course, since it gets switched off in production builds, it doesn't have the runtime performance costs. It's one of those things that is obvious as soon as somebody does it. Those are often some of the best advances as they can have a lot of impact quickly.
Why would anyone want to? Cast is an explicit type conversion (see the standard, 5.4 [expr.cast]).
If your interface "classes" don't define all the methods you need to access an object, your architecture is screwed up. If you have to do typecasting, the interface should provide a method which is used to identify the correct class/interface for casting.
Casting without knowing what kind of object you're dealing with isn't a "bug" -- it's a shitty developer writing crap code who should be fired.
I do not fail; I succeed at finding out what does not work.
I think that was reported back in ... oh 1973 with the original C compiler.
Just another reason to avoid C++.
I don't think it's really a reason to avoid C++, you can do lots of perverse things in C also. It's a feature of the family.
The biggest problem I personally have with C++ is operator overloads, which I think are just a bad idea.
The biggest problem I personally have with C++ is operator overloads, which I think are just a bad idea.
The problem isn't so much operator overloads as it is C++-style overloading in general. Operators are just another kind of function. The problem with overloading them is that there is no common type signature or interface definition binding the various overloads together. That, and the limited set of available operators, which drives developers to reuse operator names for unrelated tasks. Even the STL sets a poor example by overloading bit-shift operators for I/O.
Contrast that with user-defined operators in Haskell, where overloading is only allowed in the context of a typeclass instance:
There can be any number of instances of the Monoid typeclass, but for every implementation of the (<>) operator the arguments and result must be the same type, and (per the Monoid laws—which admittedly are only convention, and not enforced by the type system) the implementation must be associative and have mempty as its left and right identity. The same overloading rules apply to the named function mempty and the operator (<>). Since Haskell permits arbitrary sequences of symbols as operator names, there is little pressure to abuse existing operators to new purposes, and while Haskell libraries tend to make extensive use of custom operators one rarely encounters them same issues that C++ project face with operator overloading.
The nearest C++ equivalent would be to define the built-in operators as members of various abstract base classes, and only allow a named function or operator to be redefined in classes which inherit their interface from the relevant base class. This unfortunately runs into some issues regarding polymorphism due to limitations of C++'s type system; for example, the implementation of mempty needs to be selected based on its return type, while C++ only supports selecting a class based on the type of the implicit parameter "this".
"The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
Casting is much more common in C++ code. I don't know if that's because of the proliferation of unique types, or because there are more newbie programmers working in C++, but I cringe whenever I look at a large C++ code base.
Good C code rarely needs casting, if at all. I presume the same is true of C++.
When I need complex runtime polymorphism, I'll switch to a language that better handles that, like Lua. The nice thing about C is that it interoperates easily with almost all other languages. This is less true with C++ (because of the stricter typing and abuse of overly specialized types; because of ABI issues; because of the way C++ programmers, like Java programmers, rely on mountains of third party libraries, often creating conflicts).
if you feel the need to cast you've probably coded yourself into a corner and should think about refactoring.
Operator overloads are there for the same reason void* is: when they actually make sense, they're a vast improvement. Complex numbers, for example, and really "+" is fine for string concatenation. Not being limited to built-in classes for that sort of thing is a feature, IMO.
C++ has few guiderails, and lets you write very unmaintainable code, much more so than C. But that's what let's you write performance-equivalent code that's much more maintainable than C.
My biggest gripe is coders who don't bother to learn the details of the 3 key library container classes: string, vector, and map. Poor coder choices causing significant (unnecessary!) performance hits was a bad enough problem that the standard had to add a fixed-sized array class, as even the simplest stuff like pre-allocating a vector when you know its size was beyond most coders. Sad, really.
And for goodness sake people, don't re-invent anything in std::algorithm! Stuff like inplace_merge or nth_element is really error prone to write yourself, as much fun as it might be to finally use that algorithms textbook from college.
Socialism: a lie told by totalitarians and believed by fools.
Fuzzing and grepping are entirely different things. If your original post hadn't gotten modded up, I probably wouldn't even respond. Fuzzing is a mechanism where cleverly malformed data is sent to an application or even a piece of hardware to see how it responds. Things like an invalid message with a proper authentication code. It's a pretty effective form of testing. In this context your comment might as well be. "Testing your software is just a poor man's method of finding errors (the real problem) in some code. Glorified greps." Ideally we aren't writing defects and are bug-free before a testing cycle, but that rarely (if ever) happens. Even if there are no verification defects there may be validation concerns. Both this and fuzzing are *dynamic* tools. Grep is a static tool although I don't know how it could possibly be employed in finding all but the most trivial defects. There are sophisticated static tools out there as well. (See FindBugs for an open source example of one). But these have nothing in common with grep.