MIT Debuts Integer Overflow Debugger
msm1267 writes Students from M.I.T. have devised a new and more efficient way to scour raw code for integer overflows, the troublesome programming bugs that serve as a popular exploit vector for attackers and often lead to the crashing of systems. Researchers from the school's Computer Science and Artificial Intelligence Laboratory (CSAIL) last week debuted the platform dubbed DIODE, short for Directed Integer Overflow Detection. As part of an experiment, the researchers tested DIODE on code from five different open source applications. While the system was able to generate inputs that triggered three integer overflows that were previously known, the system also found 11 new errors. Four of the 11 overflows the team found are apparently still lingering in the wild, but the developers of those apps have been informed and CSAIL is awaiting confirmation of fixes.
Sad to say, my own code probably has a huge number of them. I'd approximate it as:
(int)0xFFFF... with a bunch more F's
Umm...an overflow flag?
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
What flag is that then?
On an X86, "V".
Not that checking it after every add instruction is really that practical. It would be better to have trapping and non-trapping versions of integer arithmetic, and to have languages with semantics which expose that choice to the programmer.
The one problem with their method is that it can only detect overflows in one direction.
Get free satoshi (Bitcoin) and Dogecoins
What flag is that then?
It's the Evil Bit
Visit the Arcade Restoration Workshop @ http://www.arcaderestoration.com
You're thinking too low level.
Integer overflows are only problematic when you use externally-controlled values to manipulate your internal data structures. If an attacker overflows a simple counter that I'm only going to echo back to him, it's not that big of deal. Garbage in is garbage out. It's only when I use that counter in an expression which, e.g., will malloc memory. I don't want _his_ garbage to taint the state of my program.
If you want safer behavior, then in the vast majority of of cases you don't need language semantics. At most you just need a better API. E.g. OpenBSD's new reallocarray(3) routine, which checks for multiplicative overflow when allocating an array of objects.
In fact, adding language semantics might be worse. Arithmetic overflow has become a popular vector for attack, but someday exploiting bugs in programs related to exception handling will become popular. And adding low-level language semantics which throws on overflow will only exacerbate those problems by providing more opportunities to tickle program bugs.
You don't want a program that can detect every attempt at exploitation. That just creates noise. You want robust algorithms and applications that isolate garbage values. If an attacker sends garbage, you return garbage. If his garbage causes you take take a different flow of execution, however, that provides him a way to reach bugs in the little-used parts of your code.
That's why I used unsigned values as much as possible in C. Unsigned overflow is well defined. Furthermore, it's second nature to check whether a value is beyond the _upper_ bounds of some object. But having to check for negative values (and underflow) requires twice the amount of effort and code. Plus, even better than checking for overflow is simply rolling with the punches. If you're indexing an array, sometimes it's safer to clamp the value to the range of the array, rather than adding a condition which takes an exceptional code path. You may end up with garbage output, but that's fine as long as it's because you received garbage input. Just don't let the garbage taint any other state, and don't allow it to change the flow of execution in your program unnecessarily.
That was the way it used to work, and in C at least the way it's exposed is the unsigned type.
Watch this Heartland Institute video
C assumes you will choose an integer size that won't overflow in your application. If you don't, that's a bug, even if C provided the run-time ability to detect overflows. So making a run-time error the default behavior when it causes a measurable performance hit on most platforms (excluding MIPS and Alpha) doesn't really make sense in a low-level language like C.
Any sufficiently unpopular but cohesive argument is indistinguishable from trolling.
MIT already has an "integer overflow debugger" decades ago. It was called Lisp.
Ezekiel 23:20
Not that checking it after every add instruction is really that practical. It would be better to have trapping and non-trapping versions of integer arithmetic, and to have languages with semantics which expose that choice to the programmer.
Swift does exactly that. Every instruction is checked for overflow. Not sure how clever the compiler is in proving that certain instructions cannot overflow.
This is what happens when you don't watch out for overflows.
The only language I know of that does checking for integer overflows is actually Swift, surprisingly enough.
"First they came for the slanderers and i said nothing."
> Not that checking it after every add instruction is really that practical.
It's entirely practical, it's just that few compilers offer the option. And they C compilers can't offer the option for unsigned arithmetic which is defined to wrap silently.
Beyond the increase in code size (the x86 INTO instruction was one byte, but unfortunately x86-64 doesn't have it), it shouldn't even affect execution speed, as the check doesn't access memory or modify a register.
Integer overflow handling is one of the first things I look at in new languages. I rejected Rust as soon as I discovered that they'd punted on this (you'd have to explicitly use the checked_* functions which return an Option type, resulting in ridiculously verbose code; infix arithmetic operators perform unchecked arithmetic). Swift got this right, but adds too much other overhead.
If his garbage causes you take take a different flow of execution, however, that provides him a way to reach bugs in the little-used parts of your code.
The different flow of execution triggered by an overflow trap should almost always be a simple call to "abort()". At this point, your program has already failed and should be stopped.
I disagree with your premise. Garbage input values should be checked and rejected in software before the overflow ever occurs. The hardware overflow check should be a last resort to enforce this at every instruction step, and in the worst case it converts privilege exploits into less serious DOS attacks.
Allowing "garbage output" as you propose just creates more opportunities for attacks when that output gets consumed somewhere.
gcc -ftrapv
gcc -fsanitize=undefined
Visual Basic.NET has it. It's on by default you can turn it off though for a project.
C# also has it, but its off by default. It also has keywords to selectively turn it on and off in specific code blocks.
-ftrapv hasn't worked since at least 2008.
clang -fsanitize=undefined, since signed integer overflow is formally undefined.
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
Okay, research paid for with my tax dollars. Where can I download it?
You can't. The title should have read "MIT Publishes Paper Discussing Alleged Integer Overflow Debugger That You'll Never Be Able to Get Your Hands On".
(Incidentally, this isn't the first paper on a tool like this. None of the tools have ever been released for general use, although you can occasionally find buggy, research-prototype level code somewhere. I played with one a year or two ago, after several hours of rewriting their code to try and get it working on something other than the one specific configuration of some old Linux distro they tried it with I gave up).
The first big problem with integers is that they are really badly defined in C, so just like you I try to use unsigned as much as possible:
Any underflow turns into a big overflow, so it can be tested for at the same time as the overflow test, and the semantics of power-of-two sized wraparound is pretty solid on all platforms and implementations.
OTOH I don't agree that having proper overflow handling would mostly be a new source of bugs, i.e. on the new Mill cpu architecture we have a full orthogonal set of of all basic operations:
When adding two numbers (belt values) you can specify signed or unsigned, and over/underflow to be handled as saturating, wraparound or trapping, as well as automatically widening.
http://millcomputing.com/wiki/...
Look at ADDSW as an example of a Signed ADD that will widen if needed.
Since the Mill carries metadata alongside each belt slot it does not need separate byte/short/word/dword ADD instructions: The size of the operations is defined by the belt slot specified and not in the instruction encoding, so the machine code is polymorphic in data item size.
I.e. you can start with 8-bit values and an 8-bit accumulator, when the sum becomes too large then it is automatically widened to 16 bits or more. This works all the way to 128 bits for all scalar operations.
Terje
"almost all programming can be viewed as an exercise in caching"
What's Mel Kaye's opinion about all this?
I ran this on my microcontroller code, and it found all sorts of these errors in all my timer and counter code. Now I have to go patch it all before any of those overflows happen. Thanks MIT!
India gives up nuclear weapons.
Too bad it's not generally checked for. That's where the tool comes in.
Overflow isn't always wrong. Two's complement arithmetic DEPENDS on ignoring an overflow AND underflow to 'just work', for example.
Python dodges the question by using long when necessary. A Python long is what C would call a bignum.
Umm, that's why I originally said that languages should explicitly support both trapping and non-trapping versions.
I'm pointing out that a sufficiently large number of ops will be using the non-trapping versions that it probably makes more sense for the compiler to add a check of the flags in the cases where it would be helpful.