Maybe he's doing it on purpose. Exposing his kinky sexual habits to the whole world might just be huge a turn-on for him, and he's even getting money for it? Win-win!
Actually, the more I think of it, the more this seems to be the only viable explanation.
Yep, I never would have even thought of looking for it until I read about the lawsuit. Now of course I just had to see it. Not hard to find, by the way: here
But if buf is already unsigned, and the problem is that buf+len wraps around to some positive value less than buf, your fix doesn't help.
Then again, I suppose the optimizer won't (or at least shouldn't) optimize in that case. So I imagine the bug only occurred if pointers are converted to int and not unsigned int.
As far as "if (buf + len > INT_MAX)" (or whatever you're using for the value of the largest possible value for the appropriate signed integral data type), which has the advantage of expressing what you're actually testing for.
No, that will always return false, because if buf + len is bigger than INT_MAX, the result of the addition will be a smaller number due to that very overflow we were testing against. Except if you cast it to a long long or something like that.
You could use "if (INT_MAX - int(buf) < len)" though. (Assuming buf is a char*, otherwise add some multiplications with sizeof)
nothing wrong with checking "if (buf + len < buf)". That should just work as intended, or at least give a compiler warning. Instead, the compiler just takes it out without warning because it thinks it knows better than us.
Quite the reverse. The compiler is assuming that you know what you mean and have written what you mean. If what you mean doesn't make sense, then it tries to interpret it as best as it can (in this example, if you've done something that means that the only well-defined values of i are 0-4, then i is always less than 10).
That was not the example I was talking about. "if (buf + len < buf)" does make perfect sense, because it catches a bug that is caused precisely by the fact that buf + len can become less that buf due to overflow. You write a program, it crashes because some value becomes negative, you add code to fix it ("if the value is negative, take corrective action") and then the compiler just says "you don't know what you're doing, this value will never be negative". And then your app crashes because, guess what, the value became negative. And you're wondering why the hell the code didn't catch it. Set a breakpoint, the code passes right over it, enter an expression to check the sign, sure thing, the sign IS negative, what the hell is going on?!
As for the reading from beyond the bounds of an array, really, I still think it's going a bit too far there. When you read from a location that is outside the bounds of an array, you should either get garbage numbers or a seg fault, but the flow of the program should never be altered like that. I do understand what's happening, I'm just saying they went too far. Yes, OK, I get it, it's undefined and therefore they are allowed to let demons come out of my nose, but really?!
No, they are not buggy. Not in the slightest. OK, they might be considered "undefined" if you follow the C standard to the letter, because the C standard was written for some kind of non-existent generic calculating device without making any assumptions about the hardware, but if you know what actually happens inside the processor, it makes perfect sense to check the sign of a calculation to detect overflows. People have been doing that for ages, it's extremely efficient. It's what you would write in machine code, too.
The only thing that makes it buggy, is the way the compiler tries to "optimize" things by making invalid assumptions.
Unsigned arithmetic in C is not defined as modular. Integer overflows are "undefined", leading to unintended compiler optimizations where it assumes overflows just can't happen.
If your code relies on undefined behaviour, then it's broken. A compiler is entirely free to do whatever it wants in the cases where the behaviour is undefined.
I think this is taking things a bit too far. "Undefined behaviour" was normally supposed to mean "this might give unexpected results on certain systems due to reasons beyond our control", not "we may decide to let pigs fly across the screen just for fun".
It has now become impossible to use certain features of the processor, very basic things like two's complement arithmetic to check for overflows etcetera, just because the compilers have decided that we shouldn't rely on our knowledge of the fact that a processor uses binary numbers.
You gave some examples of actually broken code, but in my opinion there's nothing wrong with checking "if (buf + len < buf)". That should just work as intended, or at least give a compiler warning. Instead, the compiler just takes it out without warning because it thinks it knows better than us.
I really recommend reading the actual paper, it was an eye-opener for me. I now have to go check all my source code because the paper gave examples of things I use on a regular basis and never thought twice about. Apparently I know too much for my own good about how binary numbers work. What's worse, I can't believe the compiler doesn't even warn about these things! Very often, there's nothing really wrong with the code except for the fact that is based on a certain knowledge about how computers work, basic things like two's complement, knowledge which programmers are apparently not supposed to have.
For a vey basic simplified example:
unsigned int a =... int b = a; if (b < 0)// a was too large for a signed int {// the "if" and the whole code block behind it is "optimized" away because b supposedly can never be negative. Huh?!! }
Of course in this case you could have just checked a instead of b, but in more complicated cases this can really become a problem. All kinds of overflow checking (for example "buf + len buf" with positive len, intended to catch overflows with very large values of len in a buffer overflow exploit) are suddenly optimized away while they were actually important parts of the security of the application!
Other examples include: if (abs(x) < 0) abs(x) CAN actually be negative if x equals -2^(n-1) due to integer overflow, but the compiler assumes it is always positive and therefore discards the check that was INTENDED to catch this case. And without any warning whatsoever. Unbelievable.
Quite amazing that this doesn't trigger a compiler warning! I can understand that the compiler would "optimize away" code that it considers to be unreachable, undefined, or extraneous, but how hard would it be to let it give a warning so that the programmer can say "hey, wait a minute, I wrote that code for a reason!".
Most of the time, when somebody discloses a vulnerability like that in a responsible way, the result is a bunch of angry letters from lawyers accusing the reporter of hacking into the system, demanding damages to be paid, etcetera.
Apparently that didn't happen in this case, so this really is a news story!
That's really weird. I would think they'd use their own power first, and use the grid as a backup? And then generators after that? That would make a lot more sense imho. The way it is now, if the grid fails, they have to resort to generators?!
Another thing I never understood is those "spent" fuel pools. They put these rods into a big swimming pool and then have to cool it constantly to keep the water from boiling. Errr, and why exactly aren't you using those to create power somehow? These things can boil a swimming pool, and instead of using that to create energy, you're spending energy to cool them? At the very least, it should be able to power its own cooling? Hot water drives turbines, turbines feed pumps, put in plenty of redundant ones so half of them can fail, perpetuum mobile until the rods are "really" spent.
Actually, I was not talking about expert programmers (which you can still find at a handful of companies like Apple and Google), but about "programmers" who just throw a bunch of libraries together. A lot of software and hardware companies would indeed hire Jeremy Clarkson to design a kitchen blender. Unfortunately.
Buoyancy apparently has several contradictory meanings. "Buoyant" can mean "able to float" but also "able to cause things to float". Therefore, denser fluids are both more AND less buoyant. Does that help?;-)
And that's exactly why my DVD player takes 30 seconds to start up while a 30 year old vcr would start playing pretty much right away. Just throw some libraries together that do what you want to do, including the libraries required by those libraries, and finally including an entire linux or other OS because, well, otherwise you can't use those libraries. I wouldn't be surprised if the thing had a hypervisor and multiple OSes too.
Programmers nowadays just bandaid lots of existing stuff together without knowing how it all works. The result is bloated software that requires gigabytes of memory while the same stuff used to work just fine on a 64k 8-bit computer.
Can you remember how much stuff used to get done by applications using 64k of memory? And how much memory those same applications would gobble up if they were written today? That's the price of throwing libraries together instead of actually programming.
That doesn't mean you actually have to write everything from scratch, you certainly should try to reuse existing code and libraries wherever possible, but you should also know how they work, what they do, and whether or not they are overkill for what you're trying to do.
If you ask today's "programmers" to provide an engine for a kitchen blender, they'll use a V8 truck (not the engine of the truck, but the entire truck) and connect the blender to the wheels of the truck somehow. Being able to take the engine out of the truck would be a big improvement already.
Very true. At the very least, programmers should know how a processor works, how memory (including virtual memory) works, etcetera. I find it disturbing that this is not even mentioned. They go straight to the high level stuff, which is important too, but intimate knowledge of the internals of a (generic) computer may keep you from producing bloated, slow code. Look at disassembled output from a compiler, step through the assembly code using a debugger and see what's the cost of high level constructs. Learn to choose where abstraction makes sense, and where it just slows things down without a real benefit.
Code some simple algorithms from scratch and try to do everything you can to make them faster. I'm not saying you should actually write your own code for everything when you're building actual applications, by all means use the Standard Library, but for training purposes there's no better way to learn than by actually doing some of this low level stuff yourself.
Architecture is important, sure, and you don't actually have to write much low level code anymore in most situations, but you should know that it's there, how it works, and how efficient or inefficient it is.
Maybe he's doing it on purpose. Exposing his kinky sexual habits to the whole world might just be huge a turn-on for him, and he's even getting money for it? Win-win!
Actually, the more I think of it, the more this seems to be the only viable explanation.
You sued there? Are you Max Mosley, Titus Groan?
Yep, I never would have even thought of looking for it until I read about the lawsuit. Now of course I just had to see it. Not hard to find, by the way: here
Except for the occasional mishap where they open fire on their own troops.
But if buf is already unsigned, and the problem is that buf+len wraps around to some positive value less than buf, your fix doesn't help.
Then again, I suppose the optimizer won't (or at least shouldn't) optimize in that case. So I imagine the bug only occurred if pointers are converted to int and not unsigned int.
As far as "if (buf + len > INT_MAX)" (or whatever you're using for the value of the largest possible value for the appropriate signed integral data type), which has the advantage of expressing what you're actually testing for.
No, that will always return false, because if buf + len is bigger than INT_MAX, the result of the addition will be a smaller number due to that very overflow we were testing against. Except if you cast it to a long long or something like that.
You could use "if (INT_MAX - int(buf) < len)" though. (Assuming buf is a char*, otherwise add some multiplications with sizeof)
nothing wrong with checking "if (buf + len < buf)". That should just work as intended, or at least give a compiler warning. Instead, the compiler just takes it out without warning because it thinks it knows better than us.
Quite the reverse. The compiler is assuming that you know what you mean and have written what you mean. If what you mean doesn't make sense, then it tries to interpret it as best as it can (in this example, if you've done something that means that the only well-defined values of i are 0-4, then i is always less than 10).
That was not the example I was talking about. "if (buf + len < buf)" does make perfect sense, because it catches a bug that is caused precisely by the fact that buf + len can become less that buf due to overflow. You write a program, it crashes because some value becomes negative, you add code to fix it ("if the value is negative, take corrective action") and then the compiler just says "you don't know what you're doing, this value will never be negative". And then your app crashes because, guess what, the value became negative. And you're wondering why the hell the code didn't catch it. Set a breakpoint, the code passes right over it, enter an expression to check the sign, sure thing, the sign IS negative, what the hell is going on?!
As for the reading from beyond the bounds of an array, really, I still think it's going a bit too far there. When you read from a location that is outside the bounds of an array, you should either get garbage numbers or a seg fault, but the flow of the program should never be altered like that. I do understand what's happening, I'm just saying they went too far. Yes, OK, I get it, it's undefined and therefore they are allowed to let demons come out of my nose, but really?!
It was probably a condition for their deal with China Mobile.
No, they are not buggy. Not in the slightest. OK, they might be considered "undefined" if you follow the C standard to the letter, because the C standard was written for some kind of non-existent generic calculating device without making any assumptions about the hardware, but if you know what actually happens inside the processor, it makes perfect sense to check the sign of a calculation to detect overflows. People have been doing that for ages, it's extremely efficient. It's what you would write in machine code, too.
The only thing that makes it buggy, is the way the compiler tries to "optimize" things by making invalid assumptions.
Unsigned arithmetic in C is not defined as modular. Integer overflows are "undefined", leading to unintended compiler optimizations where it assumes overflows just can't happen.
If your code relies on undefined behaviour, then it's broken. A compiler is entirely free to do whatever it wants in the cases where the behaviour is undefined.
I think this is taking things a bit too far. "Undefined behaviour" was normally supposed to mean "this might give unexpected results on certain systems due to reasons beyond our control", not "we may decide to let pigs fly across the screen just for fun".
It has now become impossible to use certain features of the processor, very basic things like two's complement arithmetic to check for overflows etcetera, just because the compilers have decided that we shouldn't rely on our knowledge of the fact that a processor uses binary numbers.
You gave some examples of actually broken code, but in my opinion there's nothing wrong with checking "if (buf + len < buf)". That should just work as intended, or at least give a compiler warning. Instead, the compiler just takes it out without warning because it thinks it knows better than us.
I really recommend reading the actual paper, it was an eye-opener for me. I now have to go check all my source code because the paper gave examples of things I use on a regular basis and never thought twice about. Apparently I know too much for my own good about how binary numbers work. What's worse, I can't believe the compiler doesn't even warn about these things! Very often, there's nothing really wrong with the code except for the fact that is based on a certain knowledge about how computers work, basic things like two's complement, knowledge which programmers are apparently not supposed to have.
For a vey basic simplified example:
unsigned int a = ... // a was too large for a signed int // the "if" and the whole code block behind it is "optimized" away because b supposedly can never be negative. Huh?!!
int b = a;
if (b < 0)
{
}
Of course in this case you could have just checked a instead of b, but in more complicated cases this can really become a problem. All kinds of overflow checking (for example "buf + len buf" with positive len, intended to catch overflows with very large values of len in a buffer overflow exploit) are suddenly optimized away while they were actually important parts of the security of the application!
Other examples include:
if (abs(x) < 0)
abs(x) CAN actually be negative if x equals -2^(n-1) due to integer overflow, but the compiler assumes it is always positive and therefore discards the check that was INTENDED to catch this case. And without any warning whatsoever. Unbelievable.
Quite amazing that this doesn't trigger a compiler warning! I can understand that the compiler would "optimize away" code that it considers to be unreachable, undefined, or extraneous, but how hard would it be to let it give a warning so that the programmer can say "hey, wait a minute, I wrote that code for a reason!".
Just a quick Google search, but there's plenty of others:
http://www.scmagazine.com.au/News/276780,security-researcher-threatened-with-vulnerability-repair-bill.aspx
http://nakedsecurity.sophos.com/2012/02/20/jail-facebook-ethical-hacker/
Most of the time, when somebody discloses a vulnerability like that in a responsible way, the result is a bunch of angry letters from lawyers accusing the reporter of hacking into the system, demanding damages to be paid, etcetera.
Apparently that didn't happen in this case, so this really is a news story!
Fortunately I don't use FaceBook, so I can still like whatever I want.
She probably doesn't like being called his girlfriend either :-)
Comedy capers would be quite suitable as well :-)
That's really weird. I would think they'd use their own power first, and use the grid as a backup? And then generators after that? That would make a lot more sense imho. The way it is now, if the grid fails, they have to resort to generators?!
Another thing I never understood is those "spent" fuel pools. They put these rods into a big swimming pool and then have to cool it constantly to keep the water from boiling. Errr, and why exactly aren't you using those to create power somehow? These things can boil a swimming pool, and instead of using that to create energy, you're spending energy to cool them? At the very least, it should be able to power its own cooling? Hot water drives turbines, turbines feed pumps, put in plenty of redundant ones so half of them can fail, perpetuum mobile until the rods are "really" spent.
Maybe it's better that way?
Actually, I was not talking about expert programmers (which you can still find at a handful of companies like Apple and Google), but about "programmers" who just throw a bunch of libraries together. A lot of software and hardware companies would indeed hire Jeremy Clarkson to design a kitchen blender. Unfortunately.
(four books right there - one for each of the founding members)
Be careful you don't start a new religion that way, it's happened before...
Buoyancy apparently has several contradictory meanings. "Buoyant" can mean "able to float" but also "able to cause things to float". Therefore, denser fluids are both more AND less buoyant. Does that help? ;-)
And that's exactly why my DVD player takes 30 seconds to start up while a 30 year old vcr would start playing pretty much right away. Just throw some libraries together that do what you want to do, including the libraries required by those libraries, and finally including an entire linux or other OS because, well, otherwise you can't use those libraries. I wouldn't be surprised if the thing had a hypervisor and multiple OSes too.
Programmers nowadays just bandaid lots of existing stuff together without knowing how it all works. The result is bloated software that requires gigabytes of memory while the same stuff used to work just fine on a 64k 8-bit computer.
Can you remember how much stuff used to get done by applications using 64k of memory? And how much memory those same applications would gobble up if they were written today? That's the price of throwing libraries together instead of actually programming.
That doesn't mean you actually have to write everything from scratch, you certainly should try to reuse existing code and libraries wherever possible, but you should also know how they work, what they do, and whether or not they are overkill for what you're trying to do.
If you ask today's "programmers" to provide an engine for a kitchen blender, they'll use a V8 truck (not the engine of the truck, but the entire truck) and connect the blender to the wheels of the truck somehow. Being able to take the engine out of the truck would be a big improvement already.
Very true. At the very least, programmers should know how a processor works, how memory (including virtual memory) works, etcetera. I find it disturbing that this is not even mentioned. They go straight to the high level stuff, which is important too, but intimate knowledge of the internals of a (generic) computer may keep you from producing bloated, slow code. Look at disassembled output from a compiler, step through the assembly code using a debugger and see what's the cost of high level constructs. Learn to choose where abstraction makes sense, and where it just slows things down without a real benefit.
Code some simple algorithms from scratch and try to do everything you can to make them faster. I'm not saying you should actually write your own code for everything when you're building actual applications, by all means use the Standard Library, but for training purposes there's no better way to learn than by actually doing some of this low level stuff yourself.
Architecture is important, sure, and you don't actually have to write much low level code anymore in most situations, but you should know that it's there, how it works, and how efficient or inefficient it is.