Buffer Overflow in MySQL
maedls.at writes "Here is a short description of the Vulnerability:Passwords of MySQL users are stored in the "User" table, part of the "mysql" database, specifically in the "Password" field. In MySQL 4.0.x and 3.23.x, these passwords are hashed and stored as a 16 characters long hexadecimal value, specifically in the "Password" field. Unfortunately, a function involved in password checking misses correct bounds checking. By filling a "Password" field a value wider than 16 characters, a buffer overflow will occur. For details and proof of concept see: http://lists.netsys.com/pipermail/full-disclosure/ 2003-September/009819.html"
The question shows that you haven't done a lot of real world programming, or if you have, you don't understand a lot of the issues. MySQL has been very carefully searched for overflows, but some overflows are very subtle and much harder to find than you might imagine.
It isn't that programmers are lazy (which we are, but that isn't the problem). It is that programmers can't keep perfect track of everything at once, and have to make assumptions. What am I supposed to tell my boss, something like "I can't start on that bug fix until I have read and perfectly comprehend all 1,500,000 lines of code in the product"? No, I have to try to get an overall idea of how things work, and dig into the details as I think necessary. This sometimes means I will miss a detail that is indirectly connected to the work at hand, and therefore make a mistake. The most important (in my opinion) and difficult work being done in computer science is in ways to organize things so that all of the details needed for a single problem are obvious and connected. Thus comes OOP and other programming methodologies that try to keep programs organized and well-structured.
If you read the article, it showed the code at fault. It wasn't just one function, but two. One function "validated" the password. Later, another function worked with the password, assuming (correctly) that it had already been "validated". The problem was that the two functions had different ideas about what it meant to be "validated". If the error had all been within one single function, then this would be almost inexcusable. But since the problem was a coincidence of two less significant flaws, it was much harder to detect. And if some automatic overrun detection tool were to flag the code, the programmer examining the warning would very likely have determined that the tool's warning was incorrect -- "the parameter was validated already before this function call, so the buffer overrun cannot happen."
Next, you can't just enter larger values to detect everything. In this case, the database ships with a 16 char limit on the password field. So sending a large value for password wouldn't work -- the value would be truncated when it went into the database. The bug is triggered by three unrelated operations in sequence: you alter the database to allow for larger values, THEN set a large value for password, and THEN flush the table. Automatic tools can't try every possible sequence of input, just a subset.
Aside from simply "getting it right in the first place" (i.e. never making invalid assumptions, which is pretty much impossible) this kind of problem can be avoided by using one of two programming.
The first is "Defense in Depth", which means that a function isn't allowed to assume that a parameter has been validated -- every function must validate every parameter ever time it is called. This works, but it has performance penalties (a parameter can be passed around hundreds of times, so now we validate it hundreds of times instead of just once). It also is boring to program the validation code, and therefore likely to be forgotten in some crucial function. Finally, validation is hard to get exactly right, and if the concept of a valid parameter changes, you have to go change it in every place it is validated.
The second is automatic handling of the situation. Use a string class of some sort, like STL's string, or use a "safe" language like Java or C#. This is better, but again it has costs in performance, as well as ignores the problem of interoperating with existing code.
So the "deeper problem" is that we can never get everything perfect by hand, and the automatic solutions come at a price we often aren't willing to pay. Solution? None at the moment. Perhaps in the future, less code will be written in "unsafe" languages (languages with potential for overflows), so buffer overflows will only be a problem for those who write the compilers and runtimes for those safe languages. But I wouldn't hold my breath -- it will be a while. And when that day finally comes, there will still be plenty of other ways to "root" a machine -- buffer overflows aren't the only way to overcome security measures.
Time flies like an arrow. Fruit flies like a banana.