OpenSSL Cleanup: Hundreds of Commits In a Week
New submitter CrAlt (3208) writes with this news snipped from BSD news stalwart undeadly.org: "After the news of heartbleed broke early last week, the OpenBSD team dove in and started axing it up into shape. Leading this effort are Ted Unangst (tedu@) and Miod Vallat (miod@), who are head-to-head on a pure commit count basis with both having around 50 commits in this part of the tree in the week since Ted's first commit in this area. They are followed closely by Joel Sing (jsing@) who is systematically going through every nook and cranny and applying some basic KNF. Next in line are Theo de Raadt (deraadt@) and Bob Beck (beck@) who've been both doing a lot of cleanup, ripping out weird layers of abstraction for standard system or library calls. ... All combined, there've been over 250 commits cleaning up OpenSSL. In one week.'"
You can check out the stats, in progress.
Well, I would think that this is mostly to do with publicity. Once someone calls your software into question in a very public light, you will be more willing to go through your project with a fine toothed comb and clean up all that old cruft you've been meaning to clear out.
This is not a sign of inherent insecurity, but one of obvious house cleaning.
"Freedom in the USA is not the ability to do what you want. It is the ability to stop others from doing what THEY want"
Because "over 250 commits" in a week by a third party to a mature security product suggests either a superhuman level of understanding or that they're going to miss a lot of the implications of their changes.
From what I understood earlier, this will be a fork of the official OpenSSL release, or will all these patches be incorporated in "generic" OpenSSL and not just the OpenBSD implementation?
I was promised a flying car. Where is my flying car?
Right now, I think the team is mostly focused on having "something usable" in OpenBSD and I doubt they care too much about anything else outside their scope.
Having said that - forking OpenSSL to something usable and burning the remains with fire is a great idea, however there is considerable risk that the rush will cause new bugs - even though right now those commits have been mostly pulling out old crap.
Fixing the beast is going to take a long while and several things will need to happen:
- Upstream hurry to put more crap into the RFC needs to cease for a while. We don't need more features at the moment, we need stability and security.
- Funding. The project needs to be funded somehow. I think a model similar to Linux Foundation might work - as long as they find a suitable project leads. But major players need to agree on this - and that's easier said than done (who will even pull them to the table?)
- Project team. Together with funding, we need a stable project team. Writing good crypto code in C, is bloody hard, so the team needs to be on the ball - all the time. And the modus operandi should be "refuse features, increase quality". Requires a strong Project Lead.
- Patience.. fixing it is a long process, so you can't go into it hastily. You need to start somewhere (and here I applaud the OpenBSD team), but to get it done, assuming that above is in place - expect 1-3 years of effort.
The article doesn't make it completely clear that this doesn't have much to do with the fixing problems in OpenSSL.
Commits to the true OpenSSL source can be seen through the web interface at https://github.com/openssl/ope.... What the article is talking about is tidying up the version that is built in to OpenBSD. Not that that isn't worthwhile work, but it's unlikely to fix many hidden problems in OpenSSL itself, unless the OpenBSD devs find something and hand it back to the upstream.
OpenSSL is on the list of projects scanned by Coverity.
I wonder why exactly Coverity did not catch the heartbleed bug. Most likely, the scan wasn't set up to deal with OpenSSL's use of it's own internal heap management routines. That's something that I would have thought should be fixed right away.
I cant talk for C, but in Java the tools which warn you about potentially dangerous constructs are great (e.g. Sonar). You can easily identify many *suspicous* contructs and change them to something more safe. 250 commits per week with 4 devs on a moderatly sized project do not see much to me, much at the "quality" and not the "quantity" side.
What annoys me is that - with all due respect - the companies which embed openssl in their products could have done a review of the code for quality. To me it seems that it's a fundamental library.
And if there are that many, would a new start not be better?
How about no?
Also I don't see why lots of fixes would necessarily mean poor fixes. They likely do what they feel is obvious fixes / stuff they consider wrong. Or something such. What do I know really.
Possibly they know what they are doing.
It seems you're not familiar with the process of software development. You just don't issue one single commit containing a billion changes. It's a step by step process, for several reasons, most importantly the mechanic and iterative process of searching for bugs.
A Tumblr site popped up a few days ago called OpenSSL Valhalla Rampage. The blogger there is going through all the commits and posting the juicy funny comments there. This includes killing... and rekilling... VMS support (which reminds me of Maxim 37: there is no such thing as overkill...), stripping out now-stupid abstractions and optimizations of the unoptimizables, and more.
--
# Canmephians for a better Linux Kernel
$Stalag99{"URL"}="http://stalag99.net";
Agreed. However fixing obviously brain damaged code is a start.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
A new start would introduce far more issues, so a major cleanup can be preferable.
Some of this code is 18 years old. There are whole swathes of code that is depreciated. Cleanup and standardisation of layout helps. Removal of things like the VAX, Windows, obsolete compilers.... Already with the KNF and cruft being removed there are things being seen and some commits being made to remove scary things.
Seriously, read some of those commits and you will see that this /will/ help wrangle it into a far more secure state.
The NSA, obviously.
Next question please.
Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)
just a simple thank you to all the coders out there who donate of their skills and time to produce this and other very important software, for free folks! Thank You for making the world a better place
Most? You've linked one big commit fixing a lot of style problems. Other than that, I see missing checks being added and strange code being fixed all over the place in a lot of commits. Aren't you twisting the reality just a little bit?
Ezekiel 23:20
Quantity doesn't equal quality
See also:
So far as all the "won't this introduce more bugs than it fixes" comments go, this is a recurring argument I have at work. I am of the "clean as you go", "refactor now" school.
Everyone else says "If it works don't fix it"(IIWDFI), "don't rock the boat" etc.
Heartbleed is what happens when the IIWDFI attitude wins. Bugs lurk under layers of cruft, simple changes become nightmares of wading through a lava flow of wrappers around hacks around bodges.
Whenever anyone says IIWDFI, remind them that testing can only find a small proportion of possible bugs, so if you can't see whether it has bugs or not by reading the code, then no matter how many test cases it passes, it DOESN'T WORK.
With all the other tripe on this thread, I thought it necessary to say this loud and clear:
Hey OpenSSL Contributors - thanks for your hard work on OpenSSL, and thanks for the hard work under this spotlight cleaning this up.
Any serious software engineer with a career behind them has worked on projects with great source code, bad source code, and everything in between. It sounds like OpenSSL is a typical project with tons of legacy code where dealing with legacy is lower priority than dealing with the future. Subtracting out all the ideological debate and conspiracy theories, please realize there are plenty of 'less noisy' people out there who appreciate everything you're doing. And even more who would appreciate it if they understood the situation.
Its now time for companies who depend on OpenSSL (and other projects) to realize that Open Source software can lower their development costs, but some of that savings needs to be put back into the process or we will all suffer from "the tragedy of the Commons".
Our modern malady is to look at methods, not histories.
Great software comes from great leadership and good-to-great talent. But mostly, it involves someone having a good idea and following it through.
Sometimes, that's a single programmer (Bill Atkinson). Most commonly, it's a group that needs a leader.
The quality of that leader then determines the quality of the product. But both industry and open source find this idea terrifying.
Industry would prefer to avoid this and promote exchangeable, replaceable cogs to the position of program/project manager. These people tend to be aggressive and thoughtless and produce gunk software.
Open source would prefer to avoid it because the big secret in open source is that people do what they want to do, not what needs doing. This is why products usually have the "fun, interesting parts" done but lag behind in the stuff no one finds thrilling, including finishing the boring parts of the code, debugging, documentation, etc.
Leadership is essential. The difference is that in open source, you can't fire people, so you can't tell them what to do.
Futurist Traditionalism
In a clean-up operation, you don't vet each change, especially when the change is reformatting instead of a real code change. It's clear from the commits I've looked at that the people doing this are working to eliminate the cruft that inevitability builds up in any project as it matures. See http://en.wikipedia.org/wiki/C... -- you take baby steps, and check your work as you go.
In the process of clean-up, of re-factoring, one may find and fix subtle bugs, such as the missing bounds check that is at the heart of, um, heartbleed. Elimination of the custom memory management in favor of the native OS code (particularly when the OS takes pains to clear out free memory -- which would have stopped heartbleed cold on some platforms) decreases the complexity of the code, and -- arguably -- makes it easier to read. Replacing clumsy code with better-crafted code that does the same thing but far more clearly makes it easier to read. Removing out-dated portability hacks removes a lot of chaff so the wheat is easier to see. But you can't do this all in one commit and expect not to stub your toe.
Repeat as necessary
When the clean-up tighten-up passes are complete, and the cruft is mostly gone, the developers then need to comb through the code looking for issues missed in the first pass, and brokenness caused by unfortunate re-factoring. It happens. Also, error patterns will pop up when the code is reformatted to a single standards. Further, because people are looking not just at syntactic content at this point, but semantic content, comments can be added to document any awkward or un-obvious constructs.
An integral part of examining the code is the development of test cases for the scaffolded version of the code. This needs to check for conformance to the RFC, and also explore what happens with intentionally malformed packets. This is the crucial step which appears to have been missing or incomplete with the pre-heartbleed OpenSSL code. Code inspection will catch stuff; test cases will expose those problems that were missed during the inspection. This process assumes that the program is structured in such a way that scaffolding is possible from a library of the code base, so each function, or small set of functions, can be tested in isolation. I find, in my own work, that I typically write several thousand lines of code during development to ensure that each major function works as expected -- and that code lives on as scaffolded test code so that when I make a change I can test it against a known set of test cases. (Sometimes, you have to change the test , or add tests, because the function changes or a bug sneaks through, but then you have a check-and-balance to minimize issues.)
You ask "Who is watching the watchers?" The answer is an easy one: as usually happens, management will work to solve yesterday's problems. I fully expect, when the work is finished, that the security researchers will pounce on the "new and improved OpenSSL" to find the bugs and holes that didn't get fixed during the current round. They struck gold once -- and like most gold bugs, they will mine the same hill until they go broke, just like the 49ers did.
One interesting effect is that Coverity [coverity.com] made improvements in their code-scanning product. (See comments above this one.) The net effect is that all the products scanned by Coverity products will benefit. Which further answers your question: we now have improved computer watchers, too.
I have used another major static analysis tool at work, one of Coverity's competitors. And more than once have had the "if you had paid attention to the static analysis reports this problem could have been solved much more quickly and cheaply" discussion. In one case several weeks were spent chasing a particularly subtle and nasty memory tramper - which was found to be showing up in the analysis results.
False positives are certainly a concern. There is a tradeoff here in terms of dealing with the time spent (re)structuring the program so that they do not occur - a matter for the project lead. The same is true of compiler warnings. Best invest the time to clean them up and configure your build so that it breaks if they occur. You'll kick yourself later if you hit a bug that was revealed by a warning which was ignored.
Mr. Anonymous Coward:
What changes are you referring to? The changes I see are good re-factoring: clean up formatting, remove dead code, add missing bounds checks
Are you volunteering to do the code audit?
Massive rush? Evidence, please.
Security testing clearly hasn't been done before? Evidence, please. The counter-evidence is that the security testing tools were found not to work in this one particular case, and that problem has been patched. Security testing costs money; how much have you donated to the project?
Heartbleed exposes a problem, it doesn't invalidate the concept of Open Source. For one example that made world-wide news, there are hundreds of examples of open-source "wins" that never rose about the journalistic "noice" because it worked properly, and didn't make any waves. The mainstream says "who do we blame, who can we sue?"
And how do you manage projects, particularly open-source projects? Does early disclosure bother me? Yes, as the admin of a group of servers. Would I be comfortable if this were done behind a wall? No. We need all the eyes we can get looking at the problem. And you need to be more explicit: are the bugs that you are complaining about being exposed exploitable in some bad way? Examples, please.
SUMMARY: It's bad, but not as bad as you make it out to be.
Some day, you should learn about automated regression testing, which allows an entire regression suite of hundreds of tests to run after every commit.
Maybe then you'll have something sensible to say.
Not just that, we're discussing changes in the tip of their source repo. Not "released" code by any strech of the imagination. We're basically spectatiors, watching one of those Bob Ross "joy of painting" shows. The picture has yet to emerge. (hmmm.... accidental Gentoo pun...)
I cant talk for C, but in Java
Haha. Oh man. Java is a VM. Do you check for "dangerous constructs" like Just In Time compiling of data into machine code at Runtime and marking data executable then running it by the Java VM? Because that's how it operates. Even just having that capability makes Java less secure, don't even have to get exploit data marked code and executed, just have to get it in memory then jump to the location of the VM code that does it for me with my registers set right. Do any of your Java code checking tools run against the entire API kitchen sink of that massive attack surface you bring with every Java program called the JRE? Do they prevent users from having tens of old deprecated Java runtimes installed at 100MB a pop, since the upgrade feature just left them there and thus are still able to be targeted explicitly by malicious code? No? Didn't think so.
Don't get me wrong, I get what you're saying, Java code can be secure, but you have to run tests against the VM and API code you'll be using too. Java based checking tools produce programs that are just as vulnerable as C code, and actually demonstrably more so when you factor in their exploit area of their operating environment. Put it this way: The C tools (valgrind) already told us that the memory manager was doing weird shit -- It was expected weird shit. No dangerous construct warning would have caught heartbleed, it's a range check error coupled with the fact that they were using custom memory management. The mem-check warnings are there, but they have been explicitly ignored. It would be like the check engine light coming on, but you know the oil pressure is fine, just the sensor is bad... so no matter how bright of a big red warning light you install it can't help you anymore, it's meaningless. Actually, it's a bit worse than that, it would be like someone knew your check engine lights were on because of some kludge they added for SPEED, and so they knew they could get away with pouring gasoline in your radiator because you wouldn't notice anything wrong until it overheated and blew up -- AND you asked them about the check engine light a few times over the past two years, but they just shrugged and said, "Don't worry about it, I haven't looked under the hood lately, but here's a bit of electrical tape if the light annoys you."
I write provably secure code all the time in C, ASM (drivers mostly), even my own languages. CPUs are a finite state machines, and program functions have finite state as well. It's fully possible to write and test code for security that performs as it should for every possible input. For bigger wordsize CPUs, Instead of testing every input, one just needs to test a sufficiently large number of them to exercise all the bit ranges and edge cases. As you've noted, automation is key. If you want to write secure code you have to think like a cracker. My build scripts automatically generate missing unit test and fuzz testing stubs based on the function signatures. Input fuzzing tests are what a security researching hacker or bug exploiting cracker will use first off on any piece of code to test for potential weakness, so if you're not using these tests your code shouldn't touch crypto or security products, it's simply not been tested. Using a bit of doc-comments to add a additional semantics I can auto generate the tests for ranges, and I don't commit code to the shared repos that doesn't have 100% test coverage in my profiler. If OpenSSL was using even just a basic code coverage tool to ensure each branch of #ifdef was compilable they'd have caught this heartbleed bug. I recompiled OpenSSL without the heartbeat option as soon as my news crawler AI caught wind of it.
Code review, chode review. These dumbasses aren't using basic ESSENTIAL testing methodology you'd use for ANY product even mildly secure: Code Coverage + memory checking is the bare minimum for anything that has to do with "credentials". They apparently also have no fuck
hearbeats just exposed their faulty malloc replacement.
Heartbleed had nothing to do with their malloc replacement (at least not directly [*] ).
Heartbleed is just a very basic case of missing nested bound checking. (They check bounds for the heartbeat request it self, but fail to check is the super structure containing the hearthbeet - i.e.: the packet - passes the bound checks too. XKCD's explanation is actually spot-on: it's more or less equivalent to forgetting to check if the requested number of caracter doesn't exceed the size of the speachbubble).
This is not caused, by memory allocation system. This is caused by several factor, among which the fact that heartbeat are a very stupid design to begin with.
(Reasons:
- there are already other better way to keep a connection alive
- totally free payload and payload-size are a bad idea (why not simply use a fixed size 32 or 64 bits ID) ?
- specifying the payload's size is stupid, because there's already a size limit: the packet itself. Now instead there are 2 sizes to check and such redundant work often leads to errors as it happened with heartbleed)
But TLS/SSL are very convoluted, to the point that someone might ask if these standards aren't designed on purpose so someone could fuck them up. It's almost a long series of "exploit-bait" engineered into standards.
---
[*] : instead of concentrating on replacing malloc, they could concentrate on replacing another part, namely designing buffer-types that contain buffer-size and are automatically bound-checked.
So heartbleed has something to do with their in-house memory management, in that they lost the opportunity to bake automatic bound checking into their custom memory manager.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]