Eric Raymond Shares 'Code Archaeology' Tips, Urges Bug-Hunts in Ancient Code (itprotoday.com)
Open source guru Eric Raymond warned about the possibility of security bugs in critical code which can now date back more than two decades -- in a talk titled "Rescuing Ancient Code" at last week's SouthEast Linux Fest in North Carolina. In a new interview with ITPro Today, Raymond offered this advice on the increasingly important art of "code archaeology".
"Apply code validators as much as you can," he said. "Static analysis, dynamic analysis, if you're working in Python use Pylons, because every bug you find with those tools is a bug that you're not going to have to bleed through your own eyeballs to find... It's a good thing when you have a legacy code base to occasionally unleash somebody on it with a decent sense of architecture and say, 'Here's some money and some time; refactor it until it's clean.' Looks like a waste of money until you run into major systemic problems later because the code base got too crufty. You want to head that off...."
"Documentation is important," he added, "applying all the validators you can is important, paying attention to architecture, paying attention to what's clean is important, because dirty code attracts defects. Code that's difficult to read, difficult to understand, that's where the bugs are going to come out of apparent nowhere and mug you."
For a final word of advice, Raymond suggested that it might be time to consider moving away from some legacy programming languages as well. "I've been a C programmer for 35 years and have written C++, though I don't like it very much," he said. "One of the things I think is happening right now is the dominance of that pair of languages is coming to an end. It's time to start looking beyond those languages for systems programming. The reason is we've reached a project scale, we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore... think it's time for working programmers and project managers to start thinking about, how about if we not do this in C and not incur those crazy downstream error rates."
Raymond says he prefers Go for his alternative to C, complaining that Rust has a high entry barrier, partly because "the Rust people have not gotten their act together about a standard library."
"Documentation is important," he added, "applying all the validators you can is important, paying attention to architecture, paying attention to what's clean is important, because dirty code attracts defects. Code that's difficult to read, difficult to understand, that's where the bugs are going to come out of apparent nowhere and mug you."
For a final word of advice, Raymond suggested that it might be time to consider moving away from some legacy programming languages as well. "I've been a C programmer for 35 years and have written C++, though I don't like it very much," he said. "One of the things I think is happening right now is the dominance of that pair of languages is coming to an end. It's time to start looking beyond those languages for systems programming. The reason is we've reached a project scale, we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore... think it's time for working programmers and project managers to start thinking about, how about if we not do this in C and not incur those crazy downstream error rates."
Raymond says he prefers Go for his alternative to C, complaining that Rust has a high entry barrier, partly because "the Rust people have not gotten their act together about a standard library."
Eric suggests it's time to move on from C, and there are indeed better languages today that can help eliminate many classes of error.
Crystal is a rising programming language with the slogan "Fast as C, Slick as Ruby". It has some compelling features that make it more attractive than other modern language attempts like Go. You really can program in a Ruby-like language and achieve software that performs with the speed of a compiled language. And you can do systems programming in Crystal, too, because while it doesn't encourage you to use them for anything but systems programming and inter-language interfaces, it has pointers, and it can format structs as required to work on hardware registers.
But the greatest advantage of Crystal, that I have experienced so far, is that it provides type-safety without excessive declarations as you would see in Java. It does this through program-wide type inference. So, if you write a function like this:
def add(a, b)
a + b end
add(1, 2) # => 3, and the returned type is Int32
add(1.0, 2) # => 3.0, and the returned type is Float64
You get type-safe duck-typing at compile-time. If a method isn't available in a type, you'll find out at compile-time. Similarly, the type of a variable can be inferred from what you assign to it, and does not have to be declared.
Now, let's say you never want to see nil as a variable value. If you declare the type of a variable, the compiler will complain at compile-time if anything tries to assign another type to it. So, this catches all of those problems you might have in Ruby or Javascript with nil popping up unexpectedly as a value and your code breaking in production because nil doesn't have the methods you expect.
There are union types. So, if you want to see nil, you can declare your variable this way:
a : String | Nil
a : String? # Shorthand for the above.
Crystal handles metaprogramming in several ways. Type inference and duck typing gives functions and class methods parameterized types for free, without any declaration overhead. Then there are generics which allow you to declare a class with parameterized types. And there is an extremely powerful macro system. The macro system gives access to AST nodes in the compiler, type inference, and a very rich set of operators. You can call shell commands at compile-time and incorporate their output into macros. Most of the methods of String are duplicated for macros, so you can do arbitrary textual transformations.
There is an excellent interface to cross-language calls, so you can incorporate C code, etc. There are pointers and structs, so systems programming (like device drivers) is possible. Pointers and cross-language calls are "unsafe" (can cause segmentation faults, buffer overflows, etc.) but most programmers would never go there.
What have I missed so far? Run-time debugging is at a very primitive state. The developers complain that LLVM and LLDB have changed their debugging data format several times recently. There's no const and no frozen objects. The developers correctly point out that const is propagated through all of your code and doesn't often result in code optimization. I actually like it from an error-catching perspective, and to store some constant data in a way that's easily shareable across multiple threads. But Crystal already stores strings and some other data this way. And these are small issues compared to the benefits of the language.
Lucky
Paul Smith of Thoughtbot (a company well-known for their Ruby on Rails expertise) is creating the Lucky web framework, written in Crystal and inspired by Rails, which has pervasive type-safety - and without the declaration overhead as in Java.
The point of all of this is that you can create a web application as you might using Ruby on Rails, but you won't have to spend as much time writing tests, because some of the
Bruce Perens.
Personally I think that C++ contains a lot of the bad parts from C and Java while not really offering any major advantage.
In any case - Valgrind and Splint are great for C programs, but for kernel work it's a bit hard to use Valgrind.
When coding Java I have had great experience using Findbugs. For C# I haven't seen any tool as good as that tool.
As a rule - never ignore compiler warnings, they may be the tip of an iceberg problem. I have found a lot of naughty bugs and coding that way.
Also beware of re-using variables, something that I have seen is very easy in VB - a variable is re-used and suddenly contains a new data type. That's really nasty. And some script languages allows that as well.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
When I coded C for MS-DOS I had to make sure that I did malloc/free in the right order just to avoid memory leaks. So if I did one malloc for A then one for B the result was that I had to free B before A or I would have trouble coming.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Ian was and continues to be very admired for his achievements, and his death was unnecessary and completely undignified, and is a continuing source of disquiet for me personally. Ian is a victim of mental illness. This is acknowledged by his family and by those who knew him more closely, rather than simply admiring him from afar. Rather than dishonor Ian by discussing this in detail, I would prefer to simply state that he was a victim of mental illness, not the police.
Bruce Perens.
Eric Raymond is not well reknowned for his programming/engineering achievements, but for being a public speaker. What is the value of his advice?
Avantgarde Hebrew science fiction
It is heart-breaking that he died without a friend left in the world, but that was a consequence of his illness.
I am 60 and my death isn't all that far away any longer. I am fortunate to have friends and a wonderful family, and hope to die in peace, with them around me.
Bruce Perens.