Eric Raymond Shares 'Code Archaeology' Tips, Urges Bug-Hunts in Ancient Code (itprotoday.com)
Open source guru Eric Raymond warned about the possibility of security bugs in critical code which can now date back more than two decades -- in a talk titled "Rescuing Ancient Code" at last week's SouthEast Linux Fest in North Carolina. In a new interview with ITPro Today, Raymond offered this advice on the increasingly important art of "code archaeology".
"Apply code validators as much as you can," he said. "Static analysis, dynamic analysis, if you're working in Python use Pylons, because every bug you find with those tools is a bug that you're not going to have to bleed through your own eyeballs to find... It's a good thing when you have a legacy code base to occasionally unleash somebody on it with a decent sense of architecture and say, 'Here's some money and some time; refactor it until it's clean.' Looks like a waste of money until you run into major systemic problems later because the code base got too crufty. You want to head that off...."
"Documentation is important," he added, "applying all the validators you can is important, paying attention to architecture, paying attention to what's clean is important, because dirty code attracts defects. Code that's difficult to read, difficult to understand, that's where the bugs are going to come out of apparent nowhere and mug you."
For a final word of advice, Raymond suggested that it might be time to consider moving away from some legacy programming languages as well. "I've been a C programmer for 35 years and have written C++, though I don't like it very much," he said. "One of the things I think is happening right now is the dominance of that pair of languages is coming to an end. It's time to start looking beyond those languages for systems programming. The reason is we've reached a project scale, we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore... think it's time for working programmers and project managers to start thinking about, how about if we not do this in C and not incur those crazy downstream error rates."
Raymond says he prefers Go for his alternative to C, complaining that Rust has a high entry barrier, partly because "the Rust people have not gotten their act together about a standard library."
"Documentation is important," he added, "applying all the validators you can is important, paying attention to architecture, paying attention to what's clean is important, because dirty code attracts defects. Code that's difficult to read, difficult to understand, that's where the bugs are going to come out of apparent nowhere and mug you."
For a final word of advice, Raymond suggested that it might be time to consider moving away from some legacy programming languages as well. "I've been a C programmer for 35 years and have written C++, though I don't like it very much," he said. "One of the things I think is happening right now is the dominance of that pair of languages is coming to an end. It's time to start looking beyond those languages for systems programming. The reason is we've reached a project scale, we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore... think it's time for working programmers and project managers to start thinking about, how about if we not do this in C and not incur those crazy downstream error rates."
Raymond says he prefers Go for his alternative to C, complaining that Rust has a high entry barrier, partly because "the Rust people have not gotten their act together about a standard library."
Well, that reminds me on my this weeks Rust bootstrapping experience to actually only build Firefox: https://www.youtube.com/watch?...
It's spelled "Murdock", btw: https://en.wikipedia.org/wiki/...
I agree, I just plow ahead and figure someone will fix everything for me at some point. We have milestones that need to be hit.
Eric suggests it's time to move on from C, and there are indeed better languages today that can help eliminate many classes of error.
Crystal is a rising programming language with the slogan "Fast as C, Slick as Ruby". It has some compelling features that make it more attractive than other modern language attempts like Go. You really can program in a Ruby-like language and achieve software that performs with the speed of a compiled language. And you can do systems programming in Crystal, too, because while it doesn't encourage you to use them for anything but systems programming and inter-language interfaces, it has pointers, and it can format structs as required to work on hardware registers.
But the greatest advantage of Crystal, that I have experienced so far, is that it provides type-safety without excessive declarations as you would see in Java. It does this through program-wide type inference. So, if you write a function like this:
def add(a, b)
a + b end
add(1, 2) # => 3, and the returned type is Int32
add(1.0, 2) # => 3.0, and the returned type is Float64
You get type-safe duck-typing at compile-time. If a method isn't available in a type, you'll find out at compile-time. Similarly, the type of a variable can be inferred from what you assign to it, and does not have to be declared.
Now, let's say you never want to see nil as a variable value. If you declare the type of a variable, the compiler will complain at compile-time if anything tries to assign another type to it. So, this catches all of those problems you might have in Ruby or Javascript with nil popping up unexpectedly as a value and your code breaking in production because nil doesn't have the methods you expect.
There are union types. So, if you want to see nil, you can declare your variable this way:
a : String | Nil
a : String? # Shorthand for the above.
Crystal handles metaprogramming in several ways. Type inference and duck typing gives functions and class methods parameterized types for free, without any declaration overhead. Then there are generics which allow you to declare a class with parameterized types. And there is an extremely powerful macro system. The macro system gives access to AST nodes in the compiler, type inference, and a very rich set of operators. You can call shell commands at compile-time and incorporate their output into macros. Most of the methods of String are duplicated for macros, so you can do arbitrary textual transformations.
There is an excellent interface to cross-language calls, so you can incorporate C code, etc. There are pointers and structs, so systems programming (like device drivers) is possible. Pointers and cross-language calls are "unsafe" (can cause segmentation faults, buffer overflows, etc.) but most programmers would never go there.
What have I missed so far? Run-time debugging is at a very primitive state. The developers complain that LLVM and LLDB have changed their debugging data format several times recently. There's no const and no frozen objects. The developers correctly point out that const is propagated through all of your code and doesn't often result in code optimization. I actually like it from an error-catching perspective, and to store some constant data in a way that's easily shareable across multiple threads. But Crystal already stores strings and some other data this way. And these are small issues compared to the benefits of the language.
Lucky
Paul Smith of Thoughtbot (a company well-known for their Ruby on Rails expertise) is creating the Lucky web framework, written in Crystal and inspired by Rails, which has pervasive type-safety - and without the declaration overhead as in Java.
The point of all of this is that you can create a web application as you might using Ruby on Rails, but you won't have to spend as much time writing tests, because some of the
Bruce Perens.
Personally I think that C++ contains a lot of the bad parts from C and Java while not really offering any major advantage.
In any case - Valgrind and Splint are great for C programs, but for kernel work it's a bit hard to use Valgrind.
When coding Java I have had great experience using Findbugs. For C# I haven't seen any tool as good as that tool.
As a rule - never ignore compiler warnings, they may be the tip of an iceberg problem. I have found a lot of naughty bugs and coding that way.
Also beware of re-using variables, something that I have seen is very easy in VB - a variable is re-used and suddenly contains a new data type. That's really nasty. And some script languages allows that as well.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
When I coded C for MS-DOS I had to make sure that I did malloc/free in the right order just to avoid memory leaks. So if I did one malloc for A then one for B the result was that I had to free B before A or I would have trouble coming.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Another language de-jour article, I remember in the early days of UNIX one of the sayings about c was something like "it expects the programmer to know what they are doing, it is in a hold-your-hand language".
Well I guess we should go back to COBOL or FORTRAN then
Ian was and continues to be very admired for his achievements, and his death was unnecessary and completely undignified, and is a continuing source of disquiet for me personally. Ian is a victim of mental illness. This is acknowledged by his family and by those who knew him more closely, rather than simply admiring him from afar. Rather than dishonor Ian by discussing this in detail, I would prefer to simply state that he was a victim of mental illness, not the police.
Bruce Perens.
You're confusing memory fragmentation with memory leaks.
Every end has half a stick.
Do you want us to say the same things about you when the police decide it's your turn to pay the piper?
When the police pump you with 30 holes, we will say "it wasn't the police officers fault for shooting Bruce 30 times, it was mental illness."
You are a very smart man, but sometimes you make me sick. It's like you only stand up for things that won't get you in trouble. You tip toe around, when you should be bringing these issues to light. You use your fame to hide issues, choosing to Instead derail conversations.
How come nobody wanted to help Ian before all this happen? How come we never heard about Ian being mentally ill before all of this? Now all of a sudden he's mentally ill and the cops were justified in shooting an unarmed man? I'm not blaming you or anyone, I'm just saying, if he was that mentally ill, then his friends should have tried to get him help, or at least have
Him committed for his own safety. Yet you guys did nothing. I'm confused.
Christ, the world just keeps getting shittier, and when our heroes are now apart of that lie, we have no where to turn.
I'm not even going to get started on the whole open source dibacle we've already gone over on slashdot.
I'll leave you with this famous quote:
Torpedo: Little bit of advise, Bob. If you want a role model, choose an old guy. By the time you're grown up they're dead.
Eric Raymond is not well reknowned for his programming/engineering achievements, but for being a public speaker. What is the value of his advice?
Avantgarde Hebrew science fiction
C++ also provides easy reference counting and insurance of pointer uniqueness. I agree that one can do the same with very well ordered C, but C++ automates the process for you, and that's one of the things that programming languages are about, automating simple thought processes for the programmer.
Avantgarde Hebrew science fiction
It is heart-breaking that he died without a friend left in the world, but that was a consequence of his illness.
I am 60 and my death isn't all that far away any longer. I am fortunate to have friends and a wonderful family, and hope to die in peace, with them around me.
Bruce Perens.
If you really can't get this out of your head, I'll discuss it on the phone with you, and expect you not to publish the information. It wouldn't take long, and would only make you more sad.
Bruce Perens.
We should be thankful he didn't suicide anyone else.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Not really garbage collection. It's a "good enough" collector. A real garbage collecting language has zero allocation routines and zero freeing routines, and will reclaim every single object eventually, and do this faster than manual alloc/free. Not a lot of common languages do this these days, because it requires deep hooks into the OS and CPU. Whereas modern scripting languages often find it faster to get going by building the framework on top of device independent languages like C or C++.
I used C++ for a long time, starting with Cfront, though the last decade has been C/assembler. C++ seems to have lost focus and the later standards seem somewhat strange like they're adding new features that aren't needed except to pad out the new standard. Now it's not so great for a low level systems language unless you use a lot of self discipline to avoid features, and it's completely bloated for big applications if you use the fashionable styles, and scripting languages do so much better and rapid prototyping. So it doesn't feel like it has a niche anymore, except as a place for the gurus to sit around and argue about what should change next.
They serve themselves and protect each other. Was it ever supposed to mean anything else?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Anyone who considers such fundamental concepts of C++ as useful to "way less than 1% of the code in typical projects" "has not understood the language" either.
Being able to write memory safe code in a memory unsafe language is not enough, you need machine validation that only the memory safe constructs are used.
C and C++ programmers given the freedom to do so will inevitably create exploitable bugs because of memory unsafe programming.
One of the consequences of a bridge falling down can be debris in the river that blocks barge traffic. That does not imply that the bridge voluntarily fell down.
eric raymond is the most over rated "hacker" in history. he has no actual technical achievements to speak of, he's famous for promoting open source, sending a death threat to bruce perens, and for attempting (and massively failing) to write a new build system for the linux kernel.
They could have protected and served them by whacking him across the arse with a truncheon, handcuffing him and dumping him in a cell to sleep it off.
Like they do in civilised countries.
Drunks are shit fighters, even if they think the opposite. If you can't subdue one without artillery you shouldn't be working as a cop.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Who are you talking about? Ian Murdock was not shot!
"we've reached a typical volume of code, at which the defect rates from the kind of manual memory management that you have to do in those languages are simply unacceptable anymore"
Translation: "We kept telling programmers they're too dumb to do their own memory management and rely on high level languages like java which handle memory management automatically and now the programmers coming out of college don't even understand memory management because it's too complex so it's time to move onto something like javascript and lotsa lotsa test frameworks. I don't understand C++ at all."
Memory management was and IS a responsibility of programmers to ensure proper resource usage and optimization and it's, get this, not difficult at all to do right so long as you don't "HACK" your code and free the used memory at the same nexus points as when you malloc them. The same is true of C++ and, get this, you even have special containers which will handle the auto-destruction for you! (gasp!) Language mechanics that have been in use for over 15 years in C++ (and no, I'm not talking about the defunct smart_ptr)
Good God man - getting rid of memory management so you can "concentrate on the problem at hand" and write 2 dozen objects to translate a dataset using factories and facade class patterns so you can add other translations in the future when THERE'S ONLY THE ONE TRANSLATION isn't a BETTER WAY!
Personally I think that C++ contains a lot of the bad parts from C and Java while not really offering any major advantage.
I disagree for a few reasons. Firstly, Java? What? C++ predates Java.
I've also not had a memory leak in C++ in probably 15 years. Either I'm the awesomest programmer ever or C++ offers some pretty big advantages. RAII is fantastic for resource management. Generics make equivalent code faster and simpler than the C equivalent.
SJW n. One who posts facts.
C++ seems to have lost focus and the later standards seem somewhat strange like they're adding new features that aren't needed except to pad out the new standard.
Like what? I mea technically no feature is "needed" in that the language is turing complete and pretty fuctional circa 1998, but I'd say the latest standards have added a fair bit good. C++14 and 17 have both been fairly minor additions, but have given some nice refinements to the rules and some much needed additions to the library.
A shame concepts never made it in to 17 (roll on 2a!). They've been on the cards since the early 90s.
Now it's not so great for a low level systems language
Yes it is. The new features haven't modified anything in that regard.
unless you use a lot of self discipline to avoid features,
Like what? You could argue exceptions though unless you're writing hard realtime code, or some parts of kernel code I'd disagree.
and it's completely bloated for big applications if you use the fashionable styles,
No it isn't. There's no evidence for this.
SJW n. One who posts facts.
It's odd that he talks about using tools to validate your code on one hand and then recommends moving away from C or C++ on the other.
There's actually some pretty fantastic work on sanitizers being done right now in Clang (and other tooling chains) that can enforce memory and type safety at run time.
You can do all your development with the sanitizers turned on, and then when you want speed when you're ready to release, turn them off.
There's still nothing faster than C or C++ than assembly, and even then you have to be reasonably skilled to beat -O3 these days.
Also beware of re-using variables, something that I have seen is very easy in VB - a variable is re-used and suddenly contains a new data type
Why in the hell are you using a typeless variable in VB?
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
No I don't, but I have seen code where it existed. Facepalm time there.
Unfortunately we can't have ADA everywhere. But for VB then I blame Microsoft for creating a shitty language that didn't force users from the beginning to be strict.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Can you please point out some of those languages, those "eal garbage collecting language has zero allocation routines and zero freeing routines, and will reclaim every single object eventually, and do this faster than manual alloc/free"?
Smalltalk, Lisp, ML, etc.
I'm not a C++ programmer, but I'm genuinely curious. What idioms would that be?
Was that really necessary?
Il n'y a pas de Planet B.
Documentation is important.....Code that's difficult to read, difficult to understand
"Real programmers don't comment their code. If it was hard to write, it should be hard to understand." - some coder who doesn't work here anymore.