Domain: dwheeler.com
Stories and comments across the archive that link to dwheeler.com.
Stories · 24
-
Debian Working on Reproducible Builds To Make Binaries Trustable
An anonymous reader writes: Debian's Jérémy Bobbio, also known as Lunar, spoke at the Chaos Communication Camp about the distribution's efforts to reassert trustworthiness for open source binaries after it was brought into question by various intelligence agencies. Debian is "working to bring reproducible builds to all of its more than 22,000 software packages," and is pushing the rest of the community to do the same. Lunar said, "The idea is to get reasonable confidence that a given binary was indeed produced by the source. We want anyone to be able to produce identical binaries from a given source (PDF)."
Here is Lunar's overview of how this works: "First you need to get the build to output the same bytes for a given version. But others also must to be able to set up a close enough build environment with similar enough software to perform the build. And for them to set it up, this environment needs to be specified somehow. Finally, you need to think about how rebuilds are performed and how the results are checked." -
How To Prevent the Next Heartbleed
dwheeler (321049) writes "Heartbleed was bad vulnerability in OpenSSL. My article How to Prevent the next Heartbleed explains why so many tools missed it... and what could be done to prevent the next one. Are there other ways to detect these vulnerabilities ahead-of-time? What did I miss?" -
Microsoft Developer Made the Most Changes To Linux 3.0 Code
sfcrazy sends this quote from the H: "The 343 changes made by Microsoft developer K. Y. Srinivasan put him at the top of a list, created by LWN.net, of developers who made the most changes in the current development cycle for Linux 3.0. Along with a number of other 'change sets,' Microsoft provided a total of 361 changes, putting it in seventh place on the list of companies and groups that contributed code to the Linux kernel. By comparison, independent developers provided 1,085 change sets to Linux 3.0, while Red Hat provided 1,000 and Intel 839." -
The Billion Dollar Kernel
jesgar writes "The Linux kernel would cost more than one billion EUR (about 1.4 billion USD) to develop in the European Union. This is the estimate made by researchers from the University of Oviedo (PPT), whereby the value annually added to this product was about 100 million EUR between 2005 and 2007 and 225 million EUR in 2008. The estimated 2008 result is comparable to 4% and 12% of Microsoft's and Google's R&D expenses on whole company products. Cost model 'Intermediate COCOMO81' is used according to parametric estimations by David Wheeler. An average annual base salary for a developer of 31,040 EUR was estimated from the EUROSTAT. Previously, similar works had been done by several authors estimating Red Hat, Debian, and Fedora distributions. The cost estimation is not of itself important, but it is an important means to an end: that commons-based innovation must receive a higher level of official recognition that would set it as an alternative to decision-makers. Ideally, legal and regulatory frameworks must allow companies participating on commons-based R&D to generate intangible assets for their contribution to successful projects. Otherwise, expenses must have an equitable tax treatment as a donation to social welfare." -
The Billion Dollar Kernel
jesgar writes "The Linux kernel would cost more than one billion EUR (about 1.4 billion USD) to develop in the European Union. This is the estimate made by researchers from the University of Oviedo (PPT), whereby the value annually added to this product was about 100 million EUR between 2005 and 2007 and 225 million EUR in 2008. The estimated 2008 result is comparable to 4% and 12% of Microsoft's and Google's R&D expenses on whole company products. Cost model 'Intermediate COCOMO81' is used according to parametric estimations by David Wheeler. An average annual base salary for a developer of 31,040 EUR was estimated from the EUROSTAT. Previously, similar works had been done by several authors estimating Red Hat, Debian, and Fedora distributions. The cost estimation is not of itself important, but it is an important means to an end: that commons-based innovation must receive a higher level of official recognition that would set it as an alternative to decision-makers. Ideally, legal and regulatory frameworks must allow companies participating on commons-based R&D to generate intangible assets for their contribution to successful projects. Otherwise, expenses must have an equitable tax treatment as a donation to social welfare." -
New DoD Memo On Open Source Software
dwheeler writes "The US Department of Defense has just released a new official memo on open source software: 'Clarifying Guidance Regarding Open Source Software (OSS).' (The memo should be up shortly on this DoD site.) This memo is important for anyone who works with the DoD, including contractors, on software and systems that include software; it may influence many other organizations as well. The DoD had released a memo back in 2003, but 'misconceptions and misinterpretations... have hampered effective DoD use and development of OSS.' The new memo tries to counter those misconceptions and misinterpretations, and is very positive about OSS. In particular, it lists a number of potential advantages of OSS, and recommends that in certain cases the DoD release software as OSS." -
Linux Kernel Surpasses 10 Million Lines of Code
javipas writes "A simple analysis of the most updated version (a Git checkout) of the Linux kernel reveals that the number of lines of all its source code surpasses 10 million, but attention: this number includes blank lines, comments, and text files. With a deeper analysis thanks to the SLOCCount tool, you can get the real number of pure code lines: 6.399.191, with 96.4% of them developed in C, and 3.3% using assembler. The number grows clearly with each new version of the kernel, that seems to be launched each 90 days approximately." -
What's The Linux Kernel Worth?
schneelocke writes "What's the value of the Linux kernel? After an offer by one Jeff V. Merkey to pay 50K USD for a BSD-licensed copy of Linux, David Wheeler does some calculations and comes up with an estimate of 612M USD." Wheeler has come up with a number of interesting software-worth estimates and other quantified facts about Free software; since some aspects involve ineffables and hypotheticals, the details can be argued, but he provides a good framework with SLOCCount. -
What's The Linux Kernel Worth?
schneelocke writes "What's the value of the Linux kernel? After an offer by one Jeff V. Merkey to pay 50K USD for a BSD-licensed copy of Linux, David Wheeler does some calculations and comes up with an estimate of 612M USD." Wheeler has come up with a number of interesting software-worth estimates and other quantified facts about Free software; since some aspects involve ineffables and hypotheticals, the details can be argued, but he provides a good framework with SLOCCount. -
Open Source: Facts and Figures
Eloquence writes "Much of the debate about GNU/Linux and open source is dominated by rhetoric rather than facts. David Wheeler has just released a new version of his "paper" (which, at 440,000 characters, is more of an e-book now) 'Why Open Source Software / Free Software (OSS/FS)? Look at the Numbers!'. According to David, this paper 'examines market share, reliability, performance, scalability, security, and total cost of ownership. It also has sections on non-quantitative issues, unnecessary fears, OSS/FS on the desktop, usage reports, other sites providing related information, and ends with some conclusions.' May come in handy when talking to your boss about Linux." -
CPAN: $677 Million of Perl
Adam K writes "It had to happen eventually. CPAN has finally gotten the sloccount treatment, and the results are interesting. At 15.4 million lines of code, CPAN is starting to approach the size of the entire Redhat 6.2 distribution mentioned in David Wheeler's original paper. Could this help explain perl's relatively low position in the SourceForge.net language numbers?" -
CPAN: $677 Million of Perl
Adam K writes "It had to happen eventually. CPAN has finally gotten the sloccount treatment, and the results are interesting. At 15.4 million lines of code, CPAN is starting to approach the size of the entire Redhat 6.2 distribution mentioned in David Wheeler's original paper. Could this help explain perl's relatively low position in the SourceForge.net language numbers?" -
Debugging
dwheeler writes "It's not often you find a classic, but I think I've found a new classic for software and computer hardware developers. It's David J. Agan's Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems." Read on for the rest. Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems author David J. Agans pages 192 publisher Amacom rating 9 reviewer David A. Wheeler ISBN 0814471684 summary A classic book on debugging principlesDebugging explains the fundamentals of finding and fixing bugs (once a bug has been detected), rather than any particular technology. It's best for developers who are novices or who are only moderately experienced, but even old pros will find helpful reminders of things they know they should do but forget in the rush of the moment. This book will help you fix those inevitable bugs, particularly if you're not a pro at debugging. It's hard to bottle experience; this book does a good job. This is a book I expect to find useful many, many, years from now.
The entire book revolves around the "nine rules." After the typical introduction and list of the rules, there's one chapter for each rule. Each of these chapters describes the rule, explains why it's a rule, and includes several "sub-rules" that explain how to apply the rule. Most importantly, there are lots of "war stories" that are both fun to read and good illustrations of how to put the rule into practice.
Since the whole book revolves around the nine rules, it might help to understand the book by skimming the rules and their sub-rules:
- Understand the system: Read the manual, read everything in depth, know the fundamentals, know the road map, understand your tools, and look up the details.
- Make it fail: Do it again, start at the beginning, stimulate the failure, don't simulate the failure, find the uncontrolled condition that makes it intermittent, record everything and find the signature of intermittent bugs, don't trust statistics too much, know that "that" can happen, and never throw away a debugging tool.
- Quit thinking and look (get data first, don't just do complicated repairs based on guessing): See the failure, see the details, build instrumentation in, add instrumentation on, don't be afraid to dive in, watch out for Heisenberg, and guess only to focus the search.
- Divide and conquer: Narrow the search with successive approximation, get the range, determine which side of the bug you're on, use easy-to-spot test patterns, start with the bad, fix the bugs you know about, and fix the noise first.
- Change one thing at a time: Isolate the key factor, grab the brass bar with both hands (understand what's wrong before fixing), change one test at a time, compare it with a good one, and determine what you changed since the last time it worked.
- Keep an audit trail: Write down what you did in what order and what happened as a result, understand that any detail could be the important one, correlate events, understand that audit trails for design are also good for testing, and write it down!
- Check the plug: Question your assumptions, start at the beginning, and test the tool.
- Get a fresh view: Ask for fresh insights, tap expertise, listen to the voice of experience, know that help is all around you, don't be proud, report symptoms (not theories), and realize that you don't have to be sure.
- If you didn't fix it, it ain't fixed: Check that it's really fixed, check that it's really your fix that fixed it, know that it never just goes away by itself, fix the cause, and fix the process.
This list by itself looks dry, but the detailed explanations and war stories make the entire book come alive. Many of the war stories jump deeply into technical details; some might find the details overwhelming, but I found that they were excellent in helping the principles come alive in a practical way. Many war stories were about obsolete technology, but since the principle is the point that isn't a problem. Not all the war stories are about computing; there's a funny story involving house wiring, for example. But if you don't know anything about computer hardware and software, you won't be able to follow many of the examples.
After detailed explanations of the rules, the rest of the book has a single story showing all the rules in action, a set of "easy exercises for the reader," tips for help desks, and closing remarks.
There are lots of good points here. One that particularly stands out is "quit thinking and look." Too many try to "fix" things based on a guess instead of gathering and observing data to prove or disprove a hypothesis. Another principle that stands out is "if you didn't fix it, it ain't fixed;" there are several vendors I'd like to give that advice to. The whole "stimulate the failure, don't simulate the failure" discussion is not as clearly explained as most of the book, but it's a valid point worth understanding.
I particularly appreciated Agans' discussions on intermittent problems (particularly in "Make it Fail"). Intermittent problems are usually the hardest to deal with, and the author gives straightforward advice on how to deal with them. One odd thing is that although he mentions Heisenberg, he never mentions the term "Heisenbug," a common jargon term in software development (a Heisenbug is a bug that disappears or alters its behavior when one attempts to probe or isolate it). At least a note would've been appropriate.
The back cover includes a number of endorsements, including one from somebody named Rob Malda. But don't worry, the book's good anyway :-).
It's important to note that this is a book on fundamentals, and different than most other books related to debugging. There are many other books on debugging, such as Richard Stallman et al's Debugging with GDB: The GNU Source-Level Debugger. But these other texts usually concentrate primarily on a specific technology and/or on explaining tool commands. A few (like Norman Matloff's guide to faster, less-frustrating debugging ) have a few more general suggestions on debugging, but are nothing like Agans' book. There are many books on testing, like Boris Beizer's Software Testing Techniques, but they tend to emphasize how to create tests to detect bugs, and less on how to fix a bug once it's been detected. Agans' book concentrates on the big picture on debugging; these other books are complementary to it.
Debugging has an accompanying website at debuggingrules.com, where you can find various little extras and links to related information. In particular, the website has an amusing poster of the nine rules you can download and print.
No book's perfect, so here are my gripes and wishes:
- The sub-rules are really important for understanding the rules, but there's no "master list" in the book or website that shows all the rules and sub-rules on one page. The end of the chapter about a given rule summarizes the sub-rules for that one rule, but it'd sure be easier to have them all in one place. So, print out the list of sub-rules above after you've read the book.
- The book left me wishing for more detailed suggestions about specific common technology. This is probably unfair, since the author is trying to give timeless advice rather than a "how to use tool X" tutorial. But it'd be very useful to give good general advice, specific suggestions, and examples of what approaches to take for common types of tools (like symbolic debuggers, digital logic probes, etc.), specific widely-used tools (like ddd on gdb), and common problems. Even after the specific tools are gone, such advice can help you use later ones. A little of this is hinted at in the "know your tools" section, but I'd like to have seen much more of it. Vendors often crow about what their tools can do, but rarely explain their weaknesses or how to apply them in a broader context.
- There's probably a need for another book that takes the same rules, but broadens them to solving arbitrary problems. Frankly, the rules apply to many situations beyond computing, but the war stories are far too technical for the non-computer person to understand.
But as you can tell, I think this is a great book. In some sense, what it says is "obvious," but it's only obvious as all fundamentals are obvious. Many sports teams know the fundamentals, but fail to consistently apply them - and fail because of it. Novices need to learn the fundamentals, and pros need occasional reminders of them; this book is a good way to learn or be reminded of them. Get this book.
If you like this review, feel free to see Wheeler's home page, including his book on developing secure programs and his paper on quantitative analysis of open source software / Free Software. You can purchase Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Debugging
dwheeler writes "It's not often you find a classic, but I think I've found a new classic for software and computer hardware developers. It's David J. Agan's Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems." Read on for the rest. Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems author David J. Agans pages 192 publisher Amacom rating 9 reviewer David A. Wheeler ISBN 0814471684 summary A classic book on debugging principlesDebugging explains the fundamentals of finding and fixing bugs (once a bug has been detected), rather than any particular technology. It's best for developers who are novices or who are only moderately experienced, but even old pros will find helpful reminders of things they know they should do but forget in the rush of the moment. This book will help you fix those inevitable bugs, particularly if you're not a pro at debugging. It's hard to bottle experience; this book does a good job. This is a book I expect to find useful many, many, years from now.
The entire book revolves around the "nine rules." After the typical introduction and list of the rules, there's one chapter for each rule. Each of these chapters describes the rule, explains why it's a rule, and includes several "sub-rules" that explain how to apply the rule. Most importantly, there are lots of "war stories" that are both fun to read and good illustrations of how to put the rule into practice.
Since the whole book revolves around the nine rules, it might help to understand the book by skimming the rules and their sub-rules:
- Understand the system: Read the manual, read everything in depth, know the fundamentals, know the road map, understand your tools, and look up the details.
- Make it fail: Do it again, start at the beginning, stimulate the failure, don't simulate the failure, find the uncontrolled condition that makes it intermittent, record everything and find the signature of intermittent bugs, don't trust statistics too much, know that "that" can happen, and never throw away a debugging tool.
- Quit thinking and look (get data first, don't just do complicated repairs based on guessing): See the failure, see the details, build instrumentation in, add instrumentation on, don't be afraid to dive in, watch out for Heisenberg, and guess only to focus the search.
- Divide and conquer: Narrow the search with successive approximation, get the range, determine which side of the bug you're on, use easy-to-spot test patterns, start with the bad, fix the bugs you know about, and fix the noise first.
- Change one thing at a time: Isolate the key factor, grab the brass bar with both hands (understand what's wrong before fixing), change one test at a time, compare it with a good one, and determine what you changed since the last time it worked.
- Keep an audit trail: Write down what you did in what order and what happened as a result, understand that any detail could be the important one, correlate events, understand that audit trails for design are also good for testing, and write it down!
- Check the plug: Question your assumptions, start at the beginning, and test the tool.
- Get a fresh view: Ask for fresh insights, tap expertise, listen to the voice of experience, know that help is all around you, don't be proud, report symptoms (not theories), and realize that you don't have to be sure.
- If you didn't fix it, it ain't fixed: Check that it's really fixed, check that it's really your fix that fixed it, know that it never just goes away by itself, fix the cause, and fix the process.
This list by itself looks dry, but the detailed explanations and war stories make the entire book come alive. Many of the war stories jump deeply into technical details; some might find the details overwhelming, but I found that they were excellent in helping the principles come alive in a practical way. Many war stories were about obsolete technology, but since the principle is the point that isn't a problem. Not all the war stories are about computing; there's a funny story involving house wiring, for example. But if you don't know anything about computer hardware and software, you won't be able to follow many of the examples.
After detailed explanations of the rules, the rest of the book has a single story showing all the rules in action, a set of "easy exercises for the reader," tips for help desks, and closing remarks.
There are lots of good points here. One that particularly stands out is "quit thinking and look." Too many try to "fix" things based on a guess instead of gathering and observing data to prove or disprove a hypothesis. Another principle that stands out is "if you didn't fix it, it ain't fixed;" there are several vendors I'd like to give that advice to. The whole "stimulate the failure, don't simulate the failure" discussion is not as clearly explained as most of the book, but it's a valid point worth understanding.
I particularly appreciated Agans' discussions on intermittent problems (particularly in "Make it Fail"). Intermittent problems are usually the hardest to deal with, and the author gives straightforward advice on how to deal with them. One odd thing is that although he mentions Heisenberg, he never mentions the term "Heisenbug," a common jargon term in software development (a Heisenbug is a bug that disappears or alters its behavior when one attempts to probe or isolate it). At least a note would've been appropriate.
The back cover includes a number of endorsements, including one from somebody named Rob Malda. But don't worry, the book's good anyway :-).
It's important to note that this is a book on fundamentals, and different than most other books related to debugging. There are many other books on debugging, such as Richard Stallman et al's Debugging with GDB: The GNU Source-Level Debugger. But these other texts usually concentrate primarily on a specific technology and/or on explaining tool commands. A few (like Norman Matloff's guide to faster, less-frustrating debugging ) have a few more general suggestions on debugging, but are nothing like Agans' book. There are many books on testing, like Boris Beizer's Software Testing Techniques, but they tend to emphasize how to create tests to detect bugs, and less on how to fix a bug once it's been detected. Agans' book concentrates on the big picture on debugging; these other books are complementary to it.
Debugging has an accompanying website at debuggingrules.com, where you can find various little extras and links to related information. In particular, the website has an amusing poster of the nine rules you can download and print.
No book's perfect, so here are my gripes and wishes:
- The sub-rules are really important for understanding the rules, but there's no "master list" in the book or website that shows all the rules and sub-rules on one page. The end of the chapter about a given rule summarizes the sub-rules for that one rule, but it'd sure be easier to have them all in one place. So, print out the list of sub-rules above after you've read the book.
- The book left me wishing for more detailed suggestions about specific common technology. This is probably unfair, since the author is trying to give timeless advice rather than a "how to use tool X" tutorial. But it'd be very useful to give good general advice, specific suggestions, and examples of what approaches to take for common types of tools (like symbolic debuggers, digital logic probes, etc.), specific widely-used tools (like ddd on gdb), and common problems. Even after the specific tools are gone, such advice can help you use later ones. A little of this is hinted at in the "know your tools" section, but I'd like to have seen much more of it. Vendors often crow about what their tools can do, but rarely explain their weaknesses or how to apply them in a broader context.
- There's probably a need for another book that takes the same rules, but broadens them to solving arbitrary problems. Frankly, the rules apply to many situations beyond computing, but the war stories are far too technical for the non-computer person to understand.
But as you can tell, I think this is a great book. In some sense, what it says is "obvious," but it's only obvious as all fundamentals are obvious. Many sports teams know the fundamentals, but fail to consistently apply them - and fail because of it. Novices need to learn the fundamentals, and pros need occasional reminders of them; this book is a good way to learn or be reminded of them. Get this book.
If you like this review, feel free to see Wheeler's home page, including his book on developing secure programs and his paper on quantitative analysis of open source software / Free Software. You can purchase Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Debugging
dwheeler writes "It's not often you find a classic, but I think I've found a new classic for software and computer hardware developers. It's David J. Agan's Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems." Read on for the rest. Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems author David J. Agans pages 192 publisher Amacom rating 9 reviewer David A. Wheeler ISBN 0814471684 summary A classic book on debugging principlesDebugging explains the fundamentals of finding and fixing bugs (once a bug has been detected), rather than any particular technology. It's best for developers who are novices or who are only moderately experienced, but even old pros will find helpful reminders of things they know they should do but forget in the rush of the moment. This book will help you fix those inevitable bugs, particularly if you're not a pro at debugging. It's hard to bottle experience; this book does a good job. This is a book I expect to find useful many, many, years from now.
The entire book revolves around the "nine rules." After the typical introduction and list of the rules, there's one chapter for each rule. Each of these chapters describes the rule, explains why it's a rule, and includes several "sub-rules" that explain how to apply the rule. Most importantly, there are lots of "war stories" that are both fun to read and good illustrations of how to put the rule into practice.
Since the whole book revolves around the nine rules, it might help to understand the book by skimming the rules and their sub-rules:
- Understand the system: Read the manual, read everything in depth, know the fundamentals, know the road map, understand your tools, and look up the details.
- Make it fail: Do it again, start at the beginning, stimulate the failure, don't simulate the failure, find the uncontrolled condition that makes it intermittent, record everything and find the signature of intermittent bugs, don't trust statistics too much, know that "that" can happen, and never throw away a debugging tool.
- Quit thinking and look (get data first, don't just do complicated repairs based on guessing): See the failure, see the details, build instrumentation in, add instrumentation on, don't be afraid to dive in, watch out for Heisenberg, and guess only to focus the search.
- Divide and conquer: Narrow the search with successive approximation, get the range, determine which side of the bug you're on, use easy-to-spot test patterns, start with the bad, fix the bugs you know about, and fix the noise first.
- Change one thing at a time: Isolate the key factor, grab the brass bar with both hands (understand what's wrong before fixing), change one test at a time, compare it with a good one, and determine what you changed since the last time it worked.
- Keep an audit trail: Write down what you did in what order and what happened as a result, understand that any detail could be the important one, correlate events, understand that audit trails for design are also good for testing, and write it down!
- Check the plug: Question your assumptions, start at the beginning, and test the tool.
- Get a fresh view: Ask for fresh insights, tap expertise, listen to the voice of experience, know that help is all around you, don't be proud, report symptoms (not theories), and realize that you don't have to be sure.
- If you didn't fix it, it ain't fixed: Check that it's really fixed, check that it's really your fix that fixed it, know that it never just goes away by itself, fix the cause, and fix the process.
This list by itself looks dry, but the detailed explanations and war stories make the entire book come alive. Many of the war stories jump deeply into technical details; some might find the details overwhelming, but I found that they were excellent in helping the principles come alive in a practical way. Many war stories were about obsolete technology, but since the principle is the point that isn't a problem. Not all the war stories are about computing; there's a funny story involving house wiring, for example. But if you don't know anything about computer hardware and software, you won't be able to follow many of the examples.
After detailed explanations of the rules, the rest of the book has a single story showing all the rules in action, a set of "easy exercises for the reader," tips for help desks, and closing remarks.
There are lots of good points here. One that particularly stands out is "quit thinking and look." Too many try to "fix" things based on a guess instead of gathering and observing data to prove or disprove a hypothesis. Another principle that stands out is "if you didn't fix it, it ain't fixed;" there are several vendors I'd like to give that advice to. The whole "stimulate the failure, don't simulate the failure" discussion is not as clearly explained as most of the book, but it's a valid point worth understanding.
I particularly appreciated Agans' discussions on intermittent problems (particularly in "Make it Fail"). Intermittent problems are usually the hardest to deal with, and the author gives straightforward advice on how to deal with them. One odd thing is that although he mentions Heisenberg, he never mentions the term "Heisenbug," a common jargon term in software development (a Heisenbug is a bug that disappears or alters its behavior when one attempts to probe or isolate it). At least a note would've been appropriate.
The back cover includes a number of endorsements, including one from somebody named Rob Malda. But don't worry, the book's good anyway :-).
It's important to note that this is a book on fundamentals, and different than most other books related to debugging. There are many other books on debugging, such as Richard Stallman et al's Debugging with GDB: The GNU Source-Level Debugger. But these other texts usually concentrate primarily on a specific technology and/or on explaining tool commands. A few (like Norman Matloff's guide to faster, less-frustrating debugging ) have a few more general suggestions on debugging, but are nothing like Agans' book. There are many books on testing, like Boris Beizer's Software Testing Techniques, but they tend to emphasize how to create tests to detect bugs, and less on how to fix a bug once it's been detected. Agans' book concentrates on the big picture on debugging; these other books are complementary to it.
Debugging has an accompanying website at debuggingrules.com, where you can find various little extras and links to related information. In particular, the website has an amusing poster of the nine rules you can download and print.
No book's perfect, so here are my gripes and wishes:
- The sub-rules are really important for understanding the rules, but there's no "master list" in the book or website that shows all the rules and sub-rules on one page. The end of the chapter about a given rule summarizes the sub-rules for that one rule, but it'd sure be easier to have them all in one place. So, print out the list of sub-rules above after you've read the book.
- The book left me wishing for more detailed suggestions about specific common technology. This is probably unfair, since the author is trying to give timeless advice rather than a "how to use tool X" tutorial. But it'd be very useful to give good general advice, specific suggestions, and examples of what approaches to take for common types of tools (like symbolic debuggers, digital logic probes, etc.), specific widely-used tools (like ddd on gdb), and common problems. Even after the specific tools are gone, such advice can help you use later ones. A little of this is hinted at in the "know your tools" section, but I'd like to have seen much more of it. Vendors often crow about what their tools can do, but rarely explain their weaknesses or how to apply them in a broader context.
- There's probably a need for another book that takes the same rules, but broadens them to solving arbitrary problems. Frankly, the rules apply to many situations beyond computing, but the war stories are far too technical for the non-computer person to understand.
But as you can tell, I think this is a great book. In some sense, what it says is "obvious," but it's only obvious as all fundamentals are obvious. Many sports teams know the fundamentals, but fail to consistently apply them - and fail because of it. Novices need to learn the fundamentals, and pros need occasional reminders of them; this book is a good way to learn or be reminded of them. Get this book.
If you like this review, feel free to see Wheeler's home page, including his book on developing secure programs and his paper on quantitative analysis of open source software / Free Software. You can purchase Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Debugging
dwheeler writes "It's not often you find a classic, but I think I've found a new classic for software and computer hardware developers. It's David J. Agan's Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems." Read on for the rest. Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems author David J. Agans pages 192 publisher Amacom rating 9 reviewer David A. Wheeler ISBN 0814471684 summary A classic book on debugging principlesDebugging explains the fundamentals of finding and fixing bugs (once a bug has been detected), rather than any particular technology. It's best for developers who are novices or who are only moderately experienced, but even old pros will find helpful reminders of things they know they should do but forget in the rush of the moment. This book will help you fix those inevitable bugs, particularly if you're not a pro at debugging. It's hard to bottle experience; this book does a good job. This is a book I expect to find useful many, many, years from now.
The entire book revolves around the "nine rules." After the typical introduction and list of the rules, there's one chapter for each rule. Each of these chapters describes the rule, explains why it's a rule, and includes several "sub-rules" that explain how to apply the rule. Most importantly, there are lots of "war stories" that are both fun to read and good illustrations of how to put the rule into practice.
Since the whole book revolves around the nine rules, it might help to understand the book by skimming the rules and their sub-rules:
- Understand the system: Read the manual, read everything in depth, know the fundamentals, know the road map, understand your tools, and look up the details.
- Make it fail: Do it again, start at the beginning, stimulate the failure, don't simulate the failure, find the uncontrolled condition that makes it intermittent, record everything and find the signature of intermittent bugs, don't trust statistics too much, know that "that" can happen, and never throw away a debugging tool.
- Quit thinking and look (get data first, don't just do complicated repairs based on guessing): See the failure, see the details, build instrumentation in, add instrumentation on, don't be afraid to dive in, watch out for Heisenberg, and guess only to focus the search.
- Divide and conquer: Narrow the search with successive approximation, get the range, determine which side of the bug you're on, use easy-to-spot test patterns, start with the bad, fix the bugs you know about, and fix the noise first.
- Change one thing at a time: Isolate the key factor, grab the brass bar with both hands (understand what's wrong before fixing), change one test at a time, compare it with a good one, and determine what you changed since the last time it worked.
- Keep an audit trail: Write down what you did in what order and what happened as a result, understand that any detail could be the important one, correlate events, understand that audit trails for design are also good for testing, and write it down!
- Check the plug: Question your assumptions, start at the beginning, and test the tool.
- Get a fresh view: Ask for fresh insights, tap expertise, listen to the voice of experience, know that help is all around you, don't be proud, report symptoms (not theories), and realize that you don't have to be sure.
- If you didn't fix it, it ain't fixed: Check that it's really fixed, check that it's really your fix that fixed it, know that it never just goes away by itself, fix the cause, and fix the process.
This list by itself looks dry, but the detailed explanations and war stories make the entire book come alive. Many of the war stories jump deeply into technical details; some might find the details overwhelming, but I found that they were excellent in helping the principles come alive in a practical way. Many war stories were about obsolete technology, but since the principle is the point that isn't a problem. Not all the war stories are about computing; there's a funny story involving house wiring, for example. But if you don't know anything about computer hardware and software, you won't be able to follow many of the examples.
After detailed explanations of the rules, the rest of the book has a single story showing all the rules in action, a set of "easy exercises for the reader," tips for help desks, and closing remarks.
There are lots of good points here. One that particularly stands out is "quit thinking and look." Too many try to "fix" things based on a guess instead of gathering and observing data to prove or disprove a hypothesis. Another principle that stands out is "if you didn't fix it, it ain't fixed;" there are several vendors I'd like to give that advice to. The whole "stimulate the failure, don't simulate the failure" discussion is not as clearly explained as most of the book, but it's a valid point worth understanding.
I particularly appreciated Agans' discussions on intermittent problems (particularly in "Make it Fail"). Intermittent problems are usually the hardest to deal with, and the author gives straightforward advice on how to deal with them. One odd thing is that although he mentions Heisenberg, he never mentions the term "Heisenbug," a common jargon term in software development (a Heisenbug is a bug that disappears or alters its behavior when one attempts to probe or isolate it). At least a note would've been appropriate.
The back cover includes a number of endorsements, including one from somebody named Rob Malda. But don't worry, the book's good anyway :-).
It's important to note that this is a book on fundamentals, and different than most other books related to debugging. There are many other books on debugging, such as Richard Stallman et al's Debugging with GDB: The GNU Source-Level Debugger. But these other texts usually concentrate primarily on a specific technology and/or on explaining tool commands. A few (like Norman Matloff's guide to faster, less-frustrating debugging ) have a few more general suggestions on debugging, but are nothing like Agans' book. There are many books on testing, like Boris Beizer's Software Testing Techniques, but they tend to emphasize how to create tests to detect bugs, and less on how to fix a bug once it's been detected. Agans' book concentrates on the big picture on debugging; these other books are complementary to it.
Debugging has an accompanying website at debuggingrules.com, where you can find various little extras and links to related information. In particular, the website has an amusing poster of the nine rules you can download and print.
No book's perfect, so here are my gripes and wishes:
- The sub-rules are really important for understanding the rules, but there's no "master list" in the book or website that shows all the rules and sub-rules on one page. The end of the chapter about a given rule summarizes the sub-rules for that one rule, but it'd sure be easier to have them all in one place. So, print out the list of sub-rules above after you've read the book.
- The book left me wishing for more detailed suggestions about specific common technology. This is probably unfair, since the author is trying to give timeless advice rather than a "how to use tool X" tutorial. But it'd be very useful to give good general advice, specific suggestions, and examples of what approaches to take for common types of tools (like symbolic debuggers, digital logic probes, etc.), specific widely-used tools (like ddd on gdb), and common problems. Even after the specific tools are gone, such advice can help you use later ones. A little of this is hinted at in the "know your tools" section, but I'd like to have seen much more of it. Vendors often crow about what their tools can do, but rarely explain their weaknesses or how to apply them in a broader context.
- There's probably a need for another book that takes the same rules, but broadens them to solving arbitrary problems. Frankly, the rules apply to many situations beyond computing, but the war stories are far too technical for the non-computer person to understand.
But as you can tell, I think this is a great book. In some sense, what it says is "obvious," but it's only obvious as all fundamentals are obvious. Many sports teams know the fundamentals, but fail to consistently apply them - and fail because of it. Novices need to learn the fundamentals, and pros need occasional reminders of them; this book is a good way to learn or be reminded of them. Get this book.
If you like this review, feel free to see Wheeler's home page, including his book on developing secure programs and his paper on quantitative analysis of open source software / Free Software. You can purchase Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Estimating the Size/Cost of Linux
2bits writes "Wow... A Billion Dollars Worth Of Software On My System For Free! Check This Guy Out, He Came Up With A Counting / Pricing Method For Quite A Few Types of Source Code. Here is the Program. The results on the site are sorta dated, based on RH 7.1, but the app is pretty cool!... Hey, I can finally find out how much all my side projects are worth / costing me..." -
Estimating the Size/Cost of Linux
2bits writes "Wow... A Billion Dollars Worth Of Software On My System For Free! Check This Guy Out, He Came Up With A Counting / Pricing Method For Quite A Few Types of Source Code. Here is the Program. The results on the site are sorta dated, based on RH 7.1, but the app is pretty cool!... Hey, I can finally find out how much all my side projects are worth / costing me..." -
Why Use Free/Open Source Software?
An Anonymous Coward writes "I came across Why Use Open Source / Free Software? at Linux Today. As the author says in his intro: "This paper provides quantitative data that, in many cases, using Open Source / Free Source software is a reasonable or even superior approach to using their proprietary competition according to various measures." Good to see stuff we've known / suspected for some time backed up by real data...." -
The Myth of Open Source Security Revisited v2.0
Dare Obasanjo contributed this followup to an article entitled The Myth of Open Source Security Revisited that appeared on the website kuro5hin. He writes: "The original article tackled the common misconception amongst users of Open Source Software(OSS) that OSS is a panacea when it comes to creating secure software. The article presented anecdotal evidence taken from an article written by John Viega, the original author of GNU Mailman, to illustrate its point. This article follows up the anecdotal evidence presented in the original paper by providing an analysis of similar software applications, their development methodology and the frequency of the discovery of security vulnerabilities." Read on below for his detailed analysis, especially relevant with the currency of security initiatives in the worlds of both open- and closed-source software.
The Myth of Open Source Security Revisited v2.0 The purpose of this article is to expose the fallacy of the belief in the "inherent security" of Open Source software and instead point to a truer means of ensuring the quality of the security of a piece software is high.
Apples, Oranges, Penguins and Daemons
When performing experiments to confirm a hypothesis on the effect of a particular variable on an event or observable occurence, it is common practice to utilize control groups. In an attempt to establish cause and effect in such experiments, one tries to hold all variables that may affect the outcome constant except for the variable that the experiment is interested in. Comparisons of the security of software created by Open Source processes and software produced in a proprietary manner have typically involved several variables besides development methodology.
A number of articles have been written that compare the security of Open Source development to proprietary development by comparing security vulnerabilities in Microsoft products to those in Open Source products. Noted Open Source pundit, Eric Raymond wrote an article on NewsForge where he compares Microsoft Windows and IIS to Linux, BSD and Apache. In the article, Eric Raymond states that Open Source development implies that "security holes will be infrequent, the compromises they cause will be relatively minor, and fixes will be rapidly developed and deployed." However, upon investigation it is disputable that Linux distributions have less frequent or more minor security vulnerabilities when compared to recent versions of Windows. In fact the belief in the inherent security of Open Source software over proprietary software seems to be the product of a single comparison, Apache versus Microsoft IIS.
There are a number of variables involved when one compares the security of software such as Microsoft Windows operating systems to Open Source UNIX-like operating systems including the disparity in their market share, the requirements and dispensations of their user base, and the differences in system design. To better compare the impact of source code licensing on the security of the software, it is wise to reduce the number of variables that will skew the conclusion. To this effect it is best to compare software with similar system design and user base than comparing software applications that are significantly distinct. The following section analyzes the frequency of the discovery of security vulnerabilities in UNIX-like operating systems including HP-UX, FreeBSD, RedHat Linux, OpenBSD, Solaris, Mandrake Linux, AIX and Debian GNU/Linux.
Security Vulnerability Face-Off
Below is a listing of UNIX and UNIX-like operating systems with the number of security vulnerabilities that were discovered in them in 2001 according to the Security Focus Vulnerability Archive. AIX 10 vulnerabilities[6 remote, 3 local, 1 both] Debian GNU/Linux 13 vulnerabilities[1 remote, 12 local] + 1 Linux kernel vulnerability[1 local] FreeBSD 24 vulnerabilities[12 remote, 9 local, 3 both] HP-UX 25 vulnerabilities[12 remote, 12 local, 1 both] Mandrake Linux 17 vulnerabilities[5 remote, 12 local] + 12 Linux kernel vulnerabilities[5 remote, 7 local] OpenBSD 13 vulnerabilities[7 remote, 5 local, 1 both] Red Hat Linux 28 vulnerabilities[5 remote, 22 local, 1 unknown] + 12 Linux kernel vulnerabilities[6 remote, 6 local] Solaris 38 vulnerabilities[14 remote, 22 local, 2 both] From the above listing one can infer that source licensing is not a primary factor in determining how prone to security flaws a software application will be. Specifically proprietary and Open Source UNIX family operating systems are represented on both the high and low ends of the frequency distribution.
Factors that have been known to influence the security and quality of a software application are practices such as code auditing (peer review), security-minded architecture design, strict software development practices that restrict certain dangerous programming constructs (e.g. using the str* or scanf* family of functions in C) and validation & verification of the design and implementation of the software. Also reducing the focus on deadlines and only shipping when the system the system is in a satisfactory state is important.
Both the Debian and OpenBSD projects exhibit many of the aforementioned characteristics which help explain why they are the Open Source UNIX operating systems with the best security record. Debian's track record is particularly impressive when one realizes that the Debian Potato consists of over 55 million lines of code (compared to RedHat's 30,000,000 lines of code).
The Road To Secure Software
Exploitable security vulnerabilities in a software application are typically evidence of bugs in the design or implementation of the application. Thus the process of writing secure software is an extension of the process behind writing robust, high quality software. Over the years a number of methodolgies have been developed to tackle the problem of producing high quality software in a repeatable manner within time and budgetary constraints. The most successful methodologies have typically involved using the following software quality assurance, validation and verification techniques; formal methods, code audits, design reviews, extensive testing and codified best practices.-
Formal Methods: One can use formal proofs based on mathematical
methods and rigor to verify the correctness of software algorithms. Tools
for specifying software using formal techniques exist such as VDM and Z.
Z (pronounced 'zed') is a formal specification notation based on set
theory and first order predicate logic. VDM stands for "The Vienna
Development Method" which consists of a specification language called
VDM-SL, rules for data and operation refinement which allow one to
establish links between abstract requirements specifications and
detailed design specifications down to the level of code, and a proof
theory in which rigorous arguments can be conducted about the properties
of specified systems and the correctness of design decisions.The
previous descriptions were taken from the
Z FAQ and the
VDM FAQ
respectively. A comparison of both specification languages is
available in the paper,
Understanding the differences between VDM and Z
by I.J. Hayes et al.
-
Code Audits: Reviews of source code by developers other than the
author of the code are good ways to catch errors that may have been
overlooked by the original developer. Source code audits can vary from
informal reviews with little structure to formal code inspections or
walkthroughs. Informal reviews typically involve the developer sending
the reviewers source code or descriptions of the software for feedback
on any bugs or design issues. A walkthrough involves the detailed
examination of the source code of the software in question by one or more
reviewers. An inspection is a formal process where a detailed examination
of the source code is directed by reviewers who act in certain roles. A
code inspection is directed by a "moderator", the source code is read by a
"reader" and issues are documented by a "scribe".
-
Testing: The purpose of testing is to find failures. Unfortunately,
no known software testing method can discover all possible failures that
may occur in a faulty application and metrics to establish such details
have not been forthcoming. Thus a correlation between the quality of a
software application and the amount of testing it has endured is
practically non-existent.
There are various categories of tests including unit, component, system, integration, regression, black-box, and white-box tests. There is some overlap in the aforementioned mentioned testing categories.
Unit testing involves testing small pieces of functionality of the application such as methods, functions or subroutines. In unit testing it is usual for other components that the software unit interacts with to be replaced with stubs or dummy methods. Component tests are similar to unit tests with the exception that dummmy and stub methods are replaced with the actual working versions. Integration testing involves testing related components that communicate with each other while system tests involve testing the entire system after it has been built. System testing is necessary even if extensive unit or component testing has occured because it is possible for seperate subroutines to work individually but fail when invoked sequentialy due to side effects or some error in programmer logic. Regression testing involves the process of ensuring that modifications to a software module, component or system have not introduced errors into the software. A lack of sufficient regression testing is one of the reasons why certain software patches break components that worked prior to installation of the patch.
Black-box testing also called functional testing or specification testing test the behavior of the component or system without requiring knowledge of the internal structure of the software. Black-box testing is typically used to test that software meets its functional requirements. White-box testing also called structural or clear-box testing involves tests that utilize knowledge of the internal structure of the software. White-box testing is useful in ensuring that certain statements in the program are excercised and errors discovered. The existence of code coverage tools aid in discovering what percentages of a system are being excercised by the tests.
More information on testing can be found at the comp.software.testing FAQ .
-
Design Reviews: The architecture of a software application can be
reviewed in a formal process called a design review. In design reviews the
developers, domain experts and users examine that the design of the
system meets the requirements and that it contains no significant flaws
of omission or commission before implementation occurs.
-
Codified Best Practices: Some programming languages have libraries
or language features that are prone to abuse and are thus prohibited in
certain disciplined software projects. Functions like
strcpy,gets, andscanfin C are examples of library functions that are poorly designed and allow malicious individuals to use buffer overflows or format string attacks to exploit the security vulnerabilities exposed by using these functions. A number of platforms explicitly disallowgetsespecially since alternatives exist. Programming guidelines for such as those written by Peter Galvin in a Unix Insider article on designing secure software are used by development teams to reduce the likelihood of security vulnerabilities in software applications.
Issues Preventing Development of Secure Open Source Software
One of the assumptions that is typically made about Open Source software is that the availability of source code translates to "peer review" of the software application. However, the anecdotal experience of a number of Open Source developers including John Viega belies this assumption.
The term "peer review" implies an extensive review of the source code of an application by competent parties. Many Open Source projects do not get peer reviewed for a number of reasons including- complexity of code in addition to a lack of documentation makes it
difficult for casual users to understand the code enough to give a
proper review
- developers making improvements to the application typically focus
only on the parts of the application that will affect the feature to be
added instead of the whole system.
- ignorance of developers to security concerns.
- complacency in the belief that since the source is available that
it is being reviewed by others.
Benefits of Open Source to Security-Conscious Users
Despite the fact that source licensing and source code availability are not indicators of the security of a software application, there is still a significant benefit of Open Source to some users concerned about security. Open Source allows experts to audit their software options before making a choice and also in some cases to make improvements without waiting for fixes from the vendor or source code maintainer.
One should note that there are constraints on the feasibility of users auditing the software based on the complexity and size of the code base. For instance, it is unlikely that a user who wants to make a choice of using Linux as a web server for a personal homepage will scrutinize the TCP/IP stack code.
References- Frankl, Phylis et al. Choosing a Testing Method to Deliver
Reliability. Proceedings of the 19th International Conference on
Software Engineering, pp. 68--78, ACM Press, May 1997.
<
http://citeseer.nj.nec.com/frankl97choosing.html
>
- Hamlet, Dick. Software Quality, Software Process, and
Software Testing. 1994. <
http://citeseer.nj.nec.com/hamlet94software.html
>
-
Hayes, I.J., C.B. Jones and J.E. Nicholls. Understanding the
differences between VDM and Z. Technical Report UMCS-93-8-1,
University of Manchester, Computer Science Dept., 1993.
<
http://citeseer.nj.nec.com/hayes93understanding.ht ml >
-
Miller, Todd C. and Theo De Raadt. strlcpy and strlcat - consistent,
safe, string copy and concatenation. Proceedings of the 1999 USENIX
Annual Technical Conference, FREENIX Track, June 1999.
<
http://www.usenix.org/events/usenix99/full_papers/ millert/millert_html/
>
-
Viega, John. The Myth of Open Source Security. Earthweb.com.
<
http://www.earthweb.com/article/0,,10455_626641,00 .html >
- Gonzalez-Barona, Jesus M. et al. Counting Potatoes: The Size of
Debian 2.2. <
http://people.debian.org/~jgb/debian-counting/coun ting-potatoes/
>
-
Wheeler, David A. More Than A Gigabuck: Estimating GNU/Linux's Size.
<
http://www.counterpane.com/crypto-gram-0003.html
>
Acknowledgements
The following people helped in proofreading this article and/or offering suggestions about content: Jon Beckham, Graham Keith Coleman, Chris Bradfield, and David Dagon. © 2002 Dare Obasanjo -
Formal Methods: One can use formal proofs based on mathematical
methods and rigor to verify the correctness of software algorithms. Tools
for specifying software using formal techniques exist such as VDM and Z.
Z (pronounced 'zed') is a formal specification notation based on set
theory and first order predicate logic. VDM stands for "The Vienna
Development Method" which consists of a specification language called
VDM-SL, rules for data and operation refinement which allow one to
establish links between abstract requirements specifications and
detailed design specifications down to the level of code, and a proof
theory in which rigorous arguments can be conducted about the properties
of specified systems and the correctness of design decisions.The
previous descriptions were taken from the
Z FAQ and the
VDM FAQ
respectively. A comparison of both specification languages is
available in the paper,
Understanding the differences between VDM and Z
by I.J. Hayes et al.
-
Why Open Source Software/Free Software?
dwheeler writes: "I've just posted a major update of my paper, ``Why Open Source Software / Free Software (OSS/FS)? Look at the Numbers!'' Many sites give qualitative reasons for using OSS/FS, but this paper emphasizes quantitative measures (such as experiments and market studies) on why using OSS/FS products is, in a number of circumstances, a reasonable or even superior approach. The paper covers market share, reliability, performance, scaleability, security, and total cost of ownership." Bookmark this for the next time your boss asks. -
Why Open Source Software/Free Software?
dwheeler writes: "I've just posted a major update of my paper, ``Why Open Source Software / Free Software (OSS/FS)? Look at the Numbers!'' Many sites give qualitative reasons for using OSS/FS, but this paper emphasizes quantitative measures (such as experiments and market studies) on why using OSS/FS products is, in a number of circumstances, a reasonable or even superior approach. The paper covers market share, reliability, performance, scaleability, security, and total cost of ownership." Bookmark this for the next time your boss asks. -
What Actually Makes Up "Linux"?
David A. Wheeler sent in linkage to his extensive analysis of the true size of Linux. There's an amazing amount of information in here, and although it focuses on Red Hat 7.1, it still has tons of interesting bits of information about the code that makes up the distribution. Break downs include languages, licenses, cost estimates, and stats that in no way clear up the legendary GNU/Linux debate that will undoubtably be engraved on tombstones somewhere. -
What Actually Makes Up "Linux"?
David A. Wheeler sent in linkage to his extensive analysis of the true size of Linux. There's an amazing amount of information in here, and although it focuses on Red Hat 7.1, it still has tons of interesting bits of information about the code that makes up the distribution. Break downs include languages, licenses, cost estimates, and stats that in no way clear up the legendary GNU/Linux debate that will undoubtably be engraved on tombstones somewhere.