Slashdot Mirror


Debugging Expert Wins ACM Dissertation Award

An anonymous reader writes "The Association for Computing Machinery (ACM) is reporting that Ben Liblit has been awarded the 2005 Doctoral Dissertation Award for his study on understanding and fixing software 'bugs' in the real world. From the article: 'Liblit's dissertation proposes a method for leveraging the key strength of user communities - their overwhelming numbers. His approach uses sparse random sampling rather than complete data collection for gathering information from the experiences of large numbers of software end users. It also simultaneously ensures that the observed data is an unbiased, representative subset of the complete program behavior across all runs.' Slashdot broke the story on this research back in 2003. Apparently the project is still going strong."

18 of 83 comments (clear)

  1. Sounds like Doc Watson by BadAnalogyGuy · · Score: 4, Insightful

    No, not the wild-eyed madcap scientist from Back to the Future. Doctor Watson is an OS service present in Windows that monitors the running process list for terminal assertions. When a program hits an exception that it can't handle, it terminates immediately and Doctor Watson is on the scene to read the last gasps of the process before its bits get blasted. Microsoft even came up with a way to harness this to allow users to send real-time feedback to Microsoft HQ whenever a crash occurred in a program. No one I know ever sends that data back, but I'm sure someone must have once.

    The current idea seems to be tracking the same termination events in the same way as Doctor Watson and sending the relevant data back to UWisc without informing the user. It sounds like a good idea, but I doubt it is in Liblit's power to fix Windows OS bugs.

    1. Re:Sounds like Doc Watson by Benoni · · Score: 5, Informative
      sending the relevant data back to UWisc without informing the user.

      Informed participation is a really big deal for me. No user should ever find themselves participating in the Cooperative Bug Isolation Project without their knowledge. Opt-in is explicit and revokable, and if the opt-in system runs into trouble of any kind, the fallback position is no data reporting at all.

      The whole thing collapses if users don't trust me. So I've taken every measure I can think of to ensure that they can. Please see the relevant project page for more details about privacy matters.

      It sounds like a good idea, but I doubt it is in Liblit's power to fix Windows OS bugs.

      Working on it! Check back in with me in a few years ... maybe less. :-)

    2. Re:Sounds like Doc Watson by Anonymous Coward · · Score: 5, Interesting

      No one I know ever sends that data back, but I'm sure someone must have once.

      Plenty of users do. There's a great blog posting by Raymond Chen called There's an awful lot of overclocking out there where he talks about investigating some of these "Watson" crashes.

      The crashes were impossible - instructions like

      xor eax, eax

      Turns out unscrupulous vendors were selling overclocked computers without informing buyers. Pretty cool article.

    3. Re:Sounds like Doc Watson by gzearfoss · · Score: 2, Interesting

      My main issue with the good Doctor is that most of the time, when I had a program that crashed and invoked him, it wasn't a Microsoft product. Typically, it's because I was working on a programming assignment and it 'burped.' So unless Microsoft was willing to help me debug my homework, I didn't see much point in sending the data on to Redmond.

      Not that I mind sending back data when it can be useful; if someone is going to look at the error logs, memory, etc., and try and make it so that it won't crash again, I'm all for it. I just pity the poor person who accidentally leaves a major bug in the code, and swamps the system with error reports.

    4. Re:Sounds like Doc Watson by Animats · · Score: 2, Interesting
      It goes back much further than that. See "The ALCOR Illinois 7090/7094 post mortem dump", a famous paper from 1967.

      Automated dump analysis is an old idea in the mainframe world, but almost unknown outside it. The microprocessor world grew up with interactive debuggers and an early user-as-programmer assumption. This hasn't translated well to the modern software world.

      In the mainframe world, there have even been mainframes that recorded the last 64 or so branches using dedicated hardware, so that after a crash, the control path could be recovered.

      What does the Mozilla project do with the data from their "quality feedback agent", anyway?

  2. Check out FindBugs for finding bugs in Java by licamell · · Score: 5, Informative
    This reminded me of work going at at UMD (University of Maryland, College Park). I know it's not quite the same thing, but I feel as though this is a good place to mention it and the slashdot community would appreciate this software. FindBugs is a very cool tool for finding bugs in java code. And no, I am not affiliated with this project, I just saw a talk on it a couple months ago.

    http://findbugs.sourceforge.net/

  3. Thank you, open source community by Benoni · · Score: 5, Informative

    This research has been a wonderful collaborative effort, and many people deserve to share the credit. To quote from part of the Acknowledgements section of my dissertation:

    I am indebted to the many members of the open source community who have supported our work. My thanks go out to the many anonymous users of our public deployment, and to the developers of the open source projects used in our public deployment and case studies.

    So thanks, Slashdot, for helping me find those users (or helping them find me). The exposure was invaluable. And thanks, open source community, for your participation. I've benefitted greatly from standing on your massed shoulders. This could not have happened without you.

    1. Re:Thank you, open source community by SEWilco · · Score: 2, Funny

      You're welcome.
      We will now debug your dissertation.

  4. Now at the University of Wisconsin-Madison by DrDitto · · Score: 2, Informative

    Ben Liblit is now an assistant professor at the University of Wisconsin-Madison. He joins a fantastic Computer Science department. Good luck Ben!

  5. Heh... by the_skywise · · Score: 4, Funny

    So somebody went and formalized the theory of "the users are the beta testers"...

    1. Re:Heh... by Benoni · · Score: 5, Informative

      Yes, exactly. The users are beta testers; we may as well admit it. I want to make them better beta testers. :-)

  6. Re:blargh by Neo-Rio-101 · · Score: 3, Funny

    *tears out own hair and screams*

    Shouldn't that be "leveraging out own hair and screaming"?

    --
    READY.
    PRINT ""+-0
  7. Request for more information by BadAnalogyGuy · · Score: 3, Interesting

    The installation of CBI is implicit consent to such monitoring, of course, and I didn't mean to imply that there was no consent involved at all.

    However, asking us to read 170-odd pages of your dissertation is a little much. Would it be possible to describe the data collection system, how reports are generated and if the reports are sent automatically or as in the case of Dr. Watson sent with user approval. Also, what types of bugs you found using your statistical methods, as well as what types of bugs you think would be difficult to find using such methods.

    A quick comparison to related mainstream debugging techniques would be useful to give us out here in the trenches a firmer grip on the techniques you describe.

    And finally, if you wouldn't mind, could you describe a real-world scenario where a generalized product (codename: CBIMax) would be marketable. If such a general product is impossible, is it because each product is different and the methods you describe would need to be revised each time? What is the maximum level of abstraction of these techniques from specific scenarios that is achievable yet still retaining enough so as to not require largescale retooling for each project?

    Thanks!

    1. Re:Request for more information by Benoni · · Score: 5, Informative
      However, asking us to read 170-odd pages of your dissertation is a little much.

      Hey, it's a real page-turner. Well, it has pages and they turn, at least.

      The other questions you ask are all good ones, but a bit much to address in a Slashdot comment. Please see the project home page for more information. The "Learn More" page may answer some of your questions, and there are additional drill-down pages from there with even more technical material on selected topics.

      Please understand that I don't mean to brush off your insightful questions. They are just questions for which satisfactory answers are hard to give in a sentence or two.

  8. Re:blargh by Anonymous Coward · · Score: 3, Informative

    tr.v. leveraged, leveraging, leverages 1.
    a. To provide (a company) with leverage.
    b. To supplement (money, for example) with leverage.
    2. To improve or enhance: "It makes more sense to be able to leverage what we [public radio stations] do in a more effective way to our listeners" Delano Lewis.

    So listen and listen good all you academic paper writers: unless what you really mean by "leverage" is "improve", don't use it.

    "Liblit's dissertation proposes a method for leveraging the key strength of user communities - their overwhelming numbers." WRONG. "improving the key strength of communities -their overwhelming ..." does not make sense. The overwhelming numbers are there they do not need improvement. What you may want to say is sthng like: "for using the key strengh of user communities as a leverage to blah blah blah"

    Just because a word sounds good it does not mean it should be used as a wild card ...

  9. Almost like infinite monkeys writing Shakespeare by gzearfoss · · Score: 2, Interesting

    I know that in one particular http://www.kingdomofloathing.com/game, they tend to follow this approach. Once a new feature is created, and debugged enough so that it's stable and doesn't break anything, the feature is released to the general populace. After all, once all of the important bugs are found, a thousand users will find the minor bugs through general usage faster than a small dedicated team of testers. Also, the time the testers save by not having to verify every single minor detail can be used to work on new material.
    Add into the equation that without some elaborate software (such as Mercury LoadRunner, or an open-source equivalent), it's hard to simulate the effect the entire population will have when they start hammering on the server. It can also help track down extremely low-occurance bugs, because with enough people working on it, those one-in-a-million cases will eventually come up.

    Kinda reminds me of infinite monkeys eventually producing the works of Shakespeare.

  10. Re:blargh by Tony-A · · Score: 3, Interesting

    The root word is lever and the basic idea is that you use something under your control to effect control over what would normally be outside your control. Like a very long handle on a pipe wrench.

    The money aspect you refer to has to do with debt financing whereby you manage to use your equity to finance something larger than your equity. I don't think the article is referring to corporate finance.

    In a perfect world you would use a few people who would recognize and fix the bugs. These people would never talk to the users. They would have no need to and neither would gain from the experience.
    In the world that I exist in, users are the ones who spot the bugs, specifically the circumstances under which the bugs exhibit themselves. I use my user's eyes to leverage {user's eyes, my skills}.

    If all you mean is "improve", you would not use a word which essentially demands a discrepancy in the metrics between cause and effect.

    b. To supplement (money, for example) with leverage.
    If you add money to an account because of a margin call, does this increase or decrease your leverage? That is a horrible excuse for a definition.

  11. reminds me of my rules on reporting list outages by SuperBanana · · Score: 2, Funny
    This reminds me of a method on reporting mailing list outages I devised back in 2001 or so.

    I told people we were switching to new software (Mailman)- and that if they got an error message or similar, to flip a quarter X times (I forget how many) and ONLY email me if they got all heads. I didn't want to get a couple dozen reports of the same problem, and I figured that if there were any problems, they'd affect a large set of the 1000+ users of the list.

    It worked brilliantly.