Distributed Statistical Debugging
Luis Villa writes "The Cooperative Bug Isolation Project at UC Berkeley and Stanford is working on statistical debugging techniques to report, find, and fix the bugs that drive the most users crazy every day. A handful of outside bug volunteers have been running the project's special feedback builds for a few weeks, and that has generated some really interesting data. But for strong results they need more runs. /. has been known to generate those kinds of big numbers ;) Their site has feedback builds of several open source applications, and the entire project is open sourced. Read more about it, then install some applications, and help them make our free software better for everyone. I'm really looking forward to the end results."
cybermace5 asked:
It's a numbers game. We're looking for statistical trends in large numbers of runs. That means we will learn the most, the most quickly, about the bugs that are happening most often to the most users. Bug triage falls out as a natural consequence of the sparse sampling and statistical modeling approach.
That said, you suggestion about measuring head-into-keyboard smacks isn't half bad. There are some groups here at Berkeley that work on haptic (touch-based) interfaces; perhaps I should pass that suggestion along to them. :-)
In the future we may add the ability for users to manually report non-crash application misbehavior. I know that this is something that The GIMP's developers are very interested in, and we are already doing controlled experiments along similar lines. The underlying statistical debugging techniques still apply.
:-)
But in the current public deployment you are quite right that we only pick up on crash bugs. Consider that the more general project name gives us something to aspire to.
If you're a raving ignoramus, you're a raving ignoramus who asks good questions. :-)
:-)
In our current deployment, we only learn about problems that crash the application. These are important bugs, but they are certainly not the only bugs.
We are considering ways to let users manually report non-crash misbehavior. I know that this is something that The GIMP's developers are very interested in, and we are already doing controlled experiments along similar lines. The underlying statistical debugging techniques still apply.
Or to put things into academic terms, "FORMAT MY *$^&$* TABLE CORRECTLY" is future work.
That isnt how watson works, the bugs are contained in what is called "Buckets", once the bucket for that particular stack trace (that particular bug) has reached its limit for enough information to debug, you can press send all you want, the bucket limit has been reached, they dont see 30,000 of the same crashes, maybe 5 or 10, they dont need more.
Click all you want. Its pointless once they got enough crash info.
Now they should open the watson buckets for developers on MSDN so we can pull the info from theyre server and run it in the debugger with our symbol server so 3rd party vendors can use this information.