Slashdot Mirror


Using Redundancies to Find Errors

gsbarnes writes "Two Stanford researchers (Dawson Engler and Yichen Xie) have written a paper (pdf) showing that seemingly harmless redundant code is frequently a sign of not so harmless errors. Examples of redundant code: assigning a variable to itself, or dead code (code that is never reached). Some of their examples are obvious errors, some of them subtle. All are taken from a version of the Linux kernel (presumably they have already reported the bugs they found). Two interesting lessons: Apparently harmless mistakes often indicate serious troubles, so run lint and pay attention to its output. Also, in addition to its obvious practical uses, Linux provides a huge open codebase useful for researchers investigating questions about software engineering."

13 of 326 comments (clear)

  1. Here's a text link by Anonymous Coward · · Score: 4, Informative

    PDF usually crashes my computer (crappy adobe software). So here's a convenient text link!

    http://216.239.37.100/search?q=cache:yuZKW8CjTqIC: www.stanford.edu/~engler/p401-xie.ps+&hl=en&ie=UTF -8

  2. More details / PostScript version by Amsterdam+Vallon · · Score: 2, Informative

    More details
    Appeared in FSE 2002. Finds funny bugs by looking for redundant operations (dead code, unused assignments, etc.). From empirical measurements, code with such redundant errors is 50-100% more likely to have hard errors. Also describes how to check for redundancies to find holes in specifications.

    Link to PostScript file for easy viewing/printing
    File

    --

    Reply or e-mail; don't vaguely moderate. Ex-O'Reilly/MIT employee, now a full-time Google employee.
  3. html by farnsworth · · Score: 2, Informative

    html version is here.

    --

    There aint no pancake so thin it doesn't have two sides.

  4. Re:I have no idea what this article means ! by Anonymous Coward · · Score: 3, Informative

    Well let me summerize it for you:

    That paper explored the hypothesis that redundancies, like type erros, flag higher-level correctness mistakes. They evaluated the approach using four checkers which they applied to the Linux operating system. These simple analyses found many superising (to them) error types. Further, those errors correlated well with known hard errors: redundancies seemed to flag confused or poor programmers who were prone to other error types. According to them, these indicators could be used to decide where to audit a system.

  5. Re:Is this not Obvious by gmack · · Score: 5, Informative

    If only the story poster had actually read the paper.

    They used a custom checker that finds these things much more effiectivly than lint.

    I actually remember the flood of bug reports and kernel patches that toy of theirs generated the first few months they put it to use on the kernel.

  6. lclint pointer incorrect... by Samrobb · · Score: 2, Informative

    For the record, it's been moved...

    Larch FTP Site
    January 28, 1999

    Many files formerly on this site were moved elsewhere after a disk
    crash in March, 1998.

    The LCLint distribution can be found at
    ftp://ftp.sds.lcs.mit.edu/pub/lclint
    or http://www.sds.lcs.mit.edu/lclint

    --
    "Great men are not always wise: neither do the aged understand judgement." Job 32:9
  7. Linux for research by Champaign · · Score: 2, Informative

    Using Linux for academic research is hardly a new idea. In my group alone one of the profs has been publishing papers and giving talks about research using Linux since 2000.

    An example of such is http://plg.uwaterloo.ca/~migod/papers/evolution.pd f - about the evolution of Linux

  8. Re:Saw his talk at FSE by Anonymous Coward · · Score: 1, Informative

    There *are* better tools already available, though not for free.

    I've long wished that Gimpel Software would release a Linux version of their lint -- it does everything described in the PDF article and much more.

  9. Related research at Berkeley: CQual by sailesh · · Score: 2, Informative

    CQual

    It's been used to find security holes.

  10. Re:Parallel programming 101 by idletask · · Score: 2, Informative

    Gee, reread yourself... Sorry but they're 100% right. If they could acquire the lock, cam and cam->ops cannot be NULL due to the first check.

    BTW, in 2.4.18, the code now looks like this for this same function:

    struct cam_data *cam = dev->priv; int retval; if (!cam || !cam->ops) return -ENODEV; DBG("cpia_mmap: %ld\n", size); if (size > FRAME_NUM*CPIA_MAX_FRAME_SIZE) return -EINVAL; /* REDUNDANT! */ if (!cam || !cam->ops) return -ENODEV; /* make this _really_ smp-safe */ if (down_interruptible(&cam->busy_lock)) return -EINTR;

  11. Re:lint is horrible by Ed+Avis · · Score: 2, Informative

    Check out Splint (formerly LCLint). Whereas traditional lint is less needed now that compilers have -W switches, splint has a whole bunch of extra stuff which gcc won't warn about. The only trouble with it is that by default, it wants you to add annotations to your program to help it be checked (for example if a function parameter is a pointerm you can annotate whether it is allowed to be null). If you go along with that then splint can give lots of help in finding places where null pointer dereferences could happen, and other bugs. But even if you don't want to annotate and you use the less strict checking it's still a handy tool. (OK, maybe C99 has some of this stuff too, but splint has more.)

    --
    -- Ed Avis ed@membled.com
  12. You can find duplicated Java code... by tcopeland · · Score: 2, Informative

    ...using this thing here:

    http://pmd.sourceforge.net/cpd.html

    CPD uses a variant on Greedy String Tiling to find duplicated code in Java programs. There's also a JavaSpaces version since finding duplicated code is fairly parallelizable....

    Yours,

    Tom

  13. Re:Parallel programming 101 by Skuto · · Score: 2, Informative

    >Uhm, if cam was NULL after the first check,
    >wouldn't cam->busy_lock cause a segfault

    I don't feel bad about being flamed by people that missed my point, but I do feel bad about this. You are 100% right - the way I read the code can't possibly be correct.

    --
    GCP