Slashdot Mirror


PMD Applied

Simon P. Chappell writes "It's a fundamentally agreed fact within our industry that code reviews are good. Really good. Sliced bread good. But have you actually tried organizing one? If you can get everyone together that needs to be there at the same time in the same meeting room, then you still have the challenge of trying to keep a roomful of geeks from getting trapped in minutiae and squabbling over details like formatting conventions. Well, what if I told you that you could get your code reviews done in less than five minutes and that there would be no arguing? Enter PMD, an open-source Java static analyzer. Think of it as a code review in a box. As if that weren't wonderful enough, there's even a book, PMD Applied, written by Tom Copeland, the author of PMD." Read on for the rest of Simon's review. PMD Applied author Tom Copeland pages 221 publisher Centennial Books rating 9 reviewer Simon P. Chappell ISBN 0976221411 summary A good book if you use or are considering using PMD for static analysis of Java code

I hope that it's not too much of stretch to suggest that this book is primarily written for users of PMD. That said, I actually purchased it (yup, this one wasn't a freebie) because I was considering a recommendation to use PMD as part of the build process at a previous employer. This is not a slam on the documentation on the PMD website, but Mr. Copeland was offering signed editions of the book if you pre-ordered and I wanted to add to my growing collection of books signed by the author. (No contest; geeky as charged.)

PMD Applied has ten chapters. The first chapter is the mandatory introduction and the second covers installation. Nothing unusual there. Chapter three covers using PMD from within Ant or Maven. I'm glad to see both build systems covered here. Both have a pretty good sized user base and both deserve to be represented. Chapter four covers using PMD within an IDE. The range of IDEs covered is excellent and includes all of the ones that I would have covered and a couple of others that I hadn't heard of.

Chapter five begins the serious job of using PMD for static analysis by examining the ability of PMD to detect duplicate code using the wonderfully named Copy and Paste Detector (CPD). Chapter six is titled Best Practices. It's a wonderful collection of advice from Mr. Copeland, on how to begin applying PMD to your Java development process. As a Team Leader, I understand many of the points he makes. Teams generally don't enjoy having impartial tools dumped on them in the middle of a project that do nothing but nitpick their code and programming style; such things are a fast-track to being unpopular. Mr. Copeland's guidelines will increase the chance of your team accepting PMD, as you start with a subset of helpful rules with low chances of false positives.

For all of its cleverness, PMD is nothing without well written rules and chapter seven has this covered. PMD ships with a good sized rule set of its own, even though not all of these will be suitable for every site. Creating custom rule sets is an important first step in customizing PMD for your team. Once the team is used to PMD, they're likely to want to start adding new custom rules to the mix and the book has you covered there as well, with instructions on writing rules with both Java and XPath expressions.

Chapter eight addresses the matter of customizing PMD. Naturally, making changes to the code and recompiling is the first customization covered, but then it looks at custom report formats, adding new languages for the CPD and advice on making contributions of code to the PMD project. Chapter nine is a look at the internals of PMD. Now, perhaps this won't be of much interest to some of the readers of this book, but I found it fascinating. There's nothing quite like having the internal workings of a useful tool explained clearly by the person who wrote it. If you don't like it, it's only twenty pages, so skip over it and check out chapter ten where the playing field of similar open-source tools is examined. The book wraps up with a large appendix with a very useful explanation of all of the rules that ship in the standard PMD deliverable.

I'm going to let a good part of my personal biases show here, but I absolutely love slim and detailed technical books. PMD Applied is 221 pages, which is about the size of the pre-ANSI version of Kernighan and Ritchie's The C Programming Language; the greatest technical book in the known universe. This is the perfect size for a technical book, allowing the author to introduce their subject, explain everything that needs saying, wrap up and be done before the reader's eyes roll back into their head and information overload descends upon them. This book is perfectly sized.

The level of technical discussion matches the size of the book and is just right. While the first few chapters deal with what is PMD and how do you install and run it, the rest of the book deals with good solid code analysis using PMD. This is why we bought the book. We're geeks ... show us the code! Mr. Copeland doesn't disappoint us and there are many excellent code examples throughout the book.

The appendix is a snapshot of the rules that shipped with PMD at the time of the book's publishing. While new rules have been added since this time, the list of every rule and a portion of example code that would trigger that rule are useful. Even if you don't use PMD, buy the book so you can use the appendix as a comprehensive list of examples of how not to write Java code!

After having been out for a year, the exact level of the software described in the book has been passed by. This may bother you, but I feel that the basic principles of PMD have remained true and so I wouldn't let the version numbers dissuade you from purchasing.

The book is only available from the publishers, Centennial Books. They're a smaller publisher, but I had no problem with purchasing through their website.

If you use or are considering using PMD for static analysis of Java code, then this book should be by your side."

Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

108 comments

  1. Misses the point by Azarael · · Score: 4, Insightful

    Code review is as much about semantics as syntax. It's great to be able to find duplicated code and the like, but what about more subtle mistakes? The only way you will really be able to benefit from code reviews is if you frequently have experienced eyes going over the code. In addition, there are lots of suggestions around for how to set up a code review to avoid squabbling (reviewers should only be looking for specific problems, code formatting is not one of them).

    1. Re:Misses the point by Anonymous Coward · · Score: 0

      PMD is actually an extremely powerful tool. We wrote a ton of custom PMD rules (in XPath and Java) which checked for and caught a ton of those "subtle mistakes" all without having to have a person looking for them.

      Integrated into the build process, PMD and Checkstyle have both saved our development team countless hours and found numerous problems before they had a chance to become integration issues.

    2. Re:Misses the point by MadMidnightBomber · · Score: 1

      I agree; but it's still worth running static analysis tools - like lint, or PMD to catch admittedly obvious mistakes - before you do the semantic (expensive, human) analysis. I mean we turn all the warnings on when we compile, don't we?

      --
      "It doesn't cost enough, and it makes too much sense."
    3. Re:Misses the point by Azarael · · Score: 1

      I agree, every QA check list needs to have a bunch of steps.

    4. Re:Misses the point by jrumney · · Score: 2, Insightful

      Its generally worth catching as many errors as you can as early as you can. I used to run an on-the-fly spell checker, which caught a lot of typos as soon as I'd made them, saving me a compile and fix cycle. In the last few years though, CodingStandardsThatActivelyPreventSpellChecking have become popular, with the justification that underscores (which my spellchecker could treat as whitespace) are not allowed in variable names in some obscure language that someone used 30 years ago, so they shouldn't be used anywhere despite leading to more readable code. Even more recently, development environments have found ways to use 2Gb of RAM and 100% of a 2GHz CPU to provide the same end effect. Progress.

    5. Re:Misses the point by alan_dershowitz · · Score: 2, Informative

      PMD is not only a format checker, and If you want syntax checking, just try to compile the code and if it doesn't compile then the syntax is wrong. PMD does exactly what you are talking about. For example, it flags negative tests and if-then blocks without braces, which while valid code are the source of many subtle mistakes.

      It is an excellent thing to run before doing code review. It flags piddling sources of code errors and formatting non-compliance so that your peer review process doesn't have to. The author agrees with you. PMD solves the problem of ensuring that code reviewers aren't wasting their time reviewing code formatting, but you still get the benefit of code formatting. PMD is great, everyone in a corporate programming environment should use it (following the author's suggestion of customizing it, I LIKE ternary expressions and will use them whether or not beginners find them "confusing.")

    6. Re:Misses the point by ThosLives · · Score: 2, Insightful

      Heh - you would think people compile with all warnings turned on. I've been working on a project recently that didn't.

      Also, the thing that neither static checkers nor semantic analysis can replace is a really good up-front requirements and architecture. Those things are often the ones that are the most lacking, and really have the most impact on things. A bad architectural decision early on can make life so much worse later, and that's not something that can be "fixed" in the traditional code review process - by the time you're coding, it's too late to fix requirements and architecture errors

      --
      "There are a dozen opinions on a matter until you know the truth. Then there is only one." - CS Lewis (paraprhase)
    7. Re:Misses the point by Azarael · · Score: 1

      It's kind of an endless loop though. The customers of my company aren't usually that technical. We might be able to come up with a great set of requirements with a whole lot of research, but it's likely that we'd never hit the sweet spot (because the customer changed their mind or plain explained something incorrectly). I figure that most developers are in the same boat that they have to do the best with the requirements that they have and be prepared to shift direction down the road. Sadly, we can't redesign our customers..

    8. Re:Misses the point by fletcher_the_dog · · Score: 1

      You are right about semantic versus syntax, and of course static code analysis cannot catch a lot of things. But it is very useful to catch a lot of things so it doesn't miss the point, it just doesn't cover all the points that a manual code review could cover. What it does do though, is free up more time for doing more in depth code reviews because you don't have to waste as much time manually reviewing syntatic junk.

    9. Re:Misses the point by emphatic · · Score: 1, Informative

      Completely agree. PMD, along with FindBugs and optionally CheckStyle are great tools to get your development team in the habit of using *before* code reviews. This allows the review to focus on higher level issues rather than "that's a possible NPE right there". Code Reviews are irreplacable. These tools are *very* helpful to help locate potential issues/bugs in your codebase automatically. (Both PMD and FindBugs have Eclipse plugins too, which helps integrate the tools into the act of code writing.)

    10. Re:Misses the point by OldManAndTheC++ · · Score: 4, Insightful

                 (if the purpose
          of code) {
      is             to be }
      [    understood,       then
        formatting        standards]
      //       can     really     help.

      --
      Soylent Green is peoplicious!
    11. Re:Misses the point by Azarael · · Score: 1

      That's why we have
      astyle dirty-file

    12. Re:Misses the point by Tim+C · · Score: 2, Insightful

      reviewers should only be looking for specific problems, code formatting is not one of them

      On the contrary, if you have a code formatting standard (and imho you should), then formal code reviews most certainly should flag transgressions.

    13. Re:Misses the point by rthille · · Score: 3, Interesting


      I've felt for many years that source should be stored as syntax trees and that editors should show you the source how every _you_ want to see it so it's most clear to you, not to the author.

      --
      Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
    14. Re:Misses the point by joggle · · Score: 2, Interesting

      Your idea has merit but I disagree that the files should be stored in this way. This would probably either be some binary format that would make it more difficult to track changes using version-control software or an XML file that would also be more difficult to keep track of. This would also prevent common text editors from being able to edit the file.

      However, it would be cool if your text editor automatically cleaned up the code for your display without actually changing the file. The trick here though would be maintaining a reasonably readable file if two people are editing it with different style settings (such as shift-width=4 versus 3, etc.). The reason you would like for it to not change the code is to make version tracking easier (otherwise every time someone with a different style made a commit tons of lines of code would change due to style differences).

      There's already text editors that can clean up code (such as the one in Visual Studio .Net) but I'm not aware of any that can do this in the background without changing the file.

    15. Re:Misses the point by caseydk · · Score: 1

      I agree wholeheartedly... and said the same thing in 2005: http://caseysoftware.com/pmdapplied ;)

    16. Re:Misses the point by digitig · · Score: 1

      There are static analysis tools that do full semantic analysis as well as syntactic analysis, tools such as SPARK and MALPAS. They look at your code and tell you what it does. They'll only work on a subset of programs (in a subset of the relevant programming language), of course, and you have to give them some hints (to overcome the halting problem, for one thing), and the output is a mass of predicate calculus that you are then expected to prove matches your formal specification for the program. Huge fun, if you like pages of math (and we are supposed to be geeks here, aren't we?)

      --
      Quidnam Latine loqui modo coepi?
    17. Re:Misses the point by EvanED · · Score: 1

      An idea if you can find a source code formatter that you can customize to your liking is to have a canonical format for storing in the repository, then on update run it through a formatter to convert it to the canonical format and on checkout/update run it through your formatter.

    18. Re:Misses the point by jgrahn · · Score: 1

      I've felt for many years that source should be stored as syntax trees and that editors should show you the source how every _you_ want to see it so it's most clear to you, not to the author.

      I do not believe that is possible. My tastes in formatting cannot be encoded into an algorithm.

      And besides, how many fellow programmers do you know who have horrible formatting, but at the same time happen to name variables and types well, and document tastefully? I know none.

      This idea was hot back in the early 1990s (I remember doing a syntax tree editor as a lab exercise) but no, I don't think it can lead to anything but more pain.

    19. Re:Misses the point by jgrahn · · Score: 2, Funny

      Heh - you would think people compile with all warnings turned on. I've been working on a project recently that didn't.

      This is, in my experience, more of a rule than an exception. Some moron starts programming, doesn't bother with the compiler, figures someone will clean that up later ... and five years later you cannot even add a CFLAGS=-ansi -Wall without having the whole codebase barf thousands of warnings at you.

    20. Re:Misses the point by Dastardly · · Score: 1

      If the code is hard to write it should be hard to read. :-)

    21. Re:Misses the point by skuenzli · · Score: 1

      You are absolutely correct.

      I've done [Java] code reviews with and without static analysis tools.

      If you don't have static analysis tools in place, the review tends to spend its time on finding and discussing little bugs. You'll still find the big ones, but the room is likely to be so worked up (or tuned out) by the discussion of "basics" to some and "nit-picking" to others that the discussion of important stuff is difficult to keep constructive.

      If you do have static analysis tools in place, you can set the expectation that the reports of the static analysis tools are clean prior to the review. This allows the discussion to go straight to the important stuff. Maven's reporting features (plugins) are the number one reason I use it. My standard reports:

      • Cobertura - Unit Test Coverage Analysis
      • FindBugs - looks for bugs in Java code/li>
      • PMD - scans Java source code and looks for potential problems/li>
      • CPD - Copy Paste Detector tool by PMD/li>
      • Changelog - retrieves list of recent changes from your cm system (CVS, Subversion, etc)/li>
      • JXR (generates an HTML version of the source that other plugins can reference -- this is great if a report highlights a problem, you can browse straight to it)/li>

      Static analysis tools are one of the great resources of the Java language.

      Stephen

    22. Re:Misses the point by mollymoo · · Score: 2, Insightful

      I LIKE ternary expressions and will use them whether or not beginners find them "confusing."

      So using a style which leads to the minimum of errors is less important than your personal preference? It's not surprising that most software is bug-laden shit when so many programmers have attitudes like yours.

      --
      Chernobyl 'not a wildlife haven' - BBC News
    23. Re:Misses the point by Raenex · · Score: 1

      Geeks != math geeks

    24. Re:Misses the point by Raenex · · Score: 1

      My tastes in formatting cannot be encoded into an algorithm.

      I'm curious. Do you have examples?

      And besides, how many fellow programmers do you know who have horrible formatting, but at the same time happen to name variables and types well, and document tastefully? I know none.

      I don't think it's a matter of "horrible" formatting. It's just a matter of personal taste, like where the brace belongs, how many spaces to indent, etc. It certainly would be nicer if everybody could use their personal style without imposing that on others. When I browse the web I get annoyed when people expect that I have a certain resolution screen, font size, etc. What's so special about code that we can't let the IDE lay it out to suit our individual tastes?

    25. Re:Misses the point by Gwyn+Fisher · · Score: 1

      What about large-scale systemic problems that can't easily be gathered for a "round-the-table" walk through. Security vulnerabilities are the obvious candidates there -- some level of review is obviously tractable (the "hey, you're slapping that user-supplied string straight into a SQL statement there, bad lad!" kind of thing), but the more pernicious and less obvious candidates (e.g. malformed tracer packets being propagated through system services), I would say, require the formalism that a static analysis tool brings to bear.

    26. Re:Misses the point by alan_dershowitz · · Score: 1

      Why do you think that PMD allows customization at all? Because labelling any particular thing as "bad" is not universally agreed upon, or not appropriate for different levels of programmers.

      When you're not a first-year graduate anymore, you start seeing why they're useful. They are in fact MORE clear and MORE logical than a bulky if-then-else. Where I work no one has a problem with that, so we disabled that rule and everyone is happy. You're SUPPOSED to do that.

  2. OOG AGREE by President_Camacho · · Score: 0, Offtopic

    Sliced bread good.

    Sliced bread good! FIRE BAD!

  3. PMD defined by martyb · · Score: 3, Informative
    WTF is PMD???

    Look at the sourceforge definition of PMD .

    1. Re:PMD defined by bubbl07 · · Score: 1
      Well, according to the summary:

      Think of it as a code review in a box.

      However, I don't trust boxes. The first box I opened had Justin Timberlake inside I'm still paying child support to the second box.
    2. Re:PMD defined by Anonymous Coward · · Score: 0

      Post menstrual dynsrome. Nuff said.

    3. Re:PMD defined by cant_get_a_good_nick · · Score: 1

      WTF is PMD???

      Easy, Erick and Parish Making Dollars, well, PMD was when Parrish went solo.
    4. Re:PMD defined by Anonymous Coward · · Score: 0

      Or post menstrual discharge (requires no creative spelling)

    5. Re:PMD defined by Jesus_666 · · Score: 1

      It's pretty annoying when people use obscure acronyms in the /. summmaries without resolving them, but acronyms that can't even be resolved are taking the annoyingness to a whole new level.

      Won't somebody think of the /.ers?

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
  4. I have by geekoid · · Score: 1

    ran many code reviews, most of them very succesfully. They just need someone to run them that isn't attached to the code.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    1. Re:I have by jrumney · · Score: 1

      It also helps if your "room full of geeks" is a small room - the person who wrote the code, the person running the meeting, and two or even just one reviewer works best in my experience. Also helps if the reviewers have reviewed the code prior to the meeting, so the rest of the team are not spending the whole meeting watching them skim the code.

  5. still need human eyes by Chirs · · Score: 3, Insightful

    There are lots of static checkers, and they're valuable to use before a formal review. The real usefulness of human reviewers is to find bugs in the logic, not the coding. Math errors, bad assumptions, missed corner cases, etc.

    1. Re:still need human eyes by slide-rule · · Score: 1

      agreed... and not only those sorts of things, but peers with more time invested into a non-trivial-sized project can also spot where you reinvented the wheel (due to unfamiliarity with the code base) and where, thinking down the road, a refactor now will save tons of time later. These are the sorts of things you do code/peer reviews for.

  6. Static analysis is the sh-t by FormOfActionBanana · · Score: 1

    Sewwiously. I don't know why Microsoft doesn't (appear to) invest in static analysis in their code... Imagine outputting a to-do list with, say, *every* freaking buffer overflow exploit in the Windows code tree.

    --
    Take off every 'sig' !!
    1. Re:Static analysis is the sh-t by Actually,+I+do+RTFA · · Score: 2, Informative

      Imagine outputting a to-do list with, say, *every* freaking buffer overflow exploit in the Windows code tree.

      From what I understand from my friend who works for M$, one of the main reasons it took so bloody long to write Vista was they did precisely that. He may be wrong or the auto-buffer overflow exploit finder may not have been perfect, but at least they're trying.

      --
      Your ad here. Ask me how!
    2. Re:Static analysis is the sh-t by Anonymous Coward · · Score: 0

      If only it were that bloody easy. Come on, how insulting are you?

    3. Re:Static analysis is the sh-t by FormOfActionBanana · · Score: 1

      Cool! Thanks, Microsoft!

      --
      Take off every 'sig' !!
    4. Re:Static analysis is the sh-t by jrumney · · Score: 1

      Static analysis of C and C++ is much more difficult, and prone to miss things and flag things inappropriately, than static analysis of a language like Java.

    5. Re:Static analysis is the sh-t by Procyon101 · · Score: 2, Insightful

      As someone else said, they do. Rigorously. And with some of the best next-gen lint like tools there are.

      As an aside though, I question the sanity of such tools. Why does the language allow you to use constructs that an automated tool could tell you were screwy?!? Shouldn't such practices the tool catches be compiler errors, or at least glaring warnings? It seems to me that the underlying problems lie in the language specifications and the tools are only there as a band-aid.

    6. Re:Static analysis is the sh-t by EvanED · · Score: 1

      WTF? MS has invested HEAVILY in static analysis. MS Research's SLAM project, which has been semi-commercialized in the static driver verifier for driver development, is one of the most important projects in the field.

    7. Re:Static analysis is the sh-t by EvanED · · Score: 1

      As an aside though, I question the sanity of such tools. Why does the language allow you to use constructs that an automated tool could tell you were screwy?!? Shouldn't such practices the tool catches be compiler errors, or at least glaring warnings? It seems to me that the underlying problems lie in the language specifications and the tools are only there as a band-aid.

      If you're writing a device driver, how are you going to have a language that tells you not to call a sleeping function when the IRQ level is too high? How are you going to have a language that will prevent you from writing 'close(fd);' twice in a row?

      Fact is, many of the things these some of tools are looking for at are a MUCH higher level than PL syntax. Making the compiler reject or complain about them would require essentially putting the static analysis into the compiler.

      As long as there are buggy programs, there are going to be bugs that you can detect automatically via static analysis.

    8. Re:Static analysis is the sh-t by jgrahn · · Score: 1

      As an aside though, I question the sanity of such tools. Why does the language allow you to use constructs that an automated tool could tell you were screwy?!?

      Because the linter is only right 90% of the time. A language has to be flexible to be useful, a linter doesn't.

    9. Re:Static analysis is the sh-t by Procyon101 · · Score: 1

      how are you going to have a language that tells you not to call a sleeping function when the IRQ level is too high?


      You aren't. But I doubt a static analyzer is going to catch that either.

      How are you going to have a language that will prevent you from writing 'close(fd);' twice in a row?


      By abstracting away the mutation. You can do this with C++ RAII, or Monadically, or any number of other ways.

      We will always need device drivers. C and ASM are great for that, no question about it. On the other hand, when programming higher level components, why do we insist on using a set of semantics designed for programming directly to hardware? My point being that if your language requires such low level flexibility then your static analyzer won't be doing you much good because it will be screaming about false positives, and if your static analyzer can catch problems, then you are programming in a language that is tied to closely to the hardware, since your semantics allow for constructs that do not make sense in your higher level domain.
    10. Re:Static analysis is the sh-t by Procyon101 · · Score: 1

      A language need only be as flexible as its problem domain requires. Many make systems are not even Turing complete (I think GNU Make is) but that doesn't make them non-useful! Java abstracts away pointers and forces you into an OO-only paradigm, but that lack of flexibility shelters you from a large class of bugs while retaining enough flexibility to solve a great range of problems. (Please don't confuse this statement as a personal endorsement of Java :)

      Without even getting into non-mutable semantics, we know a few techniques that are widely recognized and used to reduce the very errors linters check for such as strong static typing, garbage collection, RAII and stack frame oriented destruction, message passing, etc... It is my premise that if the linter can catch any bugs at all in your semantics, than it is very likely you are programming using constructs that are too low level for your problem domain in the first place. If the linter only picks up false positives, then you are using the right level of abstraction, but you also have no use for the linter.

    11. Re:Static analysis is the sh-t by Anonymous Coward · · Score: 0

      Microsoft invests heavily in static and dynamic analysis of their code base (prefast, fxcop, prefix, etc.). In fact such analyses form critical bars in security signoffs for releases.

    12. Re:Static analysis is the sh-t by EvanED · · Score: 1

      You aren't. But I doubt a static analyzer is going to catch that either.

      There are tools that are powerful enough to find if small programs (such as drivers) have these bugs. They are limited by assumptions they make so they can't catch all of them, but they will usually either prove a program free of these bugs up to their assumptions or find a concrete counterexample.

      (In general such a problem is of course undecidable, but for the tiny subsets of possible programs that represent what we actually write, these conclusions can often be made.)

      By abstracting away the mutation. You can do this with C++ RAII, or Monadically, or any number of other ways.

      The above was just an example. I agree that you can remove most of these bugs with C++ for instance, but I don't think you'll get a language that doesn't have some similar mistakes you can make, at least until processors or compilers get around to implementing DWIM.

      Like I said, as long as there are bugs in programs, there will be bugs in programs that static analysis tools can find and help you prevent. And aside from DWIM, I don't think you'll get a language that will change that.

    13. Re:Static analysis is the sh-t by Procyon101 · · Score: 1

      I agree that there will always be classes of bugs that cannot be solved by language design. The halting problem is the classic example of this. However, if the static analysis tool can determine that the code you wrote is not in a subset of possible code you probably meant to write, then it seems perfectly reasonable that you can construct a syntax in which only programs lying in the subset of programs you mean to write are expressible, or at least a syntax that maps them closely enough that a static analyzer won't be able to determine the bugs. For instance, adding 2 strings together is generally not meaningful, so the compiler in most languages statically checks for this possibility using a type system. Such static analysis could conceivably done by an external tool, but I think we would agree that said tool only masks a deficiency in the compiler.

      In the most trivial realization of what I am saying, you could simply merge the syntactic analyzer with the compiler, in which case you have augmented the compiler. I think it is possible to go much further in constraining the syntax of the language itself though. Extremely strongly typed languages like ML and Haskell do this for instance.

      Basically, in the end, I think that if there is near universal consensus on what is a "bad thing to do" in such and such a language, and such a thing can decidably be picked up by a static analyzer, then it points to a deficiency in the language, and by not patching it up, forces the users of the language to use external tools to catch the issue instead of a better compiler.

  7. everyone's got a different name for it by President_Camacho · · Score: 1

    Think of it as a code review in a box.

    Hey, if you want to call it a "code review", that's your prerogative.

    1. Re:everyone's got a different name for it by Anonymous Coward · · Score: 0

      Fantastic

  8. Business logic? Algorithms? by stoicfaux · · Score: 3, Insightful

    Err... call me old-fashioned, but aren't code reviews supposed to focus on the business logic implementation, potential side effects, and the algorithms used? Don't get me wrong, it's nice that to have an automated tool to check syntax and for duplicated code, but that's not where the savings comes from. Until you get a human level A.I. to do code reviews, automated code review tools are just gravy and not the meat to solving code problems.

  9. Static Analyzer Run != Code Review by neutralstone · · Score: 3, Insightful

    Enter PMD, an open-source Java static analyzer. Think of it as a code review in a box.

    No, no, no!

    Neither of these development tools is a substitute for the other. Static analyzers are very good at catching *certain* *limited* *kinds* of obvious and likely or conceivable bugs in a short span of time, but in general they cannot deduce the intentions of the developer -- let alone the desires of the user. That's where your peers come in: they review your designs before you write the code and review patches prior to merging them into a revision control system. Static analyzers are *debugging tools*: use them while coding and/or when you run regresssion tests.

    Why wasn't that obvious to the submitter?
    1. Re:Static Analyzer Run != Code Review by The+Masked+Rat+Fink · · Score: 1

      Yes! Yes! Yes!

      Having worked in corporate I/S for far too many years now, I understand that the "right thing" is rarely what happens.

      I agree with your point that PMD is not a substitute for a REAL code review, undertaken by a small group of talented programmers. Unfortunately, these rarely happen, so PMD starts to be a very useful second line of defense. For the pragmatic among us, using PMD makes much sense. I wish it wasn't so, but I find that it is. If you work in an environment unlike that which I've described, then more power to you.

      Simon (the guy who wrote the review)

      --
      simonpeter.org | simonpeter.com | techbook.info
    2. Re:Static Analyzer Run != Code Review by EvanED · · Score: 1

      Static analyzers are very good at catching *certain* *limited* *kinds* of obvious and likely or conceivable bugs...

      To be fair, good static analysis tools can catch a LOT of non-obvious bugs. MS Research's SLAM project found a bug in the parallel port driver, a race condition that was triggered if a program was closing the port at the same time the port was being physically removed from the computer. (Hey, it could happen if you have a laptop docking station.) This bug went undetected for like a decade to be found automatically.

      You might debate the value of finding that particular bug, but static analysis can do more than find obvious bugs. (In fact, in some cases, they can prove a program free of a particular class of bugs. Let's see you do that with code reviews.)

      (I'm not denigrating code reviews either in the slightest; you're very right that they are nearly complimentary. But I do think that they can do more than you acknowledge.)

    3. Re:Static Analyzer Run != Code Review by EvanED · · Score: 1

      (I'm not denigrating code reviews either in the slightest; you're very right that they are nearly complimentary. But I do think that they can do more than you acknowledge.)

      they = static analysis tools

      Got my antecedents confused...

    4. Re:Static Analyzer Run != Code Review by neutralstone · · Score: 1

      I agree with your point that PMD is not a substitute for a REAL code review, undertaken by a small group of talented programmers.

      Ok, we agree so far...

      Unfortunately, these rarely happen, so PMD starts to be a very useful second line of defense. For the pragmatic among us, using PMD makes much sense. I wish it wasn't so, but I find that it is.

      Why do you wish that using PMD wouldn't make sense?

      I think perhaps you missed my other point: Code review is not a substitute for running a static analyzer.

      Static analysis is a Good Thing.

      Peer review is Good Thing.

      To use *both* is best: both reveal bugs that the other might not catch.

      One of the great virtues of static analyzers is that they help you to weed out lots of simpler issues *before* you go into peer review of code, and that can save you and your peers a good deal of time.

      For the pragmatic among us, using a layered approach to catching bugs makes the most sense. If a pointy-haired boss takes away any of those layers, that's a separate issue.
    5. Re:Static Analyzer Run != Code Review by neutralstone · · Score: 1

      Static analyzers are very good at catching *certain* *limited* *kinds* of obvious and likely or conceivable bugs...
      To be fair, good static analysis tools can catch a LOT of non-obvious bugs.
      Very true; and I've seen SAs catch some very non-obvious bugs so I shouldn't have said "obvious". (:

      MS Research's SLAM project found a bug in the parallel port driver, a race condition that was triggered if a program was closing the port at the same time the port was being physically removed from the computer. (Hey, it could happen if you have a laptop docking station.) This bug went undetected for like a decade to be found automatically.

      You might debate the value of finding that particular bug, but static analysis can do more than find obvious bugs. (In fact, in some cases, they can prove a program free of a particular class of bugs. Let's see you do that with code reviews.)
      It can be easier to prove correctness if contract programming is employed. But in general I agree.
    6. Re:Static Analyzer Run != Code Review by Gwyn+Fisher · · Score: 1

      I think perhaps you missed my other point: Code review is not a substitute for running a static analyzer.

      Static analysis is a Good Thing.

      Peer review is Good Thing.

      Nice summary. Static analysis tools have a firm place in a well-planned development cycle, and that place is on the developer's desktop, while they're coding. Applying static analysis as a "last thing before ship" stage is a recipe for failure.

      So, apply static analysis while coding and get all the advantages that it brings in terms of defect and vulnerability location. Then apply code review as a "bigger picture" walk through of logic, semantics and overall purpose. Stops the code review turning into a long whine session on "why do I have to tell you people not to do 'x' every time we meet?" and gets it to focus on what it's actually for. (disclaimer: I work for a static analysis company, although am not involved with PMD).

  10. How good is that code? by 140Mandak262Jamuna · · Score: 5, Funny
    My friend wrote a syntax analyzer to report gross statistics about code. Calculate the ratio of number of lines of comments to lines of code, average number of lines per function, etc etc. Finally it would come up with a let grade between A and F for code qualtiy based on some heuristics. That was his masters project.

    Our diabolical prof fed this analysis code itself to it as data. It spat out a D. He got a D.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    1. Re:How good is that code? by tcopeland · · Score: 2, Interesting

      Nice... yup, metrics are nice, but don't let them rule.

      PMD has a bunch of metrics stuff, including NPathComplexity (thanks to Jason Bennett for writing that one) and CyclomaticComplexity. All the "codesize" PMD rules are here.

    2. Re:How good is that code? by Mr.+Underbridge · · Score: 1

      Our diabolical prof fed this analysis code itself to it as data. It spat out a D. He got a D.

      What an idiot! Come on, if it's his freaking *project*, you'd think he'd at least do the vanity/sanity check of running it on itself. Something like that had better damned well produce an 'A' for itself. Did it never occur to him to at least practice the principles of design he's preaching?

      I'd have done the same thing if I were the prof, it's just too damned obvious. And incredibly funny. If I was feeling charitable I might give him the opportunity to resubmit.

      If I were the student, I'd have had an obfuscated section that ensured it returned an 'A' if it was self-run.

    3. Re:How good is that code? by Anonymous Coward · · Score: 0

      If it was anything like my senior deign project, he was up late the last week just getting the thing to work right, so making lots of comments and doing lots of extra testing was more than likely not the top priority. Don't forget, you only have about 2 months to design, code, test, and make a presentation and working example of a project. And that's with other classes too. They aren't supposed to be works of art. Plus, the professor sounds like a real dick. I would have fucked up his car.

    4. Re:How good is that code? by Mr.+Underbridge · · Score: 1

      If it was anything like my senior deign project, he was up late the last week just getting the thing to work right, so making lots of comments and doing lots of extra testing was more than likely not the top priority. Don't forget, you only have about 2 months to design, code, test, and make a presentation and working example of a project.

      If you're doing anything like this, you make sure it gives itself an A through the entire development cycle even if it does nothing else. Even if you code the entire thing in a day (and anyone could code up a simple version of a program that does that in a day). Too obvious. Certainly don't create a project whose sole purpose is to return a grade, and then fail itself.

      Not to mention which, many students actually manage to do decent work for projects. I'm familiar with the concept, having done many class projects plus graduate theses.

      Plus, the professor sounds like a real dick. I would have fucked up his car.

      Again, too obvious and uncreative. If I'm the prof, I see it coming and now you have a D *and* you're arrested. If I'm teaching a class in revenge, I'd give you a D for *that*. Don't blame the prof. for your crappy work.

  11. Code collaborator is a good option. by _pi-away · · Score: 1

    I've used code collaborator and I find it pretty good. The interface can be a little limiting sometimes, but overall a good product that allows the reviewers to review from their office when they have time to do it. This can occasionally lead to the review dragging on, but the convenience is worth it.

    --

    "The crows seemed to be calling his name, thought Caw."
  12. But: does your foreskin... by FormOfActionBanana · · Score: 0, Offtopic

    I believe just about every book would cover that.

    But: does your foreskin...

    run Linux?

    --
    Take off every 'sig' !!
    1. Re:But: does your foreskin... by Anonymous Coward · · Score: 0

      Imagine a Beowulf cluster of foreskins...

  13. Re:Business logic? Algorithms? by tcopeland · · Score: 1

    > aren't code reviews supposed to focus on the business logic implementation

    I think that's exactly right - use tools like PMD to find nickle and dime things like certain null pointer exceptions, unused code, empty try blocks, etc. Let the code reviews be focused on things like "hey, we don't need all these accessors", "we should be using the business rule package here", "this is really more of a Map than a List" and that sort of thing.

    The tools do the gruntwork and the people do the thinking... good times.

  14. One important thing about linux code by Anonymous Coward · · Score: 0

    SYBASE ASE 15.0 doesnt work with glibc 2.4+ and the "workaround" doesn't help the install package run.

    Ya get what ya pay for. Time to upgrade to Yukon! You had your shot, you clusterfuck of hippy bullshit.

  15. findbugs is better by Nightshade · · Score: 2, Insightful

    I've used both PMD and findbugs and PMD is pretty good, but I find findbugs to be much easier to set up and use. (http://findbugs.sourceforge.net/)

    1. Re:findbugs is better by Anonymous Coward · · Score: 0
      They do two different things. Findbugs is all about looking at the bytecode and finding problems like double check locking. PMD is more about finding unused private methods/variables, bad naming conventions in your source. To say one is better than the other is a bit shortsighted; both should be run in any Java project.


      Also don't forget about CPD (Copy Paste Detector) which comes with PMD. Brilliant little piece of software for showing up when your developers should be writing methods rather than copy pasting code.

  16. FxCop by Anonymous Coward · · Score: 0

    It's called FxCop, and it's been freely available from Microsoft for a _long_ time, having been created internally for use on their own projects. I'm not going to cite, because even a simple Google search would have turned up dozens to hundreds of links about FxCop and teams using it. Since you didn't run said search before posting, I don't see why I should cite.

  17. Re:Business logic? Algorithms? by alan_dershowitz · · Score: 2, Insightful

    This has to do with the human element. If you get six people in a room and give them a pile of badly formatted code, even with specific instructions to the contrary they are likely going to at least mentally pick apart all the obvious problems before they even move on to subtle logic flaws and business logic. I think the savings come in because you can be a better code reviewer precisely because you are not being distracted by formatting or stupid little potential errors. I personally have a much easier time reading code if it's formatted in a consistent way, so I like having a corporate code formatting standard even when I disagree on the specific policies in some places.

  18. PMD - OK, but... by gzunk · · Score: 1

    I've used PMD, and it's quite good, but irritating at times. It complained when I was creating an object within a loop. Now what I had to do was populate a list of objects so the whole point was to create multiple objects, using a loop. But it still complained...

    It's like checkstyle - good for catching some things, but sometimes too irritating to take seriously.

  19. Structural analysis for Java by Anonymous Coward · · Score: 1, Interesting

    I really thought that IBM was headed somewhere when they cooked up Structural analysis for Java
    http://www.alphaworks.ibm.com/tech/sa4j back in the 04's.

    Yes, heuristics can catch probable mistakes in developed code, but the above sort of thing, if sophisticated enough, could warn you about serious architectural mistakes as you are making them.

  20. Finding Time For Code Reviews Is The Problem by Prototerm · · Score: 1

    I used to hate going into code reviews at this one place I worked. Because nobody had time to prepare for it properly, it often collapsed into a series of "You need to indent 4 spaces instead of 3" kind of criticisms, while ignoring issues like code reuse, poorly commented code, and misuse of basic OOP principles.

    I don't believe a machine, no matter how well programmed, can replace a proper review done by real live human beings. But to do that, you need to give people the time to do it right. Unfortunately, the project managers in the companies I've worked for didn't really believe in a structured programming approach. It's a big waste of time, and programmers are being paid to write code,not sit around talking about code.

    Oh well, these are the same managers who feel that functional and technical specs are also a waste of time. I believe this is called the Redmond Programming Methodology. AKA "Quality is Job 1.1"

    --
    "My country, right or wrong; if right, to be kept right; and if wrong, to be set right." --Senator Carl Schurz (1872)
  21. A note to the reviewer by zCyl · · Score: 1

    Indeed. It can be quite helpful in reviews of you actually provide a somewhat detailed definition, at the beginning, for what you're talking about, rather than assuming everyone has heard of it. Calling it a "static analyzer" only makes me think of balloons with hair stuck to them. And, "The first chapter is the mandatory introduction... Nothing unusual there," is simply ironic, given its absence here.

    Freshmeat says, "PMD is a Java source code analyzer. It finds unused variables, empty catch blocks, unnecessary object creation, and more. It includes CPD, a tool to detect chunks of identical code."

    Although I'm not sure if any of that is comparable to what's obtained by peer code review.

  22. Step 1 by SamTheButcher · · Score: 1

    Put your code in the box? :)

  23. What happened to Slashdot? by leamanc · · Score: 1

    Where is all the whining about Slashdot linking off to some evil corporate bookseller, when the book can be acquired for cheaper at Amazon?

    --
    :q!
    1. Re:What happened to Slashdot? by Anonymous Coward · · Score: 0
      Well, if you had read the whole review, you would have seen this part:

      The book is only available from the publishers, Centennial Books. They're a smaller publisher, but I had no problem with purchasing through their website.

      So don't worry, Slashdot is still the same. Well, we don't have as many people complaining about the links, but we still have people who can't read the whole article!
  24. Re:Business logic? Algorithms? by jgrahn · · Score: 1

    Err... call me old-fashioned, but aren't code reviews supposed to focus on the business logic

    (What is "business logic", anyway? Can I assume it's a synonym for "logic", only made more palatable to PHBs?)

    implementation, potential side effects, and the algorithms used?

    I guess that's way of putting it, yes. It's also a tool for people to learn each others parts of the code, and for understanding each others' styles and design choices.

  25. This software cannot ever work properly. by JustNiz · · Score: 1

    This was proved to not be viable in 1936 when Alan Turing proved that no machine given a description of a program and a finite input can decide whether the program finishes running or will run forever, given that input.

    Thus a general algorithm to solve the halting problem for all possible program-input pairs cannot exist, let alone validate the semantic approach or validitiy of the code itself.

    Unfortunately there are enough managers who want to devalue software development from a skilled art into a brain-dead process, so they will buy into this no matter how invalid it really is.

    1. Re:This software cannot ever work properly. by mture · · Score: 1

      I understand the computer theory behind your comment. However, I don't see what this has to do with the task of improving code quality. Do I need to prove whether a program halts to be able to suggest improvements through static analysis?

      Static analysis tools are one part of the complex software development process. I wouldn't be so quick to throw them out completely because of something a very smart computer scientist once said!

    2. Re:This software cannot ever work properly. by JustNiz · · Score: 1

      Sure but my understanding from the article is that this software is intended to replace the need for code reviews by humans. My original comment was really trying to say why that could never happen.

    3. Re:This software cannot ever work properly. by Raenex · · Score: 1

      Sure but my understanding from the article is that this software is intended to replace the need for code reviews by humans.

      There were many posts that pointed out that this software was not meant to take the place of code reviews. The original submitter/reviewer had it wrong. No need to rant about the Halting Problem and managers.

  26. Defence contracting by steveoc · · Score: 2, Insightful

    Used to do this all the time when writing software for defence / weapons systems. (in Oz at least)

    Its easy - the source is printed out and circulated .. peers have 1 week to circulate the code and make markups in coloured pen of their choice, after which a formal code review meeting is held.

    It is the responsibility of each peer to be familiar with the requirements spec (even if its not part of their project), and the design that has been signed off. If they dont agree with the design - too bad - the process allows for plenty of time to look at design options, document those options, and document reasons for choosing one approach above another. Dont waste people's time by trying to flog a dead horse at a code review.

    The meetings are known in advance, so its never a problem getting people in the same room at the same time.

    Code review meeting is informally divided into 4 sections :

    1) Intro - if this is NOT the first code review, quickly list the TODO items that came up from the last code review. Good developers will refer to these recurrent issues quickly, and stick to making use of the time by covering new ground. 10 minutes to discuss any code formatting / spelling / naming convention issues. These ARE important for maintainability and consistency, so egos need to be put on hold for the duration of this 10 minute period. Its not hard to do.

    2) Critical issues section - no time limit. During this phase of the meeting, we look at :Coding errors, buffer overflows, and other WTFs in general. Requires serious, devious rat-like cunning on the part of the reviewers to think up diabolical scenarios in which the reviewed software might fail. Again, egos are put to one side, since everone eventually ends up on the receiving end of this treatment. OK, its hard to watch 'your baby' being ripped apart like this, but its all for a good cause. A lot of really good things come out of this exersize.

    Strangely, when your peers rip the shit out of your code, yes it hurts, but it gives you more respect for those peers. There is nothing better than feeling that you are surrounded by really top-notch people that you CAN bounce ideas off, and get better insights into what you are working on. That is one side-benefit of tight code reviews.

    Good, disciplined peers will stick to the agenda, and not wander off into talking about design options or formatting issues during this period.

    Document everything - even if the conversation winds around and ends up coming to the conclusion that the code is not at fault after all - document it, so you dont end up going back down the same path later on down the track.

    3) Requirements review : quick session to discuss whether the code matches the requirements. Dont bother turning up to a meeting without a thorough understanding of those requirements - you will look like a dick, and nobody will invite you to a pissup on friday arvo. Document any areas where the code falls short of requirements AND ALSO document any areas where the developer has gone over the top and added a pile of 'features' that are outside of the requirements. Keep the code lean and mean.

    4) 5 minute wrap up : write up a TODO list of things that need to be done, estimate a timeframe, make sure it gets plugged into the gannt chart, and set a fixed time for the next review.

    Time and schedule should be on everyone's mind, but thats a management and sales problem - not a technical one. Good developers will ignore schedule constraints and stick to doing things right. Having this attitude MAY turn some people's hair grey, but thats not our problem ... fact is, its the quickest way to finish off a good product.

    And we do build good shit.

    Most importantly, everyone should come out of a code review with that smug feeling that they really are part of an elite team of coders, who look at things from every possible angle. It builds morale and raises standards all round.

    1. Re:Defence contracting by SubliminalVortex · · Score: 1

      Have you ever served time in prison?

    2. Re:Defence contracting by steveoc · · Score: 1

      No, not .. yet. Getting close sometimes though.

      Why ? Do you something I dont ?

  27. easy enough by mshurpik · · Score: 1

    Assuming your manager is competent, what's wrong with 1 developer and 1 senior/manager for a code review?

    The key here is that the developer improves his code, or learns more about how the other developers work. That's all.

  28. Lintian checking != peer reviews by chthon · · Score: 1

    So it seems that PMD is something like lint, splint or QAC, but then for Java.

    Where I work we use QAC on C code (embedded development), to obtain information and statistics on such things. However, there is also an extensive code review practice set up.

  29. Presubmission reviews are the strongest ways by mtippett · · Score: 1

    The classic - sit in a room and review - doesn't work, the engagement factor drops considerably.

    The process that I have implemented within my team is quite simple.

        1) Before review, the developers prepare the submission comments as well (low quality checkin comments are as bad as low quality code)
        2) The developers then run a script to prepare a diff for review
        3) The review is set to *all* developers
        4) Any developer can hold the review based on logical, syntactical or stylistic criteria.
        5) When a developer okays the change, they are attached to the checkin comment as a reviewer.

    Some retorts that I have received from people outside the group (and my response)

        1) Diffs don't provide enough context ; If you can't interpret a context diff, you don't know your code

    Some observations on what happens when the developers get used to it

        1) Developers can do this as part of their day-to-day life
        2) Developers will eventually learn to rely on the code review as a way of understanding
        3) Developers begin to get paranoid about code that has not been reviewed :)
        4) Reviewer developers begin to focus on particular areas of their concern (magic numbers, style, correctness, code familiarity)
        5) It doesn't hold back development,since the half-backed checkin doesn't occur nearly as much, and hence no rework.
        6) The developers begin to strive for a clean review - hence increasing quality
        7) The developers begin to make the process more stringent *on their own*

    Surprisingly, the psychological aspect of the review process plays to increase it's strength.

    Now the next area for me to get the developers fully engaged on is formal SDLC artefacts :).

  30. Review Tooling by mperham · · Score: 1

    PMD is decent but really not all that useful in the grand scheme of things. Two tools are absolutely critical at my workspace:

    checkstyle - enforce a coding style. We integrate this into our SCM system so people can't check in unless their code meets the official style. It's draconian but our codebase would have 12 different styles and be close to unreadable without it.

    SmartBear's CodeCollaborator - a tool which makes code reviews easy. This is a necessity when working with junior developers on another continent. I can't speakly highly enough about this program - it's commercial but worth every penny.

  31. Fortify does this by FormOfActionBanana · · Score: 1

    http://www.fortifysoftware.com/products/sca/scaHow ItWorks.jsp
    I think these guys compile an object that makes it easy to examine every possible logic path in the source.

    It's more than just source code pattern matching.

    --
    Take off every 'sig' !!