The P.G. Wodehouse Method of Refactoring

← Back to Stories (view on slashdot.org)

The P.G. Wodehouse Method of Refactoring

Posted by kdawson on Saturday March 22, 2008 @08:02PM from the on-the-wall dept.

covertbadger notes a developer's blog entry on a novel way of judging progress in refactoring code. "Software quality tools can never completely replace the gut instinct of a developer — you might have massive test coverage, but that won't help with subjective measures such as code smells. With Wodehouse-style refactoring, we can now easily keep track of which code we are happy with, and which code we remain deeply suspicious of."

10 of 133 comments (clear)

Min score:

Reason:

Sort:

The idea of physically printing code... by nullchar · 2008-03-22 20:31 · Score: 5, Interesting

...is a neat idea. Besides the mentioned practice of raising and lowering pieces of code that the developers are happy and dissatisified with, hanging code encourages peer review.

Perhaps not in-depth code review, but physically hanging code in your office might "scare" developers into adhering to their organization's standards for fear of their coworkers mockery of poor code.

It might be difficult to hide shitty code when anyone can walk by and look at what *you* think is good.
(At least it might take just as much effort to hide bad code as it does to make it good.)
BTW by symbolset · 2008-03-22 20:57 · Score: 4, Interesting

I'm agreeing with you. 30k lines is 500 pages. That's roughly 8' high by 50' wide. Definitely doable.
Not about the scaring though -- just about it being useful. Anxiety isn't something I'd want to deliberately introduce to a working programmer. Most of the ones I've known had enough performance anxiety issues of their own without adding any.
Hanging the code makes some errors more visible. Not all errors are bugs. Some are structural. Structural fixes sometimes repair "pernicious" bugs.

--
Help stamp out iliturcy.
Big Visible Charts by EponymousCoder · 2008-03-22 21:20 · Score: 5, Interesting

I really like the concept, and it fits in with a bunch of techniques we've been using at work in line with the "Big Visible Charts" ideas. Things like this and Agile stories written on index cards and pinned to the wall do sound hokey. A number of people like Johanna Rothman http://www.pragprog.com/titles/jrpm however point out, that these techniques are a lot more inclusive and (as I've found) you get much more animated discussions than the pm/architect/team lead writing a document "for discussion."
If nothing else it's fun to watch management trying to cope with your walls being covered with sheets of paper, cards and string when they've paid all this money for MS Project and the Rational Suite.
Re:Grok it. by blahplusplus · 2008-03-23 00:04 · Score: 3, Interesting

"Believe it or not flowcharts and Venn diagrams are not obsolete."

Believe it or not I use mindmapping software to help plan out the structure of a program and draw relationship lines arbitrarily, I wish someone made these mindmapping programs and made them more accessable to programs and programming.

http://www.thebrain.com/

Also great flowchart drawing tools:

http://www.smartdraw.com/
Form follows function? by martyb · 2008-03-23 01:02 · Score: 4, Interesting
FTFA:
... A better solution would be to print a class per page. At the start of the project, the application had about 150 classes, and the refactoring effort is focussed on about 80 of those. Initially, gigantic classes would be an incomprehensible smudge of grey, but as the refactoring process starts tidying the code and factoring out into other classes, the weekly printout would start to literally come into focus, hopefully ending up with many pages actually containing readable code (which happens roughly when the class is small enough to fit on no more than 3 pages at normal size).

Brilliant! Absolutely brilliant! "Smell test?" Yah, right. But then I got to thinking, "Why are code formatting standards such a hot topic?" The computer doesn't care if indentation is expressed with 2 spaces, 3 spaces, or a tab. But, I do! Over time, I've learned how to see coding errors just from the slight aberrations in the LOOK of code. Couldn't tell you WHAT it was, at first, it just felt (or smelled) wrong. So call it what you will, but I could now see how "smell test" has some basis behind it. Then, I got to thinking of an age-old question:
How do you find a needle in a haystack?
1. Make the haystack smaller, and/or
2. Make the needle(s) bigger
The technique in the article accomplishes BOTH of these. I'd suggest running the code through a pretty printer to get consistent layout throughout the whole project. The more the semantics of the project can be represented by syntax, the more visible the troublesome code becomes.
Re:Only 30K lines anyway... by Enleth · 2008-03-23 01:35 · Score: 4, Interesting

I'd disagree on pointers and references. If you pass something in by reference, you need to know it goes in there by reference, it's not visible in the calling code. If something's not visible - well, that's a bug just waiting to crawl in there. If you pass something by pointer, the calling code shows it clearly and you know that whatever was passed is likely to be changed by the called function. That's the rationale used by Trolltech and it is quite convincing to me.

Besides, using char * is a must sometimes, when using C libraries that accept, modify and return strings or just some chunks of arbitrary data as char *.

--
This is Slashdot. Common sense is futile. You will be modded down.
There are no corner cases in really good gode by Terje+Mathisen · 2008-03-23 02:40 · Score: 4, Interesting

There are two ideas of thought about corner cases (and the GP pointed out one).
Thought #1) (GP) There's no such things as a corner. It is a requirement - it may be that fewer people/fewer processes use it; but, it is still a section of the total solution that must be designed to overcome some problematic section. Otherwise, why is the code being written?

Thought #2) Corner cases only effect a small number of your user-base; therefore, code to satisfy 95%-99% of your customers. The underlying principle here is that the manager will wait for another release. This approach is usually taken when the project manager failed to account for something and says (and I quote), "We'll just re-design it after the first release." I have taken part in a few optimization competitions, and each time #1 has been a crucial part of the solution:

The usual approach is to optimize the 90-95% case, then bail on the remainder, but this will almost always be beaten by code which manages to turn everything into the "normal" case, with no if/else handling, no testing, no branching.

When I was beaten by David Stafford in Dr.Dobbs Game of Life challenge, I had lots of specialcase code to handle all the border cases, while David had managed to embed that information into his lookup tables and data structures. (He had also managed to make the working set so much smaller that it would mostly fit in the L1 cache. :-)

When my Pentomino solver won another challenge, being twice as fast as #2, the crucial idea was to make the solver core as tiny as possible, with very little data movement and the minimum possible number of tests.

Terje

--
"almost all programming can be viewed as an exercise in caching"
Re:Only 30K lines anyway... by dubl-u · 2008-03-23 06:47 · Score: 2, Interesting

I agree with most of your comments, and especially the spirit of keeping everything shipshape and avoiding the endless game of "who can be blamed". Two minor improvements:
Any error handling through error return codes, probably to be replaced by exceptions, unless it turns the calling code into a wild mass of try/catch blocks. Sometimes instead of return codes, there are other good options. For example, you can spin a state machine out into an object, which in effect keeps the return codes safely in one place until you want to check them. In some cases, I love the Null Object Pattern. And sometimes it makes sense to have a request object and a response object, with the response carrying possible error-related info.
Anything that *doesn't* belong together should be split into separate files (but don't make a file for just a single function - instead create a file with "leftovers"). For functions, maybe. But a lot of good objects really can have just one method. Ruby does that all the time with anonymous methods, for example, and sometimes that pattern is worth using in a more explicit language.

Also, more generally, I feel like unit tests are a much better place to store knowledge than comments.

Other than that, I agree completely!
Perl 6 as a Cautionary Tale ... by joe_n_bloe · 2008-03-23 09:30 · Score: 4, Interesting

I'd like to focus on the author's comments about rewriting vs. refactoring. From July 25, 2000:
Last Monday, nobody knew that anything unusual was about to happen. On Tuesday, the Perl 6 project started. On Wednesday, Larry announced it at his "State of the Onion" address at the Perl conference.

It's one thing to decide to rewrite rather than refactor a product that is losing market share because it is not performing as well as its competitors. (E.g. Netscape.) It's another thing to decide to rewrite (and redesign) rather than refactor a wildly successful and popular product because its continued development has become difficult. Just shy of eight years later, Perl 5 is still creaking along nicely, and Perl 6 (White Elephant Service Pack) is still under design as much as development.

Is Perl 5 so hard to refactor that a determined effort couldn't have made progress, or been completed twice over, in 8 years? Along the way, a lot of the cruft and inelegance in the language could have been removed, and more elegant features inserted.

It happens over and over again - developers, even experienced ones, can't see the impracticality of what they're getting into, and can't see that they're doing work that isn't needed.
Another take at 50.000 ft. code visualization... by mAx7 · 2008-03-23 21:09 · Score: 3, Interesting

In the Complexity Map, a slightly similar approach, a treemap is used to visualize the code's namespace hierarchy in a 2d-landscape. Results from code metric tools are layed out in the treemap, either for individual metrics (e.g. cyclomatic complexity) or for aggregated metrics (anthing that influences team productivity; e.g. errors that are not logged). Due to the Prefuse-based seamless zooming, combined with drill down functionality, it's really easy to visualize and investigate hotspots in extremely large codebases.

The website contains some more background and a nice interactive demo. If you have the patience to wait for the applet to load, I'll guarantee you you'll like it.

Disclaimer: I am the author of this tool. The website mentions commercial interest, but to be honest: there's hardly any. I've found that the concept is just too difficult to sell over the web, so I'll probably open source it soon.