Refactoring: Improving the Design of Existing Code

← Back to Stories (view on slashdot.org)

Refactoring: Improving the Design of Existing Code

Posted by Hemos on Thursday September 23, 1999 @01:55AM from the solid-advice-for-programmers dept.

SEGV has returned and is continuing his excellent set of reviews. This time around, we're looking at Martin Fowler's (with Kent Beck, John Brant, William Opdyke, and Don Roberts) Refactoring: Improving the Design of Existing Code. Click below for more details. Refactoring: Improving the Design of Existing Code author Martin Fowler with Kent Beck, John Brant, William Opdyke, pages 431 publisher Addison-Wesley rating 9/10 reviewer SEGV ISBN summary Just what the working programmer ordered: a catalogue of practical refactorings with solid advice on when and how to apply them.

Overview

This book could very well do for refactoring what the "Gang of Four" book did for design patterns. In fact, with the number of contributing authors, this might well become known as the "Gang of Five" book. (They contributed content to chapters 3 and 12 through 15.)

Organization

Refactoring leaps in feet first with an extended example. I found this to be a surprisingly effective opener: it didn't overwhelm me, and left me hungry for more. The first chapter follows a sample program through several incremental refactorings, and the reader gets the idea via osmosis.

To illustrate the technique of refactoring, the first chapter presents the original code on the left page, and the resulting code on the right, with changes in bold. This presentation, coupled with explanatory text, makes it easy to see what's going on and focus on what's happening. It's as if you're looking over the author's shoulder as he edits, compiles, and tests code in his development environment.

What is Refactoring?

Now that you've done a refactoring, you might be curious to know more about what refactoring is. The next few chapters provide the relevant background.

Refactoring is what the book's subtitle suggests: changing code in in ways that preserve behaviour, but improve the way that behaviour is generated. This could be as trivial as renaming a method, or as tricky as separating domain and presentation classes.

Why go through this trouble? In the end, the code is different but it acts the same; there has been no new functionality added. Why? You do this to place yourself in a better position to add new functionality to the software. If you don't, you eventually end up with spaghetti code that is unmaintainable and will not support new functionality at all.

I think anyone who has worked on real code can appreciate the need for refactoring. In fact, most good programmers already do it, although perhaps only on a subconscious level. What this book aims to do is to raise that ad-hoc activity to a higher level of applied technique. Just as there are principles and practices in GUI design (as opposed to merely throwing widgets together randomly), there are principles and practices in refactoring activity: this book catalogues them.

Catalogue

Sandwiched between introductory and summary chapters is the meat of the book: a catalogue of over seventy refactorings. This catalogue follows in the footsteps of the highly successful Design Patterns format: Pattern Name and Classification, Intent, Also Known As, Motivation, Applicability, Structure, Participants, Collaborations, Implementation, Sample Code, Known Uses, and Related Patterns. Since the individual refactorings are less complex than patterns, this catalogue uses the format: Name, Summary, Motivation, Mechanics, and Examples.

The idea is the same. The name and summary provide a definitive vocabulary and a reference-card example. The motivation explains the relevance of the refactoring. The mechanics cover the step-by-step details of how the refactoring is executed. Then a series of examples demonstrate the variations.

Applicability

I like the catalogue. Although some refactorings seem deceptively trivial, it is useful to have them laid out in step-by-step detail. You never know when you will make a mistake, and when you absolutely positively must fix a bug or add a feature by the next day, and need to refactor to do it, slow and steady wins the race.

Further, other refactorings are not so trivial and familiar, and it is certainly useful to have their traps and pitfalls exposed. Frequently, they rely on the smaller refactorings themselves.

I can see this book becoming well-used in a shop with plenty of production code.

Supplementary Material

The non-catalogue chapters are informative as well. I especially appreciate the metaphor of bad smells in the code: the "if it stinks, change it" philosophy is the perfect counter-point to the oft-cited "if it ain't broke, don't fix it" mentality.

The chapter on refactoring tools discusses the possibility of automating much of the mechanical work of refactoring. Although there is a Refactoring Browser for Smalltalk, I suspect that Java and C++ versions are a little ways off. I'd wager that, as with the UML, tool support will lag industry practice for some time.

Style

As always, the author's writing style is down-to-earth and easy to read. Martin tells you straight up what he's found useful and what he hasn't. He tells you where he's made mistakes, and where the risk is less pronounced.

I like the way he goes through an example, then goes through it again under different conditions, thereby revealing the many-splendoured variations. Frequently he continues examples that were left off from other refactorings.

Plenty of further reading is suggested; I always like that.

Flaws

The book has a Java focus, and that is the language used for the examples. There is some mention of Smalltalk and C++, but not much; far less than Design Patterns, for example. Still, the book is quite understandable to anyone with object-oriented development experience.

The book references design patterns; some refactorings even apply and manipulate patterns. However, I wish there were more direct references to the Design Patterns book. That would especially help those new to both refactorings and design patterns.

There are a few minor typos (nothing major), so check the author's web site for errata and try to get a recent printing if you can.

Recommendation

It's no secret that I think this is a book whose time has come. I'm hoping it will codify my approach to refactoring, to help me be more efficient in my development.

I recommend this book as both a practical catalogue, and as a general work on the theory and practice of refactoring. I think that the refactoring community will grow much as the patterns community before it, and that we will see more published on the subject.

Until then, this book is a good start.

Purchase this at Amazon.

TABLE OF CONTENTS

Foreword
Preface
1. Refactoring, a First Example
2. Principles in Refactoring
3. Bad Smells in Code
4. Building Tests
5. Toward a Catalog of Refactorings
6. Composing Methods
7. Moving Features Between Objects
8. Organizing Data
9. Simplifying Conditional Expressions
10. Making Method Calls Simpler
11. Dealing with Generalization
12. Big Refactorings
13. Refactoring, Reuse, and Reality
14. Refactoring Tools
15. Putting It All Together
References
List of Soundbites
Index

18 of 99 comments (clear)

Min score:

Reason:

Sort:

Refactoring and Extreme Programming by Anonymous Coward · 1999-09-22 21:13 · Score: 2

I think a lot of people underestimate the importance of refactoring code. It's put to good use (as I can attest from experience) in the Extreme Programming software development methodology. (If you haven't heard of this, check it out. It seems kind of radical, but it works very well in practice if applied appropriately.)
Good book by Desdicardo · 1999-09-22 21:14 · Score: 3

I've owned this book for a couple months now and I feel it was definitely worth buying. The section on self-testing code was quite useful, even if it was short. My one complaint is that the book does not address how to apply refactoring in an environment that closely tracks SPR's. On a project where there is formal witness testing you usually try to keep SPR's small with limited impact. This is exactly the opposite of how refactoring works... i.e. redesign the whole thing if it is the Right Thing to Do. While self-testing code helps, having to pay for a complete formal regression test for each SPR would get expensive. Other than that, however, this is an excellant book. I would be very happy if it attracted a following as large as the Design Patterns book.
Finally, Software Editing Arrives... by Anonymous Coward · 1999-09-22 21:30 · Score: 3

The more I write code, the more I realize that it is like any other kind of writing. And after years of looking at garbage spagheti code, I have come to the conclusion that the best way to raise the level of coding is for experienced and talented programmers to review the code of others, making revisions if necessary. Many programmers would no doubt scream in protest, and this too would be a good thing: in my experience the worst programmers are also the ones with the most vanity (especially in regards to their code).
1. Re:Finally, Software Editing Arrives... by Anonymous Coward · 1999-09-22 23:06 · Score: 2
  
  You need the right environment for reviews to work:
  1) social: if programmers feel the need to protect their code from their coworkers, you do not have a team development. Your whole application, being put together by modules from each programmer, will only be as good as your weakest team member.
  2) testability: if you cannot easily test (parts of) your application code, refactoring will not work. No one, other than the original designer, will dare to do any changes. Especially if it is spaghetti.
  3) concurrent version control: using something like cvs, clearcase or such is vital. For doing refactoring you need your own sandbox to be able to play in.
  If you have all three items working in your project, you will get, in my experience, the best results. Everybody can learn from everyone else.
  The worst experience was in a department where everyone worked isolated. Being new there, I got a piece of code to extend with some minor function. Peeking into the main header file I found:
  #define TRUE (-10)
  #define FALSE (-11)
  The guy who had written this had left the company. Showing this piece of marvel around and asking, if this was common practise, I was told not to do such thing. Someone could feel embarrassed by it. "Well, someone should", I thought, went back to my desk and refactored heavily...
2. Re:Finally, Software Editing Arrives... by flanker · 1999-09-23 00:56 · Score: 2
  
  No way. Without some sort of accountability, whether it be pair programming or code reviews or whatever, dirty little secrets will become pervasive in the code base and eventually pull the works down. In addition, the best way for new members to get up to speed as quick as possible is to have design and code reviews. It takes time from senior developers, but pretty soon the people getting reviewed are doing reviews on others. Development done in isolation is doomed to fail.
  
  --
  Left shift 1 for e-mail...
3. Re:Finally, Software Editing Arrives... by gid-foo · 1999-09-23 01:24 · Score: 3
  
  Hear hear, this is the big problem with code reviews, in my experience. They tend to either be a rubber stamp or a tedious examination of religious principles (i like 2 tab indents, not 4, or your open bracket should be on the next line not following your function declaration, etc). In a best case scenario you actually have engineers reading over your code and looking for problems or bogus assumptions. Too often most "engineers" are mediocre and uninterested in anything but widening their cubicle-monkey asses. Egoless development doesn't just apply to the individual writing the code but to all coders in the shop. On another note, in an Intel style environment where the whole goal is to make yourself appear superior to your colleagues by putting down their ideas and abilities (you're manager won't hire anyone potentially better than them because they are in direct competition with their own engineers) code reviews are merely political tools. In other words, an excellent chance to sink the hatchet into your cubicle mate's back.
No amount of programming methology... by GoofyBoy · 1999-09-22 21:36 · Score: 2

...can replace good, solid comments. Comments lines are soooo understated in schools, but are sooo conforting in the real world.

At University of Toronto, examples in first year had 2 line of comments for each line of code. I try to stick with that ratio at work. (I said TRY :( )

--
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
1. Re:No amount of programming methology... by Simes · 1999-09-22 21:44 · Score: 2
  
  There is a school of thought which contends that if your code requires that many comments, it hasn't been very well-written. Good code should be almost self-documenting - an excess of comments can swamp the code and make it just as difficult to understand, if not more so, than the same code with no comments at all.
  
  Additionally, if you're spending twice as much time writing comments as you are writing code, then only one third of your time is being immediately productive. I appreciate the long-term value of well-documented code, but also tend to believe that brevity is the soul of clarity. And with that in mind, I'll shut up now.
  
  Simes.
  
  --
  
  --
  Don't imitate. Enervate.
2. Re:No amount of programming methology... by Mr.+Slippery · 1999-09-22 23:06 · Score: 3
  
  Good structure and good variable names help you understand what is going on, but don't help explain why.
  Comments the just restate what the code says, like
  foocnt++; /* increment foocnt */
  are absolutely useless, and those who put them in code need to be taken out and beaten. But comments like
  foocnt++; /* yes, this situation counts as a foo */
  (ok, a bit of a contrived example) are more useful. Function and module headers that explain design and interface are very useful, and should be required on all sizeable projects, IMHO.
  Inside the code itself, perhaps the best way to comment is to write down what a section of code is going to do before you write it. Thus, you write the comment
  /* now we have to mung the frobnitz for each element of the barbaz array */
  before you write the code that loops over the array . (Assuming that the code is more complex than for (i=0;i) Then, when you're going back over the code, and you see something that makes you pause for a second - or even a quarter of a second - and say "why are we doing this?", comment it. Trust me, the guy who inherits your code will love you for it.
  (I have a calligraphic button that says "Code as if whoever maintains your code is a violent psychopath who knows where you live." Great advice. Anyone know the original source?)
  Anyway, I'm glad that I know have a term for refactoring. I've been doing it for years, but it was sometimes difficult to explain to management what had to be done and why. I shall add this tome to my purchase queue.
  
  --
  Tom Swiss | the infamous tms | my blog
  You cannot wash away blood with blood
3. Re:No amount of programming methology... by A+Big+Gnu+Thrush · 1999-09-23 02:21 · Score: 2
  
  "A Fortran programmer can write Fortran in any language."
  
  Not a complement.
comments should be about *WHY*..not *WHAT* by tuffy · 1999-09-22 21:59 · Score: 3

I don't spend a lot of time on comments the first time around writing things, but invariably I comment the second time around as an eternal reminder why I did things the way I did. This holds especially true for the low-level bits of logic I try to abstract away first and hardly look at until the boss calls for some tweaks.
Too many comments can be as bad as too few, and trying to get the right mix is somewhat of an art that I still haven't quite mastered. But I think using them as reminders has come in very handy.

--
Ita erat quando hic adveni.
Re:SPR? by Desdicardo · 1999-09-22 22:14 · Score: 2

An SPR is a Software Problem Report. When a bug is found or an enhancement is requested an SPR is created to track the changes to the code base. The term is often used interchangebly with Software Change Report (SCR) and bug report. Check out GNATS. Its the GNU SPR-tracking tool.
But refactoring applies to small programs too by jflynn · 1999-09-22 23:03 · Score: 2

I often write small assembly programs, where correctness, then time optimization are the critical design goals. I have to agree with your statement on over architecting small one-off systems, though careful design is always rewarded.

I'd also agree architecture is of critical importance in large multi-programmer projects, especially any that must be expanded and grow over time. Many open source projects qualify nicely of course.

But refactoring really rings a bell, even with small assembly programs. Typically I develop a simulation in C, test it, translate to assembly, verify equivalence of output, then refactor it until it's time optimal. Don't know if this book would suggest methods useful to me, but rewriting code while preserving its function is something I do a lot of.
Analysis Paralysis by kuro5hin · 1999-09-23 00:02 · Score: 3

In my experience, the times when 'well' architectured systems suck is when some manager with experience in, say, Visible Analyst decides it's up to him to engineer the whole system from the get-go, unencumbered by any knowlege of the language it'll be written in, or the strengths and limitations of that language.
The most important thing about all of this is that software development goes in cycles. First you make it work, then you make it right, then you make it fast. Leaving out any of these steps is very bad.
Another very bad thing is when you have the whole system planned out in excruciating detail before you write line one of code. Inevitably, one of your assumptions will turn out to be totally unworkable, and if it's already set in stone, that will probably break everything else. Generally you have to sketch the broad strokes, fill in the major code, find out what works and what doesn't, throw away what you've done so far, and start for real. That's just the way it is, and if you don't plan to throw away your first try, you'll just end up being overbudget and late when you have to throw it away anyway.

----
We all take pink lemonade for granted.

--
There is no K5 cabal.
I am not the real rusty.
Re:C++ and elegance by cjeris · 1999-09-23 01:31 · Score: 2

Semantics??? Forgive me, but I think you missed my point totally.
C++ doesn't HAVE a semantics. C++ has a tangled, elephantine heap of ambiguities. ML has a semantics (Milner et al, The definition of Standard ML, second edition).
As for syntax ... I read a story once about someone (I forget who) who was trying to write a yacc grammar for C++, and every time he ran across an ambiguous case and ran it through cfront (the defining implementation at the time) to check its parsing, cfront dumped core. But that doesn't even matter; any language could have its syntax reduced to Lisp's and it would be fine with me.
Type checking? After working in a language with a really powerful static type system (such as ML or Haskell), trying to express concepts as types in C++ feels like moving a sand dune with tweezers. Parametric polymorphism and algebraic sum types are just the beginning of what I miss.
In short, individual language features (casts, overloading, whatever -- see Haskell to understand what overloading should be about) aren't what bother me. What bothers me is the whole philosophy of clumsy thinking at too low a level of abstraction. It astonishes me that people can and do write large programs in C++ and Java. I can respect that after a fashion, but just because the wall is bloody from everyone else banging their head against it doesn't mean I should do the same.
I apologize for the strong language in this post, but I feel very strongly that the use of insufficiently abstract languages (principally C++ and Java these days) in production software is a major reason why said software is so frequently so bad; and that there is no excuse for the failure to use more advanced language technology to improve our design of and reasoning about programs.

--
Constructive logic destructs my brain.
Good idea, but what about politics... by randomthought · 1999-09-23 01:40 · Score: 2

I think refactoring is a Good Thing. The problem, though, is with the politics of implementing it in an organization. Since the benefits are long term, it can be difficult to convince people to devote resources to it. Most places I've worked at, programmers who quickly churn out massive amounts of barely functional, but glitzy, prototype code are regarded by management types as geniuses, while those who argue for reengineering code are considered obstacles.
Perhaps the book addresses this (I haven't read it). Anyone actually work anywhere where management signed on to refactoring?
D'Oh! by A+Big+Gnu+Thrush · 1999-09-23 02:26 · Score: 2

Or a compliment for that matter.
Re: C++ and elegance by Chalst · 1999-09-23 05:53 · Score: 2

If you are interested in looking into Haskell, my main advice is to be prepared to spend time learning it: if you have never used a lazy functional programming language it takes time to learn how to use these features, and the way Haskell uses state (indirectly via `monads') is tricky to grasp.

the time is worth it in my opinion; laziness provides a powerful heuristic for attacking difficult optimisation problems efficiently: an application of haskell is in providing effiecnt dynamic prgramming solutions to NP hard optimisation problems, and then step-by-step transforming these into industrial strength `C' code. This strategy has produced some of the best solutions to the problem, because laziness captures a fruitful intuition about how to minimise resources in such problems.

If you are not enthusiastic about putting the time in, it is worth having a look at Objective CAML (`ocaml'), a language that combines an excellent marriage of objective and functional programming with one of the best development suites in functional programming. References to both can be found at the FAQ for comp.lang.functional.

Of course both languages lack `prevalence', though that may change, since Simon Peyton-Jones, one of the chief architects of Haskell, has taken a post at Microsoft...