Refactoring: Improving the Design of Existing Code
Overview
This book could very well do for refactoring what the "Gang of Four" book did for design patterns. In fact, with the number of contributing authors, this might well become known as the "Gang of Five" book. (They contributed content to chapters 3 and 12 through 15.)
Organization
Refactoring leaps in feet first with an extended example. I found this to be a surprisingly effective opener: it didn't overwhelm me, and left me hungry for more. The first chapter follows a sample program through several incremental refactorings, and the reader gets the idea via osmosis.
To illustrate the technique of refactoring, the first chapter presents the original code on the left page, and the resulting code on the right, with changes in bold. This presentation, coupled with explanatory text, makes it easy to see what's going on and focus on what's happening. It's as if you're looking over the author's shoulder as he edits, compiles, and tests code in his development environment.
What is Refactoring?
Now that you've done a refactoring, you might be curious to know more about what refactoring is. The next few chapters provide the relevant background.
Refactoring is what the book's subtitle suggests: changing code in in ways that preserve behaviour, but improve the way that behaviour is generated. This could be as trivial as renaming a method, or as tricky as separating domain and presentation classes.
Why go through this trouble? In the end, the code is different but it acts the same; there has been no new functionality added. Why? You do this to place yourself in a better position to add new functionality to the software. If you don't, you eventually end up with spaghetti code that is unmaintainable and will not support new functionality at all.
I think anyone who has worked on real code can appreciate the need for refactoring. In fact, most good programmers already do it, although perhaps only on a subconscious level. What this book aims to do is to raise that ad-hoc activity to a higher level of applied technique. Just as there are principles and practices in GUI design (as opposed to merely throwing widgets together randomly), there are principles and practices in refactoring activity: this book catalogues them.
Catalogue
Sandwiched between introductory and summary chapters is the meat of the book: a catalogue of over seventy refactorings. This catalogue follows in the footsteps of the highly successful Design Patterns format: Pattern Name and Classification, Intent, Also Known As, Motivation, Applicability, Structure, Participants, Collaborations, Implementation, Sample Code, Known Uses, and Related Patterns. Since the individual refactorings are less complex than patterns, this catalogue uses the format: Name, Summary, Motivation, Mechanics, and Examples.
The idea is the same. The name and summary provide a definitive vocabulary and a reference-card example. The motivation explains the relevance of the refactoring. The mechanics cover the step-by-step details of how the refactoring is executed. Then a series of examples demonstrate the variations.
Applicability
I like the catalogue. Although some refactorings seem deceptively trivial, it is useful to have them laid out in step-by-step detail. You never know when you will make a mistake, and when you absolutely positively must fix a bug or add a feature by the next day, and need to refactor to do it, slow and steady wins the race.
Further, other refactorings are not so trivial and familiar, and it is certainly useful to have their traps and pitfalls exposed. Frequently, they rely on the smaller refactorings themselves.
I can see this book becoming well-used in a shop with plenty of production code.
Supplementary Material
The non-catalogue chapters are informative as well. I especially appreciate the metaphor of bad smells in the code: the "if it stinks, change it" philosophy is the perfect counter-point to the oft-cited "if it ain't broke, don't fix it" mentality.
The chapter on refactoring tools discusses the possibility of automating much of the mechanical work of refactoring. Although there is a Refactoring Browser for Smalltalk, I suspect that Java and C++ versions are a little ways off. I'd wager that, as with the UML, tool support will lag industry practice for some time.
Style
As always, the author's writing style is down-to-earth and easy to read. Martin tells you straight up what he's found useful and what he hasn't. He tells you where he's made mistakes, and where the risk is less pronounced.
I like the way he goes through an example, then goes through it again under different conditions, thereby revealing the many-splendoured variations. Frequently he continues examples that were left off from other refactorings.
Plenty of further reading is suggested; I always like that.
Flaws
The book has a Java focus, and that is the language used for the examples. There is some mention of Smalltalk and C++, but not much; far less than Design Patterns, for example. Still, the book is quite understandable to anyone with object-oriented development experience.
The book references design patterns; some refactorings even apply and manipulate patterns. However, I wish there were more direct references to the Design Patterns book. That would especially help those new to both refactorings and design patterns.
There are a few minor typos (nothing major), so check the author's web site for errata and try to get a recent printing if you can.
Recommendation
It's no secret that I think this is a book whose time has come. I'm hoping it will codify my approach to refactoring, to help me be more efficient in my development.
I recommend this book as both a practical catalogue, and as a general work on the theory and practice of refactoring. I think that the refactoring community will grow much as the patterns community before it, and that we will see more published on the subject.
Until then, this book is a good start.
Purchase this at Amazon.
TABLE OF CONTENTS
Foreword
Preface
1. Refactoring, a First Example
2. Principles in Refactoring
3. Bad Smells in Code
4. Building Tests
5. Toward a Catalog of Refactorings
6. Composing Methods
7. Moving Features Between Objects
8. Organizing Data
9. Simplifying Conditional Expressions
10. Making Method Calls Simpler
11. Dealing with Generalization
12. Big Refactorings
13. Refactoring, Reuse, and Reality
14. Refactoring Tools
15. Putting It All Together
References
List of Soundbites
Index
I think a lot of people underestimate the importance of refactoring code. It's put to good use (as I can attest from experience) in the Extreme Programming software development methodology. (If you haven't heard of this, check it out. It seems kind of radical, but it works very well in practice if applied appropriately.)
I've owned this book for a couple months now and I feel it was definitely worth buying. The section on self-testing code was quite useful, even if it was short. My one complaint is that the book does not address how to apply refactoring in an environment that closely tracks SPR's. On a project where there is formal witness testing you usually try to keep SPR's small with limited impact. This is exactly the opposite of how refactoring works... i.e. redesign the whole thing if it is the Right Thing to Do. While self-testing code helps, having to pay for a complete formal regression test for each SPR would get expensive. Other than that, however, this is an excellant book. I would be very happy if it attracted a following as large as the Design Patterns book.
The more I write code, the more I realize that it is like any other kind of writing. And after years of looking at garbage spagheti code, I have come to the conclusion that the best way to raise the level of coding is for experienced and talented programmers to review the code of others, making revisions if necessary. Many programmers would no doubt scream in protest, and this too would be a good thing: in my experience the worst programmers are also the ones with the most vanity (especially in regards to their code).
...can replace good, solid comments. Comments lines are soooo understated in schools, but are sooo conforting in the real world.
At University of Toronto, examples in first year had 2 line of comments for each line of code. I try to stick with that ratio at work. (I said TRY
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
Too many comments can be as bad as too few, and trying to get the right mix is somewhat of an art that I still haven't quite mastered. But I think using them as reminders has come in very handy.
Ita erat quando hic adveni.
An SPR is a Software Problem Report. When a bug is found or an enhancement is requested an SPR is created to track the changes to the code base. The term is often used interchangebly with Software Change Report (SCR) and bug report. Check out GNATS. Its the GNU SPR-tracking tool.
I often write small assembly programs, where correctness, then time optimization are the critical design goals. I have to agree with your statement on over architecting small one-off systems, though careful design is always rewarded.
I'd also agree architecture is of critical importance in large multi-programmer projects, especially any that must be expanded and grow over time. Many open source projects qualify nicely of course.
But refactoring really rings a bell, even with small assembly programs. Typically I develop a simulation in C, test it, translate to assembly, verify equivalence of output, then refactor it until it's time optimal. Don't know if this book would suggest methods useful to me, but rewriting code while preserving its function is something I do a lot of.
The most important thing about all of this is that software development goes in cycles. First you make it work, then you make it right, then you make it fast. Leaving out any of these steps is very bad.
Another very bad thing is when you have the whole system planned out in excruciating detail before you write line one of code. Inevitably, one of your assumptions will turn out to be totally unworkable, and if it's already set in stone, that will probably break everything else. Generally you have to sketch the broad strokes, fill in the major code, find out what works and what doesn't, throw away what you've done so far, and start for real. That's just the way it is, and if you don't plan to throw away your first try, you'll just end up being overbudget and late when you have to throw it away anyway.
----
We all take pink lemonade for granted.
There is no K5 cabal.
I am not the real rusty.
C++ doesn't HAVE a semantics. C++ has a tangled, elephantine heap of ambiguities. ML has a semantics (Milner et al, The definition of Standard ML, second edition).
As for syntax ... I read a story once about someone (I forget who) who was trying to write a yacc grammar for C++, and every time he ran across an ambiguous case and ran it through cfront (the defining implementation at the time) to check its parsing, cfront dumped core. But that doesn't even matter; any language could have its syntax reduced to Lisp's and it would be fine with me.
Type checking? After working in a language with a really powerful static type system (such as ML or Haskell), trying to express concepts as types in C++ feels like moving a sand dune with tweezers. Parametric polymorphism and algebraic sum types are just the beginning of what I miss.
In short, individual language features (casts, overloading, whatever -- see Haskell to understand what overloading should be about) aren't what bother me. What bothers me is the whole philosophy of clumsy thinking at too low a level of abstraction. It astonishes me that people can and do write large programs in C++ and Java. I can respect that after a fashion, but just because the wall is bloody from everyone else banging their head against it doesn't mean I should do the same.
I apologize for the strong language in this post, but I feel very strongly that the use of insufficiently abstract languages (principally C++ and Java these days) in production software is a major reason why said software is so frequently so bad; and that there is no excuse for the failure to use more advanced language technology to improve our design of and reasoning about programs.
Constructive logic destructs my brain.
Perhaps the book addresses this (I haven't read it). Anyone actually work anywhere where management signed on to refactoring?
Or a compliment for that matter.
If you are interested in looking into Haskell, my main advice is to be prepared to spend time learning it: if you have never used a lazy functional programming language it takes time to learn how to use these features, and the way Haskell uses state (indirectly via `monads') is tricky to grasp.
the time is worth it in my opinion; laziness provides a powerful heuristic for attacking difficult optimisation problems efficiently: an application of haskell is in providing effiecnt dynamic prgramming solutions to NP hard optimisation problems, and then step-by-step transforming these into industrial strength `C' code. This strategy has produced some of the best solutions to the problem, because laziness captures a fruitful intuition about how to minimise resources in such problems.
If you are not enthusiastic about putting the time in, it is worth having a look at Objective CAML (`ocaml'), a language that combines an excellent marriage of objective and functional programming with one of the best development suites in functional programming. References to both can be found at the FAQ for comp.lang.functional.
Of course both languages lack `prevalence', though that may change, since Simon Peyton-Jones, one of the chief architects of Haskell, has taken a post at Microsoft...