Domain: refactoring.com
Stories and comments across the archive that link to refactoring.com.
Comments · 57
-
Instead of click-bat here are some real sources...
- Code Complete 2 was the most influential book I read on how to keep it simple. http://cc2e.com/
- The Pragmatic programmer taught me how to stick it to my manager and push for more time/testing: https://pragprog.com/the-pragm...
- Refactoring taught me how to clean up the code written by developers who didn't read these books. http://www.refactoring.com/
- Clean Code brings it all together. https://sites.google.com/site/...
These books provide much better information on the topic and deserve a place on your shelf alongside the GoF's Design Patterns.
-
Your project has a debt problem
Right now I am working on a project with a 4,000 line function
Even when coding in assembly language for an 8-bit microprocessor, I'd probably extract methods an order of magnitude before 4000 lines.
related classes scattered across multiple projects so they can't compile easily
Create a new project whose purpose is to provide classes to these projects.
If your boss complains about not having time=money for refactoring, try first seeing whether your boss has heard of Dave Ramsey and his Total Money Makeover. If you're not familiar, Mr. Ramsey is a famous proponent of sacrifice to pay down personal debt. Then explain to your boss that your codebase is likewise deep in debt, and dealing with messy code like that is like having to spend a lot of your revenue on paying interest on that debt. Refactoring to pay the principal on your project's technical debt may delay getting the next feature out, but it might help you get the next six features out in same time that you otherwise would have produced only four.
-
Martin Fowler's Refactoring
No, in spite of what some jackasses say, it isn't just rewriting for its own sake. It is improving the structure in standardized ways so that you can add your new features much more safely. In interviews, I prefer people can name some standard refactorings before I ask them the typical questions about design patterns.
-
Re:Server performance is important, but...
The review of this book doesn't make it obvious to me. Is this book really about refactoring or is it about query tuning?
IMHO, the former doesn't really need to be db vendor specific. Refactoring should encompass all code and not just the SQL. Looking for ways to refactor from an ORM perspective makes sense such as lazy evaluation and strategic caching.
Query tuning is an important topic with which there are already plenty of resources devoted to it.
-
Re:BASIC is TERRIBLE!
You didn't bother to explain how learning basic can cripple someones mind. That's what I was really interested in.
Anyhow, I address your specific objections below. None of which apply to modern versions of the language.
1. No long variable names.
The C64 allowed variables of any length, but only considered the first two characters to be significant (the rest were ignored). GW-Basic (1983) allowed 40 character variable names (where all the characters were significant)2. No local variables.
That's not entirely true.
For example, the following program:DEF FNSQR(X) = X*X
LET X=5
PRINT FNSQR(10), XOutputs: 100 5
Of course, that's just for unstructured versions of basic. Every structured basic I've seen has proper scope rules. (FYI, structured basic has been around longer than the personal computer.)
3. No Recursion.
A valid objection, though modern versions of basic don't suffer this limitation. It should also be noted that it's trivial to convert a recursive function to one using iteration. (Example) It's often a good idea.4. GOSUB instead of true procedure calls.
Break out the old assembly book and check out CALL/RET! Seriously, it's obviously not an issue an any structured basic (which necessarily has "true" procedure calls).5. No structures.
It has all the coolest control structures, if that's what you mean.
Do you mean records? (example in c: struct x {int a; char b[5]; float b;}; )
Explain yourself!It still doesn't explain why basic is a poor choice for a first language. It was the first language of millions of first-rate programmers (that is to say, their minds were not permanently injured!). What about basic makes it such a poor choice? Is it the name that bugs you?
-
Re:A few tipsGood points. To add a few in no particular order,
- Try to keep up a good ratio of tests vs code. Tests make your code more robust and make it easier to change it without breaking anything - but most of all they force you to think about what you're trying to accomplish and they can serve as a spec for your code at the same time.
- Refactor often. This is a lot easier if you followed the previous point.
- Run a lint-like tool on your code now and then, e.g. PMD for Java or pyLint for Python. You shouldn't treat its output as gospel but it can be quite enlightening.
-
Re:Big Visible Charts
Also, here is the obligatory link to the Martin Fowler Refactoring catalog.
-
Re:Fundamental Misunderstanding of Refactoring
-
Re:Fundamental Misunderstanding of Refactoring
-
No more funWhat I find interesting is that, except for perhaps startups or trivial projects, nowhere except for the "projects" or "software engineering" class, which everybody hated, did school teach us what it was going to be like out in the Real World(tm). Usually in single-semester CS classes, you have several "labs" or "machine problems" (depending upon where you go to school), and usually they aren't more than, say, a couple thousand lines of code for each one. And then in the projects class, they taught it like a Waterfall lifecycle, which you can think of as somewhat of one iteration through a Tornado lifecycle.
So what you're left with is not much of an idea of what is going on outside academia other than perhaps "really large programs." That is why everybody that I interviewed with coming out of school in the 90s asked if I had taken a projects class.
If I were to design a curriculum to get people ready for how things are after school, I'd make a two semester course requirement:
The first semester I would have the students go through a survey-type class where different types of methodologies were explained, along with the advantages and disadvantages of each, with an example of representative types of applications that used each method. Perhaps a telephone switch used with a Waterfall methodology. And go on from there. This would go up to whatever the latest fad was. This would also include the prerequisites for starting a project, which hopefully are common to all projects -- you would use something like the material from The Software Project Survival Guide. You'd also look at different maturity measurement methods, such as SEI CMMI levels. And then a dose of the real-world with mistakes that people make during software projects, such as excessive "tailoring" of the process, giving up the process during mid-iteration to code like mad, etc., and ways to get out of such software development mistakes.
You would also get taught concepts such as Configuration Management (with a survey of different tools, such as CVS, Subversion, ClearCase), unit- versus integration- versus system-testing and tools to perform each.
The second semester would be the actual project where you would use the appropriate methodology for the size, number of people, and time to work on the project. You would try to make it as realistic as possible, including requirements gathering with inadequate requirements, bad business contracts, interacting with QA for getting a test plan up and running, etc. Then halfway through the project you could have additional requirements added by the customer and see how to successfully manage such changes.
Another course or portion of a projects course would be doing what most of us end up doing anyway: modifying other people's code. This would also go over the different types of code modification: new features addition, optimizing code for better speed, user interface changes, etc. It would also survey different tools, such as debuggers and profilers. It would also look at the hows and whys of refactoring.
All of these are necessary to successful real-world software development, IMHO. Unless you go to an underfunded start-up ("OMG, why aren't you coding!!!), or you work at Google.
Without this, I think a lot of people are going into software development thinking it's all fun, with this rosy picture of working on original code, and thinking that testing what you do after it all works.
DT
-
Re:Dear god, no.
I've seen stuff like this before. Poorly architected system gets rewritten in another programming language. This is a no-fault, blameless way of throwing out the old, useless system and replacing it with something that is, hopefully, better.
If you claim that the old system is too crufty for a reasonable maintenance cycle, then someone might get their feelings hurt and strive to defend the old code. A language change is more politically acceptible because C++ can't defend itself from being bashed.
The truth will set you free, however. If the old system is bad because you have a bad architect and that architect is setting the architecture for the new system, then after great expense, you will still have a loser.
If the old system is crufty because of six years of scope drift and the current architect is good, then figure out why the old system is crufty and incrementally refactor it until the maintenance cycles become reasonable again.
-
Explanation...IANAAOP
OK, I think that it's fair to say that AOP is still very much at the cutting edge of developement in the java community.
Essentially the idea with AOP is that you increase the modularity and decrease the coupling of components in your application. An example of this would be Logging.
Now most (if not all) apps need to have logging code of one sort or another. This means that the code necessary to log to whichever system (commons-logging, log4j etc in the java world) you use tends to touch all of the classes/objects in your application.
Consider a situation where you have a large existing code base that you have refactored using well understood techniques such as extract method and extract superclass until you have a wonderfully clean and consistent application. You will find that even after this refactoring you still have dependencies throughout your code on your logging mechanism that cannot easily be refactored away.
In software engineering terminology this means that your application code is tightly coupled to whichever logging implementation it uses. This is a bad thing. It makes the application brittle - caveat, see 1 below.
Aspect Orientated Programming can help solve these sorts of problem by introducing another conceptual level into traditional OO. You can move the common logging code into one or more Aspects. These are separate from the rest of your codebase. When the application is run, the aspects are used to modify the code in your application to, for example, use your preferred logging implementation - oh, and before someone flames me saying I'm wrong, I know that with java this isn't exactly how it works, but it's a useful lie!
Now the interesting thing about Spring is that it's one of the first widely accepted uses of AOP. I know that even a year ago AOP was mainly of interest to comp-sci academics and bleeding-edge technologists but with Spring incorporating it into the framework it has brought AOP to a lot of people's attention.
Inversion of control, otherwise known as the Hollywood Principle (don't call us, we'll call you), refers to the currently accepted/hyped practice of "telling things what to do" rather than "things inherently knowing what to do". It's obviously a lot more complex than that, and a lot has been written about the technique but that's the basic gist.
As for citing major applications that use these practices, well an awful lot of applications are being written using the Spring framework as their heavy lifter of choice. As for Inversion of Control, well-written/architected applications have been using the hollywood principle for years - it really isn't anything new.
Now, forget all of that and lets talk about Intentional Programming
:-P -
Explanation...IANAAOP
OK, I think that it's fair to say that AOP is still very much at the cutting edge of developement in the java community.
Essentially the idea with AOP is that you increase the modularity and decrease the coupling of components in your application. An example of this would be Logging.
Now most (if not all) apps need to have logging code of one sort or another. This means that the code necessary to log to whichever system (commons-logging, log4j etc in the java world) you use tends to touch all of the classes/objects in your application.
Consider a situation where you have a large existing code base that you have refactored using well understood techniques such as extract method and extract superclass until you have a wonderfully clean and consistent application. You will find that even after this refactoring you still have dependencies throughout your code on your logging mechanism that cannot easily be refactored away.
In software engineering terminology this means that your application code is tightly coupled to whichever logging implementation it uses. This is a bad thing. It makes the application brittle - caveat, see 1 below.
Aspect Orientated Programming can help solve these sorts of problem by introducing another conceptual level into traditional OO. You can move the common logging code into one or more Aspects. These are separate from the rest of your codebase. When the application is run, the aspects are used to modify the code in your application to, for example, use your preferred logging implementation - oh, and before someone flames me saying I'm wrong, I know that with java this isn't exactly how it works, but it's a useful lie!
Now the interesting thing about Spring is that it's one of the first widely accepted uses of AOP. I know that even a year ago AOP was mainly of interest to comp-sci academics and bleeding-edge technologists but with Spring incorporating it into the framework it has brought AOP to a lot of people's attention.
Inversion of control, otherwise known as the Hollywood Principle (don't call us, we'll call you), refers to the currently accepted/hyped practice of "telling things what to do" rather than "things inherently knowing what to do". It's obviously a lot more complex than that, and a lot has been written about the technique but that's the basic gist.
As for citing major applications that use these practices, well an awful lot of applications are being written using the Spring framework as their heavy lifter of choice. As for Inversion of Control, well-written/architected applications have been using the hollywood principle for years - it really isn't anything new.
Now, forget all of that and lets talk about Intentional Programming
:-P -
Re:Comments
Tell them to use comments in code, and be sure that they make them good comments.
IMHO this is wrong. You should tell them to refactor their code. If they do it correctly, the code will be readable without any comment. In refactoring terms a comment is a code-smell. It may seem strange to not write comments, however i know this works perfectly well since this is how we do it at work... (and our code is very readable). Also it avoid to have out-of-date documentation, since your code is your documentation. -
API doesn't exclude refactoringI kind of understand what you are trying to convey.
When I write a program for myself or in a team with just a few developers who know how every aspect of the solution works, it's nice to change stuff at a whim. Since the team is small, everyone monitors each cvs-update for changes and talks to each other frequently to keeps abreast of what has changed. However, for projects spanning more than just a handful people, specs quickly becomes essential as a communication tool.
An API is just that, a communication instrument (or protocol) which is intended to describe the exposed interfaces, their purpose, what they do when used in various ways etc. Essential information for using a component. This is information any user of the component would need to know anyway in order to use it properly.
Now, when you mentioned refactoring being sacrificed if interfaces (APIs) are published (which is what I interpreted your text to say), I completely failed to see the underlying reasoning.Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.
(reference). What this seems to say is that the whole point of refactoring is to keep the code from decaying while not changing the interfaces!
If you require new features / behaviors, why not simply create new function signatures and deprecate old ones? What's so difficult or bad with that approach to changes? -
Re:Requirements?
I don't have a problem with most of these development methodologies perse, but most of them seem to lack the entire concept of DATA and INFORMATION.
You should check out things like Agile Modelling and Agile Data for more information. I don't think it's core to agile methods, as not every application uses a database. But if you're big into databases, these sites can help you see how agile approaches could work in your environment.
Do these methodologies include some prep work on gathering business requirements and understanding the underlying information relationships?
It's not just prep work; it is work that should happen all the time. That's why Refactoring and Domain-Driven Design are such a big deal to people doing Extreme Programming. We strive for representational harmony across all levels, from talking with users down to the database schema. And not just in the spec, either; as we learn more about the domain and find better representations, refactoring lets us safely change the structure of the code to match. -
a better question
You actually bring up a better question. How do we deal with big pieces of steaming ****, I mean spaghetti that get handed to us to maintain.
There are all kinds of processes and theories that if you religiously follow you can be sure to prevent a project from becoming crap. But, always in the life of a project there comes a PHB determined to turn the code bad.
I think we need a lot more attention on how to deal with code thats already in bad shape. We've got refactoring and Code Reading but, little else on the subject of improving existing code bases. A result of this, most dev's tasked with maintenance simply do as little as possible when modifying the code from fear of unwanted consequences. Thus, code rot sets in. -
Re:Clear Code
Much better is an explanation over a block of 5-10 lines giving you an idea of what you are trying to achieve. Comment any thing that is not clear, like if your using bitwise shifts to multiply and divide, for example.
I used to write lots of explanations. I still do when I have to, but now it feels like a defeat to me.
Instead, I try to make the code readable enough that it doesn't need explanations. E.g., I'll take each of those 5-10 line blocks and do an Extract Method refactoring. The information that would have ended up in the comment ends up in the method name, in the parameter names, and in the return variable name.
Also, I put a lot of that energy into writing good unit tests. Comments can get out of date, but unit tests always tell the truth. -
Re:I blame the If statementsIf you have an overload of these long chains of "if" statements, it sounds to me like a classic example of when to use subclasses and/or polymorphism. (See Replace Type Code with Subclasses and Replace Conditional with Polymorphism)
Turn this..
if (customerName == "cust1") {
// customizations for customer 1
} else if (customerName == "cust2") { // customizations for customer 2
} // ...Into this..
Customer1 myCustomer;
myCustomer->doCustomizations(); -
Re:I blame the If statementsIf you have an overload of these long chains of "if" statements, it sounds to me like a classic example of when to use subclasses and/or polymorphism. (See Replace Type Code with Subclasses and Replace Conditional with Polymorphism)
Turn this..
if (customerName == "cust1") {
// customizations for customer 1
} else if (customerName == "cust2") { // customizations for customer 2
} // ...Into this..
Customer1 myCustomer;
myCustomer->doCustomizations(); -
Re:Just quick and easy
one of the best tips from one the best programmers I've ever known, is use the form "iii" for all incremented variables [...] Why? Because if you use English Language descriptions, "iii" should never occur when searching except for in the case of your variable
This seems dubious to me. The only time I use a variable like i is in a nice, short loop, so I have no need to search. If a loop is too long to fit conveniently on the screen, I almost always do Extract Method (or I extract a function if it's not OO code). If I can't figure out a way to do that, I give the variable some meaningful name.
Using three-letter loop variables seems like a very clever solution to the wrong problem entirely. -
Refactoring
More complete Refactoring tools for C/C++ would be nice.
-
Refactoring
More complete Refactoring tools for C/C++ would be nice.
-
Re:Possibly not as bad as it looks
The key thing is to figure out where the joints are. Find the interfaces, the ways different peices talk to each other. Understanding this is usually the key to how the whole code is organized. It tells you how the authors thought about it. And it also tells you what parts can be incrementally replaced without having to throw out the whole shebang.
That's great advice -- right, once you have control over the interfaces you can do lots of things. You can put a good-interface wrapper around a clump of spaghetti code with a poor interface, then replace that whole section at your leisure. Etc. etc. Make sure you have good tests written, and you can just *drop* those 1500 lines of code that you suspect aren't doing anything -- if your tests are good and they pass, you were right. This is where a good IDE really shines as well -- it can tell you what methods are never called, what parameters are never used, and so on; just cleaning out that junk makes any code much more manageable.
Good reading suggestion for those interested in this: Martin Fowler's Refactoring.com. All of these refactoring patterns have names that I don't remember, and there are plenty more strategies discussed there that'll make your eyes light up if you've been in this situation before. -
Re:We should do more of this
I've sat in front of my source code knowing that not only could it be made better, but that there is probably a better way to do it. Unfortunately, the reason old code stays around hobbling around the system with plaster casts around its legs and band aids covering its heads, yes more than one head because at some point I figured that it would be better to stick a brand new head on there rather than refactor the functionality out and create a brand new program.
Old code has much embedded wisdom. Lots of little bug fixes, solutions thought out, methods applied and debugged. Usually it's a really bad idea to scrap it.
If you apply proper refactoring techniques and some underlying method to allow the code to evolve, and you'll find that most cases of code rot are really just code neglect.
Of course, there may be licensing, or other reasons (designed for an environment that no longer exists) why it's best to scrap a particular codebase, but as a general rule, only drop software that's actually unsuccessful in the marketplace.
If it sells, update/extend/refactor rather than rewrite. -
Re:Java
Java has nothing about it that makes it any more maintainable than any other language.
I disagree:
- it has a quite consistent API that make use of OO concepts such as 'interfaces'. This pushes the programmer to reuse the same concepts in the design of its code.
- "one source, one class", no header files mess: this simplifies editing compared to C/C++
- JavaDoc: of course most language have now an equivalent tool. But JavaDoc is THE standard for Java so everybody uses it.
- IDE for Java are now far more advanced than IDE for other languages. Do you know about refactoring? This single point is making the difference for code maintenance. -
Coding ain't math, not any more
Being a mathematician won't make the switch go off that allows you to expertly use object oriented programming. Nor will it help you create a good GUI. Nor will it help you validate date formats. You need a firm grasp on the math you learned in middle school, but the need to be a mathematician has diminished in many computer science workplaces to the point that the "need" is now a simple "added bonus".
When coding was entirely procedural and focused almost entirely on crunching numbers, well, yes, math was a big deal, but the paradigm's changed greatly now. Now aptitude in pure logic [rather than a broad math bkgd, much less pure calc] is much more important in my experience. Relational database design and object oriented programming require great understanding of set theory, not calculus. I AP'ed into sophomore calculus and had two semesters (plus an audit of DiffEQ) in college, and haven't used that stuff once since entering the workplace (on my sixth year).
When I look to interview and hire new programmers to my team, for pure intellectual skills I'm looking at good coding style, properly factored (as in refactoring) coding examples, and the ability to explain, say, why an example database schema is or isn't in good third normal form. The math I've seen in my tasks is very basic, whether the product I've helped develop was a simple web-based MIS, county-wide tax system, or financial tracker for the largest non-profits.
In fact the only time it's been useful for me to understand mathematical concepts [beyond set theory] was when I thought our resident Geographic Information Systems (GIS) experts weren't considering all the ends and outs of different map projections. Even then, what I was commenting on was well outside of my job description of a database admin.
It's good to know math, all other things equal, but in today's programming workplace, the emphasis on math in CS programs is unfounded. I'll even daresay that's why so mnay people who weren't schooled as programmers do so well -- I know about as many programmers that have impressed me with their proverbial skillz that had a degree in the humanities or no degree at all as I do those with a CS background.
Wake up & catch up, CS programs, and teach what's useful in "the real world"! -
Not really
From Refactoring.com
What is Refactoring?
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. Its heart is a series of small behavior preserving transformations. Each transformation (called a 'refactoring') does little, but a sequence of transformations can produce a significant restructuring. Since each refactoring is small, it's less likely to go wrong. The system is also kept fully working after each small refactoring, reducing the chances that a system can get seriously broken during the restructuring.
...
So not really... -
Re:Are they reinventing the wheel ?
Martin Fowler says eclipse " in many ways Eclipse is the Emacs for the 21st century."
That's pretty hearty praise. -
XP is the way to goI find it interesting that not one of the high-score responses (I havent read the others) to this question has mentioned XP, i.e. Extreme Programming.
XP was built on the knowledge you just mentioned: That the Q&D solution is necessary, and that if you try to follow set procedure (or the "waterfall model" as it's called) where you have a requirements, design phase, implementaion phase, testing phase you will most likely fail.
Basically, what XP is all about, is to acknowledge that the specification is useless, beacuse after a couple of years when the project is finished, the end result doesn't look anything like the sepcification anyway. At least if you want to survive in a dynamic market. We have all seem that in the real world, the requirements change during development, and if they do, you need to go back and change the spec, and then possibly reimplement large parts of the code.
So, how do you go about solving this? Well, first of all you have to understand and believe in a couple of mantras:
- The implementation is the specification
- Refactor mercilessly
- Do the least thing that could possibly work
- Test driven development
Here is the list of development rules.
The implementation is the specification
This means that instead of writing a specification before programming begins, you let the application evolve (very similar to most Open Source projects actually) and if the requirements change during development, you change the code to adopt.
Refactor mercilessly
This is absolutely nessecary in order for the previous point to work. What it means is that you should not be afraid to change the layout of working code, to make it easier to add new features. With good refactoring you don't need a complex design in the beginning, which means that you get to market more quickly.
I strongly recommend you read Martin Fowlers book Refactoring, it's a real eye-opener.
This leads us up to the next point:
Do the least thing that could possibly work
If you are thinking of implementing a couple of abstract base classes and interfaces to make your object design super generic, so that it can be used for a lot of differnt things in the future. For example implementing a plugin architecture in your file parser or somehting like that, you need to ask yourself the following questions:
Am I going to need this abstraction now? and If I need it in the future, will it be easy to refactor the code so it gets that functionality? If your ansers to there questions are "no" and "yes" respectively (which they usually are) then you should not do it.
Essentially: don't do anything until you need it.
Test driven development
All this refactoring, and no solid specs can be a bit scary, especially in the beginning. A common question that pops up is: "how can we guarantee that the code works if you keep changing it all the time?". The answer is: unit testing. The rule of thumb here is: Implement the test first, then you implement the code so that the rest runs. Whenever you are going to fix a bug, write a unit test that triggers the bug first, and then fix the code so that the test succeeds. You then make sure you run the ever growing test suite several times per day. It helps a lot in catching regression bugs.
For Java, I recommend JUnit.
Now, the biggest problem you face is selling XP to your PHB's. They will more than likely feel that they are losing control, and they will be afraid that their nice Microsoft Project documents will become useless (no one seems to remember that (almost) every single waterfall project will overrun both budget and time constraints). However, there is
-
Re:Too much integration is too limiting...
Clearly, you don't understand much about refactoring, then. Refactoring is a formal, verifyable process for making equivalent changes in code. Things like "extract method", or "push up".
These are tricky to get right by hand, but, since they can be done in a provably correct way, automatically, they are precisly the sort of thing your IDE should do for you.
Please read up on refactoring at http://www.refactoring.com/
and http://www.extremeprogramming.org/rules/refactor.h tml.
-
I Heart Unit TestingIn a corporate environment, isn't this what testers are for? You don't waste the programmers time on this,
Both sorts of testing are helpful.
Functional tests make sure that the program meets the requirements. Good QA people are invaluable.
But even if I had the world's best team of testers, I'd still do test-first programming. Why?- Faster feedback. No matter how fast a testing team is, it's not fast enough for me. With automated unit tests, I know within seconds that my code is good.
- Clearer interfaces. When I do test-first programming, I'm always thinking about what my code looks like from an internal perspective. This makes the interfaces to it much clearer for others to deal with.
- Automated documentation. When I wnt to know what a chunk of code is supposed to do or how it's supposed to be used, I just look at the unit tests.
- Rot prevention. Having automated unit tests gives me confidence that the code will continue to work, even when I'm not looking at it. Change in libraries? Junior programmers making changes? I'm not worried; the tests will catch the problems.
- Refactoring support. If you don't have good test coverage, doing serious refactoring is impossible.
As far as I can tell, every programmer does manual unit testing, generally by putting in print statements as they develop and then looking for stuff in the output. Doing automated unit testing just means that you take those manual checks and automate them. And isn't automating tedious manual work what computers are all about? -
Solid Foundation For Software RefactoringTo get to know refactoring, I can refer to http://www.refactoring.com (by Martin Fowler).
For those interested in the value of refactoring (whether it's merely a buzzword), I can refer to a research project of ours, at http://win-www.uia.ac.be/u/lore/refactoringProjec
t By the way: since the article does not go into behaviour-preserving restructuring of JUnit, they shouldn't mention 'refactoring' in the title.
-
Not necessarily true
Not necessarily true.
If you start building your prototype with solid testing and you apply effective refactoring your little prototype can grow into a solid and clean system.
Java has great tools for these jobs (junit and eclipse), but you can find similar tools for almost every other language. Give them a try.
Fh -
Re:Software... Engineering?
It may be costly, but it is still easy and ultimately cheaper proportionally than changing a bridge.
And getting easier and cheaper. With techniques like refactoring, you can significantly flatten the cost-of-change curve.
Software also wins big because our tools and "materials" are advancing much more rapidly than physical tools and materials are. Component-based software also helps; bridge-builders can borrow ideas from one another, but they can't just copy 95% of somebody else's bridge and add on a couple new entry ramps. -
Re:Developers love him; Managers hate him
You seem to have a somewhat different definition of refactoring than the one Fowler uses in his book on refactoring, in his other writings, and in the interview referenced above.
First of all, adding OLE or CORBA would not be Refactoring. Fowler described it like this:Refactoring is making changes to a body of code in order to improve its internal structure, without changing its external behavior.
Secondly, Fowler's book doesn't recommend refactoring for no reason. He has some specific design problems that a developer might see in a body of code they are working on. (a method too long, two classes too tightly intertwined, etc.) In his book, he describes refactoring as being the flip side of design patterns. Design patterns can be used during the design phase to create a good design. Refactoring can be used during the construction phase to become a good design.
Thirdly, the developer who didn't create a good design initially can use refactoring and come up with something better, because there are catalogs of effective refactorings The recipes that define these refactorings describe how to make these changes efficiently and safely without disturbing any more of the code than necessary.
These aspects work together like this. A developer , while coding finds that some problem is impeding their progress. For example, he discovers that every time he makes a change to one class, he discovers that he needs to make a correllary change to another class. He then decides that it fits the description of "Feature Envy", and performs the move method refactoring.
Basically, I see refactoring as a software developers equivalent of building codes. Building contractors don't need to know, or at least calculate out every time, the physics involved to make a structure solid enough to support itself and its contents. The building codes are a distilled instructions of what the physics calculations would indicate as appropriate action (with a bit of a margin of error.) Performing refactorings based on well known, tested, refactorings is using design tips of people who are much better software designers than you are.
-
Re:Developers love him; Managers hate him
You seem to have a somewhat different definition of refactoring than the one Fowler uses in his book on refactoring, in his other writings, and in the interview referenced above.
First of all, adding OLE or CORBA would not be Refactoring. Fowler described it like this:Refactoring is making changes to a body of code in order to improve its internal structure, without changing its external behavior.
Secondly, Fowler's book doesn't recommend refactoring for no reason. He has some specific design problems that a developer might see in a body of code they are working on. (a method too long, two classes too tightly intertwined, etc.) In his book, he describes refactoring as being the flip side of design patterns. Design patterns can be used during the design phase to create a good design. Refactoring can be used during the construction phase to become a good design.
Thirdly, the developer who didn't create a good design initially can use refactoring and come up with something better, because there are catalogs of effective refactorings The recipes that define these refactorings describe how to make these changes efficiently and safely without disturbing any more of the code than necessary.
These aspects work together like this. A developer , while coding finds that some problem is impeding their progress. For example, he discovers that every time he makes a change to one class, he discovers that he needs to make a correllary change to another class. He then decides that it fits the description of "Feature Envy", and performs the move method refactoring.
Basically, I see refactoring as a software developers equivalent of building codes. Building contractors don't need to know, or at least calculate out every time, the physics involved to make a structure solid enough to support itself and its contents. The building codes are a distilled instructions of what the physics calculations would indicate as appropriate action (with a bit of a margin of error.) Performing refactorings based on well known, tested, refactorings is using design tips of people who are much better software designers than you are.
-
Re:CVS / RCS - the next step?
Named parameters are nice as well, this is an alternative which has its advantages and disadvantages. If I had to choose I would go for the refactoring(s). They are more versatile, encourage improving your code and save some find/replace (well, a lot actually).
I consider refactoring one of major new OO coding techniques of this decennium (some things might work with procedural programming, but a lot of refactorings are OO specific). If you want to learn more, you should read Ward Cunninghams famous wiki and check out refactoring.com (although it's a bit hard to navigate). -
When good interfaces go crufty
In Vernor Vinges sci-fi novel A fire upon the deep, he presents the idea of software archeology. Vinges future has software engineers spending large amounts of time digging through layers of decades-old code in a computer system like layers of dirt and rubbish in real-world archeology to find out how, or why, something works.
So far, in 2002, this problem isnt so bad. We call such electronic garbage cruft, and promise to get rid of it someday. But its not really important right now, we tell ourselves, because computers keep getting faster, and we havent quite got to the point where single programs are too large for highly coordinated teams to understand.
But what if cruft makes its way into the human-computer interface? Then you have problems, because human brains arent getting noticably faster. (At least, not in the time period were concerned with here.) So the more cruft there is in an interface, the more difficult it will be to use.
Unfortunately, over the past 20 years, Ive noticed that cruft has been appearing in computer interfaces. And few people are trying to fix it. I see two main reasons for this.
-
Microsoft and Apple dont want to make their users go through any retraining, at all, for fear of losing market share. So rather than make their interfaces less crufty, they concentrate on making everything look pretty.
- Free Software developers have the ability to start from a relatively cruft-free base, but (as a gratuitously broad generalization) they have no imagination whatsoever. So rather than making their interfaces more usable, they concentrate on copying whatever Microsoft and Apple are doing, cruft and all.
Here are a few examples of interface cruft.
-
In the 1970s and early 80s, transferring documents from a computers memory to permanent storage (such as a floppy disk) was slow. It took many seconds, and you had to wait for the transfer to finish before you could continue your work. So, to avoid disrupting typists, software designers made this transfer a manual task. Every few minutes, you would save your work to permanent storage by entering a particular command.
Trouble is, since the earliest days of personal computers, people have been forgetting to do this, because its not natural. They dont have to save when using a pencil, or a pen, or a paintbrush, or a typewriter, so they forget to save when theyre using a computer. So, when something bad happens, theyve often gone too long without saving, and they lose their work.
Fortunately, technology has improved since the 1970s. We have the power, in todays computers, to pick a sensible name for a document, and to save it to a persons desktop as soon as she begins typing, just like a piece of paper in real life. We also have the ability to save changes to that document every couple of minutes (or, perhaps, every paragraph) without any user intervention.
We have the technology. So why do we still make people save each of their documents, at least once, manually? Cruft.
-
The original Macintosh, which introduced graphical interfaces to the general public, could only run one program at a time. If you wanted to use a second program, or even return to the file manager, the first program needed to be unloaded first. To make things worse, launching programs was slow, often taking tens of seconds.
This presented a problem. What if you had one document open in a program, and you closed that document before opening another one? If the program unloaded itself as soon as the first document was closed, the program would need to be loaded again to open the second document, and that would take too long. But if the program didnt unload itself, you couldnt launch any other program.
So, the Macs designers made unloading a program a manual operation. If you wanted to load a second program, or go back to the file manager, you first chose a menu item called Quit to unload the first program. And if you closed all the windows in a program, it didnt unload by itself it stayed running, usually displaying nothing more than a menu bar, just in case you wanted to open another document in the same program.
Trouble is, the Quit command has always been annoying and confusing people, because its exposing an implementation detail the lack of multitasking in the operating system. It annoys people, because occasionally they choose Quit by accident, losing their careful arrangement of windows, documents, toolboxes, and the like with an instantaneity which is totally disproportionate to how difficult it was to open and arrange them all in the first place. And it confuses people, because a program can be running without any windows being open, so while all open windows may belong to the file manager, which is now always running in the background menus and keyboard shortcuts get sent to the invisible program instead, producing unexpected behavior.
Fortunately, technology has improved since 1984. We have the power, in todays computers, to run more than one program at once, and to load programs in less than five seconds.
We have the technology. So why do we still punish people by including Quit or Exit menu items in programs? Cruft.
-
As I said, the original Macintosh could only run one program at a time. If you wanted to use a second program, or even return to the file manager, the first program needed to be unloaded first.
This presented a problem when opening or saving files. The obvious way to open a document is to launch it (or drag it) from the file manager. And the obvious way to save a document in a particular folder is to drag it to that folder in the file manager. But on the Mac, if another program was already running, you couldnt get to the file manager. What to do? What to do?
So, the Macs designers invented something called a file selection dialog, or filepicker a lobotomized file manager, for opening and saving documents when the main file manager wasnt running. If you wanted to open a document, you chose an Open menu item, and navigated your way through the filepicker to the document you wanted. Similarly, if you wanted to save a document, you chose a Save menu item, entered a name for the document, and navigated your way through the filepicker to the folder you wanted.
Trouble is, this interface has always been awkward to use, because its not consistent with the file manager. If youre in the file manager and you want to make a new folder, you do it one way; if youre in a filepicker and you want to make a new folder, you do it another way. In the file manager, opening two folders in separate windows is easy; in a filepicker, it cant be done.
Fortunately, technology has improved since 1984. We have the power, in todays computers, to run more than one program at once, and to run the file manager all the time. We can open documents from the file manager without quitting all other programs first, and we can save copies of documents (if necessary) by dragging them into folders in the file manager.
We have the technology. So why do we still make people use filepickers at all? Cruft.
-
This last example is particularly nasty, because it shows how interface cruft can be piled up, layer upon layer.
-
In Microsofts MS-DOS operating system, the canonical way of identifying a file was by its pathname: the concatenation of the drive name, the hierarchy of directories, and the filename, something like C:\WINDOWS\SYSTEM\CTL3DV2.DLL. If a program wanted to keep track of a file in a menu of recently-opened documents, for example it used the files pathname. For backward compatibility with MS-DOS, all Microsofts later operating systems, right up to Windows XP, do the same thing.
Trouble is, this system causes a plethora of usability problems in Windows, because filenames are used by humans.
-
What if a human renames a document in the file manager, and later on tries to open it from that menu of recently-opened documents? He gets an error message complaining that the file could not be found.
-
What if he makes a shortcut to a file, moves the original file, and then tries to open the shortcut? He gets an error message, as Windows scurries to find a file which looks vaguely similar to the one the shortcut was supposed to be pointing at.
-
What happens if he opens a file in a word processor, then renames it to a more sensible name in the file manager, and then saves it (automatically or otherwise) in the word processor? He gets another copy of the file with the old name, which he didnt want.
-
What happens if a program installs itself in the wrong place, and our fearless human moves it to the right place? If hes lucky, the program will still work but hell get a steady trickle of error messages, the next time he launches each of the shortcuts to that program, and the next time he opens any document associated with the program.
Fortunately, technology has improved since 1981. We have the power, in todays computers, to use filesystems which store a unique identifier for every file, separate from the pathname such as the file ID in the HFS and HFS+ filesystems, or the inode in most filesystems used with Linux and Unix. In these filesystems, shortcuts and other references to particular files can keep track of these unchanging identifiers, rather than the pathname, so none of those errors will ever happen.
We have the technology. So why does Windows still suffer from all these problems? Cruft.
Lest it seem like Im picking on Microsoft, Windows is not the worst offender here. GNU/Linux applications are arguably worse, because they could be avoiding all these problems (by using inodes), but their programmers so far have been too lazy. At least Windows programmers have an excuse.
-
To see how the next bit of cruft follows from the previous one, we need to look at the mechanics of dragging and dropping. On the Macintosh, when you drag a file from one folder to another, what happens is fairly predictable.
- If the source and the destination are on different storage devices, the item will be copied.
- If the source and destination are on the same storage device, the item will be moved.
- If you want the item to be copied rather than moved in the latter case, you hold down the Option key.
Windows has a similar scheme, for most kinds of files. But as Ive just explained, if you move a program in Windows, every shortcut to that program (and perhaps the program itself) will stop working. So as a workaround for that problem, when you drag a program from one place to another in Windows, Windows makes a shortcut to it instead of moving it and lands in the Interface Hall of Shame as a result.
Naturally, this inconsistency makes people rather confused about exactly what will happen when they drag an item from one place to another. So, rather than fixing the root problem which led to the workaround, Microsoft invented a workaround to the workaround. If you drag an item with the right mouse button, when you drop it youll get a menu of possible actions: move, copy, make a shortcut, or cancel. That way, by spending a couple of extra seconds choosing a menu item, you can be sure of what is going to happen. Unfortunately this earns Microsoft another citation in the Interface Hall of Shame for inventing the right-click-drag, perhaps the least intuitive operation ever conceived in interface design. Say it with me: Cruft.
- It gets worse. Dragging a file with the right mouse button does that fancy what-do-you-want-to-do-now-menu thing. But normally, when you click the right mouse button on something, you want a shortcut menu a menu of common actions to perform on that item. But if pressing the right mouse button might mean the user is dragging a file, it might not mean you want a shortcut menu. What to do, what to do?
So, Windows designers made a slight tweak to the way shortcut menus work. Instead of making them open when the right mouse button goes down, they made them open when the right mouse button comes up. That way, they can tell the difference between a right-click-drag (where the mouse moves) and a right-click-I-want-a-shortcut-menu (where it doesnt).
Trouble is, that makes the behavior of shortcut menus so much worse that they end up being pretty useless as an alternative to the main menus.
-
They take nearly twice as long to use, since you need to release the mouse button before you can see the menu, and click and release a second time to select an item.
-
Theyre inconsistent with every other kind of menu in Windows, which opens as soon as you push down on the mouse button.
-
Once youve pushed the right mouse button down on something which has a menu, there is no way you can get rid of the menu without releasing, clicking the other mouse button, and releasing again. This breaks the basic GUI rule that you can cancel out of something youve pushed down on by dragging away from it, and it slows you down still further.
In short, Windows native shortcut menus are so horrible to use that application developers would be best advised to implement their own shortcut menus which can be used with a single click, and avoid the native shortcut menus completely. Once more, with feeling: Cruft.
-
Meanwhile, we still have the problem that programs on Windows cant be moved around after installation, otherwise things are likely to break. Trouble is, this makes it rather difficult for people to find the programs they want. In theory you can find programs by drilling down into the Program Files folder, but theyre arranged rather uselessly (by vendor, rather than by subject) and if you try to rearrange them for quick access, stuff will break.
So, Windows designers invented something called the Start menu, which contained a Programs submenu for providing access to programs. Instead of containing a few frequently-used programs (like Mac OSs Apple menu did, before OS X), this Programs submenu has the weighty responsibility of providing access to all the useful programs present on the computer.
Naturally, the only practical way of doing this is by using multiple levels of submenus thereby breaking Microsofts own guidelines about how deep submenus should be.
And naturally, rearranging items in this menu is a little bit less obvious than moving around the programs themselves. So, in Windows 98 and later, Microsoft lets you drag and drop items in the menu itself thereby again breaking the general guideline about being able to cancel a click action by dragging away from it.
This Programs menu is the ultimate in cruft. It is an entire system for categorizing programs, on top of a Windows filesystem hierarchy which theoretically exists for exactly the same purpose. Gnome and KDE, on top of a Unix filesystem hierarchy which is even more obtuse than that of Windows, naturally copy this cruft with with great enthusiasm.
-
Following those examples, its necessary to make two disclaimers.
Firstly, if youve used computers for more than six months, and become dulled to the pain, you may well be objecting to one or another of the examples. Hey!, youre saying. Thats not cruft, its useful! And, no doubt, for you that is true. In human-computer interfaces, as in real life, horrible things often have minor benefits to some people. These people manage to avoid, work around, or blame on user stupidity, the large inconvenience which the cruft imposes on the majority of people.
Secondly, there are some software designers who have waged war against cruft. Word Places Yeah Write word processor abolished the need for saving documents. Microsofts Internet Explorer for Windows, while having many interface flaws, sensibly abolished the Exit menu item. The Acorns RISC OS abolished filepickers. The Mac OS uses file IDs to refer to files, avoiding all the problems I described with moving or renaming. And the ROX Desktop eschews the idea of a Start menu, in favor of using the filesystem itself to categorize programs.
However, for the most part, this effort has been piecemeal and on the fringe. So far, there has not been a mainstream computing platform which has seriously attacked the cruft that graphical interfaces have been dragging around since the early 1980s.
So far.
-
-
_Refactoring_
Hi,
If Martin Fowler's Refactoring is not on your list, it should be added.
This book is changing the way people write code, and is up there with Knuth's books, Kernighan and Ritchie, and Design Patterns in terms of influence over software development. -
Re:Unit tests pass != good codeIn all the focus on passing unit tests (which are written first) and constantly refactoring, they have deliberately lessened the focus on a clean, maintainable design, and left it essentially to chance.
Eh?
Refactoring is, according to the subtitle of Martin Fowler's book, "improving the design of existing code." If you are refactoring all the time and still ending up with a sucky design, then you aren't refactoring very well.
You can look at XP as a series of tiny steps repeated over and over:- write a little test
- think a little about the design of the code that the test implies
- write code until the test passes
- look again at the design and notice what is wrong with it
- refactor until your design is good
Now, suppose six months down the line, I have a codebase that passes all the tests, but in making a simple change to meet a new requirement I can cause it to fail 500 tests and need six man-months of rewrite time before it passes them all again. Do I really have a good codebase?
Of course not.
I've never heard of a case nearly this bad with an XP team, although XP newbies will often blow an iteration (i.e., 1-2 weeks) on a big refactoring. This is always a sign that they haven't been doing enough refactoring as they go.
I generally diagnose this as "exessively high tolerance for ugliness and pain". On non-agile projects, one locks down a design and then just codes within that framework for months or years. The first design is never perfect, so one gets used to just hacking around a bad design, bending it to your needs.
In making the transition to XP, developers need to unlearn that behavior. About 80% of this can be solved by never, ever copying and pasting more than a couple of words. The urge to copy and paste code is almost always a sign that your design is bad. And by copying and pasting, one multiplies the ugliness. Instead you must figure out the commonality between the segments of code and abstract something beautiful. -
Re:Comments are evil.
The best comment is the code.
For a really good resource on why this is, and how to make your code actually live up this ambitious declaration, check out Refactoring by Martin Fowler. Most of the comments I write are about things that have some external significance... hacks, basically. :|
While this idea about this book is to improve the design of existing code, it's somewhat nontrivial to apply in practice on some nasty, tangled, obtuse code that tries to do too much. Rather more trivially, it makes new code that I write, and then the maintainance of that code, much better.
-If
PS: I've found both Refactoring and Analysis Patterns by Fowler to be well-written and insightful. Substantially less dry than works by his counterparts (Kent Beck, Gang of Four, etc). -
Re: XML folks believing their own hype
Granted, defining a language or technology to be syntactically based on xml makes it very verbose, but there are other consequences of this too.
For one, this allows for facilitated tool integration and automated manipulation and handling. This could be for a graphical or more concise textual represenation of the xslt "program" or for automated generation. And tool integration should not be underestimated.
Think of the refactoring support that editors/IDEs are starting to provide. AFAIK, Right now this is most prevelent in java environments (have a look at IntelliJ's iDEA), having its origins in SmallTalk. This will probably be extened to other languages and technologies over the next one to three years.
I highly recommend the book Refactoring: Improving the Design of Existing Code by Martin Fowler. Also checkout the site refactoring.com.
-
Re:Refactoring--No offense, but it doesn't sound like you know what refactoring is. Refactoring, as defined in the book by Fowler of the same name, has little to do with redesigning. Refactoring involves restructing code without breaking it. Usually this involves incremental changes to the code, and running unit tests in between each change to ensure that the code doesn't break. I would argue that refactoring is a much lower-level activity than redesigning.
Visit the refactoring web site if you want to learn more about what refactoring is (and isn't).
-
Re:More justification of OO being a phony.
Sounds like a bad design. I would pass the deck of cards to an instance of a shuffler interface. If this is the dealer object or not is unimportant.
This breaks the paradigm of a data and its operation going together, one of the basic tenants of OO design. Then you would have a non related class, the shuffler class, picking at the data of the deck class. This is clearly covered by Martin Fowler in Refactoring as a bad smell.
There's no reason to choose functional vs. OO programming, they work fine together. Closures and OO works perfectly together in Smalltalk. The power of OO is data encapsulation and correct use of interfaces. These are things that are not easily modelled in a pure functional language.
You say it yourself. Lexical closures are the perfect medium for encapsulation and inteface exposure. The lexical bindings allow for data to be completely hidden from the users eyes. The function that creates the closure and returns it is a factory in design pattern lingo. It creates a function with a clearly defined interface. I think both of these OO paradigms are more than adequately covered in functional programming. -
Re:More justification of OO being a phony.Not to beat a dead horse, this is my last post on the topic.
This is wrong. Any objects you define do correlate to the "real-world" (your model). If they dont your program will not make sense. You have taken a wrong turn somewhere.
I am not the only to advocate this position. Jeff Alger is a great proponent of objects not as real-world entities but of programming idioms, as is Martin Fowler. The reason is that to distribute work, many times is makes more sense to have object be active: it fits the "data with their operations" paradigm. The point of the cards example is that in real life you don't really have self shuffling decks of cards, but it is a natural division of work in programming (it is an operation of the deck dealing with its data). -
Re:Refactoring your project
Refactoring is actually at http://www.refactoring.com/
-
Re:Code rewrite
Isn't there sometimes a happy medium between completely rewriting the whole codebase and continuing to hack it up?
This happy medium is described well by Bruce Eckel in Thinking in C++. He says in the chapter on design (paraphrased): "don't worry that getting some aspects of a design wrong will mean you have to rewrite everything. You won't - properly-written classes shield you from your mistakes." This is from the section that talks about the problems that occur early on in implementation, but applies equally to rewrites.
For example, maybe you can identify certain modules that can be isolated and rewritten, then tested rigorously against the old code to make sure they're functionally identical.
This is called refactoring and is now a widely-accepted industry standard practice for improving a codebase without rewriting it from scratch. The official web site is here. -
Re:what a load of crap
Fine. You do have regression tests for all the bugs you've fixed and all of the features you've added, along with notes on why you've added things? If so, it's called refactoring. Otherwise, it's a recipe for a different kind of buggy mess.
-
Re:Code rewrite
Yes, there is often a happy medium between completely rewriting the whole codebase and continuing to hack it up. There have always been disciplined ways to steadily improve existing code over time.
The latest buzzword for this is Refactoring. There are some excellent published materials on this topic. We've finally reached a stage where verbal discussion of good software engineering techniques have reached a point where we can write intelligently about the topic using common terminology. See refactoring.com and The First Wiki for some good online starting-points.