When Making a Comprehensive Retrofit of your Code...

← Back to Stories (view on slashdot.org)

When Making a Comprehensive Retrofit of your Code...

Posted by Cliff on Friday December 21, 2001 @10:19AM from the things-you-do-when-you-rewrite-everything dept.

chizor asks: "My programming team is considering making some sweeping changes to our code base (150+ perl CGIs, over a meg of code) in the interest of consistency and reducing redundancy. We're going to have to make some hard decisions about code style. What suggestions might readers have about tackling a large-scale retrofit?" Once the decision has been made for a sweeping rewrite of a project, what can you do to make sure things go smoothly and you don't run into any development snags...especially as things progress in the development cycle?

14 of 385 comments (clear)

Min score:

Reason:

Sort:

Extremem Programming - Refactor by Strudleman · 2001-12-21 10:26 · Score: 3, Interesting

You should read up on Extreme Programming, in particular Code Refactoring. It's a method of cleaning up old code. A very well written book, as well as an excellent code-housekeeping method.

--
Do it doug.
Use framework, seperate code form display.. and.. by cybrthng · 2001-12-21 10:42 · Score: 3, Interesting

Well, i can't say it enough. Use a web framework, i don't know of any for Perl off the top of my mind, but i use Resin.

Resin is a JSP, Servlet, XML, XSLT application server that support all the latest and greatest EJB components and Managed persistance on the database and makes a great framework to build from.

I have Postgresl for the Database, and use beans to run queries and output via xml and then i have XSLT draft xml data into the wonderfull HTML code.

The beauty is, my code is in beans, servlets and jsp's. My HTML is in .xtp files (in the case of resin) and simple XSLT sheets parse the XML to render to the user.

Just means i can produce output easily for WAP, Palm, CE, Normal Web Browsing, EMail, and what not without modifying the backend. Just create xslt using a session identifier to bring up the corresponding stylesheets for whatever device is acesssing the page.

Enough about java, but something similar, be it in house developed would be your best bet.

I also get away with using a Swing applicatoin to manage the database, users and run reports provide an easy to navigate gui which just interprests the same xml data that would be retrieved by the html client. Not a single change to the backend since i'm using the beauty of soap, jsp's, servlets and java.

Virtualize your interfaces, standardize on your backend and use re-useable components. Perl is similar in many ways that you can load libaries and abstract your code from the display which will save you tons of hours of hassle in any future upgrades compared to the bit of changes you would do now
Re:object orientation by TheRain · 2001-12-21 10:43 · Score: 3, Interesting

why is this flamebait? all this person is saying is that object orientation could make the code easier to manage and, thus, help to reduce redundancy. it's a good statement, modularize your code and it becomes more reusable... and therefore less redundant. who cares if he likes java over perl and CGI.

--
Please help! I'm stuck inside my virtual reality headset!
Little by little by Mike+Schiraldi · 2001-12-21 11:05 · Score: 3, Interesting

Don't stop development on the old tree and shift all work to the restructuring project.

Instead, leave most of the manpower working on the old tree as always. Take a small team of your best people and have them gather input from the others, while the others keep working on the old tree. Then have the small team outline the changes that need to be made.

Work on the changes while simultaneously working on the usual stuff. Say, 90% of your manpower should do what they do now, and 10% should work on restructuring things. One mouthful at a time.

When you have a mouthful that you think is ready, branch the old tree. Merge your diffs into the branch, TEST IT, and if things seem to work, land the change onto the old tree.

--

--
Mod up a post Rob doesn't like and you'll never mod again
DON'T DO IT! by mcrbids · 2001-12-21 11:26 · Score: 3, Interesting

As posted elsewhere, re-writing is painful, and generally NOT A GOOD IDEA!

I currently maintain a code base of around 120,000 lines of php and html (written by myself in a long, hard year) and have had to "retrofit" it a few times.

I find that when it's time to do an "over-haul" it's generally best to:

1) Pretend I know nothing - redesign from scratch. Write out a spec with flow charts, DB table definitions, etc. - make it VERY DETAILED. Spend lots of time at it. More time spent here saves even more time later.

2) Ignore your spec. (See step three)

3) When a bug comes up, or new functionality needs to be added to the codebase, refer to the spec built in 1, and build to it, and then put in compatability wrappers to work with the existing codebase.

Make these compatability wrappers log their calling in some way, based on a global variable. This allows you to see when they're no longer needed simply by defining a variable in a config file and waiting a while.

4) You'll be slowly bringing the application up to the new spec - eventually you'll reach a point where it's easier just to bring the remaining pieces up to snuff than to build more abstraction wrappers. When you get to that point, you'll find most of the work is already done, just finish it to the spec and remove the compatability wrappers.

This can still be a painful process, but at least it isn't a "gun to your head"! This allows you to regression test your work as it's done, resulting in a more stable deliverable, and you can still meet clients' needs in the meantime without making them wait 6 months while you re-write all your stuff.

Hope this helps...

--
I have no problem with your religion until you decide it's reason to deprive others of the truth.
IBM's (re-) write of OS/400 by Degrees · 2001-12-21 12:38 · Score: 3, Interesting

IBM was faced with a simlar problem when they came out with the RISC version of the AS/400 hardware. They needed to re-write, but they had to maintain absolute compatibility. I think the articles describing this were in Dr. Dobb's Journal (although I might be wrong on this). Unfortunately, I do not have URL's for this.
One of the success factors they found was documenting the interfaces for each and every call between modules. The documentation turned out to be excruciatingly precise - but this led to zero ambiguity (and thus 100% interoperability). It also required meetings (sometimes arguments) between programmers to hash out what was actually going to happen. Another factor was that they decided to allow zero 'overloading' of functions by different modules. A programmer was not allowed to duplicate someone else's work, nor create a second, incompatible version of a function provided in a different module. If the function was provided by someone else's module, the programmer had to call it (properly). The result was that they reaped the benefit of object oriented programming - reuse and refinement of modular libraries.
It would be better if you could get the real scoop from the real programmers - but this might give you something to think about.

--
"The most sensible request of government we make is not, "Do something!" But "Quit it!"
Re:Don't Listen by augustz · 2001-12-21 12:41 · Score: 3, Interesting

Yes, I agree with this. I actually dislike some of the easy but (what I consider heavy) choices like Java for this reason. You can get by with so much less so easily for many projects, and support literally 5x as many users per server. For a small business this is the difference between profit and no profit. For a large business this can be the difference between .com and .bomb.

All too many times I see sites begging for money for hosting/bandwidth. Take a look at their HTML/CSS and see it is HUGLY bloated (no linked css, prevent all caching including image with default cache buster installs) and not gzipped, and I wonder, if what I can see is so bad, behind the scenes it is probably even worse. (ie dynamic page gens where none are needed). Which I had included this in my list and left of the hardware point, which I agree is the wrong message.

But damn, if you do code right, hardware is so cheap I can't beleive it. I'm convinced some 10 machine projects with bad coding can be supported on a single machine now.
Learn from those who have succeeded at this... by William+Tanksley · 2001-12-21 12:48 · Score: 4, Interesting

Some here are warning you that major changes always require a total rewrite; yet in real life, total rewrites result in inability to compete (look how long the Netscape rewrite paralysed Netscape, unable to meet Microsoft's challenge!). There's some good discussion of the danger of rewriting at a former MS software engineer's site, and some limited advice about how to get away without doing it.

But you've decided to rework rather than rewrite, you say, so I have no doubt you'll ignore the naysayers here. So what CAN you do? After all, as you recognise, reworking is dangerous!

The following rules have worked for me; I've refined my own experience with advice from Fowler's Refactoring, a book as useful as Design Patterns, and with study of Extreme Programming, a design methodology forged in the traditions of Smalltalk, and in the knowledge that maintainance, the most important and expensive part of software engineering, is also the least studied.

First, do the simplest thing that could possibly work. Don't EVER take your program out of commission for more than a day; make sure it runs at the end of each day. If you're doing something and at the end of the day your code base is broken, STRONGLY consider throwing away your changes and going back to the design stages.

Second, rely on unit tests extensively. Start every change by writing as extensive of a unit test as possible. Unit test every function you touch, BEFORE you touch it, and after. Unit test every change you make, and run the unit test BEFORE you make the change to ensure that it fails (i.e. it detects the change). Write your unit tests BEFORE you write code, whenever possible; you'll objectively know your code is done when your unit tests pass.

Third, don't design too far ahead; you don't know what tricks the old code is going to throw at you. Implement one feature at a time, bringing the code into compliance. Once everything has a unit test (thanks to your following the above principles), THEN you can safely embark on larger design changes -- and in the meantime, you have working code with new features, a win even if your customer/boss/manager decides not to continue.

Fourth, don't be afraid to redesign your own code. The stuff you wrote has more tests, so it's safer to change, but it's more likely than the old code to lack some critical understanding only age can give.

Fifth, use the principles of refactoring. Whenever possible split each code change into two parts: first, a part which changes the structure of the code without changing its function (and which therefore allows you to run the same unit tests); and second, a part which uses the new structure to perform a new function (thereby requiring new unit tests).

Good luck. If you want more advice, read up on Extreme Programming.

-Billy
Minimise Untested Documentation! by William+Tanksley · 2001-12-21 12:57 · Score: 3, Interesting

Depend on tests, not documents. Documents lie. Tests don't.

Use as little documentation as possible, BUT NO LESS. (In other words, for heaven's sake don't ever try to get away without any documentation.) Documentation should state fundamental premises -- things like "The customer wants X." and "This code checks that I'm fulfilling the customer's requirement of X." Documentation should not state intrinsic properties -- the statement "this code does X" should be made as a test, not a document.

-Billy
Re:Sleeping dogs by foobar104 · 2001-12-21 15:23 · Score: 3, Interesting

You might want to reconsider perl as the language of choice for a large scale application.

I agree 100%. My company started to bring a commercial application to market a little less than a year ago. I prototyped the code in Perl, and the prototype was sufficiently okay that the decision was made to evolve the prototype into the release code.

This was A Mistake. It was A Dumb Idea. It was also My Decision. I have taken Much Shit for this from my coworkers. But you live and you learn.

I have since (over the past four months or so) rewritten the entire application-- every line, every file-- in C++. The source tree is 3.8 MB, and it compiles to about 100 MB of object code. (The actual executables, of course, are much smaller than that.) It was a pretty big job.

Not only is my code tighter and cleaner than the original Perl stuff (which was actually pretty okay code) but it's between 2 and 10 times faster.

I love Perl, absolutely adore programming in it, but there are some things that are easier to do with C, or C++, or (presumably) Java. When you split a project up among a number of people, for example, using the Bridge design pattern and distributing read-only interface header files makes modular integration so very much easier. That's just one example.

We would not have been able to get our app to market without the Perl prototype. And I don't think it would have been worth a damn if we hadn't rewritten it in C++.
It's called depreciation by JohnsonWax · 2001-12-21 16:17 · Score: 4, Interesting

There's a concept in the world of stuff you can touch called 'depreciation'. Why the software world hasn't caught onto this is beyond me, but from my experience it seems to apply reasonably well.

The value of your code goes down in value over time. Now, don't confuse that with the value of your design - your design could be grand, but the code is part of your equipment, like your hardware. Depreciate it over time to reflect the increasing cost of maintenance and integration costs in a migrating business.

Reworking your code allows you to make adjustments to your design to reflect a new environment, or to move away from languages/APIs/toolkits that might be hard to maintain.

I depreciate my code over about 3 years. It's all modular, and I replace code about as frequently as I add. A number of years ago, I had tons of bandwidth but not much CPU power, so I tended to push data rather than compute. The reverse is now true, so I made some design changes as part of a standard rewrite - no need to wait until it broke. Overall, most of my code hasn't radically changed in it's design, but it has been rewritten several times. I've had code cut back to 10% of it's original size by adopting a new toolkit, etc. I've made it more robust, faster, cleaner, better documented. I can't think of a case where it's gotten worse, and I can't think of a case where the rewrite took much longer to write than the original code - and more often than not took much less time. It's worked well enough that I've added a considerable amount of functionality, but spend no more time reworking code because it becomes increasingly efficient and is never too far removed from future additions.

Many people suggest that code rewrites are a waste of time, but it's a maintenance function. People that only budget time to write new code often find those extra work hours devoted to maintenance. Budget it in - and the best way is through rewrites.
Re:see joel on software by bay43270 · 2001-12-21 17:18 · Score: 3, Interesting

Although I enjoyed his book, and respect his opinion on matters of interface design, Joel Spolsky is no champion of well-designed code. In his book, he even suggests that a programming language shouldn't be used if another can create the same user interface in less time... as if the time it takes to create the first version of a product should take precedence of other factors such as maintainability of code, reuse, and flexibility. He does a good job of regurgitating the works of other interface specialists, but he is not an expert in code maintenance.
Re:object orientation by anacron · 2001-12-21 18:02 · Score: 3, Interesting

anacron, interesting experience and solid advice. I didn't see Extreme Programming mentioned. Did you use it at all?

Actually we didn't use it formally. But we used some of the best ideas. Iterative development, frequent code reviews, programming in pairs etc. We had a pretty small team so XP might have helped a little bit more, but it's a fundamental paradigm shift that most people I work with aren't comfortable making. My team would have been less effective had I said, "You *must* use XP."

I have been on projects in the past that have embraced more of the XP model, but something like Rational's Unified Process (RUP) kind of fits in more with what we do.

For traditional software shops (e.g. those that make products) a methodology can be set in place by the management, because the company is in essense it's own client. For other types of software shops -- such as consulting firms, or web design shops -- where one-off software is commonplace, the type of methodology that's used is usually determined by the client.

For example, if I'm working with a client that isn't very savvy, I probably don't want to do all of my requirements in formal use-cases, followed up by UML design documents. They won't add any value to the client at all, yet we still have to make sure we understand what it is we're building.

It's an interesting challenge and it's usually best overcome by having solid people on the team that can speak to both sides. A solid requirements team which can talk to the client (be it internal or external) and truly understand the business problems, all the whole clearly articulating those to the technical team. Sometimees double documentation is necessary -- one set for the client and one set for the developers or development lead.

There is just no easy way to do software development correctly. It's amazing that in nearly 60 years, no one has come up with a "right" way. There's only so much a methodology can bring to the table -- after that it's about finding and keeping a competant staff who understand the value of what they're doing.

The methodology is just a roadmap. How the car is driven is completely up to the development team. Even a good technical leader can become a backseat driver at times, as there is just not enough hours in the day to provide oversight to every single member of a team.

I think the more traditional software shops should look at the console gaming industry for tips. They have a difficult challenge -- it's not possible for them to release a fix or update for their game once it's completed. So how do companies like Activision, Square, Nintendo and Sega do their software development, and is there anything that the development teams at large can learn? Rigerous testing is one of the most obvious things that comes to mind for them, but how do they know what it is they're testing when a game's vision usually only exists in the mind of the creator?

.anacron
My analogous situation by Chris+Johnson · 2001-12-22 02:15 · Score: 3, Interesting

I'm currently doing something similar- the program I maintain is Mastering Tools, which has long been an over-featured, densely packed program operating on a text input basis with a certain amount of visual feedback on the text input.

However, I've long known that there are two things my real users (ideal users? the serious mastering engineers) want: familiar interface and realtime processing. I can't deliver realtime processing without literally doing the whole thing over again in a language I don't know (it's done in REALbasic, which is little better than a scripting language for speed, though it's got really, really nice prototyping abilities and GUI support.

However, the time's come to completely overhaul the interface, partly because I have some ideas for mid/side processing and don't have any room in the current layout to fit them! The ideas have to do with rectifying the side channel and using it to either enhance or remove signals that aren't in both channels equally- where regular mono is a 'node' that totally eliminates out-of phase content, it's also possible to completely eliminate R and L-only content and keep only R+L and also the out-of-phase content.

That part's the easy part- it's simply signal processing (and will be duly released under the GPL as soon as it's done, as always). However, the interface is asking for a total, complete overhaul in several ways, and that's what's taking all my effort currently. Here's the situation and how I'm handling it...

Layering. It's no longer possible to fit the whole app in a 640x480 area, even with small print. There are several possible answers to this: one would be having separate windows. This would be more adaptable to larger screens, but it's untidy and there are issues with closing windows and still referring to controls they contain- so that's out. What's looking more reasonable is tabbed panels- RB implements a nice little drag and droppable tabbed panel control that appears quite easy (though I had some trouble attempting to do nested tabbed panels- after one experience of having all the nested panels (at identical coordinates) switch to the top panel of their parent panel, I quickly gave up on that concept. Instead I'm using more panel real estate and trying to divide the controls into logical categories. That is, of course, a real headache- doing interface properly is hard! (says 'interface is hard Barbie') It's only somewhat easier with the additional space. Complicating matters further, is the expectation of the intended audience here. It has to both be organized and look and feel like a mixing board, amplifier or rackmount box of some sort.

Solution: implementing controls like knobs and meters. It's actually quite fun to code a knob appearance out of graphics primitives- and surprisingly hard to get mouse gestures to work on the damn thing- using a two-arg arctangent routine in RB that I don't fully understand. I also have a single meter control already implemented for azimuth display- think that it might be best to tear that apart and re-implement it in a more general sense.

That's because the Knob class turned out to be the right thing to do- it's implemented almost totally separate from the main body of the program, as if it were a RB control or something. Knowing I was going to be using it in different sizes, possibly different colors etc, I wrote the knob code as completely scalable- from maybe 20 pixels to over 100. It reads its size from the control width and height, runs itself as far as handling mouse input and storing a control value, and does not inherit comparable interfaces to the controls it will be replacing- so when it's time to plug 'em in I can run the program and see what routines crash and burn (and need to be rewritten for the new control interface). I'm thinking meters need to also be handled in the same way- somehow- not clear on the form yet.

So: dunno what else to tell you, but it seems like the things I'm doing that are helpful are: compartmentalize new interfaces, make them adaptable and get them working independently of the existing code, while leaving the existing code in working condition. Then when the new stuff is brought online do it in such a way that you could do it one control at a time or in small batches.

Dunno if that's relevant to where you're coming from- but it's what I've found necessary when facing a major re-implementation.