Rewrites Considered Harmful?
ngunton writes "When is "good enough" enough? I wrote this article to take a philosophical look at the tendency for software developers to rewrite new versions of popular tools and standards from scratch rather than work on the existing codebase. This introduces new bugs and abandons all the small fixes and tweaks that made the original version work so well. It also often introduces incompatibilities that break a sometimes huge existing userbase. Examples include IPv4 vs IPv6, Apache, Perl, Embperl, Netscape/Mozilla, HTML and Windows. "
Was it a "good idea" for Microsoft to rewrite Windows as XP and Server 2003? I don't know, it's their code, they can do whatever they like with it. But I do know that they had a fairly solid, reasonable system with Windows 2000 - quite reliable, combining the better aspects of Windows NT with the multimedia capabilities of Windows 98. Maybe it wasn't perfect, and there were a lot of bugs and vulnerabilities - but was it really a good idea to start from scratch? They billed this as if it was a good thing. It wasn't. It simply introduced a whole slew of new bugs and vulnerabilities, not to mention the instability. It's just another example of where a total rewrite didn't really do anyone any good. I don't think anyone is using Windows for anything so different now than they were when Windows 2000 was around, and yet we're looking at a 100% different codebase. Windows Server 2003 won't even run some older software, which must be fun for those users...
.. as they are rewriting the security layer!
As a coder I can assure you that working on somebody else's code is frustrating because you allways say: "I would have done this differently". Most rewrites I think come from there, having the idea of a better implementation.
This introduces new bugs and abandons all the small fixes and tweaks that made the original version work so well. It also often introduces incompatibilities that break a sometimes huge existing userbase.
Microsoft has created an entire, successful, multibillion-dollar-a-year-profiting business model off of this!!
Sheesh.
do() || do_not();
In light of the preceding article, I propose that we completely rewrite slashdot! In BASIC! This will provide unsurpassed slowness and crashing, making the world better for all!
- - - - - - -
Orppf urp mf y.ppcxn. yflcbi otcnnov C am yflcbi yr n.apb Ekrpatv (Dvorak -> Qwerty)
The trick is to include all the tweaks and fixes that were implemented in the old code. Obviously, if you rewrite and then leave open all the gaps and problems from the earlier version, there's no point in rewriting. However, you could rewrite with those fixes in mind, and come out with a completely new (and problem-free) edition.
I think it may have something to do with programmer ego and something to do with the challenge. I'm guilty of it myself. You find something you're interested in and you want to build it. It doesn't matter if someone else has done it or even done it well before you. The challenge is to do it yourself.
~ "When I'm of that age I'm just going to live up a tree."
Microsoft: Ok, Windows XP and 2003 have a full rewrite of the TCP/IP stack and security system.
Slashdoter: Why did Microsoft rewrite the core OS? They just introduced more bugs and lost the stability and security fixes from older versions of the OS?
"Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
Ok, this dude uses netscape 4.x and thinks its fast. next article please.
The minor tweaks, fixes, and changes that made the old version work so well can only go so far. Such is often the nature of code. Tiny fixes and patches are (sometimes haphazardly) hacked on to the code.
Perhaps if true extensive software engineering and documentation techniques were followed, a full rewrite may not be necessary. However, as long as quick fixes continue to pollute the code and make it more and more difficult to work with, an eventual total rewrite will always be necessary.
I'm sympathetic to the idea behind this article, but does it deserve a place on /.? There's absolutely no empirical data, or even a reasonable example given in the document. The author is talking about IPv6 and Perl6 both of which are unknown quantities at this point.
He's right that just throwing away old code means yo u lose a lot of valuable bug fixes, on the other hand if you look at some code and realize there is a better way then the solution is to rewrite it.
Of course you can have it both ways. What you do is write an automated test case for every bug that you fix in your code. When you write the new version it has to pass the old test suite, then you've got new code and all the experience from the old code.
John.
Ouch! There goes my karma!
Although I have an unhealthy habit of wanting to start things from scratch, I believe it can be a good thing more often than not.
When you've developed a piece of software, fixed its bugs, and tweaked it, more times than not, those fixes and tweaks are nothing more than workarounds for your currently flawed structure. Usually, you don't realize these flaws until AFTER you've created it.
By starting it from scratch, you can keep your mistakes in mind, and make better and more efficient software.
Sure, there are chances of running into new bugs, but isn't that what the whole learning process is about? The more you learn, the better the software will keep making. You can only go so far, if you need to turn a paper bag full of feces into an operating system. But if you start from scratch, you can create your own digitized significant other. You know, relatively speaking.
- shazow
This oughtta be good. (puts on asbestos-lined pants)
Your point is well taken about ego often driving rewrites but in my experience the driving force for rewrites is often maintainability.
As a program ages and drifts from the original intent ugly hacks are often placed on top of the original code to add unforseen functionality. There is also the opposite effect where old code is sitting around that no longer has any function. I remember one drastic case of this when rewriting a program where only about 1/2 the code was even beeing utilized.
By rewriting the code you clean things up and make it easier for future programers to understand what the code is doing.
One should not theorize before one has data. -Sherlock Holmes-
For Windows users, Winamp is probably the best example I can think of. Take a stable, usable, simple and elegant audio player (Winamp2) and fuck it up by writing it from scratch (Winamp3), then ultimately abandon that clusterfuck rewrite in favor of yet another rewrite (Winamp5) that fixes what they fucked up with Winamp3.
I'm mighty happy sticking with Winamp2, thank you very much.
Don't rewrite. Refactoring code is the way to go. Refactoring in small pieces allows the app to maintain compatibility as the process progresses.
The other side of the rewrite issue is, how long can you continue to maintain code from a legacy system? I worked on a project a couple years ago that had been migrated from assembler to COBOL and is now being rewritten (as opposed to being redesigned) for Oracle. Nevermind for a moment the fact that the customers wanted to turn the Oracle RDBMS into just another flat-file system--which included designing a database that had no enabled foreign key constraints and that was completely emptied each day so that the next day's data could be loaded. . .
Some of the fields that are now in the Oracle database are bitmapped fields. This is done because there's no documentation for what those fields originally represented in the assembler code and because the designers are afraid of what they might break if they try to drop the fields or attempt to map the fields out into what they might represent. I had the good fortune to get out of the project last August. . . last I checked, they had settled for implementing a Java UI over the COBOL mainframe UI.
Anyway, my point is this: at some point, you have to decide whether the system you're updating is worth further updates. Can you fix everything that's wrong with the code, or are there some things you'll have to jerry-rig or just shrug your shoulders and give up on? Under circumstances like what I mentioned above, I truly think you're better off taking your licks and designing from scratch, because at least that way you can take advantage of the new features that more recent database software and programming languages have to offer.
!#@%*)anks for hanging up the phone, dear.
As I recall, Torvalds made mention that some of his original code in the Linux base was not very good and he would have written it much differently today. Indeed, most anyone that habitually programs naturally becomes more skilled and if the underlying premisis/framework/model of an application or tool is not as good as could be - or is lacking a certain methodology that time has proven to be beneficial and only rewriting it will solve this - what is wrong with rewriting the code from the ground-up?
This guy is full of shit and has no idea of what he is talking about.
Some of the better parts:
- He claims that The mozilla project and everything Netscape >4 is pointless and that Netscape 4 "just works". We all know that Netscape 4 is an awful, crashy, buggy, standards-breaking piece of crap that set the Internet back years.
- He claims that Windows XP was a complete rewrite. Windows XP is NT 5.1 -- (check with ver if you want) Windows 2000 with the PlaySkool OS look.
Okay, so most of the article consists of, "Here's software X. They re-wrote it, and now it's not as good or as accepted. Why'd they do that? They suck."
Software is re-written for many reasons. Sometimes it's ego, sometimes it's for fun, but usually it's because you take a look at the existing codebase and what you want to do with it in the future, and you decide that it's going to cost a lot less to implement the future features by re-writing and fixing the new bugs than to work around the existing architecture.
I've had to make the re-write or extend decision more than once, and it's rarely a simple decision.
What I would have preferred from this article is some interviews with the people responsible for the decision to re-write, and what their thinking was, as well as whether they still agree with that decision or would have done something differently now.
=Brian
There is nothing so good that someone, somewhere, will not hate it.
Sometimes, rewriting from scratch is necessary to remove bugs. Not all bugs are just failure to check a buffer overflow, which can be fixed without a complete rewrite. Sometimes your basic communications architecture in the program is fundamentally flawed and insecure. At that point, by the time you've fixed that bug in the existing codebase it would have been easier to start from scratch and make everything else work "natively" on the new internal standard.
;-) There are fundamental security issues with all GUI windows operating in the same user space. If one is compromised, they're all 0wnz3d. That's a reasonably major flaw, but to fix it would require essentially rewriting the entire GUI portion of Windows, because it's so integral to the system. To try and fix that without a rewrite would be harder and more complicated than chucking it and starting from scratch, and probably introduce a dozen other bugs in the process.
:-)
Take for example, Windows.
Sometimes you really do need to throw out the baby with the bath water, if the baby is that dirty. Besides, making a new one can be fun!
--GrouchoMarx
Card-carrying member of the EFF, FSF, and ACLU. Are you?
Joel on software has covered this point in a good article: http://www.joelonsoftware.com/articles/fog00000000 69.html.
It was too messy and unmaintainable. I'll wait until the rewrite comes out to fix all the grammer and spelling bugs.
As a video game developer, I've been involved in many "code upgrades", as well as rewrites. As long as the rewrite is being done by people who wrote the original code, and they invest time in some preproduction carefully thinking through what they did right and wrong, the rewritten product will always be faster, more stable, easier to maintain, etc. etc. In the end it's always been a clear winner.
Things You Should Never Do, Part I
Rewrites are 'bad' from a management point of view (at least, a manager that isn't familiar with software development), which looks at return on investment (ROI).
However, from a developer's point of view, a partial or complete rewrite is sometimes the only way to FIX certain bugs. While it may introduce new, small ones, usually developers are smart enough to read the old code and learn from it's mistakes before the do a rewrite.
A partial or complete rewrite is ALSO sometimes the only way to fix 'spaghetti code' -- code that's become so tangled from patch upon patch being applied to it that it's now impossible to trace and fix a bug. If spaghetti code isn't pursued and rewritten on a regular basis (this is 'constant improvement' -- a management buzzword from the past few years that actually works), new bugs can be inadvertantly introduced -- and it can sometimes take weeks to hunt down an intermittant bug by tracing spaghetti code. Ladies and gents, WEEKS of programmer time is expensive compared to one programmer spending 8-10 hours per week tracking down bad code in the codebase and rewriting it.
Really, there's a case for doing rewrites on a constant basis. The author should have instead addressed adequate testing in software development environments...
--
Vote for your hopes, not for your fears - Vote Third Party
As a software designer, developer, programmer and user, I have to saw that rewrites done right are A Good Thing(TM). When I do a rewrite, it is with the intention that it is to be better than the old one. I only do rewrites when a limitation of the old code base has been reached or can be foreseen to be reached.
When a rewrite is to be made, it goes without saying that anything learned from previous development should also be applied to the newer project. If you can't learn from the mistakes of the past, don't do a rewrite.
It is not rewriting, per se, that is the problem. It is choosing WHEN to do a rewrite. Unless there is sufficient reason to do one (ie. old code hard to maintain, scalability problems, old code reaching its maximum potential, etc.), of course one should stick to improving on existing one. If, however, the reason is that so "we could have something new", or so that "we could say we did a rewrite" or "I'm the new architect around here. Scrap the old code and write my design", then of course rewrites might be more trouble than they're worth.
All common sense.
From the Perl 6 development webpage:
"The internals of the version 5 interpreter are so tangled that they hinder maintenance, thwart some new feature efforts, and scare off potential internals hackers. The language as of version 5 has some misfeatures that are a hassle to ongoing maintenance of the interpreter and of programs written in Perl."
For me, this is a necessary and sufficient condition for rewriting something.
Another one is: When changing the original will take longer than rewriting from scratch.
HCG 50a = 2MASX J11170638+5455016
11h17m06.4s +54d55m02s
Enlightenment DR17.
Stating on Slashdot that I like cheese since 1997.
Every successful piece of software I've ever worked on was rewritten at least once, by the same team (or by myself on private projects) in the process of development, fully or at least partially.
The fact of the matter is, even if you hire an expensive architect and have him do a good job, he's not a God. When you develop software some parts of it tend to become ugly as heck and you can't help but think on how to do the same thing better and/or with less effort, so that it won't become a PITA to run, maintain, improve and extend. When you reach critical mass, you become "enlightened", throw some shit away and rewrite it to save time later on. In all cases where I've seen it done I think it was worth the extra effort. I also think re-engineering code as you go saves money long-term if it's done reasonably.
All of this, of course, doesn't apply to those who start their separate standalone projects even though there are dozens of other reasonably good projects to contribute to (and maybe rewrite some parts of). Freshmeat.net is full of examples.
While the article is a good rant, it's just wrong sometimes. For instance:
* He says that IPv6 uses 64 bit addresses. It uses 128 bit in reality. You would think that, if you were saying why something was bad, you'd do some basic research?
* Also in the IPv6 stuff, "TCP/IP works pretty well". So? TCP/IPv4 and TCP/IPv6 are the same damn thing. That's not an argument against IPv6, it's an argument for knowing what you're talking about.
* Perl. Sorry, the reasons for moving to the model in Perl 6 is well documented and sane. There's some problems with Perl 5 that we can't get around without losing backwards compatibility (syntax braindamage, for instance).
* Mozilla. Ok, it's slow. The Mozilla team even admits it at this point. MozFirebird is better. The reason for starting fresh wasn't speed, it was because the old codebase sucked.
* HTML. Having a language for both layout and data sucks. Splitting it into 2 parts is much better. There are developer perks, too (no rewriting the website to make it look different, no playing with layout to add data).
The basic point he seems to be missing is: a major version change (1 to 2) is supposed to be a radical update. The version system used by the kernel (and a lot of OSS projects) is based on that. Major.minor.revision. Bump revision when making bug fixes, bump minor when adding features (without breaking too much API), bump major when it's something new altogether.
-- Bill "Houdini" Weiss
Netscape 4 is horrible. It's usage is actually slowing down adoption of Mozilla and other far superior browsers. Once we start creating web sites with standards rather than with code that looks like HTML we'll have smaller browsers that can do things much faster than what Mozilla can do today. Indeed Mozilla isn't just one browsers but multiple browsers for all the F'ing crappy implementations of HTML there have been. Just look at the page this article is on. It's ladden with mistakes, isn't even standard HTML 4.0!
This guy would prefer to see the net stop growing than see some change so he doesn't have to rewrite some stuff. Lazy ass.
I wrote 3 years ago a web-app based on perl that is currently the heart of one of the tasks my company does. I am in the process of completely rewriting it in php and using no code or concepts from the first iteration in the new release.
Why? I have better way's of doing things now, I need to be scalable to handle a worldwide company instead of simply a regional tool, and to increase speed, useability and stability.
a rewrite is the only way to achieve these things. anyone who has been with a project for an extended period of time and had to expand/modify it beyond it's origional capabilities knows this.
Do not look at laser with remaining good eye.
Don't like Mozilla? Use Mozilla Firebird. Honestly, I can't think of any browser I've used that is better than Firebird (especially with the addition of extensions). Firebird should be enough proof to this guy that Mozilla was a step in the right direction.
When i coach CS students i sometimes say to them:
There comes a time in every software development project where the best thing to do is to delete all code and start again, perhaps you are there now?
But that is usualy when a student has started coding without knowing a squat about how to solve the assignment and realized after a couple of hundred lines of code.
This is my sig, show me yours
Ad-hoc fixes to particular problems lead to code bloat. Excessive code bloat leads to a desire to rewrite.
A full rewrite may have a cleaner architecture, but often, those fixes for particular, tiny problems are lost in the rewrite ("what is this two-line if supposed to do?").
The solution? Redesign, but get to the new architecture from the old architecture by refactoring, one small step at a time. To do it quickly and with confidence, do unit tests. Lots of unit tests. Ideally, those tiny problems that prompted the fixes should have unit tests specifically built to trigger them.
My website
The Problem: Rewrite Mania
...
Waaaaaaa!!
Case 1: IPv4 vs IPv6
Waaaaaaa!
Case 2: Apache 1.x vs Apache 2.x
Waaaaaaaaaa!
Case 3: Perl 5.x vs Perl 6
Waaaaaaaaa! Waaaaaaaaaaa!
Case 4: Embperl 1.x vs Embperl 2
Waaaaa!
Case 5: Netscape 4.x vs Mozilla
Waaaaaaaaa!
Case 6: HTML 4 vs XHTML + CSS + XML + XSL + XQuery + XPath + XLink +
XML is hard! My HTML for Dummies book weighs too much! Waaaaaaa!
Case 7: Windows 2000 vs Windows XP vs Server 2003
Waaaaaaaa!
Conclusion: In Defense of "good enough" and simplicity
Waaaaa waaaaaaaaa!
"Considered Harmful" is Considered Harmful.
One point that the author seems to miss is that there are better and worse ways of doing a rewrite. Several of the examples he mentions (notably Apache 1 vs 2 and Perl 5 vs 6) are being handled very well. Development on the old versions is continuing while the new versions are being improved essentially in the background. That means that nobody is forced to upgrade until the new version actually provides them with enough tangible benefits that the switch is justified.
Perl is an especially good example because the new version is actually separating the language specification (Perl6) from the Virtual Machine (Parrot). Parrot will be flexible enough to run both the old and new language specifications, so even people who don't want to rewrite their scripts will benefit from the performance enhancements. Combined with continued development of the existing codebase, this makes Perl very future safe, all while offering the potential benefits of a complete code rewrite.
There's no point in questioning authority if you aren't going to listen to the answers.
I'm not sure that the author of the story really discusses the give and take of patching an old codebase, vs a complete rewrite. Instead, he focuses on a negative that isn't really there.
As soon as I read the headline, the first apps that sprang to mind were Sendmail, and WuFTPD. Both have been historically full of holes, and a complete mess. I haven't really looked at Sendmail code, but having to configure each option with regular expressions, while powerful, is just lame (IMO). The WuFTPD code is a mess. It's been passed on and passed on, and patched and patched. It eventually became a total whore that nobody really wanted to touch on any level.
Now, both of these (AFAIK) were not rewritten from scratch, and suitable replacements have been produced all over the place. However, would it have been so bad to rewrite those from scratch, while still maintaining the older versions? How would it be any different from, say, the Linux kernel. I run 2.4.x on my production machines. 2.6 is out, but I'm not going to run it until it's proven itself elsewhere (and is integrated into a mainstream distribution). 2.4 will be maintained for a long, long time -- and it's not eve na complete rewrite (AFAIK). Usually code rewrites are adopted by the public...not right away, but eventually.
Finally, his gripe about Mozilla/Netscape are interresting, but not really warranted (and he does acknowledge this). The applications became more bloated as system resources became more plentiful. Software tends to do this -- it has to do with greater layers of abstraction as hardware gets better. But furthermore, it's because Mozilla had to be able to "compete" with the latest greatest from Microsoft...which MSFT will always be updating as new standards are added.
The point is, it doesn't really matter. It doesn't do a disservice one way or the other, and since much of the software we're talking about is Free Software, it matters even less, since the code it out there -- if there are enough people using the older versions, there will always be someone to maintain it.
-Turkey
I disagree.
Case in point, Windows 2000, AKA NT 5. When in its original development, it was being built on Windows NT 4.0 technology. When they realized that this "upgrade" added more problems than it solved, and subsequently was unable to keep up with newly emerging technologies and standards, they decided to scrap it and start from scratch.
Similarly, this is why Windows XP is 5.11.2600 (if I recall correctly), because it was built on NT 5.
As opposed to Win 9x which was just a modded kernel and added dll clutter.
Just because you can mod me down, doesn't mean you're right. Shoes for industry!
Exactly when did Netscape ever work well on Linux?
All I remember is consistent crashing from Netscape Gold through the finally-put-down Netscape 4.x. It was the biggest piece of shit browser ever written precisely because its codebase was old (forked from NCSA Mosaic in 1994, which itself was much older) and non-extensible, yet more and more shit was thrust into it. It had to be rewritten, and all the Gecko-based browsers have been much more feature-complete and reliable for the past 2-3 years than Netscape ever was.
I use Galeon, and the thing basically never crashes. Back in 1999, I considered myself lucky if a particular version of Netscape 4.x only crashed once every half-hour.
[ home ]
Maybe times have changed, but when I started my career as a _maintenance_ coder, there were two ways to do it:
1) The usual way: fix small sections of code in the same style and technique that it was originally written,
2) rewrite large sections of code that were _truly_ hard to maintain, taking great care to leave something much more maintainable behind. This route requires much more thorough testing than (1).
I remember another of us "programmers" who said he didn't do maintenance, he was a "development animal." Wrote abysmal code. When he rewrote a major module of our system he tried to make FORTRAN look like ALGOL, using GOTO statements in the righthand margin of the code.
I bet that module got a rewrite not long after that. Something that was maintainable and written for the language that was being compiled, not the language that didn't even exist on our system.
I do not concur with the author's seemingly blanket assumption that a complete re-write of codebase is wasteful. There are times when it is necessary for both practical and philosophical reasons.
From the practical standpoint, and suggested by other astute readers, often times the initial specs did not sufficiently anticipate future growth. Needless, it is a poor programmer who does not from a programmatic perspective anticipate this and do his/her/its best to provide a sufficiently robust framework that has at least one order of magnitude growth in a primary spec. On top of this, standards change, new ones emerge, "paradigms" shift, needs change and so on -- at times it just makes sense to start from scratch. You are not going to build a business building on top of your house's foundation...it just is not scalable to the new needs.
Philosophically, I think it is worth tearing down the structures and building anew at times. Too much incremental growth can lead to long term stagnation as the original skills to build the foundation are lost through inactivity. As an aerospace engineer I can see it now where too much information and processes have become institutionalized -- I fear if ever we needed to do it from scratch.
The Gnome desktop environment is a prime example of disasters through re-writes.
As we all know, Gnome's oringal purpose was to provide a free rival to KDE, which was the first easy to use Desktop Environment for Linux, this was back before Qt was GPL
Unfortunaltey for Gnome, its problems started as it kept replacing and rewriting core components. For example, it started out with the Enlightenment window manger, then it switched to sawfish, then it switched to the buggy and slow metacity. Metacity has had many problems, and most people want the old sawfish back, but havoc pennington refused to do it and insists that people use it.
The file manager keeps changing too. First it was GMC, then it was the Slow and buggy Nautilus from the now defunct Eazel corporation, now they are writing a new Windows 95 like file manager for gnome called Spiral Nautilus.
It also rewrote the graphics layer GTK and broke compatibillity with GTK 1.x. There are many legacy GTK apps still in wide use and they look ugly on newer desktops.
There is also the many problems with the file dialog, which is now only emerging in GTK 2.4. This is also incompatible with older GTK versions. This means that if you want to use a new program, YOU HAVE to upgrade to Gnome 2.6, and can't keep your leagcy Gnome 2.0,2,4 desktops.
They keep switching default apps, for example, Galeon was dropped in favour of the buggy and far less featureful Epiphany in 2.0. They also dumped several other applications that were useful.
To make matters worse, it is going away from the old philosphy of simple text files and are using an XML based registry clone to configure stuff. KDE keeps the text file format underneeth and has had a standardized API for it.
It also has a lack of true intergration, Micheal de Incanta has PUBLICLY ADMITTED that Bonobo was a faliure. KDE has had this BUILT in from day one using kpart technology, which is now being used in Apples Mac OS X Panther Edition.
Gnome developers, realising they kan't kompete with KDE technology, has spread various FUD about kde, but the message is getting through. Red Hat has abondaned their Gnome desktops, Fedora developers are working hard to make KDE 3.2 the default desktop for Core 2. Debian, who has traditionally been pro-gnome have announced their full support for KDE and they are working hard to make KDE the defualt desktop for
KDE on the other hand has kept consistent technology and has internally has changed very little since 2.0. Distros like Lycoris are still using 2.x because it is very stable and mature. KDE 3.2 will be a good example of why maturity, and not wheel inventing is a better idea overall. They have took their technology and have optimized it for usabillity
Gnome 2.6 will need more than just propoganda about the HIG if it is going to get the attention it needs, but instead it looks like they are reinventing wheels again.
I'm not sure of the relative validity of these points... though I agree with the sentiment.
1. This is a Cisco problem, a general problem is in the points below:
2. I didn't know there were "16.7 million addresses per square metre of the earth's surface, including the oceans", interesting. The problem with the present system is the 'chunking' of IP addresses... large groups of IP numbers lay dormant, reserved for academic/government institutions, the portion of private addresses is being squeezed in places. The IPv4 problem is an allocation problem, but allocations are determined by committees, and as soon as disparate and opposed committees get going progress and action stop for years, yes it is annoying beaucracy, but a technical work-around may well be easier than fighting this pettyness.
3. A few years ago the internet was here and it did its job while processing power of out-of-the-factory internet infrastructure was so much lower than it is today. Volumes are higher now, the dotcom boom of endless financing has gone (maybe?!) but I believe the internet with IPv6 and its higher volumes could be cheaper now than it was (I am happy to have my mind changed with hard data but not supposition and not opinion based on loose facts and assumed weightings). Lecacy equipment and software has to be replaced sometime, all that COBOL was re-written pre-2K pretty easily.
4. See 3.
There may be "16.7 million addresses per square metre of the earth's surface" but if this can provide a solution for the next few decades, and is easier to overcome than the present beaucracy in IPv4 allocation (which is huge and incredibly difficult to overcome IMHO) then it is worth it.
--
FreeNET user? Comfortable with the adverse selection?
Here's a much better article with a similar thesis: Joel on Software - Things You Should Never Do, Part I
There are parts of it that I've never agreed with:
This should never happen! If you have all these bugfixes in your code and no way to know why they were put in, you've screwed up badly. You should have each one documented in:
So the idea that you'd have all these important bugfixes without any way of knowing what they are should be laughable! Given a codebase like that, you probably would be better off throwing it out, because it was clearly developed without any kind of discipline.
Also, he's embelleshing a lot. If it's just a "a simple routine to display a window", it doesn't need to load a library, require Internet Explorer, etc., and thus can't possibly have bugs related to those things. He makes the situation sound a lot more extreme than it really is.
But in general, I think he's right. Refactor, not rewrite. That's the same thing the XP people say to do. They also have extension unit tests to make it easier to refactor with the confidence that you haven't screwed anything up. Which can help in situations like this:
Ugh. I bet it would have been a lot less tuning if there were a decent way to test that the change to support #60 hasn't broken any of the previous 59 server types. Or that just a refactoring hasn't broken any.
I don't think this advice always applies, though. I rewrite one major project from scratch at work: our personnel system. Our database schema was hopelessly denormalized and broken. That's not something you can refactor easily - with a widely-used database schema, it's easy to make one big change than many smaller ones, because a lot of the work is just hunting down all the places that use it. That's easier to do once. So I believe there are situations this advice does not apply, but I also believe they are rare.
It's hard to write code that is robust enough to not need rewrites. The ability to do that is what separates the really good programmers from amateurs like myself. It's the difference between being a piker (like myself) and an engineer.
I'm not a great programmer, and don't do it regularly, but when I have written fairly big projects, I find that the need for rewrites came out of poor design choices that I had made.
I typically start out with something small, that can handle the core functionality expected from the project. Then I try to add features and fix bugs.
Eventually, the code becomes very difficult to maintain, and ultimately, you get to the point where the ad-hoc architecture simply won't support a new feature.
To the user, everything looks fine, everything runs reliably, but under the hood, there are real problems.
My worst experience was with a web app. I started out with script based pages in ASP (not my call), and kept writing new pages to do different things. It got to the point where I had a about three hundred script pages and lots of redundant code.
When it would become necessary to change the db table structures for another app hitting the same data, I'd have a lot of trouble keeping up, fixing my code quickly in a reliable way.
The problem was that it just wasn't possible to stand still. I couldn't go to my boss and say, "I need a three month feature freeze, to rewrite this stuff."
Writing a new version in parallel was hard because maintaining the crummy but functional code was taking more and more time. It was a real problem, and caused me a fair amount of pain, and suffering.
After digging myself into that hole, I stepped back and tried to figure out how other people did it. I would have been a lot better off building on top of something like struts.
The lesson I took from this is that it's important to study design patterns, and to use tested frameworks whenever possible. You have to think like an engineer, and not someone who codes by the seat of his pants. I'm not an engineer, so it's not easy for me to do that.
I'm not saying that the people who run the projects mentioned are in the same boat that I was. As programmers, they're in a different league.
But they're often working on problems that aren't well understood. Patterns and frameworks are ways to leverage other people's experiences. But if that experience doesn't exist, you have to guess on certain design decisions, and see how it comes out.
Top notch programmers are obviously going to guess a lot better than someone like me will. But they're still going to make mistakes. When enough of those mistakes pile up, you're going to need to do a rewrite.
You could make a point that's opposite of the one that the article makes by looking at the java libraries.
They made choices with their original AWT gui tools that were just wrong. They weren't dumb people -- they just didn't know, the experience necessary to make the right choice simply didn't exist. Once they tried it, they realized it wasn't working, and they came back with Swing.
Rewrites are always going to be necessary for new sorts of projects, because you can't just sit in your armchair and predict how complex systems will work in the real world. You have to build them and see what happens.
Once all of the old code has been either pasted back in, revised or deleted, I've usually got a program that does everything the old one does and more, but it is smaller, simpler and cleaner.
Most of the subtle features and knowledge embedded in the old code is not lost by using this approach; it gets pulled back in.
There comes a time in any softwares life that a rewrite IS the correct decision.
To put it in real world terms...
If you take a single floor home and start adding floors to it, you won't ever turn it into a skyscraper. At least not one I'd ever want to be near.
If you want a skyscraper, you bull doze the house, design the skyscraper and build it.
A lot of early design decisions can really haunt you later. Like the Apache threading example in the article.
-Jerry
The story says, in part...
"Examples include IPv4 vs IPv6, Apache, Perl, Embperl, Netscape/Mozilla, HTML and Windows"
All props to IPv4 and all, but I don't think it stands a chance against all of those put together (even with Windows on their team).
RP
In the company that I work for, we have been running our CGI "legacy" financial application for the past 8 years. After the first 4 years, it worked fine with no more problems! Scalability, memory leaks, etc., etc. were all found. The application just hums along doing its job and the users are happy.
But 4 years ago management had to jump on the J2EE bandwagon and introduce the "Java" version of our financial product. Here it is FOUR years later and our "new" Java app is still not in production because of spec changes, clustering issues, etc., etc., etc.
I kept telling them K.I.S.S., but sales said that we need the new buzzwords to get clients and everyone knows about "Java". Hell, the way I see it, every time management changes their mind, it just adds to my job security since we need to make more changes.
Rewrites can be good or bad depending on the goal and the understanding of the people doing them.
For a good rewrite to occur the following need to be true:
- emphasis on smaller/simpler code, NOT adding new features (new features may come about "for free" as part of the new design, but must not be added in as extras at this stage)
- the person/team doing the work should have a full understanding of the whole architecture of what is being rewritten
- full access to all the previous bugs and bugfixes to make sure the new versions addresses all these problems
- fairly complete regression testing should be available to compare the old and new versions
- reduce/simplify/refactor as much as possible; identify common patterns and eliminate the redundancy, and in so doing hopefully eliminate bugs as well
It's a pretty major chore that requires a lot of understanding and very comptenent people to head up the effort. But if done right, it's well worth the effort, because if it can maintain compatibility (or at least mostly) and greatly simplifies the source base, it will dramatically decrease the maintenance time needed in the future, and can naturally wipe out many potential bugs at the same time, reducing future debugging time.
Most of the examples given needed rewrites to remain viable. It's easy to look at a package from afar and declare it "perfectly sufficient". Things look different when you have to work with a system daily. In particular, rewrites often address shortcomings in a system's capacity for extension. Just compare the number of third-party extensions available for Netscape 4.* vs. the number now available at mozdev.org for Mozilla and Firebird.
A bigger problem, to my mind, is when a half-dozen projects with the noble intention of replacing an aging kludged-up tool are started, all of which suck in different ways, and none of which learn from each other. And then they lose momentum and stagnate.
Examples? Most programmers agree that "make" is overdue for replacement, but despite many attampts (cmake, jam, cons, ant) no one has managed to come up with one that is compelling enough to catch on. CVS is a crufty mess, but none of it's potential replacements are mature enough or have the kind of widespread tool support to make much of a dent in CVS installations. And there are dozens of written-from-scratch applications which differ primarily on the GUI toolkit they are based on, which would be better apps if they incorporated the best features from all into a joint effort. My idea of the perfect browser combines features of Konqueror, Galeon, Epiphany, Firebird, and Safari.
--
CPAN rules. - Guido van Rossum
There are always ways to improve code. Much of the time you'll end up with a much smaller, much more efficient, much more extensible application.
Rewriting is almost always a good thing. The rules of writing english papers apply here.
At a certain point, certain portions will mature to the point that they can't or needn't be improved with each successive version. If you're content with the architecture, you'll reach this stage. But not many applications can evolve to new heights with the same diagram/layout it started with 5 years ago.
This is just a poor attempt to get noticed.
clifgriffin > blog
It's really common to build something, step back, examine its warts, and start over again with a new perspective and understanding. It's called prototyping. Some people actually build the first one with the intent of throwing it away. Others release it as v1.0, and introduce issues of the kind this author is referring to.
There are many reasons you might prefer a rewrite. The main one, to me, is that complicated applications contain layers and dependencies, not all of which are obvious to a new programmer. If, after some analysis, your assumptions about these dependencies are wrong, you'll break the original code faster than you can say "global variable". In the end, you could easily spend more time and effort patching and praying than you would rebuilding from the ground up.
Of course, if some of the original architects are still involved in the project, arhictectural knowledge and assumptions can be transferred to new programmers in a fairly fluid way, and I suspect it is in these cases where you can confidently add on to an existing code base.
And it's always helpful if the previous programmers were actually good programmers, and who wrote code and comments that were mindful of those who might follow them later. But that's not within your control.
I've had the fortune to be affiliated with the Icarus Verilog compiler/simulator effort over the last 3-4 years. The first version of the code had some specific design decisions that made scaling simulations beyond a thousand or so gates impossible.
The author chose to throw out his simulation engine and much of his code generation and adopt a completely new model. It took him the better part of a year to get roughtly where he was with the original code base as far as functionality is concerned. He also has a regression environment with several hundred tests he uses regularly to let him know how he is doing with respect to functionality. About 2 1/2 years into the rewrite period, Icarus is now handling behavioral code of 1 Million gates at about 80% of the performance of commercial tools!
Was the rewrite needed. YES! Did it take awhile. YES! Was it worth the wait. YES!
Have you compiled your kernel today??
How is it possible to so completely miss the point of Perl 6? The intent is not necessarily to replace Perl 5 - Perl 5 is fantastic and the Perl 6 developers above all people know this. Perl 6 is perhaps best thought of as a DIFFERENT LANGUAGE which will 'just happen' to be, in many places, very similar/identical to Perl 6.
Once you start thinking of Perl 6 in that manner, you realise what it's for. It's not to replace all of the Perl already out there. It's to provide a new tool, a new language for doing new things in, drawing on the experience gained in years of working with Perl 5 and other languages.
Ponie, of course, is part of the effort to make sure that at least some of the vast amounts of Perl 5 code is usable with Perl 6, should programmers wish it. And even that's not a total rewrite of the existing Perl codebase.
So ultimately, that article has nothing of use in it. Yes, programmers should be careful what they rewrite and when they rewrite it, but many times such things are actually worth it. GTK+ 2, anybody?
Miri it is whil Linux ilast...
seems common in other areas of engineering also, bridges could just be retrofitted, buildings added on to, but sometimes there are too many unknowns in engineering old structures... Are the building materials made from Asbestos, How has the structure held up after so many years? Have other modifications extended or complicated further modifications beyond that which the original plans called for? Sometimes the unknowns themselves justify building from scratch. Sure we could just keep tacking on new technologies to old, but the result will seldom be better. More often the real advancement comes from taking the knowledge gained from past experiences and applying them to new, rather than actually taking old work and trying to make it work in a new situation.
Would you really want horses running on a treadmill attached to the front of your car, just because humanity wouldn't want to throw away its previous investment in transportation technology?
Well, it was professionally written, except for the use of phrases and words like "pain in the ass", "jackass", and "asshole".
There are 0x40000000 types of people: those who understand 32-bit IEEE 754 floating point, and those who don't.
Having done this for years, I think I'm starting to figure out why I do it, and perhaps someday I'll be able to stop myself from doing it, so that I can actually release something :-)
I think the need to rewrite is more emotional than intellectual. As I work on an existing codebase, I notice the little bumps and warts on it, the little "tweaks and fixes" which make it work, and I find them ugly. For some reason, I place the highest aesthetic value on code that was written in one big, flowing session, where the entire structure was understood from the beginning, and the entire thing looks like it was born fully-formed from some supernatural source.
In an ever futile attempt to realize this goal, I constantly chuck out perfectly good code and redo it from scratch. I do this because I seek the emotional experience of those few times when I really do sit down and blast out something that's beautiful, elegant, and functional. Even if, practically, it's no better than before.
Open source programming is often described as scratching an itch. It should be immediately apparent why this correlates to extensive rewriting of code. Some problems are simply enjoyable to solve. The necessary thinking feels good. Just as we watch a good movie again and again even though we've got the plot memorized, some programmers want to rewrite the same functionality repeatedly because it just feels good.
To hell with practical considerations, like whether or not that's "bad" for the codebase. I program for pleasure.
I have a 400 MHz PC at home. Netscape 4.7 runs acceptable fast. Mozilla is a hog. So I'm sticking with Netscape 4.7.
It's also useful to have that browser around when doing web development, to ensure that my sites look OK in the older browsers. There are still a lot of Netscape 4.7 browsers floating around out there.
That said, I use Mozilla on both my 1.8 GHz laptop and my 2.0 GHz work PC.
Like woodworking? Build your own picture frames.
> But you have to consider what Netscape would be like if it had had the amount of work put into it that mozilla has now.
If you knew the history of the Mozilla project, you would know that modifiying the old Netscape code would have gone nowhere.
Before the Mozilla developers decided that a rewrite was necessary, they spent the better part of a year trying to improve the original Netscape code.
But the code was so bad that they couldn't attract any developers -- no one was willing to work on it.
Trying to work on the old code was boring, and difficult, and required huge amounts of effort for very little gain. Also, the code was not modularized properly, which meant that very few developers could work on it at the same time without constantly tripping over each other.
If they had simply tried to upgrade the old code, you would have a slightly better Netscape browser today (assuming they didn't give up entirely).
But the rewrite attracted large numbers of developers, and produced some real innovations.
Today, Mozilla is a much better browser -- head and shoulders above IE -- with better stability, better standards support, and features such as tabbed browsing, and pop-up blocking.
But more than that, the Mozilla project has given us a powerful cross-platform development toolkit, with the XUL user-interface facility. This has not only created a new field in developing Mozilla plug-ins, but is being used for the construction of many other products.
The original poster was right. The author of the article is talking nonsense.
I seem to remember the person in charge of developing Internet Explorer for MS saying this *exact* same thing. In fact, he claimed this was the reason MS won the browser war: Netscape lost all their years of work because they decided to re-write their browser, while IE tweaked theirs. I'm sure I read this on /. somewhere, but I can't find the article.
:(
I know a lot of it had to do with MS's business tactics, but Netscape/AOL took like 5 years to put out a new browser after 4.7. And do you guys even remember Netscape 6 Preview? What a god-awful browser that was. My friends starting calling it Nutscrape cuz it was so painful to use
Those instances of Slashdotter are not necessarily the same person. Furthermore, Microsoft would be totally braindead to make a business plan from Slashdot comments. BillG probably isn't that dumb, despite what we wish for ;-)
__CmdrTHAC0__
In Soviet Russia, Spanish Inquisition doesn't expect YOU!!
If all you're browsing are pages served from a single domain, consisting primarily of flowed elements (headers, lists, images, and that's about it) with pages that are fairly short.
Start adding tables and forms, trying to reflow the page when resizing (especially if it's a long one), and prepare for the wait of your lifetime.
At least mozilla can display part of a page while the rest renders, and resolve more than one domain name at a time when connecting to resources in parallel.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
I usually find Jamie Zawinski to be an arrogant rude asshole, but occasionally our opinions overlap. In this brief rant he describes the Cascade of Attention-Deficit Teenagers software development model, which often leads to rewriting code from the ground up. Over and over and over.
Stay out of that trap, and actually fix stuff during your rewrite, and there's nothing at all wrong with doing it over from scratch. Rewrite it just because you don't feel that modifying other people's code is sexy enough, or that your version will surely be bug-free -- because, hey, it's you -- or because "you would have done things differently," and you'll have failed.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Another great example not mentioned by anyone yet is the excellent Opera Internet browser. It isn't always rewritten from scratch, but overall there are enough changes in each new major version to make it almost unusable, at least to me. Every time a new version (3.0, 5.0, 6.0, 7.0) is rolled out, many little things no longer work as they did, and sometimes they are clearly and unequivocally broken.
:)
Before I knew better, I used to download the release versions (not betas or RCs), but each and every time I ended up uninstalling the new version and switching back. It usually took more than a month and about 10 updates for a new version to reach relative maturity. Witness 3.21, 6.05, 7.20, only these versions could be considered better than their predecessors in all respects. With version 7 I succumbed at about 7.1, but next time I will really know better and not even consider Opera 8, until there have been a month without updates.
On a more serious note, I think there is moment of maturity in many every product's lifetime, a moment when new features could no longer justify an upgrade (other things, such as compatibility, being equal).
Future Wiki -- If you don't think about the future, you cannot have one.
While, for example, BSD's 'ls' program can be tracked all the way to the seventies, GNU people of course rewrote it just for the license sake.
A nice example is the 'ping' tool. The story of ping tells how the program was concieved and made, and the FreeBSD's current ping.c is based on it:
That's a codebase 21 years old and still viable!-- Sig down
I sent this reply to the author through the site, but it would probably get some use here too.
"The Web was based on the idea that a simple markup language could allow us to divorce document presentation from document structure"
Which HTML 1.0 through 3.2 didn't really achieve, admittedly...
"Some of the changes to HTML were done in a way that shouldn't break old browsers, but as I said before, I am increasingly seeing websites that don't render properly in Netscape 4.x"
There's a shock. I thought it was 2004, and you're still testing on a browser which is at least three major revisions old, never mind that Mozilla itself seems to be more useful than Netscape's rebadged browser.
"So apparently the FONT tag is deprecated - now we have to use style sheets and whatnot to do something that was originally very simple"
This is because "the web was based on the idea that a simple markup language could allow us to divorce document presentation from document structure", and the FONT tag is presentation appearing in the document structure. That's sort of like a divorce where the couple still sleep with each other.
"but at the expense of being able to do simple things quickly."
I beg to differ. Even if I really want to break style guidelines and make a chunk of text red for no particular purpose, it still takes the same amount of time to type <span class="red"> than it was to type <font color="red">. Never mind that this really is a bad thing to do. Why is it red? Is there a meaning to the red? Perhaps it should be <span class="important">, in which case why not just use <strong>?
"As a Web developer I have long wondered why they didn't add more types to the INPUT form tags to express different types - for example, a DATE attribute, or INTEGER, DOUBLE, or whatever."
Of course XHTML 2.0 will be partnered with XForms, which will attain this functionality in so as far as any field which can store a value can be of an XML Schema type. This includes -- wait for it -- dates, integers, doubles, and arbitrary regular expressions.
"These "rich" (but simple! not XML!) attributes could then be seen by the browser and presented to the user in whatever way is supported by the system"
Hopefully they do this. I would love to see browsers implement a calendar popup. I can't count the number of times we had to use a JavaScript for this.
"But the direction we're going in, the HTML books have just become thicker and thicker over the last few years."
This I don't get. There are less tags now, right? It's the CSS and XSL books which should be getting thicker. By the way, never buy a book on XSL:FO. I accidentally dropped that on my foot, and christ, they hurt.
I think the progression from HTML 4.0 through XHTML 1.0 to XHTML 1.1 was smooth. They're encouraging people to go back to the roots of the web: to mark up content depending on what it means, not depending on how it's supposed to look. Sites like www.csszengarden.com are living proof of how the separation of HTML and CSS can achieve excellent separation of concerns between the graphic designer and the web developer, and I'd personally love to see more sites such as this (only with real content!) pop up all over the place. If for no other reason than the pages loading faster due to many, many less tags in the HTML! :-)
Karma: It's all a bunch of tree-huggin' hippy crap!
However, if you're talking about a larger project such as a commercial software application with 50 man-years of development, a complete rewrite will usually be undertaken with a large degree of ignorance of the true problem domain. Also, you can rewrite your codebase to fit more nicely with the *current* state of your specs and requirements, but the new "elegent" design may be even less suitable to tomorrow's new feature request. Even rewriting a major subsystem from scratch can be a costly mistake.
The real trick is how to maintain the code in such a way that it continuously improves instead of just getting more and more riddled with spaghetti, dead code paths, and other clutter. It's especially hard, because it's easy to forget to treat a mature code base with respect, and just hack in "one more" thing because you are hoping to rewrite it at some point.
There is the "broken window" idea--one broken window will lead to an increasing spiral of vandalism, and one line of crud in a source file will give future programmers the feeling that they can add ten more lines of crud because the code is already "dirty". While the usual adage is to clean up the window immediately, the reality is that most source files have one hundred broken windows already and fixing them all right now is not an option. What takes discipline is to make sure to leave each file in a better condition than you left it--remove some dead code, do a little refactoring to clean it up, rename identifiers or reformat code to conform with project-wide standards (NOT your personal pet style!!! This is very annoying when people check out a file, reformat it to their own preference, add one small feature, break some other part of the code, and check it back in to source control...). Another common problem is when people come up with some "new religion", convert about half the code over to the new way, but leaving it in a "worst of both worlds" state because it turns out the "new way" was just "different" and added as many problems as it solved. It is easy to add code that goes against the grain of the existing code because you didn't bother to really understand the system and structure of the code, or because you don't "like" a certain design decision.
In many ways, the real achievement is that higher mental state where you stare at the huge, messy codebase that you've been working with for ages, have the "aha" moment, come up with a simple refactoring, make changes to fifty files, replace hundreds of lines of code with tens of lines, and it just works the first time and only takes a few hours, because for one fleeting instant, you had the whole thing in your head at once. I wish I could have more of those moments.
I'm as guilty as any for violating these ideas--I often keep my nice clean "new" subsystems tidy because they are already tidy and conform to my current design philosophies/religion, but let my big, mature--and often more imporant--subsystems grow increasingly crudified because I have the continuing fantasy that I will be rewriting them from scratch "next project"...