About  Sitemap 

 Home  My  Articles   Links  Search  Resume  Contact  Website
 Contents  Status  Latest  Guestbook  Printable  Edit  Search  Bookmark  RSS

Rewrites Considered Harmful?

When is "good enough" enough?

Topic: Neil Gunton  
Categories: Software development


Copyright © 2004-2014 By Neil Gunton

Last update: Sunday July 20, 2008 08:30 (US/Pacific)

Table of Contents

The Problem: Rewrite Mania
Case 1: IPv4 vs IPv6
Case 2: Apache 1 vs Apache 2
Case 3: Perl 5 vs Perl 6
Case 4: Embperl 1.x vs Embperl 2
Case 5: Netscape 4.x vs Mozilla
Case 6: HTML 4 vs XHTML + CSS + XML + XSL + XQuery + XPath + XLink + ...
Case 7: Windows 2000 vs Windows XP vs Server 2003
Conclusion: In Defense of "good enough" and simplicity


This document collects some thoughts on the tendency to totally rewrite Version 2.0 of successful software (and standards) just because the first version is perceived to be "messy" and "unmaintainable". Does this really help anybody, does it result in better systems, and when is "good enough", well, enough? Are we doomed to a constant cycle of churning and rewriting? At the end of the article I suggest that there is a "cost" involved with totally rewriting any popular package or standard, which depends on three factors - the amount of accumulated wisdom that will be lost, the degree of incompatibility with the older version, and lastly the number of users of the old system.

Note: Since the original 2004 Slashdot story linking to this article, I've had quite a few comments relating to my total ignorance, complete idiocy, paucity of wisdom and general lack of higher brain function. Thanks to all who have emailed, I appreciate the corrections (my mistake on the Windows rewrite, apparently it didn't happen that way). Also, I have been informed that there is another article that is in the same vein as this, but rather better written - see "Things You Should Never Do, Part 1" by Joel Spolsky. My article was basically an early morning rant that got somewhat out of control and ended up being written down as an article on my home page, then posted to slashdot as a lark, and then torn to pieces by the slashdot crowd. Please try to see it for what it is - thoughts, observations and opinions about the way the world is. I hope you find it interesting, thought provoking or at the very least entertaining - if nothing else then you can have some fun telling me what an idiot I am in email. Thanks for visiting!

The Problem: Rewrite Mania

I have been noticing a certain trend in software toward rewriting successful tools and standards. It seems that programmers always have the urge to make things better, which is perfectly understandable - after all, this is the primary trait of the engineer's mind (although I also think that artistic creativity also enters in the mix). Why should things stay static? Surely progress is good, and if we just stayed in the same place, using the same versions of tools without improvement, then things would deteriorate and generally get pretty boring.

That's all very true, but what I am seeing is that in many cases we have tools which truly are "good enough" for what they are designed to do - TCP/IP allows us to build giant, interconnected networks, Apache lets us build flexible web servers, Perl lets us write incomprehensibly obfuscated code(!)... well, point being, these things work. Really, outstandingly well. They are "good enough", and moreover they are used everywhere. So all's well and good, right? Well, not exactly. The programmers add little bits and pieces here and there, fix lots of bugs, and over time the code starts to look distinctly messy - and with the insights gained from this "first version" of the application (I don't mean V1.0, but rather the overall codebase) the developers start to think about how it could be "done right". You know, now they know how they should have done it.

Fired with new zeal and enthusiasm, the developers embark on a grand rewrite project, which will throw out all the old, stale, horrible, nasty untidy code, and construct reams of brand new, clean, designed, and, uh, buggy, incompatible, untested code. Oh well, it'll be worth it ... right? So the new version will break some things that worked with the old version - the benefits from the changes far outweigh a loss of backward compatibility. In their minds, the developers are more focused on the cool aspects of the new version than they are on the fact that in the real world, millions of people are still using the old version.

Eventually, then, the new version comes out, to grand fanfare. And a few people download it, try it... and it doesn't quite work. This is perfectly normal, these things need time. So all the people who are running large production systems with the old version just back off for a while until the new version has been tested properly by, uh, someone else. Thing is, nobody wants to use their own production system to test the new version, particularly when the new version is incompatible with the old version and requires config changes which would be a pain to change back to the old version when the new version breaks!

So what do we end up with? Two versions. The old one (which everyone uses) and the new one (which some people use, but is acknowledged to be not as reliable as the old one). Perhaps over time the new version becomes dominant, perhaps not... the more different it is from the old one (and the more things that it breaks), the longer it will take. People have existing systems that work just fine with the old version, and unless they are forced to, there is simply no reason to upgrade. This is particularly true for Open Source software, because since there are so many people using the older version, it will inevitably be maintained by someone. There will be bugfixes and security updates, even when all the "new" development work is officially happening on the new version. With closed proprietary software, it's different - companies like Microsoft have much more power to effectively force people to upgrade, no matter how much pain it causes.

But regardless of all that, I am left wondering what this all gets us. Is it really the best way? If software works and is used by lots and lots of people quite successfully, then why abandon all the hard work that went into the old codebase, which inevitably includes many, many code fixes, optimizations, and other small bells and whistles that make life better for the user. The older version may not be perfect, but it does the job - and, in some cases, there just doesn't seem to be a good reason to upgrade. New versions mean new bugs and new problems - it's a never ending cycle. And, in truth, aren't we reaching a kind of plateau in software development? Are we really doing anything all that different today than what we were doing ten years ago? Or even 20 years ago? When I learned C++ back in 1989, people were writing the same sorts of programs they are writing now, just with different API's. Are the users really doing anything all that different? Word processing is word processing, after all, as is email and web browsing.

Anyway, here are a few examples of this trend that sprang to mind once I started pondering the subject...

Case 1: IPv4 vs IPv6

We are, supposedly, running out of IP addresses. This is due to the fact that IP addresses in IPv4 use only 32 bits, and large chunks of addresses were handed out in the early days of the internet to institutions such as MIT, who have never used all of them and probably never will. As a result, with the proliferation of mobile devices and wireless technology, there is a need for every device that could be connected to the internet to have its own IP address. Apparently IPv6 will solve all these problems, with a brand new standard that uses 128 bits.

Or will it? TCP/IP works pretty well - it routes packets to their destination. The entire world is based on it. With the advent of NAT and the use of non-routable IP addresses to handle LANs, the need for every device to have a routable address becomes lessened. The issue of large blocks of addresses being tied up by entities who can't use them all in a million years becomes one of politics and diplomacy, rather than a new standard.

Many, many applications work with IPv4 and assume that addresses will be 32 bit, not 128 bit (as IPv6 specifies). So we have the classic situation where the old version, IPv4, works "well enough", not perfectly, but reasonably. The internet has not melted down, and shows no sign of doing so. Sure, the address shortage is an issue, but we as humans have a tendency to look for the path of least resistance - and it sure seems easier to re-arrange all those tied-up blocks of existing addresses than it does to rewrite every application in the world that assumes IPv4. The current system works, and there is going to be a lot of resistance to making the switch.

So what do we have? A lot of switches and routers that "understand" IPv6, but no impetus to make the switch, because it would quite simply break too many things - and the benefits aren't sufficiently obvious to make it overwhelmingly compelling.

Case 2: Apache 1 vs Apache 2

Apache is the world's most popular web server - it runs on almost two thirds of the servers on the planet, according to Netcraft. Many, many sites use Apache 1.x, and it works very, very well. But the codebase was seen as being, well, messy - the name "Apache" originally came from "A Patchy Web Server", after all. So they decided to rewrite the thing and make it work better, and put quite a lot of new features in there. Threading, for one thing, which makes it run much better on Windows. Ok, but then things like mod_perl had to be rewritten too, because modules like this hook deep into Apache and the whole API had changed. So now we have a confusing situation where people coming to Apache have a stressful decision to make - use the old version, which works well and is robust and reliable, but is, well, old, and has a different API, or try the newer version, which works well for simple sites but still has issues as soon as you use more complex stuff like PHP or mod_perl? Hmmm, quite a conundrum.

Let me be clear here - I'm certainly not advocating that we stand still with this stuff. Progress is good, and making a threaded version to run better on Windows is certainly a good idea. But to totally rewrite the codebase and change the API - this just generates a lot of hassle for two thirds of the websites on the planet! Couldn't we have instead built on the existing code, which by now is extremely robust (if a little messy), or at least provided an API compatibility layer? I'm not blaming anybody, because I know all this is done by volunteers who do a fantastic job. I'm just questioning the wisdom of starting from scratch with an all-new codebase that breaks old versions.

Anyway, it appears that we'll be stuck for some time with the old-version vs new-version mess.

Case 3: Perl 5 vs Perl 6

Perl is another tool that has been outstandingly useful, indeed earning a reputation as being "the duct tape of the internet". It is both simple and complex, both messy and elegant, and there's always more than one way to do it. Perl is loved and hated by many, and there are actual contests to see who can write the most obfuscated code. But it works! Really, really well. Sure, it has warts, but it works, good enough. So why rewrite it all and change stuff around for Perl 6? I know that a lot of extremely smart people are working on the project, and I also fully realize that all sorts of arguments can be made for why Perl 6 will be so much cleaner and better. But people will still write code, probably the same kind of code, and probably nothing much that Perl 5 couldn't do just as well, in its own way. Thing is, there are different ways of doing things in Perl 6 - it'll break many Perl 5 scripts. Since many of those scripts are legacy and "just work", the original developer having left the company long ago (more common than you realize!), it really isn't practical to rewrite everything to use the new version. Also, the old version was so well understood and people knew its little idiosynchrasies - why abandon all that? We'll just have to start learning a whole bunch of new bugs and oddities...

What am I arguing for here? Well, I am just saying that Perl works "well enough" at the moment, and there's a lot of code out there that is "good enough" and does the job it was written to do, and all that will break when Perl 6 becomes the "standard". Ok, so Perl 5 will still be supported, but was it really so necessary to do the total rewrite and break the old code?

Even if Perl 6 has a compatibility mode that allows Perl 5 scripts to run, there is still a big headache for programmers: Do they develop using the new features in Perl 6, thus ensuring that their script will not run on machines with Perl 5 installed? Or do they stay with the Perl 5 features, and thus eschew the benefits of the new version? This would, of course, be true of any language that is being actively developed. But the point is that there is a real cost here, because Perl 5 has become effectively a "standard" that many, many people write to. In the past, the changes to Perl have been mostly small additions here and there, but not fundamental changes to the language and underlying interpreter. Many scripts rely on the small idiosynchrasies of Perl 5 (for better or worse). Will it be possible to reproduce all of those idiosynchrasies in Perl 6 "compatibility" mode, given that it's rewritten code? Probably not. Perl 6 will be a fantastic language, no doubt - but the point here is simply that there is a cost involved with making such radical changes to a well-established standard.

Case 4: Embperl 1.x vs Embperl 2

Embperl (Embedded Perl) is another example of a tool that I use all the time to write dynamic websites. Gerald Richter has done a wonderful job of making a truly useful package that allows you to embed Perl code into HTML. Version 1.x worked very, very well - but Gerald realized that he could write a better version, in C (rather than Perl, which is what 1.x is written in) and it would have many new features, more flexibility and ... it would be all new code. So now we have the situation where 2.x has been in beta for over two years - Gerald has had other projects to work on as well, I'm not blaming him for anything here, but simply pointing to Embperl as another example of a code rewrite that appears to gain little. Sure, 2.x is faster than 1.x, but I still can't use it because there are bugs which break my existing, complex applications that are based on Embperl 1.x. These bugs appear to be obscure and esoteric enough that they are quite hard to track down. I would love to do more testing of 2.x, but I can't - the new version has configuration changes that break the old version, so it's a lot of hassle to change between the two for testing.

I want to make absolutely clear once more that I have utmost respect for Gerald and for Embperl - I am simply using this as another example where the total rewrite caused unanticipated problems that are in fact quite hard to get around. People can't use 2.x on complex production systems until the new version is properly tested and debugged - but the new version can't be properly debugged until everyone is using it in production. And people simply won't make the switch because it breaks existing systems! Not Gerald's fault, it's just symptomatic of new code.

Case 5: Netscape 4.x vs Mozilla

I have an "old" (circa 1999) Penguin Computing 450 MHz AMD K6-II workstation with 512 MB RAM (running RedHat 7.3 until I can manage the switch to Debian), which works just fine for just about everything I do - software development, compiling, browsing, email, word processing, you name it - but not Mozilla. This application is the slowest, most bloated thing that I have ever seen. It was, you guessed it, a total rewrite of the Netscape browser. I use Netscape 4.80 on a daily basis - not because I love it, but simply because it is fast and it responds to mouse clicks in a way that doesn't drive me up the wall like Mozilla does. Mozilla obviously does things a lot more "correctly" than the old version, and I am increasingly encountering websites that don't render properly with 4.x. But this still leaves me wondering exactly what we've gained here - the old version was, by all accounts, very messy in its code, and had so many kludges that it was necessary to just junk it and start again. So what did we end up with? A new browser based on a cross platform GUI toolkit such as wxWindows? No, we ended up with XUL, which is almost certainly responsible for the slow speed of Mozilla. Everything in the GUI is now effectively interpreted, instead of being native code as in the old version. Of course Mozilla works just fine for people using newer computers - but sorry, I simply don't want to be told that my workstation has to be upgraded just because of my ... web browser! What does a web browser do today that is so different to what was being done by Netscape 4.x? Nothing much, as far as I can see. Rendering HTML is something that Gecko can do quite well, but not all that much faster (as far as I can tell) than 4.x. In most cases, 4.x is perfectly "good enough" for me, and it doesn't take 10 seconds to open a new mail message window, either.

All of which is to say, once again, I don't blame the developers in particular. They are no doubt doing a fantastic job at what they set out to do - that is, build a new browser from scratch that can run on many different platforms with minimal code changes. And, it's fair enough to design for faster CPU's (well, not for me, but I can at least understand where they are coming from). But my point is the old version worked quite well, and was getting pretty reliable - sure, people had crashes and many actually hate Netscape 4.x with a passion because of its bugginess and lack of standards compliance. However, I do believe that it might have been possible to keep Netscape 4.x going, fixing the bugs and keeping a browser that, well, just works! Was a total rewrite really so necessary? Is XUL really so much better? Not in my view. Browsing the Web is browsing the web, and it's pretty much the same for me now as it was back in 1998. Only difference seems to be, if I want to do it now then apparently I need a new computer to use the new browser... even though the actual useage of the browser is the same. Bizarre.

Case 6: HTML 4 vs XHTML + CSS + XML + XSL + XQuery + XPath + XLink + ...

The Web was based on the idea that a simple markup language could allow us to divorce document presentation from document structure, and concentrate on how information was related rather than how it should be displayed. HTML was a huge hit, because it was simple and open and anyone could use it, and you could embed images that allowed people to make very pretty web pages. But over the last few years this concept has become quite lost in the maze of new, complex and (in my mind) dubious standards that threaten the simplicity that made the original version so beguiling. So I include "Standards" in my list of things that are rewritten, just like code. People see the "first" version, and they think that it looks a bit untidy and inconsistent and could be done better, so they write a whole new version.

Some of the changes to HTML were done in a way that shouldn't break old browsers, but as I said before, I am increasingly seeing websites that don't render properly in Netscape 4.x - and believe me, when I see them in Mozilla, they are really not doing anything that couldn't be achieved very readily with "old" HTML. So apparently the FONT tag is deprecated - now we have to use style sheets and whatnot to do something that was originally very simple - e.g. making some text red. Why? We sacrifice simplicity in the name of an intellectual goal that promises greater consistency, but at the expense of being able to do simple things quickly.

We missed such a big chance to simply fix HTML and make it a bit more useful. Instead of getting rid of useful tags and redesigning HTML as XML, we could have simply added some useful stuff to existing tags. As a Web developer I have long wondered why they didn't add more types to the INPUT form tags to express different types - for example, a DATE attribute, or INTEGER, DOUBLE, or whatever. These "rich" (but simple! not XML!) attributes could then be seen by the browser and presented to the user in whatever way is supported by the system - in a GUI, using a little calendar window, perhaps (for dates). This would get rid of so much of the headache involved with parsing out this stuff on the server side. Focusing on simple additions like that would have been so much more useful than rewriting what the Web is supposed to be in terms of XHTML, XSL and so on. We now have a confusing forest of standards, where there used to be simplicity. The Web exploded because it was open and simple - accessible to schoolkids, anybody could write their own web page. But the direction we're going in, the HTML books have just become thicker and thicker over the last few years. And still, all most people really want to do is browse the web...

Case 7: Windows 2000 vs Windows XP vs Server 2003

Was it a "good idea" for Microsoft to rewrite Windows as XP and Server 2003? I don't know, it's their code, they can do whatever they like with it. But I do know that they had a fairly solid, reasonable system with Windows 2000 - quite reliable, combining the better aspects of Windows NT with the multimedia capabilities of Windows 98. Maybe it wasn't perfect, and there were a lot of bugs and vulnerabilities - but was it really a good idea to start from scratch? They billed this as if it was a good thing. It wasn't. It simply introduced a whole slew of new bugs and vulnerabilities, not to mention the instability. It's just another example of where a total rewrite didn't really do anyone any good. I don't think anyone is using Windows for anything so different now than they were when Windows 2000 was around, and yet we're looking at a 100% different codebase. Windows Server 2003 won't even run some older software, which must be fun for those users...

Addendum: I've been informed by reliable sources that Windows XP was not, in fact, a total rewrite! My apologies for this apparent gap in my memory. The reason I used this example was that I thought I remembered, around the time that XP came out, reading in the computer press that XP had a large portion of itself rewritten as brand new code - this was in something like Computerworld or Infoworld or one of those. I remember it because Microsoft was claiming that it would be so much more reliable, and the columnist was questioning this based on the fact of all this rewritten code - it was bound to have new bugs, new vulnerabilities, etc. Of course, now I can't find any reference to this, but you'll just have to take my word for it that I did read it... I'll leave this bit in here for now for posterity - if only to prove once again to everybody just how clued-in I am! D'oh.

Conclusion: In Defense of "good enough" and simplicity

You might read all this and think what an idiot I am for suggesting that older, crappier, buggier, dirtier, messier, more complex software might be better than newer, cleaner, faster rewrites. Well, the point is a subtle one - in a nutshell, when you rewrite, you lose all those little fixes and improvements that made the older version good to use and reliable. New software always introduces new bugs. Often, the rewrite process seems to be driven by a desire to make the product somehow more theoretically consistent and complete - which in turn often ends up losing the simplicity and elegance that made the original so compelling and useful. Rewriting, especially when it breaks existing systems, results in multiple versions of software that makes it confusing for new users and perplexing for old users.

And, let's face it - programmers just like to write new code. It's natural. We all do it - it's easier to start from scratch than it is to make the old version better. Also, it's more glamorous - everybody wants to be credited with creating something themselves, rather than maintaining and developing an existing thing. So, I can quite understand why things are the way they are. This whole document is simply a philosophical pondering of the benefits of such behavior. What happens when the popular, successful software or standard is rewritten from scratch? Experience tells us that it's not always a good situation.

Mind you, I am not saying that we should never rewrite code - sometimes it's just a necessary thing, because of new platforms or changes to underlying API's. It's all a question of degree - do you totally rewrite, or do you evolve existing, working code? Rewrites are so often done without any regard to the old code at all. In my experience, new programmers often come on board, and it's just too much trouble to look through and really understand all the little nooks and crannies. We have seen it plenty of times in business - there is an old version of the application, but you're brought in to put together a new version. Usually the new spec has so many functional/UI differences from the old one that the old is simply discarded as being irrelevant. And yet, many times, the underlying functional differences are not actually all that great. So, unfortunately, years of patches, special cases and wisdom are just abandoned.

There is a "cost" involved with totally rewriting any application, in terms of "lost wisdom". If you have a package that is very popular, used by many people and has had a lot of bugfixes and patches applied over time, then it is more likely that a total rewrite will have a higher cost. Also if you change the way it works in the process, you create a chasm between the new and old versions that has to be crossed by users, and this causes stress. Which version to use - the old, reliable, well known but out-of-date version, or the newer, sleaker, incompatible, more buggy version? Hmmm. If your software (or standard) is not used by many people and doesn't have any significant history behind it (in terms of "accumulated wisdom") then clearly there are no real issues involved in rewriting - the cost is low. So I am not making a blanket statement that rewriting is bad; the whole point of this article was to focus on tools and standards that have attained great success and are used by many people. Such software/standards will inevitably have had a large amount of wisdom invested over time, because nothing is perfect first time out. Thus it is the most popular tools and packages that are most likely to be casualties of total rewrites.

So in summary, I would say that the "cost" of a total rewrite depends on three factors:

  1. Amount of "accumulated wisdom" (bug fixes, tweaks and useful patches) in the old version that will be discarded
  2. How incompatible the new version is with the old version (API, data formats, protocols etc)
  3. How many people used the old version and will be affected by the changes

A suggestion: If you have a very successful application, don't look at all that old, messy code as being "stale". Look at it as a living organism that can perhaps be healed, and can evolve. You can refactor, you can rewrite portions of the internals to work better, many things can be accomplished without abandoning all the experience and error correction that went into that codebase. When you rewrite you are abandoning history and condemning yourself to relive it.

-Neil Gunton
January 15th 2004

"Rewrites Considered Harmful?" Copyright © 2004-2014 By Neil Gunton. All rights reserved.
Website Copyright © 2000-2014 by Neil Gunton Thu 23 Oct 2014 12:53 (US/Pacific) (0.078s)      Top    Link    Report    Terms of Service