Table of Contents
-
Introduction
-
The Problem: Rewrite Mania
-
Case 1: IPv4 vs IPv6
-
Case 2: Apache 1 vs Apache 2
-
Case 3: Perl 5 vs Perl 6
-
Case 4: Embperl 1.x vs Embperl 2
-
Case 5: Netscape 4.x vs Mozilla
-
Case 6: HTML 4 vs XHTML + CSS + XML + XSL + XQuery + XPath + XLink + ...
-
Case 7: Windows 2000 vs Windows XP vs Server 2003
-
Conclusion: In Defense of "good enough" and simplicity
Introduction
This document collects some thoughts on the tendency to totally
rewrite Version 2.0 of successful software (and standards) just
because the first version is perceived to be "messy" and
"unmaintainable". Does this really help anybody, does it result in
better systems, and when is "good enough", well, enough? Are we doomed
to a constant cycle of churning and rewriting? At the end of the
article I suggest that there is a "cost" involved with totally
rewriting any popular package or standard, which depends on three
factors - the amount of accumulated wisdom that will be lost, the
degree of incompatibility with the older version, and lastly the
number of users of the old system.
Note: Since the original 2004 Slashdot story
linking to this article, I've had quite a few comments relating to my
total ignorance, complete idiocy, paucity of wisdom and general lack
of higher brain function. Thanks to all who have emailed, I appreciate
the corrections (my mistake on the Windows rewrite, apparently it
didn't happen that way). Also, I have been informed that there is
another article that is in the same vein as this, but rather better
written - see "Things
You Should Never Do, Part 1" by Joel Spolsky. My article was
basically an early morning rant that got somewhat out of control and
ended up being written down as an article on my home page, then posted
to slashdot as a lark, and then torn to pieces by the slashdot
crowd. Please try to see it for what it is - thoughts, observations
and opinions about the way the world is. I hope you find it
interesting, thought provoking or at the very least entertaining - if
nothing else then you can have some fun telling me what an idiot I am
in email. Thanks for visiting!
The Problem: Rewrite Mania
I have been noticing a certain trend in software toward rewriting
successful tools and standards. It seems that programmers always have
the urge to make things better, which is perfectly understandable -
after all, this is the primary trait of the engineer's mind (although
I also think that artistic creativity also enters in the mix). Why
should things stay static? Surely progress is good, and if we just
stayed in the same place, using the same versions of tools without
improvement, then things would deteriorate and generally get pretty
boring.
That's all very true, but what I am seeing is that in many cases
we have tools which truly are "good enough" for what they are designed
to do - TCP/IP allows us to build giant, interconnected networks,
Apache lets us build flexible web servers, Perl lets us write
incomprehensibly obfuscated code(!)... well, point being, these things
work. Really, outstandingly well. They are "good enough", and
moreover they are used everywhere. So all's well and good, right?
Well, not exactly. The programmers add little bits and pieces here and
there, fix lots of bugs, and over time the code starts to look
distinctly messy - and with the insights gained from this "first
version" of the application (I don't mean V1.0, but rather the overall
codebase) the developers start to think about how it could be "done
right". You know, now they know how they should have done it.
Fired with new zeal and enthusiasm, the developers embark on a
grand rewrite project, which will throw out all the old, stale,
horrible, nasty untidy code, and construct reams of brand new, clean,
designed, and, uh, buggy, incompatible, untested code. Oh well, it'll
be worth it ... right? So the new version will break some things that
worked with the old version - the benefits from the changes far
outweigh a loss of backward compatibility. In their minds, the
developers are more focused on the cool aspects of the new version
than they are on the fact that in the real world, millions of people
are still using the old version.
Eventually, then, the new version comes out, to grand fanfare. And
a few people download it, try it... and it doesn't quite work. This is
perfectly normal, these things need time. So all the people who are
running large production systems with the old version just back off
for a while until the new version has been tested properly by, uh,
someone else. Thing is, nobody wants to use their own production
system to test the new version, particularly when the new version is
incompatible with the old version and requires config changes which
would be a pain to change back to the old version when the new version
breaks!
So what do we end up with? Two versions. The old one (which
everyone uses) and the new one (which some people use, but is
acknowledged to be not as reliable as the old one). Perhaps over time
the new version becomes dominant, perhaps not... the more different it
is from the old one (and the more things that it breaks), the longer
it will take. People have existing systems that work just fine with
the old version, and unless they are forced to, there is simply no
reason to upgrade. This is particularly true for Open Source software,
because since there are so many people using the older version, it
will inevitably be maintained by someone. There will be bugfixes and
security updates, even when all the "new" development work is
officially happening on the new version. With closed proprietary
software, it's different - companies like Microsoft have much more
power to effectively force people to upgrade, no matter how much pain
it causes.
But regardless of all that, I am left wondering what this all gets
us. Is it really the best way? If software works and is used by lots
and lots of people quite successfully, then why abandon all the hard
work that went into the old codebase, which inevitably includes many,
many code fixes, optimizations, and other small bells and whistles
that make life better for the user. The older version may not be
perfect, but it does the job - and, in some cases, there just doesn't
seem to be a good reason to upgrade. New versions mean new bugs and
new problems - it's a never ending cycle. And, in truth, aren't we
reaching a kind of plateau in software development? Are we really
doing anything all that different today than what we were doing ten
years ago? Or even 20 years ago? When I learned C++ back in 1989,
people were writing the same sorts of programs they are writing now,
just with different API's. Are the users really doing anything all
that different? Word processing is word processing, after all, as is
email and web browsing.
Anyway, here are a few examples of this trend that sprang to mind
once I started pondering the subject...
Case 1: IPv4 vs IPv6
We are, supposedly, running out of IP addresses. This is due to
the fact that IP addresses in IPv4 use only 32 bits, and large chunks
of addresses were handed out in the early days of the internet to
institutions such as MIT, who have never used all of them and probably
never will. As a result, with the proliferation of mobile devices and
wireless technology, there is a need for every device that could be
connected to the internet to have its own IP address. Apparently
IPv6 will solve all these problems,
with a brand new standard that uses 128 bits.
Or will it? TCP/IP works pretty well - it routes packets to their
destination. The entire world is based on it. With the advent of NAT
and the use of non-routable IP addresses to handle LANs, the need for
every device to have a routable address becomes lessened. The issue of
large blocks of addresses being tied up by entities who can't use them
all in a million years becomes one of politics and diplomacy, rather
than a new standard.
Many, many applications work with IPv4 and assume that addresses
will be 32 bit, not 128 bit (as IPv6 specifies). So we have the
classic situation where the old version, IPv4, works "well enough",
not perfectly, but reasonably. The internet has not melted down, and
shows no sign of doing so. Sure, the address shortage is an issue, but
we as humans have a tendency to look for the path of least resistance
- and it sure seems easier to re-arrange all those tied-up blocks of
existing addresses than it does to rewrite every application in the
world that assumes IPv4. The current system works, and there is going
to be a lot of resistance to making the switch.
So what do we have? A lot of switches and routers that
"understand" IPv6, but no impetus to make the switch, because it would
quite simply break too many things - and the benefits aren't
sufficiently obvious to make it overwhelmingly compelling.
Case 2: Apache 1 vs Apache 2
Apache is the world's most
popular web server - it runs on almost two thirds of the servers on
the planet, according to
Netcraft. Many,
many sites use Apache 1.x, and it works very, very well. But the
codebase was seen as being, well, messy - the name "Apache" originally
came from "A Patchy Web Server", after all. So they decided to rewrite
the thing and make it work better, and put quite a lot of new features
in there. Threading, for one thing, which makes it run much better on
Windows. Ok, but then things like mod_perl had to be rewritten too,
because modules like this hook deep into Apache and the whole API had
changed. So now we have a confusing situation where people coming to
Apache have a stressful decision to make - use the old version, which
works well and is robust and reliable, but is, well,
old, and
has a different API, or try the newer version, which works well for
simple sites but still has issues as soon as you use more complex
stuff like PHP or mod_perl? Hmmm, quite a conundrum.
Let me be clear here - I'm certainly not advocating that we stand still with this stuff. Progress is good, and making a threaded version to run better on Windows is certainly a good idea. But to totally rewrite the codebase and change the API - this just generates a lot of hassle for two thirds of the websites on the planet! Couldn't we have instead built on the existing code, which by now is extremely robust (if a little messy), or at least provided an API compatibility layer? I'm not blaming anybody, because I know all this is done by volunteers who do a fantastic job. I'm just questioning the wisdom of starting from scratch with an all-new codebase that breaks old versions.
Anyway, it appears that we'll be stuck for some time with the
old-version vs new-version mess.
Case 3: Perl 5 vs Perl 6
Perl is another tool that has
been outstandingly useful, indeed earning a reputation as being "the
duct tape of the internet". It is both simple and complex, both messy
and elegant, and there's always more than one way to do it. Perl is
loved and hated by many, and there are actual contests to see who can
write the most obfuscated code. But it works! Really, really
well. Sure, it has warts, but it works, good enough. So why rewrite it
all and change stuff around for
Perl 6? I know that a lot of
extremely smart people are working on the project, and I also fully
realize that all sorts of arguments can be made for why Perl 6 will be
so much cleaner and better. But people will still write code, probably
the same kind of code, and probably nothing much that Perl 5 couldn't
do just as well, in its own way. Thing is, there are different ways of
doing things in Perl 6 - it'll break many Perl 5 scripts. Since many
of those scripts are legacy and "just work", the original developer
having left the company long ago (more common than you realize!), it
really isn't practical to rewrite everything to use the new
version. Also, the old version was so well understood and people knew
its little idiosynchrasies - why abandon all that? We'll just have to
start learning a whole bunch of new bugs and oddities...
What am I arguing for here? Well, I am just saying that Perl works
"well enough" at the moment, and there's a lot of code out there that
is "good enough" and does the job it was written to do, and all that
will break when Perl 6 becomes the "standard". Ok, so Perl 5 will
still be supported, but was it really so necessary to do the total
rewrite and break the old code?
Even if Perl 6 has a compatibility mode that allows Perl 5 scripts
to run, there is still a big headache for programmers: Do they develop
using the new features in Perl 6, thus ensuring that their script will
not run on machines with Perl 5 installed? Or do they stay with the
Perl 5 features, and thus eschew the benefits of the new version? This
would, of course, be true of any language that is being actively
developed. But the point is that there is a real cost here, because
Perl 5 has become effectively a "standard" that many, many people
write to. In the past, the changes to Perl have been mostly small
additions here and there, but not fundamental changes to the language
and underlying interpreter. Many scripts rely on the small
idiosynchrasies of Perl 5 (for better or worse). Will it be possible
to reproduce all of those idiosynchrasies in Perl 6 "compatibility"
mode, given that it's rewritten code? Probably not. Perl 6 will be a
fantastic language, no doubt - but the point here is simply that there
is a cost involved with making such radical changes to a
well-established standard.
Case 4: Embperl 1.x vs Embperl 2
Embperl (Embedded
Perl) is another example of a tool that I use all the time to write
dynamic websites. Gerald Richter has done a wonderful job of making a
truly useful package that allows you to embed Perl code into
HTML. Version 1.x worked very, very well - but Gerald realized that he
could write a better version, in C (rather than Perl, which is what
1.x is written in) and it would have many new features, more
flexibility and ... it would be all new code. So now we have the
situation where 2.x has been in beta for over two years - Gerald has
had other projects to work on as well, I'm not blaming him for
anything here, but simply pointing to Embperl as another example of a
code rewrite that appears to gain little. Sure, 2.x is faster than
1.x, but I still can't use it because there are bugs which break my
existing, complex applications that are based on Embperl 1.x. These
bugs appear to be obscure and esoteric enough that they are quite hard
to track down. I would love to do more testing of 2.x, but I can't -
the new version has configuration changes that break the old version,
so it's a lot of hassle to change between the two for testing.
I want to make absolutely clear once more that I have utmost
respect for Gerald and for Embperl - I am simply using this as another
example where the total rewrite caused unanticipated problems that are
in fact quite hard to get around. People can't use 2.x on complex
production systems until the new version is properly tested and
debugged - but the new version can't be properly debugged until
everyone is using it in production. And people simply won't make the
switch because it breaks existing systems! Not Gerald's fault, it's
just symptomatic of new code.
Case 5: Netscape 4.x vs Mozilla
I have an "old" (circa 1999)
Penguin Computing 450 MHz
AMD K6-II workstation with 512 MB RAM (running RedHat 7.3 until I can
manage the switch to Debian), which works just fine for just about
everything I do - software development, compiling, browsing, email,
word processing, you name it - but not
Mozilla. This application is the
slowest, most bloated thing that I have ever seen. It was, you guessed
it, a total rewrite of the Netscape browser. I use Netscape 4.80 on a
daily basis - not because I love it, but simply because it is fast and
it responds to mouse clicks in a way that doesn't drive me up the wall
like Mozilla does. Mozilla obviously does things a lot more
"correctly" than the old version, and I am increasingly encountering
websites that don't render properly with 4.x. But this still leaves me
wondering exactly what we've gained here - the old version was, by all
accounts, very messy in its code, and had so many kludges that it was
necessary to just junk it and start again. So what did we end up with?
A new browser based on a cross platform GUI toolkit such as
wxWindows? No, we ended up with
XUL, which is almost certainly responsible for the slow speed of
Mozilla. Everything in the GUI is now effectively interpreted, instead
of being native code as in the old version. Of course Mozilla works
just fine for people using newer computers - but sorry, I simply don't
want to be told that my workstation has to be upgraded just because of
my ...
web browser! What does a web browser do today that is so
different to what was being done by Netscape 4.x? Nothing much, as far
as I can see. Rendering HTML is something that Gecko can do quite
well, but not all that much faster (as far as I can tell) than 4.x. In
most cases, 4.x is perfectly "good enough" for me, and it doesn't take
10 seconds to open a new mail message window, either.
All of which is to say, once again, I don't blame the developers in
particular. They are no doubt doing a fantastic job at what they set
out to do - that is, build a new browser from scratch that can run on
many different platforms with minimal code changes. And, it's fair
enough to design for faster CPU's (well, not for me, but I can at
least understand where they are coming from). But my point is the old
version worked quite well, and was getting pretty reliable - sure,
people had crashes and many actually hate Netscape 4.x with a passion
because of its bugginess and lack of standards compliance. However, I
do believe that it might have been possible to keep Netscape 4.x
going, fixing the bugs and keeping a browser that, well, just works!
Was a total rewrite really so necessary? Is XUL really so much better?
Not in my view. Browsing the Web is browsing the web, and it's pretty
much the same for me now as it was back in 1998. Only difference seems
to be, if I want to do it now then apparently I need a new computer to
use the new browser... even though the actual useage of the browser is
the same. Bizarre.
Case 6: HTML 4 vs XHTML + CSS + XML + XSL + XQuery + XPath + XLink + ...
The Web was based on the idea that a simple
markup language could allow us to
divorce document presentation from document structure, and concentrate
on how information was related rather than how it should be
displayed. HTML was a huge hit, because it was simple and open and
anyone could use it, and you could embed images that allowed people to
make very pretty web pages. But over the last few years this concept
has become quite lost in the maze of new, complex and (in my mind)
dubious standards that threaten the simplicity that made the original
version so beguiling. So I include "Standards" in my list of things
that are rewritten, just like code. People see the "first" version,
and they think that it looks a bit untidy and inconsistent and could
be done better, so they write a whole new version.
Some of the changes to HTML were done in a way that shouldn't
break old browsers, but as I said before, I am increasingly seeing
websites that don't render properly in Netscape 4.x - and believe me,
when I see them in Mozilla, they are really not doing anything that
couldn't be achieved very readily with "old" HTML. So apparently the
FONT tag is deprecated - now we have to use style sheets and whatnot
to do something that was originally very simple - e.g. making some
text red. Why? We sacrifice simplicity in the name of an intellectual
goal that promises greater consistency, but at the expense of being
able to do simple things quickly.
We missed such a big chance to simply fix HTML and make it a bit
more useful. Instead of getting rid of useful tags and redesigning
HTML as XML, we could have simply added some useful stuff to existing
tags. As a Web developer I have long wondered why they didn't add more
types to the INPUT form tags to express different types - for example,
a DATE attribute, or INTEGER, DOUBLE, or whatever. These "rich" (but
simple! not XML!) attributes could then be seen by the browser and
presented to the user in whatever way is supported by the system - in
a GUI, using a little calendar window, perhaps (for dates). This would
get rid of so much of the headache involved with parsing out this
stuff on the server side. Focusing on simple additions like that would
have been so much more useful than rewriting what the Web is supposed
to be in terms of XHTML, XSL and so on. We now have a confusing forest
of standards, where there used to be simplicity. The Web exploded
because it was open and simple - accessible to schoolkids, anybody
could write their own web page. But the direction we're going in, the
HTML books have just become thicker and thicker over the last few
years. And still, all most people really want to do is browse the
web...
Case 7: Windows 2000 vs Windows XP vs Server 2003
Was it a "good idea" for Microsoft to rewrite Windows as XP and
Server 2003? I don't know, it's their code, they can do whatever they
like with it. But I do know that they had a fairly solid, reasonable
system with Windows 2000 - quite reliable, combining the better
aspects of Windows NT with the multimedia capabilities of Windows
98. Maybe it wasn't perfect, and there were a lot of bugs and
vulnerabilities - but was it really a good idea to start from scratch?
They billed this as if it was a good thing. It wasn't. It simply
introduced a whole slew of new bugs and vulnerabilities, not to
mention the instability. It's just another example of where a total
rewrite didn't really do anyone any good. I don't think anyone is
using Windows for anything so different now than they were when
Windows 2000 was around, and yet we're looking at a 100% different
codebase. Windows Server 2003 won't even run some older software,
which must be fun for those users...
Addendum: I've been informed by reliable sources that
Windows XP was not, in fact, a total rewrite! My apologies for this
apparent gap in my memory. The reason I used this example was that I
thought I remembered, around the time that XP came out, reading in the
computer press that XP had a large portion of itself rewritten as
brand new code - this was in something like Computerworld or Infoworld
or one of those. I remember it because Microsoft was claiming that it
would be so much more reliable, and the columnist was questioning this
based on the fact of all this rewritten code - it was bound to have
new bugs, new vulnerabilities, etc. Of course, now I can't find any
reference to this, but you'll just have to take my word for it that I
did read it... I'll leave this bit in here for now for posterity - if
only to prove once again to everybody just how clued-in I am! D'oh.
Conclusion: In Defense of "good enough" and simplicity
You might read all this and think what an idiot I am for
suggesting that older, crappier, buggier, dirtier, messier, more
complex software might be better than newer, cleaner, faster
rewrites. Well, the point is a subtle one - in a nutshell, when you
rewrite, you lose all those little fixes and improvements that made
the older version good to use and reliable. New software
always
introduces new bugs. Often, the rewrite process seems to be driven by
a desire to make the product somehow more theoretically consistent and
complete - which in turn often ends up losing the simplicity and
elegance that made the original so compelling and useful. Rewriting,
especially when it breaks existing systems, results in multiple
versions of software that makes it confusing for new users and
perplexing for old users.
And, let's face it - programmers just like to write new code. It's
natural. We all do it - it's easier to start from scratch than it is
to make the old version better. Also, it's more glamorous - everybody
wants to be credited with creating something themselves, rather than
maintaining and developing an existing thing. So, I can quite
understand why things are the way they are. This whole document is
simply a philosophical pondering of the benefits of such
behavior. What happens when the popular, successful software or
standard is rewritten from scratch? Experience tells us that it's not
always a good situation.
Mind you, I am not saying that we should never rewrite code
- sometimes it's just a necessary thing, because of new platforms or
changes to underlying API's. It's all a question of degree - do you
totally rewrite, or do you evolve existing, working code? Rewrites are
so often done without any regard to the old code at all. In my
experience, new programmers often come on board, and it's just too
much trouble to look through and really understand all the
little nooks and crannies. We have seen it plenty of times in business
- there is an old version of the application, but you're brought in to
put together a new version. Usually the new spec has so many
functional/UI differences from the old one that the old is simply
discarded as being irrelevant. And yet, many times, the underlying
functional differences are not actually all that great. So,
unfortunately, years of patches, special cases and wisdom are just
abandoned.
There is a "cost" involved with totally rewriting any application,
in terms of "lost wisdom". If you have a package that is very popular,
used by many people and has had a lot of bugfixes and patches applied
over time, then it is more likely that a total rewrite will have a
higher cost. Also if you change the way it works in the process, you
create a chasm between the new and old versions that has to be crossed
by users, and this causes stress. Which version to use - the old,
reliable, well known but out-of-date version, or the newer, sleaker,
incompatible, more buggy version? Hmmm. If your software (or standard)
is not used by many people and doesn't have any significant history
behind it (in terms of "accumulated wisdom") then clearly there are no
real issues involved in rewriting - the cost is low. So I am not
making a blanket statement that rewriting is bad; the whole point of
this article was to focus on tools and standards that have attained
great success and are used by many people. Such software/standards
will inevitably have had a large amount of wisdom invested over time,
because nothing is perfect first time out. Thus it is the most popular
tools and packages that are most likely to be casualties of total
rewrites.
So in summary, I would say that the "cost" of a total rewrite
depends on three factors:
- Amount of "accumulated wisdom" (bug fixes, tweaks and useful patches) in the old version that will be discarded
- How incompatible the new version is with the old version (API, data formats, protocols etc)
- How many people used the old version and will be affected by the changes
A suggestion: If you have a very successful application, don't
look at all that old, messy code as being "stale". Look at it as a
living organism that can perhaps be healed, and can evolve. You can
refactor, you can rewrite portions of the internals to work better,
many things can be accomplished without abandoning all the experience
and error correction that went into that codebase. When you rewrite
you are abandoning history and condemning yourself to relive it.
-Neil Gunton
January 15th 2004