I’m one of those programmers who emphasizes heavily on the aesthetic appeal of code. I describe certain pieces of well-written code as beautiful and elegant, and poorly written code as ugly or dirty. I like Eric Raymond’s explanation of beautiful code:
Writing these [large Perl programs of significant complexity] left me progressively less satisfied with Perl. Larger project size seemed to magnify some of Perl’s annoyances into serious, continuing problems. The syntax that had seemed merely eccentric at a hundred lines began to seem like a nigh-impenetrable hedge of thorns at a thousand. “More than one way to do it” lent flavor and expressiveness at a small scale, but made it significantly harder to maintain consistent style across a wider code base. And many of the features that were later patched into Perl to address the complexity-control needs of bigger programs (objects, lexical scoping, “use strict”, etc.) had a fragile, jerry-rigged feel about them.
These problems combined to make large volumes of Perl code seem unreasonably difficult to read and grasp as a whole after only a few days’ absence. Also, I found I was spending more and more time wrestling with artifacts of the language rather than my application problems. And, most damning of all, the resulting code was ugly—this matters. Ugly programs are like ugly suspension bridges: they’re much more liable to collapse than pretty ones, because the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity. A language that makes it hard to write elegant code makes it hard to write good code.
The elegance of code seems to coincide with its readability, brevity, and intuitively obvious implementation.
The Ultimate Measure of Software is User Experience
From a realistic point of view, the ultimate goal of writing software should be to satisfy users. Users only care about their user experience, they couldn’t care less what your beautiful code looks like.
Its tempting to think that there’s no point to writing elegant code if users are never going to appreciate it. I used to agree with this sentiment before I started developing software professionally, before I really saw the effects of bad code in the real world.
Thank you, come again
Every year, my company attempts to start selling a new piece of banking software. Two years ago, the company wrote up specs on a piece of teller software, eTeller [actual name of software changed to project job], and decided it would be easiest to outsource this software to contracters in India. Our relationship with India was simple: my company would make the business decisions and request features, and India would deliver the code.
After a year of unsupervised development, we started selling version 1.0 of eTeller. As we expect in the normal lifecycle of any software, we got bug reports, enhancement and new feature requests, etc. India was happy to make the changes...
... until around 6 months into the project when India began to slip; they simply weren't able to keep pace with the change requests. After v1.0 was installed, the list of known bugs was on the rise, development of new features came to a stagnant halt, and user satisfaction was taking a dive. The application had serious stability problems, such as crashing or freezing 40 to 50 times a day (not an exaggeration) and a general inability to keep cash drawers balanced.
When bank presidents started putting the president of my company on speed-dial to demand bug fixes or threaten to find a new software vendor, we realized we had a problem. Since India was unable to make changes to the software, we decided that we simply couldn't rely on India to do their jobs right; we requested access to their sourcecode repository to make the changes ourselves. And that’s when we saw it for the first time:
The Beast
Every programmer will eventually encounter The Beast: an application exceeding 10,000 or 100,000 or 1,000,000 lines of the worst, most twisted, most utterly evil, fantastically wicked, atrociously written code on the planet.
These multi-megabyte monsters are born when large companies attempt to save a few pennies by hiring unskilled amateurs to do professional work. Beasts often rear their heads most often in mission-critical systems such as banks, hospitals, insurance companies, air-traffic controllers, grocery stores, routers.
Even for a Beast, eTeller was a fright to behold. Here is a brief list of evil:
- More than 130,000+ lines of tangled, completely unmodularized C#.
- Excluding form files, there were only 2 classes used in the entire application:
1) A static class called “Globals”, which held 1000s of variables to maintain the state of application. There were no other datatypes used in the Globals class apart from bool, string, decimal, and dataset.
2) This SQLHelper class copy/pasted from the internet.
- 20,000 lines of code embedded in a single file called frmMain.cs. All other code in the application accessed frmMain directly in this fashion:
((frmMain)this.MDIParent).UpdateStatusBar(hstValues);
- To look up accounts, the programmers did something like this:
bool blnAccountExists =
new frmAccounts().GetAccountInfo().blnAccountExists
As bad as it already is creating an invisible form to perform business logic, how do you think the form knew which account to look up? That’s easy: the form could access Globals.lngAcctNum and Globals.strAcctType. (Hungarian notation? What is this, 1997 all over again?)
- Parameters were rarely passed between functions, since all state was held in the Globals class. In the rare instance that parameters were passed, they were passed as strings for convenience.
- Nearly all methods and functions had side-effects, no matter the purpose the function, usually modifying 1 or more variables in the Globals namespace. For example, one of the form validation methods had a mysterious side effect of calculating over and short payments on loans for whatever account was stored Globals.lngAcctNum.
- Some forms had invisible labels. The programmers used the .Text and .Tag properties to hold miscellaneous data.
- No objects existed in the application. The only datastructures used were ArrayLists, HashTables, and DataSets — none of which guarantee type safety. The programmers actually invented bizarre pseudo-objects using just ArrayLists, HashTables, and DataSets:
- ArrayLists of hashtables
- HashTables using arraylists as keys and other hashtables as values
- Arraylists of arraylists
- HashTables using DataRows as keys and ArrayLists as values
For a static Accounts class in particular, the programmer created a seperate HashTable for each concievable property of an account: a HashTable called hstAcctExists, hstAcctNeedsOverride, hstAcctFirstName.
The keys for all of those hashtables was a “|” seperated string. Concievable keys included “123456|DDA”, “24100|SVG”, “100|LNS”, etc. (Most of these hashtables were easy to convert to typed Dictionary objects at first, and eventually into a unified Account object.)
-
CTRL-C, CTRL-V was a favorite key combination employed by India. For example, the 500 lines of code required to print receipts was copied across 7 forms. The code required to implement the "customizable grid" feature was copy/pasted verbatim across 12 different forms.
- There was an unusual threading model, which I call the Thread-And-Timer model. Every form that used threading had 1 or more timers on it (1 timer per thread). The Thread-And-Timer model looks like this:
bool finished = false;
Thread backgroundThread;
public void SomePublicMethodThatsEnevitablyInvokedRandomlyByAnotherForm()
{
try
{
backgroundThread.Abort();
}
catch { // swallow }
finished = false;
backgroundThread = new Thread(someMethod);
timer1.Enabled = true;
}
publiv void someMethod()
{
// do stuff in the background
finished = true;
}
private void timer1_Tick(object sender, TimerEventArgs e)
{
try
{
if (finished == true)
backgroundThread.Abort();
}
catch { // swallow }
}
Each time you click a button, it would start a new thread. If you click the button before the thread finishes its job, then the thread is aborted so that another new job is started. After starting a thread, a timer would check some bool at intervals of 100 milliseconds. When the bool is set to true, the timer aborts the thread for some reason and the resulting ThreadAbortException is ignored.
This bizarre pattern was found in a dozen different places throughout the application. Naturally, every thread in the application freely accessed any variable, including the ones in the Globals class, without locking.
-
Spectacular abuse of the ternary operator. Here are some actual snippets of code pulled from the application:
strDrCr = chkCredits.Checked && chkDebits.Checked ? string.Empty : chkDebits.Checked ? "D" : chkCredits.Checked ? "C" : "N";
if (strDefaultVals == strNowVals && (dsTranHist == null ? true : dsTranHist.Tables.Count == 0 ? true : dsTranHist.Tables[0].Rows.Count == 0 ? true : false))
if (Validations(parPostMode == "ADD" ? true : false))
That is only a small tip of the iceburg, barely a fraction of all of the errors in eTeller. The most damning part of the application was the total absence of unit tests. Not a single test, for a single one of the 130,000 lines of code. It was evident why the programmers were unable to make changes.
Crisis-Driven Development
After seeing the state of eTeller’s ugly code, my dev team became heavily involved with the development and maintenance of this application. From Nov 2007 to Jan 2008, we entered a state of crisis-driven development.
Banks were demanding to have bugs fixed on the same day they were reported. Each new bug sent us into crisis-mode: we fixed bugs as fast as possible and send a new release to QA almost immediately. Inevitably, 5 new bugs would be documented within minutes of QA testing, so we’d fix those up as quickly as possible.
Against the recommedations of QA and developers, we were releasing new versions of eTeller once every 2 to 3 days for a couple of months. There were many occasions where we’d be writing bug fixes to a release minutes before updates at a bank.
Over time, we stablized our product just enough to add new features and respond to bug reports at a reasonable pace. eTeller is continuously being refactored, piece by piece, into a better application everyday.
What did I learn from The Beast
While my team was working on eTeller, I discovered how ugly code has a psychological effect on programmers. The amount of frustration working with the code simply sucks the vitality out of programmers, it makes programmers ashamed of their own creations. Unhappy programmers don’t produce quality code or good tests, which holds back the development of application in the long run. Unahppy programmers tend to look for new, less stressful jobs, which also hurts development.
My team has been hacking through the jungle of eTeller code since Nov 2007. We’ve made significant improvements (especially with regard to type-safe objects and seperation of business logic from UI). We’ve re-written about 25% of the code in eTeller (which, fortunately, corresponds to about 90% of the core eTeller functionality) to the point where the code is marginally palatable. The application line-count is down to around 53,000 non-trivial lines of code, and drops at a rate of 20kb per week. We estimate the final 'gold' version of this application will drop to around 25,000 non-trivial lines of code.
Don’t let anyone tell you that the beauty of code does not matter to users. High quality code, more often than not, corresponds to a high quality product and satisfying user experience. If programmers want to satisfy our users, then we’re obliged to write high quality, beautiful code in all of our applications.
(In case you're wondering, we eventually terminated our contract with India.)