Useless Error Messages are a Common Practice

Posted on March 12, 2007

As software engineers, our desire should be to write fault-tolerant and robust software. The panacea is a program that never crashes, that deals with unknown situations, and has as few defects as possible. At some point we were supposed to be taught to write exceptions that do something rather than simply squawk error messages at the user — especially if they have no clue what the error message means. The problem with most error messages is that only the developer who wrote them will likely know what they mean.

For example, the other night I boot up my machine and my firewall program fires off an error message in a dialog box; failing to start all together. The dialog contained a message telling me that some obscure configuration file deep in the depths of this program was unable to be loaded. My only option was to press the “Ok,” button which terminated the program. Subsequent attempts to start the program resulted in the same error message. I cringe when I have to do this, but all I could think of to do was reboot the machine and hope the problem goes away — a shotgun tactic and a solution I am certainly not happy with.

Is it any wonder that users frequently throw up their hands and proclaim that they do not understand computers? Their frustration with meaningless error messages and complex interfaces forces them to believe that they cannot understand; that they are excluded from the secret order of computer magicians. They end up feeling like their computer is a little black box that they have no control over (much like how I feel about my Fon router). This is bad.

While the darker side of me revels in excluding people who are too ignorant to learn how computers work in the first place; I have to remind myself that I barely understand how a car works and yet I expect to be able to drive one without being a mechanic. And while the interfaces in computer systems are getting better (think of steering wheels and pedals), error reporting and recovery hasn’t been progressing much it seems. A car is able to continue operating despite certain levels of operational failures while hopefully leaving the driver feeling that they are still in control of the device. Better still, the device should be able to inform the user of what component has failed in a way they can understand. Cars do this with simple icon indicators — low gas or oil; overheating, low battery power. However, even while these indicators are on, the car will do everything it can to continue operating. Only in the most extreme circumstances will it stop all-together.

With the sophistication we are seeing in computer software and hardware today, why are computer systems any less fragile than the mechanical (well, mechanical-digital hybrid technically) systems? Is there a good reason for a program to terminate if it can’t find a configuration file? Should it terminate without meaningful warning if a memory address is unexpectedly occupied or corrupt? The answer should be obvious. Software engineers need to write more meaningful exceptions that actually handle errors rather than simply report them. The user can deal with icons and simple instruction on how to handle the error if they can help the program recover (just as a driver can understand how to replace a flat or change the oil). Doing so may mean the difference for a positive experience.

Lets stop making users feel like morons and smarten up.