Observing and Reasoning About Errors
NOTE: This essay is the first of a series of blogs adapted from my Quality Software, Volume 2, Why Software Gets in Trouble: Fault Patterns
Three of the great discoveries of our time have to do with programming: programming computers, the programming of inheritance through DNA, and programming of the human mind (psychoanalysis). In each, the idea of error is central.
The first of these discoveries was psychoanalysis, with which Sigmund Freud opened the Twentieth Century and set a tone for the other two. In his introductory lectures, Freud opened the human mind to inspection through the use of errors—what we now call "Freudian slips."
The second of these discoveries was DNA. Once again, key clues to the workings of inheritance were offered by the study of errors, such as mutations, which were mistakes in transcribing the genetic code from one generation to the next.
The third of these discoveries was the stored program computer. From the first, the pioneers considered error a central concern. von Neumann noted that the largest part of natural organisms was devoted to the problem of survival in the face of error, and that the programmer of a computer need be similarly concerned.
In each of these great discoveries, errors were treated in a new way: not as lapses in intelligence, or moral failures, or insignificant trivialities—all common attitudes in the past. Instead, errors were treated as sources of valuable information, on which great discoveries could be based.
The treatment of error as a source of valuable information is precisely what distinguishes the feedback (error-controlled) system from its less capable predecessors—and thus distinguishes Pattern 3 (Steering) software cultures from Patterns 0 (Oblivious), 1 (Variable), and 2 (Routine). Organizations in those patterns have more traditional—and less productive—attitudes about the role of errors in software development, attitudes that they will have to change if they are to transform themselves into Steering organizations.
So, in the following blog entries, we'll explore what happens to people in Oblivious, Routine, and especially Variable organizations as they battle those "inevitable" errors in their software. After reading these chapters, perhaps they'll appreciate that they can never move to a Steering pattern until they learn how to use the information in the errors they make.
One of my editors complained that the first sections of this essay spend "an inordinate amount of time on semantics, relative to the thorny issues of software failures and their detection."
What I wanted to say to her, and what I will say to you, is that such "semantics" form one of the roots of "the thorny issues of software failures and their detection." Therefore, to build on a solid foundation, I need to start this discussion by clearing up some of the most subversive ideas and definitions about failure. If you already have a perfect understanding of software failure, then skim quickly, and please forgive me.
Errors Are Not A Moral Issue
"What do you do with a person who is 900 pounds overweight that approaches the problem without even wondering how a person gets to be 900 pounds overweight?"—Tom DeMarco
This is the question Tom asked when he read an early version of this blog. He was exasperated about clients who were having trouble managing more than 10,000 error reports per product. So was I.
Over fifty years ago, in my first book on computer programming, Herb Leeds and I emphasized what we then considered the first principle of programming:
The best way to deal with errors is not to make them in the first place.
Not all wisdom was born in the Computer Age. Thousands of years before computers, Epictetus said, "Men are not moved by things, but by the views which they take of them."
Like many hotshot programmers half a century ago, my view of "best" was then a moral stance:
Those of us who don't make errors are better programmers than those of you who do.
I still consider this a statement of the first principle of programming, but somehow I no longer apply any moral sense to the principle. Instead, I mean "best" only in an economic sense, because,
Most errors cost more to handle than they cost to prevent.
This, I believe, is part of what Crosby means when he says "quality is free." But even if it were a moral question, I don't think that Steering cultures—which do a great deal to prevent errors—can claim any moral superiority over Oblivious, Routine and Variable cultures—which do not. You cannot say that someone is morally inferior because they don't do something they cannot do. Oblivious, Routine and Variable software cultures cannot, though these days, most programmers operate in such organizations—which are culturally incapable of preventing large numbers of errors. Why incapable? Let me put Tom's question another way:
"What do you do with a person who is rich, admired by thousands, overloaded with exciting work, 900 pounds overweight, and has 'no problem' except for occasional work lost because of back problems?"
Tom's question presumes that the thousand pound person perceives a weight problem, but what if they perceive a back problem instead? My Oblivious, Routine or Variable clients with tens of thousands of errors in their software do not perceive they have a serious problem with errors. Why not? First of all, they are making money. Secondly, they are winning the praise of their customers. Customer complaints are generally at a tolerable level on every two products out of three. With their rate of profit, why should they care if a third of their projects have to be written off as total losses?
If I attempt to discuss these mountains of errors with Oblivious, Routine or Variable clients, they reply, "In programming, errors are inevitable. Even so, we've got our errors more or less under control. So d on't worry about errors. We want you to help us get things out on schedule."
Such clients see no more connection between enormous error rates and two-year schedule slippages than the obese person sees between 900 pounds of body fat and pains in the back. Would it do any good to accuse them of having the wrong moral attitude about errors? Not likely. I might just as well accuse a blind person of having the wrong moral attitude about rainbows.
But their errors do create a moral question—for me, their consultant. If my thousand-pound client is happy, it's not my business to tell him how to lose weight. If he comes to me complaining of back problems, I can step him through a diagram of effects showing how weight affects his back. Then it's up to him to decide how much pain is worth how many chocolate cakes.
Similarly, if he comes to me complaining about error problems, I can ... (you finish the sentence)
(to be continued)