Monday, March 12, 2007

Innocent but Dangerous Language

To be successful as a consultant, you need to pay attention to seemingly innocent language. The computer software field is filled with such language booby traps, but let me introduce the subject by citing a field that might be more familiar to more people—dog training.

My wife, Dani, is an anthropologist by profession, and so naturally is a skilled listener. She's now retired from her anthropology career, but brings all her skills and experience as an anthropologist and management consultant to her new career of behavioral consulting with dog owners and dog trainers. The combination produces many interesting ideas, like what she told me about the way attack dogs are trained. As usual, the big problem with attack dogs is not the dogs, but the people.

When someone hears that a dog is attack trained, chances are about one in three that they'll turn to the dog and command: KILL!

As a joke.

Or just to see what the dog will do.

To protect against this idiotic human behavior, this carelessness with words, attack-dog trainers never use words like "kill" as the attack command. Instead, they use innocent words like "health" that would never be given in a command voice.

This kind of protection is needed because a trained dog is an information processing machine, in some ways very much like a computer with big teeth. A single arbitrary command could mean anything to a dog, depending on how it was trained—or programmed.

This arbitrariness doesn't matter much if it's not an attack dog. The handler might be embarrassed when Rover runs out to fetch a ball on the command ROLL OVER, but nothing much is lost. But if the dog were trained to respond to ROLL OVER by going for the throat, it's an entirely different matter.

Maintenance, or Computers with Teeth

It's the same with computers. Because computers are programmed, and because the meanings of many words in programs are arbitrary, a single mistake can turn a helpful computer into one that can attack and kill an entire enterprise. That's why I've never understood why some of my clients take such a casual attitude toward software maintenance. Time and again, I hear managers explain that maintenance can be done by less intelligent (and cheaper) people, operating without all the formal controls and discipline of development—because it's not so critical. And no amount of argument seems able to convince them differently—until they experience a costly maintenance blunder.

Fortunately (or unfortunately), costly maintenance blunders are rather common, so managers have many lessons, even though the tuition is high. I keep a confidential list of expensive programming errors committed by my clients, and all of the most costly ones are maintenance errors. And almost all of those involve the change of a single digit in a previously operating program.

In all these cases, the change was called, innocently, "trivial," so it was instituted casually by a supervisor telling a low-level maintenance programmer to "change that digit"—with no written instructions, no test plan, nobody to review the change, and, indeed, no controls whatsoever between that one programmer and the day-to-day operations of the organization. It was exactly like having an attack dog trained to respond to KILL—or perhaps HELLO.

Just Change One Line

I've done studies, confirmed by others about the chances of a maintenance change being done incorrectly, depending on the size of the change. Here's the first part of the table:

     Lines         Chance of

    Changed        Error

          1                    .50

          2                    .60

          3                    .65

          4                    .70

          5                    .75

Developers are often shocked to see this high rate, for two reasons. In the first place, development changes are simpler than maintenance changes because they are being applied to cleaner, smaller, better-structured code. Usually, the code has not been changed many times in the remote past by unknown hands, so does not contain many unexpected linkages. Such linkages were involved in each of my most costly disasters.

Secondly, the consequences of an erroneous change during development are usually smaller, because the error can be corrected without affecting real operations. Thus, developers don't take that much notice of their errors, and thus tend to underestimate their frequency.

In development, you simply fix errors and go on your merry way. Not so in maintenance, where you must mop up the damage the error caused, then spend countless hours in meetings explaining why it will never happen again—until the next time.

For these two reasons, developers interpret this high rates of maintenance errors as indicative of the ignorance or inexperience of maintenance programmers. But if we continue down the table a few lines, we can see that the cause cannot be either ignorance or inexperience:

     Lines         Chance of

    Changed        Error

          10                   .50

          20                   .35

The decrease in error rate as the size of the change increases shows that maintenance programmers are perfectly capable of doing better work than their record with small changes seems to indicate. That's because these "trivial" changes are not taken seriously, and so are done carelessly and without controls. How many times have you heard a programmer say, "No problem! All I have to do is change one line."

Who Coined These Innocent-Sounding Words?

And how many times have you heard these programmers' managers agree with them? Or even to work "quick and dirty" when "it's only a minor change"?

This carefree attitude would be sensible if "minor" changes were truly minor—if maintenance of a program were actually like maintenance of an apartment building. Not that janitorial maintenance can't be dangerous, but the janitor can assume that changing one washer in the kitchen sink won't incur great risk of causing the building to collapse and bury all the occupants. It's not safe to make the same assumption for a program used every day to run a business, but because we are so free and arbitrary with words, the word "maintenance" has been misappropriated from the one circumstance to the other.

Whoever coined the word "maintenance" for computer programs was as careless and unthinking as the person who trains an attack dog to kill on the command KILL or HELLO. With the wisdom of hindsight, I would suggest that the "maintenance" programmer is more like a brain surgeon than a janitor—because opening up a working system is more like opening up a human brain and replacing an nerve than opening up a sink and replacing a washer. Would maintenance be easier to manage if it were called "software brain surgery"?

Think about it this way. Suppose you had a bad habit—like saying KILL to attack dogs. Would you go to a brain surgeon and say, "Just open up my skull, Doc, and remove that one little habit. And please do a quick and dirty job—it's only a small change! Just a little maintenance job!"?

The Moral

Of course, you as a consultant would never be this careless with language, would you? But when you're called in by a client who's having trouble—like disastrous small maintenance changes—listen to their "innocent" language. It may contain just the clue you need to make one small change and fix the problem.

Oh, wait a minute!


ad hominem said...

Do you mean "a single digit in a previously operating program"?

I apologize if I misunderstood.

Unknown said...

I had, what I consider to be, the PRIVILEGE of doing software maintenance for four years. I was amazed at the expertise of the people with whom I had the PRIVILEGE of doing this maintenance. I have always been astounded to hear people degrade the task of maintaining software as well as degrade the software maintainers.

Thank you for stating your views on this so eloquently. Also thank you for the studies and the numbers.


Gerald M. Weinberg said...

Good eyes, Kirk. And it is I who must apologize for the typo. Kind of illustrates what I was saying, though, about single digit/character errors: "operation" for "operating" could be interpreted as one character--delete the o and add the g (but in a different place). Anyway, carelessness.

And Dwayne, maybe we'll be joined by a few other voices and someday change this prejudice. Thanks.

Jamie said...

Great Blog Gerald,

The way we throw language about is ridiculous. The problem, I think, for many maintenance teams is that they are deliberately set up as the fall guy. The development team, under pressure to deliver will do just about anything to get the software over the wall. Any mistakes they made will be lost in the haze of the maintenance period.

When consultancies do this, especially when they have on going maintenance contracts, I think it borders on the criminal. What I advocate is that the programmers and managers who developed the code do the first bit of maintenance. This assumes they are capable, as you point out maintenance engineering is difficult. But, if you make the team responsible then they’ll soon stop delivering rubbish. It’s like making the soldiers stand under the bridge they just built.

Great analogy of maintenance being like brain surgery. I will endeavor to make that mainstream language in the software industry, if you don’t mind.

Although, it doesn’t have to be brain surgery if we only dropped off decent software, with tests, and dashboards…

Jamie Dobson.

PS – I wrote a blog this morning about reactive responses. The way people talk about maintenance is also pure reaction, born out of ignorance and lack of thought. . J.

Jamie said...

Hi Gerry,

I was wondering if I could quote you?

I am refering back to your blog and would like to quote your conclusion '"Just open up my skull, Doc, and remove that one little habit. And please do a quick and dirty job—it's only a small change! Just a little maintenance job!"'

Gerald M. Weinberg said...

Jamie, you can certainly quote me--and I think I might quote you a bit, too. Thanks.