06.05.07

Review: Vessey's Expertise in debugging computer programs: a process analysis

Posted in programmer productivity, review at 4:07 pm by ducky

I really wanted to like Expertise in Debugging Computer Programs: a Process Analysis by Iris Vessey (Systems, Man and Cybernetics, IEEE Transactions on, Vol. 16, No. 5. (1986), pp. 621-637.) Van Mayrhauser and Vans said in Program Comprehension During Software Maintenance and Evolution indicated that Vessey had shown that experts use a breadth-first approach more than novices.

I was a little puzzled, though, because Schenk et al, in Differences between novice and expert systems analysts: what do we know and what do we do? cited the exact same paper, but said that novices use a breadth-first approach more than experts! Clearly, they both couldn’t be right, but I was putting my money on Schenk just being wrong.

It also looked like the Vessey paper was going to be a good paper just in general, that it would tell me lots of good things about programmer productivity.

Well, it turns out that the Vessey paper is flawed. She studied sixteen programmers, and asked them debug a relatively simple program in the common language of the time — COBOL. (The paper was written in 1985.) It is not at all clear how they interacted with the code; I have a hunch that they were handed printouts of the code. It also isn’t completely clear what the task termination looked like, but the paper says at one point “when told they were not correct”, so it sounds like they said to Vessey “Okay, the problem is here” and Vessey either stopped the study or told them “no, keep trying”.

Vessey classified the programmers as expert and novice based on the number of times they did certain things while working on the task. (Side note: in the literature that I’ve been reading, “novice” and “expert” have nothing to do with how much experience the subjects have. They are euphemisms for “unskilled” and “skilled” or perhaps “bad” and “good”.)

She didn’t normalize by how long they spent to finish the task, she just looked at the count of how many times they did X. There were three different measures for X: how often did the subject switch what high-level debugging task they were doing (e.g. “formulate hypothesis” or “amend error”); start over; change where in the program they were looking.

She then noted that the experts finished much faster than the novices. Um. She didn’t seem to notice that the that the count of the number of times that you do X during the course of a task is going to correlate strongly with how much time you spend on the task. So basically, I think she found that people who finish faster, finish faster.

She also noted that the expert/novice classification was a perfect predictor of whether the subjects made any errors or not. Um, the time they took to finish was strongly correlated with whether they made any errors or not. If the made an error, they had to try again.

Vessey said that 15/16 experts could be classified by a combination of two factors: whether they used a breadth-first search (BFS) for a solution or a depth-first search (DFS) whether they used systems thinking or not However, you don’t need both of the tests; just the systems-thinking test accurately predicts 15/16. All eight of the experts always used BFS and systems-thinking, but half of the novices also used BFS, while only one of the novices used systems-thinking.

Unfortunately, Vessey didn’t do a particularly good job of explaining what she meant by “system thinking” or how she measured it.

Vessey also cited literature that indicated that the amount of knowledge in programmer’s long-term memory affected how well they could debug. In particular, she said that the chunking ability was important. (Chunking is a way to increase human memory capacity by re-encoding the data to match structures that are already in long-term memory, so that you merely have to store a “pointer” to the representation of the aggregate item in memory, instead of needing to remember a bunch of individual things. For example, if I ask you to remember the letters C, A, T, A, S, T, R, O, P, H, and E, you will probably just make a “pointer” to the word “catastrophe” in your mind. If, on the other hand, I ask you to remember the letters S, I, T, O, W, A, J, C, L, B, and M, that will probably be much more difficult fo you.)

Vessey says that higher chunking ability will manifest itself in smoother debugging, which she then says will be shown by the “count of X” measures as described above, but doesn’t justify that assertion. She frequently conflates “chunking ability” with the count of X, as if she had fully proven it. I don’t think she did, so her conclusions about chunking ability are off-base.

One thing that the paper notes is that in general, the novices tended to be more rigid and inflexible about their hypotheses. If they came up with a hypothesis, they stuck with it for too long. (There were also two novices who didn’t generate hypotheses, and basically just kept trying things somewhat at random.) This is consistent with what I’ve seen in other papers.

1 Comment

  1. Best Webfoot Forward » Productivity factors said,

    June 12, 2007 at 4:25 pm

    […] seen a number of other academic studies that seemed to show no effect of age or experience.  The Vessey paper and the Schenk paper (which I will blog about someday, really!), for example, have some […]