I’ve long favoured self-documenting code over code comments. But I’ve recently reconsidered the issue. After years of dealing with self-documenting code I’m beginning to see problems. With hindsight I’m less sure that self-documenting code is always the best way to go.
Code is written with 2 targets in mind: the compiler and the programmer. These targets have different needs and different competencies. Self-documenting code and code comments are conflicting approaches to cater for both needs. I write “conflicting” because self-documenting code attempts to create one text for both, whereas commenting tries to keep them separate.
As programming languages have developed, the tendency has been to provide more opportunities for naming and more expressive names. Early programming languages had heavy restrictions on names that now seem bizarre. Today we mix case, use as many characters as we want; we can even split words with an underscore or dash.
In most cases we can embed comments into the symbols of the instructions. The program then becomes a mixture of plain English and program syntax. Of course that new language is not English. But English is a robust language: you can murder it and it will still (mostly) make sense to an English speaker.
Proponents of self-documenting code often suggest that comments are therefore redundant: anything they express can be mashed into the code. It might not be Shakespeare, but still gets the point across. Whereas, comments can confuse us if we don’t keep them up to date with changes in the code; and experience tells us that we don’t. The code never lies, but the comments can.
The problem with that view is that self-documented code can also lie. It may not be immediately obvious, but self-documented code suffers most of the problems that comments do (and a few additional ones as well.) Self-documenting code removes a layer of indirection that could help us understand the code better. If we find a function named MTT(), we have no choice but to look up the documentation or the source. Whereas if it were named MeanTimeToTarget() it’s tempting to assume its meaning.
In the development of HTML the trend runs the other way. Where text and presentation markup were once mashed together, separate style sheets are now common. Wikis and blogs try to minimise the markup further.
I think, to get the best of both will require an integrated approach to programming language and program writing. Smalltalk provides a good example with its integrated environment. Many programmers I know find it hard to let go of source files and plain text editors. Doxygen or cweb might help to bridge the gap.
Leave a Reply