When code goes bad: What to watch for
Emergent Design: Pathologies uncovered
Book extract, part five Scott Bain’s book, Emergent Design: The Evolutionary Nature of Professional Software Development by Addison Wesley, looks at the principles involved in building and maintaining robust, reliable, and cost-effective code. In this, our concluding extract, Scott identifies the pathologies in code when coupling, cohesion, and redundancy have not been adhered to.
It is important to know the qualities you want your code to have, but it is also important to empower yourself with a set of indicators that tell you that you are heading in the wrong direction. Ideally, I would like to know I am doing something I should not do before I have done very much of it. Also, we often are called upon to evaluate other people's code, or to revisit our own code from the recent or distant past.
Pathologies help us here. In truth, code, design, and system pathologies could be the subject of an entire book, but there are a few really high-leverage indicators that we can use as a basic set. Not surprisingly, they tie into the qualities that I listed earlier: coupling, cohesion, and eliminating redundancy.
Indicators of weak cohesion
Here are some indicators of weak cohesion:
- Difficulty naming. I would like the names of classes, methods, and other entities (delegates, packages, and so on) to reveal their intentions in their names. I would also like my names to be relatively brief, and yet still tell the whole story.
- When a class or method has a long name, or a vague name, the reason often is that a really informative name is very difficult to create, due to the fact that the class or method does a number of different things. This, of course, is weak cohesion, and causes lots of other problems anyway. When I cannot name things the way I want to, I suspect that my entities do too much.
- Large tests. When a class has multiple responsibilities, it can create large tests, because the test must cover all the possible combinations of these responsibilities.
- Large classes and methods. When a class or a method gets big and requires lots of scrolling in your integrated development environment, or when it is hard to see well, I usually suspect weak cohesion. This is not an absolute; algorithms themselves can get large and yet still be about one responsibility, but it is, at least, something to investigate when I see it.
A student once told me a story that is pretty illustrative and also kind of funny. He was given a class to refactor. Someone had written the class in C#, but clearly did not know much about object orientation. The class worked fine (the author had been, obviously, a skilled programmer), but nobody could or would touch it. It was, essentially, one huge method.
He started by doing what I would probably do; he just scanned through the code without really reading it thoroughly, just to determine how tough this job was going to be. He was, of course, being asked how long this was going to take.
As he scanned along, hitting Page Down over and over, suddenly all the code disappeared from the screen. A few more page downs revealed blank screen after blank screen, and then suddenly the code was back again.
This happened several times. As he paged down through the code, sometimes for pages at a time, there would be nothing but a blank page, and then for pages at a time he would see code again.
He had been working all day, so he suspected that he may have corrupted the memory buffer of his IDE (Visual Studio, in this case). He restarted it. The same problem happened again, so he rebooted. Then he cold started. He checked out a fresh copy of the code from source-control. Same problem.
Now, he really hit the old "wall of voodoo". He figured it must be something strange, like a problem in the operating system or the installation of the IDE. He was, in fact, very close to taking a pretty drastic step and reinstalling either or both of these things.
Then he noticed how small the horizontal scroll button was at the bottom of his IDE.
The problem was actually very simple: the code was so complex, and had to do so much procedurally, that the nesting from the accumulating tabs in the code often pushed all the code off screen to the right, sometimes for pages at a time, until the conditional branching, looping, try/catches, and so forth all closed and the code drifted back to the right.
It was not good news (lots of work to do), but at least he knew it was not his machine. It was a problem with the code, and it was obviously very weakly cohesive.
Indicators of accidental or illogical coupling
Here are some examples of accidental or illogical coupling.
- Unexpected side effects. The very thing we hope to avoid by paying attention to coupling is also a key indicator that we have not. When a change in one part of the system changes something in another part of the system, and this is surprising, unexpected, and illogical to you, then most likely there is coupling in the system that was not intended or does not make sense.
- Hesitancy. When you find yourself hesitant or resistant to making a change to the system, sometimes this is simply your subconscious telling you that you know the system has coupling in it that is going to "get you" when you try to change it. Of course, ultimately, we are trying to eliminate this hesitancy because we want to be able to evolve systems as we go, but when we feel it, we should pay attention to our own reactions.
- Comments. I have a love-hate relationship with comments. Too many comments are not a good thing, because they can get in your way (they make the code longer) and because they often do not get updated when the system changes. However, some comments can really help to make a system more readable and understandable. I have come to draw a distinction here. Some comments are about what the code is doing and often refer to other parts of the code in their explanation. This is an indicator of a problem. Why is the code simply not readable in the first place? Often, this is because it cannot be, as there are excessive dependencies with other parts of the system. This, of course, is a coupling problem. However, other comments are about why the code is doing what it's doing, which could reflect business or regulatory rules, and these can be difficult to make clear in the code. I like comments like these: the "why" comments as opposed to the "what" comments.
- Large test fixtures. When we examine unit testing, this will make more sense; in short, a unit test needs to create an instance of the class it is designed to test. Sometimes, it has to create other instances too, because the class it is testing needs them to operate. The collection of instances created in a unit test is called the fixture for the test by some people (yours truly included). A good overall view of the coupling in a system can be obtained by looking at the suite of unit tests that test it, and taking an average of the number of instances in the fixtures. This only works, of course, if the system has tests. Unfortunately, many of the systems that have serious coupling problems do not have tests, because the testing is too difficult.
Indicators of redundancy
Here are some indicators of redundancy.
- Redundant changes. When you find yourself making the same change in multiple places, clearly the thing you are changing is redundant. However, because people tend to think of redundancy only in terms of code and state, they can miss this indicator. Here is a very common example: imagine that the code new Widget() appears in several places in your code, because you instantiate that class in many places. Now imagine that the constructor of Widget changes. Or, that Widget becomes an abstract class, with multiple subclasses that represent different forms of the Widget. You will have to make the change(s) in every place where the code new Widget() appears. Does that mean you had a redundancy? Of course. You are probably thinking: "But wait! I do that all the time!" If you are thinking that, you are not alone, and it is a fairly common source of maintenance headaches.
- Inheriting from the Clipboard. That phrase comes from a student of mine named Boon. She always said that if you copy code from one place and paste it into another, you are obviously creating redundant code. She calls it "inheriting from the Clipboard," which I think it pretty clever and memorable. When you copy code, ask yourself why you are doing it - and if the same operation is needed in two places, doesn't this indicate that you need a service in your system?
My boss and mentor Alan Shalloway puts it this way: There are three numbers in software: 0, 1, and infinity. 0 represents the things we do not do in a system (we do those for free). 1 represents the things we do once and only once. But at the moment we do something twice, we should treat it as infinitely many and create cohesive services that allow it to be reused.
It is helpful to be surrounded by smart people like Boon and Alan, and it is also an advantage of being part of a profession. Other professionals are your resources, and you are theirs.
I teach different things: design patterns, systems architecture, unit testing, test-driven development, and so forth. I've noticed that no matter what the topic is, so long as it is part of software development, I have to start by making sure that everyone understands the qualities in this chapter.
Of course, many do. However, it's very common to find that two people are using a word like cohesion very differently; or that one person thinks of coupling very simply (it's coupled or it isn't), whereas another has more depth (the different types of coupling).
I don't expect to have taught you anything new or earthshaking here (but if I have, so much the better!). The purpose here was to make sure you're emphasizing the same things I am, and that we're using these terms in the same way, and with the same depth.
Now, we have a foundation to build.
This chapter is excerpted from the new book, Emergent Design: The Evolutionary Nature of Professional Software Development by Scott Bain, published by Addison-Wesley Professional, March 2008 ISBN 0-321-50936-6 Copyright (c) 2008 Pearson Education, Inc. For more information, please see informIT.com and Register Books.