Skip to main content

How To Recognize a Bad Codebase

We learn to recognize a bad bit of code quickly as our code-fu grows. Arbitrary side-effects smell badly and crazy one-liners frustrate us. It becomes easier to identify what lines of a codebase you might want to clean up to improve the overall quality of the work.

There is a line between codebaess with bad code in them and bad codebases. When do we learn to recognize this and what are the signs that the problem is far reaching, not localized? A bad codebase is an expensive codebase. It is difficult to work with and difficult to collaborate with others on. Identifying what makes a codebase bad is key to knowing when, where, and why to improve it. Improving the overall code quality reduces the overall code cost. I'm thinking about software in economic terms these days, and I'm hoping we can turn the recession to our favor by pushing the mantra Bad Code is Expensive Code.

Costs of code come from three actions. Adding features costs, fixing bugs costs, and understanding costs. Adding features is an obvious source of code cost, and every time you want to expand a products abilities you're going to pay appropriately. Fixing bugs is both obvious and subtle. Where its obvious that you need to fix bugs you see, it can be very subtle when costs are added that you can't actually detect (more on this later). Understanding the code, to most minds, might be entire subtle and never obvious. New developers, existing developers moving to new areas, and users trying to understand the behavior emerging from the collection of code all need to understand these things and the most expensive to understand it the less likely they will.

I feel no need to expand on the cost of adding to a codebase. What will hit us are the subtle points. Bugs' cost explode against the subtle misunderstandings, leading to the conclusion that a lack of understanding the code is the single greatest source of increasing its cost. This is through the partial obvious needs to understand the code and the more subtle costs they add to being able to fix bugs, and even to properly expand the feature set. The problems manifest as the actual bugs in the software.

The sign of a bad codebase is a difficult to debug codebase.

Now we only need to know the causes of difficult debugging to know the signs of a bad codebase.

Does the codebase lack tests? No tests mean you can't be sure any change breaks more than you intended to fix. Locating the source of a problem is hugely expensive when you're manually verifying correctness, instead of via automated testing. There are fantastic techniques of binary debugging, narrowing a changeset range down to the extra change that introduced a bug. This is so expensive with manual testing that it might as well be impossible, while with tests its one of the greatest debugging tools you could ever have at your disposal: It can automatically tell you exactly what code caused your bug. It can debug for you, but only in a codebase that started out good.

Does the codebase lack documentation? If your understanding of the code comes mostly from trial and error or asking other developers, then you lack documentation or enough clear code to self-document. Every time you add a feature or fix a bug, you're debugging more than the code, but your understanding of how it functions. Clear code, concise comments, and good documentation let you focus on the breakage of the code, and not the breakage of your understanding of its design.

Does the codebase grow or shrink? We might think a growing codebase is a generally universally good sign, but its not so. A shrinking codebase can be a great sign. It means two things. Firstly, it means an increase in the quality when the amount of code reduces while maintaining or increasing the value (not to be confused with cost) of the code. For example, if you can make a function clearer but finding more concise ways of expressing the same ideas, you reduce how much code there is to understand to get the same job done. A shrinking codebase also tells you that the code is understandable enough to be refactored, which is a little deceptive. The better quality of your code, the easier it becomes to improve the quality even futher.

Take this as a three point test. How do your current projects score?


Anonymous said…
This comment has been removed by a blog administrator.

Popular posts from this blog

CARDIAC: The Cardboard Computer

I am just so excited about this.

CARDIAC. The Cardboard Computer. How cool is that? This piece of history is amazing and better than that: it is extremely accessible. This fantastic design was built in 1969 by David Hagelbarger at Bell Labs to explain what computers were to those who would otherwise have no exposure to them. Miraculously, the CARDIAC (CARDboard Interactive Aid to Computation) was able to actually function as a slow and rudimentary computer. 
One of the most fascinating aspects of this gem is that at the time of its publication the scope it was able to demonstrate was actually useful in explaining what a computer was. Could you imagine trying to explain computers today with anything close to the CARDIAC?

It had 100 memory locations and only ten instructions. The memory held signed 3-digit numbers (-999 through 999) and instructions could be encoded such that the first digit was the instruction and the second two digits were the address of memory to operate on. The only re…

Interrupting Coders Isn’t So Bad

Here’s a hot take: disrupting coders isn’t all that bad.

Some disruptions are certainly bad but they usually aren’t. The coder community has overblown the impact. A disruption can be a good thing. How harmful disruption might be a symptom of other problems.

There are different kinds of disruptions. They are caused by other coders on your team, managers and other non-coders, or meetings throughout the day.

The easiest example to debunk is a question from a fellow developer. Imagine someone walks over to your desk or they ping you on Slack, because they have “one quick question.” Do you get annoyed at the interruption when you were in the middle of something important? You help out your teammate quickly and get back to work, trying to pick up where you left off. That’s a kind of interruption we complain about frequently, but I’m not convinced this is all that bad.

You are being disrupted but your team, of which you are only one member of the whole unit, is working smoothly. You unstuck …

How To Care If BSD, MIT, or GPL Licenses Are Used

The two recent posts about some individuals' choice of GPL versus others' preference for BSD and MIT style licensing has caused a lot of debate and response. I've seen everything as an interesting combination of very important topics being taken far too seriously and far too personally. All involved need to take a few steps back.

For the uninitiated and as a clarifier for the initiated, we're dealing with (basically) three categories of licensing when someone releases software (and/or its code):
Closed Source. Easiest to explain, because you just get nothing.GPL. If you get the software, you get the source code, you get to change it, and anything you combine it with must be under the same terms.MIT and BSD. If you get the software, you might get the source code, you get to change it, and you have no obligations about anything else you combine it with.The situation gets stickier when we look at those combinations and the transitions between them.

Use GPL code with Closed S…