Wednesday, January 18, 2012

On cyclomatic complexity

Not all of us are lucky to work with greenfield projects; the majority end up working with codebases that we inherit. These codebases have had their share of modifications that moved them away from the original elegant design: bug fixes, feature additions, and enhancements that were done in a hurry because of delivery pressure, without regard to code hygiene and future maintainability. The code then feels heavy, complex, hard to understand and more painful to maintain.

There are a lot of ways to characterize code complexity, but the one that jumps to mind is "cyclomatic complexity" described by the McCabe paper from 1976 http://www.literateprogramming.com/mccabe.pdf.  The paper describes how to measure the cyclomatic complexity of different program control paths, and recommends a bound of 10 for manageable code complexity. NIST also selects the same number http://hissa.nist.gov/HHRFdata/Artifacts/ITLdoc/235/chapter2.htm. The numbers, despite being arbitrary, provide a guideline to when to get out the scalpel, and start cleaning up the code.

I was curious about how our code fares, and luckily I found a lot of open source tools that help with that. For C/C++ programs "cccc" is good. If you're using the ports systems,

sudo port install cccc

does the trick. For Java code there are a lot of plugins for various IDEs, and for NetBeans, Simple Code Metrics is good. The plugin page shows that it is old and unmaintained, however it works perfectly in NetBeans 7.1. You choose a file or a package, and click the SCM icon, and it spits out metrics about the code including the cyclomatic complexity.

When I ran the tests, the results were an eye-opener. The good news is that most of the code has a complexity below 15, with the occasional function that is high up in the 30s. Not alarming, but definitely would need attention to keep the code hygiene high. The surprise came when I included some of the open source libraries that we use, and the numbers shot up as high as 206. When I talked to colleagues from other companies, the experiences were similar--most of the normal code hovers around 20, with the occasional function around 50, but no high alarming numbers like the 206 one.




No comments :

Post a Comment