Software Code Counter Review and Recommendations

May 5, 2010 · Filed Under Software Estimating, Software Sizing  - 1 Comment(s)

Galorath’s Mike Churchman did a comprehensive evaluation of numerous code counting tools and provided recommendations.

While some people may wonder why anyone would want to count code (it can be useful when estimating the amount of work in reuse as well as gaining a benchmark for estimating new code when using lines as a size measure) Mike’s evaluation is very useful.  I am including only the report on the top two tools.

Recommendations:

C/C++

USC CodeCount

 Java

USC CodeCount

Perl

USC CodeCount (if you can get around the bug)

PHP

CLOC.EXE, with adjustment factor

Python

CLOC.EXE, with adjustment factor

Test files:

For Java, Perl, PHP, and Python, he used several files which I pulled from various SourceForge projects. For C++, he used our standard sample file from the manual and Dan’s book.

Manual count (Control)

In the test files, he counted partial lines separately from SLOC, simply because he wasn’t sure about what to do with them, so my initial SLOC count simply doesn’t include partials at all.

The code counters which he tested were (for the most part) smart enough to tell the difference between blank lines, comments, and actual code. With very few exceptions, their blank-line and comment counts agreed, and matched my hand-counts.

There was, however, some significant variation in the SLOC numbers which they reported.

C/C++, Java, and Perl:

The USC code counter generally gave the most conservative SLOC counts — ranging from the same as mine, to over 25% less. In many cases, the USC count was about 4% to 10% lower than his hand count.

Perl bug:

He found one bug in the USC code counter for Perl: it appears that if the number of opening and closing quotes in a comment block don’t match, the counter stops counting SLOC, although it will continue to count comments and blank lines. Unfortunately, it also fails to recognize a quote character at the end of a comment line, unless it is followed by two spaces.

PHP and Python:

The USC code counters don’t handle PHP and Python. The best of the ones that are available all give reasonably accurate counts of code (as opposed to comment or blank) lines, but they count partial lines as full lines. One promising-looking application, for example, counts (according to the online documentation) physical, rather than logical lines.

Suggestion:

For PHP and Python (and for Perl, if you can’t find any practical way around the USC counter’s bug), you could use CLOC.EXE, which seems to give accurate total counts for SLOC + partial lines, then apply an adjustment factor to the output to get something reasonably close to an accurate SLOC count.

How do you come up with and adjustment factor? His first impulse would be to take some representative samples of code from various sources (the test files that I’ve been using might be sufficient, although we may want something larger), hand-count SLOC and partials, get an average SLOC-to-total (SLOC + partials) ratio for each language, and use that as a multiplier.

Other Code Counters:

Universal Code Counter gives counts which generally match CLOC.EXE, sometimes, however, it seems to be rather far off the mark, and it reverses the counts for blanks and comments, which leaves me somewhat in doubt of its accuracy. Most of the other counters were less accurate, or at least less reliable.



Thank you for reading “Dan on Estimating”, if you would like more information about Galorath’s estimation models, please visit our contact page or call us at +1 310 414-3222.

Related posts:

  1. New Code Counter Update Available from USC
  2. Counting XML Source Lines of Code
  3. New Code Counting Tool Made Available
  4. Dealing With Generated Lines of Code
  5. Step Four: Software Sizing

Comments

One Response to “Software Code Counter Review and Recommendations”

  1. Maxim Rusakov on May 6th, 2010 6:50 am

    What tool would you require to count C# files?

Leave a Reply