Libre Software People's Front

don't confuse it with People's Front of Open Source

Posts Tagged ‘ohcount

What .. is your favourite SLOC counter?

leave a comment »

Is this a good question? I would say it is something that could be asked by the bridgekeeper in the film Monty Python and the Holy Grail. In our research group we started using sloccount a long time ago and it is currently integrated with some of our data mining tools like cvsanaly but after testing ohcount a couple of times during the last summer I started to though we should either improve it or replace it. Yesterday I read a post by Nicolas Betternburg on a study of PostgreSQL 9 and he mentioned he used cloc to count the SLOC (source lines of code) so I decided to test a couple of tools during this cold morning in Madrid.

The test I performed use the latest code of fusionforge and nautilus with the aim of getting the total SLOC and the SLOC for the primary language. I started with the tools cloc 1.09, sloccount 2.26 and ohcount (latest from git), but I obtained weird results from ohcount (very different from the version used in ohloh) and I discarded it. This is what I got:

SLOC in Fusionforge:

  • 452508 in 27 different languages according to cloc
  • 285193 in 10 different languages (javascript is not among them!) according to sloccount

Primary language in Fusionforge (PHP SLOC):

  • 244435 according to cloc
  • 243355 according to sloccount

SLOC in Nautilus:

  • 182967 in 10 different languages according to cloc
  • 146595 in 5 different languages according to sloccount

Primary language in Nautilus (C SLOC):

  • 134675 according to cloc
  • 134680 according to sloccount

There are other SLOC counters but they are not available in the Debian repos .. those are: sclc, USC’s CODECOUNT and loc

At this point I think we should test deeply cloc but I still don’t understand some numbers in the fusionforge report where it says there are 1600 PHP files and using the “file” command I only found 1355.

So .. the good think of sloccount is the time and effort estimation (but it is based on the SLOC), the great feature of ohcount is to identify the FLOSS licenses of the code and as far as I’ve seen the best SLOC counter is cloc. Do you agree?

[This post is also available in my blog at libresoft.es]

Written by sanacl

October 20, 2010 at 1:08 pm

Mining software licenses with cvsanaly and ohcount

leave a comment »

During the last three weeks I’ve been diving into cvsanaly to refresh my python skills. My first contributions have been a couple of easy fixes but now I’m finishing the integration of the ohcount tool which detects the license used in source code files ( see my previous entries about ohcount ).

This afternoon with 35ºC outside I’m very close to the air conditioning while testing and cleaning up the code before submitting the patch to my colleague carlosgc. With this new extension we get a table which relates files, revisions and licenses. See the picture below.

Ohcount is a very interesting tool, we even realized we had incorrect headers in 31 source files of cvsanaly. The new extension allow us to detect these changes. For instance the image below reflects the different licenses over time on one of the cvsanaly files, as you can see the file had two licenses (gpl and lpgl) before revision 609. That happened due to a incorrect header which mixed gpl and lgpl text together.

So, our plan is to integrate ohcount to study the licenses used in the fresh code and start studying if there are significant facts over time. I hope the code will be committed to git://git.libresoft.es/git/cvsanaly by the end of next week, in any case drop me a mail if you are interested on it and I’ll let you know.

Written by sanacl

August 27, 2010 at 3:53 pm

Ohcount, the Ohloh’s line counter

with one comment

This afternoon I did some simple tests with Ohcount which is the Ohloh’s source code line counter. I did not manage to compile the 3.0 release, but the latest version downloaded from git worked properly.

With the default parameters is similar to sloccount, it has more information about the code but nothing about effort estimation.

For me, the most interesting part is the possibility to get the license from a source code file with the flag “-l”

$ ./bin/ohcount -l /tmp/evince/
lgpl evince-document.h
gpl ev-document-model.c
gpl ev-annotation-window.h
gpl ev-stock-icons.c
gpl ev-view-presentation.c
gpl ev-job-scheduler.h
gpl ev-document-model.h
lgpl ev-timeline.c
gpl ev-page-cache.h
gpl ev-jobs.c
lgpl ev-transition-animation.c
...
gpl ephy-zoom-control.h
gpl ev-previewer.c
gpl ev-previewer-window.c
gpl ev-previewer-window.h
gpl evince-thumbnailer.c

This tool looks promising, I’m going to test it deeply to propose using it in Melquiades (flossmetrics) and the FusionForge metrics plugin that we are developing these days.

UPDATE I’ve found a bug in this version of the tool while studying the evince code. It identifies cpp code in the libview directory which is false. I’ve reported the bug to the main developer in sourceforge.

Written by sanacl

July 6, 2010 at 8:28 pm