Libre Software People's Front

don't confuse it with People's Front of Open Source

Posts Tagged ‘cloc

Analysis of reused code using FLOSS tools

leave a comment »

Last week we attended Linux Tag in Berlin to give two talks. First one was about identifying reused code between two FLOSS projects and it was given by me. The second one explained the importance of studying FLOSS software communities and was given by Daniel Izquierdo.

The main aim of my presentation was to show that it is possible (and easy!) to get very interesting results about the shared code between two FLOSS projects using FLOSS tools; the ones we used in this case were: CCFinder, Cloc, Ninka and Grep. The study identified not only the common code but also the possible license issues that were found. These kind of studies can be interesting from different points of view, I’ve summed them up in the following questions:

  • how different are two software projects?
  • is it feasible to propose a merge of the code?
  • how is the derivate project using the original code?
  • are the licenses being respected? what about the copyright?
  • is the new project using new licenses that could be interested for the team that created the original work? are they improving the code?
  • what changes performed the second team on the original code?
  • is your source code being adopted by a certain community?

The presentation that was presented is available here.

[This entry is part of the work I do in Bitergia and it is also available here]


Written by sanacl

May 29, 2012 at 11:05 am

What .. is your favourite SLOC counter?

leave a comment »

Is this a good question? I would say it is something that could be asked by the bridgekeeper in the film Monty Python and the Holy Grail. In our research group we started using sloccount a long time ago and it is currently integrated with some of our data mining tools like cvsanaly but after testing ohcount a couple of times during the last summer I started to though we should either improve it or replace it. Yesterday I read a post by Nicolas Betternburg on a study of PostgreSQL 9 and he mentioned he used cloc to count the SLOC (source lines of code) so I decided to test a couple of tools during this cold morning in Madrid.

The test I performed use the latest code of fusionforge and nautilus with the aim of getting the total SLOC and the SLOC for the primary language. I started with the tools cloc 1.09, sloccount 2.26 and ohcount (latest from git), but I obtained weird results from ohcount (very different from the version used in ohloh) and I discarded it. This is what I got:

SLOC in Fusionforge:

  • 452508 in 27 different languages according to cloc
  • 285193 in 10 different languages (javascript is not among them!) according to sloccount

Primary language in Fusionforge (PHP SLOC):

  • 244435 according to cloc
  • 243355 according to sloccount

SLOC in Nautilus:

  • 182967 in 10 different languages according to cloc
  • 146595 in 5 different languages according to sloccount

Primary language in Nautilus (C SLOC):

  • 134675 according to cloc
  • 134680 according to sloccount

There are other SLOC counters but they are not available in the Debian repos .. those are: sclc, USC’s CODECOUNT and loc

At this point I think we should test deeply cloc but I still don’t understand some numbers in the fusionforge report where it says there are 1600 PHP files and using the “file” command I only found 1355.

So .. the good think of sloccount is the time and effort estimation (but it is based on the SLOC), the great feature of ohcount is to identify the FLOSS licenses of the code and as far as I’ve seen the best SLOC counter is cloc. Do you agree?

[This post is also available in my blog at]

Written by sanacl

October 20, 2010 at 1:08 pm