One year ago three of us were freezing cold in Brussels attending FOSDEM 2012 and trying to get feedback about a couple ideas we had in order to create a start-up. Twelve months later here we are again in our way to FOSDEM, but this time we are part of Bitergia, the company we created.
The creation of the start-up has been a lesson itself. During the first months of 2012 we lost count of the times we met at a very nice cafeteria in the South of Madrid to discuss the ideas about the people we wanted involved, business models, potential clients and so on. One of our main concerns during these discussions were to decide the main idea that would be the base of our business model. During our last months (late 2011) in the research group we started to receive some petitions from people interested in using the results of some research lines in their products. Offering interesting results as researchers was really nice, the new challenge started to be how to get money from that results/expertise. Advance warning: this will be one of our critical problems or opportunities.
During summer 2012 we launched officially the company mainly focused in offering software statistics from Libre Software communities and mainly aimed at companies involved in that projects. Since then we have been working on improving the tools developed by the research group and starting to perform interesting studies about big projects like Mediawiki, OpenStack or Webkit. Right now we are achieving an interesting point where we want to deepen in the real work behind the current data we have about total activity (commits, committers, issues, time-to-fix, blah, blah). I’m pretty sure we have more fun ahead :)
Did I talk about getting focused in the paragraphs above? One of our favourite dilemmas was to define how focused we wanted to be in our business idea. As soon as we started the company we saw some opportunities of collaborating with a couple of entities supporting and improving software forges. Since the very beginning we were very interested in Allura, Fusionforge and software forges in general, but we wasn’t sure it would be a good idea to follow that path. Time has passed and a few months ago we decided to add forges to the business idea, so currently we are working on supporting software forges and integrating them with our software analytics report.
So far, the adventure is being really fun and I’m pretty sure this is because I have great partners. If someone would ask me about what lesson I extract from this first stage of the company’s life I would say: once you have the idea, look for people that support you and motivate you to grow up.
Last week we attended Linux Tag in Berlin to give two talks. First one was about identifying reused code between two FLOSS projects and it was given by me. The second one explained the importance of studying FLOSS software communities and was given by Daniel Izquierdo.
The main aim of my presentation was to show that it is possible (and easy!) to get very interesting results about the shared code between two FLOSS projects using FLOSS tools; the ones we used in this case were: CCFinder, Cloc, Ninka and Grep. The study identified not only the common code but also the possible license issues that were found. These kind of studies can be interesting from different points of view, I’ve summed them up in the following questions:
- how different are two software projects?
- is it feasible to propose a merge of the code?
- how is the derivate project using the original code?
- are the licenses being respected? what about the copyright?
- is the new project using new licenses that could be interested for the team that created the original work? are they improving the code?
- what changes performed the second team on the original code?
- is your source code being adopted by a certain community?
The presentation that was presented is available here.
[This entry is part of the work I do in Bitergia and it is also available here]
Something is wrong with the way ohloh.net sort the developers by meritocracy. Well, ohloh.net uses Kudos to measure it, the definition that they offer for kudo is:
kudo: a statement of praise or approval; accolade; compliment.
Basically, kudos are a public way to show your appreciation or respect for an open source contributor. Remember that meritocracy is a very important part of the motivation to work in libre software projects. According to the explanation by the ohloh staff, contributors who have received the most Kudos will receive the highest KudoRank of 10 and only the 64 top people can receive the highest KudoRank.
Once the concept is clear, let's have a look at the top ten contributors according to ohloh (based on the Kudos they've received). The top 3 committers according to the Kudo Rank are Jari Aalto, zeljic and Stefan Küng (see image below).
Now, let's compare the first three contributors. If we add up the values of the two first committers (Jari Aalto and zeljic) we have 21 commits, 2 years of experience and 6 kudos received. On the other hand Stefan Küng, who is the third contributor, has received around 100 kudos (you'll have to count them manually), has coding experience of 8 years and 9 months and is part of 15 software projects. The obvious questions is, what is wrong here?. The first two committers have not claimed the ohloh account for this contributions, could it be the root of the bug?
I think this is a bug, I'll let the Ohloh staff know. Stay tunned.
I remember the first time I had to deal with a development based on Moodle. It was more than 6 years ago but I still remember that everything was prepared to attract people and boost the creation of a stronger community. Today I had a look at the number of Moodle and the activity of its leader, Martin Dougiamas, and I have to say I am quite happy Moodle is being such a big success.
First let’s have a look at the big picture:
- Moodle has a large and diverse user community with over 1,128,626 registered users on the moodle.org site alone, speaking over 78 languages in 217 countries.
- As of October 2010 it had a user base of 49,952 registered and verified sites, serving 37 million users in 3.7 million courses. Currently (December 2011) it has a user base of 72,168 registered and verified sites, serving 57 million users in 5.8 million courses. Wow! “serving 57 million users”! Well done Moodle team :).
- There are around 50 companies that compose the “Moodle partners” network. It offers services around Moodle and help with the development. The services that they offer are: hosting, support, consulting, integration, developing of the courses, customisation and certification.
But, what about its creator and leader?. I wanted to find out the role played by Dougiamas during the last years, so I analysed the git repository with cvsanaly (the development team maintains a git mirror of the cvs they use for the development). This was what I found out:
I believe that Moodle is a very interesting case to be deeply studied. If you are part of the moodle community and find any errors, please let me know.
As part of my master’s homework, I’ve just read an article written by some remarkable colleagues a few of years ago about the geographic origin of libre software developers. The article was interesting and had some impact. The key question that they tried to answer was how diverse the national origin of developers is, and the approach was also new as they didn’t want to use surveys but real data. They information that they used in the study was the following:
- A dump of the SourceForge database created in 2005, which included more than 1,180,000 registered users.
- Mailing lists archives of the Debian, GNOME and FreeBSD projects. A total of more than one million different e-mail addresses.
The article was focused both on users/contributors (contributors of the forge, mailing lists, source code repositories) and developers (contributors of the source code repos). Obviously, the second group is a subset of the first one. These were some of the results:
- out of 1.1 million registered participants on SourceForge, just under 50,000 committed code to the development repositories. Well this is not a result of the study, but I found it quite interesting
- most of the total sf.net users came from Europe and North America, followed by Asia with less than 10% of the developer population. If we take into account that the population is larger in Europe, the penetration of the libre software development per capita was higher in North America than in Europe.
- there were more developers in the US and Canada than in most European countries or regions. On the other hand, the US had fewer libre software developers per million Internet users than most European countries.
- when the total number of developers is adjusted using wealth (GSD), China, India, Russia, Brazil and even South Africa are among the higher contributors.
I wonder if the situation will be same by 2020.
“Geographic origin of libre software developers”
by Jesus M. Gonzalez-Barahona, Gregorio Robles, Roberto Andradas-Izquierdo and Rishab Aiyer Ghosh
A couple of weeks ago some colleagues and I were discussing whether to follow the Style Guide for Python Code (PEP8) and its limitation of 80 columns. The origin of that limit is much older than I thought. It comes from the IBM punch cards!.
At first sight it’s nonsense to obly people to write code in 80 columns when most of the screen display resolutions are above 1024×768. On the other hand, it is better to find “standards” to be followed in your collaborative projects.
According to PEP8 these are the main reasons for using the old limitation:
- there are still many devices around that are limited to 80 character lines
- limiting windows to 80 characters makes it possible to have several windows side-by-side
- it is easier to be read
I agree with the second point, but first and third point are at least debatable. What is sure is that the 80 columns limitation will be updated sooner or later. The second argument will be useless when most of the screens displays will be above 1280×800 pixels. The only question is when the 80-column will be part of the past.
Some big projects like webkit also faced the problem of choosing whether following the entire PEP8 or throwing the 80-col out the window. The discussion is available here.
So .. finally we decided to use the PEP8 as far as possible. The new contributions made to bicho and cvsanaly follow it and the legacy code should be updated during the following months.
As you may know, OSOR and SEMIC will merge very soon to become Joinup. I have been working in OSOR since the very beginning, and today I helped Roberto Andradas to provide the last dump with the data that will be used by the new staff to feed Joinup.
More than three years ago, Juanjo Amor and Jesús M. González-Barahona offered me the challenge of coordinating a very young and passionate team to deploy, maintain and improve the platform. We made many mistakes but I am quite happy to see how we have learned. During the last year my contribution to OSOR was very small because Roberto Andradas took over the coordination, but I must confess that today I have a strange feeling (mixture of hapiness and sadness).
I would like to thank all the people that worked within the project during these years. It has been a pleasure. The OSOR community will migrate now to the Joinup platform with a different staff. Game is over for us, new adventures ahead.