Posts Tagged ‘software metrics’
Something is wrong with the way ohloh.net sort the developers by meritocracy. Well, ohloh.net uses Kudos to measure it, the definition that they offer for kudo is:
kudo: a statement of praise or approval; accolade; compliment.
Basically, kudos are a public way to show your appreciation or respect for an open source contributor. Remember that meritocracy is a very important part of the motivation to work in libre software projects. According to the explanation by the ohloh staff, contributors who have received the most Kudos will receive the highest KudoRank of 10 and only the 64 top people can receive the highest KudoRank.
Once the concept is clear, let’s have a look at the top ten contributors according to ohloh (based on the Kudos they’ve received). The top 3 committers according to the Kudo Rank are Jari Aalto, zeljic and Stefan Küng (see image below).
Now, let’s compare the first three contributors. If we add up the values of the two first committers (Jari Aalto and zeljic) we have 21 commits, 2 years of experience and 6 kudos received. On the other hand Stefan Küng, who is the third contributor, has received around 100 kudos (you’ll have to count them manually), has coding experience of 8 years and 9 months and is part of 15 software projects. The obvious questions is, what is wrong here?. The first two committers have not claimed the ohloh account for this contributions, could it be the root of the bug?
I think this is a bug, I’ll let the Ohloh staff know. Stay tunned.
( have a look at the previous post if you didn’t )
I recommend you to use the screen command to download the repos, it could take a couple of hours if your connection is not quick. Use a log file to ensure that everything was properly downloaded and the mail command to notify you when the downloads finish.
../get_repos.sh > ../log_git_clone.txt 2>&1; mail email@example.com -s "git clone fin" < ../log_git_clone.txt
After using git clone to get all the git repositories used by Android, we need to start using cvsanaly to analyze the code, again we will use a log file.
list=`ls` for i in $list do echo "------ ANALYSING $i" >> ../log-cvsanaly.txt ~/repos/cvsanaly/cvsanaly2 -u **** -p **** -d cvsanaly_android_lcanas $i >> ../log-cvsanaly.txt 2>&1 done mail firstname.lastname@example.org -s "cvsanaly finished" < ../log-cvsanaly.txt
At this point we’ve got a single mysql database with all the information of the 167 Android repositories. The next step is to use this information to answer some questions, in this introductory study we are going to examine the activity over time (in terms of commits) of the project and divided by Google staff and others. We will assume that the Google employees use a user id with @google or @android, that’s how we will divide them in two groups.
The first R commands below create the connection with the mysql database and obtain the variables comm and googlers which contain the number of commits per month and domain.
> library(RMySQL) Loading required package: DBI > con <- dbConnect( MySQL(), user="***", password="***", dbname="cvsanaly_android23_lcanas" ) > comm <- dbGetQuery(con, "select count(scmlog.id) as comm from scmlog join people on (scmlog.author_id=people.id) where date >= '2008-10-21 00:00:00' and people.email not like '%@android.com%' and people.email not like '%@google.com%' group by date_format(scmlog.date, '%Y %m') order by date_format(scmlog.date, '%Y %m') asc;") > googlers <- dbGetQuery(con, "select count(scmlog.id) as googlers from scmlog join people on (scmlog.author_id=people.id) where date >= '2008-10-21 00:00:00' and people.email like '%@android.com%' or people.email like '%@google.com%' group by date_format(scmlog.date, '%Y %m') order by date_format(scmlog.date, '%Y %m') asc;")
We join the information from google employees and the rest of contributors. It is also needed to obtain the list of months which will be useful as x axis in the chart we will generate.
> mymatrix2<-cbind(googlers,comm) > months <- dbGetQuery(con, "select date_format(scmlog.date, '%m/%y') as month from scmlog join people on (scmlog.author_id=people.id) where date >= '2008-10-21 00:00:00' and people.email not like '%@android.com%' and people.email not like '%@google.com%' group by date_format(scmlog.date, '%Y %m') order by date_format(scmlog.date, '%Y %m') asc;")
The last step is to generate the chart and save it to a file.
> barplot(t(mymatrix2),names.arg=t(months),ylab="commits",legend.text=c("Google employees","Rest"),col=c("dark green","grey")) > savePlot(filename="android-commits-domains.png", type="png")
Voilà, based on the software history of the Android project we have generated a view of the activity around the code in terms of commits over time.
This basic process should be improved to obtain more accurate results, for instance some of the Google employees committed code using an empty mail address, then the contribution from non google employees seems to be bigger than it is. It will also be necessary to analyze the Linux kernel together with the rest of the Android code in order to obtain a wider view of the effort invested by the Android community. There are many different questions that can shed some light on how the different communities work, in the last two posts we’ve seen one of the methods to start performing a quantitative study with the purpose of answering some of those questions.
In the past Open Forges Summit Scott Collison told us that he saw Ohloh as a failure project and foresaw news about the platform. Here you are the news.
The following message have been sent to the ohloh users:
Dear Ohloh Community Member,
I’m Andi Zink, head of development at Black Duck Software. We’ve got some exciting news for you today, which we hope will get you as fired up as it has us! Black Duck Software has acquired Ohloh.net from Geeknet.
Ohloh has been built for you, the developer community, and our vision for Ohloh carries on this mission! We have big plans for improving and expanding the site, bringing you even more useful capabilities and information about the world of FOSS, and we will keep it free and open.
So, who is Black Duck Software? We’ve been around since 2003, and have been involved with the FOSS world since our inception. Black Duck develops a suite of applications and services that companies use to speed up and manage their use of open source software. We’ve compiled a KnowledgeBase of FOSS project information that we believe is the industry’s most comprehensive. We also have a free developer website for code search called Koders.com where we index billions of lines of open source code from projects in our KnowledgeBase. We will combine Koders.com and our KnowledgeBase with the project and people data on Ohloh. That together with new more powerful search, selection, and usage tools will make it much easier for developers to find, select and use FOSS.
We know the site could use some love and a shot of investment. We’ve seen the enhancements and feature requests posted on the Ohloh forums. We’ll address some of these requests right away. Ohloh has always been built, guided by the use and feedback of its users. This won’t change! We promise to listen, engage with you, and give you plenty of opportunity to help us turn Ohloh into the most useful center of FOSS knowledge with the biggest, most innovative and engaged developer community. Ohloh is YOUR site, and we’re committed to making it into what YOU need to accelerate your use of FOSS, and make your projects even more relevant, dynamic, and successful.
Lots of information about the announcement can be found on the “Ohnouncement” page. Visit the Ohloh forums and join the discussion. Tell us what you want and need from an enhanced Ohloh. We care about what you care about: FOSS adoption, great tools, and collaborative communities. Let us know what you think, ask any questions, participate, and help us make this initiative take off!
Thank you for your support and loyalty toward Ohloh. We’re excited about collaborating with you and building what we hope will continue to be your go-to destination and trusted source of FOSS knowledge.
So .. finally ohloh will be improved. Let’s see what they can do 🙂
The chart I created shows information about Balsa the GNOME mail client, the data was obtained from Melquiades. The example chart shows the commits and reports created over the last ten years in the Balsa project with some of the most remarkable events (releases). I have to say that I did not include all the releases, this was done manually and I was a bit tired 😉 . The dynamic thing is available here.
Adding events to our charts will improve the graphical information we offer but so far the extraction of the release dates is a manual process. After finding more data to be shown using the timeplot events, we will need more libraries to create pie charts and maybe bar charts.
Some useful links:
The data and html code is available here.
During the last couple of months some interesting things have happened in my research group (libresoft.es) related with software metrics and its application to collaborative environments. One of our dearest data mining project (FLOSSMetrics) has achieved a great added value in terms of procedures to get data from libre software projects and some of its small features have been applied OSOR.eu, the biggest collaborative environment we maintain. With the background we have in this topic (see the links below) we are in a great position to contribute with something interesting in this area to the libre software community so .. there we go.
Our first task is to polish up the tools we developed for FLOSSMetrics, our team have some ideas about how to improve the heart of the analysis (a tool called retrieval system .. so far!). At the same time they design the new platform I’ll start creating a prototype which will be our template for its application to the first forge: fusionforge, which is the new libre release of GForge. As always, the design is the most important part because we want/need to obtain a standalone product with a wide variety of plugins. Indeed we need to emphasize that our aim must be obtaining/offering a “product“, only one product with many many small applications. This point of view would be new for us (it is not very common in research projects) and I’m pretty sure that’s the way we can improve the final quality.
One metrics product to rule them all!
Some interesting links: