Libre Software People's Front

don't confuse it with People's Front of Open Source

Posts Tagged ‘android

Study of the Android development activity and its authors

leave a comment »

Libre software is changing the way applications are built by companies, while the traditional software development model does not pay attention to external contributions, libre software products developed by companies benefit from them. These external contributions are promoted creating communities around the project and will help the company to create a superior product with a lower cost than possible for traditional competitors. The company in exchange offers the product free to use under a libre software license.

Android is one of these products, it was created by Google a couple of years ago and it follows a single vendor strategy. As Dirk Riehle introduced some time ago it is a kind of a economic paradox that a company can earn money making its product available for free as open source. But companies are not NGOs, they don’t give away money without expecting something in return, so where is the trick?

As a libre software project Android did not start from scratch, it uses software that would be unavailable for non-libre projects. Besides that, it has a community of external stakeholders that improve and test the latest version published, help to create new features and fix errors. It is true that Android is not a project driven by a community but driven by a single vendor, and Google does it in a very restricted way. For instance external developers have to sign a Grant of Copyright License and they do not even have a roadmap, Google publish the code after every release so there are big intervals of time where external developers do not have access to the latest code. Even with these barriers there are a significant part of the code that is being provided from external people, it is done directly for the project or reused from common dependencies (GIT provides ways to reuse changes done to remote repositories).

Commits by domain per month (proportional)

Commits by domain per month (proportional)

Commits by domain per month (total)

Commits by domain per month (total)

The figures above reflect the monthly number of commits done by people split up in two, in green colour commits from mail domains google.com or android.com, the study assumes that these persons are Google employees. On the other hand in grey colour the rest of commits done by other mail domains, these ones belong to different companies or volunteers.

According to the first figure (on the left), which shows the proportion of commits, during the first months that were very active (March and April 2009) the number of commits from external contributors was similar to the commits done by Google staff. The number of external commits is also big in October 2009, when the total amount of commits reached its maximum. Since April 2009 the monthly activity of the external contributors seems to be between 10% and 15%.

The figure on the left provides a interesting view of the total activity per month, two very interesting facts here: the highest peak of development was reached during late 2009 (more than 8K commits per month during two months). The second is the activity during the last months, as it was mentioned before the Google staff work in private repositories so until they publish the next version of Android, we won’t see another peak of development (take into account that commits in GIT will modify the history when the code is published, thus the last months in the timeline will be overwritten during the next release)

Commits by domain

Commits by domain

More than 10% of the commits used by Google in Android were committed using mail domains different to google.com or android.com. At this point the question is: who did it?

(Since October 2008)

# Commits Domain
69297 google.com
22786 android.com
8815 (NULL)
1000 gmail.com
762 nokia.com
576 motorola.com
485 myriadgroup.com
470 sekiwake.mtv.corp.google.com
422 holtmann.org
335 src.gnome.org
298 openbossa.org
243 sonyericsson.com
152 intel.com

Having a look at the name of the domains, it is very surprising that Nokia is one of the most active contributors. This is a real paradox, the company that states that Android is its main competition helps it!. One of the effects of using libre software licenses for your work is that even your competition can use your code, currently there are Nokia commits in the following repositories:

  • git://android.git.kernel.org/platform/external/dbus
  • git://android.git.kernel.org/platform/external/bluetooth/bluez

This study is a ongoing process that should become a scientific paper, if you have feedback please let me know.

CVSAnalY was used to get data from 171 GIT repositories (the Linux kernel was not included). Our tool allow us to store the metadata of all the repositories in one SQL database, which helped a lot. The study assumes that people working for Google use a domain @google.com or @android.com.

References:

[This entry is part of the work I do in LibreSoft and it is also available in my blog at libresoft.es]

Advertisements

Written by sanacl

April 16, 2011 at 5:29 pm

Android went up like a rocket during 2010

leave a comment »

Have you seen the latest market share of Operating Systems for smartphones?

Wow! It’s amazing how quickly Android is increasing its market. During 2010 a total of 170,000 applications were published and it started with less than 10,000. It will be very interesting to know the share market by mid this year, we’ll see if Android is still going up like a rocket.

More info in the Nielsen’s blog and in the Android Zoom blog

Written by sanacl

February 16, 2011 at 10:00 pm

Posted in Uncategorized

Tagged with , , ,

How to get quantitative data from the Android source code (II)

with 2 comments

( have a look at the previous post if you didn’t )

I recommend you to use the screen command to download the repos, it could take a couple of hours if your connection is not quick. Use a log file to ensure that everything was properly downloaded and the mail command to notify you when the downloads finish.

../get_repos.sh > ../log_git_clone.txt 2>&1; mail lcanas@libresoft.es -s "git clone fin" < ../log_git_clone.txt 

After using git clone to get all the git repositories used by Android, we need to start using cvsanaly to analyze the code, again we will use a log file.

list=`ls`
for i in $list
do 
echo "------ ANALYSING $i" >> ../log-cvsanaly.txt
~/repos/cvsanaly/cvsanaly2 -u **** -p **** -d cvsanaly_android_lcanas $i >> ../log-cvsanaly.txt 2>&1
done
mail lcanas@libresoft.es -s "cvsanaly finished" < ../log-cvsanaly.txt

At this point we’ve got a single mysql database with all the information of the 167 Android repositories. The next step is to use this information to answer some questions, in this introductory study we are going to examine the activity over time (in terms of commits) of the project and divided by Google staff and others. We will assume that the Google employees use a user id with @google or @android, that’s how we will divide them in two groups.

The first R commands below create the connection with the mysql database and obtain the variables comm and googlers which contain the number of commits per month and domain.

> library(RMySQL)
Loading required package: DBI
> con <- dbConnect( MySQL(), user="***", password="***", dbname="cvsanaly_android23_lcanas" )
> comm <- dbGetQuery(con, "select count(scmlog.id) as comm from scmlog join people on (scmlog.author_id=people.id) 
where date >= '2008-10-21 00:00:00' and people.email not like '%@android.com%' and people.email not like '%@google.com%' 
group by date_format(scmlog.date, '%Y %m') order by date_format(scmlog.date, '%Y %m') asc;")

> googlers <- dbGetQuery(con, "select count(scmlog.id) as googlers from scmlog join people on (scmlog.author_id=people.id) 
where date >= '2008-10-21 00:00:00' and people.email like '%@android.com%' or people.email like '%@google.com%' 
group by date_format(scmlog.date, '%Y %m') order by date_format(scmlog.date, '%Y %m') asc;")

We join the information from google employees and the rest of contributors. It is also needed to obtain the list of months which will be useful as x axis in the chart we will generate.

> mymatrix2<-cbind(googlers,comm)

> months <- dbGetQuery(con, "select date_format(scmlog.date, '%m/%y') as month from scmlog join people 
on (scmlog.author_id=people.id) where date >= '2008-10-21 00:00:00' and people.email not like '%@android.com%' and 
people.email not like '%@google.com%' group by date_format(scmlog.date, '%Y %m') order by date_format(scmlog.date, '%Y %m') asc;")

The last step is to generate the chart and save it to a file.

> barplot(t(mymatrix2),names.arg=t(months),ylab="commits",legend.text=c("Google employees","Rest"),col=c("dark green","grey"))

> savePlot(filename="android-commits-domains.png", type="png")

Voilà, based on the software history of the Android project we have generated a view of the activity around the code in terms of commits over time.

This basic process should be improved to obtain more accurate results, for instance some of the Google employees committed code using an empty mail address, then the contribution from non google employees seems to be bigger than it is. It will also be necessary to analyze the Linux kernel together with the rest of the Android code in order to obtain a wider view of the effort invested by the Android community. There are many different questions that can shed some light on how the different communities work, in the last two posts we’ve seen one of the methods to start performing a quantitative study with the purpose of answering some of those questions.

Written by sanacl

December 31, 2010 at 2:07 am

How to get quantitative data from the Android source code (I)

with one comment

One of my targets for 2011 is to make as easy as possible the process of obtaining quantitative data from open source projects. We have developed several tools with that purpose but they still need a lot of love to be really user-friendly and stable. In the following two posts I’ll show you how to get basic data from FLOSS projects using the source code repository, in this example we will study the code provided by Android using cvsanaly to get data from the repositories and R to create a couple of charts.

The Google developers created a tool called repo to deal with the different git repos that they are using in Android. I don’t like to install tools that I won’t use so I’ll bypass it with a couple of bash commands.

The repo command uses the git://android.git.kernel.org/platform/manifest.git as starting point, so after cloning this repository you’ll see that it contains a XML file called default.xml with the following content:

  <project path="system/bluetooth" name="platform/system/bluetooth" />
  <project path="system/core" name="platform/system/core" />
  <project path="system/extras" name="platform/system/extras" />
  <project path="system/netd" name="platform/system/netd" />
  <project path="system/vold" name="platform/system/vold" />
  <project path="system/wlan/ti" name="platform/system/wlan/ti" />

The XML code above only shows some of the 159 references to git repositories. Without the repo command created by Google, the developers should have to download them one by one or using a script. We will use awk and a simple bash script to extract them form the XML file and download them in one go.

$ list=`cat default.xml |awk -F '"' '{print $4}'|grep -v '^$'|grep -v "UTF-8"|grep -v "Makefile$"`
$ for i in $list
do 
j=`echo $i|sed 's:/:_:g'`
echo git clone git://android.git.kernel.org/$i $j >> get_repos.sh
done

Now, just edit the file get_repos.sh and add the following lines at the beginning and we have a script to download the Android’s repositories. Don’t forget to give it execution permission.

#!/bin/bash
echo "getting android repos"

Easy, isn’t it?. The next step is to execute the script to download the 159 git repositories and in the meanwhile install cvsanaly which has to be installed from sources, but do not panic it is straightforward:

At this point you are ready to start playing with the raw data extracted from all the git repositories in a single relational database. Stay tuned, the second chapter is coming soon.

UPDATE: the new release of Android 2.3 which has been published a couples of days ago uses 167 git repositories

Read the second part

Written by sanacl

December 17, 2010 at 8:52 am