Libre Software People's Front

don't confuse it with People's Front of Open Source

Posts Tagged ‘bitergia

Elasticsearch Snapshots in Digitalocean Spaces

leave a comment »

Elasticsearch snapshots are a really nice feature that you should master if you work with the Elastic stack. We at Bitergia use Elasticsearch clusters running on top of Digital Ocean droplets. This cloud provider offers an S3 compatible service named Spaces which can be set up to store those snapshots. Unluckily the documentation and examples available are very rare, so I thought this blog post may save you some time.

So let’s see an example about how to produce Elasticsearch 6.8.6 Snapshots in Digital Ocean spaces.

The first thing you need is to install in your cluster the plugin named ‘repository-s3’, which by the way is using an Apache 2 License. This plugin can be installed using the plugin manager:

$ sudo bin/elasticsearch-plugin install repository-s3

In case you are behind a proxy, you can also download the zip file and pass the file to the plugin manager. More info available at the documentation section of the S3 Repository Plugin.

Now, go to DigitalOcean and get the parameters from your Space, these are the ones we need:

  • ID and Key: the secrets to access to the space
  • endpoint: something like
  • bucket name: I will use sanacl-testing in the examples below

In order to finish with the setup of Elasticsearch we have to store the secrets and the endpoint. Modify the file `elasticsearch.yml` to add a parameter with the endpoint:

# S3 compatible

Run the code below (with the correct credentials) in all your nodes:

echo $AWS_ACCESS_KEY_ID | /usr/share/elasticsearch/bin/elasticsearch-keystore add --stdin s3.client.default.access_key
echo $AWS_SECRET_ACCESS_KEY | /usr/share/elasticsearch/bin/elasticsearch-keystore add --stdin s3.client.default.secret_key

Restart your Elasticsearch cluster and we are ready to start creating the snapshots. If your cluster does not start, have a look at the logs because any of the parameters above is incorrect.

Create the repository for the snapshots, the only mandatory setting is the bucket name:

$ cat query1.json
  "type": "s3",
  "settings": {
    "bucket": "sanacl-testing"
$ curl -X PUT -k -H "Content-Type: application/json" -d @query1.json https://...:9200/_snapshot/es_backups

Now, create your first snapshot:

curl -X PUT -k -H "Content-Type: application/json" https://....:9200/_snapshot/es_backups/1

In case you are using Search Guard and want to restore some content, you will need both admin and key certificates. I’ve tested it restoring an index named ‘git’ like this:

$ cat query2.json
  “indices": "git,-searchguard",
  "include_global_state": false
$ curl -X POST -k -H "Content-Type: application/json" --cert admin_cert.pem --key admin_cert_key.pem -d @query2.json https://….:9200/_snapshot/es_backups/1/_restore

And that’s all folks, if you found a typo don’t hesitate to let me know. 

Have a nice day!

Written by sanacl

March 30, 2020 at 2:22 pm

First half year as a Bitergian

leave a comment »

One year ago three of us were freezing cold in Brussels attending FOSDEM 2012 and trying to get feedback about a couple ideas we had in order to create a start-up. Twelve months later here we are again in our way to FOSDEM, but this time we are part of Bitergia, the company we created.

The creation of the start-up has been a lesson itself. During the first months of 2012 we lost count of the times we met at a very nice cafeteria in the South of Madrid to discuss the ideas about the people we wanted involved, business models, potential clients and so on. One of our main concerns during these discussions were to decide the main idea that would be the base of our business model. During our last months (late 2011) in the research group we started to receive some petitions from people interested in using the results of some research lines in their products. Offering interesting results as researchers was really nice, the new challenge started to be how to get money from that results/expertise. Advance warning: this will be one of our critical problems or opportunities.

During summer 2012 we launched officially the company mainly focused in offering software statistics from Libre Software communities and mainly aimed at companies involved in that projects. Since then we have been working on improving the tools developed by the research group and starting to perform interesting studies about big projects like Mediawiki, OpenStack or Webkit. Right now we are achieving an interesting point where we want to deepen in the real work behind the current data we have about total activity (commits, committers, issues, time-to-fix, blah, blah). I’m pretty sure we have more fun ahead 🙂

Did I talk about getting focused in the paragraphs above? One of our favourite dilemmas was to define how focused we wanted to be in our business idea. As soon as we started the company we saw some opportunities of collaborating with a couple of entities supporting and improving software forges. Since the very beginning we were very interested in Allura, Fusionforge and software forges in general, but we wasn’t sure it would be a good idea to follow that path. Time has passed and a few months ago we decided to add forges to the business idea, so currently we are working on supporting software forges and integrating them with our software analytics report.

So far, the adventure is being really fun and I’m pretty sure this is because I have great partners. If someone would ask me about what lesson I extract from this first stage of the company’s life I would say: once you have the idea, look for people that support you and motivate you to grow up.

Good luck!

Written by sanacl

February 1, 2013 at 4:21 pm

Posted in Uncategorized

Tagged with ,

Analysis of reused code using FLOSS tools

leave a comment »

Last week we attended Linux Tag in Berlin to give two talks. First one was about identifying reused code between two FLOSS projects and it was given by me. The second one explained the importance of studying FLOSS software communities and was given by Daniel Izquierdo.

The main aim of my presentation was to show that it is possible (and easy!) to get very interesting results about the shared code between two FLOSS projects using FLOSS tools; the ones we used in this case were: CCFinder, Cloc, Ninka and Grep. The study identified not only the common code but also the possible license issues that were found. These kind of studies can be interesting from different points of view, I’ve summed them up in the following questions:

  • how different are two software projects?
  • is it feasible to propose a merge of the code?
  • how is the derivate project using the original code?
  • are the licenses being respected? what about the copyright?
  • is the new project using new licenses that could be interested for the team that created the original work? are they improving the code?
  • what changes performed the second team on the original code?
  • is your source code being adopted by a certain community?

The presentation that was presented is available here.

[This entry is part of the work I do in Bitergia and it is also available here]

Written by sanacl

May 29, 2012 at 11:05 am