Author Archives: Diederik

Setting up Amazon AWS instance to crunch Wikipedia XML dump files

In this post, I’ll describe the steps to setup an Amazon AWS instance and get access to XML Wikipedia dump files.There is a way to download wikidumps for any project / language, the data is from early 2009. I will … Continue reading

Posted in wikipedia | Tagged | Leave a comment

Reminiscences of a young gamer and the comeback of the adventure genre in gaming

Back in 1984… I was six years old when my parents bought their first PC, it was 1984. They bought an Olivetti with 2 5ΒΌ disk drives, a CGA video adapter and a monitor with phosphor green screen. As soon … Continue reading

Posted in gaming | Leave a comment

Twittering to End Dictatorship: Ensuring the Future of Web-based Social Movements

Update: this post was originally posted in June 2o09. With the current events in Syria, it is important that we keep pressuring Western corporations not to sell surveillance technologies to these regimes. I have not updated the original post so … Continue reading

Posted in social movements, Twitter | Tagged , | Leave a comment

Using Hadoop to analyze the full Wikipedia dump files using WikiHadoop

Background Probably the largest free dataset available on the Internet is the full XML dump of the English Wikipedia. This dataset in it’s uncompressed form is about 5.5Tb and still growing. The sheer size of this dataset poses some serious … Continue reading

Posted in hadoop, wikipedia | Tagged , , | 28 Comments

Configuring Cassandra multinode on Ubuntu 10.10

Many many thanks to David Strauss for guiding me through configuring Casssandra and I thought I should share this knowledge. Installation of Cassandra Let’s start with a clean Ubuntu 10.10 64bits installation. Before we can install Cassandra, make sure you … Continue reading

Posted in nosql | Tagged , | 2 Comments

Add Cairo support to iGraph 0.5

I have been working with iGraph 0.5.2 and I am really happy with the speed and diversity of algorithms. Of course, networks need to be visualized as well. iGraph does offer visualization capabilities but you need Cairo installed. Unfortunately, installing … Continue reading

Posted in igraph | Tagged , , | Leave a comment