Tuesday, July 24, 2012

Cassandra on a Raspberry Pi, 5 and 6 node insert stress tests:

Here's a quick update for the performance graphs for cassandra on Raspberry pi.  Here's the results for 5 and 6 node inserts on a stress test

I'm getting to the point in this project where I can start to build pseudo data centers and start to test performance there.

Thursday, July 19, 2012

Java performance on Raspbian vs Debian

Over the past few weeks I’ve been blogging about my experience of running Apache Cassandra on the Raspberry Pi.  I plan to use the Pi as an educational resource in the University I work in, hopefully giving students the chance to play with large clusters and experiment with configurations, database models and practices in a nosql environment.  Of course performance isn’t great but  for me, it’s a  cheap way of getting lots of nodes and do real network configuration problems. 

A couple of days ago a new Debian based distro for the pi was released called Raspbian “wheezy” was released and is now the official Raspberry Pi Debian distro (I believe).   This is the first OS release for the Pi to take advantage of the Pi’s floating point hardware, which is going to make the OS a lot faster for general use.  I downloaded it for testing in my rig, sadly this is a tale of woe.

Apache Cassandra is a Java application and needs a JRE in order to run.    I’ve always used a  Oracle supplied JVM “Oracle’s java SE for  embedded “


Sadly, it seems this can’t be used on Raspbian.  Trying to run it gives :

Java: error while loading shared libraries: libjli.so: cannot open shared object file: No such file or directory

It seems that this version of Java uses the “soft float ABI (armel) which is incompatible with Raspbian”: (thanks to mpthompson on the Raspberry Pi forum for the information) so it’s looking like it can’t run.  Back to openjdk ?

But wait !  Why did I not use openjdk in the first place ?

That’s simple, performance.  In my experience (and perhaps this is a configuration problem I’m not aware of)  Open JDK is a lot slower than the Oracle version.  And  I mean a lot slower!  I set up a single node Cassandra server image, one with the old Debian image and Oracle Java the other with Raspbian and Open JDk.  I then ran stress tests from a Apple Air (something I’ve done many times !) .  Here’s the results.  The second column is interval_op_rate, you want this to be as high as possible, the third column  is  avg_latency, you want this to be as low as possible.

Raspbian and OpenJDK

>Lifeintheairage:bin Administrator$ ./stress -d -o insert -I DeflateCompressor
Unable to create stress keyspace: Keyspace names must be case-insensitively unique ("Keyspace1" conflicts with "Keyspace1")


Debian Squeze and Java SE for embedded

>lifeintheairage:bin Administrator$ ./stress -d -o insert -I DeflateCompressor
Unable to create stress keyspace: Keyspace names must be case-insensitively unique ("Keyspace1" conflicts with "Keyspace1")

And a graph of interval_op_rate:

(red is Java SE for embedded, blue is OpenJDK)

Java SE for embedded really is a lot faster for Apache Cassandra (and I wouldn’t be surprised for other java apps such as Arduino IDE).  For now I need to stick with the Debian release, I hope it doesn’t become unsupported.  Hopefully someone can  get in touch with Oracle and encourage them to support a official port of Java SE for embedded onto the raspberry which supports the correct Raspbian libraries.