Rewriting Playdar: C++ to Erlang, massive savings

I’ve heard many anecdotes and claims about how many lines of code are saved when you write in Erlang instead of [C++/other language]. I’m happy to report that I now have first-hand experience and some data to share.

I initially wrote Playdar in C++ (using Boost and Asio libraries), starting back in February this year. I was fortunate to be working with some experienced developers who helped me come to terms with C++. There were three of us hacking on it regularly up until a few months ago, and despite being relatively new to C++, I’ll say that we ended up with a well designed and robust codebase, all things considered.

On Feeling Smug

I’ll admit I felt rather smug making it all work in C++ with Boost and ASIO. Getting it to build on all three platforms and dynamically load extensions (DLLs etc) at runtime in a cross-platform way was also quite satisfying (I had plenty of help with that side of things). I learned a lot about C++, Boost, ASIO and CMake. But, as the codebase grew, I began to seriously question my decision to use C++.

My initial reasons for choosing C++ were twofold:

  • Distribution – shipping the Erlang VM didn’t sound like fun
  • Taglib – *the* library to read metadata from audio files (mp3, m4a, ogg etc) is C++

It turns out Playdar is naturally a good fit for Erlang – it does lots in parallel, and lots of stuff it does is asynchronous and event based. Even with all the stuff you get with Boost, multithreaded stuff in C++ is inelegant, to put it kindly.

SLOCed and Loaded

Anyway, a couple of weeks ago I sat down to re-implement Playdar from scratch in Erlang. I thrashed out the guts of it in a couple of days, and by the end of the week I almost had it 1:1 features with the C++ codebase. There’s still a bit of C++ left – code to interface with taglib.

Using the SLOCcount tool (SLOC=source lines of code) I counted the lines of code in various modules from both codebases, here are the results:

Erlang Version C++ Version Savings
Core Daemon 1,100 4,491 75%
Library + Scanner 197 + 167.cpp 1,355 73%
LAN Resolver 105 427 75%
P2P 463 1,762 74%
TOTAL 2,032 8,035 75%


75% less lines of code using Erlang compared to C++ to implement the same thing – not too shabby :)

The second time around writing in Erlang I knew exactly what I was building, so it’s unfair to compare development time of the two codebases, but given how fast I can type I reckon I saved a good few hours of just pounding the keyboard to input the code (and countless hours of debugging: Erlang tends to work first time, really). Well I’m not sure if “saved” is the right word, considering It was working in C++ already, but it’s my time to waste :)

If you count the third party code bundled with both codebases (excluding boost/asio!) then the erlang codebase saves a whopping 92%. I’m more interested in the savings in code I had to write, however.

Memory and CPU Usage

I’ve done some preliminary comparisons between both projects, when it comes to CPU and memory usage both projects are pretty similar. The Erlang codebase uses slightly more memory than C++ at the moment, but I’m convinced I can get that down to at least as low as the C++ project was. I picked up a few optimization tricks from my three-part Million-user comet experiment in Erlang earlier this year. I’ll post more about this if I learn any new tricks.

One thing I’ve realised about the Erlang codebase is that I’ve used processes to encapsulate state (active queries, specifically) where I didn’t really need to. It seemed sensible at the time, but it’s probably just a waste of memory. I’m going to change it to spawn processes to get the work done (ie, a process that runs the query) but not necessarily just to maintain state.

Distribution to the desktop

C++

You just have to make sure that you build everything and ship with any DLLs along with checks in the installer for system libraries needed (runtime dlls). Oh, and make sure you don’t change the plugin binary interface in the main app, or new plugins will crash and burn when you load them. Add a check for that. Oh and be careful about compiling taglib and stuff with mingw and the rest with VC++, or things might mysteriously crash. Also I heard a horror story about allocating memory in plugin code but deallocating it in the main app when the plugin was compiled against a different stdlib than the main app. This is all par for the course, and the experienced C++ developers I asked for help had no trouble making it work. Size of installable pacakge: 2.5MB

Erlang

Compiling, and building/loading plugins in the Erlang codebase is straightforward on all platforms, as is often the way with VMs. I was against shipping the Erlang VM originally because I figured it would be a lot of hassle and increase the download size substantially. Packaging an Erlang app for the desktop involves taking the installed VM directory structure and stripping out all the docs, source and parts of the Erlang stdlib we don’t use, then packaging it along with the compiled Playdar code. CouchDB does something like this too, and RabbitMQ ships the Erlang VM without stripping unneeded libs. We’ll work on packaging some more (for all platforms), but to date Max has crafted a package that contains the necessary bits of the Erlang VM, a sexy Prefpane to start/stop the daemon on OS X, and the compiled Playdar code all weighing in under 10MB.

We’ll put together a Windows installer soon that’ll probably be around the same size. A 10MB download isn’t so bad nowadays, and I expect we can optimize the packaging process some more. Linux users will get a package that depends on the erlang VM in their package manager.
Seems like shipping Erlang apps to the desktop isn’t so hard after all.

tl;dr

Someone rewrote a C++ app in Erlang: 75% less lines of code for same functionality.

You should read this blog post about Playdar, by Paul Lamere, and take a look at the Playdar website.

C++ codebase (deprecated)
Erlang codebase

Playdar is the future, and the future is written in Erlang :)

Tags: , ,

Wednesday, October 21st, 2009 playdar, programming 14 Comments

Erlang talk at London Hackspace

Last night I gave an “Intro to Erlang” talk at a London Hackspace meetup. I did a quick audience survey first: About 75% did “web programming” (ruby,python,php,etc).  Around 30% admitted to regularly using C/C++/Java or desktop/mobile app development.  Less than 10% had much experience with functional programming.

I wanted to impress upon the audience that Erlang is a practical language, built by Ericsson with a specific purpose in mind. You use Erlang to build useful, scalable and reliable distributed systems in the real world. This was worth pointing out because when many people hear “functional programming” they immediately think of eccentric bearded academics proving the validity of their Haskell code and comparing Monads.

I skipped through the basics of sequential programming in Erlang pretty quickly and tried to spend most of the time showing how you handle processes and send messages. I built a basic Erlang server process that kept a count of how many operations it had done, explaining how it passes state to itself on every loop. Hopefully this helped some people grok how you can build servers that keep a global state by using recursion. I also showed off hot code reloading. We added another feature to the server and upgraded it without stopping it.

You can download the code I used (see link at the end) if you want to try out the examples from last night yourself. The last code I showed was an example of doing the same thing using gen_server, so hopefully if you followed along you’ll have a good understanding of what gen_server is and why it exists.

Hot code reloading example

I can’t write a post about Erlang without including some code, so here’s the basic example I used showing how hot code reloading works:

  1. -module(ex09).
  2. -export([start/0, loop/2, client/3]).
  3.  
  4. start() -> spawn(?MODULE, loop, [0,0]).
  5.  
  6. loop(Ops,Wtfs) ->
  7.  receive
  8.    {Client, double, Num} ->
  9.      Client ! Num * 2,
  10.      loop(Ops+1, Wtfs);
  11.  
  12.    {Client, square, Num} ->
  13.      Client ! Num * Num,
  14.      loop(Ops+1, Wtfs);
  15.  
  16.    {Client, _, _Num} ->
  17.      Client ! wtf,
  18.      loop(Ops, Wtfs+1);
  19.  
  20.    reload ->
  21.      io:format("Reloading~n"),
  22.      ?MODULE:loop(Ops, Wtfs);
  23.  
  24.    stats ->
  25.      io:format("Ops: ~p, Wtfs: ~p ~n", [Ops, Wtfs]),
  26.      loop(Ops, Wtfs)
  27.  end.
  28.  
  29. % basic client API:
  30.  
  31. client(Pid, Cmd, Num) ->
  32.  Pid ! {self(), Cmd, Num},
  33.  receive
  34.    Ans -> Ans
  35.  after 1000 -> timeout
  36.  end.

And if you were following along you saw something like this:

1> c(ex09).
{ok,ex09}
2> Pid = ex09:start().
<0.38.0>
3> Pid ! stats.
Ops: 0, Wtfs: 0
stats
4> ex09:client(Pid, double, 10).
20
5> ex09:client(Pid, triple, 10).
wtf

At this point we added support for “triple” to the example and showed how the fully-qualified call to loop (using the modulename:fun() instead of fun() syntax) causes the newest version of the module to be used:

6> c(ex09).
{ok,ex09}
7> ex09:client(Pid, triple, 10).
wtf
8> Pid ! reload.
Reloading
reload
9> ex09:client(Pid, triple, 10).
30
10> Pid ! stats.
Ops: 2, Wtfs: 2
stats

You can see from the stats at the end that the global state was kept – the server process staying running during the code upgrade.

Download

The slides, example code and basic mochiweb comet project we saw last night can be downloaded here. I should warn you that unless you saw my talk and the various explanations and disclaimers that went along with the code, it’s probably not a good place to start or learn from. Have a look at www.learnyousomeerlang.com or get one of the two excellent Erlang books.

London Hackspace

If you live in London you should know about this. Russ and Jonty (who I worked with at Last.fm for years) started London Hackspace: “We run a dedicated space for people to learn and build things in London.” There are workshops at hackspace meetups on topics ranging from Arduino and electronics hacking, to iPhone development, to Erlang and beyond. Their unofficial slogan could be “Beer & Hacking” – it’s a great place to meet people doing interesting things in London, and to learn new things.

http://london.hackspace.org.uk/

Playdar

Playdar is my pet project at the moment. I talked about this last night too. I wrote it in C++ using Boost, mainly as an excuse to do something serious in C++. I’ve since seen the error of my masochistic ways and in the last week I’ve tossed out the 10,000 lines of C++ and rewritten it in Erlang. I’m not quite finished, but once I have feature parity between the two codebases I’ll write an article comparing the two.  As you might expect, the Erlang codebase is far superior in almost every way.

http://www.playdar.org/

Tags: , , , ,

Thursday, October 8th, 2009 Uncategorized 1 Comment

Anti-RDBMS: A list of distributed key-value stores

Please Note: this was written January 2009 – see the comments for updates and additional information. A lot has changed since I wrote this.
- RJ

Perhaps you’re considering using a dedicated key-value or document store instead of a traditional relational database. Reasons for this might include:

  1. You’re suffering from Cloud-computing Mania.
  2. You need an excuse to ‘get your Erlang on’
  3. You heard CouchDB was cool.
  4. You hate MySQL, and although PostgreSQL is much better, it still doesn’t have decent replication. There’s no chance you’re buying Oracle licenses.
  5. Your data is stored and retrieved mainly by primary key, without complex joins.
  6. You have a non-trivial amount of data, and the thought of managing lots of RDBMS shards and replication failure scenarios gives you the fear.

Whatever your reasons, there are a lot of options to chose from. At Last.fm we do a lot of batch computation in Hadoop, then dump it out to other machines where it’s indexed and served up over HTTP and Thrift as an internal service (stuff like ‘most popular songs in London, UK this week’ etc). Presently we’re using a home-grown index format which points into large files containing lots of data spanning many keys, similar to the Haystack approach mentioned in this article about Facebook photo storage. It works, but rather than build our own replication and partitioning system on top of this, we are looking to potentially replace it with a distributed, resilient key-value store for reasons 4, 5 and 6 above.

This article represents my notes and research to date on distributed key-value stores (and some other stuff) that might be suitable as RDBMS replacements under the right conditions. I’m expecting to try some of these out and investigate further in the coming months.

Glossary and Background Reading

The Shortlist

Here is a list of projects that could potentially replace a group of relational database shards. Some of these are much more than key-value stores, and aren’t suitable for low-latency data serving, but are interesting none-the-less.

Name Language Fault-tolerance Persistence Client Protocol Data model Docs Community
Project Voldemort Java partitioned, replicated, read-repair Pluggable: BerkleyDB, Mysql Java API Structured / blob / text A Linkedin, no
Ringo Erlang partitioned, replicated, immutable Custom on-disk (append only log) HTTP blob B Nokia, no
Scalaris Erlang partitioned, replicated, paxos In-memory only Erlang, Java, HTTP blob B OnScale, no
Kai Erlang partitioned, replicated? On-disk Dets file Memcached blob C no
Dynomite Erlang partitioned, replicated Pluggable: couch, dets Custom ascii, Thrift blob D+ Powerset, no
MemcacheDB C replication BerkleyDB Memcached blob B some
ThruDB C++ Replication Pluggable: BerkleyDB, Custom, Mysql, S3 Thrift Document oriented C+ Third rail, unsure
CouchDB Erlang Replication, partitioning? Custom on-disk HTTP, json Document oriented (json) A Apache, yes
Cassandra Java Replication, partitioning Custom on-disk Thrift Bigtable meets Dynamo F Facebook, no
HBase Java Replication, partitioning Custom on-disk Custom API, Thrift, Rest Bigtable A Apache, yes
Hypertable C++ Replication, partitioning Custom on-disk Thrift, other Bigtable A Zvents, Baidu, yes


Why 5 of these aren’t suitable

What I’m really looking for is a low latency, replicated, distributed key-value store. Something that scales well as you feed it more machines, and doesn’t require much setup or maintenance – it should just work. The API should be that of a simple hashtable: set(key, val), get(key), delete(key). This would dispense with the hassle of managing a sharded / replicated database setup, and hopefully be capable of serving up data by primary key efficiently.

Five of the projects on the list are far from being simple key-value stores, and as such don’t meet the requirements – but they are definitely worth a mention.

1) We’re already heavy users of Hadoop, and have been experimenting with Hbase for a while. It’s much more than a KV store, but latency is too great to serve data to the website. We will probably use Hbase internally for other stuff though – we already have stacks of data in HDFS.

2) Hypertable provides a similar feature set to Hbase (both are inspired by Google’s Bigtable). They recently announced a new sponsor, Baidu – the biggest Chinese search engine. Definitely one to watch, but doesn’t fit the low-latency KV store bill either.

3) Cassandra sounded very promising when the source was released by Facebook last year. They use it for inbox search. It’s Bigtable-esque, but uses a DHT so doesn’t need a central server (one of the Cassandra developers previously worked at Amazon on Dynamo). Unfortunately it’s languished in relative obscurity since release, because Facebook never really seemed interested in it as an open-source project. From what I can tell there isn’t much in the way of documentation or a community around the project at present.

4) CouchDB is an interesting one – it’s a “distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API”. Data is stored in ‘documents’, which are essentially key-value maps themselves, using the data types you see in JSON. Read the CouchDB Technical Overview if you are curious how the web’s trendiest document database works under the hood. This article on the Rules of Database App Aging goes some way to explaining why document-oriented databases make sense. CouchDB can do full text indexing of your documents, and lets you express views over your data in Javascript. I could imagine using CouchDB to store lots of data on users: name, age, sex, address, IM name and lots of other fields, many of which could be null, and each site update adds or changes the available fields. In situations like that it quickly gets unwieldly adding and changing columns in a database, and updating versions of your application code to match. Although many people are using CouchDB in production, their FAQ points out they may still make backwards-incompatible changes to the storage format and API before version 1.0.

5) ThruDB is a document storage and indexing system made up for four components: a document storage service, indexing service, message queue and proxy. It uses Thrift for communication, and has a pluggable storage subsystem, including an Amazon S3 option. It’s designed to scale well horizontally, and might be a better option that CouchDB if you are running on EC2. I’ve heard a lot more about CouchDB than Thrudb recently, but it’s definitely worth a look if you need a document database. It’s not suitable for our needs for the same reasons as CouchDB.

Distributed key-value stores

The rest are much closer to being ’simple’ key-value stores with low enough latency to be used for serving data used to build dynamic pages. Latency will be dependent on the environment, and whether or not the dataset fits in memory. If it does, I’d expect sub-10ms response time, and if not, it all depends on how much money you spent on spinning rust.

MemcacheDB is essentially just memcached that saves stuff to disk using a Berkeley database. As useful as this may be for some situations, it doesn’t deal with replication and partitioning (sharding), so it would still require a lot of work to make it scale horizontally and be tolerant of machine failure. Other memcached derivatives such as repcached go some way to addressing this by giving you the ability to replicate entire memcache servers (async master-slave setup), but without partitioning it’s still going to be a pain to manage.

Project Voldemort looks awesome. Go and read the rather splendid website, which explains how it works, and includes pretty diagrams and a good description of how consistent hashing is used in the Design section. (If consistent hashing butters your muffin, check out libketama – a consistent hashing library and the Erlang libketama driver). Project-Voldemort handles replication and partitioning of data, and appears to be well written and designed. It’s reassuring to read in the docs how easy it is to swap out and mock different components for testing. It’s non-trivial to add nodes to a running cluster, but according to the mailing-list this is being worked on. It sounds like this would fit the bill if we ran it with a Java load-balancer service (see their Physical Architecture Options diagram) that exposed a Thrift API so all our non-Java clients could use it.

Scalaris is probably the most face-meltingly awesome thing you could build in Erlang. CouchDB, Ejabberd and RabbitMQ are cool, but Scalaris packs by far the most impressive collection of sexy technologies. Scalaris is a key-value store – it uses a modified version of the Chord algorithm to form a DHT, and stores the keys in lexicographical order, so range queries are possible. Although I didn’t see this explicitly mentioned, this should open up all sorts of interesting options for batch processing – map-reduce for example. On top of the DHT they use an improved version of Paxos to guarantee ACID properties when dealing with multiple concurrent transactions. So it’s a key-value store, but it can guarantee the ACID properties and do proper distributed transactions over multiple keys.

Oh, and to demonstrate how you can scale a webservice based on such a system, the Scalaris folk implemented their own version of Wikipedia on Scalaris, loaded in the Wikipedia data, and benchmarked their setup to prove it can do more transactions/sec on equal hardware than the classic PHP/MySQL combo that Wikipedia use. Yikes.

From what I can tell, Scalaris is only memory-resident at the moment and doesn’t persist data to disk. This makes it entirely impractical to actually run a service like Wikipedia on Scalaris for real – but it sounds like they tackled the hard problems first, and persisting to disk should be a walk in the park after you rolled your own version of Chord and made Paxos your bitch. Take a look at this presentation about Scalaris from the Erlang Exchange conference: Scalaris presentation video.

The reminaing projects, Dynomite, Ringo and Kai are all, more or less, trying to be Dynamo. Of the three, Ringo looks to be the most specialist – it makes a distinction between small (less than 4KB) and medium-size data items (<100MB). Medium sized items are stored in individual files, whereas small items are all stored in an append-log, the index of which is read into memory at startup. From what I can tell, Ringo can be used in conjunction with the Erlang map-reduce framework Nokia are working on called Disco.

I didn’t find out much about Kai other than it’s rather new, and some mentions in Japanese. You can chose either Erlang ets or dets as the storage system (memory or disk, respectively), and it uses the memcached protocol, so it will already have client libraries in many languages.

Dynomite doesn’t have great documentation, but it seems to be more capable than Kai, and is under active development. It has pluggable backends including the storage mechanism from CouchDB, so the 2GB file limit in dets won’t be an issue. Also I heard that Powerset are using it, so that’s encouraging.

Summary

Scalaris is fascinating, and I hope I can find the time to experiment more with it, but it needs to save stuff to disk before it’d be useful for the kind of things we might use it for at Last.fm.

I’m keeping an eye on Dynomite – hopefully more information will surface about what Powerset are doing with it, and how it performs at a large scale.

Based on my research so far, Project-Voldemort looks like the most suitable for our needs. I’d love to hear more about how it’s used at LinkedIn, and how many nodes they are running it on.

What else is there?

Here are some other related projects:

If you know of anything I’ve missed off the list, or have any feedback/suggestions, please post a comment. I’m especially interested in hearing about people who’ve tested or are using KV-stores in lieu of relational databases.

UPDATE 1: Corrected table: memcachedb does replication, as per BerkeleyDB.

Tags: , , , , ,

Monday, January 19th, 2009 programming 148 Comments

How we use IRC at Last.fm

Everyone that works at Last.fm is typically connected to our IRC server. We have different channels per team, as well as a company-wide channel, and a few channels dedicated to automated monitoring.

Sometimes it makes much more sense to discuss / ask questions on IRC instead of email, and it’s useful to be able to raise people who are not in the office. That said, the main reason I’m writing this post is to mention the dev-support bot we use: irccat.

IRCCat – Development support bot

The irccat bot joins all your channels, and waits for messages on a specified ip:port on your internal network. Anything you send to that port will be sent to IRC by the bot. IRCCat – as in, `cat` to IRC.

Using netcat, you can easily send events to irc from shell scripts:

$ echo “Something just happened” | nc -q0 somemachine 12345

That will send to the default channel only (first in the config file). You can direct messages to specific combinations of channels (#) or users (@) like so:

$ echo “#syschan Starting backup job” | nc -q0 somemachine 12345

$ echo “#musicteam,#legal,@alice New album uploaded: …” | nc -q0 somemachine 12345

Some of the things we automatically send to appropriate IRC channels:

  • SVN commits
  • JIRA issue tracker updates
  • Nagios alerts for monitored hosts and services
  • Deployment notices to testing/staging/production
  • Results of automated tests if something bad happens
  • Links to pics from security camfeed when someone opens the office door out of hours

We also post messages from automated backup jobs etc, which helps correlate such events with any unusual load spikes or glitches in usually-smooth graphs.

In addition to providing a cat-to-irc conduit, irccat will also hand off commands to a script you can provide. We use this to expose lookup tools and some admin functions to our support staff and developers. The handler script we use is PHP, and has access to our core website libs. Typing “?pokereleasenode”, “?lookup user RJ” or “?uncache artist Radiohead” is faster than writing a throw-away script, more accessible to non-developers, less hassle than a web interface and creates a public log so people can see what’s going on.

The bot is written in Java, it’s easy to build and configure, all the deps are included:

http://github.com/RJ/irccat/tree/master

Tags: , ,

Thursday, January 8th, 2009 programming 52 Comments

Getting to know ejabberd and writing modules

I started poking around in the ejabberd source code to see what I could learn. I couldn’t find much in the way of high level documentation that talks about how the various bits of ejabberd talk to each other, so I’m starting to piece it together myself.

After compiling ejabberd I made a php script I could use with the external authentication system. Here’s a version that supports just two hardcoded users:

ejabberd.cfg:
{auth_method, external}.
{extauth_program, "/tmp/auth.php"}.


auth.php:

  1. #!/usr/bin/php
  2. <?
  3. $fh  = fopen("php://stdin", ‘r’);
  4. if(!$fh){
  5.     die("Cannot open STDIN\n");
  6. }
  7. $users = array(‘user1′=>‘password1′, ‘user2′=>‘password2′);
  8.  
  9. do{
  10.     $lenBytes = fgets($fh, 3);
  11.     $len = unpack(‘n’, $lenBytes);
  12.     $len = $len[1];
  13.     if($len<1) continue;
  14.     $msg = fgets($fh, $len+1);
  15.     $toks=explode(‘:’,$msg);
  16.     $method = array_shift($toks);
  17.     switch($method){
  18.         case ‘auth’:
  19.             list($username, $server, $password) = $toks;
  20.             if(@$users[$username] == $password){
  21.                 print pack("nn", 2, 1); // ok
  22.             }else{
  23.                 print pack("nn", 2, 0); // fail
  24.             }
  25.             break;
  26.  
  27.         case ‘isuser’:
  28.             list($username, $server) = $toks;
  29.             if(isset($users[$username])){
  30.                 print pack("nn", 2, 1); // yes
  31.             }else{
  32.                 print pack("nn", 2, 0); // nope
  33.             }
  34.             break;
  35.  
  36.         default:
  37.             print pack("nn", 2, 0);// fail
  38.     }
  39. }while(true);


I stripped down the ejabberd config to just load what I considered the bare essentials. Here is the modules section I’m testing with:

From ejabberd.cfg:
{modules,
[
{mod_caps, []},
{mod_disco, []},
{mod_roster, []},
{mod_pubsub, [ % requires mod_caps
{access_createnode, pubsub_createnode},
{plugins, ["default", "pep"]}
]},
{mod_mnesiaweb, []},
{mod_thriftctl, []}
]}.

mod_disco deals with discovery, so clients can find out what the server supports. mod_roster deals with rosters (buddy lists etc) using mnesia. mod_pubsub is enabled because I want to use User Tune, an extension that lets you broadcast the name of the song you are playing to all everyone in your roster. mod_caps provides XEP-115 – an extension for broadcasting and dynamically discovering client, device, or generic entity capabilities. mod_caps is a requirement of mod_pubsub.

I’ve removed the module that allows users to register, although I made a few accounts first whilst testing. The last two modules, mod_mnesiaweb and mod_thriftctl are modules I wrote.

mod_mnesiaweb

To help figure out what’s going on inside of ejabberd, it’s useful to be able to easily browse the mnesia database. Yaws comes with an appmod that does this, called ymnesia. This ejabberd module will start yaws in embedded mode and run this appmod, enabling you to explore the mnesia database from a web browser.

Yaws observation: yaws didn’t appear to build ymnesia by default, I edited the Makefile in src and added “ymnesia” to the module list. Also, if ./configure fails, the package you are probably missing is libpam0g-dev

mod_mnesiaweb:

  1. % Ejabberd module that runs yaws in embedded mode,
  2. % and loads the ymnesia appmod for browsing mnesia.
  3. -module(mod_mnesiaweb).
  4. -author(‘rj@last.fm’).
  5.  
  6. -include("/usr/local/lib/yaws/include/yaws.hrl").
  7.  
  8. -behaviour(gen_mod).
  9. -export([start/2, stop/1]).
  10.  
  11. start(_Host, Opts) ->
  12.     Port = gen_mod:get_opt(port, Opts, 8001),
  13.     code:add_path("/usr/local/lib/yaws/ebin"),
  14.     application:set_env(yaws, embedded, true),
  15.     application:start(yaws),
  16.     GC = yaws_config:make_default_gconf(false,"yawstest"),
  17.     SC = #sconf{
  18.         port = Port,
  19.         servername = "ejabnesia",
  20.         listen = {0,0,0,0},
  21.         appmods = [{"showdb", ymnesia}],
  22.         docroot = "wwwroot"
  23.         },
  24.     yaws_api:setconf(GC, [[SC]]),
  25.     ok.
  26.  
  27. stop(_Host) ->
  28.     application:stop(yaws),
  29.     ok.


To compile it:
erlc -pa ${EJAB_SRC} -I ${EJAB_SRC} mod_mnesiaweb.erl
where EJAB_SRC is the ejabberd-2.X.X/src directory, after you’ve compiled from source (so the beams are there too).

Copy the resulting mod_mnesiaweb.beam to /var/lib/ejabberd/ebin so ejabberd finds it, and it should work. Hit up http://localhost:8001/showdb/ in your browser and you can explore the mnesia database.

Use the match syntax to filter tables. For example to find everyone in my roster, I use this in the input box next to roster:
{roster,{"RJ",'_', {'_','_',[]}}, '_','_','_','_','_','_','_','_'}

Not pretty, but it gets the job done. You can just view the entire table, copy a record then replace fields with ‘_’ to build queries.

mod_thriftctl

Next up I wanted to try the Erlang Thrift bindings (written by the folks at Amie St.), and expose some useful functionality for controlling the server.

If you aren’t familiar with Thrift, I recommend reading about it first. In a nutshell, you write your API using an IDL (a .thrift file) and the thrift compiler creates client libraries, and server code in various different languages. It’s an RPC mechanism, and useful in a mixed environment.

mod_thriftctl.thrift:
#!/usr/local/bin/thrift -php -erl

struct JabberUser {
1: string name,
2: string server
}

service Ejabthrift {
/* add ruser to roster of luser, and visa-versa. also routes presence to users if online */
void add_friend( 1: JabberUser luser,
2: JabberUser ruser
),

/* remove ruser from luser's roster */
void remove_friend( 1: JabberUser luser, 2: JabberUser ruser ),

/* make it look like fromuser sent a message to touser */
void spoof_message( 1: JabberUser fromuser, 2: JabberUser touser, 3: string message, 4: string subject ),
/* .. or a chat message */
void spoof_chat( 1: JabberUser fromuser, 2: JabberUser touser, 3: string message, 4: string thread ),

/* sends PEP usertune message, see http://xmpp.org/extensions/xep-0118.html */
void publish_np ( 1: JabberUser fromuser, 2: string artist, 3: string album, 4: string track, 5: i32 tracklength, 6: i32 tracknum )
}

Run that .thrift file, and you get gen-php and gen-erl directories, with php client code, and erlang files needed to build a server.

Here’s the ejabberd module, which starts a thrift server:

mod_thriftctl:

  1. %
  2. % A module to control ejabberd with a thrift interface.
  3. %
  4. -module(mod_thriftctl).
  5. -author(‘rj@last.fm’).
  6.  
  7. % ejabberd headers:
  8. -include("ejabberd.hrl").
  9. -include("mod_roster.hrl").
  10. -include("jlib.hrl").
  11.  
  12. % thrift server headers:
  13. -include("thrift.hrl").
  14. -include("transport/tSocket.hrl").
  15. -include("protocol/tBinaryProtocol.hrl").
  16. -include("server/tErlServer.hrl").
  17. -include("transport/tErlAcceptor.hrl").
  18.  
  19. % we are an ejabberd module:
  20. -behaviour(gen_mod).
  21. -export([start/2, stop/1]).
  22.  
  23. % our thrift service:
  24. -include("ejabthrift_thrift.hrl").
  25. -include("mod_thriftctl_types.hrl").
  26. -export([   add_friend/2, remove_friend/2,
  27.             spoof_message/4, spoof_chat/4,
  28.             publish_np/6
  29.         ]).
  30.  
  31. % convert thrift Jabberuser into ejabberd jid
  32. ju2jid(Jabberuser) when is_record(Jabberuser, jabberUser) ->
  33.     #jid{ user=Jabberuser#jabberUser.name, server=Jabberuser#jabberUser.server, resource="",
  34.           luser=Jabberuser#jabberUser.name, lserver=Jabberuser#jabberUser.server, lresource=""
  35.         }.
  36.  
  37. spoof_message( FromU, ToU, Msg, Subject ) ->
  38.     F = ju2jid(FromU),
  39.     T = ju2jid(ToU),
  40.     XmlBody = {xmlelement, "message",
  41.                [
  42.                 {"from", jlib:jid_to_string(F)},
  43.                 {"to", jlib:jid_to_string(T)}
  44.                ],
  45.                [
  46.                {xmlelement, "subject", [], [{xmlcdata, Subject}]},
  47.                {xmlelement, "body", [], [{xmlcdata, Msg}]}
  48.                ]
  49.               },
  50.     ejabberd_router:route(F, T, XmlBody).
  51.  
  52. spoof_chat( FromU, ToU, Msg, Thread ) ->
  53.     F = ju2jid(FromU),
  54.     T = ju2jid(ToU),
  55.     XmlBody = {xmlelement, "message",
  56.                [{"type", "chat"},
  57.                 {"from", jlib:jid_to_string(F)},
  58.                 {"to", jlib:jid_to_string(T)}
  59.                ],
  60.                [
  61.                {xmlelement, "thread", [], [{xmlcdata, Thread}]},
  62.                {xmlelement, "body", [], [{xmlcdata, Msg}]}
  63.                ]
  64.               },
  65.     ejabberd_router:route(F, T, XmlBody).
  66.  
  67. publish_np( FromU, ArtistS, AlbumS, TrackS, LengthI, TrackNumI ) ->
  68.     From = ju2jid(FromU),
  69.     % The usertune message must contain binaries, not strings or ints
  70.     FromStr     = jlib:jid_to_string(From),
  71.     Artist      = list_to_binary(ArtistS),
  72.     Album       = list_to_binary(AlbumS),
  73.     Track       = list_to_binary(TrackS),
  74.     Length      = list_to_binary(io_lib:format("~w",[LengthI])),
  75.     TrackNum    = list_to_binary(io_lib:format("~w",[TrackNumI])),
  76.     Xml = {xmlelement,"iq",
  77.                 [{"from", FromStr},
  78.                  {"type","set"},
  79.                  {"id","pub1"}],
  80.                 [{xmlcdata,<<"\n  ">>},
  81.                  {xmlelement,"pubsub",
  82.                   [{"xmlns","http://jabber.org/protocol/pubsub"}],
  83.                   [{xmlcdata,<<"\n    ">>},
  84.                    {xmlelement,"publish",
  85.                     [{"node","http://jabber.org/protocol/tune"}],
  86.                     [{xmlcdata,<<"\n      ">>},
  87.                      {xmlelement,"item",[],
  88.                       [{xmlcdata,<<"\n        ">>},
  89.                        {xmlelement,"tune",
  90.                         [{"xmlns","http://jabber.org/protocol/tune"}],
  91.                         [{xmlcdata,<<"\n          ">>},
  92.                          {xmlelement,"artist",[],
  93.                           [{xmlcdata, Artist}]},
  94.                          {xmlcdata,<<"\n          ">>},
  95.                          {xmlelement,"length",[],[{xmlcdata, Length}]},
  96.                          {xmlcdata,<<"\n          ">>},
  97.                          {xmlelement,"source",[],
  98.                           [{xmlcdata, Album}]},
  99.                          {xmlcdata,<<"\n          ">>},
  100.                          {xmlelement,"title",[],
  101.                           [{xmlcdata, Track}]},
  102.                          {xmlcdata,<<"\n          ">>},
  103.                          {xmlelement,"track",[],[{xmlcdata, TrackNum}]},
  104.                          {xmlcdata,<<"\n        ">>}]},
  105.                        {xmlcdata,<<"\n      ">>}]},
  106.                      {xmlcdata,<<"\n    ">>}]},
  107.                    {xmlcdata,<<"\n  ">>}]},
  108.                  {xmlcdata,<<"\n">>}]},
  109.     % PEP means you act as a pubsub node yourself,
  110.     % so it’s addressed to yourself and is broadcast to your friends automatically:
  111.     ejabberd_router:route(From, From, Xml),
  112.     ok.
  113.  
  114. % adds bi-directional friend relationship immediately for both users.
  115. add_friend(     #jabberUser{name=LU, server=LS},
  116.                 #jabberUser{name=RU, server=RS}) ->
  117.     AskMessage = "",
  118.     Group = "",
  119.     Subtype = both,
  120.     subscribe(LU, LS, RU, RS, RU, Group, Subtype, AskMessage),
  121.     subscribe(RU, RS, LU, LS, LU, Group, Subtype, AskMessage),
  122.     route_rosteritem(LU, LS, RU, RS, RU, Group, Subtype),
  123.     route_rosteritem(RU, RS, LU, LS, LU, Group, Subtype),
  124.     ok.
  125.  
  126. remove_friend( #jabberUser{name=LU, server=LS}, #jabberUser{name=RU, server=RS} ) ->
  127.     unsubscribe(LU, LS, RU, RS),
  128.     unsubscribe(RU, RS, LU, LS),
  129.     route_rosteritem(LU, LS, RU, RS, "", "", "remove"),
  130.     route_rosteritem(RU, RS, LU, LS, "", "", "remove"),
  131.     ok.
  132.  
  133. unsubscribe(LocalUser, LocalServer, RemoteUser, RemoteServer) ->
  134.     Key = {{LocalUser,LocalServer,{RemoteUser,RemoteServer,[]}},
  135.        {LocalUser,LocalServer}},
  136.     mnesia:transaction(fun() -> mnesia:delete(roster, Key, write) end).
  137.  
  138. route_rosteritem(LocalUser, LocalServer, RemoteUser, RemoteServer, Nick, Group, Subscription) ->
  139.     LJID = jlib:make_jid(LocalUser, LocalServer, ""),
  140.     RJID = jlib:make_jid(RemoteUser, RemoteServer, ""),
  141.     ToS = jlib:jid_to_string(LJID),
  142.     ItemJIDS = jlib:jid_to_string(RJID),
  143.     GroupXML = {xmlelement, "group", [], [{xmlcdata, Group}]},
  144.     Item = {xmlelement, "item",
  145.         [{"jid", ItemJIDS},
  146.          {"name", Nick},
  147.          {"subscription", Subscription}],
  148.         [GroupXML]},
  149.     Query = {xmlelement, "query", [{"xmlns", ?NS_ROSTER}], [Item]},
  150.     Packet = {xmlelement, "iq", [{"type", "set"}, {"to", ToS}], [Query]},
  151.     ejabberd_router:route(LJID, LJID, Packet).
  152.  
  153.  
  154. subscribe(LocalUser, LocalServer, RemoteUser, RemoteServer, Nick, Group, Subscription, Xattrs) ->
  155.     R = #roster{usj = {LocalUser,LocalServer,{RemoteUser,RemoteServer,[]}},
  156.                 us = {LocalUser,LocalServer},
  157.                 jid = {RemoteUser,RemoteServer,[]},
  158.                 name = Nick,
  159.                 subscription = Subscription, % none, to=you see him, from=he sees you, both
  160.                 ask = none, % out=send request, in=somebody requests you, none
  161.                 groups = [Group],
  162.                 askmessage = Xattrs, % example: [{"category","conference"}]
  163.                 xs = []
  164.                },
  165.     mnesia:transaction(fun() -> mnesia:write(R) end).
  166.  
  167. start(Host, Opts) ->
  168.     ?INFO("mod_ejabthrift start().",[]),
  169.     %% get options
  170.     Port = gen_mod:get_opt(port, Opts, 9000),
  171.  
  172.     spawn(fun()-> thrift:start() end),
  173.     ?INFO("mod_ejabthrift thrift:start().",[]),
  174.  
  175.     Handler   = ?MODULE,
  176.     Processor = ejabthrift_thrift,
  177.  
  178.     TF = tBufferedTransportFactory:new(),
  179.     PF = tBinaryProtocolFactory:new(),
  180.  
  181.     ServerTransport = tErlAcceptor,
  182.     ServerFlavor    = tErlServer,
  183.  
  184.     Server = oop:start_new(ServerFlavor, [Port, Handler, Processor, ServerTransport, TF, PF]),
  185.  
  186.     case ?R0(Server, effectful_serve) of
  187.     ok    ->
  188.         ?INFO("mod_ejabthrift: Thrift server (~s) listening on port ~w",[Host, Port]),
  189.         % put Server into process dictionary (needed for clean stop)
  190.         put(thrift_server_reference, Server),
  191.         ok;
  192.     Error ->
  193.         ?ERROR_MSG("mod_ejabthrift: Error starting thrift server: ~w", [Error]),
  194.         Error
  195.     end.
  196.  
  197. stop(_Host) ->
  198.     ?C0(get(thrift_server_reference), stop),
  199.     ok.


To build, first build the gen-erl code:

erlc -pa ${EJAB_SRC} -I ${EJAB_SRC} -I ${ERL_THRIFT}/include -I ./gen-erl -o ./gen-erl ./gen-erl/*.erl

Where ERL_THRIFT is the lib/erl directory from the amiethrift code, git://repo.or.cz/amiethrift.git

Then compile the module:

erlc -pa ${EJAB_SRC} -I ${EJAB_SRC} -I ${ERL_THRIFT}/include -I ./gen-erl *.erl

To install, copy all the beam files to the ejabberd ebin dir:

sudo cp *.beam gen-erl/*.beam /var/lib/ejabberd/ebin/

This is inspired by mod_xmlrpc, which is in ejabberd-modules. As you can see from the start function, that’s what it takes to start a thrift server. It’s now trivial to call into ejabberd from other languages. For example, if you started listening to a song using a flash player on the website, a php webservice could make a user tune announcement on your behalf, or spoof messages from you boasting how much you love listening to Paris Hilton.

If anyone knows where I can read about the ejabberd architecture / design, so I don’t have to piece it all together myself, please let me know.

Tags: , , , , ,

Sunday, November 23rd, 2008 programming 5 Comments