A Million-user Comet Application with Mochiweb, Part 1

In this series I will detail what I found out empirically about how mochiweb performs with lots of open connections, and show how to build a comet application using mochiweb, where each mochiweb connection is registered with a router which dispatches messages to various users. We end up with a working application that can cope with a million concurrent connections, and crucially, knowing how much RAM we need to make it work.

In part one:

  • Build a basic comet mochiweb app that sends clients a message every 10 seconds.
  • Tune the Linux kernel to handle lots of TCP connections
  • Build a flood-testing tool to open lots of connections (ye olde C10k test)
  • Examine how much memory this requires per connection.

Future posts in this series will cover how to build a real message routing system, additional tricks to reduce memory usage, and more testing with 100k and 1m concurrent connections.

I assume you know your way around the Linux command line, and know a bit of Erlang.

Building a Mochiweb test application

In brief:

  1. Install and build Mochiweb
  2. Run: /your-mochiweb-path/scripts/new_mochiweb.erl mochiconntest
  3. cd mochiconntest and edit src/mochiconntest_web.erl

This code (mochiconntest_web.erl) just accepts connections and uses chunked transfer to send an initial welcome message, and one message every 10 seconds to every client.

  1. -module(mochiconntest_web).
  2. -export([start/1, stop/0, loop/2]).
  3. %% External API
  4. start(Options) ->
  5.     {DocRoot, Options1} = get_option(docroot, Options),
  6.     Loop = fun (Req) ->
  7.                    ?MODULE:loop(Req, DocRoot)
  8.            end,
  9.     % we’ll set our maximum to 1 million connections. (default: 2048)
  10.     mochiweb_http:start([{max, 1000000}, {name, ?MODULE}, {loop, Loop} | Options1]).
  11.  
  12. stop() ->
  13.     mochiweb_http:stop(?MODULE).
  14.  
  15. loop(Req, DocRoot) ->
  16.     "/" ++ Path = Req:get(path),
  17.     case Req:get(method) of
  18.         Method when Method =:= ‘GET’; Method =:= ‘HEAD’ ->
  19.             case Path of
  20.                 "test/" ++ Id ->
  21.                     Response = Req:ok({"text/html; charset=utf-8",
  22.                                       [{"Server","Mochiweb-Test"}],
  23.                                       chunked}),
  24.                     Response:write_chunk("Mochiconntest welcomes you! Your Id: " ++ Id ++ "\n"),
  25.                     %% router:login(list_to_atom(Id), self()),
  26.                     feed(Response, Id, 1);
  27.                 _ ->
  28.                     Req:not_found()
  29.             end;
  30.         ‘POST’ ->
  31.             case Path of
  32.                 _ ->
  33.                     Req:not_found()
  34.             end;
  35.         _ ->
  36.             Req:respond({501, [], []})
  37.     end.
  38.  
  39. feed(Response, Path, N) ->
  40.     receive
  41.         %{router_msg, Msg} ->
  42.         %    Html = io_lib:format("Recvd msg #~w: ‘~s’<br/>", [N, Msg]),
  43.         %    Response:write_chunk(Html);
  44.     after 10000 ->
  45.         Msg = io_lib:format("Chunk ~w for id ~s\n", [N, Path]),
  46.         Response:write_chunk(Msg)
  47.     end,
  48.     feed(Response, Path, N+1).
  49.  
  50. %% Internal API
  51. get_option(Option, Options) ->
  52.     {proplists:get_value(Option, Options), proplists:delete(Option, Options)}.


Start your mochiweb app

make && ./start-dev.sh
By default mochiweb listens on port 8000, on all interfaces. If you are doing this on the desktop, you can test with any web browser. Just navigate to http://localhost:8000/test/foo.

Here’s the command-line test:

$ lynx --source "http://localhost:8000/test/foo"
Mochiconntest welcomes you! Your Id: foo<br/>
Chunk 1 for id foo<br/>
Chunk 2 for id foo<br/>
Chunk 3 for id foo<br/>
^C

Yep, it works. Now let’s make it suffer.

Tuning the Linux Kernel for many tcp connections

Save yourself some time and tune the kernel tcp settings before testing with lots of connections, or your test will fail and you’ll see lots of Out of socket memory messages (and if you are masquerading, nf_conntrack: table full, dropping packet.)

Here are the sysctl settings I ended up with – YMMV, but these will probably do:

# General gigabit tuning:
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_syncookies = 1
# this gives the kernel more memory for tcp
# which you need with many (100k+) open socket connections
net.ipv4.tcp_mem = 50576   64768   98152
net.core.netdev_max_backlog = 2500
# I was also masquerading the port comet was on, you might not need this
net.ipv4.netfilter.ip_conntrack_max = 1048576

Put these in /etc/sysctl.conf then run sysctl -p to apply them. No need to reboot, now your kernel should be able to handle a lot more open connections, yay.

Creating a lot of connections

There are many ways to do this. Tsung is quite sexy, and there and plenty of other less-sexy ways to spam an httpd with lots of requests (ab, httperf, httpload etc). None of them are ideally suited for testing a comet application, and I’d been looking for an excuse to try the Erlang http client, so I wrote a basic test to make lots of connections.
Just because you can, doesn’t mean you should.. one process per connection would definitely be a waste here. I’m using one process to load urls from a file, and another process to establish and receive messages from all http connections (and one process as a timer to print a report every 10 seconds). All data received from the server is discarded, but it does increment a counter so we can keep track of how many HTTP chunks were delivered.


floodtest.erl

  1. -module(floodtest).
  2. -export([start/2, timer/2, recv/1]).
  3.  
  4. start(Filename, Wait) ->
  5.     inets:start(),
  6.     spawn(?MODULE, timer, [10000, self()]),
  7.     This = self(),
  8.     spawn(fun()-> loadurls(Filename, fun(U)-> This ! {loadurl, U} end, Wait) end),
  9.     recv({0,0,0}).
  10.  
  11. recv(Stats) ->
  12.     {Active, Closed, Chunks} = Stats,
  13.     receive
  14.         {stats} -> io:format("Stats: ~w\n",[Stats])
  15.         after 0 -> noop
  16.     end,
  17.     receive
  18.         {http,{_Ref,stream_start,_X}} ->  recv({Active+1,Closed,Chunks});
  19.         {http,{_Ref,stream,_X}} ->          recv({Active, Closed, Chunks+1});
  20.         {http,{_Ref,stream_end,_X}} ->  recv({Active-1, Closed+1, Chunks});
  21.         {http,{_Ref,{error,Why}}} ->
  22.             io:format("Closed: ~w\n",[Why]),
  23.             recv({Active-1, Closed+1, Chunks});
  24.         {loadurl, Url} ->
  25.             http:request(get, {Url, []}, [], [{sync, false}, {stream, self}, {version, 1.1}, {body_format, binary}]),
  26.                 recv(Stats)
  27.     end.
  28.  
  29. timer(T, Who) ->
  30.     receive
  31.     after T ->
  32.         Who ! {stats}
  33.     end,
  34.     timer(T, Who).
  35.  
  36. % Read lines from a file with a specified delay between lines:
  37. for_each_line_in_file(Name, Proc, Mode, Accum0) ->
  38.     {ok, Device} = file:open(Name, Mode),
  39.     for_each_line(Device, Proc, Accum0).
  40.  
  41. for_each_line(Device, Proc, Accum) ->
  42.     case io:get_line(Device, "") of
  43.         eof  -> file:close(Device), Accum;
  44.         Line -> NewAccum = Proc(Line, Accum),
  45.                     for_each_line(Device, Proc, NewAccum)
  46.     end.
  47.  
  48. loadurls(Filename, Callback, Wait) ->
  49.     for_each_line_in_file(Filename,
  50.         fun(Line, List) ->
  51.             Callback(string:strip(Line, right, $\n)),
  52.             receive
  53.             after Wait ->
  54.                 noop
  55.             end,
  56.             List
  57.         end,
  58.         [read], []).



Each connection we make requires an ephemeral port, and thus a file descriptor, and by default this is limited to 1024. To avoid the Too many open files problem you’ll need to modify the ulimit for your shell. This can be changed in /etc/security/limits.conf, but requires a logout/login. For now you can just sudo and modify the current shell (su back to your non-priv’ed user after calling ulimit if you don’t want to run as root):

$ sudo bash
# ulimit -n 999999
# erl

You might as well increase the ephemeral port range to the maximum too:
# echo "1024 65535" > /proc/sys/net/ipv4/ip_local_port_range

Generate a file of URLs to feed to the floodtest program:
( for i in `seq 1 10000`; do echo "http://localhost:8000/test/$i" ; done ) > /tmp/mochi-urls.txt

From the erlang prompt you can now compile and launch floodtest.erl:
erl> c(floodtest).
erl> floodtest:start("/tmp/mochi-urls.txt", 100).

This will establish 10 new connections per second (ie, 1 connection every 100ms).

It will output stats in the form {Active, Closed, Chunks} where Active is the number of connections currently established, Closed is the number that were terminated for some reason, and Chunks is the number of chunks served by chunked transfer from mochiweb. Closed should stay on 0, and Chunks should be more than Active, because each active connection will receive multiple chunks (1 every 10 seconds).


The resident size of the mochiweb beam process with 10,000 active connections was 450MB – that’s 45KB per connection. CPU utilization on the machine was practically nothing, as expected.

Assessment so far

That was a reasonable first attempt. 45KB per-connection seems a bit high – I could probably cook something up in C using libevent that could do this with closer to 4.5KB per connection (just a guess, if anyone has experience please leave a comment). If you factor in the amount of code and time it took to do this in Erlang compared with C, I think the increased memory usage is more excusable.


In future posts I’ll cover building a message router (so we can uncomment lines 25 and 41-43 in mochiconntest_web.erl) and talk about some ways to reduce the overall memory usage. I’ll also share the results of testing with 100k and 1M connections.

UPDATED: Part 2 and Part 3 are online now.

Tags: , , , , , ,

Wednesday, October 15th, 2008 programming

41 Comments to A Million-user Comet Application with Mochiweb, Part 1

  1. Great post! Thanks for taking the time to write up your experiences with comet and erlang. I look forward to more of them in the future!

  2. scott on October 16th, 2008
  3. Hi.

    I strongly advice you to try libevent’s http support. About a year ago I did a little project with it and within 25 minutes I had a working application able to handle 4000 to 5000 requests per second at 100 parallel clients testing, and with only about 15% of delay for handling variables with POST payload. Memory was O(log n) and I think it was significantly less than that. This on a single core 1.8Ghz P4 desktop running GNU/Linux/Gentoo. A current server should be at least twice that.

    To have a more meaningful result, try to test from a separate box as your client will fight for resources with the server. Also I found localhost networking performance scales like O(log n) while external connectivity tends to be 0(n) for whatever reasons (ethernet collisions, drivers, i/o out of main bus.)

    At libevent speeds you start looking at reusing memory buffers and trying to use more the stack instead of heap. It’s the killer feature for performance of non-interpreted languages.

    If what you need to do is simple and needs performance, C isn’t that hard to do. I don’t know erlang or mochiweb, but if you mail me some details (and it’s easily to mock) I might give it a shot :)

    In particular my guesstimate is you need less than 2k per connection, but probably taking more time to write this comment than to get some code out.

  4. Ale on October 16th, 2008
  5. Also a libevent based app will sure fit within a CPU’s 32k+32k L1 cache.

  6. Ale on October 16th, 2008
  7. Thanks – I am curious about the C/libevent implementation. Erlang is awesome for all the behind the scenes message routing and pubsub stuff, but a libevent based frontend httpd in C that talks to the erlang system could potentially save a lot of memory. I would like to do a followup post about that if I can find the time (I have a couple more erlang posts in mind first about message routing and pubsub stuff).

    I will put the code from these demos into github or something, perhaps someone else can whip up a libevent/C test :)

  8. RJ on October 16th, 2008
  9. Instead of writing a front-end using C/libevent why wouldn’t you just use Nginx?

    http://nginx.net/

  10. David Cancel on October 17th, 2008
  11. Yes, nginx is a very good option too. It is even slightly faster at price of size and some other things. Code needs to be done either way. Why not lighthttpd? Why not Apache httpd? :)

  12. Ale on October 17th, 2008
  13. http://www.danga.com/djabberd/#perf

    Djabberd says it got 300k connections in 1GB (~ 3.5 KB / connection).

  14. Luke on October 17th, 2008
  15. You should be able to do connections in less than 45kb by shrinking the amount of heap allocated by default so each of the processes using something like erlang:hibernate/3 or hacking on mochiweb a bit to ensure a particular min_heap_size on spawned processes.

  16. Bob Ippolito on October 17th, 2008
  17. Yep, hibernate features in my next post. Wasn’t suitable when each process is just blocking for 10s then sending.

  18. RJ on October 17th, 2008
  19. [...] A Million-user Comet Application with Mochiweb, Part 1 | Richard Jones, Esq. [...]

  20. My daily readings 10/17/2008 « Strange Kite on October 17th, 2008
  21. I noticed that (where you create a list of urls) it does not match with the code;

    I would expect to see the url http://localhost:8000/test/$i instead of http://localhost:8000/$i …. this due to this code:

    case Path of
    “test/” ++ Id ->

  22. Arjen Wiersma on October 17th, 2008
  23. [...] Jones has started an article series about building A Million-user Comet Application with MochiWeb. In this series I will detail what I found out empirically about how mochiweb performs with lots of [...]

  24. Comet Daily » Blog Archive » A Million User Comet App with MochiWeb on October 17th, 2008
  25. Question 1: What is the total system memory usage difference? (Not just the process. In my tests system resources for 800k sockets took 450MB, program just about 25MB.)

    Question 2: What do you mean by “10000 active connections”? Connected sockets, HTTP sessions?

    I managed to freeze both my OSX and Gentoo looking for real limits :)

  26. Alecco on October 18th, 2008
  27. [...] A Million-user Comet Application with Mochiweb, Part 1 (tags: erlang web2.0 comet mochiweb programming) [...]

  28. links for 2008-10-20 « Bloggitation on October 20th, 2008
  29. [...] —— 《A Million-user Comet Application with Mochiweb》。这个系列的第一篇在 [这里] [...]

  30. Erlang-China » Comet and Erlang, A perfect match on October 21st, 2008
  31. Great article.

  32. arie murdianto on October 22nd, 2008
  33. [...] Part 1, we built a (somewhat useless) mochiweb comet application that sent clients a message every 10 [...]

  34. A Million-user Comet Application with Mochiweb, Part 2 | Richard Jones, Esq. on October 23rd, 2008
  35. [...] Part 1 and Part 2 in this series showed how to build a comet application using mochiweb, and how to route messages to connected users. We managed to squeeze application memory down to 8KB per connection. We did ye olde c10k test, and observed what happened with 10,000 connected users. We made graphs. It was fun, but now it’s time to make good on the claims made in the title, and turn it up to 1 million connections. [...]

  36. A Million-user Comet Application with Mochiweb, Part 3 | Richard Jones, Esq. on November 4th, 2008
  37. [...] Part 1 Part 2 Part 3 [...]

  38. Kevin’s Link Blog » Pragmatically Scaling Erlang on November 4th, 2008
  39. [...] A Million-user Comet Application with Mochiweb, Part 1 | Richard Jones, Esq. [...]

  40. del.icio.us bookmarks - 2008-11-11 on November 12th, 2008
  41. [...] A Million-user Comet Application with Mochiweb, Part1, Part 2 and most interestingly Part 3 [...]

  42. Comet Daily » Blog Archive » Liberator Performance and Architecture on December 1st, 2008
  43. [...] series on scaling Comet applications written with Erlang’s MochiWeb framework. In RJ’s first article, he demonstrates a simple Erlang-based client-server Comet test that scales to 10,000 concurrent [...]

  44. Comet & Java: Threaded Vs Nonblocking I/O at iobound on December 15th, 2008
  45. dkgtkpyiemcijpfawell, hi admin adn people nice forum indeed. how’s life? hope it’s introduce branch ;)

  46. MourbushiorarArrab on December 29th, 2008
  47. Toward a million-user long-poll HTTP application - nginx + erlang + mochiweb :) « Alexey’s Random Notes on January 9th, 2009
  48. [...] 原文:A Million-user Comet Application with Mochiweb, Part 1 [...]

  49. 用Mochiweb打造百万级Comet应用,第一部分 - IDISC的生活 on January 21st, 2009
  50. [...] 现在我们需要运行另外一个有更大伸缩行的测试;客户端链接到mochiweb应用,mochiweb把他们注册到路由器。我们能生成更多的虚假消息来考验路由器,路由器将把这些消息发送到任何注册的客户端。让我们再次运行在Part 1 使用的10,000个并发用户的测试 ,但是这次我们将在传输大量消息之前保持所有的客户连接一段时间。 [...]

  51. 用Mochiweb打造百万级Comet应用,第二部分 - IDISC的生活 on January 22nd, 2009
  52. [...] series of posts about creating a “Million user Comet application with MochiWeb” (part I, part II, part III). « Using a forward-proxy for direct access to [...]

  53. Playing with Erlang « Kai Lautaportti on February 2nd, 2009
  54. Mac OS X:
    (for ((i = 1;i

  55. Dmitriy on February 14th, 2009
  56. (for ((i = 1;i<=10000;i++)); do echo $i; done) >

  57. Dmitriy on February 14th, 2009
  58. [...] you’re running highly loaded Linux servers like us you’ll find these Linux Kernel tuning tips useful. I’ve been using a similar set of tuning params for a while [...]

  59. Tuning the Linux Kernel for many tcp connections | Lookery Blog on February 17th, 2009
  60. [...] 原文:A Million-user Comet Application with Mochiweb, Part 1 [...]

  61. 用Mochiweb打造百万级Comet应用,第一部分 | 静庵 on March 3rd, 2009
  62. [...] 1 使用的10,000个并发用户的测试 ,但是这次我们将在传输大量消息之前保持所有的客户连接一段时间。 Part 1 [...]

  63. 用Mochiweb打造百万级Comet应用,第二部分 | 静庵 on March 3rd, 2009
  64. Check out http://msgbus.googlecode.com/ for an example of this in C using libevent’s evhttp.

  65. j on March 7th, 2009
  66. [...] 每个HTTP请求由一个Erlang进程负责处理,充分发挥了Erlang多线程的优势。一个连接对应一个Erlang进程,比较适合comet这样的应用,并发请求的处理能力与Erlang线程数量正相关。相关测试见A Million-user Comet Application with Mochiweb  [...]

  67. Erlang-China » Mochiweb的设计分析 on March 15th, 2009
  68. [...] been tinkering around with linked-in drivers lately. There are several interesting projects using linked-in drivers so I thought I should learn how to write [...]

  69. Hypothetical Labs » Blog Archive » Simplest Linked-In Driver EVAH on March 24th, 2009
  70. [...] [upmod] [downmod] A Million-user Comet Application with Mochiweb, Part 1 | Richard Jones, Esq. (metabrew.com) 0 points posted 7 months ago by trshant tags 4mdelicious mochiweb erlang [...]

  71. Tagz | "A Million-user Comet Application with Mochiweb, Part 1 | Richard Jones, Esq." | Comments on May 16th, 2009
  72. [...] 原文:A Million-user Comet Application with Mochiweb, Part 1 [...]

  73. 用Mochiweb打造百万级Comet应用 « Learsu on August 18th, 2009
  74. Not working for me. Where do I can find some help. New to erlang and mochiweb.

    My facts:
    1. The server is up and running, and the lynx test is working fine.

    2. After I sudo and raise ulimit, running erl takes about 450MB by itself, without running anything yet.

    3. Then running floodtest:start gives me lots of “Closed: econnrefused”.

    Any ideas?

  75. Manuel on October 13th, 2009
  76. I recommend asking for help in #erlang on irc.freenode.org – it’s full of helpful people (I’m RJ2 on freenode).

  77. RJ on October 28th, 2009
  78. [...] convencerle recurrí una vez más a una serie de tres posts de Richard Jones, cofundador y CTO de last.fm, y al parecer, y con gran criterio, fan de Erlang. Tantas veces he [...]

  79. A million-user Comet application with Erlang : el_anhelo_constante on November 2nd, 2009
  80. Anyone know how to detect client disconnects not using a timeout? In this code, it would not “logout” immediately… I think it involves making the socket active instead of passive, but can’t figure it out. Richard, any thoughts? Thanks!

  81. Shayne on November 5th, 2009

Leave a comment