[erlang-questions] mochiweb https/ssl

sasa <>
Fri May 6 00:02:45 CEST 2011


Hello,

First of all, I apologize if this question occurs twice. I have tried
submitting it couple of hours ago via google groups, but it didn't show
up, so I'm trying via e-mail now.

I am trying to develop https comet (long polling) server using mochiweb.
I have already written one such http server which is in production for
couple of months and serves no more than 1000 concurrent users, sending
broadcast message to all concurrent users at an average rate of 0.3
broadcast/sec

When I added ssl support, it essentially works when smaller number of
users (about 100-200) are connected. With larger number, it stops
responding correctly i.e. some connection attempts are refused, requests
time out etc. Only ssl/https part seems to be affected by this behavior.

To better test this, I developed small test mochiweb app which works as
follows:
1. Spawn and register singleton broadcast process
2. For each request, the erlang process sends its pid to the broadcast
process. Then it waits for the message from the broadcast process
3. Every 5 seconds broadcast process notifies all registered https
response processes from 2, then it clears its list of registered
processes
4. When process from 2 receives notification, it responds and finishes.

In this test implementation, the response is hardcoded short string. For
the sake of brevity, I didn't include the test app code, but if
necessary I can do so.


I then deployed this test app to the server, and set up two client
machines to load test the server. The load testing code is also written
in erlang, and is using ibrowse for making https requests, since I had
some problems with httpc. The test client basically adheres to the
protocol described above. It spawns request processes, with each process
making the requests in an infinite loop. I gather success/failure stats
in a singleton erlang process and print them out in regular intervals.

As in the production system, with http, the test server performs
correctly. With https, when number of concurrent requests reaches some
treshold (about 1000), many requests are not served.

On the client machine, I occasionally receive following errors:
=ERROR REPORT==== 
** State machine <0.17592.0> terminating

** Reason for termination = 
** {badarg,[{erlang,byte_size,[undefined]},
           {ssl_tls1,split_secret,1},
           {ssl_tls1,prf,4},
           {ssl_handshake,master_secret,4},
           {ssl_connection,handle_resumed_session,2},
           {ssl_connection,next_state,3},
           {gen_fsm,handle_msg,7},
           {proc_lib,init_p_do_apply,3}]}


While on the server I occasionally notice following errors: 
=SUPERVISOR REPORT==== 5-May-2011::11:06:33 ===
    Supervisor: {local,ssl_connection_sup}
    Context:    child_terminated
    Reason:     {{badmatch,{resumed,undefined}},
                 [{ssl_handshake,hello,4},
                  {ssl_connection,hello,2},
                  {ssl_connection,next_state,3},
                  {gen_fsm,handle_msg,7},
                  {proc_lib,init_p_do_apply,3}]}
    Offender:   [{pid,<0.7044.2>},
                 {name,undefined},
                 {mfargs,{ssl_connection,start_link,undefined}},
                 {restart_type,temporary},
                 {shutdown,4000},
                 {child_type,worker}]


I have tried to tweak some OS parameters, as well as erlang parameters,
but without success. After three days of experimenting, I am out of
ideas and need some help or pointers.

I am using Erlang R14B02 on 64 bit Ubuntu 10.04, all the machines are on
the EC2 cloud. 

Following are sysctl.conf settings (I tried some more combinations
without success):
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_syncookies = 1

net.ipv4.tcp_mem = 50576   64768   98152
net.core.netdev_max_backlog = 2500
net.ipv4.netfilter.ip_conntrack_max = 1048576

net.ipv4.ip_local_port_range = 1024 65535

net.ipv4.tcp_fin_timeout = 10



Following is ulimit -a output:
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 32000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited



These are mochiweb parameters:
[
  {port, 9999},
  {name, https_test},
  {ssl, true},
  {ssl_opts, [
    {cacertfile, "keys/cacert.pem"},
    {certfile, "keys/cert.pem"},
    {keyfile, "keys/cert.key"},
    {depth, 0}
  ]}
]


To use it on a 443 port, I have set up iptables port forwarding
iptables -t nat -I PREROUTING --source 0/0 --destination 0/0 -p tcp
--dport 443 -j REDIRECT --to-ports 9999


These are erl parameters:
erl -P 268435456 -env ERL_MAX_PORTS 100000



If anybody has any idea, I would be most grateful.


Best regards,
Sasa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110506/c2e59055/attachment.html>


More information about the erlang-questions mailing list