[erlang-questions] mochiweb https/ssl
sasa
sasa555@REDACTED
Tue May 17 12:20:07 CEST 2011
Hello,
I spent last week fighting with SSL problems without success. Eventually I
switched to the stunnel based SSL which delegates to Erlang HTTP server.
Following are the characteristics of the behavior I experienced last week.
These are a bit different than the original ones, but I didn't want to open
new topic. I am posting this in case someone is stuck with the similar
behavior.
1. The mochiweb http server runs without problems since the beginning of the
year. It also servers some small number of SSL users. The server is the same
erlang application running on the same node.
2. Every day of the last week, in the morning, I would force the users to
use SSL (https) rather than http.
3. The server then works fine for couple of hours.
4. At one point, the servers starts responding very slowly. Restart of the
erlang application, node, or even the machine doesn't help.
5. At the same time, the http server (which is the same application on the
same node) is working fine.
6. Kernel tuning didn't help.
7. I don't notice anything strange in erlang log or on the OS level.
8. In my application log, it seems as if most connections are not even
established. The same thing is confirmed from netstat output. The number of
connections is smaller than expected. It looks as if new requests are queued
somewhere or some main accept/listen loop is processing them slowly.
9. The CPU usage is low as well as the memory usage.
After forcing users back to http, the problems disappeared. The server could
then normally serve smaller number of SSL users.
It seems to me that the problem is somewhere in the mochiweb/erlang. After
switching to stunnel, which is in front of that same mochiweb application,
everything is working nicely. I should note that stunnel uses more CPU.
I am not quite sure why is it not working for me. I had load tests which
performed fine. Maybe the problem is that my real users are constantly
connecting/disconnecting, closing browsers during the request etc. I didn't
load test this type of users.
In any case, the behavior described above occurred every day, and was
resolved when switching to stunnel SSL.
I am sorry it didn't work out for me. I would be happier to use erlang SSL,
so I'll try again when the new version is released.
Regards,
Sasa
On Fri, May 6, 2011 at 12:02 AM, sasa <sasa555@REDACTED> wrote:
> Hello,
>
> First of all, I apologize if this question occurs twice. I have tried
> submitting it couple of hours ago via google groups, but it didn't show up,
> so I'm trying via e-mail now.
>
> I am trying to develop https comet (long polling) server using mochiweb. I
> have already written one such http server which is in production for couple
> of months and serves no more than 1000 concurrent users, sending broadcast
> message to all concurrent users at an average rate of 0.3 broadcast/sec
>
> When I added ssl support, it essentially works when smaller number of users
> (about 100-200) are connected. With larger number, it stops responding
> correctly i.e. some connection attempts are refused, requests time out etc.Only ssl/https part seems to be affected by this behavior.
>
> To better test this, I developed small test mochiweb app which works as
> follows:
> 1. Spawn and register singleton broadcast process
> 2. For each request, the erlang process sends its pid to the broadcast
> process. Then it waits for the message from the broadcast process
> 3. Every 5 seconds broadcast process notifies all registered https response
> processes from 2, then it clears its list of registered processes
> 4. When process from 2 receives notification, it responds and finishes.
>
> In this test implementation, the response is hardcoded short string. For
> the sake of brevity, I didn't include the test app code, but if necessary I
> can do so.
>
>
> I then deployed this test app to the server, and set up two client
> machines to load test the server. The load testing code is also written in
> erlang, and is using ibrowse for making https requests, since I had some
> problems with httpc. The test client basically adheres to the protocol
> described above. It spawns request processes, with each process making the
> requests in an infinite loop. I gather success/failure stats in a s
> ingleton erlang process and print them out in regular intervals.
>
> As in the production system, with http, the test server performs
> correctly. With https, when number of concurrent requests reaches some
> treshold (about 1000), many requests are not served.
>
> On the client machine, I occasionally receive following errors:
> =ERROR REPORT====
> ** State machine <0.17592.0> terminating
>
> ** Reason for termination =
> ** {badarg,[{erlang,byte_size,[undefined]},
> {ssl_tls1,split_secret,1},
> {ssl_tls1,prf,4},
> {ssl_handshake,master_secret,4},
> {ssl_connection,handle_resumed_session,2},
> {ssl_connection,next_state,3},
> {gen_fsm,handle_msg,7},
> {proc_lib,init_p_do_apply,3}]}
>
>
> While on the server I occasionally notice following errors:
> =SUPERVISOR REPORT==== 5-May-2011::11:06:33 ===
> Supervisor: {local,ssl_connection_sup}
> Context: child_terminated
> Reason: {{badmatch,{resumed,undefined}},
> [{ssl_handshake,hello,4},
> {ssl_connection,hello,2},
> {ssl_connection,next_state,3},
> {gen_fsm,handle_msg,7},
> {proc_lib,init_p_do_apply,3}]}
> Offender: [{pid,<0.7044.2>},
> {name,undefined},
> {mfargs,{ssl_connection,start_link,undefined}},
> {restart_type,temporary},
> {shutdown,4000},
> {child_type,worker}]
>
>
> I have tried to tweak some OS parameters, as well as erlang parameters, but
> without success. After three days of experimenting, I am out of ideas and
> need some help or pointers.
>
> I am using Erlang R14B02 on 64 bit Ubuntu 10.04, all the machines are on
> the EC2 cloud.
>
> Following are sysctl.conf settings (I tried some more combinations without
> success):
> net.core.rmem_max = 16777216
> net.core.wmem_max = 16777216
> net.ipv4.tcp_rmem = 4096 87380 16777216
> net.ipv4.tcp_wmem = 4096 65536 16777216
> net.ipv4.tcp_syncookies = 1
>
> net.ipv4.tcp_mem = 50576 64768 98152
> net.core.netdev_max_backlog = 2500
> net.ipv4.netfilter.ip_conntrack_max = 1048576
>
> net.ipv4.ip_local_port_range = 1024 65535
>
> net.ipv4.tcp_fin_timeout = 10
>
>
>
> Following is ulimit -a output:
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 20
> file size (blocks, -f) unlimited
> pending signals (-i) 16382
> max locked memory (kbytes, -l) 64
> max memory size (kbytes, -m) unlimited
> open files (-n) 32000
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 8192
> cpu time (seconds, -t) unlimited
> max user processes (-u) unlimited
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
>
>
>
> These are mochiweb parameters:
> [
> {port, 9999},
> {name, https_test},
> {ssl, true},
> {ssl_opts, [
> {cacertfile, "keys/cacert.pem"},
> {certfile, "keys/cert.pem"},
> {keyfile, "keys/cert.key"},
> {depth, 0}
> ]}
> ]
>
>
> To use it on a 443 port, I have set up iptables port forwarding
> iptables -t nat -I PREROUTING --source 0/0 --destination 0/0 -p tcp --dport
> 443 -j REDIRECT --to-ports 9999
>
>
> These are erl parameters:
> erl -P 268435456 -env ERL_MAX_PORTS 100000
>
>
>
> If anybody has any idea, I would be most grateful.
>
>
> Best regards,
> Sasa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110517/33d3dbef/attachment.htm>
More information about the erlang-questions
mailing list