[erlang-questions] Diagnosting problems with cowboy

Max Lapshin max.lapshin@REDACTED
Wed Dec 28 12:03:30 CET 2011


Erlyvideo is failing after 3 hours of working. It happens only on one
client's machine.

It is new version that uses cowboy.

This is beginning of erl_crash.dump:

=erl_crash_dump:0.1
Wed Dec 28 06:05:14 2011
Slogan: eheap_alloc: Cannot allocate 1824525600 bytes of memory (of
type "heap").
System version: Erlang R15B (erts-5.9) [source] [64-bit] [smp:4:4]
[async-threads:8] [hipe] [kernel-poll:true]
Compiled: Wed Dec 14 16:00:22 2011
Taints: crypto,jiffy,mpeg2_crc32,mpegts_reader
Atoms: 12502
=memory
total: 4067805288
processes: 3758136452
processes_used: 3758105607
system: 309668836
atom: 347633
atom_used: 344446
binary: 86113112
code: 8335129
ets: 730624



I've found only two interesting processes (look at their  Reductions,
Stack and Message queue length):

=proc:<0.244.0>
State: Garbing
Spawned as: cowboy_http_protocol:init/4
Spawned by: <0.133.0>
Started: Wed Dec 28 02:15:21 2011
Message queue length: 6577001
Number of heap fragments: 0
Heap fragment data: 0
Link list: [#Port<0.2722>, <0.239.0>, <0.133.0>,
{from,<0.132.0>,#Ref<0.0.0.322>}]
Reductions: 25916636
Stack+heap: 24488375
OldHeap: 182452560
Heap unused: 22924313
OldHeap unused: 663469
Program counter: 0x00007f8e80a98998 (prim_inet:recv0/3 + 224)
CP: 0x0000000000000000 (invalid)


=proc:<0.250.0>
State: Waiting
Spawned as: cowboy_http_protocol:init/4
Spawned by: <0.133.0>
Started: Wed Dec 28 02:15:21 2011
Message queue length: 6596904
Number of heap fragments: 0
Heap fragment data: 0
Link list: [#Port<0.2784>, <0.239.0>, <0.133.0>,
{to,<0.239.0>,#Ref<0.0.4.18567>}, {from,<0.132.0>,#Ref<0.0.0.355>}]
Reductions: 25538292
Stack+heap: 24488375
OldHeap: 182452560
Heap unused: 22856852
OldHeap unused: 110812
Program counter: 0x00007f8e80abd950 (gen:do_call/4 + 576)
CP: 0x0000000000000000 (invalid)
arity = 0



Looks like these processes were receiving about 500 messages per
second during 4 hours.

Either they were subscribing on the same stream several times, either
there was some other problem.

How can I inspect this?
Currently I think about adding process, that once in 5 seconds check
all processes and kill everyone with message queue len > 10 000
messages.



More information about the erlang-questions mailing list