[erlang-questions] socket_closed_unexpectedly with clustered RabbitMQ in EC2

Youngkin, Rich <>
Wed Dec 10 23:56:46 CET 2014


Looking at this a bit more it turns out the the default haproxy client
timeout was 50 seconds.  Changing this to other values (larger and smaller)
changed the frequency of the socket_closed_unexpectedly errors. So I think
I answered my own question. Thanks to anyone who took a look at this in the
meantime.

Thanks,
Rich


On Wed, Dec 10, 2014 at 10:59 AM, Youngkin, Rich <
> wrote:

> Hi,
>
> I'm using the Erlang amqp client and am getting a
> "socket_closed_unexpectedly" CRASH REPORT. This error happens regardless of
> whether the application is actively publishing or consuming messages (i.e.,
> the app is idle, but connected to RabbitMQ).  This doesn't happen in a
> non-EC2 environment or in a single-zone EC2 environment.
>
> Here are some environmental details:
>
>    1. My application is using Erlang version is R15B01
>    2. amqp_client-2.7.1
>    3. RabbitMQ 3.2.2 using Erlang R14B04
>
> Here are the details regarding the EC2 configuration:
>
>    1. 2 EC2 zones in the same EC2 region
>    2. 2 RabbitMQ instances per Zone - 4 total instances
>    3. All RabbitMQ instances are clustered (cluster_partition_handling
>    set to pause_minority).
>    4. 2 Application instances per Zone (my app) - 4 total instances
>    5. There is an haproxy between my app and the RabbitMQ cluster
>    6. My application is using Mnesia and it's clustered across all app
>    instances
>
> I've included the text from the SASL log below.
>
> As I stated above, the error message occurs whether my application is
> publishing or not.  It does happen more frequently while publishing is
> occurring.  It looks to me like AMQP is recovering/restarting the affected
> processes, but I'd like to better understand why it's occurring and fix it
> if possible.  The error is happening regularly, every 50 seconds. This
> seems significant.
>
> Any help is appreciated.
>
> Thanks!
> Rich
>
>
> =CRASH REPORT==== 8-Dec-2014::20:52:34 ===
>
>  crasher:
>
>    initial call: amqp_gen_connection:init/1
>
>    pid: <0.5460.10>
>
>    registered_name: []
>
>    exception exit: socket_closed_unexpectedly
>
>      in function  gen_server:terminate/6 (gen_server.erl, line 747)
>
>    ancestors: [<0.5459.10>,amqp_sup,<0.4172.0>]
>
>    messages: []
>
>    links: [<0.6264.0>,<0.5459.10>,#Port<0.7206767>]
>
>    dictionary: []
>
>    trap_exit: false
>
>    status: running
>
>    heap_size: 610
>
>    stack_size: 24
>
>    reductions: 717
>
>  neighbours:
>
>
> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>
>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>
>      Context:    child_terminated
>
>      Reason:     socket_closed_unexpectedly
>
>      Offender:   [{pid,<0.4173.0>},
>
>                   {name,connection},
>
>                   {mfa,
>
>                       {amqp_gen_connection,start_link,
>
>                           [amqp_network_connection,
>
>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>
>
>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>
>                                none,
>
>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>
>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>
>                                [],[]},
>
>                            #Fun<amqp_connection_sup.0.39273983>,
>
>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>
>                   {restart_type,intrinsic},
>
>                   {shutdown,brutal_kill},
>
>                   {child_type,worker}]
>
> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>      Context:    shutdown
>      Reason:     reached_max_restart_intensity
>      Offender:   [{pid,<0.4173.0>},
>                   {name,connection},
>                   {mfa,
>                       {amqp_gen_connection,start_link,
>                           [amqp_network_connection,
>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>
>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>                                none,
>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>                                [],[]},
>                            #Fun<amqp_connection_sup.0.39273983>,
>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>                   {restart_type,intrinsic},
>                   {shutdown,brutal_kill},
>                   {child_type,worker}]
>
> =PROGRESS REPORT==== 9-Dec-2014::20:55:15 === supervisor:
> {<0.4320.0>,amqp_connection_sup} started: [{pid,<0.4321.0>},
> {name,connection}, {mfa, {amqp_gen_connection,start_link,
> [amqp_network_connection, {amqp_params_network,<<"guest">>,<<"guest">>,
> <<"/">>,"10.199.30.169",5672,0,0,1000, infinity,none,
> [#Fun<amqp_auth_mechanisms.plain.3>,
> #Fun<amqp_auth_mechanisms.amqplain.3>], [],[]},
> #Fun<amqp_connection_sup.0.39273983>,
> #Fun<amqp_connection_sup.2.54430129>,[]]}}, {restart_type,intrinsic},
> {shutdown,brutal_kill}, {child_type,worker}]
>
> ...
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141210/fd983224/attachment.html>


More information about the erlang-questions mailing list