[erlang-questions] socket_closed_unexpectedly with clustered RabbitMQ in EC2

Youngkin, Rich richard.youngkin@REDACTED
Wed Dec 10 18:59:12 CET 2014


Hi,

I'm using the Erlang amqp client and am getting a
"socket_closed_unexpectedly" CRASH REPORT. This error happens regardless of
whether the application is actively publishing or consuming messages (i.e.,
the app is idle, but connected to RabbitMQ).  This doesn't happen in a
non-EC2 environment or in a single-zone EC2 environment.

Here are some environmental details:

   1. My application is using Erlang version is R15B01
   2. amqp_client-2.7.1
   3. RabbitMQ 3.2.2 using Erlang R14B04

Here are the details regarding the EC2 configuration:

   1. 2 EC2 zones in the same EC2 region
   2. 2 RabbitMQ instances per Zone - 4 total instances
   3. All RabbitMQ instances are clustered (cluster_partition_handling set
   to pause_minority).
   4. 2 Application instances per Zone (my app) - 4 total instances
   5. There is an haproxy between my app and the RabbitMQ cluster
   6. My application is using Mnesia and it's clustered across all app
   instances

I've included the text from the SASL log below.

As I stated above, the error message occurs whether my application is
publishing or not.  It does happen more frequently while publishing is
occurring.  It looks to me like AMQP is recovering/restarting the affected
processes, but I'd like to better understand why it's occurring and fix it
if possible.  The error is happening regularly, every 50 seconds. This
seems significant.

Any help is appreciated.

Thanks!
Rich


=CRASH REPORT==== 8-Dec-2014::20:52:34 ===

 crasher:

   initial call: amqp_gen_connection:init/1

   pid: <0.5460.10>

   registered_name: []

   exception exit: socket_closed_unexpectedly

     in function  gen_server:terminate/6 (gen_server.erl, line 747)

   ancestors: [<0.5459.10>,amqp_sup,<0.4172.0>]

   messages: []

   links: [<0.6264.0>,<0.5459.10>,#Port<0.7206767>]

   dictionary: []

   trap_exit: false

   status: running

   heap_size: 610

   stack_size: 24

   reductions: 717

 neighbours:


=SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===

     Supervisor: {<0.4172.0>,amqp_connection_sup}

     Context:    child_terminated

     Reason:     socket_closed_unexpectedly

     Offender:   [{pid,<0.4173.0>},

                  {name,connection},

                  {mfa,

                      {amqp_gen_connection,start_link,

                          [amqp_network_connection,

                           {amqp_params_network,<<"guest">>,<<"guest">>,


 <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,

                               none,

                               [#Fun<amqp_auth_mechanisms.plain.3>,

                                #Fun<amqp_auth_mechanisms.amqplain.3>],

                               [],[]},

                           #Fun<amqp_connection_sup.0.39273983>,

                           #Fun<amqp_connection_sup.2.54430129>,[]]}},

                  {restart_type,intrinsic},

                  {shutdown,brutal_kill},

                  {child_type,worker}]

=SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
     Supervisor: {<0.4172.0>,amqp_connection_sup}
     Context:    shutdown
     Reason:     reached_max_restart_intensity
     Offender:   [{pid,<0.4173.0>},
                  {name,connection},
                  {mfa,
                      {amqp_gen_connection,start_link,
                          [amqp_network_connection,
                           {amqp_params_network,<<"guest">>,<<"guest">>,

 <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
                               none,
                               [#Fun<amqp_auth_mechanisms.plain.3>,
                                #Fun<amqp_auth_mechanisms.amqplain.3>],
                               [],[]},
                           #Fun<amqp_connection_sup.0.39273983>,
                           #Fun<amqp_connection_sup.2.54430129>,[]]}},
                  {restart_type,intrinsic},
                  {shutdown,brutal_kill},
                  {child_type,worker}]

=PROGRESS REPORT==== 9-Dec-2014::20:55:15 === supervisor:
{<0.4320.0>,amqp_connection_sup} started: [{pid,<0.4321.0>},
{name,connection}, {mfa, {amqp_gen_connection,start_link,
[amqp_network_connection, {amqp_params_network,<<"guest">>,<<"guest">>,
<<"/">>,"10.199.30.169",5672,0,0,1000, infinity,none,
[#Fun<amqp_auth_mechanisms.plain.3>,
#Fun<amqp_auth_mechanisms.amqplain.3>], [],[]},
#Fun<amqp_connection_sup.0.39273983>,
#Fun<amqp_connection_sup.2.54430129>,[]]}}, {restart_type,intrinsic},
{shutdown,brutal_kill}, {child_type,worker}]

...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141210/a4d8599f/attachment.htm>


More information about the erlang-questions mailing list