[erlang-questions] socket_closed_unexpectedly with clustered RabbitMQ in EC2

Ilya Shcherbak tthread@REDACTED
Fri Dec 26 00:51:40 CET 2014


Hey,
there is heartbeat parameter in RMQ connection. for prevent closing session
by proxy u need to turn on it. (it's turned of by default if u use erlang
amqp_client).  but u need to negotiate this parameter and IDLE tcp timeout
parameter in AWS load balancer.
https://www.rabbitmq.com/configure.html

if u use eralng amqp_client u can establish connection like this:
 amqp_connection:start(
      #amqp_params_network{
        host     = Host,
        port     =  Port,
        ...
        heartbeat = 30 %% <li>heartbeat :: non_neg_integer() - The hearbeat
interval in seconds,
                                 %%     defaults to 0 (turned off) (network
only)</li>
      }
   ).

Ilya Shcherbak


2014-12-10 23:56 GMT+01:00 Youngkin, Rich <richard.youngkin@REDACTED>:

> Looking at this a bit more it turns out the the default haproxy client
> timeout was 50 seconds.  Changing this to other values (larger and smaller)
> changed the frequency of the socket_closed_unexpectedly errors. So I think
> I answered my own question. Thanks to anyone who took a look at this in the
> meantime.
>
> Thanks,
> Rich
>
>
> On Wed, Dec 10, 2014 at 10:59 AM, Youngkin, Rich <
> richard.youngkin@REDACTED> wrote:
>
>> Hi,
>>
>> I'm using the Erlang amqp client and am getting a
>> "socket_closed_unexpectedly" CRASH REPORT. This error happens regardless of
>> whether the application is actively publishing or consuming messages (i.e.,
>> the app is idle, but connected to RabbitMQ).  This doesn't happen in a
>> non-EC2 environment or in a single-zone EC2 environment.
>>
>> Here are some environmental details:
>>
>>    1. My application is using Erlang version is R15B01
>>    2. amqp_client-2.7.1
>>    3. RabbitMQ 3.2.2 using Erlang R14B04
>>
>> Here are the details regarding the EC2 configuration:
>>
>>    1. 2 EC2 zones in the same EC2 region
>>    2. 2 RabbitMQ instances per Zone - 4 total instances
>>    3. All RabbitMQ instances are clustered (cluster_partition_handling
>>    set to pause_minority).
>>    4. 2 Application instances per Zone (my app) - 4 total instances
>>    5. There is an haproxy between my app and the RabbitMQ cluster
>>    6. My application is using Mnesia and it's clustered across all app
>>    instances
>>
>> I've included the text from the SASL log below.
>>
>> As I stated above, the error message occurs whether my application is
>> publishing or not.  It does happen more frequently while publishing is
>> occurring.  It looks to me like AMQP is recovering/restarting the affected
>> processes, but I'd like to better understand why it's occurring and fix it
>> if possible.  The error is happening regularly, every 50 seconds. This
>> seems significant.
>>
>> Any help is appreciated.
>>
>> Thanks!
>> Rich
>>
>>
>> =CRASH REPORT==== 8-Dec-2014::20:52:34 ===
>>
>>  crasher:
>>
>>    initial call: amqp_gen_connection:init/1
>>
>>    pid: <0.5460.10>
>>
>>    registered_name: []
>>
>>    exception exit: socket_closed_unexpectedly
>>
>>      in function  gen_server:terminate/6 (gen_server.erl, line 747)
>>
>>    ancestors: [<0.5459.10>,amqp_sup,<0.4172.0>]
>>
>>    messages: []
>>
>>    links: [<0.6264.0>,<0.5459.10>,#Port<0.7206767>]
>>
>>    dictionary: []
>>
>>    trap_exit: false
>>
>>    status: running
>>
>>    heap_size: 610
>>
>>    stack_size: 24
>>
>>    reductions: 717
>>
>>  neighbours:
>>
>>
>> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>>
>>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>>
>>      Context:    child_terminated
>>
>>      Reason:     socket_closed_unexpectedly
>>
>>      Offender:   [{pid,<0.4173.0>},
>>
>>                   {name,connection},
>>
>>                   {mfa,
>>
>>                       {amqp_gen_connection,start_link,
>>
>>                           [amqp_network_connection,
>>
>>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>>
>>
>>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>>
>>                                none,
>>
>>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>>
>>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>>
>>                                [],[]},
>>
>>                            #Fun<amqp_connection_sup.0.39273983>,
>>
>>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>>
>>                   {restart_type,intrinsic},
>>
>>                   {shutdown,brutal_kill},
>>
>>                   {child_type,worker}]
>>
>> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>>      Context:    shutdown
>>      Reason:     reached_max_restart_intensity
>>      Offender:   [{pid,<0.4173.0>},
>>                   {name,connection},
>>                   {mfa,
>>                       {amqp_gen_connection,start_link,
>>                           [amqp_network_connection,
>>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>>
>>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>>                                none,
>>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>>                                [],[]},
>>                            #Fun<amqp_connection_sup.0.39273983>,
>>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>>                   {restart_type,intrinsic},
>>                   {shutdown,brutal_kill},
>>                   {child_type,worker}]
>>
>> =PROGRESS REPORT==== 9-Dec-2014::20:55:15 === supervisor:
>> {<0.4320.0>,amqp_connection_sup} started: [{pid,<0.4321.0>},
>> {name,connection}, {mfa, {amqp_gen_connection,start_link,
>> [amqp_network_connection, {amqp_params_network,<<"guest">>,<<"guest">>,
>> <<"/">>,"10.199.30.169",5672,0,0,1000, infinity,none,
>> [#Fun<amqp_auth_mechanisms.plain.3>,
>> #Fun<amqp_auth_mechanisms.amqplain.3>], [],[]},
>> #Fun<amqp_connection_sup.0.39273983>,
>> #Fun<amqp_connection_sup.2.54430129>,[]]}}, {restart_type,intrinsic},
>> {shutdown,brutal_kill}, {child_type,worker}]
>>
>> ...
>>
>>
>>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141226/62be03b5/attachment.htm>


More information about the erlang-questions mailing list