[erlang-questions] socket_closed_unexpectedly with clustered RabbitMQ in EC2

Aram Antonyan aram.antonyan@REDACTED
Fri Dec 26 20:53:54 CET 2014


Hi Rich,
You shall not change session timeout value, instead add heartbeat value
when you connect to the server as was suggested in previous email, that
will solve an issue.

Thanks Aram.



On Thu, Dec 25, 2014 at 3:51 PM, Ilya Shcherbak <tthread@REDACTED> wrote:

> Hey,
> there is heartbeat parameter in RMQ connection. for prevent closing
> session by proxy u need to turn on it. (it's turned of by default if u use
> erlang amqp_client).  but u need to negotiate this parameter and IDLE tcp
> timeout parameter in AWS load balancer.
> https://www.rabbitmq.com/configure.html
>
> if u use eralng amqp_client u can establish connection like this:
>  amqp_connection:start(
>       #amqp_params_network{
>         host     = Host,
>         port     =  Port,
>         ...
>         heartbeat = 30 %% <li>heartbeat :: non_neg_integer() - The
> hearbeat interval in seconds,
>                                  %%     defaults to 0 (turned off)
> (network only)</li>
>       }
>    ).
>
> Ilya Shcherbak
>
>
> 2014-12-10 23:56 GMT+01:00 Youngkin, Rich <richard.youngkin@REDACTED>:
>
>> Looking at this a bit more it turns out the the default haproxy client
>> timeout was 50 seconds.  Changing this to other values (larger and smaller)
>> changed the frequency of the socket_closed_unexpectedly errors. So I think
>> I answered my own question. Thanks to anyone who took a look at this in the
>> meantime.
>>
>> Thanks,
>> Rich
>>
>>
>> On Wed, Dec 10, 2014 at 10:59 AM, Youngkin, Rich <
>> richard.youngkin@REDACTED> wrote:
>>
>>> Hi,
>>>
>>> I'm using the Erlang amqp client and am getting a
>>> "socket_closed_unexpectedly" CRASH REPORT. This error happens regardless of
>>> whether the application is actively publishing or consuming messages (i.e.,
>>> the app is idle, but connected to RabbitMQ).  This doesn't happen in a
>>> non-EC2 environment or in a single-zone EC2 environment.
>>>
>>> Here are some environmental details:
>>>
>>>    1. My application is using Erlang version is R15B01
>>>    2. amqp_client-2.7.1
>>>    3. RabbitMQ 3.2.2 using Erlang R14B04
>>>
>>> Here are the details regarding the EC2 configuration:
>>>
>>>    1. 2 EC2 zones in the same EC2 region
>>>    2. 2 RabbitMQ instances per Zone - 4 total instances
>>>    3. All RabbitMQ instances are clustered (cluster_partition_handling
>>>    set to pause_minority).
>>>    4. 2 Application instances per Zone (my app) - 4 total instances
>>>    5. There is an haproxy between my app and the RabbitMQ cluster
>>>    6. My application is using Mnesia and it's clustered across all app
>>>    instances
>>>
>>> I've included the text from the SASL log below.
>>>
>>> As I stated above, the error message occurs whether my application is
>>> publishing or not.  It does happen more frequently while publishing is
>>> occurring.  It looks to me like AMQP is recovering/restarting the affected
>>> processes, but I'd like to better understand why it's occurring and fix it
>>> if possible.  The error is happening regularly, every 50 seconds. This
>>> seems significant.
>>>
>>> Any help is appreciated.
>>>
>>> Thanks!
>>> Rich
>>>
>>>
>>> =CRASH REPORT==== 8-Dec-2014::20:52:34 ===
>>>
>>>  crasher:
>>>
>>>    initial call: amqp_gen_connection:init/1
>>>
>>>    pid: <0.5460.10>
>>>
>>>    registered_name: []
>>>
>>>    exception exit: socket_closed_unexpectedly
>>>
>>>      in function  gen_server:terminate/6 (gen_server.erl, line 747)
>>>
>>>    ancestors: [<0.5459.10>,amqp_sup,<0.4172.0>]
>>>
>>>    messages: []
>>>
>>>    links: [<0.6264.0>,<0.5459.10>,#Port<0.7206767>]
>>>
>>>    dictionary: []
>>>
>>>    trap_exit: false
>>>
>>>    status: running
>>>
>>>    heap_size: 610
>>>
>>>    stack_size: 24
>>>
>>>    reductions: 717
>>>
>>>  neighbours:
>>>
>>>
>>> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>>>
>>>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>>>
>>>      Context:    child_terminated
>>>
>>>      Reason:     socket_closed_unexpectedly
>>>
>>>      Offender:   [{pid,<0.4173.0>},
>>>
>>>                   {name,connection},
>>>
>>>                   {mfa,
>>>
>>>                       {amqp_gen_connection,start_link,
>>>
>>>                           [amqp_network_connection,
>>>
>>>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>>>
>>>
>>>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>>>
>>>                                none,
>>>
>>>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>>>
>>>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>>>
>>>                                [],[]},
>>>
>>>                            #Fun<amqp_connection_sup.0.39273983>,
>>>
>>>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>>>
>>>                   {restart_type,intrinsic},
>>>
>>>                   {shutdown,brutal_kill},
>>>
>>>                   {child_type,worker}]
>>>
>>> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>>>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>>>      Context:    shutdown
>>>      Reason:     reached_max_restart_intensity
>>>      Offender:   [{pid,<0.4173.0>},
>>>                   {name,connection},
>>>                   {mfa,
>>>                       {amqp_gen_connection,start_link,
>>>                           [amqp_network_connection,
>>>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>>>
>>>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>>>                                none,
>>>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>>>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>>>                                [],[]},
>>>                            #Fun<amqp_connection_sup.0.39273983>,
>>>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>>>                   {restart_type,intrinsic},
>>>                   {shutdown,brutal_kill},
>>>                   {child_type,worker}]
>>>
>>> =PROGRESS REPORT==== 9-Dec-2014::20:55:15 === supervisor:
>>> {<0.4320.0>,amqp_connection_sup} started: [{pid,<0.4321.0>},
>>> {name,connection}, {mfa, {amqp_gen_connection,start_link,
>>> [amqp_network_connection, {amqp_params_network,<<"guest">>,<<"guest">>,
>>> <<"/">>,"10.199.30.169",5672,0,0,1000, infinity,none,
>>> [#Fun<amqp_auth_mechanisms.plain.3>,
>>> #Fun<amqp_auth_mechanisms.amqplain.3>], [],[]},
>>> #Fun<amqp_connection_sup.0.39273983>,
>>> #Fun<amqp_connection_sup.2.54430129>,[]]}}, {restart_type,intrinsic},
>>> {shutdown,brutal_kill}, {child_type,worker}]
>>>
>>> ...
>>>
>>>
>>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141226/926452a3/attachment.htm>


More information about the erlang-questions mailing list