[erlang-questions] socket_closed_unexpectedly with clustered RabbitMQ in EC2

Youngkin, Rich richard.youngkin@REDACTED
Fri Dec 26 23:35:26 CET 2014


Thanks Ilya & Aram!  I'll take a look at this and revisit the haproxy
timeout value.

Cheers,
Rich

On Fri, Dec 26, 2014 at 12:53 PM, Aram Antonyan <aram.antonyan@REDACTED>
wrote:

> Hi Rich,
> You shall not change session timeout value, instead add heartbeat value
> when you connect to the server as was suggested in previous email, that
> will solve an issue.
>
> Thanks Aram.
>
>
>
> On Thu, Dec 25, 2014 at 3:51 PM, Ilya Shcherbak <tthread@REDACTED> wrote:
>
>> Hey,
>> there is heartbeat parameter in RMQ connection. for prevent closing
>> session by proxy u need to turn on it. (it's turned of by default if u use
>> erlang amqp_client).  but u need to negotiate this parameter and IDLE tcp
>> timeout parameter in AWS load balancer.
>> https://www.rabbitmq.com/configure.html
>>
>> if u use eralng amqp_client u can establish connection like this:
>>  amqp_connection:start(
>>       #amqp_params_network{
>>         host     = Host,
>>         port     =  Port,
>>         ...
>>         heartbeat = 30 %% <li>heartbeat :: non_neg_integer() - The
>> hearbeat interval in seconds,
>>                                  %%     defaults to 0 (turned off)
>> (network only)</li>
>>       }
>>    ).
>>
>> Ilya Shcherbak
>>
>>
>> 2014-12-10 23:56 GMT+01:00 Youngkin, Rich <richard.youngkin@REDACTED>:
>>
>>> Looking at this a bit more it turns out the the default haproxy client
>>> timeout was 50 seconds.  Changing this to other values (larger and smaller)
>>> changed the frequency of the socket_closed_unexpectedly errors. So I think
>>> I answered my own question. Thanks to anyone who took a look at this in the
>>> meantime.
>>>
>>> Thanks,
>>> Rich
>>>
>>>
>>> On Wed, Dec 10, 2014 at 10:59 AM, Youngkin, Rich <
>>> richard.youngkin@REDACTED> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using the Erlang amqp client and am getting a
>>>> "socket_closed_unexpectedly" CRASH REPORT. This error happens regardless of
>>>> whether the application is actively publishing or consuming messages (i.e.,
>>>> the app is idle, but connected to RabbitMQ).  This doesn't happen in a
>>>> non-EC2 environment or in a single-zone EC2 environment.
>>>>
>>>> Here are some environmental details:
>>>>
>>>>    1. My application is using Erlang version is R15B01
>>>>    2. amqp_client-2.7.1
>>>>    3. RabbitMQ 3.2.2 using Erlang R14B04
>>>>
>>>> Here are the details regarding the EC2 configuration:
>>>>
>>>>    1. 2 EC2 zones in the same EC2 region
>>>>    2. 2 RabbitMQ instances per Zone - 4 total instances
>>>>    3. All RabbitMQ instances are clustered (cluster_partition_handling
>>>>    set to pause_minority).
>>>>    4. 2 Application instances per Zone (my app) - 4 total instances
>>>>    5. There is an haproxy between my app and the RabbitMQ cluster
>>>>    6. My application is using Mnesia and it's clustered across all app
>>>>    instances
>>>>
>>>> I've included the text from the SASL log below.
>>>>
>>>> As I stated above, the error message occurs whether my application is
>>>> publishing or not.  It does happen more frequently while publishing is
>>>> occurring.  It looks to me like AMQP is recovering/restarting the affected
>>>> processes, but I'd like to better understand why it's occurring and fix it
>>>> if possible.  The error is happening regularly, every 50 seconds. This
>>>> seems significant.
>>>>
>>>> Any help is appreciated.
>>>>
>>>> Thanks!
>>>> Rich
>>>>
>>>>
>>>> =CRASH REPORT==== 8-Dec-2014::20:52:34 ===
>>>>
>>>>  crasher:
>>>>
>>>>    initial call: amqp_gen_connection:init/1
>>>>
>>>>    pid: <0.5460.10>
>>>>
>>>>    registered_name: []
>>>>
>>>>    exception exit: socket_closed_unexpectedly
>>>>
>>>>      in function  gen_server:terminate/6 (gen_server.erl, line 747)
>>>>
>>>>    ancestors: [<0.5459.10>,amqp_sup,<0.4172.0>]
>>>>
>>>>    messages: []
>>>>
>>>>    links: [<0.6264.0>,<0.5459.10>,#Port<0.7206767>]
>>>>
>>>>    dictionary: []
>>>>
>>>>    trap_exit: false
>>>>
>>>>    status: running
>>>>
>>>>    heap_size: 610
>>>>
>>>>    stack_size: 24
>>>>
>>>>    reductions: 717
>>>>
>>>>  neighbours:
>>>>
>>>>
>>>> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>>>>
>>>>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>>>>
>>>>      Context:    child_terminated
>>>>
>>>>      Reason:     socket_closed_unexpectedly
>>>>
>>>>      Offender:   [{pid,<0.4173.0>},
>>>>
>>>>                   {name,connection},
>>>>
>>>>                   {mfa,
>>>>
>>>>                       {amqp_gen_connection,start_link,
>>>>
>>>>                           [amqp_network_connection,
>>>>
>>>>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>>>>
>>>>
>>>>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>>>>
>>>>                                none,
>>>>
>>>>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>>>>
>>>>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>>>>
>>>>                                [],[]},
>>>>
>>>>                            #Fun<amqp_connection_sup.0.39273983>,
>>>>
>>>>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>>>>
>>>>                   {restart_type,intrinsic},
>>>>
>>>>                   {shutdown,brutal_kill},
>>>>
>>>>                   {child_type,worker}]
>>>>
>>>> =SUPERVISOR REPORT==== 9-Dec-2014::20:55:15 ===
>>>>      Supervisor: {<0.4172.0>,amqp_connection_sup}
>>>>      Context:    shutdown
>>>>      Reason:     reached_max_restart_intensity
>>>>      Offender:   [{pid,<0.4173.0>},
>>>>                   {name,connection},
>>>>                   {mfa,
>>>>                       {amqp_gen_connection,start_link,
>>>>                           [amqp_network_connection,
>>>>                            {amqp_params_network,<<"guest">>,<<"guest">>,
>>>>
>>>>  <<"/">>,"10.199.30.169",5672,0,0,1000,infinity,
>>>>                                none,
>>>>                                [#Fun<amqp_auth_mechanisms.plain.3>,
>>>>                                 #Fun<amqp_auth_mechanisms.amqplain.3>],
>>>>                                [],[]},
>>>>                            #Fun<amqp_connection_sup.0.39273983>,
>>>>                            #Fun<amqp_connection_sup.2.54430129>,[]]}},
>>>>                   {restart_type,intrinsic},
>>>>                   {shutdown,brutal_kill},
>>>>                   {child_type,worker}]
>>>>
>>>> =PROGRESS REPORT==== 9-Dec-2014::20:55:15 === supervisor:
>>>> {<0.4320.0>,amqp_connection_sup} started: [{pid,<0.4321.0>},
>>>> {name,connection}, {mfa, {amqp_gen_connection,start_link,
>>>> [amqp_network_connection, {amqp_params_network,<<"guest">>,<<"guest">>,
>>>> <<"/">>,"10.199.30.169",5672,0,0,1000, infinity,none,
>>>> [#Fun<amqp_auth_mechanisms.plain.3>,
>>>> #Fun<amqp_auth_mechanisms.amqplain.3>], [],[]},
>>>> #Fun<amqp_connection_sup.0.39273983>,
>>>> #Fun<amqp_connection_sup.2.54430129>,[]]}}, {restart_type,intrinsic},
>>>> {shutdown,brutal_kill}, {child_type,worker}]
>>>>
>>>> ...
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141226/e75f018a/attachment.htm>


More information about the erlang-questions mailing list