gen_tcp:recv() and client_machine_crash

Tue Jun 27 14:12:19 CEST 2006

Thanks for the pointer...

In my application, the client keeps the socket open for
some undetermined time (to do something else).
So, this TCP_keepalive option would have been a good
solution without changing the client_code.
However, the default time for the keepalive probing signal
to be sent out is 2 hours (from Stevens Book), which is too
LONG. And
changing that default 2 hours may affect ALL sockets
on the host!

Maybe the simple solution for my case is simply implementing
a monitoring process on the server side to probe the
client machine periodically :)
[ Fortunately, there is ONLY one machine on the client
  side, even though there may be many clients at a time. ]

Thanks
HP

On Tue, 27 Jun 2006, Yani Dzhurov wrote:

> Hi,
>
> Why don't trying using the keep alive option for tcp socket:
>
> {keepalive, Boolean} (TCP/IP sockets)
> Enables/disables periodic transmission on a connected socket, when no other
> data is being exchanged. If the other end does not respond, the connection
> is considered broken and an error message will be sent to the controlling
> process. Default disabled.
>
> This is from the inet module docs . I think this should work for you.
>
> Cheerz,
>
> Jani
>
> -----Original Message-----
> From: owner-erlang-questions@REDACTED
> [mailto:owner-erlang-questions@REDACTED] On Behalf Of HP Wei
> Sent: Monday, June 26, 2006 10:15 PM
> To: erlang-questions@REDACTED
> Subject: gen_tcp:recv() and client_machine_crash
>
>
> I have a server (written in Erlang) which handles
> clients' request(s) through tcp_socket by
> the following piece of psudo-code:
>
> handle_request(Sock) ->
>    case gen_tcp:recv(Sock, 0) of
>      {ok, Bin} ->
>          handle_Bin(...);
>      {error, Reason} ->
>          handle_exit(...)
>    end.
>
> [ Note: The client code is written in python. ]
>
> I want the handle_exit() to handle three abnormal-exit situations
> that may occur on the client side.
> (1) the client's code gets Control_C exit;
> (2) the client's code gets killed by 'kill -9 pid'
>     [ under UNIX ];
> (3) the machine that the client is running on
>     crashes !
>
> For (1) and (2), the part 'handle_exit()' in the
> above snippet gets executed as expected.
>
> However, we had a client_machine_crash this morning
> but the handle_exit(..) did not seem to get executed.
>
> Questions:
>    Is it supposed to be executed ?
>    If yes, I will check the other part of the server-code.
>    If no, then what is the proper way to detect a
>    machine crash ??
>
> thanks
> HP
>
>