[erlang-questions] my socket can't handle the load

Pablo Platt pablo.platt@REDACTED
Sat Jun 5 11:13:19 CEST 2010


I was able to identify exactly when the issue happens.
As long as only one request at a time is sent to the driver it works fine:
request1, response1, request2, response2...

If the client sends several requests as fast as it can without waiting for a response it breaks.
debug messages show the following:
request1, request2, response1, request3, response2, Error timeout for request3, response3

This failure is reproducible. The timeout occurs after 5000ms compared to ~10ms for a normal query.
It can't be related to too many messages in a process mail box because we are only talking about 3 requests.
I don't know if this issue is related to my driver or the db.

Is there something related to sockets, or TCP that breaks when you send and receive data in a non-blocking way?
Is there an erlang driver that sends messages to a socket in a non-blocking way and handle responses according to a unique id tag
that I can use as a reference to see if any special configuration is needed?






________________________________
From: Kaiduan Xie <kaiduanx@REDACTED>
To: Pablo Platt <pablo.platt@REDACTED>
Cc: Bernard Duggan <bernie@REDACTED>; Erlang <erlang-questions@REDACTED>
Sent: Thu, June 3, 2010 3:14:33 PM
Subject: Re: [erlang-questions] my socket can't handle the load

Pablo, you can use erlang:process_info(Pid, message_queue_len) to find
the mail box size of gen_server.

On Thu, Jun 3, 2010 at 2:52 AM, Pablo Platt <pablo.platt@REDACTED> wrote:
> p.s.
> when getting a gen_server:call timeout the debug message of the
> requester(not the socket gen_server) says:
> heap_size: 1597
> stack_size: 24
> reductions: 1996
>
>
> ________________________________
> From: Pablo Platt <pablo.platt@REDACTED>
> To: Kaiduan Xie <kaiduanx@REDACTED>; Bernard Duggan <bernie@REDACTED>
> Cc: Erlang <erlang-questions@REDACTED>
> Sent: Thu, June 3, 2010 9:49:17 AM
> Subject: Re: [erlang-questions] my socket can't handle the load
>
> @Bernard, Kaiduan
> When I'm testing with a pool of 20 connections instead of 1, the driver and
> socket works fine.
> I think that means that both the receiver (database) and the sender can
> handle the load.
> When sending one request at a time from the same socket it also works fine
> but don't think this is the correct design.
>
> I don't think {active, once} is relevant here because responses from the db
> will only arrive when I make requests
> so I don't need to protect myself against DOS from untrusted third party.
>
> When getting a timeout on gen_server:call to the socket gen_server, how can
> I print useful info
> about the socket gen_server and the process calling it so I can see the
> mailbox queue of them and maybe other useful
> stuff that help me find the problem?
>
> Thanks
>
> ________________________________
> From: Kaiduan Xie <kaiduanx@REDACTED>
> To: Bernard Duggan <bernie@REDACTED>
> Cc: Pablo Platt <pablo.platt@REDACTED>; Erlang
> <erlang-questions@REDACTED>
> Sent: Thu, June 3, 2010 3:59:02 AM
> Subject: Re: [erlang-questions] my socket can't handle the load
>
> Pablo,
>
> Have you considered any chance the call of
> gen_tcp:send(State#state.socket, Packet) getting blocked? For example,
> a slow receiver. If gen_tcp:send() is getting blocked, hand_call()
> will get blocked, and gen_server will slow down. Please look the
> example listed in gen_tcp,
>
> http://www.erlang.org/doc/man/gen_tcp.html#examples
>
> Kaiduan
>
> On Wed, Jun 2, 2010 at 5:35 PM, Bernard Duggan <bernie@REDACTED> wrote:
>> Hi Pablo,
>>    One thing you may want to try is to wherever possible, avoid letting
>> the queue on your gen_server grow beyond a couple of messages.  Allowing
>> the queue to grow unchecked can cause serious performance degradation on
>> selective receives (which, to be fair, I can't see any of in your code,
>> but they can crop up in non-obvious library calls at times).
>> Off the top of my head, you'd do this by:
>> a) changing the handle_cast operation to a handle_call (to keep clients
>> using that operation from flooding you with requests) and
>> b) changing {active, true} to {active, once} in your connect call (and
>> making the corresponding change in handle_info to reactivate the socket.
>>
>> It may or may not be your problem, but it's a fairly easy change and
>> worth a try.
>>
>> Another possibility is that you have the same issue, but in the calling
>> process (I'm guessing now, since you haven't provided that code).  A
>> gen_server call does a selective receive while it waits for a response -
>> if your message queue is very large when you make the call, there's a
>> good chance you'll get a timeout regardless of how quickly the
>> gen_server serves the request.
>>
>> Cheers,
>>
>> Bernard
>>
>> On 3/06/2010 1:27 AM, Pablo Platt wrote:
>>> Hi,
>>>
>>> I'm writing a driver to a database that is using a tcp socket.
>>> There are two types of messages, "send and forget" and "send and
>>> receive".
>>> I have a gen_server that is responsible for opening the socket, receive a
>>> message from a process and sending it to the socket and receive responses
>>> and pass them to the caller.
>>> The gen_server saves a list of {request_id, CallerPid} in the state to
>>> know who to respond to when a packet is received from the db.
>>>
>>> Everything works when the rate of messages is low but when increasing it
>>> I'm starting to get gen_server timeout error on the requestor.
>>> Do I need to add/change parameters when opening the socket?
>>> Should I queue requests and wait for a response before sending the next
>>> request or is it ok to send several requests one after the other?
>>>
>>> Thanks
>>>
>>> The client sends a request using either:
>>> Resp = gen_server:call(Conn, {request, Request})
>>> or
>>> gen_server:cast(Conn, {request, Request})
>>>
>>>
>>> The relevant gen_server code:
>>>
>>> init([Host, Port]) ->
>>>     Socket = open_socket(Host, Port),
>>>     {ok, #state{socket=Socket, req_id=1}}.
>>>
>>> open_socket(Host, Port) ->
>>>     case gen_tcp:connect(Host, Port, [binary, {active, true}]) of
>>>         {ok, Sock} ->
>>>             Sock;
>>>         {error, Reason} ->
>>>             exit({open_socket_failed, Reason})
>>>     end.
>>>
>>> handle_call({request, Packet}, From, State) ->
>>>     ReqID = State#state.req_id + 1,
>>>     gen_tcp:send(State#state.socket, Packet),
>>>     {noreply, State#state{req_id=ReqID, requests=[{ReqID,
>>> From}|State#state.requests]}};
>>>
>>> handle_cast({request, Packet}, State) ->
>>>     ReqID = State#state.req_id + 1,
>>>     gen_tcp:send(State#state.socket, Packet),
>>>     {noreply, State#state{req_id=ReqID}}.
>>>
>>> handle_info({tcp, _Socket, Data}, State) ->
>>>     RawResp = <<(State#state.resp)/binary, Data/binary>>,
>>>     case check_packet:decode_response(RawResp) of
>>>         undefined ->
>>>             {noreply, State#state{resp = RawResp}};
>>>         {Resp, Tail} ->
>>>             ResponseTo = get_requestor(Resp),
>>>             {value, {ResponseTo, Client}, NewRequests} =
>>> lists:keytake(ResponseTo, 1, State#state.requests),
>>>             gen_server:reply(Client, Resp),
>>>             {noreply, State#state{resp = Tail, requests=NewRequests}}
>>>     end;
>>>
>>> % the following never been called, even when I'm getting the error.
>>> handle_info({tcp_closed, _Socket}, State) ->
>>>     {noreply, State};
>>>
>>> handle_info({tcp_error, _Socket, _Reason}, State) ->
>>>     {noreply, State}.
>>>
>>> terminate(_Reason, _State) ->
>>>     ok.
>>>
>>>
>>>
>>>
>>
>>
>> ________________________________________________________________
>> erlang-questions (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>
>>
>
>
>



      


More information about the erlang-questions mailing list