[erlang-questions] gen_tcp question

Mon Sep 18 18:51:23 CEST 2006

Joe,

See in-line comments.

Joe Armstrong (TN/EAB) wrote:
> Hi Serge,
> 
> It seems to me that the program "does the right thing" (ie recombines 
> fragmented packets if packet>0) but that this fact is not documented. 
> 
> I can understand that {packet,4} adds a 4 byte header on transmission
> so for example, a C program can read this and do whever it wants. 
> 
> But since the packet,4 header is silently removed in the Erlang case
> (with an active socket) then it become invisible, so I have to add my
> own private size(BIN) (as 4 bytes) in front of each Binary I send
> and then do a re-entrant package reassembly in Erlang (not difficult,
> but it seems a bit daft, since in this case I could just have used
> packet,0
> and done the reassembly anyway)
>  
> Sean, claims that packets are not fragmented, ie that the driver
> reassembles the packets in the case when a packet length is 
> given, ie does the right thing

Yes, I concur with Sean's claim.

> Jani said in his last mail
> 
> Jani> When using {packet,PacketType}, PacketType=1,2 or 4 inet_drv
> receives
> Jani> (sends) 1, 2 or 4 byte header before the message you are receiving
> (sending). Those bytes 
> Jani> correspond to one 8bit, 16bit or 32bit integer numbers that tell
> how long the message is going Jani> to be. The C-code in inet_drv then
> receives (sends) the header and the message in one or 
> Jani> fragments which should happen under the hood.
> 
> I think this is the opposite of what Sean says
> 
> I'm not really sure what " .. and the message in one or two fragments
> .." means.

I am not clear about this sentence either, but I believe Jani meant the 
same thing as Sean.

[...]
> It seems to me that the logical behaviour if a packet size is given is
> that the
> Erlang messages are delivered from and to the driver in unfragmented
> binaries
> and that the messages on the line are preceded by a header and then
> possible fragmented.

Correct.  It would be infeasible to assume that messages can be not 
fragmented on the transport layer, as it's beyond the control of the C 
driver used by the emulator, and is fully dependent on the underlying 
protocol.  For instance, in case of SCTP the protocol preserves message 
boundaries, so a receive call on an SCTP socket would always guarantee 
that a message is either delivered in full, or an error is generated.

> I gave up and started reading inet_drv.c - somewhere in 7071 lines of is
> the answer ...
> but it's not obvious to me where this is ...

You can examine the following functions:

tcp_recv()
tcp_remain()
tcp_deliver()

> I didn't really understand this comment
> 
>> Since you are including this example in the book, I'd 
>> recommend additionally showing how to use active sockets and 
>> still preserve flow control without gen_tcp:recv, so that you 
>> can implement the server using the standard OTP gen_server 
>> behavior.  This can be accomplished by using {active, once} option. 
> 
> Can you expand on this with (possibly) an example?

Surely!

Let's say that we wanted to accomplish the following:

1. Implement a TCP server receiving client connection requests, and 
communicating over some custom protocol.
2. Ensure that there's flow control on the server side and that clients 
will not bring down the server by over-flooding its message queue.
3. Use standard OTP behaviors to implement the server process handling 
client transactions.

Now, say we learn how to accomplish #1 by studying your tcp_server tutorial:

http://www.sics.se/~joe/tutorials/web_server/tcp_server.erl

#2 can be accomplished by ensuring that if too much traffic is being 
generated by clients, we have control over how that traffic is 
dispatched to the mailbox of the Erlang server process handling client's 
transactions.  This is done by either:

a. Using blocking gen_tcp:recv/2 socket calls. In this case the Erlang 
process actively reads data from the socket's buffer, and this data 
doesn't reach server process mailbox.  The drawback of this approach is 
that it introduces a blocking behavior to the process, and we cannot use 
selective receive to respond to other messages coming to the process 
mailbox.

b. Using the {active, once} server socket option, and not using the 
blocking gen_tcp:recv/2 call, but relying on the gen_tcp's ability to 
read and deliver *one* message to the mailbox of the server process. 
This is great because we no longer have to deal with blocking calls in 
the program, and can use standard gen_server behavior to implement the 
server.  The trick is that after receiving and processing a message from 
the mailbox, the server needs to issue a call to inet:setopts(Sock, 
[{active, once}]) to allow the gen_tcp socket to deliver the next message.

#3 is illustrated by the example below.  Note that I am only including a 
few gen_server callbacks that illustrate this point.

-module(custom_server).
-behavior(gen_server).

% Create a new process handling interactions with the client.  If this
% server was a part of an application with a supervisor, we could use
% the simple_one_for_one restart strategy in the supervisor for these
% server processes.  In this case in a production system a crash of
% such a server would automatically generate and log an error report by
% the supervisor / SASL.  The difference in server creation then would
% that we'd call supervisor:start_child/2 instead of gen_server:start
% to create the server process.
new_client(Socket) ->
     {ok, Pid} = gen_server:start(?MODULE, [Socket], []),
     % Here we assume that the calling process had the ownership of the
     % socket.  We need to delegate that ownership to the newly created
     % gen_server in order for the active socket to deliver messages to
     % the proper Pid.
     ok = gen_tcp:controlling_process(Socket, Pid),
     {ok, Pid}.
...

% Let's assume that the socket is created by
init([Socket]) ->
     inet:setopts(Socket, [{active, once}, {packet, 4}, binary]),
     ...
     {ok, State#state{sock = Socket}}.

handle_info({tcp, Socket, Data}, #state{sock=Socket} = State) ->
     % Note that Data is delivered non-fragmented here
     NewState = do_handle_request(Socket, Data, State),
     inet:setopts(Socket, [{active, once}]),
     {noreply, NewState};
...

Please note the following:

1. It is not clear from documentation if the socket returned from the 
acceptor call inherits any options given to the listen socket.  I.e. if 
the listen socket was opened with:

{ok, ListenSocket} = listen(Port, [{packet, 4}]).

would the server process accepting the client socket need to explicitly 
call inet:setopts(Socket, [{packet, 4}]) after the following statement?

{ok, Socket} = accept(ListenSocket).

Or is that option inherited from the listener socket?  I don't think any 
options are inherited, and call that explicitly, but it would be nice if 
documentation mentioned something about it.

2. Is it possible to turn the socket acceptor process into a gen_server? 
  The gen_tcp:accept/1 call is blocking by nature, and I don't see how 
this could be done.  However, this server pattern is fairly generic, and 
it would be nice if it could be completely mapped to some OTP behavior.

Regards,

Serge

> Thanks a lot
> 
> /Joe
> 
> 
>> -----Original Message-----
>> From: Serge Aleynikov [mailto:serge@REDACTED] 
>> Sent: den 18 september 2006 16:45
>> To: Joe Armstrong (TN/EAB)
>> Cc: erlang-questions@REDACTED
>> Subject: Re: [erlang-questions] gen_tcp question
>>
>> Joe Armstrong (TN/EAB) wrote:
>>> 	{ok, L} = gen_tcp:listen(Port,
>>> [{length,4},binary,{active,true}]),
>> I hope it is a typo, because there's no {length, N} socket 
>> option.  Did you, perhaps, mean {packet, 4}?
>>
>>> loop(S, C) ->
>>>      receive
>>>          {tcp, S, Bin} ->              %% <----- Is Bin of length 4
>>> here?????????????
>>>               C ! binary_to_term(Bin),   
>>>               loop(S, C);
>>>          {tcp_closed, S} ->
>>> 	        C ! closed;
>>> 	   {msg, Term} ->
>>>               gen_tcp:send(S, term_to_binary(Term)),
>>> 	        loop(S, C);
>>>          close ->
>>> 	        gen_tcp:close(S)	
>>> 	end        
>> I wouldn't be able to say how this was *supposed* to work 
>> since this is unclear from the documentation.  However, by 
>> knowing the details of the inet_drv C driver I can tell you 
>> that the difference in the driver behavior in presence of the 
>> {active, true} option is that in contrast to the 
>> gen_tcp:recv/2, when the driver is getting a command from the 
>> Erlang process to issue a socket read, the active option 
>> initiates a socket read upon detecting some data on the 
>> socket in exactly the same manner as if the 
>> gen_tcp:recv(Sock, 0) was called.  In both cases you are 
>> guaranteed to get the *full non-fragmented* packet of length 
>> M (where M is read from the message header determined by the 
>> N bytes with respect to the {packet, N} option).
>>
>> Note that since gen_tcp:recv(Sock, M), where M > 0 is not a 
>> possible call for sockets with {packet, N} where N =/= 0, it 
>> would also be meaningless to have the {packet, N} option if 
>> it didn't guarantee non-fragmented delivery of packets.
>>
>>> I need the active=true mode for this since loop has to 
>> handle messages 
>>> from both C and S - so I don't want to block in gen_tcp:recv
>> Since you are including this example in the book, I'd 
>> recommend additionally showing how to use active sockets and 
>> still preserve flow control without gen_tcp:recv, so that you 
>> can implement the server using the standard OTP gen_server 
>> behavior.  This can be accomplished by using {active, once} option.
>>
>>> Now if the answer to this question is in the documentation I can't 
>>> seem to find it the setopts documentation says that a 4 
>> byte header is 
>>> appended to the TCP data - that's all.
>> I also don't believe this is documented.
>>
>> Regards,
>>
>> Serge
>>
>>>> -----Original Message-----
>>>> From: Jani Hakala [mailto:jahakala@REDACTED]
>>>> Sent: den 14 september 2006 15:33
>>>> To: erlang-questions@REDACTED
>>>> Cc: Joe Armstrong (TN/EAB)
>>>> Subject: Re: [erlang-questions] gen_tcp question
>>>>
>>>> "Joe Armstrong (TN/EAB)" <joe.armstrong@REDACTED> writes:
>>>>
>>>>> The behaviour appears not to be documented
>>>>>
>>>> Meaning of {packet,N} is explained in man inet, inet:setopts The 
>>>> behaviour of gen_tcp:recv is explained in man gen_tcp
>>>>
>>>> Jani Hakala
>>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://www.erlang.org/mailman/listinfo/erlang-questions
>>>
>>
>