[erlang-questions] tcp data in a gen_fsm

Sun Nov 29 14:54:28 CET 2015

On 2015年11月29日 日曜日 14:27:27 Frans Schneider wrote:
> Dear list,
> 
> I have a gen_fsm representing the tcp connection to a remote server with 
> 6 states defined. Tcp data is handled by handle_info with active = 
> once.Tcp packets can contain more than one commands/data items.
> I would like to feed the data into the different FSM states in the same 
> manner as with send_event() and not handle the different states in 
> separate handle_info clauses for the different states or a massive 
> handle_info clause with a huge case statement.
> Question is, can I call the state clauses like this or do I brake OTP 
> behavior?This is a client and blocking incoming data is not an issue.
> 
> handle_info({tcp, Socket, Raw_data}, State_name, #state{buffer = Buffer} 
> = State) ->
>      {Datas, Buffer1} = unpack(<<Buffer/binary, Raw_data/binary>>),
>      lists:foldl(fun(Data, {StateName_in, State_in}) ->
>                          {next_state, StateName_out, State_out} = 
> ?MODULE:StateName_in({recv, Data}, State_in),
>              {StateName_out, State_out}
>          end, {State_name, State}, Datas),
>      inet:setopts(Socket, [{active, once}]),
>      {next_state, active, State#state{buffer = Buffer1}};
> 
> I can of course use a seperate process for handling the tcp stuff which 
> makes the gen_fsm calls, but was just wondering if this could work as well.

Using FSMs to represent protocol state is usually a win. However, using it to deal with the socket directly in TCP usually isn't a good fit. TCP is a stream, and you understand how that works as evidenced by your unpack/1 function. That means your variable Datas should contain a list of received messages in your protocol -- and each message has the potential to change the state of the protocol.

You are now going to be looping over passed FSM states in *two* cases: when you receive new network data, and when you have anything left in the message list held in Datas. That's a little awkward, and defeats the purpose of passing NextState to the FSM -- since you have to manually handle the difference in your looping code.

I prefer instead to write a TCP process that deals only with network messages and does something like unpack/1 on them (but I have a sort of generic TCP process I use for this, where unpack/1 is a callback -- and I've always wondered why OTP doesn't have something like this, since gen_servers are just a touch kludgy for this ...?). Any time Datas is not empty the TCP process sends each element of Datas, one by one as Erlang messages to the FSM that represents the protocol state.

TCP is its own protocol. It needs a handler. Your protocol is its own thing as well, and an FSM suits that well. The two protocols being written into a single process gets more convoluted than I enjoy.

...but...

ZOMG MY PRECIOUS PERFORMANSSESSSS!!!

And, well, whatever. I've never had *this* part of a program be my performance bottleneck. I have had this part of a program cause me to waste a lot of *my* time picking out the difference between unpacking TCP, iterating over a stack of extracted protocol messages, and iterating over the network receives. It certainly *can* be made to work, and it appears that *most* code is written like this -- where the application and transmission stuff is mixed together -- but I don't find it to be particularly readable a few months later, and that sucks really bad when you're trying to figure out what your protocol is up to.

Its also *very* nice to separate your protocol out into Erlang messages (even if its a binary or text protocol and not based on Erlang terms) because if you separate the network bits from the protocol bits you can use your own application protocol within your own node to do other stuff -- especially testing of your process' compliance with your own protocol. It also gives you a lot of component flexibility in how you might add other local parts to your system. That you can throw a TCP layer in front of it is awesome -- and that also means you can throw an SCTP (or whatever) layer in front of it if you want later.

-Craig