[erlang-questions] Writing platform independent Socket based binary parsing code

Dmitry Kolesnikov <>
Sat May 12 11:50:43 CEST 2012


Hi,

There are no needs for ntohs functions in erlang. Binary matching does the job and you should keep in-mind that by default it assumes "network byte order" ~ big-endian
So, like you've wrote you parse data through binary match
<<Version:32, _Rest/binary>> = Data  "big-unsigned" can be skipped (it is assumed  by default)

- Dmitry

On May 12, 2012, at 12:13 PM, Arun Muralidharan wrote:

> Thanks Dmitry for the quick response.
> I am aware of the fact that from network we get data in our application in Big Endian format and in c/c++ I would use ntohs/ntohl functions for converting it into a format that the OS supports.
> But, the same in Erlang is confusing me (no idea why !!).
> As per your last statement "So, the parsing of protocol primitive shall not be dependent on CPU architecture. E.g. If you protocol say that version is serialized into big-endian then your parsing code shall be same and it is platform independent. "     do you want to say that, as I am parsing data from network socket I should use  "<<Version : 4/big-unsigned-integer:8,_Rest/binary>> = Sock_Data." as the data in network is serialized in big endian format ?
> 
> Really sorry if I am sounding naive..but somehow i am having trouble understanding this in Erlang. 
> 
> Thanks,
> Arun
> 
> 
> On Sat, May 12, 2012 at 1:52 PM, Dmitry Kolesnikov <> wrote:
> Hello,
> 
> I'd like to quote one statement here: "Network stacks and communication protocols must also define their endianness. Otherwise, two nodes of different endianness would be unable to communicate."
> As an example, TCP/IP protocol suite is defined to be a big-endian. 
> 
> So, the parsing of protocol primitive shall not be dependent on CPU architecture. E.g. If you protocol say that version is serialized into big-endian then your parsing code shall be same and it is platform independent. 
> 
> - Dmitry
> 
> P.S: ;-)  http://geekandpoke.typepad.com/geekandpoke/2011/09/simply-explained-1.html
> 
> 
> On May 12, 2012, at 10:34 AM, Arun Muralidharan wrote:
> 
>> Hi Folks,
>> 
>> I have an erlang application which includes parsing of binary data from socket (both TCP and UDP). The binary data is as per some specific protocol, so my options for TCP and UDP sock is as per below:
>> 
>> 
>> TCP sock :
>> Opts = [binary, {packet, 0}, {reuseaddr, true},
>>             {keepalive, true}, {backlog, 30}, {active, false}],
>> UDP sock :
>> [binary,{active,once},{recbuf,2097152}]
>> 
>> Now, when i parse the data I get from socket, I do like this (On UNIX):
>> 
>>     << Version:4/big-unsigned-integer-unit:8,
>>        Length:2/big-unsigned-integer:8,
>>        _Rest/binary >> = Bin_event_From_Socket.
>> 
>> Now, this would give me a problem when run on LINUX as LINUX is little endian based. So, the option here for me is to convert 'big' to 'little' in the above code.
>> 
>> But, Erlang being a VM based language, I must be having some facility to write platform independent code. I searched the NEt but couldnt find much on it.
>> 
>> So, the question here is how can i write a platform independent sockaet data parsing code ?
>> 
>> Thanks.
>> 
>> -Arun
>> _______________________________________________
>> erlang-questions mailing list
>> 
>> http://erlang.org/mailman/listinfo/erlang-questions
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120512/8e5f4f01/attachment.html>


More information about the erlang-questions mailing list