[erlang-questions] How erlang handle complicated data structure like C?

Sun Sep 18 14:17:01 CEST 2011

On Sun, Sep 18, 2011 at 09:10, Jovi Zhang <bookjovi@REDACTED> wrote:
> Hi,
>    Is there any good way to handle complicated data structure like C in Erlang?
>
>    struct packet_t {
>        int a,
>        double b,
>        struct xx_t,
>        union c {int d, char e},
>        char f:1,
>        char g:3,
>        char h:5
>    }
>
>    Some protocol have complicated data structure like above, how to
> represent those structure in Erlang? and operate on each field freely
> like C?

Your tools are:

Binary Pattern Matches. The above can be matched with something like:

case Packet of
  <<A:32/integer, B:64/float, XX:?XX_SIZE/binary, D:32/integer,
F:1/integer, G:3/integer, G:5/integer>> ->
    S_XX = decode(XX), % Assume you have a separate decoding function
for this beast
    {A, B, S_XX, {d, D}, F, G, H};
   <<A:32/integer, B:64/float, XX:?XX_SIZE/binary, E:8/integer,
F:1/integer, G:3/integer, G:5/integer>> ->
    S_XX = decode(XX),
    {A, B, S_XX, {e, E}, F, G, H}
end

It probably won't work as expected as the bit size doesn't match up to
a multiple of 8. Also note that some protocols use a length field
early to specify the size of later data, which you can also do. Using
the binary as an expression and not as a match will enable you to
construct the struct.

Records, possibly nested.

The idea here is that you are taking an external packet representation
apart and converting it to an internal form. The internal form are
usually not a 8-element tuple as above, but via using a (nested)
record as George C. Serbanut recommends. In C, you keep, more or less,
an 1-to-1 representation to what is on the wire (beware of the packed
attribute!). In Erlang you decode the packets external representation
to an internal one and then proceed to use the internal representation
in your code. When you want to output the internal representation, you
render it into binary() form and shove that out over the wire.

. . .

Structures can be more complicated than this. If the structure size is
statically known and you can break your input up in the right chunk
size, the above works. If the size is dynamic, you will have to write
code which scrutinizes a partial decode to learn how to decode the
rest. Finally, if there are more packets coming in on, say, the same
TCP stream, you may want to process it recursively by adding a <<....,
Rest/binary>> field to the match and then handle the Rest of the input
in another call. For TCP streams you may also need some buffering to
make sure you have a full packet to decode, but this isn't something
new compared to C.

-- 
J.