[erlang-questions] (ArgumentError) argument error :erlang.binary_to_term

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Sun Jan 29 16:05:33 CET 2017


My bet is on a framing problem. Joe indirectly gives you a way to handle
the question: Test the binary_to_term locally, without the socket. Or send
something you know over the wire and check it isn't garbled. This
eliminates eventual zlib trouble.

The framing problem is subtle: The kernel could either deliver 1 byte a
time from the stream, which is inefficient, or it could wait for eternity
and just buffer everything never delivering, which is wrong. In practice,
the kernel chooses some middle ground. It will deliver bytes from the
stream to you either because it has a full buffer internally, or because
some time has passed since the last delivery, or that your application is
usually going to ask for more data rather quickly. The MTU of your IP
datagrams can quickly come into play and you will get messages roughly of
MTU size or a multiple hereof. The internet can do evil things to your
packets however: fragment them, reorder them, omit a packet in the TCP
sequence which requires a SACK roundtrip and so on. These events affects
the way the kernel sends bytes to the application in the stream.

Since it is a stream, you need your code to handle this case (untested):

read_packet(Pkt, Buffer) ->
    read_buffer(<<Buffer/binary, Pkt/binary>>).

read_buffer(<<L:32/integer, Payload:L/binary, RestBuffer/binary>>) ->
    {packet, Payload, RestBuffer};
read_buffer(Buffer) ->
    {need_more_data, Buffer}.

write_term(Sock, Term) ->
    Payload = term_to_term(Term),
    L = byte_size(Payload),
    gen_tcp:write(Sock, [<<L:32/integer>>, Payload]).

Coincidentally, this is roughly what the option {packet, 4} does at the VM
layer and it is quicker. In the case where we lack enough bytes in the
buffer from the kernel, we can tell the world we need more data. If we can
break off a packet from the buffer, we do that and return what is remaining
in the buffer. The above API requires the caller to repeatedly call
read_buffer/1 on the buffer until we can't squeeze more packets out of the
buffer. Another variant would just return a list of decoded packets and a
buffer state for the things which weren't able to be decoded.

On Sun, Jan 29, 2017 at 12:56 PM Ali Sabil <ali.sabil@REDACTED> wrote:

> You don't seem to be using any framing over TCP. TCP is stream based not
> message based.
>
> Basically what happens is that your message when it gets too large it gets
> split and sent in multiple packets. You need to add framing around your
> data to ensure that the entire message is received before you attempt a
> call to binary_to_term.
> On Sun, 29 Jan 2017 at 11:50, Joe Armstrong <erlang@REDACTED> wrote:
>
> No idea what's wrong - but you could start checking that
>
> 1) binary_to_term works locally (it should)
> 2) the data is transmitted correctly
>     print the md5 checksum of the binary before sending and after reception
>     erlang has a BIF for this erlang:md5
>
> /Joe
>
>
> On Sat, Jan 28, 2017 at 1:55 AM, Kevin Johnson
> <johnson786.kevin@REDACTED> wrote:
> > Hi,
> >
> > I am using Elixir to send data using :gen_tcp. The syntax presented here
> may
> > be elementary Elixir syntax, however the issue is specifically related to
> > the usage of term_to_binary and binary_to_term conversions over gen_tcp.
> >
> > This is the command I used on the client side:
> >>
> >> :gen_tcp.send(client_socket, :erlang.term_to_binary(data))
> >
> >
> > This is the command I use on the server side:
> >>
> >>   defp handle_data(data) do
> >>     data
> >>     |> :erlang.binary_to_term
> >>   end
> >
> >
> > When my data gets received on the server side, it seems that at a certain
> > size threshold that it errors out like this:
> >
> >> pry(6)> [error] GenServer #PID<0.502.0> terminating
> >>
> >> ** (ArgumentError) argument error
> >>
> >>  :erlang.binary_to_term(<<131, 116, 0, 0, 0, 28, 109, 0, 0, 0, 5, 66,
> 49,
> >> 95, 81, 49, 108, 0, 0, 0, 1, 116, 0, 0, 0, 1, 109, 0, 0, 0, 4, 116, 101,
> >> 120, 116, 109, 0, 0, 0, 11, 84, 69, 88, 84, 32, 65, 78, 83, 87, 69,
> ...>>)
> >>
> >> (stdlib) gen_server.erl:601: :gen_server.try_dispatch/4
> >>
> >> (stdlib) gen_server.erl:667: :gen_server.handle_msg/5
> >>
> >> (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
> >>
> >> Last message: {:tcp, #Port<0.10115>, <<131, 116, 0, 0, 0, 28, 109, 0, 0,
> >> 0, 5, 66, 49, 95, 81, 49, 108, 0, 0, 0, 1, 116, 0, 0, 0, 1, 109, 0, 0,
> 0, 4,
> >> 116, 101, 120, 116, 109, 0, 0, 0, 11, 84, 69, 88, 84, 32, 65, 78, 83,
> ...>>}
> >
> >
> > Here is some example data that will just pass fine:
> > %{"B2_Q18" => [%{"choice_id" => "B2_Q18_C1"}], "B3_B9_Q1" =>
> [%{"choice_id"
> > => "B3_B9_Q1_C1"}], "B2_Q7" => [%{"choice_id" => "B2_Q7_C1"}],
> "B5_B1_Q8" =>
> > [%{"choice_id" => "B5_B1_Q8_C1"}], "B3_B6_Q1" => [%{"choice_id" =>
> > "B3_B6_Q1_C1"}], "B5_B2_Q5" => [%{"choice_id" => "B5_B2_Q5_C1"},
> > %{"choice_id" => "B5_B2_Q5_C2"}], "B3_B7_Q1" => [%{"choice_id" =>
> > "B3_B7_Q1_C1"}], "B3_B4_Q1" => [%{"choice_id" => "B3_B4_Q1_C1"}],
> "B2_Q16"
> > => [%{"choice_id" => "B2_Q16_C1"}], "B2_Q20" => [%{"choice_id" =>
> > "B2_Q20_C1"}], "B2_Q5" => [%{"choice_id" => "B2_Q5_C1"}], "B2_Q9" =>
> > [%{"choice_id" => "B2_Q9_C1"}], "B2_Q17" => [%{"choice_id" =>
> "B2_Q17_C1"}],
> > "B3_B2_Q1" => [%{"choice_id" => "B3_B2_Q1_C1"}], "B5_B2_Q3" =>
> > [%{"choice_id" => "B5_B2_Q3_C1"}], "B3_B3_Q1" => [%{"choice_id" =>
> > "B3_B3_Q1_C1"}], "B5_B1_Q1" => [%{"text" => "TEXT ANSWER"}], "B2_Q14" =>
> > [%{"choice_id" => "B2_Q14_C1"}], "B5_B1_Q3" => [%{"choice_id" =>
> > "B5_B1_Q3_C22"}], "B5_B2_Q2" => [%{"choice_id" => "B5_B2_Q2_C1"}],
> "B2_Q4"
> > => [%{"choice_id" => "B2_Q4_C1"}], "B2_Q10" => [%{"choice_id" =>
> > "B2_Q10_C1"}], "B4_Q1" => [%{"choice_id" => "B4_Q1_C1"}, %{"choice_id" =>
> > "B4_Q1_C10"}], "B5_B1_Q6" => [%{"choice_id" => "B5_B1_Q6_C1"}], "B2_Q6"
> =>
> > [%{"choice_id" => "B2_Q6_C1"}], "B2_Q2" => [%{"choice_id" =>
> "B2_Q2_C1"}],
> > "B2_Q1" => [%{"choice_id" => "B2_Q1_C1"}]}
> >
> > The moment I make it slightly longer, by like appending an extra key "a"
> =>
> > "b" it will give me the above error.
> >
> > It seems to be that the size of the keys plays an issue here, because if
> I
> > convert all the above "choice_id" keys to a mere "c", then even with the
> > extra entry "a" => "b" included like following below will work just fine:
> > %{"B2_Q18" => [%{"c" => "B2_Q18_C1"}], "B3_B9_Q1" => [%{"c" =>
> > "B3_B9_Q1_C1"}], "B2_Q7" => [%{"c" => "B2_Q7_C1"}], "B5_B1_Q8" => [%{"c"
> =>
> > "B5_B1_Q8_C1"}], "B3_B6_Q1" => [%{"c" => "B3_B6_Q1_C1"}], "B5_B2_Q5" =>
> > [%{"c" => "B5_B2_Q5_C1"}, %{"c" => "B5_B2_Q5_C2"}], "B3_B7_Q1" => [%{"c"
> =>
> > "B3_B7_Q1_C1"}], "B3_B4_Q1" => [%{"c" => "B3_B4_Q1_C1"}], "B2_Q16" =>
> [%{"c"
> > => "B2_Q16_C1"}], "B2_Q20" => [%{"c" => "B2_Q20_C1"}], "B2_Q5" => [%{"c"
> =>
> > "B2_Q5_C1"}], "B2_Q9" => [%{"c" => "B2_Q9_C1"}], "B2_Q17" => [%{"c" =>
> > "B2_Q17_C1"}], "B3_B2_Q1" => [%{"c" => "B3_B2_Q1_C1"}], "B5_B2_Q3" =>
> [%{"c"
> > => "B5_B2_Q3_C1"}], "B3_B3_Q1" => [%{"c" => "B3_B3_Q1_C1"}], "B5_B1_Q1"
> =>
> > [%{"text" => "TEXT ANSWER"}], "B2_Q14" => [%{"c" => "B2_Q14_C1"}],
> > "B5_B1_Q3" => [%{"c" => "B5_B1_Q3_C22"}], "B5_B2_Q2" => [%{"c" =>
> > "B5_B2_Q2_C1"}], "B2_Q4" => [%{"c" => "B2_Q4_C1"}], "B2_Q10" => [%{"c" =>
> > "B2_Q10_C1"}], "B4_Q1" => [%{"c" => "B4_Q1_C1"}, %{"c" => "B4_Q1_C10"}],
> > "B5_B1_Q6" => [%{"c" => "B5_B1_Q6_C1"}], "B2_Q6" => [%{"c" =>
> "B2_Q6_C1"}],
> > "B2_Q2" => [%{"c" => "B2_Q2_C1"}], "B2_Q1" => [%{"c" => "B2_Q1_C1"}],
> "a" =>
> > "b"}
> >
> > The following on the other hand will just fail:
> > data =
> >
> %{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
>
>  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
> > => "q",
> >
> "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
>
>  bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
> > => "q",
> >
> "ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
>
>  cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc"
> > => "q"}
> > :gen_tcp.send(client_socket, :erlang.term_to_binary(data))
> >
> > The only issue that I was able to track down here is a conjunction of two
> > things:
> > 1) A limit to the size of total key length inside a map/hash
> > 2) sending that hash as a binary over tcp and converting back to term
> >
> > In all of the above cases, a direct conversion of :erlang.term_to_binary
> > followed by :erlang.binary_to_term works just fine without any issues.
> The
> > issue only comes about after that binary was send over tcp and then
> > :erlang.binary_to_term is attempted.
> >
> > I would greatly appreciate if anyone can provide me guidance in this
> matter,
> > point out to me any relevant documentation that officially states any of
> > these constraints if they are in fact constraints, and if they are not
> > supposed to be constraints then kindly instruct me what additional
> > information may be needed for me to get to the bottom of this.
> >
> > Thank you.
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> >
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170129/a2eb87a6/attachment.htm>


More information about the erlang-questions mailing list