[erlang-bugs] Segfault in erl_interface attempting to decode certain large binaries

Jonathan Gray jlist@REDACTED
Thu Jul 24 19:52:52 CEST 2008


Sverker,

I apologize for not looking directly at the binary to see if there were any
strange issues.

Looks like I have my C server not blocking until it receives the entire
sequence.  That will certainly clear this up.

Thanks for your help.  I am nearly complete with the ei based parser so will
probably go forward with that since it will be more efficient than
converting to ETERM and then parsing.  So this error on my part may be a
blessing in disguise (or I will keep telling myself that)  :)

Thanks again!

Jonathan

-----Original Message-----
From: erlang-bugs-bounces@REDACTED [mailto:erlang-bugs-bounces@REDACTED]
On Behalf Of Sverker Eriksson
Sent: Thursday, July 24, 2008 9:35 AM
To: jlist@REDACTED
Cc: erlang-bugs@REDACTED
Subject: Re: [erlang-bugs] Segfault in erl_interface attempting to decode
certain large binaries

Hi Jonathan

I've looked at the test files you attached.

The file 'badbinary' is clearly broken. A lot of zeros is ending the file:

 > hexdump -C badbinary
:
0000ffb0 34 31 68 02 64 00 04 64 61 74 61 6d 00 00 00 08 |41h.d..datam....|
0000ffc0 54 65 72 72 65 6e 63 65 6a 6c 00 00 00 02 68 02 |Terrencejl....h.|
0000ffd0 64 00 03 72 6f 77 6d 00 00 00 24 38 37 32 61 36 |d..rowm...$872a6|
0000ffe0 62 62 61 2d 34 64 31 65 2d 34 34 64 34 2d 61 38 |bba-4d1e-44d4-a8|
0000fff0 61 62 2d 62 66 31 37 65 39 65 38 38 00 00 00 00 |ab-bf17e9e88....|
00010000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0001e640 00 00 00 00 |....|
0001e644

The zeros are starting close to offset 0x10000 into the file.

The segfault happens when erl_interface finds the faulty format and 
fails to recover due to some bug. I will look at that.

I can't see how you could decode this data from within Erlang.

/Sverker, Erlang/OTP Ericsson


jlist@REDACTED wrote:
> All,
>
> I have a TCP interface between an Erlang system and a C system.  Both
> send/receive marshaled binary Erlang terms and I have not had any problems
> to date.
>
> Today I began doing some more serious testing with larger chunks of binary
> to be decoded in C.
>
> We ran into a bug (it seems) with erl_interface 3.5.5.4 that is causing it
> to segfault during decoding.  The backtrace looks like this:
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 46912496233216 (LWP 4091)]
> 0x00000000004032c4 in _erl_free_term ()
> (gdb) bt
> #0  0x00000000004032c4 in _erl_free_term ()
> #1  0x000000000040496b in erl_decode_it ()
> #2  0x0000000000404937 in erl_decode_it ()
> #3  0x0000000000404c93 in erl_decode_it ()
> #4  0x0000000000404c93 in erl_decode_it ()
> #5  0x0000000000405311 in erl_decode ()
> #6  0x0000000000401b38 in main (argc=1, argv=0x7fff51ec7938) at
badtest.c:28
>
> The unfortunate part is that the way this large binary term is generated
> cannot be done in any kind of sample code (it's being pulled off an
external
> database).
>
> Testing code:  http://jgray.la/erlang/erl_decode_segfault_test.tar.gz
>
> However, I have created a set of test files in C which recreate the
> segfault.  I stored the binary in a flat file (as 'badbinary') and have a
> testing program which reads it off disk and attempts to decode it.  To
prove
> the approach is sane (and that this segfault is related to something
strange
> about the decoding of this particular binary, not the size or general
format
> of the binary) there is a 'goodbinary' file and testing program for that.
>
> To use the test code:
>
> Untar/Ungzip the file.  You may need to edit the Makefile to fix the paths
> to your erl_interface library.
>
> 'make' and then you can:
>
> ./badtest  (this reads 'badbinary' and attempts to decode, causes
segfault)
> ./goodtest (this reads 'goodbinary' and successfully decodes it)  [nearly
> identical code to badtest.c but reads different file w/ different size]
>
> Also included is
>
> ./makegoodbin (a simple program that generates a large ETERM in an
identical
> format to the badbinary but contains duplicated binary data everywhere) 
>
>
> Notes:
>
> * The marshaled binary erlang term being sent to C can be successfully
> decoded/unmarshaled from within Erlang without a problem
> * This is reproducible with many different large erlang terms generated
from
> our database queries.  'makegoodbin.c' creates a term identical in format
to
> those causing problems, however it does not have the random distribution
of
> binary sizes and content, and so I'm not able to reproduce the problem in
> this way.
> * The entire system, end-to-end including this decoding step, works
> perfectly in most cases.  However when the data goes into the 100k+ range,
> the segfaults start to happen.  That's why I created the 'makegoodbin'
which
> follows the same format.  Unfortunately that works even at sizes of >1MB
> adding to the confusion of the problem.
>
>
> Any help is appreciated.  Thanks.
>
> I apologize if this is a repost.  I never saw my original post hit the
list
> and did not receive any responses.
>
> Jonathan Gray
> Streamy Inc.
>
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-bugs
>   

_______________________________________________
erlang-bugs mailing list
erlang-bugs@REDACTED
http://www.erlang.org/mailman/listinfo/erlang-bugs




More information about the erlang-bugs mailing list