[erlang-bugs] Segfault in erl_interface attempting to decode certain large binaries

Jonathan Gray jlist@REDACTED
Wed Jul 23 17:50:20 CEST 2008


Serge,

Sorry I was not more clear.  This is a port program with two remote systems
communicating via simple TCP packets (size header, buffer).

Thanks for the advice.  The documentation suggests that erl_interface is
actually making use of the ei code.  I'm not sure if that's correct because
I see other resources on the web that say they are different (ei is new,
erl_interface is old).  However the error does exist in a malloc() call so
perhaps that can be avoided doing the term decodes myself?

Unfortunately what I need (already have working but segfaulting on a large
decode) is a generic parser that converts ErlBinary into a number of
different formats (JSON, Python, C struct tree) and vice versa.  The ETERM
representation gave me an easy way to make recursive converters.  Recreating
this generic behavior using ei seems like I'll be recreating functions of
erl_interface and losing the ETERM representation so I'll need to rewrite my
converters in a completely different way. 

Rewriting it to go directly from binary would be a good thing in the long
run, just not something I was planning on doing at this stage.  Has anyone
ever written a newer generic erl_encode/erl_decode using the latest ei?

I will take a closer look at ei and report back.

Thanks for your help.

Jonathan


-----Original Message-----
From: erlang-bugs-bounces@REDACTED [mailto:erlang-bugs-bounces@REDACTED]
On Behalf Of Serge Aleynikov
Sent: Wednesday, July 23, 2008 3:52 AM
To: jlist@REDACTED
Cc: erlang-bugs@REDACTED
Subject: Re: [erlang-bugs] Segfault in erl_interface attempting to decode
certain large binaries

Jonathan,

It's not clear from your description if you're using erl_interface to 
build a driver or a port program.  While I don't have an answer to your 
direct question, perhaps if you are writing a C port program you can use 
ei instead of erl_interface, and if you are writing a driver, you can 
use driver_output_term() / driver_send_term() and corresponding 
ErlDrvTermData* structures to pass data to/from the emulator (*). This 
is the fastest way to communicate with the emulator's port owner process.

(*) See: http://www.erlang.org/doc/man/erl_driver.html
and also the source code of inet_drv.c in the distribution for various 
LOAD_*() macros that simplify working with ErlDrvTermData structures.

Serge




jlist@REDACTED wrote:
> All,
> 
> I have a TCP interface between an Erlang system and a C system.  Both
> send/receive marshaled binary Erlang terms and I have not had any problems
> to date.
> 
> Today I began doing some more serious testing with larger chunks of binary
> to be decoded in C.
> 
> We ran into a bug (it seems) with erl_interface 3.5.5.4 that is causing it
> to segfault during decoding.  The backtrace looks like this:
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 46912496233216 (LWP 4091)]
> 0x00000000004032c4 in _erl_free_term ()
> (gdb) bt
> #0  0x00000000004032c4 in _erl_free_term ()
> #1  0x000000000040496b in erl_decode_it ()
> #2  0x0000000000404937 in erl_decode_it ()
> #3  0x0000000000404c93 in erl_decode_it ()
> #4  0x0000000000404c93 in erl_decode_it ()
> #5  0x0000000000405311 in erl_decode ()
> #6  0x0000000000401b38 in main (argc=1, argv=0x7fff51ec7938) at
badtest.c:28
> 
> The unfortunate part is that the way this large binary term is generated
> cannot be done in any kind of sample code (it's being pulled off an
external
> database).
> 
> Testing code:  http://jgray.la/erlang/erl_decode_segfault_test.tar.gz
> 
> However, I have created a set of test files in C which recreate the
> segfault.  I stored the binary in a flat file (as 'badbinary') and have a
> testing program which reads it off disk and attempts to decode it.  To
prove
> the approach is sane (and that this segfault is related to something
strange
> about the decoding of this particular binary, not the size or general
format
> of the binary) there is a 'goodbinary' file and testing program for that.
> 
> To use the test code:
> 
> Untar/Ungzip the file.  You may need to edit the Makefile to fix the paths
> to your erl_interface library.
> 
> 'make' and then you can:
> 
> ./badtest  (this reads 'badbinary' and attempts to decode, causes
segfault)
> ./goodtest (this reads 'goodbinary' and successfully decodes it)  [nearly
> identical code to badtest.c but reads different file w/ different size]
> 
> Also included is
> 
> ./makegoodbin (a simple program that generates a large ETERM in an
identical
> format to the badbinary but contains duplicated binary data everywhere) 
> 
> 
> Notes:
> 
> * The marshaled binary erlang term being sent to C can be successfully
> decoded/unmarshaled from within Erlang without a problem
> * This is reproducible with many different large erlang terms generated
from
> our database queries.  'makegoodbin.c' creates a term identical in format
to
> those causing problems, however it does not have the random distribution
of
> binary sizes and content, and so I'm not able to reproduce the problem in
> this way.
> * The entire system, end-to-end including this decoding step, works
> perfectly in most cases.  However when the data goes into the 100k+ range,
> the segfaults start to happen.  That's why I created the 'makegoodbin'
which
> follows the same format.  Unfortunately that works even at sizes of >1MB
> adding to the confusion of the problem.
> 
> 
> Any help is appreciated.  Thanks.
> 
> I apologize if this is a repost.  I never saw my original post hit the
list
> and did not receive any responses.
> 
> Jonathan Gray
> Streamy Inc.
> 
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-bugs
> 

_______________________________________________
erlang-bugs mailing list
erlang-bugs@REDACTED
http://www.erlang.org/mailman/listinfo/erlang-bugs




More information about the erlang-bugs mailing list