[erlang-bugs] Segfault in erl_interface attempting to decode certain large binaries
Thu Jul 24 03:33:40 CEST 2008
The tutorial was kept simplistic on purpose.
Actually using ei for free-form terms is quite simple and only requires
less than a day or so to get used to all flavors of ei_* functions.
Jonathan Gray wrote:
> Yeah I saw that tutorial. Unfortunately it's based around a fixed format of
> terms, thus doesn't make use of ei_get_type(), which is central to any kind
> of generic decoder.
> Regardless I did more extensive testing and it looks like I will be
> implementing my own generic Erlang term decoder using ei. Will post results
> when I'm done.
> -----Original Message-----
> From: [mailto:]
> On Behalf Of Serge Aleynikov
> Sent: Wednesday, July 23, 2008 5:51 PM
> To: Jonathan Gray
> Subject: Re: [erlang-bugs] Segfault in erl_interface attempting to decode
> certain large binaries
> You can use the following tutorial to help you started with ei:
> I haven't seen any newer erl_encode/erl_decode implementations. Frankly
> I've started doing that myself in C++ and got the recursive encoding
> part, but haven't done the decoder and had to put this project on hold
> due to other priorities.
> Jonathan Gray wrote:
>> Sorry I was not more clear. This is a port program with two remote
>> communicating via simple TCP packets (size header, buffer).
>> Thanks for the advice. The documentation suggests that erl_interface is
>> actually making use of the ei code. I'm not sure if that's correct
>> I see other resources on the web that say they are different (ei is new,
>> erl_interface is old). However the error does exist in a malloc() call so
>> perhaps that can be avoided doing the term decodes myself?
>> Unfortunately what I need (already have working but segfaulting on a large
>> decode) is a generic parser that converts ErlBinary into a number of
>> different formats (JSON, Python, C struct tree) and vice versa. The ETERM
>> representation gave me an easy way to make recursive converters.
>> this generic behavior using ei seems like I'll be recreating functions of
>> erl_interface and losing the ETERM representation so I'll need to rewrite
>> converters in a completely different way.
>> Rewriting it to go directly from binary would be a good thing in the long
>> run, just not something I was planning on doing at this stage. Has anyone
>> ever written a newer generic erl_encode/erl_decode using the latest ei?
>> I will take a closer look at ei and report back.
>> Thanks for your help.
>> -----Original Message-----
>> On Behalf Of Serge Aleynikov
>> Sent: Wednesday, July 23, 2008 3:52 AM
>> Subject: Re: [erlang-bugs] Segfault in erl_interface attempting to decode
>> certain large binaries
>> It's not clear from your description if you're using erl_interface to
>> build a driver or a port program. While I don't have an answer to your
>> direct question, perhaps if you are writing a C port program you can use
>> ei instead of erl_interface, and if you are writing a driver, you can
>> use driver_output_term() / driver_send_term() and corresponding
>> ErlDrvTermData* structures to pass data to/from the emulator (*). This
>> is the fastest way to communicate with the emulator's port owner process.
>> (*) See: http://www.erlang.org/doc/man/erl_driver.html
>> and also the source code of inet_drv.c in the distribution for various
>> LOAD_*() macros that simplify working with ErlDrvTermData structures.
>>> I have a TCP interface between an Erlang system and a C system. Both
>>> send/receive marshaled binary Erlang terms and I have not had any
>>> to date.
>>> Today I began doing some more serious testing with larger chunks of
>>> to be decoded in C.
>>> We ran into a bug (it seems) with erl_interface 220.127.116.11 that is causing
>>> to segfault during decoding. The backtrace looks like this:
>>> Program received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 46912496233216 (LWP 4091)]
>>> 0x00000000004032c4 in _erl_free_term ()
>>> (gdb) bt
>>> #0 0x00000000004032c4 in _erl_free_term ()
>>> #1 0x000000000040496b in erl_decode_it ()
>>> #2 0x0000000000404937 in erl_decode_it ()
>>> #3 0x0000000000404c93 in erl_decode_it ()
>>> #4 0x0000000000404c93 in erl_decode_it ()
>>> #5 0x0000000000405311 in erl_decode ()
>>> #6 0x0000000000401b38 in main (argc=1, argv=0x7fff51ec7938) at
>>> The unfortunate part is that the way this large binary term is generated
>>> cannot be done in any kind of sample code (it's being pulled off an
>>> Testing code: http://jgray.la/erlang/erl_decode_segfault_test.tar.gz
>>> However, I have created a set of test files in C which recreate the
>>> segfault. I stored the binary in a flat file (as 'badbinary') and have a
>>> testing program which reads it off disk and attempts to decode it. To
>>> the approach is sane (and that this segfault is related to something
>>> about the decoding of this particular binary, not the size or general
>>> of the binary) there is a 'goodbinary' file and testing program for that.
>>> To use the test code:
>>> Untar/Ungzip the file. You may need to edit the Makefile to fix the
>>> to your erl_interface library.
>>> 'make' and then you can:
>>> ./badtest (this reads 'badbinary' and attempts to decode, causes
>>> ./goodtest (this reads 'goodbinary' and successfully decodes it) [nearly
>>> identical code to badtest.c but reads different file w/ different size]
>>> Also included is
>>> ./makegoodbin (a simple program that generates a large ETERM in an
>>> format to the badbinary but contains duplicated binary data everywhere)
>>> * The marshaled binary erlang term being sent to C can be successfully
>>> decoded/unmarshaled from within Erlang without a problem
>>> * This is reproducible with many different large erlang terms generated
>>> our database queries. 'makegoodbin.c' creates a term identical in format
>>> those causing problems, however it does not have the random distribution
>>> binary sizes and content, and so I'm not able to reproduce the problem in
>>> this way.
>>> * The entire system, end-to-end including this decoding step, works
>>> perfectly in most cases. However when the data goes into the 100k+
>>> the segfaults start to happen. That's why I created the 'makegoodbin'
>>> follows the same format. Unfortunately that works even at sizes of >1MB
>>> adding to the confusion of the problem.
>>> Any help is appreciated. Thanks.
>>> I apologize if this is a repost. I never saw my original post hit the
>>> and did not receive any responses.
>>> Jonathan Gray
>>> Streamy Inc.
>>> erlang-bugs mailing list
>> erlang-bugs mailing list
> erlang-bugs mailing list
More information about the erlang-bugs