[erlang-questions] inet_gethost fails with enomem on AVR32 (R14B03)

Winston Smith smith.winston.101@REDACTED
Mon Oct 31 00:27:42 CET 2011


I solved my issues here, the solution(s) are below for posterity:

On Thu, Aug 18, 2011 at 12:22 AM, Winston Smith
<smith.winston.101@REDACTED> wrote:
> I am trying to update my AVR32 system (an Atmel NGW100 development
> board) to use OTP R14B03 and a reltool (via "rebar generate") based
> release -- previous releases were hand-built.  However, I'm having
> problems getting the node to start up.  I have built R14B03 using
> Atmel's buildroot-v3.0.0 for NGW100 and aside from a couple of issues
> (avr32-linux-gcc internal errors due to -O3, fixed by specifying -O
> and a missing implementation of finite() in uclibc, fixed by passing
> -Dfinite=__finite in CFLAGS), I have been able to fire up the erlang
> prompt via the erl command.  I am using the
> erl-xcomp-avr32-atmel-linux-gnu.conf xcomp file I supplied a while
> back and is included in the otp releases.

I'll submit a diff to erlang-patches to update
erl-xcom-avr32-atmel-linux-gnu.conf for buildroot-v3.0.0.

> Using the "reltool via rebar generate" to create an AVR32 based
> release of my app, I first ran into issues with the following error on
> the NGW100 platform:
>
> # bin/mynode console
> Exec: /home/avr32/mynode/erts-5.8.4/bin/erlexec -boot
> /home/avr32/mynode/releases/1/mynode -mode embedded -config
> /home/avr32/mynode/etc/app.config -args_file
> /home/avr32/mynode/etc/vm.args -- console
> Root: /home/avr32/mynode
> (no error logger present) error: "Error in process <0.2.0> with exit
> value: {badarg,[{erl_prim_loader,check_file_result,3},{init,patch_dir,2},{init,'-patch_path/2-lc$^0/1-0-',2},{init,eval_script,8},{init,do_boot,3}]}
> "
> {"init terminating in
> do_boot",{badarg,[{erl_prim_loader,check_file_result,3},{init,patch_dir,2},{init,'-patch_path/2-lc$^0/1-0-',2},{init,eval_script,8},{init,do_boot,3}]}}
>
> Crash dump was written to: erl_crash.dump
> init terminating in do_boot ()
>
>
>
> I couldn't figure out this error (or the "no error logger present"
> since sasl is included in the release), but I was able to work around
> this by commenting out "+K true" and "+A 5" in the vm.args file, as
> follows:
>
> ## Enable kernel poll and a few async threads
> ## +K true
> ## +A 5

Seems that it might be necessary to add the following to
erl_xcomp_configure_flags:

--disable-threads --disable-smp

The avr32 is single core, so I don't know how --disable-smp affects
this, or if it's really an issue with --disable-threads.


> So now, I get a bit further, but it fails trying to run inet_gethost
> with an enomem error which I'm now stuck at:
>
>
> # bin/mynode console
> Exec: /home/avr32/mynode/erts-5.8.4/bin/erlexec -boot
> /home/avr32/mynode/releases/1/mynode -mode embedded -config
> /home/avr32/mynode/etc/app.config -args_file
> /home/avr32/mynode/etc/vm.args -- console
> Root: /home/avr32/mynode
> {error_logger,{{2011,8,17},{23,43,58}},"~s~n",["Error in process
> <0.16.0> with exit value:
> {enomem,[{erlang,open_port,[{spawn,\"inet_gethost 4
> \"},[{packet,4},eof,binary]]},{inet_gethost_native,do_open_port,2},{inet_gethost_native,run_once,0}]}\n"]}
> Segmentation fault
>
>
>
> A wild guess would be that there's some kind of problem with the
> packet sizes being passed into inet_gethost from
> inet_gethost_native.erl is causing a memory allocation error ... I
> suppose it's also possible that my AVR32 system just doesn't have
> enough memory to run everything (the NGW100 seems to have about ~30MB
> of memory available when running Linux), but I'm not sure this is the
> case; I think there's a xcomp bug here.

Once I updated to R14B04 and made the above changes to
erl_xcomp_configure_flags, I no longer saw this issue, I don't know if
something got fixed in R14B04, or if it was some kind of thread/smp
issue, or more likely, that there was a memory shortage (see next
paragraph) and with these changes it was no longer failing [in
inet_gethosts phase].

At this point, I found that the node would die with a segfault in
beam, trying to reference memory at address 0 -- failed malloc() ?.  I
used strace on the board to see what was going on when it faulted and
I found that the default reltool config would package up a fair few
apps in the lib directory and so it seemed that perhaps I was just
running out of memory.  The NGW100 boards (mk1 anyway) only have about
~29MB after booting Linux.

The following post helped me cut the memory footprint:

http://stackoverflow.com/questions/7419909/reducing-size-of-rebar-generated-upgrade-packages

Using this I was able to significantly reduce the number of apps
included in the reltool generated node.

With all of this, the node starts and runs, including the port driver!!!


> I had to hack rebar to add a new reltool.config setting "root_dir"
> which is passed into reltool:eval_target_spec() as the 2nd argument,
> this lets me point "rebar generate" at the cross compiled OTP instance
> for AVR32.  I'm still working on adding something to allow rebar to
> cross compile the port driver!

I did create a fork of rebar on github and submitted a pull request to
basho to get these changes (see
https://github.com/basho/rebar/pull/145) into rebar itself, hopefully
that will happen soon.  Additionally, in order to get the port driver
properly compiled for the AVR32 platform, I had to override the
following environment variables:

CC, CXX, LD, RANLIB, AR, ERL_CFLAGS and ERL_EI_LIBDIR

The last two are so the port driver can compile and link with the
cross compiled version of erts and ei_interface rather than the
headers/libs for the host system.

Hope this helps someone!



More information about the erlang-questions mailing list