We're running a cluster of 11 Erlang nodes, running on R13B3 using Ubuntu 10.04 LTS 64-bit.<div>The application is installed in a subdirectory of the home directory of a particular user. We start the application using nohup, and with stdio/stderr re-direct to a log file.</div>
<div>We're seeing a mysterious crash in some cases (see below)</div><div><br></div><div>If we're logged in as the appropriate user, everything starts fine.</div><div>If we're logged in as root, and do "sudo -u username <start-command>" then beam.smp crashes with "init terminating in do_boot".</div>
<div>Why is this? I can't figure out why it's crashing like this.</div><div><br></div><div><div>{no error logger present) error: "Error in process <0.2.0> with exit value: {badarg,[{erl_prim_loader,check_file_result,3},{erl_prim_loader,check_file_result,3},{init,get_boot,1},{init,get_boot,2},{init,do_boot,3}]}\n"</div>
<div>init terminating in do_boot ()</div></div><div><br></div><div><br></div><div>Googling on this error shows only two other web resources; one an IRC chat with basho, not getting an answer, and one a message on this mailing list, also not getting an answer.</div>
<div>Reading the code, it seems as if the only way this will happen is if there is an error, but the "Func" argument or "Reason" argument to check_file_result is somehow not an atom. I don't, however, see how that could be happening.</div>
<div>The other question is also why the boot script would not be visible/available at that point. The particular user has no .bashrc or other such init script.</div><div><br></div><div><br></div><div>The actual start script (with secrets sanitized) is:</div>
<div><br></div><div><div> nohup erl +K true -noshell \</div><div> -env ERL_MAX_PORTS 200500 \</div><div> +W w +P 1001001 \</div><div> -boot start_sasl \</div><div> -cluster leader@domain \</div>
<div> -kernel inet_dist_listen_min 50000 inet_dist_listen_max 50009 \</div><div> -name "$node@$hostname" -setcookie "some secret" \</div><div> -callout svc_url '"<a href="http://service/%%.php">http://service/%%.php</a>"' \</div>
<div> -s launcher -extra $role \</div><div> </dev/null > "$logdir/$node-logfile.txt" 2>&1 &</div></div><div><br></div><div><br></div><div><br></div><div>Here's the crash-dump analysis (not very useful):</div>
<div><br></div><div><div>Slogan <span class="Apple-tab-span" style="white-space:pre"> </span>init terminating in do_boot ()</div><div>Node name <span class="Apple-tab-span" style="white-space:pre"> </span>'<a href="mailto:mqnode@AF001603.prod.imvu.com">mqnode@AF001603.prod.imvu.com</a>'</div>
<div>Crashdump created on <span class="Apple-tab-span" style="white-space:pre"> </span>Wed Jul 6 10:23:58 2011</div><div>System version <span class="Apple-tab-span" style="white-space:pre"> </span>Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:8:8] [rq:8] [async-threads:0] [hipe] [kernel-poll:true]</div>
<div>Compiled <span class="Apple-tab-span" style="white-space:pre"> </span>Fri Apr 9 12:29:55 2010</div><div>Memory allocated <span class="Apple-tab-span" style="white-space:pre"> </span>141146456 bytes</div><div>Atoms <span class="Apple-tab-span" style="white-space:pre"> </span>5690</div>
<div>Processes <span class="Apple-tab-span" style="white-space:pre"> </span>35</div><div>ETS tables <span class="Apple-tab-span" style="white-space:pre"> </span>17</div><div>Funs <span class="Apple-tab-span" style="white-space:pre"> </span>357</div>
<div><br></div>--<br>Americans might object: there is no way we would sacrifice our living standards for the benefit of people in the rest of the world. Nevertheless, whether we get there willingly or not, we shall soon have lower consumption rates, because our present rates are unsustainable. <br>
<br>
</div>