Montavista system call error handling patch

Steve Vinoski vinoski@REDACTED
Thu May 13 22:14:49 CEST 2010


We've recently experienced a 100% repeatable case of R13B02-1 and
R13B04 beam core dumps on a Montavista Linux system running on a
Cavium Octeon processor. This is technically not a problem with the
Erlang VM but rather is a case where certain socket-related calls in
this version of Montavista Linux, based on a 2.6.21 Linux kernel, are
indicating failure by returning negative numbers with large absolute
values, such as negative 0x40000, rather than returning -1 to indicate
failure as they should. The Erlang TCP code expects system calls to
return -1 to indicate errors and compares directly against that value,
so it ends up treating these negative return values as successful and
adds them to internal pointers as bytes written, offsets, etc. This in
turn corrupts these pointers and causes beam to dump core.

This patch introduces a portability macro to test for system call
failure within inet_drv.c:

git fetch git://github.com/vinoski/otp.git socket_error_portability

--steve


More information about the erlang-patches mailing list