[erlang-questions] segfaults in cowboy rest server
Thu Apr 9 16:22:57 CEST 2015
i've got a problem i'm trying to solve wrt controlling memory
consumption in a cloud environment. i've got a server that
receives log data from security appliances and stores in a
mariadb database. logs are sent to us via RESTful api calls,
with batches of logs embedded in the json body of a POST
call. they can get rather large, and we get a lot of them,
at a high rate.
when production load increased beyond what was anticipated
(doesn't it always?) we began having failures, with the server
disappearing without a trace. in some cases oom-killer killed
it, in others it would fail trying to allocate memory. we only
saw the latter by running in erlang shell and waiting until
it died, then we saw a terse error message.
to prevent this, i added a check in service_available() to
see if erlang:memory( total ) + content-length > some threshold,
and reject the request if so. also, having read the recent threads
about garbage collecting binaries, i added a timer to check every
30 seconds that forces gc on all processes if memory usage
is too high.
this seems to work pretty well, except that after a few days
of running, we get hard crashes, with segfaults showing up
kernel: beam: segfault at 7f09a004040c ip 000000000049e209 sp
00007fff860d32b0 error 4 in beam[400000+2ce000]
kernel: beam: segfault at 7fce288829bc ip 000000000049e209 sp
00007fffa0d2d7a0 error 4 in beam[400000+2ce000]
i've been using erlang for 15 years, and have never seen a segfault.
we've recently updated from r15b02 to r17.4, and we've also
switched from webmachine to cowboy. i don't know if either of
those things are relevant. i'm kind of at a loss as to how to diagnose
or deal with this.
any advice would be greatly appreciated.
Lead Member of Technical Staff
AT&T Chief Security Office (CSO)
"This e-mail and any files transmitted with it are AT&T property, are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not one of the named recipient(s) or otherwise have reason to believe that you have received this message in error, please notify the sender and delete this message immediately from your computer. Any other use, retention, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited."
More information about the erlang-questions