high-volume logging via a gen_server

Dan Kelley djk121@REDACTED
Mon Oct 4 13:57:09 CEST 2010


I'm relatively new to erlang.  I've been working on a simple router which
connects to external servers which speak different protocols.  When the
router gets a packet, it decodes it, figures out the destination channel,
encapsulates it in that channel's required format, and sends it along.  This
all works fine.

I have a pretty standard-looking supervision tree setup for the processes
which make up the router app.  One of the processes is a gen_server which
handles the logging for the overall app.  It provides a simple interface
(log:info(), log:debug(), etc)  to the app.  When clients use that
interface, they end up using gen_server:cast to send the logging process a
log line.  I've moved as much of the CPU load as I can out to the functional
interface so it's paid by the callers and not the logging process.

Inside the logging gen_server, I'm not doing anything smarter than using
io:format to write to disk:

logit(Timestamp, FromPid, RequestId, LevelStr, Category, MessageKey,
LogString, State) ->
    io:format(State#state.log_device,
      "~s,~s,rock,~s,~p,~s,~s,~s,~s,~s~n", [Timestamp,
   State#state.unix_pid,
   State#state.hostname,
   FromPid,
   LevelStr,
   RequestId,
   Category,
   MessageKey,
   LogString]).

When I do performance tests of the overall app, it's invariably the logging
process which is the limiting factor of the overall system.  Usually what
happens is that the mailbox for the process accumulates several hundred
thousand messages, which causes the size of the VM to bloat until the host
starts swapping.  (I understand that I could use gen_server:call to make the
logging synchronous, but that'd slow down all of the transactions, which I'd
like to avoid if I can.)

I'm pretty sure that the underlying disk is not saturated - I'm just not
getting that many Mb of log data.  My guess is that the single erlang
process that the logging gen_server is using just can't both read from its
mailbox and write to disk fast enough to keep up.

So, what are good strategies to cope with a large incoming volume of
messages that all need to wind up in the same logfile?  Is there a more
efficient way to write to disk than the simple io:format() call than I'm
using above?  What's a good way to parallelize the logging over multiple
processes but keep all of the information in one file?

Thanks,

Dan


More information about the erlang-questions mailing list