[erlang-questions] high-volume logging via a gen_server

Jachym Holecek freza@REDACTED
Tue Oct 5 01:48:58 CEST 2010


# Chandru 2010-10-04:
> On 4 October 2010 12:57, Dan Kelley <djk121@REDACTED> wrote:
> 
> > So, what are good strategies to cope with a large incoming volume of
> > messages that all need to wind up in the same logfile?  Is there a more
> > efficient way to write to disk than the simple io:format() call than I'm
> > using above?  What's a good way to parallelize the logging over multiple
> > processes but keep all of the information in one file?
> >
> What we do is to have one public ETS table per log file. All log entries are
> written directly to the ETS table by the calling process. Every few seconds,
> a dedicated logging process scans the ETS table, accumulates them, and dumps
> them to disk in one write operation. This works very well.

Actually, this appears to be somewhat degenarate use case for ETS, under extreme
loads at least. Passing log messages (ASCII text formatted in the context of
calling process) to per-log gen_server in a synchronous call is at least as fast
as the ETS approach, but possibly faster. Key point is to write the gen_server
so that it avoids any unnecessary activity (especially: memory allocations & any
kind of log data processing). Buffering can be achieved down at io device level
with delayed_write option.

Sorry to be a little vague, I didn't quite get around to analysing this in more
detail (like: understand ETS locking protocol & run more measurements).

I'm however sure that after I've changed a (proprietary) networked logging
server from ETS-based approach to the one indicated above, write throughput
went up from ~10-20MB/s to 100MB/s (ramdisk) or ~60-70MB/s (real disk). There
were other changes along the way, but IIRC I've tried most of them with ETS
first and it didn't help.

Regards,
	-- Jachym


More information about the erlang-questions mailing list