Logging to one process from thousands: How does it work?

Thu Jan 5 20:19:59 CET 2006

On 5 Jan 2006, at 18:20, chandru wrote:

> Hi Sean,
>
> On 05/01/06, Sean Hinde <sean.hinde@REDACTED> wrote:
>> Hi Chandru,
>>
>> If the overall system is synchronous you should not see message
>> queues fill up out of control.
>
> Not necessarily. If I decided to allow a 100msgs/sec into the system
> and if each of those messages generated a number of log entries, the
> message queues in the logger process quickly build up if your hard
> disk is not responsive at some point (using synchronous logging). The
> problem I have with this is I can't always guarantee that I will
> safely handle 100 msgs/sec.

Spawn is the async mechanism here then :-)

Assuming that the disk subsystem can keep up on average, and you  
don't want to limit the input load by setting a maximum level of  
concurrency as well as msgs/s, then I guess what you want to achieve  
is to isolate the worker jobs from short term pauses in the disk logger.

You could introduce an additional accumulator process which stores  
log messages while waiting  for a separate disk log owning process to  
write the current chunk. The protocol to the disk log owning process  
could be "send async log message, but don't send any more until the  
disk log process confirms that the write is done with a message back".

>
>> Or make the message consumer drop events which it cannot handle. Just
>> receiving a message takes very little time, and then you can
>> guarantee that important messages do get handled.
>
> I suppose I could do this - but that'll mean invoking
> process_info(self(), message_queue_len) everytime I handle a new
> message. Is the message queue length actually incremented everytime a
> message is placed in a process' queue? Or is the length computed
> everytime I call process_info/2?

I had in mind some local heuristic where the process could decide it  
is overloaded, but this could work also. A quick look at the source  
suggests this call is efficient.

Sean