Logging to one process from thousands: How does it work?

Wed Jan 4 10:01:10 CET 2006

Well, the existing implementation is just a small fix to prevent
the beginner from getting stuck on a simple producer/consumer
problem. Unfortunately the fix works so well that users are
lured to think it is the complete solution, realizes it is
not flawless, and starts making intricate improvement
suggestions.

The problem is not easily solved in the general case. So
I guess it would be futile to try to implement the
final solution in the runtime system. 

The existing implementation might be enough in many cases;
to reduce the cost of pattern matching at the receive end
- do no pattern matching, just always take the
first message. The pattern matching cost at 
the receive end does not have to increase
with queue lengt; if a message has been scanned
and found not to match - the scan continues with
new messages even after a schedule out. It is 
when entering a new receive statement the message
queue is rescanned from the beginning.

Each application will have to solve the problem using
some kind of flow control. For a logger this is 
unpleasent but not impossible. E.g the client might
have a counter in the process dictionary and do
a synchronous log entry every 17 log entries. And
if the logger always takes the first message, the
existing implementation might be enough. Beware that
the existing flow control only works for local sends
so I guess a logger in a node cluster would have to
use explicit flow control.

camster@REDACTED (Richard Cameron) writes:

> On 3 Jan 2006, at 20:02, David Hopwood wrote:
> 
> > Raimo Niskanen wrote:
> >> A process that sends to a
> >> receiver having a large receive queue gets punished
> >> with a large reduction (number of function calls)
> >> count for the send operation
> >
> > This makes the problem less likely to occur, but it isn't
> > necessarily enough
> > to prevent a message backlog in all cases.
> 
> If the amount of punishment a sender received increased with the
> queue length, would the maximum backlog asymptotically tend to a
> finite limit in the worst case?
> 
> That's probably only true if you assume that the per-timeslice
> production and service rates are constant, which might very well be
> complete rubbish. I'm guessing the cost of pattern matching to do the
> "receive" on the message queue is going to increase with the queue
> length too. So, maybe it comes down to making sure that the
> punishments grow faster than the cost of performing the receives?
> 
> Any idea what that cost is? Is the worst case linear, or can the
> virtual machine optimise for all cases?
> 
> > In principle, the right way to
> > handle this is to provide back-pressure to the message sources,
> > i.e. to allow
> > the logging operation to block in each source. The simplest way to
> > do that is
> > to use bounded queues.
> 
> Sounds like the reduction count punishment is a good way of providing
> that back-pressure, but still having "soft realtime" properties which
> more gradually degrade under load.
> 
> I think with bounded queues it might also be possible to dream up
> scenarios where deadlocks can occur. What about a simple client/
> server arrangement where the client sends a message to the server
> then happens to receive some other external (and unrelated) message
> which fills up its queue before the server responds. The server can't
> reply (its ! now blocks?) and the client can't process its message
> queue as it's sitting waiting for a reply from the server.
> 
> Richard.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB