[erlang-questions] gen_server aggregating calls/casts

Robert Raschke rtrlists@REDACTED
Tue Jul 9 01:27:52 CEST 2013


Have you considered using a queuing framework, like RabbitMQ?

By using topic exchanges and appropriately bound queues, it seems a bit
more straightforward to implement what you desire.

Alternatively, you'll probably want to double layer your server, as was
previously suggested. Have the outer server deal with the problem of
batching and the inner one serialise access to your resource. Judicious use
of gen_server:call with noreply results in the outer server, whilst keeping
track of the callers, and a combination of immediate calls to the lower
layer and timeout handling depending on your use cases, would be my first
attempt.

Robby
 On Jul 8, 2013 11:16 PM, "Jonathan Leivent" <jleivent@REDACTED> wrote:

> On 07/08/2013 04:30 PM, Robert Raschke wrote:
>
>> A gen_server is there to handle a resource such that requests to that
>> resource are handled sequentially. It sounds like that doesn't quite match
>> your use case.
>>
>> What you describe appears a little bit odd. Say you receive a request, and
>> now, before you handle it, you check if more of the same request type has
>> come in, and if they have, you add them in your "send results to" set. Now
>> what? Do you check again? Or are you satisfied you can actually do some
>> work?
>>
>
> The server goes through longish periods when it is doing work (writing and
> syncing requests to disk, communicating with other servers, etc.). After
> these longish periods, it is very likely that many requests have
> accumulated on its queue, along with other messages that are not requests.
>  I'd be satisfied if it could aggregate all requests that are currently on
> the server's queue, but do so without having to first go through the
> processing of the non-request messages that have also accumulated on its
> queue.  Then, batch process those aggregated requests together - including
> writing/syncing them to disk together - with big performance savings due to
> amortizing the expensive write/sync operation (as well as reducing network
> overhead for the other parts of request processing).
>
>
>> Do you maybe want to memoize results? By adding results to your server
>> state, you could reply to identical(?) requests via lookup, leaving the
>> resource in peace.
>>
>
> It needs to service any requests it aggregates before servicing any
> non-request message.  Or, rather, that's true of the first request. Said a
> better way: requests can be processed earlier than non-request messages
> that arrive before those requests, but the opposite is not true.
>
> So, if the server memoizes the requests, it has to process them as soon as
> it encounters the first non-request message - even though there may be many
> more requests later in its queue it could aggregate with the earlier ones.
>  Because earlier requests must be processed before later non-requests.
>
> Unless I memoize ALL messages, requests and non-requests - so that the
> server delays handling any until it gets to a queue-empty state or times
> out or something.  That's possible, but it sounds like I'll have to write
> my own gen_server-like infrastructure dispatch loop once I've accumulated
> all those non-requests.  I guess that's possible...
>
> A separate aggregating gen_server might be easier and more modular. Like
> this:
>
>             aggregator              actual server
>             ==========              =============
>                                     doing busy work...
> request --> memoize and forward --> forwarded request enqueued
> request --> memoize
> ...
> request --> memoize
>                                     ... done busy work
>                                     see forwarded request
>             handle call         <-- call aggregator
>             reply all and clear --> handle aggregated requests
>
> I'll try both...
>
>  Or do you require some kind of batching, only bothering the resource if
>> enough interest is present? Maybe augmented with some kind of time limit?
>> This would probably be a bit harder to implement.
>>
>
> Yes - such processing might be nice.  Knowing how many requests can be
> written/synced to disk in about the same time as a single request, for
> instance, and knowing approximately how long such a write/sync takes. Which
> will vary greatly based on available hardware and the underlying OS,
> external demands on disk IO, etc.  I'll leave all that for version 2...
>
> -- Jonathan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130709/34c99d60/attachment.htm>


More information about the erlang-questions mailing list