[erlang-questions] Memory-effective queue handling
Richard Carlsson
carlsson.richard@REDACTED
Wed Jul 25 14:38:07 CEST 2012
On 07/25/2012 11:57 AM, Attila Rajmund Nohl wrote:
>>> I have a data structure where I manage "historical data": a list of
>>> timestamps when events happened. I only need to know how many events
>>> happened in the last 24 hours. When an event happens, I put the
>>> timestamp into the queue (I use the queue module in OTP). When I check
>>> how many values are in the queue, I also remove the entries that are
>>> older than 24 hours. My problem is: when I remove the old elements, I
>>> can do it one by one and at each step the (possibly big) queue gets
>>> copied and I run out of memory.
>> I am curious: how do you manage to do this? Can we see the code?
>> (Certainly it shouldn't be the case that much copying should be done at each
>> step.)
>
> I can show an excerpt from the code:
>
> increase_counter(#ch{queue=Q, counter_value=CV}=CH, Value) ->
> Date = ?MODULE:get_date(),
> NewQueue=queue:in({Date, Value}, Q),
> CH#ch{queue=NewQueue, counter_value=CV+Value}.
>
> get_value(#ch{hist_len=HL}=CH) ->
> % get the values from the queue until we reach a first element in the queue
> % which is not older then the hist_len (e.g. not older than 15 mins)
> Date = ?MODULE:get_date(),
> get_younger_than(CH, Date-HL).
>
> get_younger_than(#ch{queue=Q, counter_value=CV}=CH, TargetDate) ->
> case queue:out(Q) of
> {empty, _} ->
> 0 = CV, % this is an assert, the value shall be 0
> {CH, 0};
> {{value, {Date, Value}}, NewQ} when Date < TargetDate ->
> get_younger_than(CH#ch{queue=NewQ, counter_value=CV-Value}, TargetDate);
> {{value, _}, _} ->
> % the currently removed entry is too young, so leave that in the
> % queue (i.e. use the parameter queue).
> {CH, CV}
> end.
>
> The copying is done in the queue:out call, because it returns not just
> the first value, but the rest of the queu (newQ) too.
You'd only run out of space if the queue is so long that you don't have
memory enough to call lists:reverse() on the current in-half of the
queue (and after that, the old queue will immediately become garbage).
Can that really be the case?
But you should be able to improve the behaviour of the above code by
using queue:peek_r(Q) instead of queue:out(Q), and then call
queue:drop_r(Q) in the second case clause only, so no rewriting of the
queue representation is done unless necessary.
/Richard
More information about the erlang-questions
mailing list