[erlang-questions] Memory-effective queue handling

Richard Carlsson carlsson.richard@REDACTED
Wed Jul 25 14:38:07 CEST 2012


On 07/25/2012 11:57 AM, Attila Rajmund Nohl wrote:
>>> I have a data structure where I manage "historical data": a list of
>>> timestamps when events happened. I only need to know how many events
>>> happened in the last 24 hours. When an event happens, I put the
>>> timestamp into the queue (I use the queue module in OTP). When I check
>>> how many values are in the queue, I also remove the entries that are
>>> older than 24 hours. My problem is: when I remove the old elements, I
>>> can do it one by one and at each step the (possibly big) queue gets
>>> copied and I run out of memory.
>> I am curious: how do you manage to do this? Can we see the code?
>> (Certainly it shouldn't be the case that much copying should be done at each
>> step.)
>
> I can show an excerpt from the code:
>
> increase_counter(#ch{queue=Q, counter_value=CV}=CH, Value) ->
>      Date = ?MODULE:get_date(),
>      NewQueue=queue:in({Date, Value}, Q),
>      CH#ch{queue=NewQueue, counter_value=CV+Value}.
>
> get_value(#ch{hist_len=HL}=CH) ->
>      % get the values from the queue until we reach a first element in the queue
>      % which is not older then the hist_len (e.g. not older than 15 mins)
>      Date = ?MODULE:get_date(),
>      get_younger_than(CH, Date-HL).
>
> get_younger_than(#ch{queue=Q, counter_value=CV}=CH, TargetDate) ->
>      case queue:out(Q) of
> 	{empty, _} ->
> 	    0 = CV, % this is an assert, the value shall be 0
> 	    {CH, 0};
> 	{{value, {Date, Value}}, NewQ} when Date < TargetDate ->
> 	    get_younger_than(CH#ch{queue=NewQ, counter_value=CV-Value}, TargetDate);
> 	{{value, _}, _} ->
> 	    % the currently removed entry is too young, so leave that in the
> 	    % queue (i.e. use the parameter queue).
> 	    {CH, CV}
>      end.
>
> The copying is done in the queue:out call, because it returns not just
> the first value, but the rest of the queu (newQ) too.

You'd only run out of space if the queue is so long that you don't have 
memory enough to call lists:reverse() on the current in-half of the 
queue (and after that, the old queue will immediately become garbage). 
Can that really be the case?

But you should be able to improve the behaviour of the above code by 
using queue:peek_r(Q) instead of queue:out(Q), and then call 
queue:drop_r(Q) in the second case clause only, so no rewriting of the 
queue representation is done unless necessary.

     /Richard




More information about the erlang-questions mailing list