[erlang-questions] Memory-effective queue handling

Erik Søe Sørensen eriksoe@REDACTED
Wed Jul 25 15:01:32 CEST 2012


Den 25/07/2012 11.57 skrev "Attila Rajmund Nohl" <attila.r.nohl@REDACTED>:
>
> 2012/7/24 Erik Søe Sørensen <eriksoe@REDACTED>:
> >
> > Den 24/07/2012 17.11 skrev "Attila Rajmund Nohl" <
attila.r.nohl@REDACTED>:
> >
> >
> >>
> >> Hello!
> >>
> >> I have a data structure where I manage "historical data": a list of
> >> timestamps when events happened. I only need to know how many events
> >> happened in the last 24 hours. When an event happens, I put the
> >> timestamp into the queue (I use the queue module in OTP). When I check
> >> how many values are in the queue, I also remove the entries that are
> >> older than 24 hours. My problem is: when I remove the old elements, I
> >> can do it one by one and at each step the (possibly big) queue gets
> >> copied and I run out of memory.
> > I am curious: how do you manage to do this? Can we see the code?
> > (Certainly it shouldn't be the case that much copying should be done at
each
> > step.)
>
> I can show an excerpt from the code:
>
> increase_counter(#ch{queue=Q, counter_value=CV}=CH, Value) ->
>     Date = ?MODULE:get_date(),
>     NewQueue=queue:in({Date, Value}, Q),
>     CH#ch{queue=NewQueue, counter_value=CV+Value}.
>
> get_value(#ch{hist_len=HL}=CH) ->
>     % get the values from the queue until we reach a first element in the
queue
>     % which is not older then the hist_len (e.g. not older than 15 mins)
>     Date = ?MODULE:get_date(),
>     get_younger_than(CH, Date-HL).
>
> get_younger_than(#ch{queue=Q, counter_value=CV}=CH, TargetDate) ->
>     case queue:out(Q) of
>         {empty, _} ->
>             0 = CV, % this is an assert, the value shall be 0
>             {CH, 0};
>         {{value, {Date, Value}}, NewQ} when Date < TargetDate ->
>             get_younger_than(CH#ch{queue=NewQ, counter_value=CV-Value},
TargetDate);
>         {{value, _}, _} ->
>             % the currently removed entry is too young, so leave that in
the
>             % queue (i.e. use the parameter queue).
>             {CH, CV}
>     end.
>
> The copying is done in the queue:out call, because it returns not just
> the first value, but the rest of the queu (newQ) too.
That doesn't copy the entire queue. That is, occasionally it recreates the
*spine* of a new list the length of the entire queue, but normally it does
way less than that.
The problem is that the way your code works, that worst-case is every time,
and the throwing-away of old entries is also done from the start each time:
your get_value() does not return the pruned queue.
A further consequence of that is that Richard's statement doesn't hold:
because you keep a reference (apparently) to the original, unpruned queue,
the memory will have to hold both list spines simultaneously.
So I'd suggest you change the return value of get_value.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120725/a8c76343/attachment.htm>


More information about the erlang-questions mailing list