[erlang-questions] Feedback for my first non-trivial Erlang program
Fred Hebert
mononcqc@REDACTED
Tue Dec 15 15:28:06 CET 2015
On 12/15, zxq9 wrote:
>The more fundamental problem, though, is that this is a compounding
>function that steps singly as it increments, but it is written in a way
>that it counts down, not up, computing the *entire* value chain below
>it all over again each step, and within each sub-step once the input
>value is over 60.
>
>Its an arithmetic fun-bomb!
>
>If you remove most of the intermediate functions (I think most but
>maybe profit/3 can be removed... ?) and compute the stepping values as
>arguments to an expanded defintion of total_balance this should run
>close to O(n).
>
>Refactoring for that may also expose some bugs...
This is the true problem with this code. It's calling itself in so many
ways the whole thing explodes combinationally and yields exponential
increases in time.
One quick way to work around that is through memoization with a macro
such as:
%% memoization uses a process dictionary. While hackish and not portable
%% through processes, a PD of this kind has the advantage of being
%% garbage collected and being returned in a stack trace. There is also no
%% copying (see: comments over the macros) or locking needed.
-define(memoize(E), lazy_memoize(fun()-> E end)).
lazy_memoize(F) when is_function(F) ->
case erlang:get(F) of
undefined ->
erlang:put(F,F()),
erlang:get(F);
X -> X
end.
Once that's done, wrap up the costly functions in ?memoize(Expression)
and the value will be stored in a process dictionary.
I've done it a bit haphazardly, memoizing stuff at a glance that looked
recursive: https://gist.github.com/ferd/ab5fed3b8ffe4b226755
While the raw implementation stalled at ~80 iterations and became
impossible to run, memoization makes it far more workable:
1> c(pension2).
{ok,pension2}
2> timer:tc(pension2, totalBalance, [8000]).
{354750,-5.145757524814171e19}
3> timer:tc(pension2, totalBalance, [8000]).
{11,-5.145757524814171e19}
Every follow-up call is then cheaper. Of course, the additional cost
comes in memory:
4> erlang:process_info(self(), dictionary).
{dictionary,[{#Fun<pension2.1.58422776>,0},
{#Fun<pension2.10.58422776>,-1325845925.5710535},
{#Fun<pension2.7.58422776>,0.0},
{#Fun<pension2.6.58422776>,16858.175941786},
{#Fun<pension2.3.58422776>,-4116.625402543886},
{...}|...]}
5> length(element(2,erlang:process_info(self(), dictionary))).
103703
6> erlang_term:byte_size(erlang:process_info(self(), dictionary)).
66125376
(the last call here uses https://github.com/okeuday/erlang_term). That's
66mb of storage for these terms in the process dictionary at the very
least (for 8000 iterations) and the host OS reports ~100mb usage for the
whole VM.
Rewriting things to repeat fewer of these operations (building up from 0
to N, rather than N down to 0 with a lot of repetitions) would probably
save memory a whole lot. If refactoring is out of the question and
memory is cheap to you, then memoization is a quick way to get something
out of it.
Be mindful however that when memoizing in the process dictionary, every
process that runs the operation will end up carrying that data. Moving
it to ETS if the calls have to be distributed is an option, after a
while it becomes very cheap. Flushing the cache from time to time (like
a manual GC) is also a thing one may want to do.
Ultimately refactoring, if possible, will prove the best long-term
solution.
Regards,
Fred.
More information about the erlang-questions
mailing list