[erlang-questions] Feedback for my first non-trivial Erlang program

zxq9 zxq9@REDACTED
Tue Dec 15 10:21:47 CET 2015

On 2015年12月15日 火曜日 05:14:54 Technion wrote:
> Hi,
> It's only a very minor suggestion, but you have a number of constants that could be replaced with compiler macros. For example:
> caBalanceStart() -> 10000.
> Becomes
> -define(CABALANCE, 10000).
> Then:
> caBalance(0) -> caBalanceStart();
> Becomes:
> caBalance(0) -> ?CABALANCE

Keep in mind, though, that constants are *almost never actually constant* throughout the service life of a program. Macros are a *bad* habit in this case. A function call can stand in for literally anything you might want to do -- including looking up a particular tax rate or whatever based on the circumstance of the program. (Anything you might want to do outside of guards, of course. The one place where it looks like a constant will actually be constant is "working time units", and this is a good candidate for macroization and elimination of some `case` statements.)

In-module calls are almost certainly *not* a cause of significant slowdown in the program -- though it is possible that the arbitrarily large stack rollup in the non-tail recursive functions that litter the code are.

That said, consider this:

caBalance(0) -> caBalanceStart();
caBalance(T) -> caBalance(T-1) + salary(T-1) - expenses(T-1) - tradingFees(T-1).

This leaves a frame per iteration on the stack. saBalance/1 does the same, as does salary/1 etc. These are not tail recursive and so leave a trail of values that have to be "rolled up" after the fact to give the final result. I'm not sure whether this is part of the performance problem or not, but regardless, you *really* want tail-recursive calls (where no return-to-caller is necessary, and so the stack can be eliminated). This has several advantages, in addition to lowering memory use (dramatically for high values for T), it also permits you to understand the *complete* state of a particular call only by examining a single call instead of the entire stack of calls (so you can start a test call in the middle if you want).

ca_balance(0) -> ca_balance_start();
ca_balance(T) -> ca_balance(T, ca_balance_start()).

ca_balance(0, A) ->
ca_balance(T, A) ->
    NewT = T - 1,
    NewA = A + salary(NewT) - expenses(NewT) - trading_fees(NewT),
    ca_balance(NewT, NewA).

This version doesn't have that problem.
(Note also the lack of camelCaseFunctionNames() here. Capitals mean things in Erlang, don't confuse the issue by pretending its C++ or Java. Also, Erlang does not have methods, only functions.)

The main problem with performance in the program, though, is constant re-computation of values you could already know after they've been computed once. The entire function works that way, actually. Regardless how long this particular version of the function takes to run, you could run it once on a very high value to build a memoized chart of values and retrieve them by key instead of recomputing all the time. Even without that, though, a significant amount of this functions time is spent computing intermediate values that could be memoized or bypassed on successive iterations by building up to T instead of counting down to it. Not to mention "next" values that are doubly or triply computed here and there, some of them rather complex:

    remove(X,T,[{Amount,Time}|Tail]) ->
        case currentValue(Amount,T-Time) > X of
            true -> [{Amount-originalValue(X,T-Time),T}|Tail];
            false -> remove(X-currentValue(Amount,T-Time),T,Tail)

could be

    remove(_, _, []) ->
    remove(X, T, [{Amount, Time} | Tail]) ->
        TLessTime = T - Time,
        CurrentOverTime = current_value(Amount, TLessTime),
        case CurrentOverTime > X of
            true  -> [{Amount - original_value(X, TLessTime), T} | Tail];
            false -> remove(X - CurrentOverTime, T, Tail)

This way you don't compute `current_value(Amount, TLessTime)` twice (once to compare it, and once to pass it through if its =< X) and `T - Time` happens at least twice in all cases -- and you potentially compute remove/3 a *lot* (the insanity really starts with any call to `portfolio/1` that is larger than 61). I added an empty-list clause. I'm not certain that it will ever fit, but if you ever use this function in another context the probability is quite high that it may be passed an empty list. The same goes for negative values for your one exported function:

total_balance(T) when < 0 -> {error, negative_time};
total_balance(T)          -> caBalance(T) + saBalance(T).

This might be a better way to make sure your program doesn't iterate forever because of externally generated bad data.

The other thing about this function is... FLOAT VALUES.

A version of this with tail recursive functions and the current one will probably return subtly (or not so subtly) different results, simply because of float carry value errors. Accumulated values and rolled up values tend to not turn out quite the same way. There is a *lot* of iterative arithmetic indicated here...

This could be prevented a few ways -- by picking a global minimum fractional value and using integer operations (but then you have to watch out for hitting boundaries that always return peculiarly round value, especially 0), by using fixed-point arithmetic, or by creating a real type and reducing at the end. Some form of the first method is used in currency trade calculations, but different industries have different regulations about how to approach that.

Something bothers me about this function:

    portfolio(T) ->
        case working(T) of
            true -> [{investments(X), X} || X <- lists:seq(0, T)];
            false -> remove(investments(T - 1), T, portfolio(T - 1))

Its not just that it isn't tail recursive, though. Passing `portfolio(T - 1)` as the third argument to remove/3 makes this a bit more mysterious at first glance than I it should be. It is a computational complexity explosion that seems totally unnecessary.


More information about the erlang-questions mailing list