[erlang-questions] Why erlang's computing performance is enormously less than c++

Mon Nov 14 01:14:49 CET 2016

On 11/11/16 8:26 PM, 谈广云 wrote:
> i campare the erlang's computing with c++
>
> erlang run 100000000 time the test_sum_0
>
> test_sum_0(N) ->
>   bp_eva_delta([1,2,3,4],[3,4,5,6],[]),
>   test_sum_0(N-1).
>
>
> bp_eva_delta([],_,L) ->
> lists:reverse(L);
> bp_eva_delta([O|Output],[S|Sigma],L) ->
> bp_eva_delta(Output,Sigma,[S * O * (1-O) |L]).

Here are some times I got.
Erlang (native compilation)  :  10.1 seconds.
Erlang (unrolled loop)       :   2.8 seconds.
Standard ML                  :   2.7 seconds.
Clean (default lazy lists)   :   8.3 seconds.
Clean (unrolled strict data) :   3.0 seconds.
A fair comparison in C       : 118.4 seconds.

The thing is that the C and Erlang code may be computing
the same function (technically they aren't), but they are
not doing it the same WAY, so the comparison is not a
comparison of LANGUAGES but a comparison of
*list processing* in one language with
*array processing* in another language.
When you compare Erlang with statically typed languages
doing the same thing (well, not quite) the same *way*
you find the numbers pleasantly close.

A list is made up of pairs.
A fairer analogue of this in C would be

     struct Node {
         struct Node *next;
         int          item;
     };
     struct Node dummy = {0,0};

     struct Node *revloop(
         struct Node *L,
         struct Node *R
     ) {
         while (L != &dummy) {
             struct Node *N = malloc(sizeof *N);
             N->next = R, N->item = L->item;
             R = N, L = L->next;
         }
         return R;
     }

     struct Node *reverse(
         struct Node *L
     ) {
         return revloop(L, &dummy);
     }

     struct Node *bp_eva_delta(
         struct Node *Output,
         struct Node *Sigma
     ) {
         struct Node *L = &dummy;
         while (Output != &dummy && Sigma != &dummy) {
             int O = Output->item, S = Sigma->item;
             struct Node *N = malloc(sizeof *N);
             N->next = L,
             N->item = S * O * (1 - O);
             L = N;
         }
         return reverse(L);
     }

     struct Node *cons(
         int item,
         struct Node *next
     ) {
         struct Node *N = malloc(sizeof *N);
         N->next = next, N->item = item;
         return N;
     }

     void test_sum_0(
         void
     ) {
         struct Node *Output =
             cons(1, cons(2, cons(3, cons(4, &dummy))));
         struct Node *Sigma =
             cons(3, cons(4, cons(5, cons(6, &dummy))));
         struct Node *R;
         int N;
         for (N = 100*1000*1000; N > 0; N--) {
             R = bp_eva_delta(Output, Sigma);
         }
     }

     int main(void) {
         clock_t t0, t1;
         t0 = clock();
         test_sum_0();
         t1 = clock();
         printf("%g\n", (t1-t0)/(double)CLOCKS_PER_SEC);
         return 0;
     }

> the erlang spend 29's , and c++ spend 2.78's.
>
> why the erlang is so slower than c++?

On the contrary, why is C so staggeringly slow compared
with Erlang, Clean, and SML?  (On my desktop machine, that
is.  On my laptop, it ran for a LONG time and then other
things started dying.  Hint: no GC.)

There are at least five differences between your C++
and Erlang examples:

(1) List processing vs array processing.
(2) Memory allocation costs (malloc() can be  S  L  O  W).
(3) Static type system.
(4) Truncating arithmetic.
(5) Loop unrolling.
and there may be an issue of
(6) native code compilation vs emulated code.

The SML, Clean, C, and C++ programs use *truncating*
integer arithmetic.  The Erlang program uses unbounded
integer arithmetic, with no prospect of overflow.  It
takes extra time to be ready for that.

The fast Erlang code doesn't use a list, it uses an
*unrolled* list:
  -type urlist(T) :: {T,T,T,T,urlist(T)}
    | {T,T,T} | {T,T} | {T} | {}.

For example, {1,2,3,4,{5,6,7,8,{}}}.
The Erlang (unrolled data) code does this, with manual
loop unrolling.  I have library code for unrolled
strict lists in Haskell, Clean, and SML, but not (yet)
for Erlang.

Thinking about unrolling is fair because this is something
that C and C++ compilers routinely do these days.
>
>  Or I do not configure  the right parameter?

Assuming we are using similar machines, it is possible that
your Erlang code was running emulated, not native.