[erlang-questions] Erlang math libraries

Thu May 17 02:23:48 CEST 2007

On 16 May 2007, at 3:39 pm, Jay Anderson wrote:
> That's too bad, many big numerical computations are highly  
> parallelizable which seems to be the big sell of erlang.

There's a distinction between "parallel" and "concurrent".
Erlang is designed for concurrency and distribution rather than
parallelism.

> Are there reasons these problems are not solved with erlang?

If I want Fortran, I know where to find it.  If I have a numeric
problem which is big enough and hard enough that I want to run it
in parallel, I want it to be efficient.  There is no substitute
for a language that provides a compiler with enough information to
pare overheads to the bone, and unfortunately compilers that are
really good at optimising numeric code are not cheap to develop.
Putting that kind of support into Erlang would have taken scarce
development resources away from the kind of problems Erlang was
invented for.

Mind you, it would be nice to have good numeric performance *as well*,
and one could certainly imagine an amalgam of Erlang and APL (but not,
please, J; there are good ideas in J but readability isn't one of them).

Let's take an example.

8000 dot products of 3000-element vectors:
    Fortran 90:  0.79 seconds  (Sun Fortran compiler, -O3)
    C 89:        1.96 seconds  (Sun C compiler, -xO3)
    Erlang:     42.33 seconds  (erlc +native)

You would have to be running on at least 54 machines (assuming linear
speedup) for parallel Erlang code to catch up with Fortran, but then
the Fortran compiler I use would do automatic parallelisation for me
if I had the right licence (quick check: by golly I *DO*, it's just
that I only have a uniprocessor machine).

> I'd want to do much larger matrix operations.
> Any thoughts on how to represent a matrix with erlang? (dense or
> sparse) Would a list of lists be good enough or would there be a  
> better way?

A list of lists is good enough *expressively*.  You can even do
Gaussian elimination with partial pivoting with surprising ease.
However, think about the space and memory reference count overheads
compared with Fortran or C arrays numbers: both are a factor of two.

Another representation is better for parallelism:
     1x1 matrix: a number
     2nx2n matrix: {TL,TR,
                    BL,BR}
For example,

     matrix_add({Atl,Atr,Abl,Abr}, {Btl,Btr,Bbl,Bbr}) ->
         {matrix_add(Atl,Btl), matrix_add(Atr,Btr),
          matrix_add(Abl,Abl), matrix_add(Abr,Bbr)};
     matrix_add(A, B) -> A+B.

The child adds can proceed in parallel without having to wade through
a whole lot of list cells to find their work.