[erlang-questions] Representation of a map in bytecode?

Wed Sep 19 00:33:29 CEST 2007

To repeat a point that has already been made:
(1) Providing a parallel map that people can call IF THEY WANT TO is  
both a
     good idea and easy to do in Erlang.  A low level implementation  
of pmap
     might be a good idea.
(2) Turning ALL maps into parallel ones is a dumb idea.  Think about  
a tree.
     There are two ways to measure how big it is:  SIZE (total number  
of nodes)
     and DEPTH (length of longest path).  Now any computation can be  
seen as a
     tree of dependencies (this is calculated in terms of that).  In  
a serial
     system, the time is proportional to the SIZE.  In a system with  
more
     processors than you could ever want, the time is proportional to  
the
     DEPTH.  It is only worth parallelising a map when the SIZE of the
     computation significantly exceeds the DEPTH, that is, when the  
amount
     of work per element is "high" compared with the cost of simply  
getting to
     all the elements.  Given the current implementation of lists,
	map(fun(X) -> X+1 end, L)
     is going to be O(length(L)) even given infinitely many processors.

Suppose we had tuple maps, specified as

        tmap(F, T) -> list_to_tuple(map(F, tuple_to_list(T))).

but implemented directly.  It is straightforward to implement this with
depth proportional to log(size(T)), so automatically parallelising tuple
maps would very nearly be sensible.  (And it's what things like  
parellising
C and Fortran compilers do.)  Even then, for cheap enough F or small
enough T, it might not be worth while.

I am *not* saying that automatic parallisation is a bad idea, only that
*blind* automatic parallelisation is a bad idea.  Doing it right  
requires
reasonably thorough whole-module analysis (at least!), not simply  
pattern
matching byte code.

Something like the approach that Clean used to take might work pretty
well.  Clean is a pure lazy functional language like Haskell, but with
lightweight arrays and usually getting better performance.  It was
called Concurrent Clean because it really did have a multiprocessor
concurrent implementation.  Clean used programmer supplied annotations
to indicate where parallelising would be a good idea.

Another approach is the "parallel skeletons" one, where there are
preprogrammed patterns of concurrency and you just plug your details
in.  Not unlike Erlang behaviours, come to think of it.  And not
unlike Joe's parallel map that you can call when it's appropriate.