Upcoming article in Dr. Dobbs'

Mon Jan 10 12:36:29 CET 2005

My view is basically the same as Vlad's. While the
tradeoffs of a chip multiprocessor might be different
from those on an SMP, a multithreaded VM incurs costs
in efficiency and implementation effort, and it's not
obvious that the possible gains outweigh those costs.
An Erlang layer on top of the current system might
extract most of the gains at a small fraction of the
effort of rewriting the VM.

(Pekka Hedqvist wrote an MSc on the topic:
http://www.erlang.se/publications/xjobb/0089-hedqvist.pdf
)

That said, some improvements might still be made.

One could speed up distributed erlang by adding a
faster communications channel than sockets when
running on the same host. (Pekka suggested this after
his MSc was done, and it seemed sensible to me, at
least.)

Some systems have used mmap() to good effect here, to
avoid serialization/socket/deserialization of terms;
regular erlang might get along with just the ordinary
serialize/copy/deserialize, which should improve
things a bit, and a shared-heap Erlang might do even
better if GC is not a problem. (But first, what _are_
the internode communication costs?)

Another improvement would be to implement process
migration, though it's again unclear how much this
would gain. Short-lived processes probably won't gain
anything. Long-lived processes might gain, at least
sometimes. But replicating the migrating process might
do even better than that.

In some cases, the underlying OS imposes limitations.
Running blocking system calls off somewhere else, for
example. That's handled today, isn't it? Not sure.

New OS primitives for chip multiprocessors might speed
things up, and might make multithreading more
palatable. I'm not sure which those would be, though.
Running multiple processors as independently as
possible is likely to remain the main technique to
extract speedups.

So, a modest proposal: perhaps one should instead just
write an Erlang layer to hide nodes even further;
e.g., spawn_lb/1 to spawn a process on the least
loaded node; support for pickling and migrating
gen_servers (transparently), etc. 

If distribution performance is found to be a
consideration, first start looking at mapping suitable
processes to the same node, replicating certain
processes, etc.

Best,
Thomas

__________________________________ 
Do you Yahoo!? 
The all-new My Yahoo! - Get yours free! 
http://my.yahoo.com