[erlang-questions] Concurrent processes on multi-core platforms with lots of chatter

Wed Dec 2 00:46:32 CET 2009

> This vision was never really developed and all that process groups are used
> for today is default I/O. This also means that they are very sparingly used.
I have code that manipulates the group_leader precisely to control where default IO happens and where SASL logs go.  I would rather not have this break.

> - I think the fact that process groups today are used for stuff common to
> the group is something which should affect affinity. Where better to put
> stuff common to a group of processes than close to the processes?
I don't think that the current semantics of process groups can accommodate this.  By doing this, you either bind logging, default IO, and affinity, or you break any code that uses group_leader to manage logging and default IO.  I would imagine that the latter is pretty painful.

> - As process groups are used for very little, then they are very little
> used. There is nothing as far as I know which demands that an application is
> in only one group, it is done so today because that is sufficient for
> process groups provide. In the same way you have supervisor trees there is
> nothing which prohibits you from having group leader trees with different
> types of group leaders depending on where the group common stuff is to be
> handled; a group leader could pass on its requests up the tree until the
> right group leader is reached.

Again, we're overloading a concept.  If we had keyed group leaders, this might make sense.  Something like group_leader(GroupLeader,Pid,GroupKey), so you could do group_leader(Leader,self(),affinity).  However, I don't particularly see this as more effective than a process_flag, and I am loathe to break any custom code for directing SASL logs or default IO.  This is a *big thing*.

Another mark in favor of process_flag is that it doesn't require an arbitrary leader process.  When a group_leader exits it appears from my limited testing that the processes old group_leader is still returned by the group_leader function.  I'm not sure that this is the right behavior for affinity.  Similarly, there may not be a suitable leader process in all situations, and tossing around extra processes for this seems like more work than necessary.

I can't imagine that this makes it any easier on the scheduling side either.  Migration of a group_leader would be a big deal, as tons of processes might follow it.  Allocating affine processes from a hash is a static matter with low overhead.  Making group_leaders "static" doesn't really help either, as it's more interface overhead (and may create load issues unless you rebalance them, again with a hash, just like I propose with process_flag).

Again, affinity with process_flag requires one line of code per worker, ever.  No processes are special, which is nice.  The little bits of extra work associated with managing processes is one more thing on the pile of stuff that makes Erlang code unnecessarily verbose (which, honestly, is already quite significant).

One thing I did notice was that group_leader is explicitly inherited.  This is important, because I think that's a behavior of process affinity that has been assumed, but nobody mentioned it yet.  Using group_leader would definitely imply inheritance.

> I don't like the idea of adding new features to the language if there is
> something already there which could be used, the language is growing enough
> as it is, not all bad of course.

Ironically, that's the same reason I recommended process_flag.  Process_flag exists already and is designed to be extensible.  I didn't really see this as an added feature, so much as a completely backwards-compatible extension to an intentionally extensible interface.  Unless you're talking about the feature of affinity, which is added in either case.

-- 
Jayson Vantuyl
kagato@REDACTED