[erlang-questions] Concurrent processes on multi-core platforms with lots of chatter

Tue Dec 1 10:23:26 CET 2009

+1
And then some :)

On Tue, Dec 1, 2009 at 5:54 AM, Jayson Vantuyl <kagato@REDACTED> wrote:

> Off the top of my head, I would expect this to be a process_flag.
>
> Something like:  process_flag(scheduler_affinity, term()).  Possibly with a
> generic group specified by an atom like undefined.  This feels more
> functional than the proposed paf module, and has the benefit of being
> data-centric.
>
> The reason I would use a term (and then group by the hash of the term) is
> because it gives an elegant way to group processes by an arbitrary (possibly
> application specific) key.  Imagine if, for example, Mnesia grouped
> processes by a transaction ID, or if CouchDB grouped them by socket
> connection, etc.  By not specifying it as an atom or an integer, it lets you
> just use whatever is appropriate for the application.
>
> I'm not too keen on reusing process groups primarily because group leaders
> are used for some really common stuff like IO, which shouldn't affect
> affinity at all.
>
> If we want to be really crazy, we could provide the ability to specify
> something like a MatchSpec to map a process group to a processor.  Call it a
> SchedSpec.  This has the added bonus that you could have multiple handlers
> that would match in order without having the full blown load of a gen_event
> or arbitrary fun.  This might also provide the beginnings of more powerful
> prioritization than the existing process_flag(priority) we have now.
>
> Currently, the Use Case that people seem to be concerned with is ensuring
> locality of execution.  However, some people might also want to use it to
> provide dedicated cores to things like system processing.  I have no idea
> how this would fit with things like the AIO threads, but I'm pretty sure
> that HPC could benefit from, for example, dedicating 1 scheduler to system
> management tasks, 1 core to IO, and 6 cores to computation.  This is a
> higher bar, but it's important nonetheless.
>
> Of course, this would have the user thinking about the underlying CPU
> topology (which I agree is bad).  However, this is simply unavoidable in
> HPC, so it's best that we accept it.  Let me state this emphatically, if we
> try to make Erlang "smart" about scheduling, what is going to happen is that
> HPC people will dig down, figure out what its doing wrong, then come back
> with complaints.  We will never be able to make it work right for everyone
> without exposing these same tunables (but likely with a crappier interface).
>  It's better to give them powerful hooks to customize the scheduler with
> smart default behavior for everyone else.
>
> The reason I like the process_flag(scheduler_affinity) / SchedSpec option
> is that it can easily start out with just the process_flag, and add
> something like SchedSpec's later, without having to change the API (or
> particularly the default behavior).  Basically, you get three groups of
> users:
>
> * Normal People: They don't use affinity, although pieces of the system
> might. (effectively implemented already)
> * Locality Users: They use affinity for locality using the convenient
> process_flag interface. (easily done with additional process_flag)
> * HPC: They use affinity, and plugin SchedSpecs that are custom to their
> deployment. (can be provided when demanded without breaking first two
> groups)
>
> On Nov 30, 2009, at 6:49 PM, Robert Virding wrote:
>
> > Another solution would be to use the existing process groups as these are
> > not really used very much today. A process group is defined as all the
> > processes which have the same group leader. It is possible to change
> group
> > leader. Maybe the VM could try to migrate processes to the same core as
> > their group leader.
> >
> > One problem today is that afaik the VM does not keep track of groups as
> > such, it would have to do this to be able to load balance efficiently.
> >
> > Robert
> >
> > 2009/11/30 Evans, Matthew <mevans@REDACTED>
> >
> >> Hi,
> >>
> >> I've been running messaging tests on R13B02, using both 8 core Intel and
> 8
> >> core CAVIUM processors. The tests involve two or more processes that do
> >> nothing more than sit in a loop exchanging messages as fast as they can.
> >> These tests are, of course, not realistic (as in real applications do
> more
> >> than sit in a tight loop sending messages), so my findings will likely
> not
> >> apply to a real deployment.
> >>
> >> First the good news: When running tests that do more than just message
> >> passing the SMP features of R13B02 are leaps and bounds over R12B05 that
> I
> >> was running previously. What I have however noticed is that in a pure
> >> messaging test (lots of messages, in a tight loop) we appear to run into
> >> caching issues where messages are sent between processes that happen to
> be
> >> scheduled on different cores. This got me into thinking about a future
> >> enhancement to the Erlang VM: Process affinity.
> >>
> >> In this mode two or more processes that have a lot of IPC chatter would
> be
> >> associated into a group and executed on the same core. If the scheduler
> >> needed to move one process to another core - they would all be
> relocated.
> >>
> >> Although this grouping of processes could be done automatically by the
> VM I
> >> believe the decision making overhead would be too great, and it would
> likely
> >> make some poor choices as to what processes should be grouped together.
> >> Rather I would leave it to the developer to make these decisions,
> perhaps
> >> with a library similar to pg2.
> >>
> >> For example, library process affinity (paf) could have the functions:
> >>
> >> paf:create(Name,[Opts]) -> ok, {error, Reason}
> >> paf:join(Name,Pid,[Opts]) -> ok, {error, Reason}
> >> paf:leave(Name,Pid) -> ok
> >> paf:members(Name) -> MemberList
> >>
> >> An affinity group would be created with options for specifying the
> maximum
> >> size of the group (to ensure we don't have all processes on one core), a
> >> default membership time within a group (to ensure we don't unnecessarily
> >> keep a process in the group when there is no longer a need) and maybe an
> >> option to allow the group to be split over different cores if the group
> size
> >> reaches a certain threshold.
> >>
> >> A process would join the group with paf:join/3, and would be a member
> for
> >> the default duration (with options here to override the settings
> specified
> >> in paf:create). If the group is full the request is rejected (or maybe
> >> queued). After a period of time the process is removed from the group
> and a
> >> message {paf_leave, Pid} is sent to the process that issued the paf:join
> >> command. If needed the process could be re-joined at that time with
> another
> >> paf:join call.
> >>
> >> Any takers? R14B01 perhaps ;-)
> >>
> >> Thanks
> >>
> >> Matt
> >>
>
>
>
> --
> Jayson Vantuyl
> kagato@REDACTED
>
>
>
>
>
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>