[erlang-questions] A problem with ETS + SMP

Thu Mar 27 07:59:39 CET 2008

A (probebly bad) alternative to ets tables, would be to use the
process dictionary: put(Key, Value) and get(Key).

I am interested in arguments why this is a good/bad approach. But I
personaly know to little of the inner workings of process dictionary.

Kind regards
Andreas Hillqvist

2008/3/26, Gene Sher <corticalcomputer@REDACTED>:
> My program has a large number of parallel operating processes (it's a neural
> network, each process is a neuron), each process has its own ETS table
> (set,private). each process does a bit of work on it's own table, and then
> sends a message to another process.... But even though each process has its
> own ETS table, when moving the program from a single core to a quad core
> (going from smp disabled to smp enabled on the same machine), the execution
> time increases twofold, (when retaining the same cpu speed, ram fsb...). so
> it goes from 200us, to 400us per single traversing of the entire net of
> processes (the processes are traversed thousands of times...). I
> reprogrammed the entire thing using only a single public ets table (just to
> see the difference), the single cpu program didn't change its speed, but the
> quad core increased execution time even further. Rewriting the program in
> dict, does on the other hand speed up the execution when moving from single
> core to quad core. Though as you guys know from running benchmarks, dict
> though having a very small (<<1us) fetch time, has a huge store time,
> becoming larger and larger with increased number of elements stored
> (10us-100us......), and in my case, each process needs to scale up to deal
> with millions of elements, hence the using of ETS. Using SET on the other
> hand is even worse than dict, in both insertion and random fetching.
>
> The question is: Why does it slow down with smp activated when each of the
> processes has its own ets table?
>
> So far I've got this far in the problem:
> I think that it is not the case that there is a bottle necking due to mail
> boxes. For one, each process does not have more than 100 other processes
> connected to it (during a standard test), and in a smaller test where each
> process is connected to only 2 or 4 other processes, same thing occurs. I
> use ETS table so that I won't have the building up of messages in the mail
> box, as soon as a message arrives at a process, he right away enters it into
> the table with the key of the Pid of the process that sent it a message, and
> the value the sender had send with its Pid (exp: {self(), prediction,
> Value}). With the insertion time of ~2 microseconds and only having lets say
> 4 other processes connected to another processes, there is no bottle necking
> due to mail box. (that's why I'm using ETS, it's essential for the speed of
> the network, and to have the ability to efficiently and quickly access any
> value, any input, any time...  at random)
>
> I've tested this setup without most of the calculations done in each process
> to see what happens with just message passing(second order derivatives...and
> other calculation, disabled) same problem occurs. I've now tried the code on
> a single CPU laptop, very peculiar thing happens. Without smp enabled it
> runs at ~300us per pass, with smp enabled (and it still only has 1 cpu, I
> simply: erl -smp), it goes up to ~450us still. Something funny is going on
> with the smp and ets.
>
> On the quad core I've gathered the following data:
> letting everything else stay constant, the only thing I changed was the
> number of schedulers::
> smp disabled: 200us per nertwork pass.
> -smp +S 1: 300us
> -smp +S 4: 350us
> -smp +S 8:
> SMP +S 8: 1.14733e+4 us
>
> Anyone ever came across a similar problem with ets tables?
> Regards,
> -Gene
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>