[erlang-questions] A problem with ETS + SMP

Ulf Wiger ulf@REDACTED
Thu Mar 27 08:34:26 CET 2008

Good: no copying, and only locking the process itself
Also good: the process dictionary is implemented as a linear hash
table, so it scales about as well as ets does.

Bad: No search functions, other than fetching all objects and sifting
through them.

Bad(?): There is no efficient way for another process to access a
single object; it has to ask the owner for it, or pull the entire
dictionary and do a linear search through it.

Bad(?): If the process dies, the entire contents of the process
dictionary are included in the crash report. This is often helpful,
but may be a terrible idea if the dictionary is very large.

Ulf W

2008/3/27, Andreas Hillqvist <andreas.hillqvist@REDACTED>:
> A (probebly bad) alternative to ets tables, would be to use the
>  process dictionary: put(Key, Value) and get(Key).
>  I am interested in arguments why this is a good/bad approach. But I
>  personaly know to little of the inner workings of process dictionary.
>  Kind regards
>  Andreas Hillqvist
>  2008/3/26, Gene Sher <corticalcomputer@REDACTED>:
> > My program has a large number of parallel operating processes (it's a neural
>  > network, each process is a neuron), each process has its own ETS table
>  > (set,private). each process does a bit of work on it's own table, and then
>  > sends a message to another process.... But even though each process has its
>  > own ETS table, when moving the program from a single core to a quad core
>  > (going from smp disabled to smp enabled on the same machine), the execution
>  > time increases twofold, (when retaining the same cpu speed, ram fsb...). so
>  > it goes from 200us, to 400us per single traversing of the entire net of
>  > processes (the processes are traversed thousands of times...). I
>  > reprogrammed the entire thing using only a single public ets table (just to
>  > see the difference), the single cpu program didn't change its speed, but the
>  > quad core increased execution time even further. Rewriting the program in
>  > dict, does on the other hand speed up the execution when moving from single
>  > core to quad core. Though as you guys know from running benchmarks, dict
>  > though having a very small (<<1us) fetch time, has a huge store time,
>  > becoming larger and larger with increased number of elements stored
>  > (10us-100us......), and in my case, each process needs to scale up to deal
>  > with millions of elements, hence the using of ETS. Using SET on the other
>  > hand is even worse than dict, in both insertion and random fetching.
>  >
>  > The question is: Why does it slow down with smp activated when each of the
>  > processes has its own ets table?
>  >
>  > So far I've got this far in the problem:
>  > I think that it is not the case that there is a bottle necking due to mail
>  > boxes. For one, each process does not have more than 100 other processes
>  > connected to it (during a standard test), and in a smaller test where each
>  > process is connected to only 2 or 4 other processes, same thing occurs. I
>  > use ETS table so that I won't have the building up of messages in the mail
>  > box, as soon as a message arrives at a process, he right away enters it into
>  > the table with the key of the Pid of the process that sent it a message, and
>  > the value the sender had send with its Pid (exp: {self(), prediction,
>  > Value}). With the insertion time of ~2 microseconds and only having lets say
>  > 4 other processes connected to another processes, there is no bottle necking
>  > due to mail box. (that's why I'm using ETS, it's essential for the speed of
>  > the network, and to have the ability to efficiently and quickly access any
>  > value, any input, any time...  at random)
>  >
>  > I've tested this setup without most of the calculations done in each process
>  > to see what happens with just message passing(second order derivatives...and
>  > other calculation, disabled) same problem occurs. I've now tried the code on
>  > a single CPU laptop, very peculiar thing happens. Without smp enabled it
>  > runs at ~300us per pass, with smp enabled (and it still only has 1 cpu, I
>  > simply: erl -smp), it goes up to ~450us still. Something funny is going on
>  > with the smp and ets.
>  >
>  > On the quad core I've gathered the following data:
>  > letting everything else stay constant, the only thing I changed was the
>  > number of schedulers::
>  > smp disabled: 200us per nertwork pass.
>  > -smp +S 1: 300us
>  > -smp +S 4: 350us
>  > -smp +S 8:
>  > SMP +S 8: 1.14733e+4 us
>  >
>  > Anyone ever came across a similar problem with ets tables?
>  > Regards,
>  > -Gene
>  >
> > _______________________________________________
>  > erlang-questions mailing list
>  > erlang-questions@REDACTED
>  > http://www.erlang.org/mailman/listinfo/erlang-questions
>  >
>  _______________________________________________
>  erlang-questions mailing list
>  erlang-questions@REDACTED
>  http://www.erlang.org/mailman/listinfo/erlang-questions

More information about the erlang-questions mailing list