[erlang-questions] A problem with ETS + SMP
Ulf Wiger
ulf@REDACTED
Thu Mar 27 08:34:26 CET 2008
Good: no copying, and only locking the process itself
Also good: the process dictionary is implemented as a linear hash
table, so it scales about as well as ets does.
Bad: No search functions, other than fetching all objects and sifting
through them.
Bad(?): There is no efficient way for another process to access a
single object; it has to ask the owner for it, or pull the entire
dictionary and do a linear search through it.
Bad(?): If the process dies, the entire contents of the process
dictionary are included in the crash report. This is often helpful,
but may be a terrible idea if the dictionary is very large.
BR,
Ulf W
2008/3/27, Andreas Hillqvist <andreas.hillqvist@REDACTED>:
> A (probebly bad) alternative to ets tables, would be to use the
> process dictionary: put(Key, Value) and get(Key).
>
> I am interested in arguments why this is a good/bad approach. But I
> personaly know to little of the inner workings of process dictionary.
>
>
> Kind regards
> Andreas Hillqvist
>
> 2008/3/26, Gene Sher <corticalcomputer@REDACTED>:
>
> > My program has a large number of parallel operating processes (it's a neural
> > network, each process is a neuron), each process has its own ETS table
> > (set,private). each process does a bit of work on it's own table, and then
> > sends a message to another process.... But even though each process has its
> > own ETS table, when moving the program from a single core to a quad core
> > (going from smp disabled to smp enabled on the same machine), the execution
> > time increases twofold, (when retaining the same cpu speed, ram fsb...). so
> > it goes from 200us, to 400us per single traversing of the entire net of
> > processes (the processes are traversed thousands of times...). I
> > reprogrammed the entire thing using only a single public ets table (just to
> > see the difference), the single cpu program didn't change its speed, but the
> > quad core increased execution time even further. Rewriting the program in
> > dict, does on the other hand speed up the execution when moving from single
> > core to quad core. Though as you guys know from running benchmarks, dict
> > though having a very small (<<1us) fetch time, has a huge store time,
> > becoming larger and larger with increased number of elements stored
> > (10us-100us......), and in my case, each process needs to scale up to deal
> > with millions of elements, hence the using of ETS. Using SET on the other
> > hand is even worse than dict, in both insertion and random fetching.
> >
> > The question is: Why does it slow down with smp activated when each of the
> > processes has its own ets table?
> >
> > So far I've got this far in the problem:
> > I think that it is not the case that there is a bottle necking due to mail
> > boxes. For one, each process does not have more than 100 other processes
> > connected to it (during a standard test), and in a smaller test where each
> > process is connected to only 2 or 4 other processes, same thing occurs. I
> > use ETS table so that I won't have the building up of messages in the mail
> > box, as soon as a message arrives at a process, he right away enters it into
> > the table with the key of the Pid of the process that sent it a message, and
> > the value the sender had send with its Pid (exp: {self(), prediction,
> > Value}). With the insertion time of ~2 microseconds and only having lets say
> > 4 other processes connected to another processes, there is no bottle necking
> > due to mail box. (that's why I'm using ETS, it's essential for the speed of
> > the network, and to have the ability to efficiently and quickly access any
> > value, any input, any time... at random)
> >
> > I've tested this setup without most of the calculations done in each process
> > to see what happens with just message passing(second order derivatives...and
> > other calculation, disabled) same problem occurs. I've now tried the code on
> > a single CPU laptop, very peculiar thing happens. Without smp enabled it
> > runs at ~300us per pass, with smp enabled (and it still only has 1 cpu, I
> > simply: erl -smp), it goes up to ~450us still. Something funny is going on
> > with the smp and ets.
> >
> > On the quad core I've gathered the following data:
> > letting everything else stay constant, the only thing I changed was the
> > number of schedulers::
> > smp disabled: 200us per nertwork pass.
> > -smp +S 1: 300us
> > -smp +S 4: 350us
> > -smp +S 8:
> > SMP +S 8: 1.14733e+4 us
> >
> > Anyone ever came across a similar problem with ets tables?
> > Regards,
> > -Gene
> >
>
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> >
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
More information about the erlang-questions
mailing list