<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
The problem with quantifying those numbers is I've got several
different plausible designs for the system, and they have different
values for those numbers.<br>
<br>
E.g., one design called for one process/cpu processor. In that
design each process would need an ets table and a mnesia table. The
mnesia table would be disk-only. The ets table would hold perhaps
100,000 entries, each of which would maintain a time stamp for time
of last access. When the table started getting full, stale entries
would need to be rolled out to the database and purged. <br>
This design uses a lot less RAM, and an extremely smaller number of
processes.<br>
<br>
Now the tables would be keyed by a programatically generated key
value to allow unique items to be referred to when they are rolled
out, so that it's possible to roll them back in. <br>
In this design there would be perhaps (at a wild guess!!) one i/o
operation for every sqrt (# of CPUs * # of entries/table) function
calls. But they would tend to come in bursts, so i/o would
definitely slow things down considerably.<br>
<br>
Well, that design wasn't optimized for Erlang. I've been
contemplating variations of it over many different languages. <br>
<br>
Now if I can have one process/entry, and if dormant processes
(waiting for a receive) can sleep in virtual memory, then I will
need a few million processes, but I will be able to let the system
manage the activation/sleep cycle of the processes. In that case
each external signal is likely to induce around 1,000 to 100,000
activations in a chain before relaxing into a settled state. Each
activation will likely only cause about 500 bytes of data to be
copied (another wild guess, with part of the uncertainty being how
many internal pointers Erlang will need to adjust). But in this
case the data can be passed on the stack, and tail calls can be
used.<br>
<br>
My problem has been that when I searched for limits on the number of
Erlang processes I got:<br>
<i>The maximum number of simultaneously alive Erlang processes is by
default 32,768. This limit can be configured at startup. For more
information, see the </i><i><span class="bold_code bc-13"><a
href="http://erlang.org/doc/man/erl.html#max_processes"><span
class="code">+P</span></a></span></i><i> command-line flag
in the </i><i><span class="bold_code bc-18"><a
href="http://erlang.org/doc/man/erl.html"><span class="code">erl(1)</span></a></span></i><i>
manual page in ERTS.</i><br>
and:<br>
<i>The best thing to do is create a lagom number of processes. Not
too many, not too few.</i><br>
and:<br>
<i><span class="ui_qtext_rendered_qtext">The actual scalability
achieved depends on your problem, on your design choices, and on
the underlying execution framework.<br>
<br>
Erlang has some things going for it, and while synthetic
benchmarks have been produced, e.g. that show linear scalability
within one node up to some 30-40 cores, and linear scalability
in an Erlang cluster up to 100 nodes and a total of 1200 cores,
the scalability story in Erlang is not so much about that, as it
is about achieving real-world scalability in systems that
actually do something useful.<br>
<br>
</span></i><span class="ui_qtext_rendered_qtext">Since I have a
single system, this left me with the impression that I shouldn't
use too many processes, and the best guess of the system at a
reasonable maximum was a bit under</span><span
class="ui_qtext_rendered_qtext"><i> 32,768. It *was* clear that I
could raise that limit, but raising it by more than an order of
magnitude, while allowed, appeared probably unwise.<br>
<br>
It appears now that this was a mistaken assumption, but I still
don't see why I should have guessed differently.<br>
</i> <br>
</span><i><span class="ui_qtext_rendered_qtext"></span></i>On
02/08/2018 12:05 PM, Joe Armstrong wrote:<br>
<blockquote type="cite"
cite="mid:CAANBt-pzhjsCoBXk+QNN7D26SnThx_4nvRV-=B4JXo9LCj2=hA@mail.gmail.com">
<pre wrap="">In order to even think about your question I'd need certain data -
words like "huge" as in "huge amounts of copying" and "limited numbers
of processes"
etc. do not convey much meaning.
Huge means different things to different people - to some people Huge
means Gbytes
(I talked the other day to somebody who used the word Huge - and I
said "how big"
he said tens of PetaBytes)
To me huge means a data structure that is larger than the RAM on my machine
(which is 16GB) - so not only do you have to say what you meant by huge but also
how your numbers relate to your machine(s).
Also how long do you have to do what? - Handling huge amounts of data
is easy if you have a big enough disks and enough time - you also need
to say (roughly) how long you have to do what (are we talking seconds,
milliseconds,
hours, days???)
The more numbers you add to questions like this the better answers
you'll get :-)
Cheers
/Joe
On Wed, Feb 7, 2018 at 5:56 PM, Charles Hixson
<a class="moz-txt-link-rfc2396E" href="mailto:charleshixsn@earthlink.net"><charleshixsn@earthlink.net></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">When should a private ets table be preferred over the process directory?
To give some context, I'm expecting to has nearly as many processes as I can
run, and that each one will need internal mutable state. Also, that the
mutable state will be complex (partially because of the limited number of
processes), so passing the state as function parameters would entail huge
amounts of copying. (Essentially I'd be modifying nodes deep within trees.)
Mutable state would allow me to avoid the copying, and the state is not
exported from the process. I'm concerned that a huge number of private ets
tables would use excessive memory, decreasing the number of processes I
could use...but all the references keep saying not to use the process
directory.
I'm still designing things now, so this is the ideal time to decide. An
alternative is that I could use a single public ets table, with each process
only accessing its own data, but I suspect that might involve a lot of
locking overhead, even though in principle nothing should need to be locked.
_______________________________________________
erlang-questions mailing list
<a class="moz-txt-link-abbreviated" href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a>
<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a>
</pre>
</blockquote>
<pre wrap="">
</pre>
</blockquote>
<br>
</body>
</html>