[erlang-questions] Millions of processes?

Tue Sep 23 19:09:30 CEST 2008

NPTL is fast, but AFAIK uses a minimum stack size of 8KB
per thread (the minimum heap size for erlang processes
seems to be 932 bytes on a 32-bit system). There also seem
to be other limits, making it very difficult in practice
to reach anywhere near 100,000 threads, and it's not
encouraged either.

http://nptl.bullopensource.org/Tests/NPTL-limits.html

In the Linux kernel FAQ, the philosophy on threads is
explained thus:

"Avoid the temptation to create large numbers of threads in your
application. Threads should only be used to take advantage of multiple
processors or for specialised applications (i.e. low-latency real-time),
not as a way of avoiding programmer effort (writing a state machine or
an event callback system is quite easy). A good rule of thumb is to have
up to 1.5 threads per processor and/or one thread per RT input stream.
On a single processor system, a normal application would have at most
two threads, over 10 threads is seriously flawed and hundreds or
thousands of threads is progressively more insane.
A common request is to modify the Linux scheduler to better handle large
numbers of running processes/threads. This is always rejected by the
kernel developer community because it is, frankly, stupid to have large
numbers of threads. Many noted and respected people will extol the
virtues of large numbers of threads. They are wrong. Some languages and
toolkits create a thread for each object, because it fits into a
particular ideology. A thread per object may be appealing in the
abstract, but is in fact inefficient in the real world. Linux is not a
good computer science project. It is, however, good engineering.
Understand the distinction, and you will understand why many widely
acclaimed ideas in computer science are held with contempt in the Linux
kernel developer community. "

http://www.kernel.org/pub/linux/docs/lkml/#s7-21

BR,
Ulf W

Zvi skrev:
> I'm no Linux expert, but
> 
> http://en.wikipedia.org/wiki/Native_POSIX_Thread_Library
> 
> "The Native POSIX Thread Library (NPTL) is a software feature that
> enables the Linux kernel to run programs written to use POSIX Threads
> fairly efficiently. In tests, NPTL succeeded in starting 100,000
> threads on a IA-32 in two seconds. In comparison, this test under a
> kernel without NPTL would have taken around 15 minutes."
> 
> I guess future Erlang VM will offer some more generic MxN threading
> model, i.e. M Erlang user-level processes implemented on N
> "schedulers" - native threads. Today in SMP Erlang is only limited
> support (i.e. command line options) to specify number o scheduler and
> no programmatic support for affinity of schedulers per core and
> Erlang processes per schedulers.
> 
> Zvi
> 
> 
> Bob Ippolito wrote:
>> We've got a couple applications that use thousands of processes per
>>  node. If those were pthreads, we'd be out of RAM before actually
>> doing anything.
>> 
>> 2008/9/23 Bard Bloom <bardb@REDACTED>:
>>> I've seen in Erlang promotional materials some rather impressive
>>> claims about how cheap Erlang processes are, and how many of them
>>> one can spawn. Which is pretty cool. But, what Erlang programs
>>> take advantage of that kind of power? Are there any examples of
>>> programs which use huge numbers of processes in interesting ways?
>>> (I am the local Erlang fancier. I got challenged on that point,
>>> and didn't have a very good answer.)
>>> 
>>> Thanks very much, Bard Bloom
>>> 
>>> _______________________________________________ erlang-questions
>>> mailing list erlang-questions@REDACTED 
>>> http://www.erlang.org/mailman/listinfo/erlang-questions
>>> 
>> _______________________________________________ erlang-questions
>> mailing list erlang-questions@REDACTED 
>> http://www.erlang.org/mailman/listinfo/erlang-questions
>> 
>> 
>