[erlang-questions] How to quick calculation max Erlang's processes and scheduler can alive based on machine specs

I Gusti Ngurah Oka Prinarjaya okaprinarjaya@REDACTED
Sun Jul 14 08:24:30 CEST 2019


Hi Dániel,

>> Try experimenting with different number of processes while monitoring
the scheduler utilisation (e.g. with observer): if you're much below 100%
utilisation (across all
>> schedulers), you have too few
I am lucky, i always get 100% utilisation

>> If, on the other hand, you see the run queue going up (the number of
runnable processes that are waiting for a CPU slice to run), you have too
many.
Where to see this?

Thank you :)


Pada tanggal Min, 14 Jul 2019 pukul 03.36 Dániel Szoboszlay <
dszoboszlay@REDACTED> menulis:

> Your second and third questions are easy to answer: measure the execution
> time of functions with timer:tc and even with a single scheduler you can
> run as many processes as you want. They will compete for a single core
> though, and will have to wait a long time to get some CPU time once
> scheduled out. So just stick to the default, and use as many schedulers as
> many cores you have.
>
> Now, finding the maximum (or rather: optimal) number of processes to
> perform this particular task on your particular machine is hard. A very
> dumb calculation would be that because all of the processes will be doing
> the same, CPU-bound task, they will all compete for the same HW resources,
> so you won't gain much by having more processes than CPU cores (4 in your
> case). If accessing the rows involves some I/O, than you should use more
> processes, so some processes can run the CPU-bound text calculations while
> others wait for I/O. Try experimenting with different number of processes
> while monitoring the scheduler utilisation (e.g. with observer): if you're
> much below 100% utilisation (across all schedulers), you have too few, If,
> on the other hand, you see the run queue going up (the number of runnable
> processes that are waiting for a CPU slice to run), you have too many.
>
> But you can safely use a bit more processes than the minimum needed to
> saturate the CPU. It can even speed up the whole job a bit if not all rows
> take equal time to process (consider one process getting a chunk of super
> slow to process rows: at the end of all other processes will have finished
> and you'll have to wait for this big worker to do its work on a single
> core; having twice as many processes would cut the chunk into two halves,
> also halving the time to wait at the end). However, after one (hard to
> find) point adding more processes would hurt performance: more processes
> means more cache misses and more synchronisation overhead at the beginning
> and end of the job.
>
> The theoretical maximum number of processes is probably constrained by
> your RAM: measure how much memory one process needs, and divide 8 GB (minus
> some for the OS and other programs) with this number. You won't be able to
> fit more processes in RAM, and swapping will only slow down your
> computation. But this limit is probably in the thousands of processes range.
>
> Hope this helps,
> Daniel
>
>
> On Sat, 13 Jul 2019 at 10:47, I Gusti Ngurah Oka Prinarjaya <
> okaprinarjaya@REDACTED> wrote:
>
>> Hi,
>>
>> I'm a super newbie, I had done very very simple parallel processing using
>> erlang. I experimenting with my database containing about hundreds of
>> thousands rows. I split the rows into different offsets then assign each
>> worker-processes different rows based on offsets. For each row i doing
>> simple similar text calculation using binary:longest_common_prefix/1
>>
>> Let's assume my total rows is 200,000 rows of data.
>> At first, i try to create 10 worker-processes, i assign 20,000 rows at
>> each worker-process.
>> Second, i try to create 20 worker-processes, i assign 10,000 rows at each
>> worker-process.
>> Third, i try to create 40 worker-processes, i assign 5000 rows at each
>> worker-process.
>>
>> My machine specs:
>> - MacBook Pro (13-inch, 2017, Four Thunderbolt 3 Ports)
>> - Processor 3,1 GHz Intel Core i5 ( 2 physical cores, with HT )
>> - RAM 8 GB 2133 MHz LPDDR3
>>
>> My questions is
>>
>> 1. How to quick calculation / dumb / simple calculation max Erlang's
>> processes based on above machine specs?
>>
>> 2. The running time when doing similar text processing with 10 worker, or
>> 20 worker or 40 worker was very blazingly fast. So i cannot feel, i cannot
>> see the difference. How to measure or something like printing total minutes
>> out? So i can see the difference.
>>
>> 3. How many scheduler need to active / available when i create 10
>> processes? or 20 processes? 40 processes? and so on..
>>
>> Please enlightenment
>>
>> Thank you super much
>>
>>
>>
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20190714/6d4e2659/attachment.htm>


More information about the erlang-questions mailing list