[erlang-questions] Running a fleet of OS processes
Aaron Seigo
aseigo@REDACTED
Tue Jul 10 09:12:10 CEST 2018
On 2018-07-09 16:33, Yevhenii Kurtov wrote:
> - continiously poll their state (JSON RPC API) and report updated state
> if
> it changed since last poll
Be careful with polling -> if each poll job takes 10ms of processing
time, then in a perfect world (and the world is not perfect) the system
can only handle 100 connections per core before simply running out of
CPU time. If most of your targets are not regularly updating, then it's
a real burn as targets with updates will have to wait until they get on
the CPU while targets with nothing to report get in the way.
The usual result is lag-under-load: polling tasks are not done
back-to-back but spread out over time (e.g. every N seconds), and
changes are relayed with a delay that increases in proportion to the
number of polling targets.
And of course, the world is not perfect. The erlang VM needs time on the
CPU, whatever else your application does also needs CPU time, the OS
itself and whatever other things are running will as well (e.g. your
external processes). The BEAM will help somewhat with its ability to
interleave processing between the various polling processes, but there
are limits that polling brings with it.
If you need to provide low-latency updates to an even moderate number of
requests, polling will likely become your bottleneck. If at all
possible, avoid polling and move to push-on-updates as close to your
source of truth as you can.
> Then there will be a pool of workers that will go and poll a fleet once
> in a while.
IME it is usually better to have one process per external exec for such
tasks. The reason for this is that it allows concurrency of the polling
with minimal fuss. If you serialize the polling in a single process,
then the Nth poll target needs to wait until the (N-1)th polling jobs
have been done. If you fire off a bunch of poll requests and wait for
them to come back async, you have to write all the bookkeeping to keep
track of which request goes with which poll target. Usually it is easier
(and often faster ime) to allow the erlang schedulers to rotate through
the set of processes doing polling, with each process handling one poll
target.
--
Aaron
More information about the erlang-questions
mailing list