[erlang-questions] Process swarm vs Queues

Mikael Pettersson mikpelinux@REDACTED
Fri Jul 21 13:09:54 CEST 2017


lud writes:
 > Hi,
 > 
 > I would like some advice regarding the design of my system.
 > 
 > I have a couple of processes monitoring data endpoints on the internet, 
 > and sending events in my system when data changes.
 > Then, I have a swarm of processes dedicated to handle those events. 
 > These processes are different, it's not the same code that handles 
 > changes for different resources types (comments, feeds,
 > 
 > The specific things :
 >   * Processes register themselves to an ETS table to tell which resources 
 > they monitor.
 >   * Processes states are very important and I can't afford to lose them. 
 > So the state is saved to disk after every handled event, and retrieved 
 > on restart.
 >   * Events are quite rare, about 10 per second, whereas I could have 1000 
 > or 10,000 monitored resources.
 > 
 > At this point, I was thinking : why use a process for each resource and 
 > have so much hibernating processes. Why not just use a job queue with 10 
 > workers, receive an event, load the data from disk, handle the event, 
 > save to disk ?
 > 
 > It seems correct to me. I started with processes because it feels 
 > natural, but now I'm quite lost, I don't need all those idling 
 > processes. I feel like I was just thinking OOP (SHAME ! Just kidding …).
 > 
 > How do you choose, why would you choose one of these two designs, or one 
 > another ?

Your problem seems very under-defined.  In particular, we don't know
the overhead of processing an event, or what you're trying to optimize
(latency? throughput? robustness?).  The processes that want to monitor
resources, is that the external API to your system or an implementation
detail within your system?

Anyway, since you mention reading and writing state to disk around each
handled event, minimizing latency cannot be critical.  Therefore I don't
see the need for anything complicated here: just spawn a temporary process
for each event as it occurs.  If you need serialization, add a gen_server.
If state is precious, have a supervisor own it.

Don't overengineer unless you have proof that the simple solution doesn't work.



More information about the erlang-questions mailing list