[erlang-questions] Process swarm vs Queues

Fri Jul 21 12:03:41 CEST 2017

Hello,

I was not thinking about third party services because I only have one 
app, just pure Erlang libraries.

I still lean towards process swarm because the implementation feels more 
natural to me. I will finish that and see how it behaves.

Thank you both for your answers :)

Ludovic

Le 2017-07-21 10:00, Bastien CHAMAGNE a écrit :
> Hi,
> 
> in contrast with Dmitry, I'd chose to use a third party message broker
> (such as RabbitMQ). It's resilience is proven and I trust that their
> uptime is better than my app. Then you can have a consumer (or a pool
> of consumers) per queue.
> 
> If nothing is done, I'd chose this path.
> 
> 
> On 20/07/2017 19:27, Dmitry Kolesnikov wrote:
>> Hello,
>> 
>> Short answer: Process swarm
>> 
>> You'll going to implement a task scheduling routines if you choose a 
>> queuing approach. A similar task scheduling routines are already 
>> implemented by VM. You are not going to implement more efficiently 
>> then it is already done. The process pool + queue is an approach to 
>> deal with external system.
>> 
>> The process per action is a natural approach in the actor systems. The 
>> memory overhead is reasonable, hibernate feature helps you to minimize 
>> it to few bytes. The CPU overhead  to spawn a new process is 
>> reasonable as well. You can run millions of process on the node. The 
>> process per action helps you with GC, fault tolerance and makes the 
>> implementation simple.
>> 
>> Best Regards,
>> Dmitry
>>> -|-|-(*>
>>> On 20 Jul 2017, at 19.49, lud <ludovic@REDACTED> wrote:
>>> 
>>> Hi,
>>> 
>>> I would like some advice regarding the design of my system.
>>> 
>>> I have a couple of processes monitoring data endpoints on the 
>>> internet, and sending events in my system when data changes.
>>> Then, I have a swarm of processes dedicated to handle those events. 
>>> These processes are different, it's not the same code that handles 
>>> changes for different resources types (comments, feeds,
>>> 
>>> The specific things :
>>> * Processes register themselves to an ETS table to tell which 
>>> resources they monitor.
>>> * Processes states are very important and I can't afford to lose 
>>> them. So the state is saved to disk after every handled event, and 
>>> retrieved on restart.
>>> * Events are quite rare, about 10 per second, whereas I could have 
>>> 1000 or 10,000 monitored resources.
>>> 
>>> At this point, I was thinking : why use a process for each resource 
>>> and have so much hibernating processes. Why not just use a job queue 
>>> with 10 workers, receive an event, load the data from disk, handle 
>>> the event, save to disk ?
>>> 
>>> It seems correct to me. I started with processes because it feels 
>>> natural, but now I'm quite lost, I don't need all those idling 
>>> processes. I feel like I was just thinking OOP (SHAME ! Just kidding 
>>> …).
>>> 
>>> How do you choose, why would you choose one of these two designs, or 
>>> one another ?
>>> 
>>> Thank you for reading
>>> 
>>> Best regards
>>> 
>>> Ludovic
>>> 
>>> 
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions