[erlang-questions] VM leaking memory

Michael Truog mjtruog@REDACTED
Sat Feb 2 08:59:54 CET 2019


On 2/1/19 10:53 PM, Frank Muller wrote:
> Hi Michael
>
> All packets in transit have a “seq_id” (sequential number).
>
> This means that in theory packet1, packet2...packetN can be checked in 
> parallel and in any order (which is not the case in my current 
> design), but they must be send to the next processing stage in order: 
> packet1 first, then packet2...
>
> I would love to hear from you how can I turn this long-lived process 
> to multiple short-lived ones while enforcing ordering.

An easy way to think about it comes from a module I used in the past 
called immediate_gc 
(https://gist.github.com/okeuday/dee991d580eeb00cd02c).  The sync_fun/2 
function is below:


sync_fun(F, A) when is_function(F), is_list(A) ->
     Parent = self(),
     Child = erlang:spawn_opt(fun() ->
         Parent ! {self(), erlang:apply(F, A)},
         erlang:garbage_collect()
     end, [link, {fullsweep_after, 0}]),
     receive
         {Child, Result} -> Result
     end.

That is all you need to use a temporary process in a blocking way that 
consumes all the temporary binary data as quickly as the BEAM allows.  
However, that example is more complex than it needs to be, with the 
child process using the fullsweep_after option and 
erlang:garbage_collect/0.  The extra complexity in the example is really 
not necessary or desirable, though it does force the garbage collection 
to occur as quickly as possible when consuming the temporary binary data 
(binary data that nothing else references).

Spawn a similar child process before you start decoding a large binary, 
so the temporary Erlang process has a lifetime the length of the single 
request (packet in your situation) or less. Spawning Erlang processes is 
cheap, so you shouldn't hesitate to use them, just ensure they are 
linked so their failures may be tracked.

CloudI (https://cloudi.org) internal services use temporary processes 
for handling service requests, in a way that is tunable with the service 
configuration options request_pid_uses and info_pid_uses , so you can 
control how many requests are processed in a temporary Erlang process 
before a new one is created (with the exit exception being used to 
terminate the Erlang process with its last result).  CloudI internal 
services also have the hibernate service configuration option, with the 
hibernate based on request rate checked every few seconds (the service 
configuration options are described at 
http://cloudi.org/api.html#2_services_add_config_opts).

The idea of using a temporary Erlang process for consuming temporary 
binary data, is likely unusual for people new to Erlang/Elixir and may 
not have found its way into Erlang/Elixir books, though it is important 
to know about if you want to avoid excessive memory consumption (and 
potentially causing the BEAM to die due to memory use).

Best Regards,
Michael





More information about the erlang-questions mailing list