Semantic Monitors: a proposal

Wed Feb 17 12:37:20 CET 2021

On 2/17/21 2:40 AM, Roger Lipscombe wrote:
> On Wed, 17 Feb 2021 at 10:28, Michael Truog <mjtruog@REDACTED> wrote:
>> It should be beneficial to have a more accurate time-of-process-death value than is currently possible when the DOWN message is received.
> Why? What's the use case here?
Anytime you want to understand the lifetime of an Erlang process you 
would want an accurate understanding of when it is first created and 
when it dies.  When it is created (spawned) isn't difficult, it can be 
the monotonic time immediately before the spawn.  However, the process 
death time is currently at the mercy of the Erlang process that owns its 
monitor, when considering the delay of the DOWN message spent in the 
message queue.  So, when I think of a use-case, I think of 
https://github.com/CloudI/CloudI/blob/228d09fe64e86f1316221de514482a82486e1034/src/lib/cloudi_core/src/cloudi_core_i_services_monitor.erl#L585 
.

The current restart time is a time after the DOWN message is received, 
though I would prefer to know when the death of the process really 
occurred to have a better understanding of the Erlang process "uptime".  
By that, I mean, how long did this particular Erlang process really have 
doing Erlang process things, not the extra latency related to other 
Erlang processes like the one that had a monitor and received the DOWN 
message.  So, a more accurate Erlang process time could be used in other 
Erlang source code.  The Erlang/OTP supervisor is currently relying on 
link/trap_exit for the restart, so that wouldn't benefit unless a 
separate trap_exit message was possible, assuming there was motivation 
to modify the supervisor restart time (to avoid any potential delay 
spent due to the 'EXIT' message in the supervisor's message queue).

In CloudI, the services_status CloudI Service API function provides the 
various time related information about the lifetime of CloudI services.  
So, for that source code I would prefer to have the most accurate 
monotonic time values possible, to ensure the service lifetime data is 
accurate.

I understand an argument against a monitor option would be to create a 
separate Erlang process to own each monitor, to avoid having extra 
messages in the message queue.  However, I don't think that is a 
realistic solution and would only make things more complex than they 
need to be.

Best Regards,
Michael