[erlang-questions] Measuring message queue delay

Fri May 1 02:01:56 CEST 2015

On 04/29/2015 02:12 AM, Roger Lipscombe wrote:
> For various reasons, I want a metric that measures how long messages
> spend in a process message queue. The process is a gen_server, if that
> has any bearing on the solution. Also, this is for production, so it
> will be always-on and must be low-impact.
>
> I could do this by timestamping _every_ message sent to the process
> and then reporting the deltas, but I think that's ugly.
>
> I thought of posting a message to self(), with a timestamp and then
> measuring the delta. I could then post another one as soon as that
> message is processed. Obviously, this leaves the process continually
> looping, which is not good.
>
> So, instead, I could use erlang:send_after, but that requires two
> messages: a delayed one triggered by erlang:send_after, and an
> immediate one for measuring the delay.

If you used a CloudI service (http://cloudi.org) for the logic the gen_server process is currently doing (using the cloudi_service behaviour instead), the CloudI service request timeout would be decremented based on any delays the CloudI service request encounters, including queuing delays (queuing delays are always decremented, there is a service configuration option to include the delay of actually handling the service request and a separate option to include the delay of actually providing the response, request_timeout_adjustment and response_timeout_adjustment respectively).

If you want to replicate just the Erlang source code that is handling the queuing delay, you can do that with a single erlang:send_after timer (you don't need two as you suggested) to fire with the timeout value of the request after taking the request from the process message queue (you need to not block the gen_server process and consume messages as quick as they arrive adding them to a queue in process heap memory).  Once you are able to actually handle the request, you then take the request from the queue in the process heap memory, and cancel the timer which provides you with the time remaining.  Comparing the time remaining to the timeout you initially provided to erlang:send_after gives you the time elapsed. However, I find it easier to use the logic already implemented for CloudI services.