[erlang-questions] gen_server:multi_call/3 causing unexpected invocation of handle_info/2

Jack Orenstein jao@REDACTED
Thu Dec 27 07:21:50 CET 2007


I have written an Erlang application that uses gen_server:multi_call/ 
3, (so the timeout is infinity). I have also written a little test  
program that does the following:

- Starts three VMs on one machine.

- Occasionally kills one of the VMs (using kill -9) and then restarts  
it. (This is intended to simulate a node crashing in a cluster with  
one VM per node.)

- Checks to see that the application is responding to the failure as  
expected.

I will typically do 1000 VM bounces in a single test run. I  
occasionally see errors due to unexpected calls to  
gen_server:handle_info/2, which appear to be connected to the  
multi_call. The cases I've examined have these characteristics:

- The multi_call invocation returns one of the nodes in the BadNodes  
part of the multi_call result.

- I retry the request to the nodes listed in BadNodes and the request  
succeeds.

- The ORIGINAL invocation to the node actually did reach the node and  
eventually execute. The response eventually reaches the node that  
issued the original request, and is routed to handle_info.

I've read the documentation for multi_call, and this behavior does  
not appear to be documented. This paragraph from the documentation  
addresses late delivery:

     To avoid that late answers (after the timeout) pollutes the  
caller's
     message queue, a middleman process is used to do the actual
     calls. Late answers will then be discarded when they arrive to a
     terminated process.

but 1) my timeout is infinity, and 2) the calling process has not  
terminated.

The behavior I'm seeing does not appear to be documented -- is it  
expected?

I'm running on a MacBook Pro, and erl -version says: Erlang  
(ASYNC_THREADS) (BEAM) emulator version 5.5.1.

Jack




More information about the erlang-questions mailing list