Broken gen:call/3?

Tue Nov 1 09:25:04 CET 2005

In the best of worlds you would be right, but since this strange
behaviour has been tested and in production for many many years
you just _might_ not be right. And a behaviour change would
expose new bugs.

Therefore we assume a behaviour change would make our major
paying customers, which are in the maintenance phase of their
products, avoid taking a new OTP release; forcing us to maintain
one release more than necessary, stealing recources from
new development...

sean.hinde@REDACTED (Sean Hinde) writes:

> Indeed !
> 
> I wonder how much code there is out there which is currently broken
> because the author did not realise this happens vs code which would
> be broken if it was changed.
> 
> My guess, based on the assumption that people would expect to have to
> handle 'EXIT' messages if they have chosen to link, is that this
> behaviour hides many more latent bugs than would be introduced if it
> were changed..
> 
> Sean
> 
> On 31 Oct 2005, at 14:18, Raimo Niskanen wrote:
> 
> > Aaah, well, yes.. This is an old flaw.
> >
> > Once upon a time there were only links to supervise other
> > processes, so the only way to know if a server died during
> > a library call e.g inside gen_server:call after sending
> > the request while receiving the response, was that an
> > 'EXIT' message was received instead; and then the library
> > code for gen_server:call would have to trap exit messages
> > and set a link to the server.
> >
> > But that can not be done by library code, since there can
> > be only one link between any pair of processes. Possibly
> > exit message trapping could be done, but there is a time
> > window after receive before disabling exit message trapping
> > that can not be controlled, so the library code can not
> > be sure to not accidentally convert a link exit to an
> > exit message.
> >
> > So, it was then designed so that _if_ the calling process
> > had activated exit message trapping _and_ set a link to the
> > server, then the gen_server:call could receive the 'EXIT'
> > message and return an error code as a result of the server call.
> >
> > Later, when monitors was introduced we could not change
> > the behaviour of gen_server:call to not consume 'EXIT'
> > messages at all (which would be the right(TM) way, in
> > the precence of monitors); the result would be passing
> > undesired 'EXIT' messages onto old calling applications.
> >
> > So, there we are today. The calling process should check
> > the result from gen_server:call plus receive 'EXIT' messages.
> > Or set a monitor of its own.
> >
> > sean.hinde@REDACTED (Sean Hinde) writes:
> >
> >
> >> Hi,
> >>
> >> This behaviour seems broken to me:
> >>
> >> 1. One process is linked to another (for supervision reasons), and a
> >> gen_*:call/2 synchronous request is made from one to the other.
> >>
> >> 2. The called process crashes while handling the call.
> >>
> >> 3. gen:call consumes *both* it's own monitor 'DOWN' message *and* the
> >> 'EXIT' message arising from the link
> >>
> >> Result: calling process doesn't get 'EXIT' message, and hence doesn't
> >> know about the crash. It does not then function well as a
> >> supervisor...
> >>
> >> Sean
> >>
> >
> > -- 
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >
> 

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB