[erlang-questions] Cast versus Call and timeouts

Bob Ippolito bob@REDACTED
Mon Jan 26 20:18:11 CET 2015


On Mon, Jan 26, 2015 at 9:39 AM, Bryan <bryan@REDACTED> wrote:

> Hi Everyone,
>
> I was hoping to better understand an interesting condition I recently
> encountered, and was able to alleviate though I am not 100% clear why.
>
> In our system, we have two types of main processes: for simplicity sake,
> lets just call them groups (A) and endpoints (B). Each of the processes are
> implemented as gen_servers.
>
> Process A implements functionality and represents a group of endpoints.
> These endpoints are then each instantiated as a process B. Each endpoint
> can then be in multiple groups. If I have two groups, then I will have two
> A processes. If I have five endpoints, I will then have five B processes.
> In our example, endpoint process #3 is a member in groups one and two.
>
> The system is very simple. If a change occurs in A, a message is then sent
> to each endpoint process B that is a member. In our example, group #1
> process would send a message to five endpoint processes. If a change occurs
> in the endpoint process B, a message is sent to each group process A it is
> a member of. In our example, if this is endpoint #3, it sends a message to
> both group one and two.
>
> Seems simple enough. My interesting condition that I ran into was where
> one of the messages from the group process A to the endpoint process B was
> a cast. All others for both gen_servers are calls. When A sent the cast
> message to B, B simply updates its state. For reasons that are not clear to
> me, this ultimately reaches a timeout state, where all the processes start
> timing out, even though there are no calling/casting cycles.
>
> I know that calling cycles introduce a deadlock condition, but I trying to
> understand why a cast, which is suppose to return immediately and be
> handled asynchronously would produce a timeout?
>
> When I move this message from a cast to a call, the system works perfectly.
>

Just a guess, but I would check to make sure that the code for handle_cast
in the recipient "B" process wasn't doing something to make it
unresponsive, such that the next call to that process would timeout. Are
you sure that there was no call as a result of that handle_cast?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150126/f85d5ee9/attachment.htm>


More information about the erlang-questions mailing list