process waits forever spawn_opt/5
Vyacheslav Levytskyy
v.levytskyy@REDACTED
Thu Jun 17 13:45:49 CEST 2021
Thank you for details, I think it explains the most part of the
situation. I checked messages indeed, they were all specific to my
application - no "{spawn_reply, Ref, ok|error, Pid|Error}" for sure,
just usual '$gen_cast' and system. Judging from messages, the caller was
blocked for about 4 hours when I noticed that. The node is ordinary
Erlang node, nothing special except for the complicated environment.
The environment is Kubernetes with istio used for networking. It's
possible that one of nodes of the cluster was restarted abruptly, and
may be it was related to version upgrade of istio networking, so we have
either restart of a node or a possible glitch of networking to break
connection, and also a generally interesting networking implementation.
One surprising issue, however, is that there were no timeouts and
spawn_opt/5 just stuck in that state. Could it be related to the
environment? If yes, and the caller may be blocked in unfortunate
circumstances in K8s/istio env, would you suggest a way to prevent such
situations?
Thank you,
Vyacheslav
On 16.06.2021 17:03, Rickard Green wrote:
>
>
> On Wed, Jun 16, 2021 at 9:15 AM Vyacheslav Levytskyy
> <v.levytskyy@REDACTED <mailto:v.levytskyy@REDACTED>> wrote:
>
>
> The function doesn't interact with the gen_server that calls spawn/4,
> although I'd expect spawn/4 to run a process and return immediately
> anyway, am I wrong?
>
>
> All spawn operations except for spawn_request() (introduced in OTP 23)
> are synchronous and block until the new process has been created and
> the caller of the BIF has received the process identifier of the newly
> created process or an error is detected. In case the connection
> between the nodes stalls the caller will be blocked until the network
> ticker takes down the connection (default 60 seconds).
>
> >>
> >> I'm surprised to see my gen_server process hanging forever when
> >> executing spawn/4 call. Process info shows spawn_opt/5 as a current
> >> function and status is waiting:
> >>
> >> > process_info(P).
> >> [{current_function,{erlang,spawn_opt,5}},
> >> {status,waiting},
> >> {message_queue_len,13},
> >> {trap_exit,false},
> >> {priority,normal},
> >> ...]
> >>
>
>
> Would have been interessting to know what process_info(P, messages)
> had returned. In the distributed case spawn_opt() is waiting for a
> message on the format: {spawn_reply, Ref, ok|error, Pid|Error}
>
> What type of node is the node that you are trying to spawn the new
> process on? Ordinary Erlang node, C-node, ...? OTP release of that node?
>
> Regards,
> Rickard
> --
> Rickard Green, Erlang/OTP, Ericsson AB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210617/93ecc452/attachment.htm>
More information about the erlang-questions
mailing list