[erlang-bugs] net_kernel hang, perhaps blocked by busy_dist_port race?

Scott Lystig Fritchie fritchie@REDACTED
Sun May 16 09:07:53 CEST 2010


Scott Lystig Fritchie <fritchie@REDACTED> wrote:

slf> New update: recipe to duplicate.

Nothing like replying to myself again ... so, here's a kludge fix:
Allow 'max' priority processes (such as 'net_kernel') to send messages
(well, queue them really) on busy distribution ports.

--- dist.c	2009-11-20 07:29:24.000000000 -0600
+++ dist.c.slf	2010-05-16 01:23:46.000000000 -0500
@@ -1496,7 +1496,7 @@
 	dep->qsize += size_obuf(obuf);
 	if (dep->qsize >= ERTS_DE_BUSY_LIMIT)
 	    dep->qflgs |= ERTS_DE_QFLG_BUSY;
-	if (!force_busy && (dep->qflgs & ERTS_DE_QFLG_BUSY)) {
+	if (!force_busy && (dep->qflgs & ERTS_DE_QFLG_BUSY) && c_p->prio != PRIORITY_MAX) {
 	    erts_smp_spin_unlock(&dep->qlock);
 
 	    plp = erts_proclist_create(c_p);

It isn't really specific to net_kernel, but there aren't many processes
(within OTP, at least) that run at max priority and communicate with the
outside world, right?  And the worst that could happen would be to have
the port's queue get bigger past the ERTS_DE_BUSY_LIMIT before a tick
timeout closed the connection (and thus frees the port's queued data),
perhaps?

-Scott


More information about the erlang-bugs mailing list