[erlang-questions] linked-in driver blocks net_kernel tick?

Sun Jan 23 17:04:51 CET 2011

There are really two different problems:

1. No run-queue checking if the only living scheduler (schedulers ?) is blocked.
2. zlib is written in a blocking way.

Both should be fixed though the first is the more serious. It will also become serious as NIFs become more used. While "hardliner me" says that NIF writers have themselves to blame if they block the system and that they should RTFM, "softliner me" says that we should probably try to help them and make it easier to get it right.

Robert

----- "Dan Gudmundsson" <dgud@REDACTED> wrote:

> Rickard who have implemented this should explain it.
> 
> If I have understood it correctly, it works like this:
> If a scheduler do not have any work to do it will be disabled.
> It will be disabled until a live thread discovers it have to much work
> and
> wakes a sleeping scheduler. The run-queues are only checked when
> processes
> are scheduled.
> 
> Since in this case the only living scheduler is busy for a very long
> time,
> no queue checking will be done and the all schedulers will be blocked
> until
> the call to the driver is complete.
> 
> We had a long discussion during lunch about it, and we didn't agree
> how it should
> work. :-)
> 
> I agree that zlib is broken and it should be fixed but I still believe
> that it
> breaks the rule about least astonishment, if I have 16 schedulers and
> one is blocked
> in a long function call I still expect other code to be invoked.
> Rickards thought is that
> such call should never happen and should be called through an async
> driver or a separate
> thread. I guess it will take a couple of more lunches to come to a
> conclusion :-)
> 
> /Dan
> 
> On Fri, Jan 21, 2011 at 10:25 PM, Ryan Zezeski <rzezeski@REDACTED>
> wrote:
> > Dan,
> >
> > Thanks for the reply, I'll be sure to chunk my data.  I was using
> the gzip/1
> > call for convenience.
> >
> > That said, I'm still a little fuzzy on something you said.  Why is
> it that
> > the "distribution" process is scheduled on the same scheduler that's
> running
> > the call to the driver?  Why not schedule it on one of the 15 other
> > schedulers that are currently sleeping?  Does this mean any other
> message I
> > send will also be blocked?  Dare I ask, how does the scheduling
> work
> > exactly?
> >
> > -Ryan
> >
> > On Fri, Jan 21, 2011 at 5:16 AM, Dan Gudmundsson <dgud@REDACTED>
> wrote:
> >
> >> All c-calls blocks a schedulers, if they are not pushed out to a
> thread.
> >>
> >> In this case it's a bug in the zlib module (probably by me) gzip
> should
> >> chunk up the input before invoking the driver.
> >>
> >> What happens is that all schedulers go to sleep because there is no
> work to
> >> do,
> >> except the one invoking the driver, a ping is received and wakes
> up
> >> the "distribution" process
> >> which gets queued up on only scheduler that is awake, but that
> >> scheduler is blocked
> >> in an "eternal" call. The pings never become processed and the
> >> distributions times out.
> >>
> >> You can wait for a patch or use zlib api to chunk up compression
> your self,
> >> see
> >> implementation of gzip in zlib module.
> >>
> >> /Dan
> >>
> >> On Fri, Jan 21, 2011 at 2:48 AM, Ryan Zezeski <rzezeski@REDACTED>
> wrote:
> >> > So...can anyone explain to me why zlib:gzip/1 is causing the
> net_kernel
> >> tick
> >> > to be blocked?  Do linked-in drivers block it's scheduler like
> NIFs?  I'm
> >> > really curious on this one :)
> >> >
> >> > -Ryan
> >> >
> >> > On Tue, Jan 18, 2011 at 6:53 PM, Ryan Zezeski
> <rzezeski@REDACTED>
> >> wrote:
> >> >
> >> >> Apologies, the example I copied was run on my mac.
> >> >>
> >> >> This is what I have on the actual production machine:
> >> >>
> >> >> Erlang R14A (erts-5.8) [source] [64-bit] [smp:16:16] [rq:16]
> >> >> [async-threads:0] [hipe] [kernel-poll:false]
> >> >>
> >> >> To be certain, I ran the same example (except this time using
> two
> >> physical
> >> >> machines) and achieved the same result.  Namely, the 'bar' node
> claims
> >> 'foo'
> >> >> is not responding and thus closes the connection.  Whatever this
> is,
> >> I've
> >> >> now easily reproduced it on two different OSs, with 2 different
> Erlang
> >> >> versions.
> >> >>
> >> >> -Ryan
> >> >>
> >> >> On Tue, Jan 18, 2011 at 6:04 PM, Alain O'Dea
> <alain.odea@REDACTED>
> >> wrote:
> >> >>
> >> >>> On 2011-01-18, at 18:54, Ryan Zezeski <rzezeski@REDACTED>
> wrote:
> >> >>>
> >> >>> > Hi everyone,
> >> >>> >
> >> >>> > Some of you may remember my latest question where I was
> having weird
> >> >>> node
> >> >>> > timeout issues that I couldn't explain and I thought it might
> be
> >> related
> >> >>> to
> >> >>> > the messages I was passing between my nodes.  Well, I
> pinpointed the
> >> >>> problem
> >> >>> > to a call to zlib:gzip/1.  At first I was really surprised by
> this,
> >> as
> >> >>> such
> >> >>> > a harmless line of code surely should have nothing to do with
> the
> >> >>> ability
> >> >>> > for my nodes to communicate.  However, as I dug further I
> realized
> >> gzip
> >> >>> was
> >> >>> > implemented as a linked-in driver and I remember reading
> things about
> >> >>> how
> >> >>> > one has to take care with them because they can trash the VM
> with
> >> them.
> >> >>>  I
> >> >>> > don't remember reading anything about them blocking code, and
> even if
> >> >>> they
> >> >>> > do I fail to see why my SMP enabled node (16 cores) would
> allow this
> >> one
> >> >>> > thread to block the tick.  It occurred to me that maybe the
> scheduler
> >> >>> > responsible for that process is the one blocked by the
> driver.  Do
> >> >>> processes
> >> >>> > have scheduler affinity?  That would make sense, I guess.
> >> >>> >
> >> >>> > I've "fixed" this problem simply by using a plain port (i.e.
> run in
> >> it's
> >> >>> own
> >> >>> > OS process).  For my purposes, this actually makes more sense
> in the
> >> >>> > majority of the places I was making use of gzip.  Can
> someone
> >> enlighten
> >> >>> me
> >> >>> > as to exactly what is happening behind the scenes?
> >> >>> >
> >> >>> > To reproduce I create a random 1.3GB file:
> >> >>> >
> >> >>> > dd if=/dev/urandom of=rand bs=1048576 count=1365
> >> >>> >
> >> >>> > Then start two named nodes 'foo' and 'bar', connect them,
> read in the
> >> >>> file,
> >> >>> > and then compress said file.  Sometime later (I think around
> 60+
> >> >>> seconds)
> >> >>> > the node 'bar' will claim that 'foo' is not responding.
> >> >>> >
> >> >>> > [progski@REDACTED ~/tmp_code/node_timeout] erl -name
> foo
> >> >>> > Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:2:2] [rq:2]
> >> >>>
> >> >>> Your SMP node seems to be capped at smp:2:2 when it out to be
> smp:16.
> >> >>>  Some resource limit may be holding back the system. That said
> zlib
> >> should
> >> >>> not ever cause this issue.
> >> >>>
> >> >>> > [async-threads:0] [hipe] [kernel-poll:false]
> >> >>> >
> >> >>> > Eshell V5.8.1  (abort with ^G)
> >> >>> > (foo@REDACTED)1> net_adm:ping('bar@REDACTED').
> >> >>> > pong
> >> >>> > (foo@REDACTED)2> nodes().
> >> >>> > ['bar@REDACTED']
> >> >>> > (foo@REDACTED)3> {ok,Data} = file:read_file("rand").
> >> >>> > {ok,<<103,5,115,210,177,147,53,45,250,182,51,32,250,233,
> >> >>> >      39,253,102,61,73,242,18,159,45,185,232,80,33,...>>}
> >> >>> > (foo@REDACTED)4> zlib:gzip(Data).
> >> >>> > <<31,139,8,0,0,0,0,0,0,3,0,15,64,240,191,103,5,115,210,
> >> >>> >  177,147,53,45,250,182,51,32,250,233,...>>
> >> >>> > (foo@REDACTED)5>
> >> >>> >
> >> >>> >
> >> >>> > [progski@REDACTED ~/tmp_code/node_timeout] erl -name
> bar
> >> >>> > Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:2:2] [rq:2]
> >> >>> > [async-threads:0] [hipe] [kernel-poll:false]
> >> >>> >
> >> >>> > Eshell V5.8.1  (abort with ^G)
> >> >>> > (bar@REDACTED)1> nodes().
> >> >>> > ['foo@REDACTED']
> >> >>> > (bar@REDACTED)2>
> >> >>> > =ERROR REPORT==== 18-Jan-2011::17:16:10 ===
> >> >>> > ** Node 'foo@REDACTED' not responding **
> >> >>> > ** Removing (timedout) connection **
> >> >>> >
> >> >>> >
> >> >>> > Thanks,
> >> >>> >
> >> >>> > -Ryan
> >> >>>
> >> >>
> >> >>
> >> >
> >>
> >
> 
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED

-- 
Robert Virding, Erlang Solutions Ltd.