[erlang-questions] linked-in driver blocks net_kernel tick?

Sat Jan 22 17:46:23 CET 2011

Rickard who have implemented this should explain it.

If I have understood it correctly, it works like this:
If a scheduler do not have any work to do it will be disabled.
It will be disabled until a live thread discovers it have to much work and
wakes a sleeping scheduler. The run-queues are only checked when processes
are scheduled.

Since in this case the only living scheduler is busy for a very long time,
no queue checking will be done and the all schedulers will be blocked until
the call to the driver is complete.

We had a long discussion during lunch about it, and we didn't agree
how it should
work. :-)

I agree that zlib is broken and it should be fixed but I still believe that it
breaks the rule about least astonishment, if I have 16 schedulers and
one is blocked
in a long function call I still expect other code to be invoked.
Rickards thought is that
such call should never happen and should be called through an async
driver or a separate
thread. I guess it will take a couple of more lunches to come to a
conclusion :-)

/Dan

On Fri, Jan 21, 2011 at 10:25 PM, Ryan Zezeski <rzezeski@REDACTED> wrote:
> Dan,
>
> Thanks for the reply, I'll be sure to chunk my data.  I was using the gzip/1
> call for convenience.
>
> That said, I'm still a little fuzzy on something you said.  Why is it that
> the "distribution" process is scheduled on the same scheduler that's running
> the call to the driver?  Why not schedule it on one of the 15 other
> schedulers that are currently sleeping?  Does this mean any other message I
> send will also be blocked?  Dare I ask, how does the scheduling work
> exactly?
>
> -Ryan
>
> On Fri, Jan 21, 2011 at 5:16 AM, Dan Gudmundsson <dgud@REDACTED> wrote:
>
>> All c-calls blocks a schedulers, if they are not pushed out to a thread.
>>
>> In this case it's a bug in the zlib module (probably by me) gzip should
>> chunk up the input before invoking the driver.
>>
>> What happens is that all schedulers go to sleep because there is no work to
>> do,
>> except the one invoking the driver, a ping is received and wakes up
>> the "distribution" process
>> which gets queued up on only scheduler that is awake, but that
>> scheduler is blocked
>> in an "eternal" call. The pings never become processed and the
>> distributions times out.
>>
>> You can wait for a patch or use zlib api to chunk up compression your self,
>> see
>> implementation of gzip in zlib module.
>>
>> /Dan
>>
>> On Fri, Jan 21, 2011 at 2:48 AM, Ryan Zezeski <rzezeski@REDACTED> wrote:
>> > So...can anyone explain to me why zlib:gzip/1 is causing the net_kernel
>> tick
>> > to be blocked?  Do linked-in drivers block it's scheduler like NIFs?  I'm
>> > really curious on this one :)
>> >
>> > -Ryan
>> >
>> > On Tue, Jan 18, 2011 at 6:53 PM, Ryan Zezeski <rzezeski@REDACTED>
>> wrote:
>> >
>> >> Apologies, the example I copied was run on my mac.
>> >>
>> >> This is what I have on the actual production machine:
>> >>
>> >> Erlang R14A (erts-5.8) [source] [64-bit] [smp:16:16] [rq:16]
>> >> [async-threads:0] [hipe] [kernel-poll:false]
>> >>
>> >> To be certain, I ran the same example (except this time using two
>> physical
>> >> machines) and achieved the same result.  Namely, the 'bar' node claims
>> 'foo'
>> >> is not responding and thus closes the connection.  Whatever this is,
>> I've
>> >> now easily reproduced it on two different OSs, with 2 different Erlang
>> >> versions.
>> >>
>> >> -Ryan
>> >>
>> >> On Tue, Jan 18, 2011 at 6:04 PM, Alain O'Dea <alain.odea@REDACTED>
>> wrote:
>> >>
>> >>> On 2011-01-18, at 18:54, Ryan Zezeski <rzezeski@REDACTED> wrote:
>> >>>
>> >>> > Hi everyone,
>> >>> >
>> >>> > Some of you may remember my latest question where I was having weird
>> >>> node
>> >>> > timeout issues that I couldn't explain and I thought it might be
>> related
>> >>> to
>> >>> > the messages I was passing between my nodes.  Well, I pinpointed the
>> >>> problem
>> >>> > to a call to zlib:gzip/1.  At first I was really surprised by this,
>> as
>> >>> such
>> >>> > a harmless line of code surely should have nothing to do with the
>> >>> ability
>> >>> > for my nodes to communicate.  However, as I dug further I realized
>> gzip
>> >>> was
>> >>> > implemented as a linked-in driver and I remember reading things about
>> >>> how
>> >>> > one has to take care with them because they can trash the VM with
>> them.
>> >>>  I
>> >>> > don't remember reading anything about them blocking code, and even if
>> >>> they
>> >>> > do I fail to see why my SMP enabled node (16 cores) would allow this
>> one
>> >>> > thread to block the tick.  It occurred to me that maybe the scheduler
>> >>> > responsible for that process is the one blocked by the driver.  Do
>> >>> processes
>> >>> > have scheduler affinity?  That would make sense, I guess.
>> >>> >
>> >>> > I've "fixed" this problem simply by using a plain port (i.e. run in
>> it's
>> >>> own
>> >>> > OS process).  For my purposes, this actually makes more sense in the
>> >>> > majority of the places I was making use of gzip.  Can someone
>> enlighten
>> >>> me
>> >>> > as to exactly what is happening behind the scenes?
>> >>> >
>> >>> > To reproduce I create a random 1.3GB file:
>> >>> >
>> >>> > dd if=/dev/urandom of=rand bs=1048576 count=1365
>> >>> >
>> >>> > Then start two named nodes 'foo' and 'bar', connect them, read in the
>> >>> file,
>> >>> > and then compress said file.  Sometime later (I think around 60+
>> >>> seconds)
>> >>> > the node 'bar' will claim that 'foo' is not responding.
>> >>> >
>> >>> > [progski@REDACTED ~/tmp_code/node_timeout] erl -name foo
>> >>> > Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:2:2] [rq:2]
>> >>>
>> >>> Your SMP node seems to be capped at smp:2:2 when it out to be smp:16.
>> >>>  Some resource limit may be holding back the system. That said zlib
>> should
>> >>> not ever cause this issue.
>> >>>
>> >>> > [async-threads:0] [hipe] [kernel-poll:false]
>> >>> >
>> >>> > Eshell V5.8.1  (abort with ^G)
>> >>> > (foo@REDACTED)1> net_adm:ping('bar@REDACTED').
>> >>> > pong
>> >>> > (foo@REDACTED)2> nodes().
>> >>> > ['bar@REDACTED']
>> >>> > (foo@REDACTED)3> {ok,Data} = file:read_file("rand").
>> >>> > {ok,<<103,5,115,210,177,147,53,45,250,182,51,32,250,233,
>> >>> >      39,253,102,61,73,242,18,159,45,185,232,80,33,...>>}
>> >>> > (foo@REDACTED)4> zlib:gzip(Data).
>> >>> > <<31,139,8,0,0,0,0,0,0,3,0,15,64,240,191,103,5,115,210,
>> >>> >  177,147,53,45,250,182,51,32,250,233,...>>
>> >>> > (foo@REDACTED)5>
>> >>> >
>> >>> >
>> >>> > [progski@REDACTED ~/tmp_code/node_timeout] erl -name bar
>> >>> > Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:2:2] [rq:2]
>> >>> > [async-threads:0] [hipe] [kernel-poll:false]
>> >>> >
>> >>> > Eshell V5.8.1  (abort with ^G)
>> >>> > (bar@REDACTED)1> nodes().
>> >>> > ['foo@REDACTED']
>> >>> > (bar@REDACTED)2>
>> >>> > =ERROR REPORT==== 18-Jan-2011::17:16:10 ===
>> >>> > ** Node 'foo@REDACTED' not responding **
>> >>> > ** Removing (timedout) connection **
>> >>> >
>> >>> >
>> >>> > Thanks,
>> >>> >
>> >>> > -Ryan
>> >>>
>> >>
>> >>
>> >
>>
>