[erlang-questions] Why does an idle erlang process (couchdb, to be precise) call epoll_wait so often?

Lukas Larsson garazdawi@REDACTED
Thu May 23 11:13:15 CEST 2013


On Thu, May 23, 2013 at 10:56 AM, Jann Horn <jann@REDACTED> wrote:

> On Wed, May 22, 2013 at 06:42:59PM +0200, Lukas Larsson wrote:
> > You have to poll like that because fd's are not the only thing which can
> > trigger load on the system. Timeouts for instance are triggered by
> calling
> > gettimeofday which means you have to break out of epoll_wait before the
> > next timeout happens.
>
> Why not by specifying a timeout in the epoll_wait call?
>

If you look closely to the strace you can see that for every n:th call
there is a small timeout given to epoll_wait. This timeout is calculated by
looking at the next timeout and a number of other factors.


>
> > Also the spinning is done to make the system respond
> > faster to events by delaying sleeping in the kernel.
>
> Ah, ok... and that brings a performance gain?
>

It brings a latency gain, which in turn can bring a performance gain.
Sleeping in the kernel is a (relatively) expensive thing to do and by
spinning before sleeping the schedulers stay more responsive. You can
configure this behaviour through the runtime flags +sbwt, +sws, +swt. See
http://www.erlang.org/doc/man/erl.html for some more details.


>
>
> > On Wed, May 22, 2013 at 3:56 PM, Jann Horn <jann@REDACTED> wrote:
> >
> > > On Wed, May 22, 2013 at 03:46:17PM +0200, Benoit Chesneau wrote:
> > > > How would it work to receive an event to wake up if doesn't listen on
> > > them?
> > >
> > > Uh... register all fds you're interested in using epoll, then do
> > > epoll_wait(fd, events, maxevents, -1)? Isn't that actually quite
> > > normal? The syscall will return as soon as something happens, but
> > > not earlier.
> > >
> > > You don't have to poll different event sources, there
> > > are many facilities that can multiplex those event sources for
> > > you and will wake you up as soon as something interesting happens.
> > > E.g. select, poll, epoll, kqueue and event ports (the last two aren't
> > > available on linux). I can't imagine a reason why you'd have to
> > > poll events like this in any OS.
> > >
> > >
> > > > On Wed, May 22, 2013 at 1:09 PM, Jann Horn <jann@REDACTED> wrote:
> > > > > This is strace output from a totally idle couchdb process:
> > > > >
> > > > > [pid 18350] 17:40:15.086577 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.086610 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.086643 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.086675 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.086736 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.086771 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.086940 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.087180 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.087219 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.087251 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.087287 epoll_wait(3, {}, 256, 228) = 0
> > > > > [pid 18350] 17:40:15.315646 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315718 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315751 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315800 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315833 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315865 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315900 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315953 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.315992 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.316081 epoll_wait(3, {}, 256, 651) = 0
> > > > > [pid 18350] 17:40:15.967979 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.969211 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.969926 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.970484 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.970616 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.970717 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.970806 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.970895 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.970983 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.971071 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.971158 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.971243 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:15.971343 epoll_wait(3, {}, 256, 345) = 0
> > > > > [pid 18350] 17:40:16.316908 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.317266 epoll_wait(3, {}, 256, 650) = 0
> > > > > [pid 18350] 17:40:16.969333 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.970057 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.970580 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.970736 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.970827 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.970917 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.971011 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.971118 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.971206 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.971300 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.971386 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:16.971485 epoll_wait(3, {}, 256, 115) = 0
> > > > > [pid 18350] 17:40:17.087042 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.087421 accept(10, 0x7fb208dbfba0, [28]) = -1
> > > EAGAIN (Resource temporarily unavailable)
> > > > > [pid 18350] 17:40:17.087915 epoll_ctl(3, EPOLL_CTL_DEL, 10,
> {EPOLLIN,
> > > {u32=10, u64=73199780460757002}}) = 0
> > > > > [pid 18350] 17:40:17.088035 epoll_ctl(3, EPOLL_CTL_ADD, 10,
> {EPOLLIN,
> > > {u32=10, u64=73199780460757002}}) = 0
> > > > > [pid 18350] 17:40:17.088139 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.088231 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.088327 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.088477 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.088565 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.088660 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.088746 epoll_wait(3, {}, 256, 0) = 0
> > > > > [pid 18350] 17:40:17.088833 epoll_wait(3, {}, 256, 0) = 0
> > > > >
> > > > > This seems to be, at least partly, intentional –
> > > erts/emulator/beam/erl_process.c
> > > > > contains a constant named "ERTS_SCHED_SYS_SLEEP_SPINCOUNT" which is
> > > set to 10.
> > > > >
> > > > > Can anyone tell me what the rationale behind this excessive
> > > busylooping is? A few
> > > > > dozen syscalls per second for nothing seems a bit weird to me.
> > > > >
> > > > > _______________________________________________
> > > > > erlang-questions mailing list
> > > > > erlang-questions@REDACTED
> > > > > http://erlang.org/mailman/listinfo/erlang-questions
> > > > >
> > >
> > > _______________________________________________
> > > erlang-questions mailing list
> > > erlang-questions@REDACTED
> > > http://erlang.org/mailman/listinfo/erlang-questions
> > >
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130523/978ca164/attachment.htm>


More information about the erlang-questions mailing list