Scheduler threads high CPU utilization on Erlang 11.0.3

Mon Oct 12 12:32:08 CEST 2020

Hi Lukas & Gopi,

There is a week-long thread in rabbitmq-users
<https://groups.google.com/g/rabbitmq-users/c/BYJzgySEdr8>, the RabbitMQ
equivalent of erlang-questions, which has some interesting details from
David Cook that you may find useful Lukas. For example, David picked up on
some interesting behaviour coming from
https://github.com/erlang/otp/blob/master/erts/lib_src/pthread/ethr_event.c#L65-L161
when +sbwt none is used.

Sharing some insights that might help:

Erlang in RabbitMQ Docker image is compiled with extra microstate accounting
<https://github.com/docker-library/rabbitmq/blob/35b41e318d9d9272126f681be74bcbfd9712d71b/Dockerfile-ubuntu.template#L134>,
which will help better understand where scheduler time is spent. This is
what I'm getting in a freshly booted RabbitMQ node, with no queues /
connections / channels:

rabbitmq-diagnostics runtime_thread_stats --sample-interval
Will collect runtime thread stats on rabbit@REDACTED for 60 seconds...
Average thread real-time    : 60001084 us
Accumulated system run-time :  1498785 us
Average scheduler run-time  :   179363 us

        Thread    alloc      aux      bifbusy_wait check_io emulator
 ets       gc  gc_full      nif    other     port     send    sleep   timer
Stats per type:
         async    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%
 0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%
 0.00%
           aux    0.00%    0.00%    0.00%    0.00%    0.01%    0.00%
 0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%   99.98%
 0.00%
dirty_cpu_sche    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%
 0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%
 0.00%
dirty_io_sched    0.00%    0.00%    0.00%    0.01%    0.00%    0.00%
 0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%   99.99%
 0.00%
          poll    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%
 0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%
 0.00%
     scheduler    0.01%    0.01%    0.02%    0.19%    0.00%    0.04%
 0.00%    0.01%    0.00%    0.00%    0.01%    0.00%    0.00%   99.70%
 0.00%

All beam.smp flags that RabbitMQ v3.8.9 (latest stable) uses on Erlang/OTP
v23.1.1 by default:

/usr/local/lib/erlang/erts-11.1.1/bin/beam.smp -W w -K true -A 128 -MBas
ageffcbf -MHas ageffcbf -MBlmbcs 512 -MHlmbcs 512 -MMmcs 30 -P 1048576 -t
5000000 -stbt db -zdbbl 128000 -B i

It takes 1 gcloud CLI command
<https://github.com/rabbitmq/tgir/blob/2d4dc1bde3f9ead0fe3555bcc2dffafae4422978/s01/e01/Makefile#L50-L58>
to spin up this container image on a dedicated VM that will have the same
performance as if RabbitMQ was installed from package.

By default, every 5 seconds, a lot of metrics are collected from all queues
/ channels / connections & even the OS itself and stored in ETS. This will
cause periodic CPU spikes (every 5s) even when there is no
publishing/consuming activity. Erlang processes to watch for:
rabbit_mgmt_external_stats (~6k reductions every 5s), vm_memory_monitor
(~2.5k reductions every 5s) & rabbit_disk_monitor (~2k reductions every 5s).

rabbitmq-diagnostics observer will show us Gopi which process use the most
reductions in your case. Quick how-to: start it
<https://gerhard.io/slides/observe-understand-rabbitmq/#/53>, then filter
by reductions <https://gerhard.io/slides/observe-understand-rabbitmq/#/54>.
This is what the text-form looks for me on a freshly booted node:

|Home(H)|Network(N)|System(S)|Ets(E)/Mnesia(M)|App(A)|Doc(D)|Plugin(P)recon:proc_window(reductions,
29, 5000) Interval | 0Days 0:8:28     |
|Erlang/OTP 23 [erts-11.1.1] [source] [64-bit] [smp:8:8] [ds:8:8:10]
[async-threads:128]                                                  |
|System     | Count/Limit           | System Switch             | Status
             | Memory Info          | Size                     |
|Proc Count | 565/1048576           | Version                   | 23.1.1
             | Allocted Mem         | 123.8125 MB     | 100.0% |
|Port Count | 11/1048576            | Multi Scheduling          | enabled
            | Use Mem              | 88.6950 MB      | 71.64% |
|Atom Count | 46994/5000000         | Logical Processors        | 8
            | Unuse Mem            | 35.1175 MB      | 28.36% |
|Mem Type   | Size                  | Mem Type                  | Size
             | IO/GC                | Interval: 5000ms         |
|Total      | 99.6300 MB   | 100.0% | Binary                    | 147.4844
KB  | 00.14% | IO Output            | 7.1914 KB                |
|Process    | 36.7505 MB   | 36.89% | Code                      | 31.0206
MB   | 31.14% | IO Input             | 140 B                    |
|Atom       | 1.3694 MB    | 01.37% | Reductions                | 266508
             | Gc Count             | 20                       |
|Ets        | 3.5130 MB    | 03.53% | RunQueue/ErrorLoggerQueue | 1/0
            | Gc Words Reclaimed   | 434099                   |
||1                         00.52% |5                         00.01% |9
                    00.00% |13                         00.00% |
||2                         00.01% |6                         00.01% |10
                     00.00% |14                         00.00% |
||3                         00.01% |7                         00.01% |11
                     00.00% |15                         00.00% |
||4                         00.01% |8                         00.01% |12
                     00.00% |16                         00.03% |
|No | Pid        |     Reductions      |Name or Initial Call
   |      Memory | MsgQueue |Current Function                  |
|1  |<0.702.0>   |5827                 |rabbit_mgmt_external_stats
   |  139.6016 KB| 0        |gen_server:loop/7                 |
|2  |<0.352.0>   |2472                 |vm_memory_monitor
  |  119.2305 KB| 0        |gen_server:loop/7                 |
|3  |<0.354.0>   |2095                 |rabbit_disk_monitor
  |  119.2305 KB| 0        |gen_server:loop/7                 |
|4  |<0.237.0>   |1277                 |aten_emitter
   |    6.7227 KB| 0        |gen_server:loop/7                 |
|5  |<0.59.0>    |496                  |file_server_2
  |   66.3164 KB| 0        |gen_server:loop/7                 |
|6  |<0.44.0>    |403                  |application_controller
   |  674.3008 KB| 0        |gen_server:loop/7                 |
|7  |<0.724.0>   |374
 |node_coarse_metrics_metrics_collector |   33.6211 KB| 0
 |gen_server:loop/7                 |
|8  |<0.599.0>   |305                  |rabbit_node_monitor
  |   21.2813 KB| 0        |gen_server:loop/7                 |
|9  |<0.357.0>   |290                  |file_handle_cache
  |   10.2695 KB| 0        |gen_server2:process_next_msg/1    |
|10 |<0.723.0>   |262
 |node_persister_metrics_metrics_collect|   25.8945 KB| 0
 |gen_server:loop/7                 |
|11 |<0.591.0>   |237                  |rabbit_event
   |   33.5430 KB| 0        |gen_event:fetch_msg/6             |
|12 |<0.725.0>   |236                  |node_metrics_metrics_collector
   |  106.7383 KB| 0        |gen_server:loop/7                 |
|13 |<0.726.0>   |234                  |node_node_metrics_metrics_collector
  |   28.8398 KB| 0        |gen_server:loop/7                 |
|14 |<0.209.0>   |219                  |timer_server
   |   11.7031 KB| 0        |gen_server:loop/7                 |
|15 |<0.727.0>   |178
 |connection_churn_metrics_metrics_colle|   21.1211 KB| 0
 |gen_server:loop/7                 |
|16 |<0.618.0>   |175                  |rabbit_memory_monitor
  |   11.6758 KB| 0        |gen_server2:process_next_msg/1    |
|17 |<0.266.0>   |156                  |sysmon_handler_filter
  |   24.0313 KB| 0        |gen_server:loop/7                 |
|18 |<0.741.0>   |139                  |cowboy_clock
   |    5.6836 KB| 0        |gen_server:loop/7                 |
|19 |<0.3.0>     |139
 |erts_dirty_process_signal_handler:star|    2.5391 KB| 0
 |erts_dirty_process_signal_handler:|
|20 |<0.999.0>   |84                   |inet_tcp_dist:do_accept/7
  |    8.5078 KB| 0        |dist_util:con_loop/2              |
|21 |<0.350.0>   |72                   |rabbit_alarm
   |    8.6094 KB| 0        |gen_event:fetch_msg/6             |
|22 |<0.238.0>   |63                   |aten_detector
  |   23.9883 KB| 0        |gen_server:loop/7                 |
|23 |<0.722.0>   |53
|queue_coarse_metrics_metrics_collector|    5.6836 KB| 0
 |gen_server:loop/7                 |
|24 |<0.50.0>    |47                   |code_server
  |  416.4180 KB| 0        |code_server:loop/1                |
|25 |<0.721.0>   |44                   |queue_metrics_metrics_collector
  |    5.6836 KB| 0        |gen_server:loop/7                 |
|26 |<0.720.0>   |44                   |consumer_created_metrics_collector
   |    5.6836 KB| 0        |gen_server:loop/7                 |
|27 |<0.719.0>   |44
|channel_process_metrics_metrics_collec|    5.6836 KB| 0
 |gen_server:loop/7                 |
|28 |<0.718.0>   |44
|channel_exchange_metrics_metrics_colle|    5.6836 KB| 0
 |gen_server:loop/7                 |
|29 |<0.717.0>   |44
|channel_queue_metrics_metrics_collecto|    5.6836 KB| 0
 |gen_server:loop/7                 |
|q(quit) p(pause) r/rr(reduction) m/mm(mem) b/bb(binary mem) t/tt(total
heap size) mq/mmq(msg queue) 9(proc 9 info) F/B(page forward/back |

PerfTest
<https://rabbitmq.github.io/rabbitmq-perf-test/stable/htmlsingle/#using-environment-variables-as-options>
can be configured to periodically stop all publishing via e.g. --variable-rate
0:60 --variable-rate 100:60 which translates to don't publish for 60
seconds, publish 100 msg/s for 60 seconds, then repeat. Consumers will not
put any pressure on RabbitMQ if messages are pushed instead of pulled (the
default PerfTest behaviour), so as soon as all messages are consumed and
publishing stops, there should be a few seconds of inactivity between the
periodic metric collection.

In some cases, applications pull messages from RabbitMQ and run constant
basic.get operations which can result in excessive CPU usage even though
there are no messages. I don't think that this is your case Gopi, but I'm
going to mention it since it is known to cause massive expeditions down
rabbit holes.

I hope this moves the discussion in the right direction.

Thanks all, Gerhard.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20201012/498c740b/attachment.htm>