Jesper Louis Andersen
Fri Apr 9 11:41:44 CEST 2021
On Thu, Apr 8, 2021 at 9:51 PM valentin@REDACTED <valentin@REDACTED>
> Presuming that by micro-batching” you mean more than one message being
> delivered at the same time, I don’t think that would be the only way to
> accomplish rates of 100k messages per second.
> In fact, in one of our project we have accomplished rates of above 200k
> UDP messages per second per Erlang node (using Erlang’s gen_udp module).
It just means that you avoid processing messages one-by-one, but set up a
short interval (5ms say) and process in batches in that short interval. In
practice, this will "feel" instant in the system, but amortizes some
back-n-forth overhead over multiple messages, reducing it. In a certain
sense, the kernel buffer on an UDP connection is already doing micro
batching when it delivers messages up to the VM. The actual amount of
messages you can process is somewhat dependent on the hardware
configuration as well, and probably also kernel tuning at these rates.
> I am not sure what process_flag( message_queue_data, off_heap ) actually
> does. I mean, I do understand that a particular process queue is allocated
> from some "private space", but not sure how such memory is managed.
> It would be great if anyone can shed some light on this.
In normal operation, the message queue is part of the process heap. So
when it's time to do a GC run, all of the messages on the heap are also
scanned. This takes time proportional to the size of the message queue,
especially if the message queue is large and the rest of the heap is small.
But note that a message arriving in the queue can't point to other data in
the heap. This leads to the idea of storing the messages off-heap, in their
own private space. This then improves GC times because we don't have to
traverse data in the queue anymore, and it's safe because the lifetime
analysis is that messages in the queue can't keep heap data alive.
On the flip side though, is that when you have messages off_heap, sending
messages to the process is slightly slower. This has to do with the
optimizations and locking pertaining to message passing. The normal way is
that you allocate temporary space for the messages from the sending process
and then hook these small allocations into the message queue of the
receiving process. They are "temporary" private off-heap spaces, which
doesn't require locking for a long time, and doesn't require a lock on the
memory area of the receiving process heap at all. Messages are then
"internalized" to the process and the next garbage collection will move the
messages into the main heap, so we can free up all the smaller memory
spaces. With off-heap, we need to manage the off-heap area, and this
requires some extra locking, which can potentially conflict more, slowing
down message delivery.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions