[erlang-questions] process priority

Wed Jul 6 11:30:31 CEST 2011

On 5 July 2011 18:07, Jachym Holecek <freza@REDACTED> wrote:
> # Mazen Harake 2011-07-05:
>> Writing to mnesia/dets requires no locking, nothing mentionable
>> anyway, since the messages (internal ones) are serialized just like
>> your gen_server will have its messages serialized (also, use
>> dirty-operations).
>
> I was talking about VM level locks -- but like I said, I don't
> know if the impact is measurable here.
>

Not sure what you mean by this? The two scenarios (whether a logging
goes to a gen_server or to a table process) are the same. I don't
think locks are measurable in this instance.

>
>> This creates end to end flow control, actually even more so because
>> you won't need the extra process (your logging process) in between
>> the write.
>
> By end-to-end I mean a feedback between message producers and their
> consumer (or consumers) -- I don't see how you get such behaviour
> with table-based approach. What prevents producers from generating
> messages faster than consumer/s can read them?
>

My mistake, I misunderstood your use of flow control which translated
wrongly in my mind :). The inherited flow control is: the more
processes try to write to a log table, the slower it will get BUT it
will only be slower, not stopped (due to io hogging). That is the idea
with using a table.

>> Perhaps your small 100B messages work so well with delayed_write
>> because you can write many of them to memory before they are flushed
>> to a file thus not hogging the disk, but I the bigger messages you
>> have the larger your in memory size needs to be to to avoid this.
>
> Sure, delayed_write parameters are configurable in the library I have
> in mind. It's really more about avoiding OS overhead for many writes,
> the disk itself just has to be fast enough to handle the load -- if
> it's not, every buffer wil overrun eventually. It's also a matter of
> how much delay are you willing to tolerate between enqueing message
> and seeing it on disk; and how many messages are you willing to lose
> on tragical VM crash.
>

The point is to use the disk as little as possible. If you have a fast
disk it will perform better over all, of course, but as you said the
OS overhead for many writes is what you try to avoid. So buffering up
and writing everything is the better way to do it. Now if this is by
using delayed_write or by buffering in tables is another question. At
least we agree on this :)

>> Delayed write does of course work well but I have experience that says
>> that writing and buffering it up in tables can be helpful to avoid
>> disk thrashing when messages are large (or higher volume). I don't
>> remember exactly how much throughput we had (and I don't want to guess
>> since it will be mere speculation without having hard data) but it
>> helped immensely.
>>
>> So I guess OP now have 2 suggestions which of course isn't bad ;)
>
> Certainly. :-)
>
>> One should also keep in mind though that different situation may have
>> different needs, would be interesting to see how they would measure
>> up.
>
> Sure -- you can't get persistent queues with gen_server-based approach
> for instance; it's designed & optimized for relatively small messages
> arriving at very high rates.
>

This makes me curious if you thought I mean that the tables I propose
will be disk copies. My suggestion was in memory tables only but
having them disk based will make it slower but persistent as you say.

> If you can recall some details about your workload (average message size,
> were they iolists/binaries, if iolists how complex were they, how did
> flusher process work roughly -- this sort of thing) I could probably
> measure the two approaches in various situations (different message
> sizes and producer concurrency levels) over the weekend and share the
> results (but not the code, sorry, proprietary stuff).

IIRC.
Message size: around 1K +- 200B, iolists maybe (not fact)
Messages per second: Don't remember... dare I guess around 10k? (Don't
ask me to bet my money on it ;))
Flusher: Just like delayed write but using the table as the buffer.
I.e. read size every x second(s), if value > N -> flush(value) else if
x' seconds have passed -> flush(value). immediately check for size
again.

Would be interesting to see how it performs.

/M