gen_server locked for some time

Fri Dec 27 10:44:29 CET 2019

Hi,

What the form of your bulk data? Is it a CSV that contain millions lines
(rows) ?

Pada tanggal Rab, 4 Des 2019 pukul 02.59 Roberto Ostinelli <
ostinelli@REDACTED> menulis:

> Thanks for the tips, Max and Jesper.
> In those solutions though how do you guarantee the order of the call? My
> main issue is to avoid that the slow process does not override more recent
> but faster data chunks. Do you pile them up in a queue in the received
> order and treat them after that?
>
> On Mon, Dec 2, 2019 at 3:57 PM Jesper Louis Andersen <
> jesper.louis.andersen@REDACTED> wrote:
>
>> Another path is to cooperate the bulk write in the process. Write in
>> small chunks and go back into the gen_server loop in between those chunks
>> being written. You now have progress, but no separate process.
>>
>> Another useful variant is to have two processes, but having the split
>> skewed. You prepare iodata() in the main process, and then send that to the
>> other process as a message. This message will be fairly small since large
>> binaries will be transferred by reference. The queue in the other process
>> acts as a linearizing write buffer so ordering doesn't get messed up. You
>> have now moved the bulk write call out of the main process, so it is free
>> to do other processing in between. You might even want a protocol between
>> the two processes to exert some kind of flow control on the system.
>> However, you don't have an even balance between the processes. One is the
>> intelligent orchestrator. The other is the worker, taking the block on the
>> bulk operation.
>>
>> Another thing is to improve the observability of the system. Start doing
>> measurements on the lag time of the gen_server and plot this in a
>> histogram. Measure the amount of data written in the bulk message. This
>> gives you some real data to work with. The thing is: if you experience
>> blocking in some part of your system, it is likely there is some kind of
>> traffic/request pattern which triggers it. Understand that pattern. It is
>> often covering for some important behavior among users you didn't think
>> about. Anticipation of future uses of the system allows you to be proactive
>> about latency problems.
>>
>> It is some times better to gate the problem by limiting what a
>> user/caller/request is allowed to do. As an example, you can reject large
>> requests to the system and demand they happen cooperatively between a
>> client and a server. This slows down the client because they have to wait
>> for a server response until they can issue the next request. If the
>> internet is in between, you just injected an artificial RTT + server
>> processing in between calls, implicitly slowing the client down.
>>
>>
>> On Fri, Nov 29, 2019 at 11:47 PM Roberto Ostinelli <ostinelli@REDACTED>
>> wrote:
>>
>>> All,
>>> I have a gen_server that in periodic intervals becomes busy, eventually
>>> over 10 seconds, while writing bulk incoming data. This gen_server also
>>> receives smaller individual data updates.
>>>
>>> I could offload the bulk writing routine to separate processes but the
>>> smaller individual data updates would then be processed before the bulk
>>> processing is over, hence generating an incorrect scenario where smaller
>>> more recent data gets overwritten by the bulk processing.
>>>
>>> I'm trying to see how to solve the fact that all the gen_server calls
>>> during the bulk update would timeout.
>>>
>>> Any ideas of best practices?
>>>
>>> Thank you,
>>> r.
>>>
>>
>>
>> --
>> J.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191227/3f949f63/attachment.htm>