<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 10pt; font-family: Verdana,Geneva,sans-serif'>
<p>Hi all,</p>
<p> my idea was to able to monitor the execution, but I must explore the gen_server+synchronous call in the future.</p>
<p>I was able to fix the bug following Maria suggestion (thank you Maria!).</p>
<p>The failing processes was dying due a redis timeout, probably because I used a redis MULTI/EXEC transaction which can lead to race conditions on the redis side.</p>
<p>I implemented a small database to track down failing processes and respawing... The idea is only to track down the timeout errors and so I changed the server to match "good" and "timeout" DOWN cases like</p>
<p>...</p>
<p>{'DOWN', Reference, process, _Pid, normal} -><br /> indexerDaemon(RunningWorker-1,FilesProcessed+1, maps:remove(Reference,MonitorRefMap) );</p>
<p><br />{'DOWN', Reference, process, Pid, {timeout, Detail}} -><br /> %% MMMmm we must assume still files to be processed?<br /> #{ Reference := FailedFile } = MonitorRefMap,<br /> io:format("!! Timeout Error on ~p ~n Detail: ~p~n", [FailedFile, {'DOWN', Reference, process, Pid, {timeout, Detail}}]), <br /> % We suppose a timeout error and we push back <br /> % Remove old Reference<br /> UpdatedRefMap=maps:remove(Reference,MonitorRefMap),<br /> NewPid=spawn(er_zauker_util, load_file_if_needed,[FailedFile]), <br /> MonitorRef = erlang:monitor(process,NewPid),<br /> NewRecoveryRefMap=UpdatedRefMap#{ MonitorRef => FailedFile },<br /> indexerDaemon(RunningWorker,FilesProcessed,NewRecoveryRefMap);</p>
<p>I do not know if there is some other smart way of doing it.</p>
<p>Thank you for your hints!!</p>
<p>...</p>
<p id="reply-intro">On 2021-04-13 17:29, dieter@schoen.or.at wrote:</p>
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0">
<div id="replybody1">
<div>
<div style="font-family: Tahoma; font-size: 16px; direction: ltr;">
<div style="font-family: Tahoma; font-size: 16px;"> </div>
Hi Giovanni,
<div> </div>
<div>I had a quick look into the code, and I think that the messages do not match..</div>
<div> </div>
<div>You wait for {worker, 0} in the waitAllWorkerDone function, but I do not think that this message</div>
<div>is generated anywhere.</div>
<div>In that function you send a {self(), report} to the daemon, to which it will respond with </div>
<div>
<div style="line-height: 19px;">
<div>{worker,RunningWorker,files_processed,FilesProcessed}.</div>
<div>Although RunningWorker can eventually become 0 when all workers exit, so the record starts with</div>
<div>{worker, 0</div>
<div>but there are the other two items which prevent the matching with the expected outcome.</div>
<div>Could this be the issue?</div>
<div> </div>
<div>Anyway, thanks for sharing your code, it is always interesting to see how somebody else is tackling a problem!</div>
<div> </div>
<div>On a more highlevel view, are you really interesting in intermediate results?</div>
<div>If I wanted only the end result, I think I would use OTP with a gen_server and a synchronous call.</div>
<div> </div>
<div>kind regards,</div>
<div>dieter</div>
<div> </div>
</div>
<br />
<div> </div>
<br /><br />
<div>Am Mo., Apr. 12, 2021 18:36 schrieb Giovanni Giorgi <jj@gioorgi.com>:</div>
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0">
<div>
<div style="font-size: 10pt; font-family: Verdana,Geneva,sans-serif;">Hi all,<br /> a newbie question here.<br />I have done a small erlang server following the behavior <span style="font-size: 10pt;">application, here</span><br /><span style="font-size: 10pt;"><a style="cursor: inherit;" href="https://github.com/daitangio/er_zauker/blob/erlang-24-migration/src/er_zauker_app.erl" target="_blank" rel="noopener noreferrer">https://github.com/daitangio/er_zauker/blob/erlang-24-migration/src/er_zauker_app.erl</a></span><br /><span style="font-size: 10pt;">To make a long story short, my server scans a set of directories and index files using redis as </span>backend<span style="font-size: 10pt;"> database.</span><br /><span style="font-size: 10pt;">It works well when I runs on small set of files.</span><br /><span style="font-size: 10pt;">But when I run it on a very huge set of files, it seems to "finish" before indexing all the files; when it starts, the client wait until every file is processed and the server can send him a report about the status:</span><br />
<div>
<div><span>er_zauker_indexer!{self(),directory,".</span><span>"},</span></div>
<div><span>er_zauker_app:wait_worker_done().</span></div>
<div> </div>
<div><span>The relevant part seems correct (see below) but </span><span style="font-size: 10pt;">I think I have done a stupid mistake, but I cannot understand where is it.</span></div>
<div><span style="font-size: 10pt;">Where can I find an example for this use case?</span></div>
<div> </div>
<div> </div>
<div>
<div><span>wait_worker_done</span><span>()</span><span>-></span></div>
<div><span>    </span><span>waitAllWorkerDone</span><span>(</span><span>1</span><span>,</span><span>erlang</span><span>:</span><span>monotonic_time</span><span>(</span><span>second</span><span>)).</span></div>
<br />
<div><span>waitAllWorkerDone</span><span>(RunningWorker,StartTimestamp) </span><span>when</span><span> RunningWorker </span><span>></span><span>0</span><span> </span><span>-></span></div>
<div><span>    </span><span>er_zauker_indexer</span><span>!</span><span>{</span><span>self</span><span>(),</span><span>report</span><span>},</span></div>
<div><span>    </span><span>receive</span><span> </span></div>
<div><span>    {</span><span>worker</span><span>,</span><span>0</span><span>} -></span></div>
<div><span>        </span><span>io</span><span>:</span><span>format</span><span>(</span><span>"All workers done</span><span>~n~n</span><span>"</span><span>);</span></div>
<div><span>    {</span><span>worker</span><span>, RunningGuys, </span><span>files_processed</span><span>, TotalFilesDone} -></span></div>
<div><span>        </span><span>if</span><span> </span></div>
<div><span>        RunningGuys  </span><span>/=</span><span> RunningWorker -> </span><span>            </span></div>
<div><span>            SecondsRunning</span><span>=</span><span> </span><span>erlang</span><span>:</span><span>monotonic_time</span><span>(</span><span>second</span><span>)</span><span>-</span><span>StartTimestamp,</span></div>
<div><span>            FilesSec</span><span>=</span><span>TotalFilesDone</span><span>/</span><span>SecondsRunning,</span></div>
<div><span>            </span><span>io</span><span>:</span><span>format</span><span>(</span><span>"[</span><span>~p</span><span>]s Workers[</span><span>~p</span><span>]  Files processed:</span><span>~p</span><span> Files/sec: </span><span>~p</span><span> </span><span>~n</span><span>"</span><span>,[SecondsRunning,RunningGuys,TotalFilesDone,FilesSec]),</span></div>
<div><span>            </span><span>timer</span><span>:</span><span>sleep</span><span>(</span><span>200</span><span>);</span></div>
<div><span>           </span><span>true</span><span> -> </span></div>
<div><span>            </span><span>%% Okey so nothing changed so far...sleep a bit</span></div>
<div><span>            </span><span>timer</span><span>:</span><span>sleep</span><span>(</span><span>100</span><span>)</span></div>
<div><span>        </span><span>end</span><span>,</span></div>
<div><span>        </span><span>%% Master sleep value</span></div>
<div><span>        </span><span>timer</span><span>:</span><span>sleep</span><span>(</span><span>990</span><span>),</span></div>
<div><span>        </span><span>waitAllWorkerDone</span><span>(RunningGuys,StartTimestamp)</span></div>
<div><span>    </span><span>after</span><span> </span><span>5000</span><span> -></span></div>
<div><span>        </span><span>io</span><span>:</span><span>format</span><span>(</span><span>"</span><span>~n</span><span>-----------------------------</span><span>~n</span><span>"</span><span>),</span></div>
<div><span>        </span><span>io</span><span>:</span><span>format</span><span>(</span><span>" Mmmm no info in the last 5 sec... when was running:</span><span>~p</span><span> Workers</span><span>~n</span><span>"</span><span>,[RunningWorker]),</span></div>
<div><span>        </span><span>io</span><span>:</span><span>format</span><span>(</span><span>" ?System is stuck? "</span><span>),</span></div>
<div><span>        </span><span>io</span><span>:</span><span>format</span><span>(</span><span>"------------------------------</span><span>~n</span><span>"</span><span>),</span></div>
<div><span>        </span><span>waitAllWorkerDone</span><span>(RunningWorker,StartTimestamp)</span></div>
<div><span>    </span><span>end</span><span>;</span></div>
<div><span>waitAllWorkerDone</span><span>(</span><span>0</span><span>,</span><span>_</span><span>) </span><span>-></span></div>
<div><span>    </span><span>io</span><span>:</span><span>format</span><span>(</span><span>"All worker Finished"</span><span>).</span></div>
</div>
<div> </div>
<div> </div>
</div>
<br /><br /><br /><br />
<div>-- <br />
<div style="margin: 0; padding: 0; font-family: monospace;">Giovanni Giorgi via webmail</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<p><br /></p>
<div id="signature">-- <br />
<div class="pre" style="margin: 0; padding: 0; font-family: monospace">Giovanni Giorgi via webmail</div>
</div>
</body></html>