<div dir="ltr"><div>Hi Peter,<br><br><br></div>Thanks for getting back to me with this. Now I've had chance to look at it, and to experiment with some different options, I can explain a bit more about what I think is going on.<br>
<div class="gmail_extra"><br><div class="gmail_quote">On 6 August 2013 16:47, Peter Andersson <span dir="ltr"><<a href="mailto:peppe@erlang.org" target="_blank">peppe@erlang.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Printouts from CT hook functions are "safe" in the sense that they<br>
execute sequentially pre/post test suite/group/case execution. It's not<br>
possible that a call to ct:log/2 from a hook function "comes in too soon<br>
or late", i.e. gets handled before/after Common Test has<br>
started/finished executing.<br>
<br></blockquote><div><br></div><div><snip><br></div><div> </div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
In general, if one knows that Common Test is running (which is the time<br>
between the CT hook init and terminate call, or the start_logging and<br>
stop_logging event message), it is safe to call ct:log/2 or ct:pal/2<br>
from anywhere and find the data in either the test case logs or in the<br>
unexpected i/o log (depending on the group leader setting).<br>
<br></blockquote><div><br></div><div><snip><br></div><div> </div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
The reason for the exit you reported initially, is that if a log call<br>
happens during startup or shutdown of Common Test, then, during a short<br>
window, it's possible that Common Test fails to communicate with Test<br>
Server and crashes.<br>
<br></blockquote><div><br></div><div>So I've definitely been hitting a race condition in my code here. I tried adding/removing the event hander that routes logging messages to ct:{log,pal}/2 around the {start,stop}_logging events, however that didn't help at all. In my ct_hook module, the call that triggers this explosion occurs inside the init/1 function:<br>
<br></div><div>%% from systest_cth.erl<br></div><div><br>init(systest, Opts) -><br> case application:start(systest, permanent) of<br> {error, {already_started, systest}} -> systest:reset();<br> {error, _Reason}=Err -> Err;<br>
ok -> ok<br> end,<br></div><div> etc ....<br></div><div><br></div><div>%% from systest.erl<br><br></div><div>reset() -><br> %% both these operations are synchronous<br>
systest_watchdog:reset(),<br> systest_results:reset(),<br> ok.<br></div><div><br></div><div>%% systest_watchdog.erl<br><br>reset() -><br> gen_server:call(?MODULE, reset).<br></div><div><br></div><div><br>Further down in systest_watchdog, the handle_call/3 clause that deals with the 'reset' signal ends up calling the systest_log, which leads to a gen_event handler calling ct:pal/2.<br>
</div><div><br></div><div>According to what you've said above, about the timing of logging attempts within the bounds of a ct_hook's init/terminate functions, that seems like it should work shouldn't it? Attempting to trigger the logging by only (de)registering the ct:pal handler in response to the {start,stop}_logging events didn't though. I guess there must be another potential race there - however I haven't attempted to do that via the hook's init/terminate functions - I'll try that now and let you know if it resolves my issues.<br>
</div><div><br></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Before I try to answer your question below, I need to understand better<br>
what you want to happen to the log printouts that take place during your<br>
configuration/setup phase (before the test run starts) and/or during the<br>
teardown phase (when Common Test has shut down).</blockquote><div><br></div><div>I think/hope that all my setup/teardown is synchronous with regards each test run - that is to say I allow for parallel test cases within suites and I don't actually return from the hook's systest_cth:stop/4 callback until all the resources configured at that scope (be it suite, group, testcase) have been killed and reported (via monitors) as dead.<br>
</div><div><br></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> If Common Test - in an<br>
offline mode (i.e. not running) - should attempt to write incoming<br>
ct:log/2 strings to a file, the best it can do really, is to<br>
write/append them to say a circular log file in the current working<br>
directory. This is possible, but it will be difficult to know which<br>
printouts belong to which test runs when analyzing the logs, and as far<br>
as I understand, this is the sort of thing you're trying to avoid anyway.<br>
<br></blockquote><div><br></div><div>Indeed. As a temporary work-around, I've stopped logging internal events to the common_test logs and put them in a separate log file instead, but that's not really what I wanted to end up with.<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
As far as I see, it's quite possible to do something clever with log<br>
printouts that happen *before* Common Test starts. They could be<br>
buffered in a temporary file then read and resent by a CT hook init<br>
function so that this data ends up first in the unexpected i/o log for<br>
the test run. The problem here is what to do with printouts that happen<br>
*after* Common Test has stopped but before your teardown is finished.<br>
Another possibility could maybe be, if possible, to change the order of<br>
the whole session so that Common Test is always started before your<br>
configuration/setup (CT hooks can for example be added dynamically) and<br>
not stopped until teardown is also finished. Perhaps an init and<br>
terminate function in a high prio CT hook module could be used to<br>
synchronize this. Sounds feasible to me.<br>
<br>
Let me know if I understand your problem correctly and tell me what<br>
ideas/requests you have and let's move on from there.<br>
<div class="im"><br>
</div></blockquote><div><br></div><div>I'm going to try again with a high prio hook to turn on/off the logging handler and see if this works. By all accounts it sounds as if it should. The other thing I might do to alleviate this problem is swallow any exception from ct:pal/2, which is *cough* bad form as a rule, but in this case might actually be the right thing to do.<br>
<br></div><div>Let me experiment with those two options first and get back to you. It might be that all I end up asking for is a bit more info in the documentation explaining the constraints.<br></div><div> <br></div><div>
I'll try to post back later today.<br><br></div><div>Cheers,<br>Tim<br></div></div></div></div>