[erlang-questions] Mysterious gen_server timeouts in MIX

John R. Ashmun john.ashmun@REDACTED
Sun Nov 3 16:34:29 CET 2013


Thanks, Bob.

I blush to write that I had considered this approach, but then I decided to
try to learn why a gen_server timeout could ever happen to the emulator
before making any changes.

This is an emulation of a really simple CPU:  It fetches an instruction,
increments its program counter, carries out the instruction, and repeats.
There are not a lot of moving parts, and everything except I/O instructions
is synchronous.

Regards,
John Ashmun



On Fri, Nov 1, 2013 at 10:08 PM, Bob Ippolito <bob@REDACTED> wrote:

> gen_server has a default timeout of 2 seconds, so if your call takes long
> enough for whatever reason this can happen. Try changing all if your
> gen_server calls to explicitly specify a longer or infinite timeout.
>
> On Friday, November 1, 2013, John R. Ashmun wrote:
>
>> I have implemented an emulator for MIX, Donald Knuth's notional 1960's
>> computer, as an Erlang application, as a way of learning to use Erlang.
>>
>> As part of the emulator's startup currently, MIX loads Knuth's Program P
>> into its memory, then jumps to the starting location of Program P.  Program
>> P calculates the first 500 prime numbers, pops them into a table, and then
>> prints 10 prime numbers at a time on what it considers to be its line
>> printer.
>>
>> The emulator usually runs fine, and when it does, it carries out Program
>> P perfectly.
>>
>> Sometimes, however, things get stuck, and I have been unable to learn
>> what goes wrong.  When there is a problem, I usually see these two error
>> reports:
>>
>> =ERROR REPORT==== 1-Nov-2013::11:33:54===
>> ** Generic server <0.38.0> terminating
>> ** Last message in was timeout
>> ** When server state == []
>> ** Reason for termination ==
>> ** {timeout,{gen_server,call,[io_controller,{wait_until_not_busy,18}]}}
>>
>> =ERROR REPORT==== 1-Nov-2013::11:33:54===
>> ** Generic server <0.4095.0> terminating
>> ** Last message in was {'$gen_cast',{write,1995}}
>> ** When Server state == {state,18,<0.4058.0>,24}
>> ** Reason for termination ==
>> ** {timeout,{gen_server,call,[io_controller,{set_ready,18}]}}
>>
>> The first one seems clearly to tell me that MIX's io_controller
>> gen_server's API wait_until_not_busy( ) function sent the message that it
>> should have, but the handle_call( ) never found the readiness status for
>> MIX device 18, its line printer, not to be busy, and then the gen_server
>> was timed out.  This should never happen, but then there is the second
>> report to consider.
>>
>> The second report confuses me.  Program P attempts to write a title
>> (FIRST FIVE HUNDRED PRIMES) to MIX's line printer.  This is what the
>> $gen_cast is attempting to do (1995 is the MIX address of the first
>> character of the title).  A MIX I/O operation (in this case, the write)
>> begins by waiting until the I/O unit that's addressed is not busy.
>> Apparently that's not true over the timeout period on this run; here is the
>> mystery:  the emulator's I/O device gen_servers are all initialized with
>> their readiness state set to ready, their readiness is changed to busy only
>> by a request to perform an I/O operation, and the operation is actually
>> carried out by an io_operation gen_server that is started by the I/O device
>> gen_server.  That io_operation gen_server is the entity that requests the
>> io_controller to set the readiness of the device back to ready, once the
>> io_operation has performed the I/O, or the output, in this case, to the
>> file that plays the role of the line printer, in this case.  It shouldn't
>> be possible for io_controller:set_ready( Device ) to be called before
>> io_controller:write( Device, Address ).  Is this what the error report is
>> telling me happened?
>>
>> All right, you say, you've programmed your emulator incompetently, and I
>> suppose I may have.  Why, then, does all this work perfectly, over and over
>> again, on another machine, or even on the same machine, before going into
>> failure mode?  (I have been unable to identify anything causing things to
>> start failing nor to start working, and I'm not making changes other than
>> sometimes running using application:start( 'MIX' ) and sometimes booting
>> MIX as an Erlang release -- sometimes compiling using +debug_info (or not)
>> has seemed to change from success to failure (or, equally likely, the
>> opposite).
>>
>> I need advice, please.
>>
>> Regards,
>> John Ashmun
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20131103/5c4247f5/attachment.htm>


More information about the erlang-questions mailing list