Fault-tolerance and distributed system
Scott Lystig Fritchie
Sat Nov 30 19:57:23 CET 2002
>>>>> "hs" == Hal Snyder <> writes:
hs> Scott's example shows application but not supervision.
Wow, I forgot that I wrote that. Nowadays, I simply use old .rel and
.app files from earlier projects as templates for new projects.
Mailing list archives are a wonderful thing.
I've put a dumb, brute-force, but functional supervisor example at
extracting the source:
1. Change directory to the "src" subdirectory.
2. Run GNU make (or compatible): "make" or "gmake' or however it's
installed on your system.
3. Run "erl -pz ../ebin -boot foo" to start the "foo" application.
4. At the Erlang prompt, execute "appmon:start()."
The "foo" application has a single supervisor the monitors two
worker processes, both of which are generic servers that implement
simple integer counters. Each worker also spawn_links three
processes in order to make the appmon process tree look more
Note that there are a bunch of io:format() debugging messages that
(hopefully) demonstrate how arguments are passed from start
functions to init functions.
5. Use commands like "increment:get1(counter1)." and
"increment:get_many(counter2, 100000)" to communicate with the two
6. Run "appmon:start()." to start the application monitor
application. A GUI window will pop up.
Click on the "foo" button. To demonstrate that the supervisors
are working correctly, click the "Kill" button and then the
"counter1" or "counter2" boxes. If either dies, the supervisor
will kill all children and then restart them. Note that the PIDs
of the counters' "children" change and that the SASL application
will spit out some messages to the console.
Killing one of the unnamed "child" processes will kill the "parent"
counter process because the counter process is not trapping process
exits ... which will then cause the foo_example_sup supervisor to
One of the three "child" processes will exit after 15 seconds, but
because it exits with a 'normal' status, its parent is not killed.
Killing foo_example_sup will result in shutting down the
application because there is nobody to restart it.
Now, I can wait for Lennart or other OTP guru to criticise my example!
More information about the erlang-questions