This tutorial attempts to show by example how to build a proper OTP-based system.
A quick way to get started might be to copy this example system, and modify the configuration files. I will try to explain the purpose of each file and suggest how they may be modified.
The example system (called "example") runs on two processors (note that the two processors can easily be two Erlang nodes on the same physical machine), and contains two applications:
Note: The code is riddled with comments. If you view it with e.g. Emacs using fontification, it may be easier to read.
With this file structure in place, you are ready to support in-service upgrade, but that will be described in a future tutorial.
...Nothing to it, really. I didn't bother with make scripts and stuff like that. Go into each src/ directory and type:
erlc -W -o ../ebin *.erl
The easiest way to build the boot script is to place yourself
in the $DIR/releases/1.0
directory, start an erlang
shell, and type the following:
Eshell V5.2 (abort with ^G) 1> <... output snipped> =PROGRESS REPORT==== 4-Dec-2002::16:55:34 === application: sasl started_at: nonode@nohost 1> Dir = "/home/etxuwig/work/erlang/release_tutorial". "/home/etxuwig/work/erlang/release_tutorial" 2> Path = [Dir ++ "/lib/*/ebin"]. ["/home/etxuwig/work/erlang/release_tutorial/lib/*/ebin"] 3> Var = {"MYAPPS", Dir}. {"MYAPPS","/home/etxuwig/work/erlang/release_tutorial"} 4> systools:make_script("example",[{path,Path},{variables,[Var]}]). ok
Now, you should be able to see an example.script file in releases/1.0/. It contains instructions for the Erlang/OTP boot loader. The .script file is converted into an Erlang binary which is stored in example.boot in the same directory.
Using systools:make_tar("example", Options)
(where
Options
is the same list of options as for
make_script/2
,
you can pack your release into a tar file, and unpack it on a target
system. The -boot_var
option makes the code
re-locatable. See erl -man systools
for more
detailed instructions.
There are tricks for starting an embedded system and being able to attach a shell to a node, but that's another tutorial.
I will show how one could easily get something up and running on a Unix workstation. Windows users will have to translate.
$DIR/releases/1.0
erl -boot ./example -config ./sys -boot_var MYAPPS $DIR -sname n1
erl -boot ./example -config ./sys -boot_var MYAPPS $DIR -sname n2
It doesn't really matter if you start both nodes at once, or one at a time. In the sys.config file, a node synchronization timeout of 10 seconds was specified. After that, the first node will continue alone if the other node has not yet appeared.
This is of course an interesting thing to try. If you start
n1
first, you may see the following output:
[etxuwig@cbe1066]: erl -boot ./example -config ./sys -boot_var MYAPPS $DIR -sname n1 Erlang (BEAM) emulator version 5.2 [hipe] [threads:0] Eshell V5.2 (abort with ^G) (n1@cbe1066)1> =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === supervisor: {local,sasl_safe_sup} started: [{pid,<0.45.0>}, {name,alarm_handler}, {mfa,{alarm_handler,start_link,[]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === supervisor: {local,sasl_safe_sup} started: [{pid,<0.46.0>}, {name,overload}, {mfa,{overload,start_link,[]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === supervisor: {local,sasl_sup} started: [{pid,<0.44.0>}, {name,sasl_safe_sup}, {mfa,{supervisor, start_link, [{local,sasl_safe_sup},sasl,safe]}}, {restart_type,permanent}, {shutdown,infinity}, {child_type,supervisor}] =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === supervisor: {local,sasl_sup} started: [{pid,<0.47.0>}, {name,release_handler}, {mfa,{release_handler,start_link,[]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === application: sasl started_at: n1@cbe1066 base_server starting. =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === supervisor: {local,base_super} started: [{pid,<0.53.0>}, {name,server}, {mfa,{base_server,start_link,[]}}, {restart_type,permanent}, {shutdown,10000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === application: base started_at: n1@cbe1066 dist_app:start(normal, _) dist_server starting. =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === supervisor: {local,dist_super} started: [{pid,<0.58.0>}, {name,server}, {mfa,{dist_server, start_link, [#Fun, #Fun ]}}, {restart_type,permanent}, {shutdown,10000}, {child_type,worker}] dist_app:start_phase(takeover, _) dist_app:start_phase(go, _) handle_call({go, normal},...) =PROGRESS REPORT==== 5-Dec-2002::16:40:27 === application: dist started_at: n1@cbe1066 (n1@cbe1066)1> (n1@cbe1066)1> global:whereis_name(dist_server). <0.58.0> (n1@cbe1066)2> dist_server:get_value(). undefined (n1@cbe1066)3> dist_server:set_value(17). {ok,undefined}
We can see that the globally registered dist_server is running
locally, and we can call the API functions
get_value/0
and set_value/1
.
If we now start n2
, dist_server should migrate over
to that node (since it is so specified in the
sys.config file.)
[etxuwig@cbe1066]: erl -boot ./example -config ./sys -boot_var MYAPPS $DIR -sname n2 Erlang (BEAM) emulator version 5.2 [hipe] [threads:0] Eshell V5.2 (abort with ^G) (n2@cbe1066)1> =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === supervisor: {local,sasl_safe_sup} started: [{pid,<0.46.0>}, {name,alarm_handler}, {mfa,{alarm_handler,start_link,[]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === supervisor: {local,sasl_safe_sup} started: [{pid,<0.47.0>}, {name,overload}, {mfa,{overload,start_link,[]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === supervisor: {local,sasl_sup} started: [{pid,<0.45.0>}, {name,sasl_safe_sup}, {mfa,{supervisor, start_link, [{local,sasl_safe_sup},sasl,safe]}}, {restart_type,permanent}, {shutdown,infinity}, {child_type,supervisor}] =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === supervisor: {local,sasl_sup} started: [{pid,<0.48.0>}, {name,release_handler}, {mfa,{release_handler,start_link,[]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === application: sasl started_at: n2@cbe1066 base_server starting. =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === supervisor: {local,base_super} started: [{pid,<0.54.0>}, {name,server}, {mfa,{base_server,start_link,[]}}, {restart_type,permanent}, {shutdown,10000}, {child_type,worker}] =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === application: base started_at: n2@cbe1066 dist_app:start({takeover,n1@cbe1066}, _) dist_server starting. =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === supervisor: {local,dist_super} started: [{pid,<0.59.0>}, {name,server}, {mfa,{dist_server, start_link, [#Fun, #Fun ]}}, {restart_type,permanent}, {shutdown,10000}, {child_type,worker}] dist_app:start_phase(takeover, {takeover,n1@cbe1066}, _) dist_app:start_phase(go, _) =PROGRESS REPORT==== 5-Dec-2002::17:27:15 === application: dist started_at: n2@cbe1066 (n2@cbe1066)1> (n2@cbe1066)1> global:whereis_name(dist_server). <0.59.0> (n2@cbe1066)2> dist_server:get_value(). 17
In the first node, n1
, we can see the following output:
=INFO REPORT==== 5-Dec-2002::17:27:15 === application: dist exited: stopped type: permanent
We can see that dist_server
brought the state
variable along when migrating to the other node (it did not bring
the special function objects along, in order to avoid nasty surprises.)
We can now try different combinations of starting and killing the two nodes.