4 Release Handling
4.1 Introduction
A new release is assembled into a release package. Such a package is installed in a running system by giving commands to the release handler, which is an SASL process. A system has a unique system version, which is updated whenever a new release is installed. The system version is the version of the entire system, not just the OTP version.
If the system consists of several nodes, each node has its own system version. Release handling can be synchronized between nodes, or be done at one node at a time.
Changes may require a node to be brought down. If that is the case and the system consists of several nodes, the release upgrade can be done as follows;
- move all applications from the node to be changed to other nodes,
- take down the node,
- do the change,
- restart the node and move the applications back.
There are several different types of releases:
- Operating system change.
- Can only be done by taking down the node. This kind of change is not supported by the release handler and therefore has to be performed manually. It is not possible to roll back automatically to a previous release, if there is an error.
- Application code or data change.
- The release is installed without bringing down the running node. Some changes, for example change of C-programs, may be done by shutting down and restarting the affected processes.
- Erlang emulator change.
- Can only be made by taking down the node. However, the release handler supports this type of change.
4.2 Administering Releases
This section describes how to build and install releases. Also refer to the SASL Reference Manual,
release_handler
, for more details.The following steps are involved in administering releases:
- A release package is built by using release building commands in the
systools
module. The package is assembled from application specification files, code files, data files, and a file, which describes how the release is installed in the system.
- The release package is transferred to the target machine, e.g. by using
ftp
.
- The release package is unpacked, which makes the system version in the release package available for installation by the
release_handler
, which interprets the release upgrade script, containing instructions for updating to the new version. If an installation fails in some way, the entire system is restarted from the old system version.
- When the installation is complete, the system version must be made permanent. When permanent, the new version is used if the system restarts.
It is also possible to reinstall an old version, or reboot the system from an old version. There are functions to remove old releases from disk as well.
4.3 File Structure
The file structure used in an OTP system is described in Release Directories. There are two ways of using this file structure together with the release handler.
The simplest way is to store all user-defined applications under
$OTP_ROOT/lib
in the same way as other OTP applications. The release handler takes care of everything, from unpacking a release to the removal of it. The release packages should be stored in the releases directory (default$OTP_ROOT/releases
). This is whererelease_handler:unpack_release/1
searches for the packages, and where the release handler stores its files. Each package is a compressedtar
file. The files in thetar
file are named relative to the$OTP_ROOT
directory. For example, if a new version (say 1.3) of the applicationsnmp
is contained in the release package, the files in thetar
file should be namedlib/snmp-1.3/*
.The second way is to store all user-defined applications in some other place in the file system. In this case, some more work has to be done outside the release handler. Specifically, the release packages must be unpacked in some way and the release handler must be notified of where the new release is located. The following three functions are available in the module
release_handler
to handle this case:
set_unpacked/2
set_removed/1
install_file/2
.
4.4 Release Installation Files
The following files must be present when a release is installed. All file names are relative to the releases directory.
ReleaseFileName.rel
Vsn/relup
Vsn/start.boot
Vsn/sys.config
The location of the releases directory is specified with the configuration parameter
releases_dir
(default$OTP_ROOT/releases
). In a target system, the default location is preferred, but during testing it may be more convenient to let the release handler write its files in a user specified directory, than in the$OTP_ROOT
directory.The files listed above are either present in the release package, or generated at the target machine and copied to their correct places using
release_handler:install_file/2
.
Vsn
is the system version string.4.4.1 ReleaseFileName.rel
The
ReleaseFileName.rel
file contains the name of the system, version of the release, the version oferts
(the Erlang runtime system) and the applications, which are parts of the release. The file must contain the following Erlang term:{release, {Name, Vsn}, {erts, EVsn}, [{App, AVsn} | {App, AVsn, AType} | {App, AVsn, [App]} | {App, AVsn, AType, [App]}]}.
Name
,Vsn
,EVsn
andAVsn
are strings,App
andAType
are atoms.ReleaseFileName
is a string given in the call torelease_handler:unpack_release(ReleaseFileName)
.Name
is the name of the system (the same as found in the boot file). This file is further described in Release Structure.4.4.2 relup
The
relup
file contains instructions on how to install the new version in the system. It must contain one Erlang term:{Vsn, [{FromVsn, Descr, RuScript}], [{ToVsn, Descr, RuScript}]}.
Vsn
,FromVsn
andToVsn
are strings,RuScript
is a release upgrade script.Descr
is a user defined parameter, which is not processed by any release handling functions. It can be used to describe the release to an operator. Finally, it will be returned byrelease_handler:install_release/1
andrelease_handler:check_install_release/1
.There is one tuple
{FromVsn, Descr, RuScript}
for each old system version which can be upgraded to the new version, and one tuple{ToVsn, Descr, RuScript}
for each old version to which the new version can be downgraded.4.4.3 start.boot
The
start.boot
file is the compiledstart.script
file. It is used to boot the Erlang machine.4.4.4 sys.config
The
sys.config
is the system configuration file.4.5 Release Handling Principles
The following sections describe the principles for updating parts of an OTP system.
4.5.1 Erlang Code
The code change feature in Erlang is made possible because Erlang allows two versions of a module to be present in the system: the current version and the old version. There is always a current version of a loaded module, but an old version of a module only exists if the module has been replaced in run-time by loading a new version. When a new version is loaded, the previously current version becomes the old version, and the new version becomes the current version. However, if there are both a current and old version of a module, a new version cannot be loaded, unless the old version is first explicitly purged.
A global function call is a call where a qualified module name is used, i.e. the call is of the form
M:F(A)
(orapply(M, F, A)
). A global call causesM:F
to be dynamically linked into the run-time code, which means thatM:F(A)
will be evaluated using the latest available version of the module, i.e. the current version.A local function call is a call without a qualified module name, i.e. the call is of the form
F(A)
. The reference toF
is resolved at compile time (irrespective of whetherF
is exported or not). By the very nature ofF(A)
being a local function call,F
can only be called by a function that is defined in the very same module as that whereF
is defined. Hence a local function call is always evaluated in the same version of a module as that of the caller.A fun is a function without a name. Like ordinary functions (i.e. functions which have names) its implementation is always bound to some module, and therefore funs are affected by code change as well. A reference to a fun is always indirect, as is the case for a global function call, where the reference is
M:F
(through an export table entry for the module), but the reference is not necessarily global. In fact, if a fun is called in the same module where it is defined, its reference will be resolved in the same way as a local function call is resolved. If a fun is called from a different module, its reference will be resolved as if the call was a global call, but with the additional requirement that the reference also match the particular implementation of the module where the fun was defined.For each process there is a current function, i.e. the function that the process is currently evaluating. That function resides in some module. Hence a process has always a reference to at least one module. It may of course have references to other modules as well, because of nested, not yet finished calls.
Before a new version of a module can be loaded, the current version must be made old. If there is no old version, the new version is merely loaded, making the previously current version to the old version, and the new version becomes current. All processes that execute the version, which became old, will continue to do so, until they have no unfinished calls within the old version.
If there is an old version, it must first be purged to make room for the current version to become old. However, an old version should not be purged if there are processes that have references to it. Such processes must either be terminated, or the loading of the new version must be postponed until they have terminated by themselves or no longer have references to the old version. There are options for controlling this in release upgrade scripts.
To prevent processes from making calls to other processes during the release installation, they may be suspended. All processes implemented with the standard behaviors, or with
sys
, can be suspended. When suspended a process enters a special suspend loop instead of its usual main process loop. In the suspend loop, the process can only receive system messages and shut-down messages from its supervisor. The code change message is a special system message, and this message causes the process to change code to the new version, and possibly to transform its internal state. After the code change a process is resumed, i.e. it returns to its main loop.We highlight here three different types of modules.
- Functional module.
- A module, which does not contain a process loop, i.e. no process has constant references to this kind of module.
lists
is an example of a functional module.- Process module.
- A module, which contains a process loop, i.e. some process has constant reference to the module.
init
is an example of a process module.- Call-back module.
- A special case of a functional module which serves as a call-back module for a generic behavior such as
gen_server
.file
is an example of a call-back module. A call to a call-back module is always a global call (i.e. it refers to the latest version of the module). This has some impacts upon how updates must be handled.Modules of the above types are handled differently when changing code.
4.5.1.1 Functional Module
If the API of a new version of a functional module is backward compatible, as may be the case of a bug fix or new functionality, we simply load the new version. After a short while, when no processes have references to the old version, the old module is purged.
A more complicated situation arises if the API of a functional module is changed so it is not longer backwards compatible. We must then make sure that no processes, directly or indirectly, try to call functions that have changed. We do this by writing new versions of all modules that use the API. Then, when performing the code change, all potential caller processes are suspended, new versions of the modules that uses the API are loaded, the new version of the functional module is loaded, and finally all suspended processes are resumed.
There are two alternatives available to manage this type of change:
- Find all calls to the module, change them, and write dependencies in your release upgrade script. This may be manageable, if a function that has been incompatibly changed is called from only a few other functions.
- Avoid this type of change. This is the only reasonable solution, if an incompatible function is called from many other modules. Instead a completely new function should be introduced, and the original function should be kept for backward compatibility. In the next release, when all other modules are changed as well, the original function can be deleted.
4.5.1.2 Process Module
A process module should never contain global calls to itself (except for code that makes explicit code change). Therefore, a new version of a process module is merely loaded and all processes which are executing the module are told to change their code and, if required, to transform their internal state.
In practice, few modules are pure in the sense that they never contain global calls to themselves. If you use higher-order functions such as
lists:map/2
in a process module, there will be global calls to the module. Therefore, we cannot merely load the module because a process might, still running the old version of the module, make a call to the new version, which might be incompatible.The only safe way to change code for a process module, is to have its implementation to understand system messages, and to change code by first suspending all processes that run the module, then order them to change code, and finally resume them.
4.5.1.3 Call-back Module
As long as the type of the internal state of a call-back module has not changed, we can just simply load the new version of the module without suspending and resuming the processes involved in the code change. This case is similar to the case of a functional module.
If the type of the internal state has changed, we must first suspend the processes, tell them to change code and at the same time give them the possibility to transform their states, and finally resume them. This is similar to the case of a process module.
4.5.1.4 Dependencies Between Processes
It is possible that a group of processes, which communicate, must perform code changes while they are suspended. Some of the processes may otherwise use the old protocol while others use the new protocol. On the other hand, there may be time-out dependencies which restrict the number of processes that can perform a synchronized code change as one set. The more processes that are included in the set, the longer the processes are suspended.
There may also be problems with circular dependencies. The following scenario illustrates this situation.
- two modules
a
andb
are dependent on each other,
- each module is executed by one process with the same name as the corresponding module,
- both are updated at the same time because the internal protocol between them has changed.
The following sequence of events may occur:
a
is suspended.
- the release handler tries to suspend
b
, but some microsecond before this happens,b
tries to communicate witha
which is now suspended
- If
b
hangs in its call toa
, the suspension ofb
fails and onlya
is updated.
- If
b
notices thata
does not answer and is able to deal with it, thenb
receives the suspend message and is suspended. Then both modules are updated and the processes are resumed.
- When
a
resumes, there is a message waiting fromb
. This message may be of an old format whicha
does not recognize.
Situations of the type described, and many others, are highly application dependent. The author of the release upgrade script has to predict and avoid them. If the consequences are too difficult to manage, it may be better to entirely shut down and restart all affected processes. This reduces the problem of introducing new code and removes the need to do a synchronized change.
4.5.1.5 Finding Processes
For each application the
.appup
file specifies how the application is upgraded. The file contains specifications of which modules to change, and how to change them. Therelup
file is an assembly of all the.appup
files.For each application the release handler searches for all processes that have to perform a code change. It traverses the application supervision tree to find all child specifications of every supervisor in the tree. Each child specification lists all modules of the application that the child uses.
Hence it is by combining the list of modules to change with all children of supervisors that the release handler finds all processes that are subject to code change.
4.5.2 Port Programs
A port program runs as an external program in the operating system. The simplest way to do code change for a port program is to terminate it, and then start a new version of it.
If that is not adequate, code change may be performed by sending the port program a message telling it to return any data that must survive the termination. Then the program is terminated, and the new version is started and the survived data is to the new version of the port program.
Changing code for port programs is very application dependent. There is no special support for it in SASL.
4.5.3 Application Specification and Configuration Parameters
In each release, each application specification (i.e. the contents of the
.app
file of the application) is known to the release handler. Before any code change is performed for an application, the new environment variables are are made available for the application, i.e. those parameters specified by theenv
tag in the application specification. When the new version of an application is running it will be informed of any changed, new or removed environment variables (see application(Module) in the KERNEL Reference Manual). This means that old processes may read new variables before they are informed of the new release. We advise against the immediate removal of the old variables. Neither do we recommend that they be syntactically changed, although they may of course change their values. They can be safely removed in the next release, by which time it is known that no processes will read the old variables.4.5.4 Mnesia Data or Schema Changes
Changing data or schemas in Mnesia is similar to changing code for functional modules. Many processes may read or write in the same table at the same time. If we change a table definition, we must make sure that all code which uses the table is changed at the same time.
One way of doing it is to let one process be responsible for one or several tables. This process creates the tables and changes the table definitions or table data. In this way a set of tables is connected with a module (process module or call-back module). When the process performs a code change, the tables are changed as well.
4.5.5 Upgrade vs. Downgrade
When a new release is installed, the system is upgraded to the new release. The release handler reads the
relup
file of the new release, and finds the upgrade script that corresponds to an upgrade from the current version to the new version of the system.When an old release is reinstalled, the release handler reads the
relup
in the current release, and finds the downgrade script that corresponds to an downgrade from the current version to the old version of the system.Usually a
relup
file for a new release contains one upgrade script and one downgrade script for each old version. If a soft downgrade is not wanted (an alternative is to reboot the system from the old release) the downgrade script is left out.For each modified module in the new release, there are some instructions that specifies how to install that module in a system. When performing an upgrade, the following steps are typically involved:
- Suspend the processes running the module.
- Load the new code.
- Tell the processes to switch to new code.
- Tell the processes to change the internal state. This usually involves calling, in the new module, a
code_change
function that is responsible for state updates, e.g. transforming the state from the old format to the new.
- Resume the processes.
The code change step is always performed when new code has been loaded and all processes are running the new code. The reason for this is that it is always the new version of the module that knows how to change the state from the old version.
When performing a downgrade the situation is different. The old module does not know how to transform the new state to the old version: the new format is unknown to the old code. Therefore, it is the responsibility of new code to revert the state back to the old version during downgrade. The following steps are involved:
- Suspend the processes running the module.
- Tell the processes to change the internal state. This usually involves calling, in the current module, a
code_change
function that is responsible for state reversals, i.e. transforming the state from the current format to the old.
- Load the new code.
- Tell the processes to switch code.
- Resume the processes.
We note that for a process module, it is possible to load the code before a process change its internal state (since a process module never contains global calls to itself), thus making the steps needed for downgrade almost the same as for upgrade. The difference between the two cases is still in the order of switching code and changing state.
For a call-back module it is not actually necessary to tell the processes to switch code, since all calls to the call-back module are global calls. The difference between upgrade and downgrade is still in the order of loading code and performing state change.
The difference between how process modules and a call-back modules are handled in the downgrade case comes from the fact that a process module never contains global calls to itself. The code is thus static in the sense that a process executing a process module does not spontaneously switch to new loaded code. The opposite situation is a dynamic module, where a process executing the module spontaneously switches to the new code when it is loaded. A call-back module is always dynamic, and a process module static. A functional module is always dynamic.
4.6 Release Handling Instructions
This section describes the release upgrade and downgrade scripts. A script is a list of instructions which are interpreted by the release handler when an upgrade or downgrade is made.
There are two levels of instructions; the high-level instructions and the low-level instructions. High- and low-level instructions may be mixed in one script. However, the high-level instructions are translated to low-level instructions by the
systools:make_relup/3
command, because the release handler understands only low-level instructions.Scripts have to be placed in the
.appup
file for each application.systools:make_relup/3
assembles the scripts in all.appup
files to form arelup
file containing low-level instructions.4.6.1 High-level Instructions
The high-level instructions are:
{update, Module, Change, PrePurge, PostPurge, [Mod]} | {update, Module, Timeout, Change, PrePurge, PostPurge,[Mod]} | {update, Module, ModType, Timeout, Change, PrePurge, PostPurge,[Mod]}
The instruction is used to update a process module or a call-back module. All processes that run the code of
Module = atom()
Timeout = default | infinity | int() > 0
ModType = static | dynamic
Change = soft | {advanced, Extra}
PrePurge = soft_purge | brutal_purge
PostPurge = soft_purge | brutal_purge
Mod = atom()
. If the module is dependent on changes in other modules, these other modules are listed here.
Module
are suspended, and if the change isadvanced
they have to transform their states into the new states. Then the processes are resumed. IfModule
is dependent on other modules, the release handler will suspend processes inModule
before suspending processes in the[Mod]
modules. In case of circular dependencies, it will suspend processes in the order that update instructions appear in the script.
soft
means backwards compatible changes andadvanced
means internal data changes, or changes which are not backwards compatible.Extra
is any term, which is used in the argument list of thecode_change
function inModule
(call-back module); otherwise it becomes part of a code change message (process module).
The optional parameterTimeout
defines the time-out for the call tosys:suspend
. It specifies how long to wait for a process to handle a suspend message and to get suspended. If no value is specified (ordefault
is given), the default value defined insys
is used.
The optional parameterModType
specifies if the code is static or dynamic, as defined in Upgrade vs. Downgrade above. It needs to be specified only in the case of soft downgrades. Its value defaults todynamic
. Note; if this parameter is specified,Timeout
is needed as well.
PrePurge
controls what action to take with processes that are executing an old version of this module. These are processes, which are left since an earlier release upgrade (or downgrade). Usually there are no such processes. If the value issoft_purge
and such processes are found, the release will not be installed and theinstall_release/1
function returns{error, {old_processes, Module}}
. If the value isbrutal_purge
, the processes which run old code are killed.
PostPurge
controls what action to take with processes that are executing old code when the new module has been installed. If the value issoft_purge
, the release handler will purge the old code when no remaining processes execute the code. If the value isbrutal_purge
, the code is purged when the release is made permanent. All processes, which still are running old code are killed.
Theupdate
instruction can also be used for functional modules. However, no processes will be suspended because no processes will have the functional module as its main module. Therefore, no processes perform code change.
{load_module, Module, PrePurge, PostPurge, [Mod]}
The instruction is used to update a functional module or a call-back module. It only loads the module. A call-back module which must perform a code change, or synchronize by being suspended, should use
Module = atom()
.
PrePurge = soft_purge | brutal_purge
PostPurge = soft_purge | brutal_purge
Mod = atom()
. If the module is dependent on changes in other modules, these other modules are listed here.
update
instead.
The object code is fetched in the beginning of the release upgrade, but the module is loaded when this instruction occurs.
{add_module, Mod}
The instruction adds a new module to the system. It loads the module.
{remove_application, Appl}
Removes an application. It callsapplication:stop
andapplication:unload
for the application.
{add_application, Appl}
Adds a new application. It callsapplication:load
andapplication:start
for the application.
{restart_application, Appl}
Restarts an existing application. The current version of the application is stopped and removed, and the new version of the application is loaded and started. The instruction is useful when the simplest way to change code for an application is to stop and restart the whole application.
4.6.2 Low-level instructions
The low-level instructions are:
{load_object_code, {Lib, LibVsn, [Module]}}
Reads eachModule
from the libraryLib-LibVsn
as a binary. It does not install the code, it just reads the files. The instruction should be placed first in the script in order to read all new code from file. This makes the suspend-load-resume cycle less time consuming. After this instruction has been executed, the code server is updated with the new version ofLib
. Calls tocode:priv_dir(Lib)
which are made after this instruction return the newpriv
dir.
Lib
is typically the application name.
point_of_no_return
If a crash occurs after this instruction, the system cannot recover and is restarted from the old version. The instruction must only occur once in a script. It should be placed after allload_object_code
operations and after user defined checks, which are performed withapply
. The functioncheck_install_release/1
tries to evaluate all instructions before this command occurs in the script. Therefore, user defined checks must not have side effects, as they may be evaluated many times.
{load, {Module, PrePurge, PostPurge}}
Before this instruction occurs, theModule
object code must have been loaded with with theload_object_code
instruction. This instruction makes code out of the binary.PrePurge = soft_purge | brutal_purge
, andPostPurge = soft_purge | brutal_purge
.
{remove, {Module, PrePurge, PostPurge}}
Makes the current version ofModule
old. When it has been executed, there is no current version in the system.PrePurge = soft_purge | brutal_purge
, andPostPurge = soft_purge | brutal_purge
.
{purge, [Module]}
Kills all processes that run the old versions of the modules in[Module]
and deletes all old versions.
{suspend, [Module | {Module, Timeout}]}
Tries to suspend all processes that executeModule
. If a process does not respond, it is ignored. This may cause the process to die, either because it crashes when it spontaneously switches to new code, or as a result of apurge
operation. If noTimeout
is specified (or ifdefault
is given), the default time-out defined in the modulesys
is used.
{code_change, [{Module, Extra}]} | {code_change, Mode, [{Module, Extra}]}
This instruction sends acode_change
system message using the functionchange_code
in the modulesys
with theExtra
argument to the suspended processes that run this code.Mode
is eitherup
ordown
. Default isup
. In case of an upgrade, the message is sent to the suspended process, after the new code is loaded (the new version must contain functions to convert from the old internal state, to the the new internal state). In case of a downgrade, the message is sent to the suspended process, before the new code is loaded (the current version must contain functions to convert from the current internal state, to the the old internal state).
Module
uses theExtra
argument internally in its code change function. Refer to the Reference Manual, modulesys
for further details.
One of the arguments to the functionsys:change_code
isOldVsn
. In the case of an upgrade it obtains its value from the attributevsn
in the old code, orundefined
if no such attribute was defined. In the case of downgrade, it is the tuple{down, Vsn}
, whereVsn
is the version of the module as defined in the.app
file, orundefined
otherwise.
{resume, [Module]}
Resumes all previously suspended processes which execute in any of the modules in the list[Module]
.
{stop, [Module]}
Stops all processes which are in any of the modules in the list[Module]
. The instruction is useful when the simplest way to change code for the[Module]
is to stop and restart the processes which run the code. If a supervisor is stopped, all its children are stopped as well.
{start, [Module]}
Starts all previously stopped processes which are in any member of[Module]
. The processes will regain their positions in the supervision tree.
{sync_nodes, Id, [Node] | {M, F, A}}
If{M, F, A}
is specified,apply(M, F, A)
is evaluated and must return a list of nodes. The instruction synchronizes the release installation with other nodes. Each node in the list of nodes must evaluate this command, with the sameId
. The local node waits for all other nodes to evaluate the instruction before execution continues. In case a node goes down, it is considered to be an unrecoverable error, and the local node is restarted from the old release. There is no time-out for this instruction, which implies that it may hang forever if a user definedapply
enters an infinite loop at some node. It is up to the user to ensure that theapply
command eventually returns or makes the node to crash.
{apply, {M, F, A}}
Applies the function to the arguments. If the instruction appears before thepoint_of_no_return
instruction, a failure of the applicationM:F(A)
is caught, causingrelease_handler:install_release/1
to return{error, {'EXIT', Reason}}
. If{error, Error}
is thrown or returned byM:F
,install_release/1
returns{error, Error}
.
If the instruction appears after thepoint_of_no_return
instruction, and if the applicationM:F(A)
fails, the system is restarted.
restart_new_emulator
Shuts down the current emulator and starts a new one. All processes are terminated gracefully. The new release must still be made permanent when the new emulator is up and running. Otherwise, the old emulator is started in case of a emulator restart. This instruction should be used when a new emulator is introduced, or if a complete reboot of the system should be done.
4.7 Release Handling Examples
This section includes several examples that show how different types of upgrades are handled. In call-back modules having the
gen_server
behavior, all call-back functions have been provided for reasons of clarity.4.7.1 Update of Erlang Code
Several update examples are shown. Unless otherwise stated, it is assumed that all original modules are in the application
foo
, version"1.1"
, and the updated version is"1.2"
.4.7.1.1 Simple Functional Module
This example is about a pure functional module, i.e. a module the functions of which have no side effects. The original version of the module
lists2
has the following contents:-module(lists2). -vsn(1). -export([assoc/2]). assoc(Key, [{Key, Val} | _]) -> {ok, Val}; assoc(Key, [H | T]) -> assoc(Key, T); assoc(Key, []) -> false.The new version of the module adds a new function:
-module(lists2). -vsn(2). -export([assoc/2, multi_map/2]). assoc(Key, [{Key, Val} | _]) -> {ok, Val}; assoc(Key, [H | T]) -> assoc(Key, T); assoc(Key, []) -> false. multi_map(Func, [[] | ListOfLists]) -> []; multi_map(Func, ListOfLists) -> [apply(Func, lists:map({erlang, hd}, ListOfLists)) | multi_map(Func, lists:map({erlang, tl}, ListOfLists))].The release upgrade instructions are:
[{load_module, lists2, soft_purge, soft_purge, []}]Alternatively, the low-level instructions are:
[{load_object_code, {foo, "1.2", [lists2]}}, point_of_no_return, {load, {lists2, soft_purge, soft_purge}}]4.7.1.2 A More Complicated Functional Module
Here we have a functional module
bar
that uses the modulelists2
of the previous example. The original version is only dependent on the original version oflists2
.-module(bar). -vsn(1). -export([simple/1, complicated_sum/1]). simple(X) -> case lists2:assoc(simple, X) of {ok, Val} -> Val; false -> false end. complicated_sum([X, Y, Z]) -> cs(X, Y, Z). cs([HX | TX], [HY | TY], [HZ | TZ]) -> NewRes = cs(TX, TY, TZ), [HX + HY + HZ | NewRes]; cs([], [], []) -> [].The new version of
bar
uses the new functionality oflists2
in order to simplify the implementation of the useful functioncomplicated_sum/1
. It does not change its API in any way.-module(bar). -vsn(2). -export([simple/1, complicated_sum/1]). simple(X) -> case lists2:assoc(simple, X) of {ok, Val} -> Val; false -> false end. complicated_sum(X) -> lists2:multi_map(fun(A,B,C) -> A+B+C end, X).The release upgrade instructions, including instructions for
lists2
, are as follows:[{load_module, lists2, soft_purge, soft_purge, []}, {load_module, bar, soft_purge, soft_purge, [lists2]}]
We must state that
bar
is dependent onlists2
to make the release handler to loadlists2
before it loadsbar
.The low-level instructions are:
[{load_object_code, {foo, "1.2", [lists2, bar]}}, point_of_no_return, {load, {lists2, soft_purge, soft_purge}} {load, {bar, soft_purge, soft_purge}}]4.7.1.3 Advanced Functional Module
Suppose now that we modify the return value of
lists2:assoc/2
from{ok, Val}
to{Key, Val}
. In order to do an upgrade, we would have to find all modules that calllists2:assoc/2
directly or indirectly, and specify that these modules are dependent onlists2
. In practice this might an unweildy task, if if many other modules are using thelists2
module, and the only reasonable way to perform an upgrade which restarts the whole system.If we insist on doing a soft upgrade, the modification should be made backward compatible by introducing an new function (
assoc2/2
, say) that has the new return value, and not make any changes to the original function at all.4.7.1.4 Advanced gen_server
This example assumes that we have a
gen_server
process that must be updated because we have introduced a new function, and added a new data field in our internal state. The contents of the original module are as follows:-module(gs1). -vsn(1). -behaviour(gen_server). -export([get_data/0]). -export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]). -record(state, {data}). get_data() -> gen_server:call(gs1, get_data). init([Data]) -> {ok, #state{data = Data}}. handle_call(get_data, _From, State) -> {reply, {ok, State#state.data}, State}. handle_cast(_Request, State) -> {noreply, State}. handle_info(_Info, State) -> {noreply, State}. terminate(_Reason, _State) -> ok. code_change(_OldVsn, State, _Extra) -> {ok, State}.The new module must translate the old state into the new state. Recall that a record is just syntactic sugar for a tuple:
-module(gs1). -vsn(2). -behaviour(gen_server). -export([get_data/0, get_time/0]). -export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]). -record(state, {data, time}). get_data() -> gen_server:call(gs1, get_data). get_time() -> gen_server:call(gs1, get_time). init([Data]) -> {ok, #state{data = Data, time = erlang:time()}}. handle_call(get_data, _From, State) -> {reply, {ok, State#state.data}, State}; handle_call(get_time, _From, State) -> {reply, {ok, State#state.time}, State}. handle_cast(_Request, State) -> {noreply, State}. handle_info(_Info, State) -> {noreply, State}. terminate(_Reason, _State) -> ok. code_change(1, {state, Data}, _Extra) -> {ok, #state{data = Data, time = erlang:time()}}.The release upgrade instructions are as follows:
[{update, gs1, {advanced, []}, soft_purge, soft_purge, []}]The alternative low-level instructions are:
[{load_object_code, {foo, "1.2", [gs1]}}, point_of_no_return, {suspend, [gs1]}, {load, {gs1, soft_purge, soft_purge}}, {code_change, [{gs1, []}]}, {resume, [gs1]}]If we want to handle soft downgrade as well, the code would be as follows:
-module(gs1). -vsn(2). -behaviour(gen_server). -export([get_data/0, get_time/0]). -export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]). -record(state, {data, time}). get_data() -> gen_server:call(gs1, get_data). get_time() -> gen_server:call(gs1, get_time). init([Data]) -> {ok, #state{data = Data, time = erlang:time()}}. handle_call(get_data, _From, State) -> {reply, {ok, State#state.data}, State}; handle_call(get_time, _From, State) -> {reply, {ok, State#state.time}, State}. handle_cast(_Request, State) -> {noreply, State}. handle_info(_Info, State) -> {noreply, State}. terminate(_Reason, _State) -> ok. code_change(1, {state, Data}, _Extra) -> {ok, #state{data = Data, time = erlang:time()}}; code_change({down, 1}, #state{data = Data}, _Extra) -> {ok, {state, Data}}.Note that we take care of translating the new state to the old format as well. The low-level instructions are:
[{load_object_code, {foo, "1.2", [gs1]}}, point_of_no_return, {suspend, [gs1]}, {code_change, [{gs1, []}]}, {load, {gs1, soft_purge, soft_purge}}, {resume, [gs1]}]4.7.1.5 Advanced gen_server with Dependencies
This example assumes that we have
gen_server
process that uses the ings1
as defined in the previous example.The contents of the original module are as follows:
-module(gs2). -vsn(1). -behaviour(gen_server). -export([is_operation_ok/1]). -export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]). is_operation_ok(Op) -> gen_server:call(gs2, {is_operation_ok, Op}). init([Data]) -> {ok, []}. handle_call({is_operation_ok, Op}, _From, State) -> Data = gs1:get_data(), Reply = lists2:assoc(Op, Data), {reply, Reply, State}. handle_cast(_Request, State) -> {noreply, State}. handle_info(_Info, State) -> {noreply, State}. terminate(_Reason, _State) -> ok. code_change(_OldVsn, State, _Extra) -> {ok, State}.The new version does not have to transform the internal state, hence the
code_change/3
function is not really needed (it will not be called since the upgrade ofgs2
is soft).-module(gs2). -vsn(2). -behaviour(gen_server). -export([is_operation_ok/1]). -export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]). is_operation_ok(Op) -> gen_server:call(gs2, {is_operation_ok, Op}). init([Data]) -> {ok, []}. handle_call({is_operation_ok, Op}, _From, State) -> Data = gs1:get_data(), Time = gs1:get_time(), Reply = do_things(lists2:assoc(Op, Data), Time), {reply, Reply, State}. handle_cast(_Request, State) -> {noreply, State}. handle_info(_Info, State) -> {noreply, State}. terminate(_Reason, _State) -> ok. code_change(_OldVsn, State, _Extra) -> {ok, State}. do_things({ok, Val}, Time) -> Val; do_things(false, Time) -> {false, Time}.The release upgrade instructions are:
[{update, gs1, {advanced, []}, soft_purge, soft_purge, []}, {update, gs2, soft, soft_purge, soft_purge, [gs1]},The corresponding low-level instructions are:
[{load_object_code, {foo, "1.2", [gs1, gs2]}}, point_of_no_return, {suspend, [gs1, gs2]}, {load, {gs1, soft_purge, soft_purge}}, {load, {gs2, soft_purge, soft_purge}}, {code_change, [{gs1, []}]}, % No gs2 here! {resume, [gs1, gs2]}]4.7.1.6 Other Worker Processes
All other worker processes in a supervision tree, such as processes of the types
gen_event
,gen_fsm
, and processes implemented by usingproc_lib
andsys
, are handled in exactly the same way as processes of typegen_server
are handled. Examples follow.4.7.1.7 Simple gen_event
This example shows how an event handler may be updated. We do not make any assumptions about which event manager processes the handler is installed in, it is the responsibility of the release handler to find them. The contents of the original module is as follows:
-module(ge_h). -vsn(1). -behaviour(gen_event). -export([get_events/1]). -export([init/1, handle_event/2, handle_call/2, handle_info/2, terminate/2, code_change/3]). get_events(Mgr) -> gen_event:call(Mgr, ge_h, get_events). init(_) -> {ok, undefined}. handle_event(Event, _LastEvent) -> {ok, Event}. handle_call(get_events, LastEvent) -> {ok, [LastEvent], LastEvent}. handle_info(Info, LastEvent) -> {ok, LastEvent}. terminate(Arg, LastEvent) -> ok. code_change(_OldVsn, LastEvent, _Extra) -> {ok, LastEvent}.The new module decides to keep the two latest events in a list and must translate the old state into the new state.
-module(ge_h). -vsn(2). -behaviour(gen_event). -export([get_events/1]). -export([init/1, handle_event/2, handle_call/2, handle_info/2, terminate/2, code_change/3]). get_events(Mgr) -> gen_event:call(Mgr, ge_h, get_events). init(_) -> {ok, []}. handle_event(Event, []) -> {ok, [Event]}; handle_event(Event, [Event1 | _]) -> {ok, [Event, Event1]}. handle_call(get_events, Events) -> Events. handle_info(Info, Events) -> {ok, Events}. terminate(Arg, Events) -> ok. code_change(1, undefined, _Extra) -> {ok, []}; code_change(1, LastEvent, _Extra) -> {ok, [LastEvent]}.The release upgrade instructions are:
[{update, ge_h, {advanced, []}, soft_purge, soft_purge, []}]The low-level instructions are:
[{load_object_code, {foo, "1.2", [ge_h]}}, point_of_no_return, {suspend, [ge_h]}, {load, {ge_h, soft_purge, soft_purge}}, {code_change, [{ge_h, []}]}, {resume, [ge_h]}]
These instructions are identical to those used for the
gen_server
.4.7.1.8 Process Implemented with sys and proc_lib
Processes implemented with sys and proc_lib are changed in the same way as processes that are implemented according to the
gen_server
behavior (which should not come as surprise, sincegen_server
et al. are implemented on top ofsys
andproc_lib
). However, the code change function is defined differently. The original is as follows:-module(sp). -vsn(1). -export([start/0, get_data/0]). -export([init/1, system_continue/3, system_terminate/4]). -record(state, {data}). start() -> Pid = proc_lib:spawn_link(?MODULE, init, [self()]), {ok, Pid}. get_data() -> sp_server ! {self(), get_data}, receive {sp_server, Data} -> Data end. init(Parent) -> register(sp_server, self()), process_flag(trap_exit, true), loop(#state{}, Parent). loop(State, Parent) -> receive {system, From, Request} -> sys:handle_system_msg(Request, From, Parent, ?MODULE, [], State); {'EXIT', Parent, Reason} -> cleanup(State), exit(Reason); {From, get_data} -> From ! {sp_server, State#state.data}, loop(State, Parent); _Any -> loop(State, Parent) end. cleanup(State) -> ok. %% Here are the sys call back functions system_continue(Parent, _, State) -> loop(State, Parent). system_terminate(Reason, Parent, _, State) -> cleanup(State), exit(Reason).The new code, which takes care of up- and downgrade is as follows:
-module(sp). -vsn(2). -export([start/0, get_data/0, set_data/1]). -export([init/1, system_continue/3, system_terminate/4, system_code_change/4]). -record(state, {data, last_pid}). start() -> Pid = proc_lib:spawn_link(?MODULE, init, [self()]), {ok, Pid}. get_data() -> sp_server ! {self(), get_data}, receive {sp_server, Data} -> Data end. set_data(Data) -> sp_server ! {self(), set_data, Data}. init(Parent) -> register(sp_server, self()), process_flag(trap_exit, true), loop(#state{last_pid = no_one}, Parent). loop(State, Parent) -> receive {system, From, Request} -> sys:handle_system_msg(Request, From, Parent, ?MODULE, [], State); {'EXIT', Parent, Reason} -> cleanup(State), exit(Reason); {From, get_data} -> From ! {sp_server, State#state.data}, loop(State, Parent); {From, set_data, Data} -> loop(State#state{data = Data, last_pid = From}, Parent); _Any -> loop(State, Parent) end. cleanup(State) -> ok. %% Here are the sys call back functions system_continue(Parent, _, State) -> loop(State, Parent). system_terminate(Reason, Parent, _, State) -> cleanup(State), exit(Reason). system_code_change({state, Data}, _Mod, 1, _Extra) -> {ok, #state{data = Data, last_pid = no_one}}; system_code_change(#state{data = Data}, _Mod, {down, 1}, _Extra) -> {ok, {state, Data}}.The release upgrade instructions are:
[{update, sp, static, default, {advanced, []}, soft_purge, soft_purge, []}]The low-level instructions are the same for upgrade and downgrade:
[{load_object_code, {foo, "1.2", [sp]}}, point_of_no_return, {suspend, [sp]}, {load, {sp, soft_purge, soft_purge}}, {code_change, [{sp, []}]}, {resume, [sp]}]4.7.1.9 Supervisor
This example assumes that a new version of an application adds a new process, and deletes one process from a supervisor. The original code is as follows:
-module(sup). -vsn(1). -behaviour(supervisor). -export([init/1]). init([]) -> SupFlags = {one_for_one, 4, 3600}, Server = {my_server, {my_server, start_link, []}, permanent, 2000, worker, [my_server]}, GS1 = {gs1, {gs1, start_link, []}, permanent, 2000, worker, [gs1]}, {ok, {SupFlags, [Server, GS1]}}.The new code is as follows:
-module(sup). -vsn(2). -behaviour(supervisor). -export([init/1]). init([]) -> SupFlags = {one_for_one, 4, 3600}, GS1 = {gs1, {gs1, start_link, []}, permanent, 2000, worker, [gs1]}, GS2 = {gs2, {gs2, start_link, []}, permanent, 2000, worker, [gs2]}, {ok, {SupFlags, [GS1, GS2]}}.The release upgrade instructions are:
[{update, sup, {advanced, []}, soft_purge, soft_purge, []} {apply, {supervisor, terminate_child, [sup, my_server]}}, {apply, {supervisor, delete_child, [sup, my_server]}}, {apply, {supervisor, restart_child, [sup, gs2]}}]The low-level instructions are:
[{load_object_code, {foo, "1.2", [sup]}}, point_of_no_return, {suspend, [sup]}, {load, {sup, soft_purge, soft_purge}}, {code_change, [{sup, []}]}, {resume, [sup]}, {apply, {supervisor, terminate_child, [sup, my_server]}}, {apply, {supervisor, delete_child, [sup, my_server]}}, {apply, {supervisor, restart_child, [sup, gs2]}}]High-level
update
instruction for a supervisor is mapped to a low-level advanced code change instruction. In thecode_change
function of the supervisor, the new child specification is installed, but no children are explicitly terminated or started. Therefore, children must be terminated, deleted and started by using theapply
instruction.4.7.1.10 Complex Dependencies
As already mentioned, sometimes the simplest and safest way to introduce a new release is to terminate parts of the system, load the new code, and restart that part. However, individual processes cannot simply be killed, since their supervisors will restart them again. Instead supervisors must first be ordered to stop their children before now code can be loaded. Then supervisors are ordered to restart their children. All this is done by issuing the
stop
andstart
instructions.The following example assumes that we have a supervisor
a
with two childrenb
andc
, whereb
is a worker andc
is a supervisor ford
. We want to restart all processes except fora
. The upgrade instructions are as follows:[{load_object_code, {foo, "1.2", [b,c,d]}}, point_of_no_return, {stop, [b, c]}, {load, {b, soft_purge, soft_purge}}, {load, {c, soft_purge, soft_purge}}, {load, {d, soft_purge, soft_purge}}, {start, [b, c]}]
We do not need to explicitly stop
d
, this is done by the supervisorc
.A whole application cannot be stopped and started with the
stop
andstart
instructions. The instructionrestart_application
has to be used instead.4.7.1.11 New Application
The examples shown so far have dealt with changing an existing application. In order to introduce a completely new application we just have to have an
add_application
instruction, but we also have to make sure that the boot file of the new release contains enough in order to start it. The following example shows how to to introduce the applicationnew_appl
, which has just one module:new_mod
.The release upgrade instructions are:
[{add_application, new_appl}]The corresponding low-level instructions are as follows (note that the application specification is used as argument to
application:start_application/1
):[{load_object_code, {new_appl, "1.0", [new_mod]}}, point_of_no_return, {load, {new_mod, soft_purge, soft_purge}}, {apply, {application, start, [{application, new_appl, [{description, "NEW APPL"}, {vsn, "1.0"}, {modules, [new_mod]}, {registered, []}, {applications, [kernel, foo]}, {env, []}, {mod, {new_mod, start_link, []}}]}, permanent]}}].4.7.1.12 Remove an Application
An application is removed in the same way as new applications are introduced. This example assumes that we want to remove the
new_appl
application:[{remove_application, new_appl}]The corresponding low_level instructions are:
[point_of_no_return, {apply, {application, stop, [new_appl]}}, {remove, {new_mod, soft_purge, soft_purge}}].4.7.2 Update of Port Programs
Each port program is controlled by a Erlang process called the port controller. A port program is updated by the port controller process. It is always done by terminating the old port program, and starting the new one.
4.7.2.1 Port Controller
In this example we have a port controller process, where we must take care of the termination and restart of the port program ourselves. Also, we may prepare for the possibility of changing the Erlang code of the port controller only. The
gen_server
behavior is used to implement the port controller. The contents of the original module is as follows.-module(portc). -vsn(1). -behaviour(gen_server). -export([get_data/0]). -export([init/1, handle_call/3, handle_info/2, code_change/3]). -record(state, {port, data}). get_data() -> gen_server:call(portc, get_data). init([]) -> PortProg = code:priv_dir(foo) ++ "/bin/portc", Port = open_port({spawn, PortProg}, [binary, {packet, 2}]), {ok, #state{port = Port}}. handle_call(get_data, _From, State) -> {reply, {ok, State#state.data}, State}. handle_info({Port, Cmd}, State) -> NewState = do_cmd(Cmd, State), {noreply, NewState}. code_change(_, State, change_port_only) -> State#state.port ! close, receive {Port, closed} -> true end, NPortProg = code:priv_dir(foo) ++ "/bin/portc", % get new version NPort = open_port({spawn, NPortProg}, [binary, {packet, 2}]), {ok, State#state{port = NPort}}.To change the port program without changing the Erlang code, we can use the following code:
[point_of_no_return, {suspend, [portc]}, {code_change, [{portc, change_port_only}]}, {resume, [portc]}]Here we used low-level instructions only. In this example we also make use of the
Extra
argument of thecode_change/3
function.Suppose now that we wish to change only the Erlang code. The new version of
portc
is as follows:-module(portc). -vsn(2). -behaviour(gen_server). -export([get_data/0]). -export([init/1, handle_call/3, handle_info/2, code_change/3]). -record(state, {port, data}). get_data() -> gen_server:call(portc, get_data). init([]) -> PortProg = code:priv_dir(foo) ++ "/bin/portc", Port = open_port({spawn, PortProg}, [binary, {packet, 2}]), {ok, #state{port = Port}}. handle_call(get_data, _From, State) -> {reply, {ok, State#state.data}, State}. handle_info({Port, Cmd}, State) -> NewState = do_cmd(Cmd, State), {noreply, NewState}. code_change(_, State, change_port_only) -> State#state.port ! close, receive {Port, closed} -> true end, NPortProg = code:priv_dir(foo) ++ "/bin/portc", % get new version NPort = open_port({spawn, NPortProg}, [binary, {packet, 2}]), {ok, State#state{port = NPort}}; code_change(1, State, change_erl_only) -> NState = transform_state(State), {ok, NState}.The high-level instruction is:
[{update, portc, {advanced, change_erl_only}, soft_purge, soft_purge, []}]The corresponding low-level instructions are:
[{load_object_code, {portc, 2, [portc]}}, point_of_no_return, {suspend, [portc]}, {load, {portc, soft_purge, soft_purge}}, {code_change, [{portc, change_erl_only}]}, {resume, [portc]}]