Release Handling

4 Release Handling
4.1 Introduction
A new release is assembled into a release package. Such a package is installed in a running system by giving commands to the release handler, which is an SASL process. A system has a unique system version, which is updated whenever a new release is installed. The system version is the version of the entire system, not just the OTP version.
If the system consists of several nodes, each node has its own system version. Release handling can be synchronized between nodes, or be done at one node at a time.
Changes may require a node to be brought down. If that is the case and the system consists of several nodes, the release upgrade can be done as follows;

move all applications from the node to be changed to other nodes,

take down the node,

do the change,

restart the node and move the applications back.

There are several different types of releases:

Operating system change.

Can only be done by taking down the node. This kind of change is not supported by the release handler and therefore has to be performed manually. It is not possible to roll back automatically to a previous release, if there is an error.

Application code or data change.

The release is installed without bringing down the running node. Some changes, for example change of C-programs, may be done by shutting down and restarting the affected processes.

Erlang emulator change.

Can only be made by taking down the node. However, the release handler supports this type of change.

4.2 Administering Releases
This section describes how to build and install releases. Also refer to the SASL Reference Manual, release_handler, for more details.
The following steps are involved in administering releases:

A release package is built by using release building commands in the systools module. The package is assembled from application specification files, code files, data files, and a file, which describes how the release is installed in the system.

The release package is transferred to the target machine, e.g. by using ftp.

The release package is unpacked, which makes the system version in the release package available for installation by the release_handler, which interprets the release upgrade script, containing instructions for updating to the new version. If an installation fails in some way, the entire system is restarted from the old system version.

When the installation is complete, the system version must be made permanent. When permanent, the new version is used if the system restarts.

It is also possible to reinstall an old version, or reboot the system from an old version. There are functions to remove old releases from disk as well.
4.3 File Structure
The file structure used in an OTP system is described in Release Directories. There are two ways of using this file structure together with the release handler.
The simplest way is to store all user-defined applications under $OTP_ROOT/lib in the same way as other OTP applications. The release handler takes care of everything, from unpacking a release to the removal of it. The release packages should be stored in the releases directory (default $OTP_ROOT/releases). This is where release_handler:unpack_release/1 searches for the packages, and where the release handler stores its files. Each package is a compressed tar file. The files in the tar file are named relative to the $OTP_ROOT directory. For example, if a new version (say 1.3) of the application snmp is contained in the release package, the files in the tar file should be named lib/snmp-1.3/*.
The second way is to store all user-defined applications in some other place in the file system. In this case, some more work has to be done outside the release handler. Specifically, the release packages must be unpacked in some way and the release handler must be notified of where the new release is located. The following three functions are available in the module release_handler to handle this case:

set_unpacked/2

set_removed/1

install_file/2.

4.4 Release Installation Files
The following files must be present when a release is installed. All file names are relative to the releases directory.

ReleaseFileName.rel

Vsn/relup

Vsn/start.boot

Vsn/sys.config

The location of the releases directory is specified with the configuration parameter releases_dir (default $OTP_ROOT/releases). In a target system, the default location is preferred, but during testing it may be more convenient to let the release handler write its files in a user specified directory, than in the $OTP_ROOT directory.
The files listed above are either present in the release package, or generated at the target machine and copied to their correct places using release_handler:install_file/2.
Vsn is the system version string.
4.4.1 ReleaseFileName.rel
The ReleaseFileName.rel file contains the name of the system, version of the release, the version of erts (the Erlang runtime system) and the applications, which are parts of the release. The file must contain the following Erlang term:
    {release, {Name, Vsn}, {erts, EVsn}, 
     [{App, AVsn} | {App, AVsn, AType} | {App, AVsn, [App]} |
        {App, AVsn, AType, [App]}]}.
      
Name, Vsn, EVsn and AVsn are strings, App and AType are atoms. ReleaseFileName is a string given in the call to release_handler:unpack_release(ReleaseFileName). Name is the name of the system (the same as found in the boot file). This file is further described in Release Structure.
4.4.2 relup
The relup file contains instructions on how to install the new version in the system. It must contain one Erlang term:
    {Vsn, [{FromVsn, Descr, RuScript}], [{ToVsn, Descr, RuScript}]}.
      
Vsn, FromVsn and ToVsn are strings, RuScript is a release upgrade script. Descr is a user defined parameter, which is not processed by any release handling functions. It can be used to describe the release to an operator. Finally, it will be returned by release_handler:install_release/1 and release_handler:check_install_release/1.
There is one tuple {FromVsn, Descr, RuScript} for each old system version which can be upgraded to the new version, and one tuple {ToVsn, Descr, RuScript} for each old version to which the new version can be downgraded.
4.4.3 start.boot
The start.boot file is the compiled start.script file. It is used to boot the Erlang machine.
4.4.4 sys.config
The sys.config is the system configuration file.
4.5 Release Handling Principles
The following sections describe the principles for updating parts of an OTP system.
4.5.1 Erlang Code
The code change feature in Erlang is made possible because Erlang allows two versions of a module to be present in the system: the current version and the old version. There is always a current version of a loaded module, but an old version of a module only exists if the module has been replaced in run-time by loading a new version. When a new version is loaded, the previously current version becomes the old version, and the new version becomes the current version. However, if there are both a current and old version of a module, a new version cannot be loaded, unless the old version is first explicitly purged.
A global function call is a call where a qualified module name is used, i.e. the call is of the form M:F(A) (or apply(M, F, A)). A global call causes M:F to be dynamically linked into the run-time code, which means that M:F(A) will be evaluated using the latest available version of the module, i.e. the current version.
A local function call is a call without a qualified module name, i.e. the call is of the form F(A). The reference to F is resolved at compile time (irrespective of whether F is exported or not). By the very nature of F(A) being a local function call, F can only be called by a function that is defined in the very same module as that where F is defined. Hence a local function call is always evaluated in the same version of a module as that of the caller.
A fun is a function without a name. Like ordinary functions (i.e. functions which have names) its implementation is always bound to some module, and therefore funs are affected by code change as well. A reference to a fun is always indirect, as is the case for a global function call, where the reference is M:F (through an export table entry for the module), but the reference is not necessarily global. In fact, if a fun is called in the same module where it is defined, its reference will be resolved in the same way as a local function call is resolved. If a fun is called from a different module, its reference will be resolved as if the call was a global call, but with the additional requirement that the reference also match the particular implementation of the module where the fun was defined.
For each process there is a current function, i.e. the function that the process is currently evaluating. That function resides in some module. Hence a process has always a reference to at least one module. It may of course have references to other modules as well, because of nested, not yet finished calls.
Before a new version of a module can be loaded, the current version must be made old. If there is no old version, the new version is merely loaded, making the previously current version to the old version, and the new version becomes current. All processes that execute the version, which became old, will continue to do so, until they have no unfinished calls within the old version.
If there is an old version, it must first be purged to make room for the current version to become old. However, an old version should not be purged if there are processes that have references to it. Such processes must either be terminated, or the loading of the new version must be postponed until they have terminated by themselves or no longer have references to the old version. There are options for controlling this in release upgrade scripts.
To prevent processes from making calls to other processes during the release installation, they may be suspended. All processes implemented with the standard behaviors, or with sys, can be suspended. When suspended a process enters a special suspend loop instead of its usual main process loop. In the suspend loop, the process can only receive system messages and shut-down messages from its supervisor. The code change message is a special system message, and this message causes the process to change code to the new version, and possibly to transform its internal state. After the code change a process is resumed, i.e. it returns to its main loop.
We highlight here three different types of modules.

Functional module.

A module, which does not contain a process loop, i.e. no process has constant references to this kind of module. lists is an example of a functional module.

Process module.

A module, which contains a process loop, i.e. some process has constant reference to the module. init is an example of a process module.

Call-back module.

A special case of a functional module which serves as a call-back module for a generic behavior such as gen_server. file is an example of a call-back module. A call to a call-back module is always a global call (i.e. it refers to the latest version of the module). This has some impacts upon how updates must be handled.

Modules of the above types are handled differently when changing code.
4.5.1.1 Functional Module
If the API of a new version of a functional module is backward compatible, as may be the case of a bug fix or new functionality, we simply load the new version. After a short while, when no processes have references to the old version, the old module is purged.
A more complicated situation arises if the API of a functional module is changed so it is not longer backwards compatible. We must then make sure that no processes, directly or indirectly, try to call functions that have changed. We do this by writing new versions of all modules that use the API. Then, when performing the code change, all potential caller processes are suspended, new versions of the modules that uses the API are loaded, the new version of the functional module is loaded, and finally all suspended processes are resumed.
There are two alternatives available to manage this type of change:

Find all calls to the module, change them, and write dependencies in your release upgrade script. This may be manageable, if a function that has been incompatibly changed is called from only a few other functions.

Avoid this type of change. This is the only reasonable solution, if an incompatible function is called from many other modules. Instead a completely new function should be introduced, and the original function should be kept for backward compatibility. In the next release, when all other modules are changed as well, the original function can be deleted.

4.5.1.2 Process Module
A process module should never contain global calls to itself (except for code that makes explicit code change). Therefore, a new version of a process module is merely loaded and all processes which are executing the module are told to change their code and, if required, to transform their internal state.
In practice, few modules are pure in the sense that they never contain global calls to themselves. If you use higher-order functions such as lists:map/2 in a process module, there will be global calls to the module. Therefore, we cannot merely load the module because a process might, still running the old version of the module, make a call to the new version, which might be incompatible.
The only safe way to change code for a process module, is to have its implementation to understand system messages, and to change code by first suspending all processes that run the module, then order them to change code, and finally resume them.
4.5.1.3 Call-back Module
As long as the type of the internal state of a call-back module has not changed, we can just simply load the new version of the module without suspending and resuming the processes involved in the code change. This case is similar to the case of a functional module.
If the type of the internal state has changed, we must first suspend the processes, tell them to change code and at the same time give them the possibility to transform their states, and finally resume them. This is similar to the case of a process module.
4.5.1.4 Dependencies Between Processes
It is possible that a group of processes, which communicate, must perform code changes while they are suspended. Some of the processes may otherwise use the old protocol while others use the new protocol. On the other hand, there may be time-out dependencies which restrict the number of processes that can perform a synchronized code change as one set. The more processes that are included in the set, the longer the processes are suspended.
There may also be problems with circular dependencies. The following scenario illustrates this situation.

two modules a and b are dependent on each other,

each module is executed by one process with the same name as the corresponding module,

both are updated at the same time because the internal protocol between them has changed.

The following sequence of events may occur:

a is suspended.

the release handler tries to suspend b, but some microsecond before this happens, b tries to communicate with a which is now suspended

If b hangs in its call to a, the suspension of b fails and only a is updated.

If b notices that a does not answer and is able to deal with it, then b receives the suspend message and is suspended. Then both modules are updated and the processes are resumed.

When a resumes, there is a message waiting from b. This message may be of an old format which a does not recognize.

Situations of the type described, and many others, are highly application dependent. The author of the release upgrade script has to predict and avoid them. If the consequences are too difficult to manage, it may be better to entirely shut down and restart all affected processes. This reduces the problem of introducing new code and removes the need to do a synchronized change.
4.5.1.5 Finding Processes
For each application the .appup file specifies how the application is upgraded. The file contains specifications of which modules to change, and how to change them. The relup file is an assembly of all the .appup files.
For each application the release handler searches for all processes that have to perform a code change. It traverses the application supervision tree to find all child specifications of every supervisor in the tree. Each child specification lists all modules of the application that the child uses.
Hence it is by combining the list of modules to change with all children of supervisors that the release handler finds all processes that are subject to code change.
4.5.2 Port Programs
A port program runs as an external program in the operating system. The simplest way to do code change for a port program is to terminate it, and then start a new version of it.
If that is not adequate, code change may be performed by sending the port program a message telling it to return any data that must survive the termination. Then the program is terminated, and the new version is started and the survived data is to the new version of the port program.
Changing code for port programs is very application dependent. There is no special support for it in SASL.
4.5.3 Application Specification and Configuration Parameters
In each release, each application specification (i.e. the contents of the .app file of the application) is known to the release handler. Before any code change is performed for an application, the new environment variables are are made available for the application, i.e. those parameters specified by the env tag in the application specification. When the new version of an application is running it will be informed of any changed, new or removed environment variables (see application(Module) in the KERNEL Reference Manual). This means that old processes may read new variables before they are informed of the new release. We advise against the immediate removal of the old variables. Neither do we recommend that they be syntactically changed, although they may of course change their values. They can be safely removed in the next release, by which time it is known that no processes will read the old variables.
4.5.4 Mnesia Data or Schema Changes
Changing data or schemas in Mnesia is similar to changing code for functional modules. Many processes may read or write in the same table at the same time. If we change a table definition, we must make sure that all code which uses the table is changed at the same time.
One way of doing it is to let one process be responsible for one or several tables. This process creates the tables and changes the table definitions or table data. In this way a set of tables is connected with a module (process module or call-back module). When the process performs a code change, the tables are changed as well.
4.5.5 Upgrade vs. Downgrade
When a new release is installed, the system is upgraded to the new release. The release handler reads the relup file of the new release, and finds the upgrade script that corresponds to an upgrade from the current version to the new version of the system.
When an old release is reinstalled, the release handler reads the relup in the current release, and finds the downgrade script that corresponds to an downgrade from the current version to the old version of the system.
Usually a relup file for a new release contains one upgrade script and one downgrade script for each old version. If a soft downgrade is not wanted (an alternative is to reboot the system from the old release) the downgrade script is left out.
For each modified module in the new release, there are some instructions that specifies how to install that module in a system. When performing an upgrade, the following steps are typically involved:

Suspend the processes running the module.

Load the new code.

Tell the processes to switch to new code.

Tell the processes to change the internal state. This usually involves calling, in the new module, a code_change function that is responsible for state updates, e.g. transforming the state from the old format to the new.

Resume the processes.

The code change step is always performed when new code has been loaded and all processes are running the new code. The reason for this is that it is always the new version of the module that knows how to change the state from the old version.
When performing a downgrade the situation is different. The old module does not know how to transform the new state to the old version: the new format is unknown to the old code. Therefore, it is the responsibility of new code to revert the state back to the old version during downgrade. The following steps are involved:

Suspend the processes running the module.

Tell the processes to change the internal state. This usually involves calling, in the current module, a code_change function that is responsible for state reversals, i.e. transforming the state from the current format to the old.

Load the new code.

Tell the processes to switch code.

Resume the processes.

We note that for a process module, it is possible to load the code before a process change its internal state (since a process module never contains global calls to itself), thus making the steps needed for downgrade almost the same as for upgrade. The difference between the two cases is still in the order of switching code and changing state.
For a call-back module it is not actually necessary to tell the processes to switch code, since all calls to the call-back module are global calls. The difference between upgrade and downgrade is still in the order of loading code and performing state change.
The difference between how process modules and a call-back modules are handled in the downgrade case comes from the fact that a process module never contains global calls to itself. The code is thus static in the sense that a process executing a process module does not spontaneously switch to new loaded code. The opposite situation is a dynamic module, where a process executing the module spontaneously switches to the new code when it is loaded. A call-back module is always dynamic, and a process module static. A functional module is always dynamic.
4.6 Release Handling Instructions
This section describes the release upgrade and downgrade scripts. A script is a list of instructions which are interpreted by the release handler when an upgrade or downgrade is made.
There are two levels of instructions; the high-level instructions and the low-level instructions. High- and low-level instructions may be mixed in one script. However, the high-level instructions are translated to low-level instructions by the systools:make_relup/3 command, because the release handler understands only low-level instructions.
Scripts have to be placed in the .appup file for each application. systools:make_relup/3 assembles the scripts in all .appup files to form a relup file containing low-level instructions.
4.6.1 High-level Instructions
The high-level instructions are:

{update, Module, Change, PrePurge, PostPurge, [Mod]} | {update, Module, Timeout, Change, PrePurge, PostPurge,[Mod]} | {update, Module, ModType, Timeout, Change, PrePurge, PostPurge,[Mod]}

Module = atom()

Timeout = default | infinity | int() > 0

ModType = static | dynamic

Change = soft | {advanced, Extra}

PrePurge = soft_purge | brutal_purge

PostPurge = soft_purge | brutal_purge

Mod = atom(). If the module is dependent on changes in other modules, these other modules are listed here.

The instruction is used to update a process module or a call-back module. All processes that run the code of Module are suspended, and if the change is advanced they have to transform their states into the new states. Then the processes are resumed. If Module is dependent on other modules, the release handler will suspend processes in Module before suspending processes in the [Mod] modules. In case of circular dependencies, it will suspend processes in the order that update instructions appear in the script.
soft means backwards compatible changes and advanced means internal data changes, or changes which are not backwards compatible. Extra is any term, which is used in the argument list of the code_change function in Module (call-back module); otherwise it becomes part of a code change message (process module).
The optional parameter Timeout defines the time-out for the call to sys:suspend. It specifies how long to wait for a process to handle a suspend message and to get suspended. If no value is specified (or default is given), the default value defined in sys is used.
The optional parameter ModType specifies if the code is static or dynamic, as defined in Upgrade vs. Downgrade above. It needs to be specified only in the case of soft downgrades. Its value defaults to dynamic. Note; if this parameter is specified, Timeout is needed as well.
PrePurge controls what action to take with processes that are executing an old version of this module. These are processes, which are left since an earlier release upgrade (or downgrade). Usually there are no such processes. If the value is soft_purge and such processes are found, the release will not be installed and the install_release/1 function returns {error, {old_processes, Module}}. If the value is brutal_purge, the processes which run old code are killed.
PostPurge controls what action to take with processes that are executing old code when the new module has been installed. If the value is soft_purge, the release handler will purge the old code when no remaining processes execute the code. If the value is brutal_purge, the code is purged when the release is made permanent. All processes, which still are running old code are killed.
The update instruction can also be used for functional modules. However, no processes will be suspended because no processes will have the functional module as its main module. Therefore, no processes perform code change.

{load_module, Module, PrePurge, PostPurge, [Mod]}

Module = atom().

PrePurge = soft_purge | brutal_purge

PostPurge = soft_purge | brutal_purge

Mod = atom(). If the module is dependent on changes in other modules, these other modules are listed here.

The instruction is used to update a functional module or a call-back module. It only loads the module. A call-back module which must perform a code change, or synchronize by being suspended, should use update instead.
The object code is fetched in the beginning of the release upgrade, but the module is loaded when this instruction occurs.

{add_module, Mod} The instruction adds a new module to the system. It loads the module.

{remove_application, Appl} Removes an application. It calls application:stop and application:unload for the application.

{add_application, Appl} Adds a new application. It calls application:load and application:start for the application.

{restart_application, Appl} Restarts an existing application. The current version of the application is stopped and removed, and the new version of the application is loaded and started. The instruction is useful when the simplest way to change code for an application is to stop and restart the whole application.

4.6.2 Low-level instructions
The low-level instructions are:

{load_object_code, {Lib, LibVsn, [Module]}} Reads each Module from the library Lib-LibVsn as a binary. It does not install the code, it just reads the files. The instruction should be placed first in the script in order to read all new code from file. This makes the suspend-load-resume cycle less time consuming. After this instruction has been executed, the code server is updated with the new version of Lib. Calls to code:priv_dir(Lib) which are made after this instruction return the new priv dir.
Lib is typically the application name.

point_of_no_return If a crash occurs after this instruction, the system cannot recover and is restarted from the old version. The instruction must only occur once in a script. It should be placed after all load_object_code operations and after user defined checks, which are performed with apply. The function check_install_release/1 tries to evaluate all instructions before this command occurs in the script. Therefore, user defined checks must not have side effects, as they may be evaluated many times.

{load, {Module, PrePurge, PostPurge}} Before this instruction occurs, the Module object code must have been loaded with with the load_object_code instruction. This instruction makes code out of the binary. PrePurge = soft_purge | brutal_purge, and PostPurge = soft_purge | brutal_purge.

{remove, {Module, PrePurge, PostPurge}} Makes the current version of Module old. When it has been executed, there is no current version in the system. PrePurge = soft_purge | brutal_purge, and PostPurge = soft_purge | brutal_purge.

{purge, [Module]} Kills all processes that run the old versions of the modules in [Module] and deletes all old versions.

{suspend, [Module | {Module, Timeout}]} Tries to suspend all processes that execute Module. If a process does not respond, it is ignored. This may cause the process to die, either because it crashes when it spontaneously switches to new code, or as a result of a purge operation. If no Timeout is specified (or if default is given), the default time-out defined in the module sys is used.

{code_change, [{Module, Extra}]} | {code_change, Mode, [{Module, Extra}]} This instruction sends a code_change system message using the function change_code in the module sys with the Extra argument to the suspended processes that run this code. Mode is either up or down. Default is up. In case of an upgrade, the message is sent to the suspended process, after the new code is loaded (the new version must contain functions to convert from the old internal state, to the the new internal state). In case of a downgrade, the message is sent to the suspended process, before the new code is loaded (the current version must contain functions to convert from the current internal state, to the the old internal state).
Module uses the Extra argument internally in its code change function. Refer to the Reference Manual, module sys for further details.
One of the arguments to the function sys:change_code is OldVsn. In the case of an upgrade it obtains its value from the attribute vsn in the old code, or undefined if no such attribute was defined. In the case of downgrade, it is the tuple {down, Vsn}, where Vsn is the version of the module as defined in the .app file, or undefined otherwise.

{resume, [Module]} Resumes all previously suspended processes which execute in any of the modules in the list [Module].

{stop, [Module]} Stops all processes which are in any of the modules in the list [Module]. The instruction is useful when the simplest way to change code for the [Module] is to stop and restart the processes which run the code. If a supervisor is stopped, all its children are stopped as well.

{start, [Module]} Starts all previously stopped processes which are in any member of [Module]. The processes will regain their positions in the supervision tree.

{sync_nodes, Id, [Node] | {M, F, A}} If {M, F, A} is specified, apply(M, F, A) is evaluated and must return a list of nodes. The instruction synchronizes the release installation with other nodes. Each node in the list of nodes must evaluate this command, with the same Id. The local node waits for all other nodes to evaluate the instruction before execution continues. In case a node goes down, it is considered to be an unrecoverable error, and the local node is restarted from the old release. There is no time-out for this instruction, which implies that it may hang forever if a user defined apply enters an infinite loop at some node. It is up to the user to ensure that the apply command eventually returns or makes the node to crash.

{apply, {M, F, A}} Applies the function to the arguments. If the instruction appears before the point_of_no_return instruction, a failure of the application M:F(A) is caught, causing release_handler:install_release/1 to return {error, {'EXIT', Reason}}. If {error, Error} is thrown or returned by M:F, install_release/1 returns {error, Error}.
If the instruction appears after the point_of_no_return instruction, and if the application M:F(A) fails, the system is restarted.

restart_new_emulator Shuts down the current emulator and starts a new one. All processes are terminated gracefully. The new release must still be made permanent when the new emulator is up and running. Otherwise, the old emulator is started in case of a emulator restart. This instruction should be used when a new emulator is introduced, or if a complete reboot of the system should be done.

4.7 Release Handling Examples
This section includes several examples that show how different types of upgrades are handled. In call-back modules having the gen_server behavior, all call-back functions have been provided for reasons of clarity.
4.7.1 Update of Erlang Code
Several update examples are shown. Unless otherwise stated, it is assumed that all original modules are in the application foo, version "1.1", and the updated version is "1.2".
4.7.1.1 Simple Functional Module
This example is about a pure functional module, i.e. a module the functions of which have no side effects. The original version of the module lists2 has the following contents:
-module(lists2).
-vsn(1).

-export([assoc/2]).

assoc(Key, [{Key, Val} | _]) -> {ok, Val};
assoc(Key, [H | T]) -> assoc(Key, T);
assoc(Key, []) -> false.
The new version of the module adds a new function:
-module(lists2).
-vsn(2).

-export([assoc/2, multi_map/2]).

assoc(Key, [{Key, Val} | _]) -> {ok, Val};
assoc(Key, [H | T]) -> assoc(Key, T);
assoc(Key, []) -> false.

multi_map(Func, [[] | ListOfLists]) -> [];
multi_map(Func, ListOfLists) ->
    [apply(Func, lists:map({erlang, hd}, ListOfLists)) |
     multi_map(Func, lists:map({erlang, tl}, ListOfLists))].
The release upgrade instructions are:
[{load_module, lists2, soft_purge, soft_purge, []}]
        
Alternatively, the low-level instructions are:
[{load_object_code, {foo, "1.2", [lists2]}},
 point_of_no_return,
 {load, {lists2, soft_purge, soft_purge}}]
        
4.7.1.2 A More Complicated Functional Module
Here we have a functional module bar that uses the module lists2 of the previous example. The original version is only dependent on the original version of lists2.
-module(bar).
-vsn(1).

-export([simple/1, complicated_sum/1]).

simple(X) ->
    case lists2:assoc(simple, X) of
        {ok, Val} -> Val;
        false -> false
    end.

complicated_sum([X, Y, Z]) -> cs(X, Y, Z).

cs([HX | TX], [HY | TY], [HZ | TZ]) ->
    NewRes = cs(TX, TY, TZ),
    [HX + HY + HZ | NewRes];
cs([], [], []) -> [].
The new version of bar uses the new functionality of lists2 in order to simplify the implementation of the useful function complicated_sum/1. It does not change its API in any way.
-module(bar).
-vsn(2).

-export([simple/1, complicated_sum/1]).

simple(X) ->
    case lists2:assoc(simple, X) of
        {ok, Val} -> Val;
        false -> false
    end.

complicated_sum(X) ->
    lists2:multi_map(fun(A,B,C) -> A+B+C end, X).
The release upgrade instructions, including instructions for lists2, are as follows:
[{load_module, lists2, soft_purge, soft_purge, []},
 {load_module, bar, soft_purge, soft_purge, [lists2]}]
        
We must state that bar is dependent on lists2 to make the release handler to load lists2 before it loads bar.

The low-level instructions are:
[{load_object_code, {foo, "1.2", [lists2, bar]}},
 point_of_no_return,
 {load, {lists2, soft_purge, soft_purge}}
 {load, {bar, soft_purge, soft_purge}}]
        
4.7.1.3 Advanced Functional Module
Suppose now that we modify the return value of lists2:assoc/2 from {ok, Val} to {Key, Val}. In order to do an upgrade, we would have to find all modules that call lists2:assoc/2 directly or indirectly, and specify that these modules are dependent on lists2. In practice this might an unweildy task, if if many other modules are using the lists2 module, and the only reasonable way to perform an upgrade which restarts the whole system.
If we insist on doing a soft upgrade, the modification should be made backward compatible by introducing an new function (assoc2/2, say) that has the new return value, and not make any changes to the original function at all.
4.7.1.4 Advanced gen_server
This example assumes that we have a gen_server process that must be updated because we have introduced a new function, and added a new data field in our internal state. The contents of the original module are as follows:
-module(gs1).
-vsn(1).
-behaviour(gen_server).

-export([get_data/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

-record(state, {data}).

get_data() -> 
    gen_server:call(gs1, get_data).

init([Data]) ->
    {ok, #state{data = Data}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.
The new module must translate the old state into the new state. Recall that a record is just syntactic sugar for a tuple:
-module(gs1).
-vsn(2).
-behaviour(gen_server).

-export([get_data/0, get_time/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

-record(state, {data, time}).

get_data() -> 
    gen_server:call(gs1, get_data).

get_time() -> 
    gen_server:call(gs1, get_time).

init([Data]) ->
    {ok, #state{data = Data, time = erlang:time()}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State};
handle_call(get_time, _From, State) ->
    {reply, {ok, State#state.time}, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(1, {state, Data}, _Extra) ->
    {ok, #state{data = Data, time = erlang:time()}}.
The release upgrade instructions are as follows:
[{update, gs1, {advanced, []}, soft_purge, soft_purge, []}]
        
The alternative low-level instructions are:
[{load_object_code, {foo, "1.2", [gs1]}},
 point_of_no_return,
 {suspend, [gs1]},
 {load, {gs1, soft_purge, soft_purge}},
 {code_change, [{gs1, []}]},
 {resume, [gs1]}]
        
If we want to handle soft downgrade as well, the code would be as follows:
-module(gs1).
-vsn(2).
-behaviour(gen_server).

-export([get_data/0, get_time/0]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

-record(state, {data, time}).

get_data() -> 
    gen_server:call(gs1, get_data).
get_time() -> 
    gen_server:call(gs1, get_time).

init([Data]) ->
    {ok, #state{data = Data, time = erlang:time()}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State};
handle_call(get_time, _From, State) ->
    {reply, {ok, State#state.time}, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(1, {state, Data}, _Extra) ->
    {ok, #state{data = Data, time = erlang:time()}};
code_change({down, 1}, #state{data = Data}, _Extra) ->
    {ok, {state, Data}}.
Note that we take care of translating the new state to the old format as well. The low-level instructions are:
[{load_object_code, {foo, "1.2", [gs1]}},
 point_of_no_return,
 {suspend, [gs1]},
 {code_change, [{gs1, []}]},
 {load, {gs1, soft_purge, soft_purge}},
 {resume, [gs1]}]
        
4.7.1.5 Advanced gen_server with Dependencies
This example assumes that we have gen_server process that uses the in gs1 as defined in the previous example.
The contents of the original module are as follows:
-module(gs2).
-vsn(1).
-behaviour(gen_server).

-export([is_operation_ok/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

is_operation_ok(Op) -> 
    gen_server:call(gs2, {is_operation_ok, Op}).

init([Data]) ->
    {ok, []}.

handle_call({is_operation_ok, Op}, _From, State) ->
    Data = gs1:get_data(),
    Reply = lists2:assoc(Op, Data),
    {reply, Reply, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.
The new version does not have to transform the internal state, hence the code_change/3 function is not really needed (it will not be called since the upgrade of gs2 is soft).
-module(gs2).
-vsn(2).
-behaviour(gen_server).

-export([is_operation_ok/1]).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, 
         terminate/2, code_change/3]).

is_operation_ok(Op) -> 
    gen_server:call(gs2, {is_operation_ok, Op}).

init([Data]) ->
    {ok, []}.

handle_call({is_operation_ok, Op}, _From, State) ->
    Data = gs1:get_data(),
    Time = gs1:get_time(),
    Reply = do_things(lists2:assoc(Op, Data), Time),
    {reply, Reply, State}.

handle_cast(_Request, State) ->
    {noreply, State}.

handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.

do_things({ok, Val}, Time) ->
    Val;
do_things(false, Time) ->
    {false, Time}.

    
The release upgrade instructions are:
[{update, gs1, {advanced, []}, soft_purge, soft_purge, []},
 {update, gs2, soft, soft_purge, soft_purge, [gs1]},
        
The corresponding low-level instructions are:
[{load_object_code, {foo, "1.2", [gs1, gs2]}},
 point_of_no_return,
 {suspend, [gs1, gs2]},
 {load, {gs1, soft_purge, soft_purge}},
 {load, {gs2, soft_purge, soft_purge}},
 {code_change, [{gs1, []}]},    % No gs2 here!
 {resume, [gs1, gs2]}]
        
4.7.1.6 Other Worker Processes
All other worker processes in a supervision tree, such as processes of the types gen_event, gen_fsm, and processes implemented by using proc_lib and sys, are handled in exactly the same way as processes of type gen_server are handled. Examples follow.
4.7.1.7 Simple gen_event
This example shows how an event handler may be updated. We do not make any assumptions about which event manager processes the handler is installed in, it is the responsibility of the release handler to find them. The contents of the original module is as follows:
-module(ge_h).
-vsn(1).
-behaviour(gen_event).

-export([get_events/1]).
-export([init/1, handle_event/2, handle_call/2, handle_info/2, 
         terminate/2, code_change/3]).

get_events(Mgr) -> 
    gen_event:call(Mgr, ge_h, get_events).

init(_) -> {ok, undefined}.

handle_event(Event, _LastEvent) -> 
    {ok, Event}.

handle_call(get_events, LastEvent) -> 
    {ok, [LastEvent], LastEvent}.

handle_info(Info, LastEvent) ->
    {ok, LastEvent}.

terminate(Arg, LastEvent) ->
    ok.

code_change(_OldVsn, LastEvent, _Extra) ->
    {ok, LastEvent}.
The new module decides to keep the two latest events in a list and must translate the old state into the new state.
-module(ge_h).
-vsn(2).
-behaviour(gen_event).

-export([get_events/1]).
-export([init/1, handle_event/2, handle_call/2, handle_info/2, 
         terminate/2, code_change/3]).

get_events(Mgr) -> 
    gen_event:call(Mgr, ge_h, get_events).

init(_) -> {ok, []}.

handle_event(Event, []) -> 
    {ok, [Event]};
handle_event(Event, [Event1 | _]) -> 
    {ok, [Event, Event1]}.

handle_call(get_events, Events) -> 
    Events.

handle_info(Info, Events) ->
    {ok, Events}.

terminate(Arg, Events) ->
    ok.

code_change(1, undefined, _Extra) -> 
    {ok, []};
code_change(1, LastEvent, _Extra) -> 
    {ok, [LastEvent]}.
The release upgrade instructions are:
[{update, ge_h, {advanced, []}, soft_purge, soft_purge, []}]
        
The low-level instructions are:
[{load_object_code, {foo, "1.2", [ge_h]}},
 point_of_no_return,
 {suspend, [ge_h]},
 {load, {ge_h, soft_purge, soft_purge}},
 {code_change, [{ge_h, []}]},
 {resume, [ge_h]}]
        
These instructions are identical to those used for the gen_server.

4.7.1.8 Process Implemented with sys and proc_lib
Processes implemented with sys and proc_lib are changed in the same way as processes that are implemented according to the gen_server behavior (which should not come as surprise, since gen_server et al. are implemented on top of sys and proc_lib). However, the code change function is defined differently. The original is as follows:
-module(sp).
-vsn(1).

-export([start/0, get_data/0]).
-export([init/1, system_continue/3, system_terminate/4]).

-record(state, {data}).

start() ->
    Pid = proc_lib:spawn_link(?MODULE, init, [self()]),
    {ok, Pid}.

get_data() ->
    sp_server ! {self(), get_data},
    receive
        {sp_server, Data} -> Data
    end.

init(Parent) ->
    register(sp_server, self()),
    process_flag(trap_exit, true),
    loop(#state{}, Parent).

loop(State, Parent) ->
    receive
        {system, From, Request} ->
            sys:handle_system_msg(Request, From, Parent, ?MODULE, [], State);
        {'EXIT', Parent, Reason} ->
            cleanup(State),
            exit(Reason);
        {From, get_data} ->
            From ! {sp_server, State#state.data},
            loop(State, Parent);
        _Any ->
            loop(State, Parent)
    end.

cleanup(State) -> ok.

%% Here are the sys call back functions
system_continue(Parent, _, State) ->
    loop(State, Parent).

system_terminate(Reason, Parent, _, State) ->
    cleanup(State),
    exit(Reason).
The new code, which takes care of up- and downgrade is as follows:
-module(sp).
-vsn(2).

-export([start/0, get_data/0, set_data/1]).
-export([init/1, system_continue/3, system_terminate/4, 
        system_code_change/4]).

-record(state, {data, last_pid}).

start() ->
    Pid = proc_lib:spawn_link(?MODULE, init, [self()]),
    {ok, Pid}.

get_data() ->
    sp_server ! {self(), get_data},
    receive
        {sp_server, Data} -> Data
    end.

set_data(Data) ->
    sp_server ! {self(), set_data, Data}.

init(Parent) ->
    register(sp_server, self()),
    process_flag(trap_exit, true),
    loop(#state{last_pid = no_one}, Parent).

loop(State, Parent) ->
    receive
        {system, From, Request} ->
            sys:handle_system_msg(Request, From, Parent, 
                                  ?MODULE, [], State);
        {'EXIT', Parent, Reason} ->
            cleanup(State),
            exit(Reason);
        {From, get_data} ->
            From ! {sp_server, State#state.data},
            loop(State, Parent);
        {From, set_data, Data} ->
            loop(State#state{data = Data, last_pid = From}, Parent);
        _Any ->
            loop(State, Parent)
    end.

cleanup(State) -> ok.

%% Here are the sys call back functions
system_continue(Parent, _, State) ->
    loop(State, Parent).

system_terminate(Reason, Parent, _, State) ->
    cleanup(State),
    exit(Reason).

system_code_change({state, Data}, _Mod, 1, _Extra) ->
    {ok, #state{data = Data, last_pid = no_one}};
system_code_change(#state{data = Data}, _Mod, {down, 1}, _Extra) ->
    {ok, {state, Data}}.
The release upgrade instructions are:
[{update, sp, static, default, {advanced, []}, soft_purge, soft_purge, []}]
        
The low-level instructions are the same for upgrade and downgrade:
[{load_object_code, {foo, "1.2", [sp]}},
 point_of_no_return,
 {suspend, [sp]},
 {load, {sp, soft_purge, soft_purge}},
 {code_change, [{sp, []}]},
 {resume, [sp]}]
        
4.7.1.9 Supervisor
This example assumes that a new version of an application adds a new process, and deletes one process from a supervisor. The original code is as follows:
-module(sup).
-vsn(1).
-behaviour(supervisor).
-export([init/1]).

init([]) ->
    SupFlags = {one_for_one, 4, 3600},
    Server = {my_server, {my_server, start_link, []},
              permanent, 2000, worker, [my_server]},
    GS1 = {gs1, {gs1, start_link, []}, permanent, 2000, worker, [gs1]},  
    {ok, {SupFlags, [Server, GS1]}}.
The new code is as follows:
-module(sup).
-vsn(2).
-behaviour(supervisor).
-export([init/1]).

init([]) ->
    SupFlags = {one_for_one, 4, 3600},
    GS1 = {gs1, {gs1, start_link, []}, permanent, 2000, worker, [gs1]},  
    GS2 = {gs2, {gs2, start_link, []}, permanent, 2000, worker, [gs2]},  
    {ok, {SupFlags, [GS1, GS2]}}.
The release upgrade instructions are:
[{update, sup, {advanced, []}, soft_purge, soft_purge, []}
 {apply, {supervisor, terminate_child, [sup, my_server]}},
 {apply, {supervisor, delete_child, [sup, my_server]}},
 {apply, {supervisor, restart_child, [sup, gs2]}}]
        
The low-level instructions are:
[{load_object_code, {foo, "1.2", [sup]}},
 point_of_no_return,
 {suspend, [sup]},
 {load, {sup, soft_purge, soft_purge}},
 {code_change, [{sup, []}]},
 {resume, [sup]},
 {apply, {supervisor, terminate_child, [sup, my_server]}},
 {apply, {supervisor, delete_child, [sup, my_server]}},
 {apply, {supervisor, restart_child, [sup, gs2]}}]
        
High-level update instruction for a supervisor is mapped to a low-level advanced code change instruction. In the code_change function of the supervisor, the new child specification is installed, but no children are explicitly terminated or started. Therefore, children must be terminated, deleted and started by using the apply instruction.
4.7.1.10 Complex Dependencies
As already mentioned, sometimes the simplest and safest way to introduce a new release is to terminate parts of the system, load the new code, and restart that part. However, individual processes cannot simply be killed, since their supervisors will restart them again. Instead supervisors must first be ordered to stop their children before now code can be loaded. Then supervisors are ordered to restart their children. All this is done by issuing the stop and start instructions.
The following example assumes that we have a supervisor a with two children b and c, where b is a worker and c is a supervisor for d. We want to restart all processes except for a. The upgrade instructions are as follows:
[{load_object_code, {foo, "1.2", [b,c,d]}},
 point_of_no_return,
 {stop, [b, c]},
 {load, {b, soft_purge, soft_purge}},
 {load, {c, soft_purge, soft_purge}},
 {load, {d, soft_purge, soft_purge}},
 {start, [b, c]}]
        
We do not need to explicitly stop d, this is done by the supervisor c.

A whole application cannot be stopped and started with the stop and start instructions. The instruction restart_application has to be used instead.
4.7.1.11 New Application
The examples shown so far have dealt with changing an existing application. In order to introduce a completely new application we just have to have an add_application instruction, but we also have to make sure that the boot file of the new release contains enough in order to start it. The following example shows how to to introduce the application new_appl, which has just one module: new_mod.
The release upgrade instructions are:
[{add_application, new_appl}]
        
The corresponding low-level instructions are as follows (note that the application specification is used as argument to application:start_application/1):
[{load_object_code, {new_appl, "1.0", [new_mod]}},
 point_of_no_return,
 {load, {new_mod, soft_purge, soft_purge}},
 {apply, {application, start,
           [{application, new_appl,
             [{description, "NEW APPL"},
              {vsn, "1.0"},
              {modules, [new_mod]},
              {registered, []},
              {applications, [kernel, foo]},
              {env, []},
              {mod, {new_mod, start_link, []}}]},
            permanent]}}].
        
4.7.1.12 Remove an Application
An application is removed in the same way as new applications are introduced. This example assumes that we want to remove the new_appl application:
[{remove_application, new_appl}]
        
The corresponding low_level instructions are:
[point_of_no_return,
 {apply, {application, stop, [new_appl]}},
 {remove, {new_mod, soft_purge, soft_purge}}].
        
4.7.2 Update of Port Programs
Each port program is controlled by a Erlang process called the port controller. A port program is updated by the port controller process. It is always done by terminating the old port program, and starting the new one.
4.7.2.1 Port Controller
In this example we have a port controller process, where we must take care of the termination and restart of the port program ourselves. Also, we may prepare for the possibility of changing the Erlang code of the port controller only. The gen_server behavior is used to implement the port controller. The contents of the original module is as follows.
-module(portc).
-vsn(1).
-behaviour(gen_server).

-export([get_data/0]).
-export([init/1, handle_call/3, handle_info/2, code_change/3]).

-record(state, {port, data}).

get_data() -> gen_server:call(portc, get_data).

init([]) ->
    PortProg = code:priv_dir(foo) ++ "/bin/portc",
    Port = open_port({spawn, PortProg}, [binary, {packet, 2}]),
    {ok, #state{port = Port}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State}.

handle_info({Port, Cmd}, State) ->
    NewState = do_cmd(Cmd, State),
    {noreply, NewState}.

code_change(_, State, change_port_only) ->
    State#state.port ! close,
    receive
        {Port, closed} -> true
    end,
    NPortProg = code:priv_dir(foo) ++ "/bin/portc",   % get new version
    NPort = open_port({spawn, NPortProg}, [binary, {packet, 2}]),
    {ok, State#state{port = NPort}}.
To change the port program without changing the Erlang code, we can use the following code:
[point_of_no_return,
 {suspend, [portc]},
 {code_change, [{portc, change_port_only}]},
 {resume, [portc]}]
        
Here we used low-level instructions only. In this example we also make use of the Extra argument of the code_change/3 function.
Suppose now that we wish to change only the Erlang code. The new version of portc is as follows:
-module(portc).
-vsn(2).
-behaviour(gen_server).

-export([get_data/0]).
-export([init/1, handle_call/3, handle_info/2, code_change/3]).

-record(state, {port, data}).

get_data() -> gen_server:call(portc, get_data).

init([]) ->
    PortProg = code:priv_dir(foo) ++ "/bin/portc",
    Port = open_port({spawn, PortProg}, [binary, {packet, 2}]),
    {ok, #state{port = Port}}.

handle_call(get_data, _From, State) ->
    {reply, {ok, State#state.data}, State}.

handle_info({Port, Cmd}, State) ->
    NewState = do_cmd(Cmd, State),
    {noreply, NewState}.

code_change(_, State, change_port_only) ->
    State#state.port ! close,
    receive
        {Port, closed} -> true
    end,
    NPortProg = code:priv_dir(foo) ++ "/bin/portc",   % get new version
    NPort = open_port({spawn, NPortProg}, [binary, {packet, 2}]),
    {ok, State#state{port = NPort}};
code_change(1, State, change_erl_only) ->
    NState = transform_state(State),
    {ok, NState}.
The high-level instruction is:
[{update, portc, {advanced, change_erl_only}, soft_purge, soft_purge, []}]
        
The corresponding low-level instructions are:
[{load_object_code, {portc, 2, [portc]}},
 point_of_no_return,
 {suspend, [portc]},
 {load, {portc, soft_purge, soft_purge}},
 {code_change, [{portc, change_erl_only}]},
 {resume, [portc]}]