6 Types and Function Specifications
6.1 The Erlang Type Language
Erlang is a dynamically typed language. Still, it comes with a notation for declaring sets of Erlang terms to form a particular type, effectively forming a specific sub-type of the set of all Erlang terms.
Subsequently, these types can be used to specify types of record fields and the argument and return types of functions.
Type information can be used to document function interfaces, provide more information for bug detection tools such as Dialyzer, and can be exploited by documentation tools such as Edoc for generating program documentation of various forms. It is expected that the type language described in this document will supersede and replace the purely comment-based @type and @spec declarations used by Edoc.
6.2 Types and their Syntax
Types describe sets of Erlang terms. Types consist and are built from a set of predefined types (e.g. integer(), atom(), pid(), ...) described below. Predefined types represent a typically infinite set of Erlang terms which belong to this type. For example, the type atom() stands for the set of all Erlang atoms.
For integers and atoms, we allow for singleton types (e.g. the integers -1 and 42 or the atoms 'foo' and 'bar'). All other types are built using unions of either predefined types or singleton types. In a type union between a type and one of its sub-types the sub-type is absorbed by the super-type and the union is subsequently treated as if the sub-type was not a constituent of the union. For example, the type union:
atom() | 'bar' | integer() | 42
describes the same set of terms as the type union:
atom() | integer()
Because of sub-type relations that exist between types, types form a lattice where the topmost element, any(), denotes the set of all Erlang terms and the bottom-most element, none(), denotes the empty set of terms.
The set of predefined types and the syntax for types is given below:
Type :: any() %% The top type, the set of all Erlang terms | none() %% The bottom type, contains no terms | pid() | port() | reference() | [] %% nil | Atom | Bitstring | float() | Fun | Integer | List | Tuple | Union | UserDefined %% described in Section 6.3 Atom :: atom() | Erlang_Atom %% 'foo', 'bar', ... Bitstring :: <<>> | <<_:M>> %% M is a positive integer | <<_:_*N>> %% N is a positive integer | <<_:M, _:_*N>> Fun :: fun() %% any function | fun((...) -> Type) %% any arity, returning Type | fun(() -> Type) | fun((TList) -> Type) Integer :: integer() | Erlang_Integer %% ..., -1, 0, 1, ... 42 ... | Erlang_Integer..Erlang_Integer %% specifies an integer range List :: list(Type) %% Proper list ([]-terminated) | improper_list(Type1, Type2) %% Type1=contents, Type2=termination | maybe_improper_list(Type1, Type2) %% Type1 and Type2 as above | nonempty_list(Type) %% Proper non-empty list Tuple :: tuple() %% stands for a tuple of any size | {} | {TList} TList :: Type | Type, TList Union :: Type1 | Type2
The general form of bitstrings is <<_:M, _:_*N>>, where M and N are positive integers. It denotes a bitstring that is M + (k*N) bits long (i.e., a bitstring that starts with M bits and continues with k segments of N bits each, where k is also a positive integer). The notations <<_:_*N>>, <<_:M>>, and <<>> are convenient shorthands for the cases that M, N, or both, respectively, are zero.
Because lists are commonly used, they have shorthand type notations. The types list(T) and nonempty_list(T) have the shorthands [T] and [T,...], respectively. The only difference between the two shorthands is that [T] may be an empty list but [T,...] may not.
Notice that the shorthand for list(), i.e. the list of elements of unknown type, is [_] (or [any()]), not []. The notation [] specifies the singleton type for the empty list.
For convenience, the following types are also built-in. They can be thought as predefined aliases for the type unions also shown in the table.
Built-in type | Defined as |
term() | any() |
binary() | <<_:_*8>> |
bitstring() | <<_:_*1>> |
boolean() | 'false' | 'true' |
byte() | 0..255 |
char() | 0..16#10ffff |
number() | integer() | float() |
list() | [any()] |
maybe_improper_list() | maybe_improper_list(any(), any()) |
nonempty_list() | nonempty_list(any()) |
string() | [char()] |
nonempty_string() | [char(),...] |
iodata() | iolist() | binary() |
iolist() | maybe_improper_list(byte() | binary() | iolist(), binary() | []) |
module() | atom() |
mfa() | {atom(),atom(),arity()} |
arity() | 0..255 |
node() | atom() |
timeout() | 'infinity' | non_neg_integer() |
no_return() | none() |
In addition, the following three built-in types exist and can be thought as defined below, though strictly their "type definition" is not valid syntax according to the type language defined above.
Built-in type | Could be thought defined by the syntax |
non_neg_integer() | 0.. |
pos_integer() | 1.. |
neg_integer() | ..-1 |
Users are not allowed to define types with the same names as the predefined or built-in ones. This is checked by the compiler and its violation results in a compilation error.
The following built-in list types also exist, but they are expected to be rarely used. Hence, they have long names:
nonempty_maybe_improper_list() :: nonempty_maybe_improper_list(any(), any()) nonempty_improper_list(Type1, Type2) nonempty_maybe_improper_list(Type1, Type2)
where the last two types define the set of Erlang terms one would expect.
Also for convenience, we allow for record notation to be used. Records are just shorthands for the corresponding tuples.
Record :: #Erlang_Atom{} | #Erlang_Atom{Fields}
Records have been extended to possibly contain type information. This is described in the sub-section "Type information in record declarations" below.
6.3 Type declarations of user-defined types
As seen, the basic syntax of a type is an atom followed by closed parentheses. New types are declared using '-type' and '-opaque' compiler attributes as in the following:
-type my_struct_type() :: Type. -opaque my_opaq_type() :: Type.
where the type name is an atom ('my_struct_type' in the above) followed by parentheses. Type is a type as defined in the previous section. A current restriction is that Type can contain only predefined types, or user-defined types which are either module-local (i.e., with a definition that is present in the code of the module) or are remote types (i.e., types defined in and exported by other modules; see below). For module-local types, the restriction that their definition exists in the module is enforced by the compiler and results in a compilation error. (A similar restriction currently exists for records.)
Type declarations can also be parameterized by including type variables between the parentheses. The syntax of type variables is the same as Erlang variables (starts with an upper case letter). Naturally, these variables can - and should - appear on the RHS of the definition. A concrete example appears below:
-type orddict(Key, Val) :: [{Key, Val}].
A module can export some types in order to declare that other modules are allowed to refer to them as remote types. This declaration has the following form:
-export_type([T1/A1, ..., Tk/Ak]).
-export_type([my_struct_type/0, orddict/2]).
mod:my_struct_type() mod:orddict(atom(), term())
Types declared as opaque represent sets of terms whose structure is not supposed to be visible in any way outside of their defining module (i.e., only the module defining them is allowed to depend on their term structure). Consequently, such types do not make much sense as module local - module local types are not accessible by other modules anyway - and should always be exported.
6.4 Type information in record declarations
The types of record fields can be specified in the declaration of the record. The syntax for this is:
-record(rec, {field1 :: Type1, field2, field3 :: Type3}).
For fields without type annotations, their type defaults to any(). I.e., the above is a shorthand for:
-record(rec, {field1 :: Type1, field2 :: any(), field3 :: Type3}).
In the presence of initial values for fields, the type must be declared after the initialization as in the following:
-record(rec, {field1 = [] :: Type1, field2, field3 = 42 :: Type3}).
Naturally, the initial values for fields should be compatible with (i.e. a member of) the corresponding types. This is checked by the compiler and results in a compilation error if a violation is detected. For fields without initial values, the singleton type 'undefined' is added to all declared types. In other words, the following two record declarations have identical effects:
-record(rec, {f1 = 42 :: integer(), f2 :: float(), f3 :: 'a' | 'b'}). -record(rec, {f1 = 42 :: integer(), f2 :: 'undefined' | float(), f3 :: 'undefined' | 'a' | 'b'}).
For this reason, it is recommended that records contain initializers, whenever possible.
Any record, containing type information or not, once defined, can be used as a type using the syntax:
#rec{}
In addition, the record fields can be further specified when using a record type by adding type information about the field in the following manner:
#rec{some_field :: Type}
Any unspecified fields are assumed to have the type in the original record declaration.
6.5 Specifications for functions
A specification (or contract) for a function is given using the new compiler attribute '-spec'. The general format is as follows:
-spec Module:Function(ArgType1, ..., ArgTypeN) -> ReturnType.
The arity of the function has to match the number of arguments, or else a compilation error occurs.
This form can also be used in header files (.hrl) to declare type information for exported functions. Then these header files can be included in files that (implicitly or explicitly) import these functions.
For most uses within a given module, the following shorthand suffices:
-spec Function(ArgType1, ..., ArgTypeN) -> ReturnType.
Also, for documentation purposes, argument names can be given:
-spec Function(ArgName1 :: Type1, ..., ArgNameN :: TypeN) -> RT.
A function specification can be overloaded. That is, it can have several types, separated by a semicolon (;):
-spec foo(T1, T2) -> T3 ; (T4, T5) -> T6.
A current restriction, which currently results in a warning (OBS: not an error) by the compiler, is that the domains of the argument types cannot be overlapping. For example, the following specification results in a warning:
-spec foo(pos_integer()) -> pos_integer() ; (integer()) -> integer().
Type variables can be used in specifications to specify relations for the input and output arguments of a function. For example, the following specification defines the type of a polymorphic identity function:
-spec id(X) -> X.
However, note that the above specification does not restrict the input and output type in any way. We can constrain these types by guard-like subtype constraints and provide bounded quantification:
-spec id(X) -> X when X :: tuple().
Currently, the :: constraint (read as is_subtype) is the only guard constraint which can be used in the 'when' part of a '-spec' attribute.
The above function specification, using multiple occurrences of the same type variable, provides more type information than the function specification below where the type variables are missing:
-spec id(tuple()) -> tuple().
The latter specification says that the function takes some tuple and returns some tuple, while the one with the X type variable specifies that the function takes a tuple and returns the same tuple.
However, it's up to the tools that process the specs to choose whether to take this extra information into account or ignore it.
The scope of an :: constraint is the (...) -> RetType specification after which it appears. To avoid confusion, we suggest that different variables are used in different constituents of an overloaded contract as in the example below:
-spec foo({X, integer()}) -> X when X :: atom() ; ([Y]) -> Y when Y :: number().
For backwards compatibility the following form is also allowed:
-spec id(X) -> X when is_subtype(X, tuple()).
but its use is discouraged. It will be taken out in a future Erlang/OTP release.
Some functions in Erlang are not meant to return; either because they define servers or because they are used to throw exceptions as the function below:
my_error(Err) -> erlang:throw({error, Err}).
For such functions we recommend the use of the special no_return() type for their "return", via a contract of the form:
-spec my_error(term()) -> no_return().