The journey of Maps and this EEP has been long and by no means a straight-forward and continuous one. I had a crystal clear picture of what I wanted Maps to be when we first started discussing it within OTP about two-three years ago. This EEP resembles that vision but it has had a lot of contributions of other ideas from both within and outside of OTP.
The idea was a data-type, a syntax aware mapping of key-value associations with pattern matching. A syntax similar to records but without the hassle of compile-time dependency and with arbitrary terms as keys. Order was not important and it could be implemented with a Hash-Array-Mapped-Trie with good performance and memory trade-offs. This was a different approach than to replace records. It was meant to replace records where suitable and in other regards not be a replacement but its own thing.
From the community there has been many wishes of a Map like data-type and a few suggestions. The one suggestion that stands out is of course the Frames proposal from Richard O’Keefe. It is the most complete proposal I’ve seen and is very well thought out. Its goal is to be a record replacement and the proposal satisfies this goal very well.
A record replacement is just that, a replacement. It’s like asking the question, “What do we have?” instead of “What can we get?” The instant rebuttal would be “What do we need?” I say Maps.
Frames has certainly inspired and influenced Maps. In many regards Maps also encompasses Frames but Maps tries to do more. In the end I believe they are two different things and have different goals.
This EEP suggests a new built in data-type for Erlang, the map,
#{ Key => Value }
.
The new data-type shall have semantics, syntax and operations that:
Similar data-types exists in other languages, i.e. perl hashes, ruby hashes, python dictionaries, or scala maps.
A map M
consists of a number of associations and keeps an association
from key terms K1..Kn
to value terms V1..Vn
where no two keys match.
Any term, compound or otherwise, is a viable key or value. Terms of type Map
are recognized by guard tests erlang:is_map/1
. There are no operators
acting on maps. Within maps there are two infix operators. An association
operator, =>
, pairs a key to a value and is used in creation and updates.
A set-value operator, :=
, is used to update a value on an already
existing and matching key. The set-value operator is also used in matching
to get the associated value from a key.
The size of a map is the number of associations in its set.
An association is a key-value pair of key K to value V in a Map.
Two keys, K1
and K2
are matching if, true = K1 =:= K2
.
Defined syntax for declaring, updating and matching maps.
Constructing a new map is done by letting an expression K
be associated to
another expression V
:
#{ K => V }
New maps may include multiple associations at construction by listing every association:
#{ K1 => V1, .. Kn => Vn }
An empty map is constructed by not associating any terms with each other:
#{}
All keys and values in the map are terms. Any expression is first evaluated and then the resulting terms are used as key and value respectively.
Keys and values are separated by the =>
arrow and associations are
separated by ,
.
Examples:
M0 = #{}, % empty map
M1 = #{ a => <<"hello">> }, % single association with literals
M2 = #{ 1 => 2, b => b }, % multiple associations with literals
M3 = #{ A => B }, % single association with variables
M4 = #{ {A, B} => f() }. % compound key associated to an evaluated expression
where, A
and B
are any expressions and M0
through M4
are the resulting
map terms.
If two matching keys are declared, the latter key will take precedent.
Example:
1> #{ 1 => a, 1 => b }.
#{ 1 => b }
2> #{ 1.0 => a, 1 => b }.
#{ 1 => b, 1.0 => a }
The order in which the expressions constructing the keys and their associated values are evaluated is not defined. The syntactic order of the key-value pairs in the construction is of no relevance, except in the above mentioned case of two matching keys.
A simple BNF grammar for the construction follows:
<map-construct> ::= '#{' <key-value-exprs> '}'
<key-value-exprs> ::= /* empty */
| <key-value-list>
<key-value-list> ::= <key-value-assoc>
| <key-value-assoc> ',' <key-value-list>
<key-value-assoc> ::= <expr> '=>' <expr>
<expr> ::= <Erlang-expression>
Updating a map has similar syntax as constructing it.
An expression defining the map to be updated is put in front of the expression defining the keys to be updated and their respective values.
M#{ K => V }
where M
is an term of type map and K
and V
are any expression.
If key K
does not match any existing key in the map, a new association
will be created from key K
to value V
. If key K
matches an existing
key in map M
its associated value will be replaced by the new value V
. In both
cases the evaluated map expression will return a new map.
If M
is not of type map an exception of type badmap
is thrown.
To only update an existing value, the following syntax is used,
M#{ K := V }
where M
is an term of type map, V
is an expression and K
is an expression
which evaluates to an existing key in M
.
If key K
does not match any existing keys in map M
an exception of type
badarg
will be triggered at runtime. If a matching key K
is present in
map M
its associated value will be replaced by the new value V
and the
evaluated map expression returns a new map.
If M
is not of type map an exception of type badmap
is thrown.
Examples:
M0 = #{},
M1 = M0#{ a => 0 },
M2 = M1#{ a => 1, b => 2 },
M3 = M2#{ "function" => fun() -> f() end },
M4 = M3#{ a := 2, b := 3 }. % 'a' and 'b' was added in `M1` and `M2`.
where M0
is any map. It follows that M1 .. M4
are maps as well.
More Examples:
1> M = #{ 1 => a }.
#{ 1 => a }
2> M#{ 1.0 => b }.
#{ 1 => a, 1.0 => b }.
3> M#{ 1 := b }.
#{ 1 => b }
4> M#{ 1.0 := b }.
** exception error: bad argument
As in construction, the order in which the key and value expressions are evaluated are not defined. The syntactic order of the key-value pairs in the update is of no relevance, except in the case where two keys match, in which case the latter value is used.
A simple BNF grammar for map updates follows:
<map-construct> ::= <map-expr> '#{' <key-value-exprs> '}'
<key-value-exprs> ::= /* empty */
| <key-value-list>
<key-value-list> ::= <key-value>
| <key-value> ',' <key-value-list>
<key-value> ::= <key-value-assoc>
| <key-value-exact>
<key-value-assoc> ::= <expr> '=>' <expr>
<key-value-exact> ::= <expr> ':=' <expr>
<map-expr> ::= <Erlang expression evaluating to a term of type map>
<expr> ::= <Erlang expression>
For accessing single values in maps, let us use an de-association:
V = M#{ K }.
Where M
is a Map and K
is any term.
If key K
matches to an existing key in map M
the associated value
will be bound to V
. If key K
does not match to any existing key in
map M
an exception badarg
will occur in runtime.
Examples:
M1 = #{ a => 1, c => 3 },
3 = M1#{ c }.
M2 = #{ 1.0 => a },
a = M2#{ 1 }.
Matching of key-value associations from maps, exemplified with the matching operator, is done in the following way:
#{ K := V } = M
where M
is any map. The key K
has to be an expression with bound variables
or a literals, and V
can be any pattern with either bound or unbound variables.
If variables in V
are unbound, it will be bound to the value associated
with the key K
, which has to exist in the map M
. If variables in V
are
bound, it has to match the value associated with K
in M
.
Example:
M = #{ a => {1,2}},
#{ a := {1,B}} = M.
This will bind variable B
to integer 2
.
Similarly, multiple values from the map may be matched:
#{ K1 := V1, .., Kn := Vn } = M
where keys K1 .. Kn
are any expressions with literals or bound variables. If all
keys exists in map M
all variables in V1 .. Vn
will be matched to the
associated values of there respective keys.
If the matching conditions are not met, the match will fail, either with
badmatch
exception, if used in the context of the matching operator
as in the example,Matching in maps only allows for :=
as delimiters of associations.
The order in which keys are declared in matching has no relevance.
Duplicate keys are allowed in matching and will match each pattern associated to the keys.
#{ K := V1, K := V2 } = M
Matching an expression against an empty map literal will match its type but no variables will be bound:
#{} = Expr
This expression will match if the expression Expr
is of type map, otherwise
it will fail with an exception badmatch
.
The grammar for the matching syntax is similar to that of construction.
Matching of literals as keys are allowed in function heads.
%% only start if not_started
handle_call(start, From, #{ state := not_started } = S) ->
...
{reply, ok, S#{ state := start }};
%% only change if started
handle_call(change, From, #{ state := start } = S) ->
...
{reply, ok, S#{ state := changed }};
More matching syntax, calculating frequency of terms in a list.
freq(Is) -> freq(Is, #{}).
freq([I|Is], #{I := C} = M) -> freq(Is, M#{ I := C + 1});
freq([I|Is], M) -> freq(Is, M#{ I => 1 });
freq([], M) -> maps:to_list(M).
Equivalent code with gb_trees
for comparison:
freq(Is) -> freq(Is, gb_trees:empty()).
freq([I|Is], T) ->
case gb_trees:lookup(I, T) of
none -> freq(Is, gb_trees:enter(I, 1), T);
{value, V} -> freq(Is, gb_trees:enter(I, V + 1, T))
end;
freq([], T) -> gb_trees:to_list(T).
Old API’s could be refined to use map syntax:
1> {ok, #{ type := Type, mtime := Mtime }} = file:read_file_info(File).
2> io:format("type: ~p, mtime: ~p~n", [Type, Mtime]).
type: regular, mtime: {{2012,7,18},{19,59,18}}
ok
3>
Map comprehension declaration:
M1 = #{ E0 => E1 || K := V <- M0 }
where M0
is any Map, E0
and E1
are any erlang expression, K
and V
constitutes the pattern to be matched by each association in M0
.
For each sequence in the generator an association is created from the evaluated
expression E0
to the evaluated expression E1
.
If M0
is not a Map, then a runtime exception of type {bad_generator, M0}
will be generated.
A simple BNF grammar for map comprehension follows:
<comprehension> ::= '#{' <key-value-assoc> '||' <comprehension-exprs> '}'
<comprehension-exprs> ::= <comprehension-expr>
| <comprehension-exprs> ',' <comprehension-expr>
<comprehension-expr> ::= <generator>
| <filter>
<generator> ::= <key-value-exact> '<-' <expr>
<filter> ::= <expr>
<key-value-assoc> ::= <expr> '=>' <expr>
<key-value-exact> ::= <expr> ':=' <expr>
<expr> ::= <Erlang expression>
Each association from all generators, which satisfies the filters, has an environment that consist of the initial environment and the environment for the association.
Examples:
M0 = #{ K => V*2 || K := V <- map() },
M1 = #{ I => f(I) || I <- list() },
M2 = #{ K => V || <<L:8,K:L/binary,V/float>> <= binary() }.
Map generators may also be used in binary and list comprehensions.
Examples:
B1 = << <<V:8>> || _ := V <- map() >>,
L1 = [ {K,V} || K := V <- map() ].
Keys known before hand can be specified directly and uniquely for a map.
-spec func(Opt, M) -> #{ 'status' => S, 'c' => integer() } when
Opt :: 'inc' | 'dec',
M :: #{ 'status' => S, 'c' => integer() },
S :: 'update' | 'keep'.
func(inc, #{ status := update, c := C} = M) -> M#{ c := C + 1};
func(dec, #{ status := update, c := C} = M) -> M#{ c := C - 1};
func(_, #{ status := keep } = M) -> M.
It could also be specified by type only.
-spec plist_to_map(Ls) -> #{ binary() => integer() } when
Ls :: [{binary(), integer()}].
plist_to_map([], M) ->
M;
plist_to_map([{K,V}|Ls], M) when is_binary(K), is_integer(V) ->
plist_to_map(Ls, M#{ K => V });
plist_to_map([_|Ls], M) ->
plist_to_map(Ls, M).
It can similarly be specified as a type.
-type map1() :: #{ binary() => integer() }.
-type map2() :: #{ <<"list1">> | <<"list2">> => [numbers()] }.
The module implementing the functions is currently named in plural, maps
in the
same spirit as lists
, gb_trees
, sets
etc but the singular map
is shorter
and may be more desirable.
Functions and semantics for maps. Originally much inspiration was from Richard O’Keefes frames proposal.
erlang:is_map(M :: term()) -> boolean().
#
This function is a guard function.
Syntax equivalence: try #{} = M, true catch error:{badmatch,_} -> false end
.
The function returns true
if M is a map otherwise false
.
erlang:map_size(M :: map()) -> non_neg_integer().
#
This function is a guard function.
Syntax equivalence: none.
The function returns the number of key-value pairs in the map. This operation happens in constant time.
Same as maps:size(M)
.
maps:remove(K0 :: term(), M0 :: map()) -> M1 :: map().
#
Syntax equivalence: #{ K => V || K := V <- M0, K =/= K0 }
.
Only with comprehensions
The function removes the key K0
, if it exists, and its associated value from
map M0
and returns a new map M1
without key K0
.
Same as, maps:from_list([{K,V}||{K,V} <- maps:to_list(M0), K =/= K0])
maps:get(K :: term(), M :: map()) -> V :: term().
#
Syntax equivalence: M#{ K }
.
Returns the value V
associated with key K
if map M
contains a key
that matches K
. If no value is associated with key K
then the call will
fail with an exception.
maps:keys(M :: map()) -> [K1, .., Kn].
#
Syntax equivalence: [K || K := _ <- M]
.
Returns a complete list of Keys, in arbitrary order, which resides within
map M
.
Same as, [K || {K,_} <- maps:to_list(M)]
.
maps:find(K :: term(), M :: map()) -> {ok, V :: term()} | error.
#
Syntax equivalence: try #{ K := V } = M, V catch error:{badmatch,_} -> error end
.
Returns a tuple {ok, V}
with value V
associated with key K
if map M
contains key K
. If no value is associated with key K
then the function
will return error
.
maps:fold(F :: fun((K :: term(), V :: term(), In :: term()) -> Out :: term()), A :: term(), M :: map()) -> Result :: term().
#
Syntax equivalence: none
Calls F(K, V, In)
for every key K
to value V
association in map M
in
arbitrary order. The function fun F/3
must return a new accumulator
which is passed to the next successive call. maps:fold/3
returns the final
value of the accumulator. The initial accumulator value A
is returned if
the map is empty.
Same as, lists:foldl(fun({K,V}, In) -> F(K,V,In) end, A, maps:to_list(M))
.
maps:from_list([{K1,V1}, .., {Kn,Vn}]) -> M :: map().
#
Syntax equivalence: #{ K1 => V1, .., Kn => Vn }
The function takes a list of key-value tuples elements and builds a map. The associations may be in any order and both keys and values in the association may be of any term. If the same key appears more than once, the latter (rightmost) value is used and the previous values are ignored.
maps:is_key(K :: term(), M :: map()) -> bool().
#
Syntax equivalence: try #{ K := _ } = M, true catch error:{badmatch, _} -> false end
.
Returns true
if map M
contains a key that matches K
.
maps:map(F :: function(), M0 :: map()) -> M1 :: map().
#
Syntax equivalence: #{ K => F(K,V) || K := V <- M }
.
Only with comprehensions
The function produces a new map M1
by calling the function fun F(K, V)
for
every key K
to value V
association in map M0
in arbitrary order.
The function fun F/2
must return the value to be associated with key K
for
the new map M1
.
Same as, maps:from_list(lists:map(fun({K,V}) -> {K, F(K,V)} end, maps:to_list(M)))
.
maps:new() -> M :: map().
#
Syntax equivalence: #{}
.
Returns a new empty map.
Same as, maps:from_list([])
.
maps:size(M :: map()) -> Size :: non_neg_integer().
#
Syntax equivalence: none.
The function returns the number of key-value associations in the map. This operation happens in constant time.
Same as erlang:map_size(M)
.
maps:put(K :: term(), V :: term(), M0 :: map()) -> M1 :: map().
#
Syntax equivalence: M0#{ K => V }
.
Associates key K
with value V
and inserts the association into map M0
.
If a key exists that matches K
, the old associated value is
replaced by value V
.
The function returns a new map M1
containing the new association.
Same as, maps:from_list(maps:to_list(M0) ++ [{K,V}])
.
maps:to_list(M :: map()) -> [{K1,V1}, ..., {Kn,Vn}].
#
Syntax equivalence: [{K, V} || K := V <- M]
.
Where the pairs, [{K1,V1}, ..., {Kn,Vn}]
, are returned in arbitrary order.
maps:update(K :: term(), V :: term, M0 :: map()) -> M1 :: map()
#
Syntax equivalence: M0#{ K := V }
.
If a key exists that matches K
, the old associated value is
replaced by value V
. The function returns a new map M1
containing
the new associated value.
Same as, maps:from_list(maps:to_list(M0) ++ [{K,V}])
.
maps:values(M :: map()) -> [V1, .., Vn].
#
Syntax equivalence: [V || _ := V <- M]
.
Returns a complete list of values, in arbitrary order, contained in map M
.
Same as, [V || {_,V} <- maps:to_list(M)]
.
maps:without([K1, .., Kn] = Ks, M0 :: map()) -> M1 :: map().
#
Syntax equivalence: #{ K => V || K := V <- M0, not lists:member(K, Ks) }
.
Only with comprehensions
Removes keys K1
through Kn
, and their associated values, from map M0
and
returns a new map M1
.
Same as, maps:from_list([{K,V}||{K,V} <- maps:to_list(M0), not lists:member(K, Ks)])
.
maps:merge(M0 :: map(), M1 :: map()) -> M2 :: map().
#
Syntax equivalence: none
Merges two maps into a single map. If two matching keys exists in both maps the
value in map M0
will be superseded by the value in map M1
.
In the case of term A
and term B
both are maps,
A
and B
have different sizes, then A
and B
are not equal.A
and B
are pair-
wise equal with their corresponding values, then A
and B
are equal.A
and B
are not equal.It follows that two maps are equal if, and only if, they are of the same, type, size and all corresponding key-value associations are pairwise equal.
The term order is defined in Erlang specification 4.7 and quoted below:
The terms are primarily ordered according to their type, in the following order:
numbers < atoms < refs < ports < PIDs < tuples < empty list < conses < binary
The specification is incomplete here, the actual term order is:
numbers < atoms < refs < funs < ports < pids < tuples < empty list < conses < binaries
The Maps data-type are ordered next after tuples:
numbers < atoms < refs < funs < ports < pids < tuples < maps < empty list < conses < binaries
----
Maps are then ordered first by their size and then according to their respective keys and lastly by the associated values in Erlang term order.
Given two maps, M1
and M2
, with the same size, they are compared
so that each key, in Erlang term order of the keys, in M1
is compared to the corresponding key of M2
. All keys are
compared first, then the values, until a difference is found. If a key
or value differs, the order of the respective terms, in Erlang term
order, is the order of the maps. If no key-value pairs differ, the
maps are considered equal.
Example:
> #{ b => 2 } > #{ a => 2 }. % b > a
true
> #{ b => 2 } > #{ a => 1, b => 2 }. % size 1 < size 2
false
> #{ b => 1 } > #{ b => 2}. % 1 < 2
false
> #{ b => 2, c => 3 } > #{ b => 1, d => 3}. % c > d, compared before 2 and 1
false
> #{ b => 1 } > #{ 1 => 1}. % b > 1
true
> #{ 1.0 => a } == #{ 1 => a }. % 1.0 == 1
true
> #{ 1.0 => a } =:= #{ 1 => a }. % 1.0 =:= 1
false
Maps are printed with keys in arbitrary order.
Map association operator and set-value operator is ordered last,
after match-operator and catch
.
:
#
Unary + - bnot not
/ * div rem band and Left associative
+ - bor bxor bsl bsr or xor Left associative
++ -- Right associative
== /= =< < >= > =:= =/=
andalso
orelse
= ! Right associative
catch
=> :=
It follows that the map expression:
#{ key => C = 1 + 2 }
will evaluate in the following order:
#{ key => ( C = ( 1 + 2 ) ) }
Pattern matching is very powerful Erlang tool. Maps introduces a couple of new features with its pattern matching.
We will exemplify using the match operator.
Pattern matching with maps is similar to records on the surface. Keys requested in a LHS pattern will be bound with the values which is found in the RHS association map.
1> #{ a := V } = #{ a => 1 }.
#{ a => 1 }
2> 1 = V.
1
Keys requested in a LHS pattern which is not found in the RHS map will produce
an exception, exception error: no match of right hand side value ...
.
1> #{ not_in_map := V } = #{ a => 1 }.
** exception error: no match of right hand side value #{ a => 1 }
Similarly, if a value for a requested key in the LHS pattern does not match the keys associated value in the RHS map the match will produce an exception.
1> #{ a := 10 } = #{ a => 1 }.
** exception error: no match of right hand side value #{ a => 1 }
Only the keys requested will bind to associations. Any unrequested keys which resides within the map being matched will be ignored.
1> #{ a := V1, b := V2 } = #{ a => 1, b => 2, c => 3}.
#{ a => 1, b => 2, c => 3 }
2> 1 = V1.
1
3> 2 = V2.
2
The order of keys requested has no significance when pattern matching a map.
1> #{ a := "1", b := "2" } = #{ a => "1", b => "2" }.
#{ a => "1", b => "2" }
2> #{ b := "2", a := "1" } = #{ a => "1", b => "2" }.
#{ a => "1", b => "2" }
The example below is a constructed example to illustrate the power of map pattern matching.
A match expression is evaluated so that variables used as keys in the expression are bound before they are evaluated (if possible).
As an example, keys can be bound by other key-value associations.
1> #{ K := V, id := K } = M = #{ id => b, a => 1, b => 2, c => 3}.
#{ id => b, a => 1, b => 2, c => 3}
2> b = K.
b
3> 2 = V.
2
In this case, the bound key id
is evaluated first and looked up in
M, binding the variable K
. The K
bound to b
can then be used to
bind V
to 2.
Binding variables used as keys requires that there is a possible order of binding without cycles. The reordering extends to all terms in a matching expression, so that:
1> { #{ X := Y }, X } = { #{ 5 => 10 }, 5 }.
with X
and Y
unbound, results in a successful match binding X
to 5 and
Y
to 10.
This is particular useful when updating specifics in map associations:
%% Function declared in module map_example
update_values([{K, V1}|Ls], #{ K := V0 } = M) -> update_values(Ls, M#{ K := V0 + V1 });
update_values([_|Ls], M) -> update_values(Ls, M);
update_values([], M) -> M.
The first function clause is important here. Key K
is bound in the tuple and
will be used to request value V0
from map M
. The map M
is then updated to
associate key K
with the new value V0 + V1
.
%% In the Erlang shell
1> M = #{ "a" => 1, "b" => 2, "c" => 3 }.
#{ "a" => 1, "b" => 2, "c" => 3 }
2> map_example:update_values([{"b", 10}, {"c", 20}, {"d", 30 }], M).
#{ "a" => 1, "b" => 12, "c" => 23 }
Note that since key "d"
does not reside in map M
it fails to match the first
clause and does not update the map with the association "d" => 40
.
An expression where the dependencies in the LHS of the match are cyclic, like:
1> #{X := Y, Y := X} = #{5 => 10, 10 => 5}.
will result in an evaluator error (variable is unbound) or a compilation error.
There are 255 tags that can be used to encode terms for external binary
distribution. All tags are defined in external.h
. The encoding starts by a
magic byte 131
.
The encoding tags used in R15B01 are the following:
SMALL_INTEGER_EXT 'a' 97
INTEGER_EXT 'b' 98
FLOAT_EXT 'c' 99
ATOM_EXT 'd' 100
SMALL_ATOM_EXT 's' 115
REFERENCE_EXT 'e' 101
NEW_REFERENCE_EXT 'r' 114
PORT_EXT 'f' 102
NEW_FLOAT_EXT 'F' 70
PID_EXT 'g' 103
SMALL_TUPLE_EXT 'h' 104
LARGE_TUPLE_EXT 'i' 105
NIL_EXT 'j' 106
STRING_EXT 'k' 107
LIST_EXT 'l' 108
BINARY_EXT 'm' 109
BIT_BINARY_EXT 'M' 77
SMALL_BIG_EXT 'n' 110
LARGE_BIG_EXT 'o' 111
NEW_FUN_EXT 'p' 112
EXPORT_EXT 'q' 113
FUN_EXT 'u' 117
DIST_HEADER 'D' 68
ATOM_CACHE_REF 'R' 82
ATOM_INTERNAL_REF2 'I' 73
ATOM_INTERNAL_REF3 'K' 75
BINARY_INTERNAL_REF 'J' 74
BIT_BINARY_INTERNAL_REF 'L' 76
COMPRESSED 'P' 80
For Maps we define tag MAP_EXT
to 116 (t
).
Data layout:
MAP_EXT
|-----------------------------------
| 1 | 4 | | |
|-----------------------------------
| 116 | Size | Keys | Values |
|-----------------------------------
The Size
specifies the number of keys and values that follows the size
descriptor.
An open questions, optimize for:
Memory size should be a priority since we send this data over the wire. We should promote ease of access so other languages can integrate towards the format.
This leads to a flat and simple structure. It follows that encoding/decoding takes a performance hit.
Why would we need maps when we have records, dicts, gb_trees, ets and proplists?
Maps are envisioned to be an easy to use, lightweight yet powerful key-value association store.
Maps utilizes one of Erlang’s major strengths, pattern matching, to enrich user experience and provide a powerful tool to simplify code development. Pattern matching gives Maps a clear edge over dicts, gb_trees or proplists in usability.
Maps provides the possibility to associate arbitrary terms as keys, not only atoms, with arbitrary terms as values in a matching capable data-type.
Maps does not claim to be an replacement to records as the frames proposal does. Instead maps targets a larger usage domain and wishes to be a complement to records and supersede them where suitable.
Maps were not envisioned as a record replacement at first, it was a hopeful requirement added later. A record replacement approach does not necessarily restrain any semantics but it may put some constraint on the implementation and underlying structure.
Records are powerful under the right circumstances:
However some of the drawbacks are:
When comparing maps with records the drawbacks are easily remedied by Maps, however the positive effects is not as easy to replicate in a built-in data-type where values are determined at runtime instead of at compile time.
Arguments for a simpler a JSON representation using frames or maps has been raised. Using frames and thereby atoms for dynamic creation of keys would be a serious drawback. Using maps would grant the possibility of string binaries to represent keys and would not put the global atom pool in disarray.
Changing json-decoding. Mochiwebs mochijson decodes Json dictionaries the as the following:
{"key": "value"} -> {struct, [{"key", "value"}]}
This could instead be:
{"key": "value"} -> #{ "key" => "value"}
Consider the following JSON examples, from json.org.
{"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
mochijson:decode/1
will currently look like:
{struct, [
{"menu", {struct, [
{"id","file"},
{"value","File"},
{"popup", {struct, [
{"menuitem", {array, [
{struct, [{"value","New"},{"onclick","CreateNewDoc()"}]},
{struct, [{"value","Open"},{"onclick","OpenDoc()"}]},
{struct, [{"value","Close"}, {"onclick","CloseDoc()"}]}
]}}
]}}
]}}
]}
mochijson:decode/1
could look like:
#{ "menu" => #{
"id" => "file",
"value" => "File",
"popup" => #{
"menuitem" => [
#{ "value" => "New", "onclick" => "CreateNewDoc()"},
#{ "value" => "Open", "onclick" => "OpenDoc()"},
#{ "value" => "Close", "onclick" => "CloseDoc()"}
]
}
}}
Let us find "menu"
-> "popup"
-> "menuitem"
.
Traversing the first structure is a bit awkward. We would have to do the following:
Decoded = mochijson:decode(Json),
{struct, Menu} = proplists:get_value("menu", Decoded),
{struct, PopUp} = proplists:get_value("popup", Menu),
{struct, Items} = proplists:get_value("menuitem", PopUp),
With maps it could look like the following:
#{ "menu" := Menu } = mochijson:decode(Json),
#{ "popup" := PopUp } = Menu,
#{ "menuitem" := Items } = PopUp,
or even:
Decoded = mochijson:decode(Json),
#{ "menu" := #{ "popup" := #{ "menuitem" := Items } } } = Decoded,
With maps, and single value access, it could look really simple:
Decoded = mochijson:decode(Json),
Items = Decoded#{ "menu" }#{ "popup" }#{ "menuitem "}.
We have some usage scenarios that are still open for debate. A proposed answer is given for each question and stems from discussions in this proposal.
behavior
where key
behaviour
was actually intended. Instead of getting two different keys,
a runtime exception occurs at this point.The answers from these questions are instrumental to how we should design and implement Maps. What we also should keep in the back of our minds is that we will never get rid of records completely and some of the frames arguments might thus be superfluous.
As stated earlier, the current syntax is not set in stone but what should we expect from it?
First and foremost it has to be unambiguous, meaning the syntax must produce single clearly defined behavior that cannot be misinterpreted by humans nor machines.
1.1 Here we also include the notion that similar syntax should have similar
behavior, or at least not completely different behavior.
For example records, #record{ key = Value }
, have O(1) performance and
2 + N words memory overhead. If we use a similar syntax,
i.e. #{ key = Value }
, we should also expect similar behavior both in
regard to semantics and performance for any and all sizes of the map.
The syntax must be as short as possible. It must not use more characters than necessary to describe a certain behavior as long as it does not violate rule 1.
2.2 We want to avoid verbosity in the language. Verbosity pushes away information from our field of vision and obfuscates it. This needs to be avoided.
The author argues for: =>
as delimiter in ‘set-or-update’ and
:=
in ‘set-existing’ and ‘matching’ syntax.
In the examples below we use #{ Key => Value }
to describe map semantics and
use-cases, but this is only one suggestion out of many.
Several syntax proposals exists, frames proposes <{ key ~ Value }>
syntax and
another syntax suggestion is very similar to record syntax #{ key = Value }
.
The current variable and atom definitions puts restrictions on what we can use
as delimiters and leaves us ~
and =
as the only sane single character
delimiters we can use. The case is very well argued in Richard O’Keefes
No more need for records (fifth draft).
Arguments against a =
delimiter are:
#{ A = B = v }
,
does A
match B
or does B
match v
?
Which =
is a match operation and which =
delimits the key-value pair?Key
‘equal’ Value
,and hence =
is in violation of rule #1 from What do we expect from our syntax?.
The interpretation of this syntax is ambiguous.
Arguments against a ~
delimiter are:
Key
‘NOT’ Value
as ~
is the bitwise NOT operator in C,Key
‘about equal’ Value
as ~
is similar to mathematics ≃
,-
in certain fonts,
ex. K ~ V
might be interpreted as K - V
, consider #{ K - 1 ~ 2 - V }
,and hence this is in violation of rule #1 from What do we expect from our syntax?. The interpretation of this syntax is ambiguous.
Two two-character delimiter suggestions are #{ Key := Value }
and
#{ Key => Value}
, where :=
is a common denominator for assignment and =>
should be read as maps to. A two-character delimiter should be avoided if at
all possible since it increases the syntax footprint of the source code.
The assignment delimiter reads well for just assignment but suffers from the
same reversed logic flaw as records when it comes to pattern matching. The match
#{ key := Value } = M
reads match M to the map pattern where Value is equal
to key. That does not read well unless we call the assignment delimiter :=
for something that its not meant to be.
However, :=
is also similar to =:=
which means “is exactly equal”,
i.e. matches. This is a valuable meaning since we have a difference
between ==
and =:=
when dealing with numbers and thus :=
could be a more
correct delimiter for matching syntax.
The delimiter ->
would be suitable choice if it weren’t for the fact that it
would overload the function clause meaning.
Both ->
and =>
might be confusing when dealing with binaries.
Consider #{ <<"key">> -> <<"value">> }
and #{ <<"key">> => <<"value">> }
,
where =>
appears to be slightly more confusing than ->
.
Listing of delimiters from above perceived desirability:
#{ K => V }
- No ambiguity, no overloading, reads as an association#{ K := V }
- No ambiguity, no overloading, reads as an assignment or exact match#{ K ~ V }
- Typographical ambiguity, no overloading, no clear meaning#{ K -> V }
- Overloads function clause head and body separator, reads as an association#{ K = V }
- Overloads match operator, reads as a match or an assignmentUsing :=
assignment for existing keys seems as a good choice. The choice for
set-or-update is between =>
and ~
.
A case for two different ways to update values in a Map is proposed.
One syntax if, and only if, we want to update a value for an already existing key and another if we want to update the Map with any key.
M#{ K => V }
to declare new key value pairs or update already existing keysM#{ K := V }
to update already existing keys.#{ K := V } = M
to match maps.Example 1:
foo() ->
M = #{ key1 => 1, key2 => 2 }, % M is declared with keys 'key1' and 'key2'
bar(M).
bar(M) ->
M#{
key1 := "1", %% 'key1' will be set to "1"
key2 := "2", %% 'key2' will be set to "2"
key3 := "3" %% this causes an exception since 'key3' does not exist in M
}.
> foo().
** exception error: no match of 'key3' in map
Example 2:
foo() ->
M = #{ key1 => 1, key2 => 2 }, % M is declared with keys 'key1' and 'key2'
bar(M).
bar(M) ->
M#{
key1 => "1", %% 'key1' will be set to "1"
key2 => "2", %% 'key2' will be set to "2"
key3 => "3" %% 'key3' will be set to "3"
}.
> foo().
#{ key1 => 1, key2 => "2", key3 => "3" }
We must lessen the syntax footprint impact on the source code and the language.
Currently the two normal ways of sending options to a functions are either via records or property lists. Both have some drawbacks. Records are compile time dependent and syntactic sugar for tuples. Property lists are generic but produces a lot of texts when defining them and operating on them.
Consider this example when parsing a list of arguments:
args(Args) ->
args(Args, [{analyze, false}, {suppression, false}, {target, none}]).
args(["-r" | Args], Opts) ->
args(Args, [{analyze, true} | proplists:delete(analyze, Opts)]);
args(["-s="++File | Args], Opts) ->
args(Args, [{suppression, File} | proplists:delete(suppression, Opts)]);
args([Target], Opts) ->
[{target, Target} | proplists:delete(target, Opts)].
The textual impact, the number of characters, is quite heavy when operating on property lists.
If we instead use some kind of map with syntax, how would that look?
args(Args) ->
args(Args, #{ analyze => false, suppression => false, target => none}).
args(["-r" | Args], Opts) -> args(Args, Opts#{ analyze := true });
args(["-s="++File | Args], Opts) -> args(Args, Opts#{ suppression := File});
args([Target], Opts) -> Opts#{ target := Target }.
This looks cleaner in my opinion but that is a very subjective view. To use some data we can count the characters, and we see that the property lists example has 390 characters versus the map examples 306. Property lists uses almost 30% more characters in this example.
Perhaps the most sane maps:from_list/1
semantics would be to have the key-value
significance order in left to right, meaning the first association is used and
the latter values with matching keys are ignored.
This differs from the dict:from_list/1
behavior.
Consider the following dict
example:
[{a,2}] = dict:to_list(dict:from_list([{a,1}, {a,2}])).
By letting the leftmost be the most significant key we could simplify conversion from and to lists.
Current suggestion has the following semantics:
Ls = [{a,old}],
#{ a := old } = maps:from_list([{a,new}|Ls]).
The reversal would be:
Ls = [{a,old}],
#{ a := new } = maps:from_list([{a,new}|Ls]).
A restriction set on the implementation by the Erlang specification is that order is total, i.e. satisfies antisymmetry, transitivity and totality.
M1 =< M2
and M2 =< M1
then M1 == M2
,M1 =< M2
and M2 =< M3
then M1 =< M3
,M1 =< M2
or M2 =< M1
(always comparable)
where M1
, M2
and M3
are any Map term.This only holds true in Erlang if we treat floats and integers as union of types,
namely numbers. In the case of a Maps, true = #{ 1.0 => V } == #{ 1 => V}
.
lists:sort([M1, .., Mn])
M1
< M2
, would result in an internal sort but would cost
O( N1 * lg N1 + N2 * lg N2 ), where
N1 = maps:size(M1) and N2 = maps:size(M2)
Do we need to have single access or is matching sufficient?
Consider the following,
V = M#{ K }
is shorter than
#{ K := V } = M
It also allows for easy access of associated values in deep structures.
The syntax for single value access is the least developed (and contemplated) feature in this proposal and certainly could use some input.
More over, the dot syntax must be abolished. Currently it is used for records but it will not be used for maps. Dot represents end of expression list in last clause, or end of attribute.
It cannot be used to distinguish between floats or associations.
Example:
1> M = #{ 1.1 => a, 1 => #{ 1 => b } }.
#{ 1 => #{ 1 => b }, 1.1 => a }.
2> #M.1.1.
a | b ?
Erlang code written with Maps will only be parseable, loadable and executable on Erlang/OTP R17A and later releases of Erlang/OTP but not on previous releases.
Erlang code written before Erlang/OTP R17A will be perfectly compatible, i.e. parseable, loadable and executable with these Maps changes.
Distribution will not be backwards compatible.
This document has been placed in the public domain.