[erlang-questions] Maps branch and disclaimers

Fri Oct 25 18:37:45 CEST 2013

Hi!

Here you go, Maps!

I've pushed a Maps branch to Erlang/OTPs repository at GitHub.

To get the branch,

   git fetch git@REDACTED:erlang/otp.git egil/maps/eep-implementation

or find it at 
https://github.com/erlang/otp/tree/egil/maps/eep-implementation

I want to state the following so there is no room for uncertainty:
- This branch contains a *development stage* of the *experimental* Maps 
feature for Erlang.

This means:
  - Do not use it in production since it is not stable,
  - Do not base any git branch on this branch since it will most likely 
be rebased,
  - and finally, we reserve the right to change any API or interfaces to 
Maps currently implemented.

The implementation is based on EEP 43 - Maps, see 
http://github.com/erlang/eep/blob/master/eeps/eep-0043.md, for details.

_What is implemented?_

The maps module API and erlang guard BIFs as defined in the EEP are 
implemented. There are however some sematic mismatches with the EEP. I 
think those are where the definition contradict itself. For instance 
maps:is_key/1 compares with =:= as stated first in the definition but 
the later example uses lists:keymember which compares with ==.

The syntax and all what that entails is implemented. The compiler will 
handle the map syntax and produce loadable beam-code. I believe this is 
what people want to test and is what I want people to test. Test the 
usability that is.

I recommend people look at the EEP for information and also the 
testsuite located at erts/emulator/test/map_SUITE.erl for information on 
how to use Maps since no other documentation is available.

Roughly,
M0 = #{ key => Value1, "key" => Value2}, % for construction.
   M1 = M1#{  "key" := Value3, <<"key">> => Value4 }, % for updates
   #{ "key" := V } = M1. % for matching

Where the operator '=>' (assoc operator) is used for extending and 
creating new Maps and the operator ':=' is used to update existing 
key/values. The ':=' operator is the only operator allowed in patterns. 
I'm guessing some confusion will arise from these two types of operators 
on where you can and/or should use them.

Look at the tests and EEP for details and inspiration.

A major difference from the EEP are variables in keys. Variables in keys 
are not allowed at all. This is because we want to reduce the scope for 
this first stage. Plenty to do besides that.

Here are some additional disclaimers to make people sad.

_What is not implemented?_

- No variable keys.
- No single value access.
- No map comprehensions.
- No datastructure to handle large Maps.
- No MatchSpecs which uses the Maps syntax will work.

_Known issues_

- Dialyzer will not work with maps in the code, this include PLT 
building with erts and stdlib.
- HiPE, the native compiler, will not with maps code.
- EDoc will not work with maps.

I'm sure there are other issues as well, it is a development branch 
after all. =)

I would also like to point out that no optimizations are done either 
with respect to the generated code. This means that the instruction set 
may change. We know of several optimization we want to do before R17, 
especially for the match compiler so keep that in mind.

We will continue stabilizing the Maps implementation as we move forward 
towards R17 and take appropriate action depending on the feedback you 
give us.

I would like to continue with saying a few words about possible changes 
that we are thinking about.

_Variables in Keys_

This feature is actually furthest down on the work prio list. We want to 
stabilize the current features before moving forward and variable keys 
is the one most likely to be dropped if we get pressed for time. 
Meaning, it might not be implemented for R17 but instead implemented for 
R18. The plan right now is to keep it though.

_The External Format_

The current external format /needs/ ordered keys as input for 
binary_to_term/1 and in distribution.

This is of course an inconvinience when dealing with other language 
interfaces which has no idea of what the erlang term order is. I instead 
propose that the external format should handle unordered input of 
key-value pairs. The trade off is a more complicated decoding which will 
take longer.

The distribution format should also be extended to be able cache keys. 
This is similar to the atom cache except we
cache the entire key array for maps. This has been the intention all 
along but it not mentioned in the EEP.

_Term order and sorting_

Finally the term order. This has been a sore point from the get go.

Maps currently respects the Erlang term order for it's keys.

The Erlang term order is what I call arithmetic term order. I propose 
that we extend Erlang with true term order where integer compares less 
then float, i.e. total term order.

This would allowing newer ordered data structures, like maps, to be more 
useful. We don't have to take
special care for the odd cases like keys 1.0 and 1 inhabiting the same 
slot in the data structure. gb_trees and such structures could also be 
extended to use this as those structures has the same limitations.

With this type ordering we could have maps with this type of keys, #{ 1 
=> "integer", 1.0 => "float" } without causing confusion.

I've been told that ETS ordered sets tables used to have this behaviour. 
Distinguishing between floats and integers. This was supposedly before 
the open source era, way back when dinosaurs roamed the planet .. I'm 
not clear on the details on why this behaviour was removed. Probably 
because of inconsistencies.

For maps to work with this I only need two things. First, a compare 
operation in the runtime that can distinguish between floats and 
integers, very easy. Secondly, a BIF that sort a list of terms with this 
new compare operation which will be used in the compiler.

But for completness, the following operators should also be implemented:

     =:=         term exact equal to, already implemented
     =/=         term not equal to, already implemented
     =:<         term less or equal than
     >:=         term greater or equal than
     <:<         term less than
     >:>         term greater than

So, true = 1 <:< 1.0.

I don't know prolog but perhaps these sematics should mimic prolog to 
respect Erlangs heritage. I have no strong opinion on this.

This syntax would mimic the already present =:= and =/= relational 
operators hower this syntax is another topic and should be a seperate EEP.

Happy testing!

Regards,
Björn-Egil Dahlberg
Erlang/OTP
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20131025/21c66b31/attachment.htm>