[erlang-questions] Maps

Tue May 14 11:18:46 CEST 2013

On Mon, 13 May, 2013 at 11:15 AM, Joe Armstrong <erlang@REDACTED> 
wrote:
> 
> 
> On Thu, May 9, 2013 at 10:56 PM, Robert Virding 
> <robert.virding@REDACTED> wrote:
>> Even before taking a really deep dive into studying the EEP one 
>> thing I can immediately say: get rid of this having both equal and 
>> match and USE ONLY MATCH. Keys are the same when they match. Period. 
>> This paragraph:
>> 
>> If key K does not equal any existing key in the map, a new 
>> association will be created from key K to value V. If key K is equal 
>> to an existing key in map M its associated value will be replaced by 
>> the new value V and the new key K will replace the old key if the 
>> key does not match. In both cases the evaluated map expression will 
>> return a new map.
>> 
>> is weird. Yes I know what it means but it is not intuitive. When 
>> keys are replaced or not replaced when they are equal is not can 
>> seem very strange to those not deep into erlang semantics. 
>> 
> 
> I think the problem here is the description - not the semantics. It's 
> not the keys which are replaced, the crucial
> idea is to have two different syntaxes. This needs a longer 
> explanation to make sense.
> 
> Short explanation
> ==============
> 
>     ':=' means "update an existing key - crash if they key is not 
> present"
>     '=>' means "update an existing key OR add a new key" 
> 
> The value is pretty much irrelevant
> 
> Long explanation of why this is good
> =============================
> 
> We can update an existing map M with the syntax:
> 
>     M#{ K1 Op V1, K2 Op V2, ... Kn Op Vn}
> 
> Where Op is either => or :=.
> 
> The syntax K => V never generates an error and is used to introduce a 
> new key
> or to update an existing key with a new value.
> 
> The syntax K :=  V is used to update the value of an *existing* key 
> and will
> raise an exception if the key K does not already exist in the map M.
> 
> The difference between these two modes of update is crucial, but needs
> a couple of examples to explain:
> 
> Assume we define a map M as follows:
> 
>     M = #{ foo => 1, bar => 2, baz => 3 }
> 
> 
> The update
> 
>     M # {foo := 12,  bary := 24}
>  
> Will fail (raise an exception) since there is no key called bary in 
> the
> original map. This is good idea (ROK suggested this in his frames 
> paper)
> since we don't want too accidentally create a new key due to a 
> spelling 
> error. This is the crash-early property of the := update syntax.
> 
> Crashing later would make debugging difficult, we would accidentally 
> add
> a bad key to a map and learn about it way later.
> 
> The update
> 
>    M # {foo := 12, bar := 24}
> 
> Will succeed, but more importantly the new map has exactly the same
> keys as the old map (since all the updates are ':=' updates) - and so
> can *share* the same key descriptor. So if we have a very long list of
> maps they can be stored in a space efficient manner. (Again this idea
> comes from ROKs frames paper). Björn-Egil's eep didn't mention this
> but the fact that we know that two maps have the same keys from the
> syntax make a lot of optimizations possible).
> 
> All of this is possible because there are two operators not one :-)
> 

Exactly and that is very, very useful. Being used to other languages 
that have maps but don't differentiate between those two use cases, I 
regularly find myself writing code like:

if key in map then do this else do that

So being able to differentiate between both behaviours by just using 
the right operator for what I want to do at the time is nice.

> 
> ---
> 
> As regards efficiency, utility, beauty and so on these are subjective.
> 
> If you want the last ounce of efficiency records and dicts are not
> going to go away when maps arrive. So if maps have the wrong
> performance characteristics then use the exiting mechanisms.
> 
> In the latest addition of my book I've been documenting the changes 
> to maps
> - this chapter has changed three times and has tended to be 
> conservative 
> so I haven't (yet) mentioned that keys can be any term (and not just 
> atoms).
> 
> I'm rather looking forward to being able to represent things like XML
> and JSON and property lists in a maps and to have an one-size-fits-all
> replacement for dicts and records. I've never really worried about the
> last ounce of efficiency - if I want real efficiency I'd change
> language and go to C or program an FPGA.
> 

I'd debate that. As you explain very well in chapter 7 of your book, 
Erlang is very good at network protocol processing because of its bit 
syntax. For simple protocols, I'd really want to keep it lightweight 
and use something efficient like records. Records also allow me to say: 
this data has a fixed structured format, don't add random keys to it.

On the other hand, if I'm processing JSON or XML, I'd want the whole 
power of maps.

Now if I can have maps behave as efficiently as records for simple 
structures while giving me a single syntax to deal with both simple and 
complex structures then I'm all for it.

Cheers,

Bruno

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130514/79952834/attachment.htm>