[erlang-questions] map elements in defined order

Ulf Wiger ulf@REDACTED
Thu Oct 26 22:33:22 CEST 2017


There are some things in Erlang that people have come to rely on, but for
which there's a price. Having a globally defined sort order is such a
thing. And maps must be implemented in such a way that the following is not
only supported, but efficient:

comp(#{} = A, #{} = B) when A > B -> greater;
...

That is, maps must behave, efficiently, as an ordered mapping on request.

I've had reason to ponder the Erlang term comparison semantics while
maintaining the sext library. The ordering semantics of maps threw a wrench
into the works, with its non-standard ordering of keys (I should have left
some room between the type tags for such unexpected developments.) The jury
(i.e. QuickCheck) is still out on whether I'll be able to support it with
the existing tag scheme.

Now, for sext, maps are at least better than refs, pids and ports, since
there is no documentation on how they are sorted internally. As it turns
out (for now), using term_to_binary() and some bit syntax seems to work in
practice for those.

For the external term format for maps, actually, the documentation says:

«Key and value pairs (Ki => Vi) are encoded in section Pairs in the
following order: K1, V1, K2, V2,..., Kn, Vn. »

... which does seem to imply that the ordering in the external term format
is defined. I assume this is not the intended meaning, but rather that
whichever key K1 is, the corresponding value V1 is the item that follows
it, and so on.

Anyway, I've come across code that uses maps as a serialization format for
a cryptographic hash. Based on what is known about maps, this would seem
like a Bad Idea, since in that case, presumably, the result of the hash
operation might change between implementations.

BR,
Ulf W

Den 26 okt. 2017 21:43 skrev "Michael Truog" <mjtruog@REDACTED>:

On 10/26/2017 12:13 PM, Ulf Wiger wrote:

> But again, Jesper, just about everyone relies on the fact that maps follow
> the general principle that there is a well-defined term comparison order.
> Otherwise, maps would be highly unsuitable to use in keys, and generally
> treacherous to use as a replacement for records. Following the Principle of
> Least Surprise, it's a darned good thing that Erlang doesn't randomize the
> key order in its maps.
>
> I doubt that anyone would abuse an extra function that produces the map
> element pairs in the internally defined sort order, given that the
> documentation would clearly state that it's more expensive than
> maps:to_list/1 (though likely faster than lists:sort(maps:to_list(M), fun
> custom_sort_fun/2) - not to mention less error-prone.)
>
> But it's not a feature I'm willing to go to war over. If no one else sees
> a use for it, I'm willing to concede that it has low priority. :)
>
>
At a low-level, a hash array mapped trie is unordered because it relies on
a hash function, so using the Erlang map in an ordered way might be
possible but would be inefficient.  If you need an ordered mapping, I think
my Erlang trie is a better option at https://github.com/okeuday/trie ,
though it requires Erlang string keys (binary keys are supported by the
btrie module, but that is slow).  That doesn't necessarily help msgpack
source code though, unless it would be a separate output option that was
added.

Best Regards,
Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20171026/dc92ae40/attachment.htm>


More information about the erlang-questions mailing list