strictly less-than (Re: [erlang-questions] Ordered set and DETS)

Ulf Wiger ulf.wiger@REDACTED
Mon Sep 6 13:56:24 CEST 2010


So when exercising my QuickCheck suite for the sext
(sortable serialization) library, I stumbled upon an intermittent
test case failure, which was particularly vexing.

The failing case was the sorting property, which says that,
for any terms, comparing the encoded terms should yield
the same result as comparing the original terms.

The failing term pairs are {-1.0, 1}, {1.0, 1}, {2.0, 2} etc.
The terms are not strictly equal, so they have to be encoded
differently (otherwise, the encoding property breaks).
That is, decode(encode(2.0)) =:= 2.0 must hold true*.
In this sense, the values in each pair are distinctly different.

* For some definition of true. In the case of sext, I specify that
the IEEE 754 binary64 representation of the float, as used in
the bit syntax notation, must be the same.

Yet, they have no defined ordering!

Eshell V5.7.5  (abort with ^G)
1> 1 < 1.0.
false
2> 1.0 < 1.
false
3> 1.0 =:= 1.
false

This means that the result of lists:sort/1 is not defined in this
particular case:

Eshell V5.7.5  (abort with ^G)
1> lists:sort([1,1.0]).
[1,1.0]
2> lists:sort([1.0,1]).
[1.0,1]

Naturally, lists:usort/1 has a corresponding behaviour:

3> lists:usort([2,2.0]).
[2]
4> lists:usort([2.0,2]).
[2.0]


I think this is a good argument in favour of adding operators
"strictly <" and "strictly >", which I suggest could be :< and >:.

I made a simple hack where the parser recognized these tokens and
expanded it into something similar to:

':<'(A,B) ->
    A == B
	andalso (is_integer(A)
		 andalso is_float(B))
	orelse A < B.

'>:'(A,B) ->
    A == B
	andalso (is_float(A)
		 andalso is_integer(B))
	orelse A < B.

but rather as a form rewriting excercise in erl_parse.yrl:

wrap_op({op,Pos,':<',L,R}) ->
    {op,Pos,'orelse',
     {op,Pos,'andalso',
         {op,Pos,'==',L,R},
         {op,Pos,'andalso',
             {call,13,{atom,Pos,is_integer},[L]},
             {call,14,{atom,Pos,is_float},[R]}}},
     {op,Pos,'<',L,R}};
wrap_op({op,Pos,'>:',L,R}) ->
    {op,Pos,'orelse',
     {op,Pos,'andalso',
         {op,Pos,'==',L,R},
         {op,Pos,'andalso',
             {call,13,{atom,Pos,is_float},[L]},
             {call,14,{atom,Pos,is_integer},[R]}}},
     {op,Pos,'>',L,R}};
wrap_op(Op) ->
    Op.


This way, it works in guards as well, and will be BW
compatible with existing tools, but with the disadvantage
that the original source cannot be reconstituted.

I could try to formulate this as an EEP, if you all think
it's a reasonable (although quite marginal) suggestion.

An alternative would be to re-formulate the property of sext
such that the encoded terms sort the same way as the original
terms in all cases where it is reasonable to rely on the sort
order in the first place. This is to some extent already true,
at least in regard to encode/decode symmetry, in other cases
involving floats. This alternative likely has less impact. :)

BR,
Ulf W

On 09/04/2010 07:16 PM, Ulf Wiger wrote:
> On 09/03/2010 10:28 PM, Evans, Matthew wrote:
>> Hi,
>>
>> Does anyone know if there are any roadmap plans to implement ordered_set table types in DETS?
>>
>> Thanks
> 
> It may be possible to build on the combination of Tokyo Tyrant
> and my sortable serialization library, which I mentioned in
> this post a while ago:
> 
> http://www.erlang.org/cgi-bin/ezmlm-cgi/4/47285
> 
> That was part of the justification for adding prefix
> matching and matchspec-style functionality in the sext
> library in the first place.
> 
> BR,
> Ulf W
> 
> 
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
> 



More information about the erlang-questions mailing list