[erlang-questions] Bit syntax matching gotchas

José Valim jose.valim@REDACTED
Wed Feb 3 12:00:48 CET 2016


Björn, in solution #1, would you warn only when matching or also when
constructing? Is the warning only at compile-time or also at runtime? For
example, would you warn for:

    X = -1.
    <<X>>.

We may have a third option which is to control the masking behaviour with a
new flag. From Erlang 19, we could warn if you are relying on masking and
you would need to pick to mask (/integer-mask) or fix your code. The
default after the warning is removed is to always raise. The mask option
wouldn't be supported when matching, making it clear the masking behaviour
is construction only.

I am not sure this is a *good* option but I thought I would mention it
anyway. :)




*José Valim*
www.plataformatec.com.br
Skype: jv.ptec
Founder and Director of R&D

On Wed, Feb 3, 2016 at 7:17 AM, Björn Gustavsson <bjorn@REDACTED> wrote:

> There are some gotchas in the bit syntax that comes up now and then on
> the mailing list, and also as a bug report at the end of last year:
> http://bugs.erlang.org/browse/ERL-44
>
> We have decided to try do something about it in OTP 19. We have
> discussed various solutions internally in the OTP group, and I have
> spent far too much time lately thinking about it.
>
> Here follows first my summary of the issues, and then my suggestion
> how the compiler should be modified.
>
> BACKGROUND ABOUT BIT SYNTAX CONSTRUCTION
>
> When constructing binaries, there is an implicit masking of the
> values. All of the following constructions give the same result:
>
> <<255>>
> <<16#FF>>
> <<16#FFFF>>
> <<-1>>
>
> There have been complaints about the implicit masking behaviour, but
> there is a lot of code that depends on it, so it would be unwise to
> change it.
>
> THE PROBLEM
>
> There is no similar masking when matching values. That means that all
> of the following expressions will all fail to match:
>
> <<-1>> = <<-1>>
> <<-1/unsigned>> = <<-1>
> <<16#FF>> = <<16#FFFF>>
> <<-12345/signed>> = <<-12345/signed>>
>
> Let's look at how the compiler internally implements matching. Take
> this function as an example:
>
> f(<<-1:8/unsigned>>) -> ok.
>
> It will be rewritten to:
>
> f(<<V:8/unsigned>>) when V =:= -1 -> ok.
>
> That is, an unsigned value (in the range 0-255) will be stored in the
> variable V, which will then be compared to -1.
>
> POSSIBLE SOLUTION #1
>
> The most obvious solution is probably to let the compiler warn for the
> above cases. The matching would still fail. The developer will need to
> fix their code. For example:
>
> <<-1/signed>> = <<-1>>
>
>
> POSSIBLE SOLUTION #2
>
> There is one problem with the solution #1. It is not possible to
> produce a warning for the following example:
>
> f(Val) ->
>   <<Val:8>> = <<Val:8>>,
>   Val.
>
> So in addition to warning when possible, another solution is to mask
> values also when matching. Internally, the compiler could rewrite the
> function to something like:
>
> f(Val) ->
>   <<NewVar:8>> = <<Val:8>>,
>   Val = NewVar band 16#FF,
>   Val.
>
> Similar rewriting should be done for literal integer, so the following
> expression would now match:
>
> <<-1>> = <<-1>>
>
>
> WHICH SOLUTION?
>
> Just to make to sure that I don't reject solution #2 just because it
> seems like a lot work to implement, I have actually implemented it.
>
> Now that I have implemented solution #2, I want to reject it.
>
> The reason I reject it is that the matching previously bound variables
> is uncommon. Even in the compiler test suites it is uncommon (test
> suites typically match bound variables more often than production code
> do).
>
> Therefore, solution #2 would make behaviour of matching more
> consistent with construction, but would not significantly increase the
> number of correct programs. Also, if clauses that previously didn't
> match start to match, then code that has not been executed before will
> be executed. Code that has not been tested usually doesn't work.
>
> Solution #1 would point all cases when literal integers could not
> possibly match and force the developer to fix them.
>
> Therefore I choose solution #1.
>
>
> YOUR INPUT?
>
> Are there better way to fix bit syntax matching? Anything I have
> forgotten or not thought about?
>
> /Björn
>
> --
> Björn Gustavsson, Erlang/OTP, Ericsson AB
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160203/fa6d009c/attachment.htm>


More information about the erlang-questions mailing list