[erlang-questions] Bit Syntax and compiler Erlang/OTP R21.1

Fred Hebert mononcqc@REDACTED
Tue Aug 20 15:13:51 CEST 2019


On Tue, Aug 20, 2019 at 8:26 AM Valentin Micic <v@REDACTED> wrote:

> Hi all,
>
> Recently I’ve made a silly mistake. I wrote:
>
> case Payload of
>    <<_:4/binary-unit:8, _:*255*, _:7/binary-unit:8, 0:16>>   -> Payload;
>    _                                                       -> throw( drop )
> end
>
>
> Considering that overall pattern (which erroneously references 255 bits
> long field, instead of an octet with a value of 255 ) is not aligned to
> 8-bit boundary, is it unreasonable to expect the compiler to report this as
> a potential problem, or  at least generate a warning (support for
> bit-fields notwithstanding).
>
> What am I missing here?
>
>
There are informally two kinds of binaries: 8-bit aligned binaries (regular
ones) are those people call 'binaries', and then you have bitstrings.
Bitstrings don't need any alignment whatsoever. Your pattern can be made to
work by using any fitting bitstring. For example:

1> <<_:4/binary-unit:8, _:255, _:7/binary-unit:8, 0:16>> =
<<0:(32+255+7*8+16)>>.
<<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,...>>

But more generally, the binary/bitstring distinction will make sense when
pattern matching:

3> <<_:4/binary-unit:8, _/binary>> = <<0:(32+255+7*8+16)>>.

** exception error: no match of right hand side value
<<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
                                                        0,...>>
4> <<_:4/binary-unit:8, _/bitstring>> = <<0:(32+255+7*8+16)>>.
<<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,...>>
5> <<_:4/binary-unit:8, _/bits>> = <<0:(32+255+7*8+16)>>.
<<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,...>>

Note that bits is shorthand for bitstring. Do note as well that you can
always cheat the pattern match by specifying units:

6> <<_:4/binary-unit:8, _/binary-unit:1>> = <<0:(32+255+7*8+16)>>.
<<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,...>>

Nothing in Erlang actually mandates exact alignment, it's just that with
the default widths of various types impacts pattern matching. Dialyzer,
however, does enforce some semantic values. Here's a sample module:

-module(chk).
-export([f/0, f/1, g/0, g/1]).

f() -> f(<<0:17>>).
g() -> g(<<0:17>>).

-spec f(binary()) -> ok.
f(Bin) ->
    <<_/binary>> = Bin,
    ok.

-spec g(binary()) -> ok.
g(Bin) ->
    <<_/bits>> = Bin,
    ok.

If you run dialyzer on it, you'll find out the following:

chk.erl:4: Function f/0 has no local return
chk.erl:4: The call chk:f(<<_:17>>) will never return since the success
typing is (binary()) -> 'ok' and the contract is (binary()) -> 'ok'
chk.erl:5: Function g/0 has no local return
chk.erl:5: The call chk:g(<<_:17>>) breaks the contract (binary()) -> 'ok'

the type binary(), to Dialyzer, implies the 8-bit alignment you're looking
after. The bitstring() type will not care for alignment. This is because
Dialyzer supports defining binary types as:

 <<>>             %% empty binary
 <<_:M>>          %% fixed-size binary, where M is a positive integer
 <<_:_*N>>        %% variable-size binary with an alignment on N
 <<_:M, _:_*N>>   %% binary of at least M size, with a variable-sized
tail aligned on N

Essentially, binary() is defined as <<_:_*8>> and bitstring() is defined as
<<_:_*1>>. This lets you encode whatever check semantics you'd like within
type specifications, and Dialyzer can try to figure it out for you. But
nothing, by default, would necessarily warrant compiler warnings since
alignment on 8 bits is not mandated by the runtime.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20190820/a0dbd526/attachment.htm>


More information about the erlang-questions mailing list