[erlang-questions] Bit Syntax and compiler Erlang/OTP R21.1
Tue Aug 20 17:04:26 CEST 2019
I should have used a more appropriate erlang term “bitstring" as opposed to “bit field” in my email — thank you for correcting this.
I’ve been using bitstrings in the past, well, not really as a bunch of “loose” bits, but always as a part of some other binary pattern aligned to 8-bit boundary.
If you cannot write 17 loose bits to a file, or, better yet, if you cannot send 13 loose bits over a socket, one has to wonder how useful are non-aligned bitstrings (and by this I mean “loose” bits).
And it gets worse. Consider this:
(tsdb_1_1@REDACTED)433> term_to_binary( <<0:8>> ).
(tsdb_1_1@REDACTED)434> term_to_binary( <<0:1>> ).
It follows that it takes more memory to store 1 loose bit than 8 aligned bits.
And just to prove that if it walks like a duck, and quacks like a duck, it's probably… an elephant.
(tsdb_1_1@REDACTED)449> is_binary( <<0:8>> ).
(tsdb_1_1@REDACTED)448> is_binary( <<0:1>> ).
But, if you put two elephants next to each other, you get — a duck!
is_binary( <<0:5, 0:3>> ).
Given all this, why would anyone find bitstrings useful?
But, the above notwithstanding, I understood your point that run-time does not mandate any kind of alignment, hence compiler has nothing to report.
Makes sense — Thank you.
> On 20 Aug 2019, at 15:13, Fred Hebert <mononcqc@REDACTED> wrote:
> On Tue, Aug 20, 2019 at 8:26 AM Valentin Micic <v@REDACTED <mailto:v@REDACTED>> wrote:
> Hi all,
> Recently I’ve made a silly mistake. I wrote:
> case Payload of
> <<_:4/binary-unit:8, _:255, _:7/binary-unit:8, 0:16>> -> Payload;
> _ -> throw( drop )
> Considering that overall pattern (which erroneously references 255 bits long field, instead of an octet with a value of 255 ) is not aligned to 8-bit boundary, is it unreasonable to expect the compiler to report this as a potential problem, or at least generate a warning (support for bit-fields notwithstanding).
> What am I missing here?
> There are informally two kinds of binaries: 8-bit aligned binaries (regular ones) are those people call 'binaries', and then you have bitstrings. Bitstrings don't need any alignment whatsoever. Your pattern can be made to work by using any fitting bitstring. For example:
> 1> <<_:4/binary-unit:8, _:255, _:7/binary-unit:8, 0:16>> = <<0:(32+255+7*8+16)>>.
> But more generally, the binary/bitstring distinction will make sense when pattern matching:
> 3> <<_:4/binary-unit:8, _/binary>> = <<0:(32+255+7*8+16)>>.
> ** exception error: no match of right hand side value <<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 4> <<_:4/binary-unit:8, _/bitstring>> = <<0:(32+255+7*8+16)>>.
> 5> <<_:4/binary-unit:8, _/bits>> = <<0:(32+255+7*8+16)>>.
> Note that bits is shorthand for bitstring. Do note as well that you can always cheat the pattern match by specifying units:
> 6> <<_:4/binary-unit:8, _/binary-unit:1>> = <<0:(32+255+7*8+16)>>.
> Nothing in Erlang actually mandates exact alignment, it's just that with the default widths of various types impacts pattern matching. Dialyzer, however, does enforce some semantic values. Here's a sample module:
> -export([f/0, f/1, g/0, g/1]).
> f() -> f(<<0:17>>).
> g() -> g(<<0:17>>).
> -spec f(binary()) -> ok.
> f(Bin) ->
> <<_/binary>> = Bin,
> -spec g(binary()) -> ok.
> g(Bin) ->
> <<_/bits>> = Bin,
> If you run dialyzer on it, you'll find out the following:
> chk.erl:4: Function f/0 has no local return
> chk.erl:4: The call chk:f(<<_:17>>) will never return since the success typing is (binary()) -> 'ok' and the contract is (binary()) -> 'ok'
> chk.erl:5: Function g/0 has no local return
> chk.erl:5: The call chk:g(<<_:17>>) breaks the contract (binary()) -> 'ok'
> the type binary(), to Dialyzer, implies the 8-bit alignment you're looking after. The bitstring() type will not care for alignment. This is because Dialyzer supports defining binary types as:
> <<>> %% empty binary
> <<_:M>> %% fixed-size binary, where M is a positive integer
> <<_:_*N>> %% variable-size binary with an alignment on N
> <<_:M, _:_*N>> %% binary of at least M size, with a variable-sized tail aligned on N
> Essentially, binary() is defined as <<_:_*8>> and bitstring() is defined as <<_:_*1>>. This lets you encode whatever check semantics you'd like within type specifications, and Dialyzer can try to figure it out for you. But nothing, by default, would necessarily warrant compiler warnings since alignment on 8 bits is not mandated by the runtime.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions