[erlang-questions] bit syntax: 0-sized segments
Fri Mar 8 08:38:49 CET 2013
Another nice thing with 0 sized integers is when you add alignment
bits in protocols, in both
matching and creation.
<<Bits:BitSize/binary, _Align:AlignSize, AlignedData/binary>> = Input
Output = <<Bits:BitSize/binary, 0:AlignSize, AlignedData/binary>>,
On Fri, Mar 8, 2013 at 1:45 AM, Richard A. O'Keefe <> wrote:
> Integers in Erlang do not have silly size restrictions
> reflecting the underlying hardware. It's not the case
> that only 8, 16, 32, 64 make sense as integer sizes,
> for example.
> Let B and V be integers such that
> B >= 0
> 0 <= V < 2**B
> is a bitstring containing exactly B bits such that
> <<R:B>> = <<V:B>>
> will exactly recover R == V.
> The interesting thing here is that B == 0 is NOT a special case.
> It is not, or _should_ not, be in any way surprising.
> What _has_ repeatedly caused surprised expressed in this mailing
> list is the quiet truncation of integer values outside the [0,2**B)
> range. That would definitely justify an exception.
> Suppose I am constructing an XML compressor, taking advantage of
> knowing the DTD. About to emit an element, I want to say "let P
> be the number of elements allowed here, counting #PCDATA as an
> element. Let B be the smallest integer such that 2**B >= P.
> Let V be the zero-origin index of the element type. Now encode
> V:B." Considering the number of elements where only #PCDATA is
> allowed, quite often I am going to want to encode 0:0."
> Zero is a perfectly good size, even for an integer.
> Floats are very very different.
> Erlang *does* let a hardware size show through.
> The *only* size that makes sense is 64.
> And you _do_ get an exception if you specify the type as 'float'
> and the size as anything that doesn't resolve to 64.
>> When one moves to integer segments the above property does not make much sense anymore (esp. since bit_size is not defined for anything other than bitstrings). In particular, the current implementation of binary pattern matching has chosen to return an "arbitrary" integer, namely 0, as the result.
> But it is not arbitrary at all. It is forced by the rule for non-zero sizes.
> If the legal range for B bits (where B > 0) is 0..2**B-1, then the legal
> range for 0 bits *has* to be 0..0. This is the *only* consistent value.
>> I can may well see that many would consider the following binding for X to 0 a bit weird.
>> 1> <<X:0/integer>> = <<42:0>>.
> That is a completely different issue. The thing that is weird here is
> in the *expression*, not the pattern, and it's allowing a value that
> does not in fact fit into the field and quietly truncating it.
> <<X:8/integer>> = <<257:8>>
> gives you X = 1. *THAT* is weird, but it has nothing whatever to do with
> zero sizes, and banning the perfectly sensible zero sizes will do nothing
> to stop the weirdness.
>> Moreover, the situation is arguably even more weird for floats:
>> 3> <<F:0/float>> = <<42:0>>.
>> 4> F.
> *That* I grant you. Since the only legal size in a construction is 64,
> and you get an exception if you try any other number, then the only legal
> size for a float in a pattern should also be 64.
>> I am not so convinced that pattern matching with 0-size segments make sense for types other than bitstrings (binaries).
> It doesn't make sense for floats.
> But it _does_ make sense for integers.
> More precisely, it makes sense for *unsigned* integers.
> A *signed* integer has to be at least one bit, because
> there has to be somewhere to put the sign, otherwise it
> isn't signed.
> We AGREE that size 0 is sensible for bit strings.
> We AGREE that size 0 is not sensible for floats.
> We AGREE that size 0 is not sensible for signed integers.
> Do we really disagree much about unsigned integers?
> We AGREE that an explicit size 0 is odd enough to
> deserve a compiler _warning_. After all, if you
> _know_ you don't want any bits, why mention them?
> A warning is not a refusal to compile.
> Our disagreement seems to be limited to
> - whether an unsigned integer with a size of zero
> determined at run time has semantics forced by
> and consistent with the semantics of nonzero
> sizes and should certainly be allowed or is so
> weird that it should raise an exception.
> erlang-questions mailing list
More information about the erlang-questions