[eeps] EEP for bitstrs

Raimo Niskanen raimo+eeps@REDACTED
Fri Aug 10 14:11:45 CEST 2007


I am resending this EEP for the archives. They have just
recently been made operational.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB
-------------- next part --------------
EEP: 4
Title: New BIFs for bit-level binaries (bitstrs)
Version: $Revision: 18 $
Last-Modified: $Date: 2007-08-10 14:05:57 +0200 (Fri, 10 Aug 2007) $
Author: Per Gustafsson
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Aug-2007
Erlang-Version: R12B-0
Post-History: 

Abstract
========

This EEP describes the introduction of bit level binaries to the
Erlang programming language. They can be constructed and manipulated
using the bit syntax and  a new set of BIFs which operate on bit-level
binaries. These new BIFs are introduced in order to not alter the semantics of
existing BIFs which operate on binaries, but instead implement similar
operations for bit level binaries using new BIFs.

Definitions
===========

A bit-level binary in this document called a *bitstr* is a sequence of
bits of any length. A *binary* on the other hand is a sequence of bits
where the number of bits is evenly divisible by eight. These
definitions implies that any *binary* is also a *bitstr*.

Manipulating *bitstrs* using the bit syntax
===========================================

A bit syntax expression: 

``<<Seg1,...,SegN>>`` 

Evaluates to a *bitstr*, if the sum of the sizes of all segments in the
expression is divisible by eight the result is also a
binary. Previously such expression could only evaluate to binaries and
a runtime error was raised if this was not the case.

With this extension the expression ``Bin = <<1:9>>`` which previously caused a
runtime error now creates a 9-bit binary. To be able to use this
*bitstr* to build a new bigger *bitstr* we can write: 

``<<Bin/bitstr, 0:1>>``

Note the use of bitstr as the type. This expands to binary-unit:1
where as the binary type would have expanded to binary-unit:8.

To match out a bit-level binary we also use the bitstr type as in ::
    
    case Bin of
      <<1:1,Rest/bitstr>> -> Rest;
      <<0:1,_/bitstr>> -> 0
    end

This allows us to avoid situations were we would have to calculate
padding. 		
 

Specifications
==============

``bitsize/1::bitstr() -> integer()``

Returns the size of a *bitstr* in bits. This BIF is allowed in guards.

``list_to_bitstr/1::bitstr_list() -> bitstr()``

``bitstr_list = cons(char() | bitstr()| bitstr_list(), bitstr() | bitstr_list) | nil()``

Concatenates the *bitstrs* and chars in the bitstr_list to create a
*bitstr*, each char becomes an 8-bit *bitstr*.

``is_bitstr/1::any() -> bool()``

Returns true if the argument is a *bitstr*, otherwise it returns
false. This BIF is allowed in guards.

``bitstr_to_list/1::bitstr() -> [char()|bitstr()]``

Turns a *bitstr* into a list of characters and if the number of bits
in the *bitstr* is not evenly divisible by eight the last element in
the list is a *bitstr* consisting of the last 1-7 bits of the original
*bitstr*.

Rationale
=========

The current definition of binaries makes it complicated to use the bit
syntax for decoding when the format is not byte oriented, because the
programmer is always forced to pad the binaries that he is using to
become a sequence of bytes. Allowing bit-level binaries alleviates
this problem.

The new BIFs proposed here are intended to give programmers the same
tools to manipulate bit-level binaries as they are used to when
manipulating binaries without changing the semantics of already
existing BIFs and maintain properties such as if this statement:

``is_binary(X) andalso size(X) =:= 0``

evaluates to true then that implies that ``X = <<>>``.

Implementation
==============

The extensions described in this document are either already
implemented in R11B-4, but protected by a compiler switched or can be
easily implemented.

Backwards Compatibility
=======================

This change will not be entirely backward compatible for example:
``N=9, <<1:N>>`` would cause an error in the old system and now it would
evaluate to a *bitstr*. 

The new BIFs are intended to give the same expressiveness for handling
bit-level binaries as we have for ordinary binaries without changing
the semantics of the BIFs for binaries such as size/1,
binary_to_list/1, list_to_binary/1 etc.. This means that all such BIFs
will throw an exception if their arguments contains *bitstrs*.


More information about the eeps mailing list