BEAM documentation (was Re: Packages in Erlang...)

Tue Sep 9 18:19:28 CEST 2003

"Erik Stenman" <Erik.Stenman@REDACTED> writes:

> I would suggest using the Icode language in the HiPE compiler,
> It is a very simple language with realy only 15 different instructions.
> And there is already a compiler from Erlang or BEAM to Icode.
> (An Erlang->Core Erlang->Icode compiler is on its way.)

Do you implement e.g. the bit syntax in Icode, or is that done in the
runtime system? And will those 15 ops be enough or do you fear Icode
feature-creep?

One thing I find attractive about James's idea of using a "mid-level"
language as the target is that it makes it easy to write things like
the bit syntax: you take the bit-pattern and generate straight forward
code e.g. in terms of logand, bitshift, etc. Then you let the
mid-level-language compiler worry about making _that_ fast.

For example, I have implemented a bit syntax is Common Lisp, and it
was very easy. (Actually it was designed by Frode Vatvedt Fjeld, and I
wrote a compiler-based backend for it as outlined above.) My compiler
is 147 lines of code, very simple, and generates code that AFAICT the
CL compiler should be able to do great things with: it includes full
type information and does "the business" with low-level LDB/DPB/AREF
operations. i.e. it is a "production quality" bit syntax.

It's a different bit syntax to the Erlang one. Essentially you define
structures/records which include the details of how to encode/decode
them as binary. For example, here is the Internet Protocol (IP) header:

  (define-binary-bitfield-struct iph ()
    (version         nil   :binary-type ip-version)
    (hlen            nil   :binary-type ip-header-length)
    (tos             nil   :binary-type ip-type-of-service)
    (total-len       nil   :binary-type ip-total-length)
    (id              nil   :binary-type ip-identification)
    (flags           nil   :binary-type ip-flags)
    (fragment-offset nil   :binary-type ip-fragment-offset)
    (ttl             nil   :binary-type ip-time-to-live)
    (protocol        nil   :binary-type ip-protocol)
    (checksum        nil   :binary-type ip-header-checksum)
    (source          nil   :binary-type ip-addr)
    (dest            nil   :binary-type ip-addr)
    (options         '()))

Then there are separate declarations saying that `ip-addr' is four
octets and maps onto a vector/array, that ip-fragment-offset is an
unsigned 13-bit value, and so on.

Then you can ask the compiler to make a function that reads or writes
this structure to/from a byte buffer. Here is the read function that
it generates (as pretty-printed by Lisp):

  (defun iph-to-vector (object &optional (buffer-spec 512))
    (let ((buffer (make-buffer buffer-spec)))
      "Encode an IP header as a vector."
      (let ((bit-buffer 0) (bits-buffered 0) (bytes-written 0))
        (declare (type (simple-array (unsigned-byte 8) (*)) buffer)
                 (type iph object)
                 (type (unsigned-byte 8) bit-buffer)
                 (type (integer 0 8) bits-buffered)
                 (type fixnum bytes-written))
        (labels ((output-byte! (value)
                   (setf (aref buffer bytes-written) value)
                   (incf bytes-written))
                 (output-bits! (value bits)
                   "Output BITS of VALUE (buffered into whole bytes.)"
                   (declare (type fixnum value bits))
                   (loop
                    (let ((take-bits (min bits (- 8 bits-buffered))))
                      (setf bit-buffer
                              (dpb
                               (ldb (byte take-bits (- bits take-bits)) value)
                               (byte take-bits (- 8 (+ take-bits bits-buffered)))
                               bit-buffer))
                      (incf bits-buffered take-bits)
                      (decf bits take-bits)
                      (when (= 8 bits-buffered)
                        (output-byte! bit-buffer)
                        (setf bits-buffered 0))
                      (when (zerop bits) (return t))))))
          (let ((object object))
            (output-bits! (slot-value object 'version) 4)
            (output-bits! (slot-value object 'hlen) 4)
            (output-bits! (slot-value object 'tos) 8)
            (output-bits! (slot-value object 'total-len) 16)
            (output-bits! (slot-value object 'id) 16)
            (output-bits! (slot-value object 'flags) 3)
            (output-bits! (slot-value object 'fragment-offset) 13)
            (output-bits! (slot-value object 'ttl) 8)
            (output-bits! (slot-value object 'protocol) 8)
            (output-bits! (slot-value object 'checksum) 16)
            (let ((object (slot-value object 'source)))
              (let ((object (slot-value object 'value)))
                (dotimes (i 4) (output-bits! (aref object i) 8))))
            (let ((object (slot-value object 'dest)))
              (let ((object (slot-value object 'value)))
                (dotimes (i 4) (output-bits! (aref object i) 8)))))
          (adjust-array buffer (list bytes-written))))))

I like this a lot:
- Compiler is very simple.
- Generated code is perfectly readable (if you know CL, and can
  understand my probably suboptimal `output-bits!' algorithm!)
- Generated code includes full type information for the Lisp compiler.

Is there room for this programming style in the guts of an Erlang
implementation? I promise to be an enthusiastic contributor if so :-)

PS,
  Compiler is here:
    http://www.bluetail.com/~luke/misc/lisp/binary-rw-gen.html
  Definitions of the TCP/IP suite's PDUs is here:
    http://www.bluetail.com/~luke/misc/lisp/netlib-structures.html

PPS, wanna see a working ethernet switch that fits on one screen? :-)
    http://www.bluetail.com/~luke/misc/lisp/switch.html

Cheers,
Luke