[erlang-questions] Is there a good source for documentation on BEAM?

Thu May 10 08:22:38 CEST 2012

"As for what I see would cause a slowdown: the attention of the key
hackers would be spent on writing this
documentation (and then maintaining it, I assume)."

Perhaps better: volunteers could document it (on a relatively
controlled wiki, for example). Then the "key hackers" could mention
any needed corrections.

As for maintenance, you say yourself (in a later e-mail) that you can
only remember significant changes happening twice. Documenting such
infrequent changes doesn't exactly sound like some grinding daily
burden for already-overworked Ericsson programmers. If they have to
propose these changes in writing anyway (at least in internal e-mail),
sounds like most of the documentation work gets done before the
changes are made.

-michael turner

On Thu, May 10, 2012 at 3:03 AM, Thomas Lindgren
<thomasl_erlang@REDACTED> wrote:
>
>
>
>
> ----- Original Message -----
>> From: Richard O'Keefe <ok@REDACTED>
>> To: Thomas Lindgren <thomasl_erlang@REDACTED>
>> Cc: Michael Turner <michael.eugene.turner@REDACTED>; "erlang-questions@REDACTED" <erlang-questions@REDACTED>
>> Sent: Tuesday, May 8, 2012 3:07 AM
>> Subject: Re: [erlang-questions] Is there a good source for documentation on BEAM?
>>
>>
>> On 8/05/2012, at 7:15 AM, Thomas Lindgren wrote:
>>>  There has been a substantial number of non-BEAM Erlang implementations
>> already, so I'm
>>>  not convinced detailed BEAM docs is the key property* to spread Erlang.
>>
>> And how many of those non-BEAM implementations still exist?
>> Does GERL?  Is E2S still maintained?  How much of OTP can it handle?
>
> This, to my mind, says more about the (lack of) need for a second source implementation than any inherent
> problems with learning BEAM. If you want to try your hand, quite a bit of the complexity is not in handling BEAM
> as such but in reimplementing ERTS: writing the BIFs, SMP, memory management, etc.
>
>>>  Indeed, requiring detailed docs of every change of BEAM seems likely to
>
>> slow innovation down instead.
>>
>> I not only *don't* believe that, I *can't* believe that.
>> Joe has informed us that there are TWO levels of BEAM,
>> one of which has been very stable, and one of which has
>> changed many times.
>>
>> I don't even believe your claim if made about the low level
>> much changed "BEAM", but let's suppose it true for the sake
>> of argument.  If the high level of BEAM has remained pretty
>> stable for quite a while, how would documenting it have
>> slowed innovation down?
>
> Note that BEAM files are not guaranteed to be compatible across releases, and they do change incompatibly
> every now and then. (Not very often, to be sure. I recall it happening twice.) Check the mailing list for some discussions.
>
> The "sub-BEAM" implementation can change more rapidly, of course. I assume implementors there can do
> platform specific things like inline expanding instructions into native, mapping VM registers to native registers,
> constructing superinstructions, etc. (I seem to recall all of these being tried at one time or another.)
>
> As for what I see would cause a slowdown: the attention of the key hackers would be spent on writing this
> documentation (and then maintaining it, I assume). Perhaps people will start depending on documented details
> of implementation, explicitly or implicitly. Major changes would also mean major internal docs rewrites.
>
> See below for one option.
>
>> ... [pace of innovation, see below on kickstarter for my comment]
>>>
>
>>>  If the motive is education, I think someone interested in compilers and
>> virtual machine architectures
>>>  would have little trouble with BEAM as such.
>>
>> I have an interest in compilers and VMs.  I worked professionally on Quintus
>> Prolog and the real WAM (not the one in the papers or Aït-Kaci's book).  And
>> trying to figure out the BEAM was such a slog that to be honest, I said to
>> myself "the hell with it, if they don't *WANT* me to understand the
>> BEAM,
>> I'm not going to waste any more of my time trying to penetrate the
>> obscurity".
>
>
> At that level of knowledge, I assume the BEAM instruction set in itself is no big hurdle.
> If you want to learn the internals beyond that, what level of detail are you looking for?
>
>>>  In a real sense, BEAM is just a vehicle to express compiler optimizations
>> for a
>>>  restricted part of ERTS (the sequential execution part, basically).
>>
>> No, compiler optimisations are expressed in the executable code of the
>> compiler.  BEAM lets you express the *results* of such optimisations,
>> which is a different thing.  It's just like the Quintus compiler:  I could
>> figure out in that case what the *results* were, but the actual process
>> remained obscure.  (More precisely, what the 'invariants' were.)
>
>
> Here is how I see it: The instruction set of BEAM has been chosen for the purpose of expressing, and then used to express, various optimizations.
> Consider a simple example: targeting BEAM vs JAM (a stack machine used previously to implement erlang).
> In order to optimize register use on JAM, you first have to translate it to a new intermediate language (and then probably never
> try to translate it back to JAM), while BEAM (like its uncle WAM) expresses registers explicitly and so makes such optimizations straightforward.
>
>> ...
>> Yes, I'm shouting.  "We don't need it" and "you don't
>> need it" are utterly
>> different propositions, and too many people in too many areas of life fail
>> to realise that.
>
> (To avoid any confusion, let me add that I last worked at Ericsson CSLAB in 1998. So I'm hardly an OTP insider.)
>
> So perhaps the right approach is to do a kickstarter to fund someone writing a deep dive Erlang/OTP internals book?
> Complexity: roughly the level of writing a Linux kernel book, at a quick guess. Perhaps a bit easier.
>
>>>  Another argument might be that BEAM should be specified in detail in order
>> to be a suitable binary format for distribution,
>>>  which is essentially what the JVM instruction set has become.
>>
>> I suggested many years ago that Erlang should take a leaf out of Kistler's
>> book (or PhD thesis).  The "Juice" system for Oberon compiled source
>> files
>> to abstract syntax trees, then cleverly compressed the ASTs and used them
>> as the binary distribution form.  They came in smaller than .class files
>> and had no presuppositions about the target hardware (not even primitive
>> size and alignment if I recall correctly).  The cost of decompressing and
>> generating native code was low, to the point where it was faster to
>> dynamically load Juice files than their equivalent of .so/.dll files, and
>> the generated code actually ran faster because the code generator knew
>> more about the environment of the target, including existing code.  (I
>> don't know if the Juice runtime did cross-module inlining, but it would
>> have been possible.)
>
>
> Not a bad idea.
>
> Best regards,
> Thomas
>