[erlang-questions] Is there a good source for documentation on BEAM?
Thu May 10 08:22:38 CEST 2012
"As for what I see would cause a slowdown: the attention of the key
hackers would be spent on writing this
documentation (and then maintaining it, I assume)."
Perhaps better: volunteers could document it (on a relatively
controlled wiki, for example). Then the "key hackers" could mention
any needed corrections.
As for maintenance, you say yourself (in a later e-mail) that you can
only remember significant changes happening twice. Documenting such
infrequent changes doesn't exactly sound like some grinding daily
burden for already-overworked Ericsson programmers. If they have to
propose these changes in writing anyway (at least in internal e-mail),
sounds like most of the documentation work gets done before the
changes are made.
On Thu, May 10, 2012 at 3:03 AM, Thomas Lindgren
> ----- Original Message -----
>> From: Richard O'Keefe <>
>> To: Thomas Lindgren <>
>> Cc: Michael Turner <>; "" <>
>> Sent: Tuesday, May 8, 2012 3:07 AM
>> Subject: Re: [erlang-questions] Is there a good source for documentation on BEAM?
>> On 8/05/2012, at 7:15 AM, Thomas Lindgren wrote:
>>> There has been a substantial number of non-BEAM Erlang implementations
>> already, so I'm
>>> not convinced detailed BEAM docs is the key property* to spread Erlang.
>> And how many of those non-BEAM implementations still exist?
>> Does GERL? Is E2S still maintained? How much of OTP can it handle?
> This, to my mind, says more about the (lack of) need for a second source implementation than any inherent
> problems with learning BEAM. If you want to try your hand, quite a bit of the complexity is not in handling BEAM
> as such but in reimplementing ERTS: writing the BIFs, SMP, memory management, etc.
>>> Indeed, requiring detailed docs of every change of BEAM seems likely to
>> slow innovation down instead.
>> I not only *don't* believe that, I *can't* believe that.
>> Joe has informed us that there are TWO levels of BEAM,
>> one of which has been very stable, and one of which has
>> changed many times.
>> I don't even believe your claim if made about the low level
>> much changed "BEAM", but let's suppose it true for the sake
>> of argument. If the high level of BEAM has remained pretty
>> stable for quite a while, how would documenting it have
>> slowed innovation down?
> Note that BEAM files are not guaranteed to be compatible across releases, and they do change incompatibly
> every now and then. (Not very often, to be sure. I recall it happening twice.) Check the mailing list for some discussions.
> The "sub-BEAM" implementation can change more rapidly, of course. I assume implementors there can do
> platform specific things like inline expanding instructions into native, mapping VM registers to native registers,
> constructing superinstructions, etc. (I seem to recall all of these being tried at one time or another.)
> As for what I see would cause a slowdown: the attention of the key hackers would be spent on writing this
> documentation (and then maintaining it, I assume). Perhaps people will start depending on documented details
> of implementation, explicitly or implicitly. Major changes would also mean major internal docs rewrites.
> See below for one option.
>> ... [pace of innovation, see below on kickstarter for my comment]
>>> If the motive is education, I think someone interested in compilers and
>> virtual machine architectures
>>> would have little trouble with BEAM as such.
>> I have an interest in compilers and VMs. I worked professionally on Quintus
>> Prolog and the real WAM (not the one in the papers or Aït-Kaci's book). And
>> trying to figure out the BEAM was such a slog that to be honest, I said to
>> myself "the hell with it, if they don't *WANT* me to understand the
>> I'm not going to waste any more of my time trying to penetrate the
> At that level of knowledge, I assume the BEAM instruction set in itself is no big hurdle.
> If you want to learn the internals beyond that, what level of detail are you looking for?
>>> In a real sense, BEAM is just a vehicle to express compiler optimizations
>> for a
>>> restricted part of ERTS (the sequential execution part, basically).
>> No, compiler optimisations are expressed in the executable code of the
>> compiler. BEAM lets you express the *results* of such optimisations,
>> which is a different thing. It's just like the Quintus compiler: I could
>> figure out in that case what the *results* were, but the actual process
>> remained obscure. (More precisely, what the 'invariants' were.)
> Here is how I see it: The instruction set of BEAM has been chosen for the purpose of expressing, and then used to express, various optimizations.
> Consider a simple example: targeting BEAM vs JAM (a stack machine used previously to implement erlang).
> In order to optimize register use on JAM, you first have to translate it to a new intermediate language (and then probably never
> try to translate it back to JAM), while BEAM (like its uncle WAM) expresses registers explicitly and so makes such optimizations straightforward.
>> Yes, I'm shouting. "We don't need it" and "you don't
>> need it" are utterly
>> different propositions, and too many people in too many areas of life fail
>> to realise that.
> (To avoid any confusion, let me add that I last worked at Ericsson CSLAB in 1998. So I'm hardly an OTP insider.)
> So perhaps the right approach is to do a kickstarter to fund someone writing a deep dive Erlang/OTP internals book?
> Complexity: roughly the level of writing a Linux kernel book, at a quick guess. Perhaps a bit easier.
>>> Another argument might be that BEAM should be specified in detail in order
>> to be a suitable binary format for distribution,
>>> which is essentially what the JVM instruction set has become.
>> I suggested many years ago that Erlang should take a leaf out of Kistler's
>> book (or PhD thesis). The "Juice" system for Oberon compiled source
>> to abstract syntax trees, then cleverly compressed the ASTs and used them
>> as the binary distribution form. They came in smaller than .class files
>> and had no presuppositions about the target hardware (not even primitive
>> size and alignment if I recall correctly). The cost of decompressing and
>> generating native code was low, to the point where it was faster to
>> dynamically load Juice files than their equivalent of .so/.dll files, and
>> the generated code actually ran faster because the code generator knew
>> more about the environment of the target, including existing code. (I
>> don't know if the Juice runtime did cross-module inlining, but it would
>> have been possible.)
> Not a bad idea.
> Best regards,
More information about the erlang-questions