[erlang-questions] Road-map for Erlang compiler?

Thu Dec 7 13:32:12 CET 2006

Richard A. O'Keefe wrote:
> Is there a road-map for the Erlang compiler anywhere?
> The last I heard there wasn't any description of the BEAM instructions,
> but surely there must be some sort of overview of the compiler structure.

Not that I know of. Let me try to give an overview:

- The BEAM compiler sources are found under lib/compiler/src/.

- The main file (the front end) is compile.erl, and it is
   fairly easy to look at the code and see what it does. The
   normal entry point is compile:file/2 (either called directly,
   or from the shell function 'c(Module)' implemented in the
   file lib/stdlib/src/c.erl, or from the erl_compile.erl module
   in lib/stdlib/src/ when using the erlc command line tool).

- The source files are read and parsed via the preprocessor
   (lib/stdlib/src/epp.erl, which uses erl_scan.erl for
   tokenization and erl_parse.erl for parsing, both also
   found under lib/stdlib/src/. erl_parse.erl is generated by
   running yecc on erl_parse.yrl.)

- The main loop in compile.erl, select_passes/2, goes through a
   list of passes which transform the code step by step. Which
   passes are selected depend on the compiler options.

- The main passes are, in order:
     * run any parse transforms on the syntax tree

     * lint: checks that the tree is well-formed (many
       syntax checks are deferred to this stage, rather
       than being handled by the parser grammar); this
       is implemented in lib/stdlib/src/erl_lint.erl

     * save the current syntax tree as "abstract code" to
       be included as debug information

     * do Erlang-level rewriting of certain constructs and
       expressions, implemented in sys_pre_expand.erl.

     * convert to Core Erlang (done by v3_core.erl)

     * run the inliner (cerl_inline.erl) if requested

     * some optimizations (done by sys_core_fold.erl)

     * convert to a more linear code representation called
       "kernel" Erlang (done by v3_kernel.erl), doing
       pattern matching compilation and closure conversion,
       and computing variable usage.

     * convert again, to "annotated BEAM format" with variable
       lifetime information (done by v3_life.erl)

     * convert to proper BEAM (in v3_codegen.erl), and run
       several cleanup and optimization passes

     * validate the BEAM code

     * encode the BEAM code as a chunk of binary data

- The HiPE compiler is called if the 'native' option is specified
   and can take its input either from the final BEAM code or from
   the Core Erlang code. Native code is placed in a separate chunk
   in the .beam file, and is only loaded if it matches the platform
   that the system is running on.

Hope this helps,

     /Richard