[erlang-questions] Erlang + LLVM

Wed Feb 3 09:47:58 CET 2010

Tony Arcieri wrote:
> On Tue, Feb 2, 2010 at 9:06 AM, Ulf Wiger <ulf.wiger@REDACTED>wrote:
> 
>> It does indeed.
>>
>> http://old.nabble.com/benchmarks-llvm-vs-gcc-tt18837645.html#a18837645
>>
>> (I have no more info than that, though.)
>>
> 
> This is a bit different than what I was proposing.  This is using LLVM as
> the backend of the C compiler used to build the C source code of the Erlang
> interpreter.
> 
> I am suggesting LLVM could be used to compile BEAM bytecode to native code,
> enabling faster execution.  I'm aware HiPE already offers native code
> compilation.

To compile to native code, the HiPE compiler uses various internal 
representations which go roughly as follows:

   BEAM bytecode -> HiPE Icode -> HiPE RTL -> native (x86, x86_64, ...)

<aside>
   Incidently, let me also comment on the following:

     In the first post of this thread, Tony Arcieri wrote:
     > ...
     > BEAM is a register machine.  LLVM is a register machine.
     > Seems like it might be possible?

   The answer to this question is "of course it is possible", but the way
   the question is phrased (probably due to haste) seems a bit naive.
   Yes, BEAM and LLVM are register-based virtual machines alright, but
   they are VMs at so different a level that the fact that they are
   register-based is not something that helps you much if you want to map
   one to the other. There is a reason why HiPE does not go directly from
   BEAM bytecode to native code.
</aside>

Anyway, about one and a half years ago, I gave to a group of two 
graduate students a project to map the HiPE's RTL to LLVM's RTL with the 
aim to use the LLVM backend optimizations and see how LLVM performs 
compared to HiPE.

The project never matured to the point that we could get measurements, 
so I do not have something very concrete to report here, other than some 
thoughts.  If you are about to do this sort of thing, you have to ask 
yourself the following questions:

   - Why am I doing this?  Is it just out of curiosity to see what sort
     of speedup I can get, or do I want to do it for real?

   In the latter case:

   - Is the performance from BEAM bytecode not good enough?
     Have I used HiPE on my code and I did not get a decent speedup?
     Do I have good reasons to believe that the speedup with LLVM will be
     (considerably) better in big applications?

   - What is the design I want?  Do I want to be able to just native code
     produced by LLVM or should I be able to load and run BEAM code too?

   - Who is responsible for memory management?  Do I use the same memory
     organization for processes as the current Erlang/OTP runtime system?

   - Do I maintain Erlang semantics as far as loading/reloading code of
     modules is concerned or not?

   - How is the scheduling of Erlang processes performed?
     Do I want a single-threaded or a multi-threaded implementation?

Note that the devil is not just in the details, but in major decisions.
The BEAM bytecode is not really self-contained.  A lot of issues are 
handled by the BEAM loader, which performs mucho magic, and/or appear in 
parts of the runtime system of Erlang/OTP.  Understanding and foreseeing 
these issues is the tricky part...

Kostis