[erlang-questions] Erlang R13B Multicore Efficiency Questions

Tue Jun 2 21:04:11 CEST 2009

Hi,

On Tue, Jun 2, 2009 at 6:25 PM, Greg Perry <Greg.Perry@REDACTED> wrote:
> Hello again list members,
>
> I have a few questions about the efficiency of Erlang using multiple
> core architectures and SMP enabled kernels.  I am in the process of
> testing Erlang for a high performance computing application in a
> virtualized clustering environment, using a fibre channel storage fabric
> and VMware vSphere ESXi 4.0 as the hypervisor.  With vSphere ESXi 4.0
> you have the option of passing up to 8 physical cores into a single VM,
> with unprivileged instructions being executed by the VM directly on the
> hardware cores; in VMware bare metal hypervisor environments, the CPU is
> not virtualized to get near-native performance inside the VM.

I don't understand why you need to involve the VMware stuff and
virtualization at all
if all you want it to build a highly parallell application with high
performance.
Why not just run 1 OS instance (Linux would do well) over the whole
multi-pro and multi-core system.
On top of that you run 1 Erlang VM with as many internal schedulers as
there are logical processors (i.e. cores and hyperthreads).

You will then get a very simple systems that scales very well.

The Erlang VM with SMP support is very performant and reliable. It is
used in many
"non-stop" running applications.
>
> With a multiple core architecture (say for example an Intel 5300 series
> quad core with two hardware processors on the motherboard) there are two
> camps of thought within the world of virtualized clustering:

Again, why involve virtualization in this?
>
> 1) Assign only a single CPU core to each VM and use a single processor
> kernel/hardware abstraction layer inside the VM's guest operating
> system; let the hypervisor handle scheduling and distribution of CPU
> resources across multiple VMs.
> 2) Assign multiple cores into a single VM, use an SMP-enabled
> kernel/hardware abstraction layer inside the VM's guest operating
> system, and rely upon the guest OS' SMP-capabilities and the guest OS
> application's ability to maximize concurrency.
>
> Each architecture has its merits, for example processor affinity can be
> used with Option #1 to "pin" a VM onto a specific core.  For a high
> performance cluster I could create 8 VMs on a single host, pin VM1 to
> core0, VM2 to core1, VM3 to core2 etc and therefore get an 8:1 hardware
> consolidation ratio with near-native performance.  The downsides to this
> architecture is that extra memory and CPU resources are expended on the
> guest OS for each of the 8 VMs, although VMware vSphere/VI3.5 use
> transparent page sharing to "deduplicate" redundant 4K memory pages
> provided each VM has similar architecture and guest OS components.
>
> Option #2 has the benefit of a single VM without the overhead of 7
> additional VMs, but this requires that the guest OS has robust SMP
> support and the application (Erlang) has the ability to take advantage
> of multiple core architectures.  An added benefit to this architecture
> is that VMware ESX/ESXi use a relaxed co-scheduling method that will
> detect the idle loop being executed in secondary cores and will
> de-schedule those CPU resources to be used by other VMs, so in theory
> when those extra cores that are assigned to a single VM are not being
> used the hypervisor will free them up for other VMs.
>
> So the question is this - how robust is the SMP/multicore capabilities
> of Erlang R13B, and would multiple CPU cores passed to a single Erlang
> node recognize the same level of efficiency and utilization as say for
> example 8 separate non-SMP VMs, each with a single dedicated CPU core
> using hypervisor processor affinity?

The SMP/multicore capabilites of ERlang R13B is very robust.
Running one Erlang VM on multiple cores will give very good
performance and a simple system.
Of course there are some internal shared data structures for
adminstrative purposes which
can cause significant lock contention depending on how frequently they
are accessed.
Message passingbetween ERlang processes will also imply the use of
shared data structures that needs to be protected by locks.

If you have many parallell computations which does not communicate
with each other at all during their work then it can sometimes be more
efficient to run several Erlang VMs on the same system for example one
per physical processor.
But the Erlang VM is continually improved when it comes to multi-core
performance so a single VM will always be a good choice as the
application is simple and the total memory footprint is smaller.

/Kenneth Erlang/OTP, Ericsson
> Thanks in advance
>
> Greg
>
>
>
>
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>