[erlang-questions] Erlang and the learning curve
Wed Jan 5 10:19:56 CET 2011
Re: Performance my take
What do I think about performance C, NIFs etc.
Erlang came from an industrial (Ericsson) lab - so our ideas about performance
backward compatibility etc. are highly influenced by the environment we work in.
What follows are my take on efficiency in our particular (industrial)
It's not a general position statement, but rather a reflection of how
I view efficiency etc.
in our environment.
#1 - Performance is only has meaning with respect to a specified
The spec says: we have to do this in 10 ms. If you do it in 9 ms it's
ok - 11 ms is not.
It's either good enough - or it's not.
If it's not good enough we can tweak the algorithms (ie stay in the
same language) or drop to a lower level
language (C). Tweaking the algorithm is preferred since it's easier to
stay in one language than use
two languages. Also a bloody sight easier to maintain.
#2 - The cost of optimization must be weighted against the cost of a)
buying more hardware
b) maintenance c) time to market c) quality d) delivery time
it is by no means clear that optimization is good. In small runs -
"throw more hardware at it" is often far
cheaper than spending programmer hours on the problem. (Small run
means we ship one product)
If you ship a volume product - tens of thousands of units then you can
spend enormous amounts of money
on optimizations - since the *total* costs are minimized.
#3 - NIFs are a convenient way to call C functions - you often want to
call C to gain access to
libraries that you do not want to recode in Erlang, or hit parts of
the system that Erlang cannot reach.
#4 - Optimizations in Erlang (think binaries) ) were made when there
was a *generic* benefit from the optimization. ie we though all users
would benefit - other optimizations (like reverse is a native
made because they were easy to do and have generic benefit. Doing
reverse in C rather than Erlang is
a purely pragmatic thing - there is no philosophy involved here.
#5 - Erlang is not "just a layer on top of C" as some postings in this
It would be more accurate to think of the underlying "C" as composed
of two parts. a) a run-time
b) and interface to assembler.
The run-time a) includes things like the garbage collector and
process-scheduler (actually the gc etc. should have been written in a
low level dialect of erlang and cross compiled to C (or assembler),
but that's a duifferent
The generated beam code is an example of b) - beam started as a set of
C macros, but the macros themselves
looked like instructions in some weird assembler language. For
pragmatic reasons it's better to
interpret rather than compile the C macro (don't ask ...). In this
sence C is used as a very thin layer
over assembler. I guess one day in the not too distance future we'll
JIT the beam code and bypass C.
When this happens Erlang will in no way represent a layer over C.
#6 - efficient linking of foreign code assumes that caller and callee
both obey the same ABI (Application Binary Interface) - ABIs reflect
either a) the real machine b) a VM
Examples of a) are the ABI 32 bit code on an intel X86, or the ADM64
conventions for 64 bit systems.
All ABIs reflecting real processes have an incredibly anal approach to
data. An integer passed as
an argument in a register or on the stack must have *exactly* 32 or 64 bits.
Saying that integers are (say) always exactly 32 bits and that "it is
the programmers responsibility to
avoid integer overflows" results in systems that are highly efficient,
but the code is error prone, difficult to
prove correct and (very) difficult to extent if you need to extend the domain.
examples: In 32 bits you can address 4GBytes of memory - addressing
more memory will cause a
major rewrite of your code. If 32 bits precision doesn't cut it and
you need 64 bits - then a major rewrite is needed.
b) VM's suffer no such problems. If you want 87 bit registers and
bignum arithmetic then this is easy
(if you design your own VM) - if you choose a "well known" VM like
.NET or the JVM then you have to
live with all the design constraints imposed by the VM - these can be
just as bothersome as living with
an ABI. You can JIT the VM to a real machine code, but JIT'ing will
never be eccicient if there is
not a 1:1 relationship between the VM and the underlying machine. If
the VM defined registers of 33 bits
then JIT'ing to a 32 bit processor would never be efficient.
#7 - Erlang was designed for building fault-tolerant systems
A lot of the design decisions in erlang were made to reduce the number
of programmer errors.
bignums are there to eliminate precision problems in inter arithmetic.
We *don't* use a binary
ABI to include foreign code since this would make the system unsafe.
These design decisions make the system inherently less efficient - it
is *designed* to be safe not efficient.
The amazing thing is that we can even compete in terms of efficiency
with system that were designed to be efficient but not safe.
Safety costs - if you need the extra performance do it *outside*
erlang in C - that's the way we designed it to
be. This is not a bug or a feature it was designed this way.
NIFS BTW make it *somewhat* easier to include foreign code and use a
ABI interface at the expense of
making the system unsafe.
There is a trade off between safety and perfomance here.
if you want safety - don't use NIFs- If you want performance AND
ACCEPT THE RISKS INVOLVED
use NIFs - this is the same trade off as offered by external port
programs and linked in drivers.
I'd almost always go with external port programs - they are pretty
easy to write and you don't get into
ABI nightmares (I tried making a NIF with a mixtures of 64 bit and 32
shared libraries on a mac - mixing
32 and 64 bit native compiled code is a complete and total mess,
unless you know exactly what you are doing
(which I don't))
#8 - real efficiency is done by moving from low level languages into hardware
For example, parsing XML is about 3000 times faster on an FPGA than
the fastest C++ parser
Decisions to go to hardware involve massive escalations in cost and
time to market.
#9 - finally
The old masters said it all 'first make it right, then make it fast'
This is what we do: (We = ///)
1) hack it in erlang
2) fast enough - ship it
3) tweak the erlang
4) fast enough? - ship it
5) hack it in C?
6) fast enough - ship it
7) tweak the C
8) fast enough? - ship it
9) make an FPGA
10) fast enough - ship it
11) make an ASIC
12) fast enough - ship it
As you go down this list things cost more and more - step 11 costs
1M$/try - to pass step 9
you need to have a high volume product (10's to hundreds of thousands of units)
If you ship one unit you'll hover around 1) + 3) and throw more
hardware at the problem.
Decisions to optimize or not have long reaching consequences for cost
of maintenance, quality, etc.
In mail groups like this most discussion will hover round the 1) - 7)
- what happens beyond this 8) - 12)
is cloaked in secrecy - yeah we make chips - what do they do? how do
they work? what are they programmed
in? - that's all secret - you'll have to guess.
Now I have to do some real work - somewhere in the 8) - 12) region -
can't tell you what :-)
On Wed, Jan 5, 2011 at 7:47 AM, Ulf Wiger
> On 5 Jan 2011, at 05:03, Steve Vinoski wrote:
>> I thought it provided a richer way of working with binaries. Are
>> binaries present in the language only for performance reasons? Klacke
>> can tell us for sure, but I don't think so; they're there to provide a
>> simple yet extremely powerful abstraction for manipulating bytes and
>> bits. When you're dealing with network protocols, like you do in the
>> telecoms world for which Erlang was originally created, the need to do
>> that easily and in a way that makes for readable and maintainable code
>> is critically important. The fact that binaries and bitstrings also
>> turn out to be useful in other app domains should come as no surprise.
> Just to straighten out the ancient history part, binaries were indeed
> introduced into the language for performance reasons - or rather, in
> order to provide a more semantically suitable datatype for handling
> network packets. This was back in the Good Old Days, when network
> protocols tended to be byte-coded rather than bulky text, so the most
> common ways to handle the packets were:
> - decode them, or extract some component part, based on offsets and
> byte codes
> - pass them along as payload
> Binaries are much more appropriate for this than linked lists of bytes,
> and protocol handling is so commonplace in Erlang that it makes perfect
> sense to provide the most suitable datatypes for the task.
> Note that originally, we had binaries without the bit syntax. You manipulated
> binaries using split_binary/2, binary_to_list/1, list_to_binary/1, and decoded
> them (after converting to lists) mainly with bsl, bsr, band and bor, which was
> quite horrible, especially when the data was bit-oriented rather than byte-oriented.
> The bit syntax was added to make the important task of encoding and decoding
> network packets more tractable. It was added for performance, and indeed,
> for several years, a hand-coded encode/decode would outperform the bit
> syntax. Still, practically everyone started using the bit syntax.
> The binary module was largely created because with the bit syntax, binaries
> turned out to be more suitable for many text processing tasks than lists
> (especially after providing a fast regexp library). Given that text representations
> are now ubiquitous also in protocol handling, and in other areas where
> Erlang has started becoming popular, it feels pretty natural to have a module
> that provides efficient text searching and manipulation operations - the kind
> that practically every mainstream language supports.
> Ulf W
> Ulf Wiger, CTO, Erlang Solutions, Ltd.
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
More information about the erlang-questions