Erlang is named after the mathematician Agner Erlang. Among other things, he worked on queueing theory. The original implementors also tend to avoid denying that Erlang sounds a little like "ERicsson LANGuage".
During the 1980s there was a project at the Ericsson Computer Science Laboratory which aimed to find out what aspects of computer languages made it easier to program telecommunications systems. Erlang emerged in the second half of the 80s was the result of taking those features which made writing such systems simpler and avoiding those which made them more complex or error prone.
The people involved at the start were Joe Armstrong, Robert Virding and Mike Williams. Others joined later and added things like distribution and OTP.
Mostly from prolog. Erlang started life as a modified prolog. ! as the send-message operator comes from CSP. Eripascal was probably responsible for , and ; being separators and not terminators.
(The following is my personal impression. I don't speak for Ericsson!)
Nothing to lose: Ericsson's core business is telecommunications products, selling programming tools is not really a business Ericsson is interested in.
Stimulate adoption: Erlang is a great language for many sorts of systems. Releasing a good, free development environment is likely to make Erlang catch on faster.
Generate goodwill: Giving away cool software can only improve Ericsson's image, especially given the current level of media attention around "open software".
Implement useful systems using Erlang. Spread the word!
Erlang is the easiest way to write fault-tolerant realtime software, but people won't use it if they don't know about it.
The more people we have using Erlang, the better quality the product becomes, the more cool applications we get and the more libraries are added to Erlang. Volunteers have already fixed several important bugs, created a Debian GNU/Linux package and ported Erlang to new platforms.
Cynics will say "basically nothing".
A hard realtime system is one which can guarantee that a certain action will always be carried out in less than a certain time. Many simple embedded systems can make hard realtime guarantees, e.g. it is possible to guarantee that a particular interrupt service routine on a Z80 CPU will never take more than 34us. It gets progressively harder to make such guarantees for more complex systems.
Many telecomms systems have less strict requirements, for instance they might require a statistical guarantee along the lines of "a database lookup takes less than 20ms in 97% of cases". Soft realtime systems, such as Erlang, let you make that sort of guarantee.
A rule of thumb is that it is straightforward to write Erlang programs which can respond to external events within a few milliseconds. The parts of Erlang which help with this are:
(or: how was Erlang bootstrapped?) In Joe's words:
First I designed an abstract machine to execute Erlang. This was called the JAM machine; JAM = Joe's Abstract Machine.
Then I wrote a compiler from Erlang to JAM and an emulator to see if the machine worked. Both these were written in prolog.
At the same time Mike Williams wrote a C emulator for the JAM.
Then I rewrote the erlang-to-jam compiler in Erlang and used the prolog compiler to compile it. The resultant object code was run in the C emulator. Then we threw away prolog.
Some of this is described in an old paper
When in doubt about exactly what the language allows or does, the best place to start is the Erlang Specification. This is still a work in progress, but it covers most of the language.
Yes, but only within one process.
If there is a live process and you send it message A and then message B, it's guaranteed that if message B arrived, message A arrived before it.
On the other hand, imagine processes P, Q and R. P sends message A to Q, and then message B to R. There is no guarantee that A arrives before B. (Distributed Erlang would have a pretty tough time if this was required!)
Most people find it simplest to program as though the answer was "yes, always".
Per Hedeland covered the issues on the mailing list (edited a bit):
"Delivery is guaranteed if nothing breaks" - and if something breaks, you will find out provided you've used link/1. I.e. you will get an EXIT signal not only if the linked process dies, but also if the entire remote node crashes, or the network is broken, or if any of these happen before you do the link.
It seems this issue of "guaranteed delivery" comes up every now and then, but I've never managed to find out exactly what it is those that are asking for it actually want:
Add to this that any guarantee would have to entail some form of ack from the remote in at least a distributed system, even if it wasn't directly visible to the programmer. E.g. you could have '!' block until the ack comes back from the remote saying that the message had progressed however far you required - i.e. synchronous communication of sorts. But this would penalize those that don't require the "guarantee" and want asynchronous communication.
So, depending on your requirements, Erlang offers you at least these levels of "guarantee":
Receiver sends ack after processing; sender links, sends, waits for ack or EXIT. This means the sender knows, for each message, whether it was fully processed or not.
Receiver doesn't send acks; Sender links, sends message(s). This means an EXIT signal informs the sender that some messages may never have been processed.
Receiver doesn't send acks; sender sends messages. :-)
There are any number of combinations of these (e.g. receiver sends ack not after each message but at some critical points in the processing).
Per concluded by pointing out that "if you think TCP guarantees delivery, which most people probably do, then so does Erlang".
The Erlang language doesn't specify any limits, but different implementations have different limits on the number of processes, the maximum amount of RAM and so on. These are documented for each implementation.
Yes and no. A fun is a reference to code; it is not the code itself. Thus a fun can only be evaluated if the code it refers to is actually loaded on the evaluating node. In some cases, we can be sure that execution will never return to a fun. In these cases the old code can be purged without problems. The following code causes no surprises:
Parent = self(),
F = fun() -> loop(Parent) end,
spawn(F).
On the other hand, a bound fun will not be replaced. In the following example, the old verson of F is executed even after a code change:
-module(cc).
-export([go/0, loop/1]).
go() ->
F = fun() -> 5 end,
spawn(fun() -> loop(F) end).
loop(F) ->
timer:sleep(1000),
F(),
cc:loop(F).
This type of problem can be solved in the code_change/2 function in the standard behaviours.
Some typical ways to get in trouble are:
A general way to avoid problems with funs which refer to code which doesn't exist is to store the function by name instead of by reference, e.g. by writing fun M:F/A.
Compared to the rest of Erlang, records are rather ugly and error prone. They're ugly because they require an awful lot of typing (no pun intended). They're error prone because the usual method of defining records, the -include directive, provides no protection against multiple, incompatible definitions of records.
Several ways forward have been explored. One is lisp-like structs which have been discussed on the mailing list. Another is Richard O'Keefe's abstract patterns which was also posted. Then there is also a suggestion for making records more reliable.
Sure. Here's an example of how to do it using nested receives:
receive ->
{priority_msg, Data1} -> priority(Data1)
after
0 ->
receive
{priority_msg, Data1} -> priority(Data1)
{normal_msg, Data2} -> normal(Data2)
end
end.
The following "questions" all relate to topics which have generated long discussions in public forums, often with some amount of stepping on people's toes. If you're going to post a news article (or write a report, or...) about any of these, reading the answers here might help you avoid some arguments we've already been through.
Current versions of Erlang (R12B onwards) ship with a static type analysis system called the Dialyzer. Using the dialyzer is optional, though many or most serious projects use it. The Dialyzer does not require source code to be modified or annotated, though annotations increase the number of problems the Dialyzer can find.
Until about 2005, static type checking was used rarely in commercial Erlang-based systems. Several people experimented with various approaches to the problem, including Sven-Olof Nyström, Joe Armstrong, Philip Wadler, Simon Marlow and Thomas Arts.
Erlang itself, i.e. ignoring the Dialyzer, uses a dynamic type system. All type checking is done at run-time, the compiler does not check types at compile time. The run-time type system cannot be defeated. This is comparable to the type systems in Lisp, Smalltalk, Python, Javascript, Prolog and others.
Java, Eiffel and some other languages have type systems which are mostly checked at compile time, but with some remaining checking done at run time. The combination of checking cannot be defeated. Such type systems provide some guarantees about types which can be exploited by the compiler, this can be useful for optimisation.
Haskell, Mercury and some other languages have type systems which are completely checked at compile time. This type system cannot be defeated. The type system in this type of language is also a design tool, it increases the language's expressiveness.
C, pascal and C++ have type systems which are checked at compile time, but can be defeated by straightforward means provided by the language.
It's undisputed that pretty much any programming language can do what every other language can, so in that sense Erlang is redundant.
On the other hand, we'd hope that some tools are better for some things than other tools. Erlang was born from systematic experiments to determine what would make a language good at solving telecommunications-related problems, and empirical evidence from large projects within Ericsson suggests that Erlang succeeded in doing that.
This might result in a faster Erlang system, or it might not. It would be interesting to see some research done in this area.
People coming to Erlang from object-oriented languages sometimes spend a while trying to write programs in an object-oriented style in Erlang before "seeing the light" and realising that the benefits that may give in other languages don't materialise in Erlang. Several papers have been published about how to "do OO in Erlang", including a chapter in the old (Armstrong, Virding, Wikström, Williams) Erlang book.
A common conservative position is to say that processes, asynchronous messages, functions and modules provide the same ability to structure systems as do threads, classes, methods, inheritance and aggregation.
An aggressive position is to say that OO is just snake oil, that inheritance is error prone and that any system which doesn't model concurrent problems with concurrency in the program is defective. Taking this position in newsgroups/mailing lists tends to trigger a flamewar.
The WOOPER project is one example of a serious effort to put an OO layer on top of Erlang.
Strings are represented as linked lists, so each character takes 8 octets of memory on a 32 bit machine, twice as much on a 64 bit machine. Access to the Nth element is O(N). This makes it easy to accidentally write O(N^2) code. Here's an example of how that can happen:
slurp(Socket) -> slurp(Socket, "").
slurp(Socket, Acc) ->
case gen_tcp:recv(Socket, 1) of
{ok, [Byte]} -> slurp(Socket, Acc ++ Byte);
_ -> Acc
end.
The bit-syntax provides an alternative way of handling strings with different space/time tradeoffs to the list implementation.
Some techniques for improving performance of string-related code:
gen_tcp:send(Socket, ["GET ", Url, " HTTP/1.0", "\r\n\r\n"]).
The current default GC is a "stop the world" generational mark-sweep collector. Each Erlang process has its own heap and these are collected individually, so although every process is stopped while GC happens for one processes, this stop time is expected to be short because each process is expected to have a small heap.
The GC for a new process is full-sweep. Once the process' live data grows above a certain size, the GC switches to a generational strategy. If the generational strategy reclaims less than a certain amount, the GC reverts to a full sweep. If the full sweep also fails to recover enough space, then the heap size is increased.
In practice, this works quite well. It scales well because larger systems tend to have more processes rather than (just) larger processes. Measurements in AXD301 (the large ATM switch) showed that about 5% of CPU time is spent garbage collecting.
Problems arise when the assumptions are violated, e.g. having processes with rapidly growing large heaps.
There are some alternative approaches to memory management which can be enabled at run-time, including a shared heap.
Maybe, but you might be better off expending effort on thinking of other ways to make your system go faster.
One version of spawn/4 accepts a list of options as the last argument. Ulf Wiger (AXD301) says that the gc_switch and min_heap_size can be used to obtain better performance if you do some measuring, benchmarking and thinking. One 'win' happens when you can completely avoid GC in short-lived processes by setting their initial heap large enough to avoid all GC during the process' life.
min_heap_size can be useful when you know that a certain process will rapidly grow its heap to well above the system's default size. Under such circumstances you get particularly bad GC performance with the current GC implementation.
gc_switch affects the point at which the garbage collector changes from its full-sweep algorithm to its generational algorithm. Again, in a rapidly growing heap which doesn't contain many binaries, GC might perform better with a higher threshold.
Yes, and it would be rather nice to be able to benefit from the considerable effort being put into making the JVM go faster.
There are a couple of obstacles: the JVM does not provide support for tail recursion or tagged data types. Both of these deficiencies can be worked around at a cost. The result will probably be slower and use more memory than a VM written for Erlang.
Other languages with roughly similar characteristics to Erlang have been written to run on the JVM. Benchmarks from such systems (e.g. KAWA) show a significant performance penalty from running on the JVM.
One of the touted advantages of functional programming languages is that it is easier to formally reason about programs and prove certain properties of a given program.
There is an active research group looking at problems related to formal verification, they have a home page.
The Erlang coding guidelines suggest avoiding defensive programming. The choice of the term "defensive programming" is unfortunate, because it is usually associated with good practice. The point of the recommendation is that allowing an Erlang process to exit when things go wrong inside the Erlang program is a good approach, i.e. writing code which attempts to avoid an exit is usually a bad idea.
For example, when parsing an integer it makes perfect sense to just write
I = list_to_integer(L)
if L is not an integer, the process will exit and a supervisor somewhere will restart that part of the system, reporting an error:
=ERROR REPORT==== 12-Mar-2003::13:04:08 ===
Error in process <0.25.0> with exit value: {badarg,[{erlang,list_to_integer,[bla]},{erl_eval,expr,3},{erl_eval,exprs,4},{shell,eval_loop,2}]}
** exited: {badarg,[{erlang,list_to_integer,[bla]},
{erl_eval,expr,3},
{erl_eval,exprs,4},
{shell,eval_loop,2}]} **
If a more descriptive diagnostic is required, use a manual exit:
uppercase_ascii(C) when C >= $a, C =< $z ->
C - ($a - $A);
uppercase_ascii(X) ->
exit({"uppercase_ascii given non-lowercase argument", X}).
This separation of error detection and error handling is a key part of Erlang. It reduces complexity in fault-tolerant systems by keeping the normal and error-handling code separate.
As for most most advice, there are exceptions to the recommendation. One example is the case where input is coming from an untrusted interface, e.g. a user or an external program.
Joe's original explanation is available online.