What are BIFs?

Tue Oct 6 00:11:26 CEST 2009

There has always been much confusion about what a BIF really is, how they
relate to the Erlang language. An example of this was a discussion earlier
this year when someone wanted to have some more functions in lists coded in
C and so become part of Erlang. Saying that they are functions coded in C
just tell us how they are implemented not what they *are*.

The source of this confusion is very old and exists in the earliest
documentation, where a BIF was described as function "which could not be
written in Erlang or written in C for efficiency". They were in some sense
implicitly part of Erlang and considered to be in the module 'erlang'. Some
could be used without specifying the module but there was no clear reason
why some needed the module and others did
not.

A first proper attempt to define what a BIF is came when Jonas Barklund and
I wrote the Erlang specification. There we proposed that a BIF was part of
the Erlang language that did not have a special syntax but looked like a
normal function call. Spawn, link and trapexit are just as much part of
Erlang as ! and receive even though they use normal function syntax and
semantics. We also decided that it was irrelevant in what language the
function was written, what was important was the function itself. This is
analogous to processes where the semantics are what is important. I think
this definition is correct but calling them BIFs was wrong.

Our BIF proposals were put on ice together with the Erlang specification.

The problem still remains and has not become better - there is still much
confusion as to what a BIF is, if being coded in C has anything to do with
it, and if being coded in C means that the functions automatically becomes
part of the language. BEAM and the compiler also handle BIFs in many
different ways which does not always help.

Currently BIF is used to describe at least three different things:

- Functions which are part of Erlang, spawn and link are examples of this.
- Functions which are not part of Erlang but which are part of a basic
runtime system, the trace BIFs are an example of this.
- Functions which have been coded in C for efficiency, for example
lists:reverse.
(Some might not agree with separating the first two)

Moving the module 'erlang' from kernel to erts is a step in the right
direction. As is putting the new unicode BIFs (?) in a separate module,
because they are not part of the language just a library where some
functions are coded in C.

OK, so why worry about it? Does it really matter? I think it does (or else I
wouldn't be writing this mail) for the following reasons:

- We need a proper language specification and this would be a part of it.
Some day I plan to resurrect the old Erlang spec and get it up to date and
we have to start somewhere.
- It would clarify what is part of the language and what is not. This would
put various discussions about adding things to the implementation/language
in their proper context. So coding a function in lists in C would just
become an implementation detail where the issue would be whether it is worth
the extra effort to do this, and not misplaced discussions about adding to
the language.
- I think it would allow you to do more with the FFI than is proposed today
and still be consistent with the language.
- And it would help people realise and understand what is what - unclarity
and confusion are never good.

Sorry this became a bit long, but I have managed to restrain myself. :-)

Robert