[erlang-questions] Messing with heart. Port and NIF, which one is better?

Wed Feb 13 16:01:15 CET 2013

Hi Gokul,

On Wed, Feb 13, 2013 at 1:03 AM, Gokul Evuri <chandu.gokul138@REDACTED> wrote:
> Hello,
> I am beginner-level erlang user and i got two questions here,
> is there a way to save my processes if the VM crashes?

Welcome!

If you use gen_server (or the simplified equivalent e2_service -- see
http://e2project.org) you have a clear life cycle for your process:

- init is used to define the initial state for the process when it starts

- handle_xxx can be used to modify the state in response to various messages

If you want to "persist your processes", you're talking about a)
persisting the process state as it changes and b) providing a way to
load the last known state at process start.

E.g.

init(InitArgs) ->
  {ok, load_state_from_disk(InitArgs)}

handle_msg(Msg, _From, State) ->
   NewState = do_something_with_msg(Msg, State),
   save_state_to_disk(NewState),
   {reply, ok, NewState}.

> And the second question,
> Is there any good argument to use NIF instead of creating a connected
> process for a port.

The NIF interface is appropriate for defining simple functions in C.
There are lots of 3rd party libraries where NIFs are used to plugin in
long running, multi-threaded facilities, but this seems misguided to
me.

If there's even a small chance that your C program will crash, use a C
port. Don't assume that the overhead of serializing messages over
stdio is going to ruin your application performance. The safety of an
external port is profoundly valuable.

To rant slightly, it's surprisingly common to see people readily
accept the "speed" trade off (use a NIF) over "safety" (use an
external C port). This is an unfortunate trend. I use various NIF
based libraries in production and routinely deal with Erlang crashes
as a result (I plan to rewrite the more troubling ones as C ports). So
what goes very fast (or so the thinking goes) suddenly goes to *zero*
when the VM crashes -- and stays at zero until everything is restarted
and initialized. This can be devastating in the very case where
speed/throughput is most important -- very high load.

Those are cases where a slightly slower interface can be paid off by a
*much* faster recovery on error. The benefit goes up if you can
distribute your work across multiple C ports -- the death of any one
will only impact 1/N of your system.

As an additional point, an external C port lets you write your C code
using the Very Good Pattern of crash-early. I.e. if you don't like a
particular state in your C process (e.g. assertion failure) call
exit() and die! If you're forced to be defensive because you're afraid
of killing the entire VM, your going *backward* to the days of
bug-hiding and mysterious behavior. Why bother with Erlang in the
first place?

All that said, if your extension in C is a simple side effect free
function or is otherwise "safe" -- a NIF will give you a simpler path
to implement it and avoids the overhead of a system process.

Garrett