[erlang-questions] per function/process locals

Mon Apr 30 17:35:55 CEST 2007

All this discussion about the process dictionary seems like a good
time to jump in and talk about a pet idea.  Maybe this will turn out
to be more generally useful than process dictionary enhancements?
I've already written a working proof-of-concept preprocessor for this
in Perl, one with horrific syntax :)

Frequently I find myself with either a many-clause function or group
of functions that take a bunch of parameters.  Some of these
parameters are essentially constants.  They get passed around, but
never change.  More often I find that I change the value of a single
parameter in each call.  As a contrived example, here's code to count
the number of integers, floats, and atoms in a list:

   count(L) -> count(L, 0, 0 ,0).
   count([H|T], Ints, Floats, Atoms) ->
      if
         is_integer(H) -> count(T, Ints+1, Floats, Atoms);
         is_float(H) -> count(T, Ints, Floats+1, Atoms);
         is_atom(H) -> count(T, Ints, Floats, Atoms+1);
         true -> count(T, Ints, Floats, Atoms)
      end;
   count([], Ints, Floats, Atoms) ->
      {Ints, Floats, Atoms}.

This code is about as tight as you can get in Erlang.  Only a single
tuple is heap-allocated, and all values are kept in BEAM registers
until the end.  The same code with records is bulkier all around.  The
interesting thing about the code is that Ints, Floats, and Atoms are
essentially local variables that get destructively updated.  It would
be nice to capture this pattern into Erlang proper.  With my ugly
preprocessor, the code looks like:

   ~LOCAL{Ints, Floats, Atoms}

   count(L) -> count (L, ~{Ints = 0 :: Floats = 0 :: Atoms = 0}).
   count([H|T], ~LOCAL) ->
      if
         is_integer(H) -> count(T, ~{Ints = Ints+1});
         is_float(H)  -> count(T, ~{Floats = Floats+1});
         is_atom(H) -> count(T, ~{Atoms = Atoms+1})
      end;
   count([], ~LOCAL) ->
      {Ints, Floats, Atoms}.

   ~LOCAL{}

Now this is a simple example, but imagine that we also wanted to count
PIDs, Refs, Tuples, Conses, and Binaries.  With the ~LOCAL syntax, the
extension is trivial.  The first version would get prohibitively
messy.  Again, records could come to the rescue, but the LOCAL version
still hardly touches the heap, and the BEAM code is dramatically
shorter.  My preprocessor turns the second version of the code into
the first.

If you can see past the syntax--especially the ~LOCAL tags in the
function headers--what this really comes down to is having local
functions and local variables inside of another function, and any of
the local function and modify the local variables.  This could easily
extend to an entire process by having one main function for the
process and all the sub-functions could access the per-process
"globals."  Or maybe it's just of personal use to me :)