How to do table lookups?

Wed Mar 3 18:57:11 CET 2004

Hi all!

I am looking for the "right" way to do the following: I am running a C
program using ports, and need to parse the binary output.

I figured the easiest way would be to get the binary bytes, and then call
a function to do the actual parsing:

loop(Port) ->
  receive
    {Port, {data, Bytes}} ->
      io:format("Received ~w~n", [Bytes]),
      parse(Bytes),
      loop(Port);
    {Port, eof} ->
      eof
  end.

parse(Bytes) ->
  <<First:1/binary, Rest/binary>> = Bytes,
  ***do magic here to convert integer to atom***
  io:format("First=~s~n", [First]).

Now, I could declare a function to do the conversion:

lookup_dict(Value) ->
  case Value of
    16#01 -> atom_a;
    16#02 -> atom_b;
    16#03 -> atom_c;
    _     -> error
  end.

parse(Bytes) ->
  <<First:1/binary, Rest/binary>> = Bytes,
  io:format("First=~s~n", [lookup_dict(First)]).

But this would seem slow (as we need to sequentially scan down the
matching statements), and I cannot do a reverse lookup (convert the atom
into the appropriate integer).

I thought about using a dict object:

parse(Bytes) ->
  Lookup_dict = dict:from_list([
        {16#01, atom_a},
        {16#02, atom_b},
        {16#03, atom_c}]),
  <<First:1/binary, Rest/binary>> = Bytes,
  io:format("First=~s~n", [dict:fetch(First, Lookup_dict)]).

But obviously the above would rebuild the dictionary each and every time.
However, is there a way to get the equivalent of a "cached" value? Or
essentially to do a curry (replace the function with a new one that has an
instance of the lookup table in its scope)? The dictionary is static, and
so any such method would be "pure" (in the sense that any calls to such a
function would always return the same value).

I thought of using the process dictionary, but this seemed to be the wrong
place to put it (as I would need to create a new copy for each process,
rather than sharing a single instance). I think what I really want here is
a module dictionary (if that makes any sense, not sure, as values cannot
be shared across processes).

Now I could create some kind of state variable (with all the appropriate
dictionaries), but this would be awkward to keep passing around, as it is
needed only at the lowest levels.

Should I create a process, and use message passing? This would seem to
consume extra overhead, and I would need to pass along a process ID
(unless I wanted to do a lookup each time) similar to the state idea.

Is there a better way?

Regards!
Ed

P.S. Is there a way to ensure that a port program is killed automatically
upon the termination of the Erlang process? The executable I am using does
not die automatically when the Erlang process is killed, and so I am
finding lots of active instances of the program running. I am using:
   Port = open_port({spawn, "EXE"}, [stream, binary]).