[erlang-questions] Process scope variable

Wed Feb 18 04:05:57 CET 2015

On 2015年2月18日 水曜日 02:05:39 Imants Cekusins wrote:
> > It has problems, but it can do the job, and the problems are a
> 
> helpful reminder that maybe the job should not be done.
> 
> Could you go into more detail about the problems with the process
> dictionary, please?

At this point all the major sides of the discussion surrounding the wisdom of 
global variables have been recounted. If you're really only interested in 
constants, then you have four main choices (in the order I prefer):

1. A function wrapper around the value:

  some_value() -> 7.

The function call involved will be optimized away by the compiler, but this 
shouldn't concern you anyway unless you discover that this is a bottleneck 
(and you can never know until you actually profile it in use...).

2. An element in a #state{} record:

  S = #s{some_value = 7}.
  % From here on grab a hold of S#s.some_value whenever you need it.

3. Hey, Marco, a Macro!

  -define(SOME_VALUE, 7).

This sucks for various reasons and achieves the same effect as #1 in any case 
(after compilation *literally* the same effect, but now with far less 
flexibility in terms of refactoring your code later if you use ?SOME_VALUE in 
several places).

4. Process dictionary.

Ugh... Just too easy to abuse (so easy to start using it as a mutable global 
hack), and seriously, for a value that isn't going to change doing this is 
sort of ridiculous:

  some_fun(X, Y) ->
    Computrons = compute(get(Z), X, Y).

More favorable is:

  some_fun(X, Y) ->
    Computrons = compute(z(), X, Y).

While the latter is easier to refactor than using a macro, is still generally 
less smooth than just working with the OTP assumptions you're already 
operating within anyway:

  some_fun(X, Y, S#s{Z = z}) ->
    Computrons = compute(Z, X, Y).

The record version has the advantage of making perfect sense as both a very 
clear part of process initialization (its pretty standard to see people pack 
everything up front right there instead of tuck magic values away somewhere 
else later in the code), and prevents any need for hunting around when things 
crash -- because every relevant value was in the function arguments, so you 
can re-test the code *exactly* as it was run yourself and quickly narrow down 
what caused the crash.

On the other hand -- think carefully about this question:
When will you ever desire to insert a constant into a module that is not a 
library?

(I discussed the "global variable" thing a bit on SO, using different examples 
and wording things differently here:
http://stackoverflow.com/questions/25770042/variable-in-erlang/25775401#25775401)

If you need a dynamic value then the only practical way to go is using a state 
tuple/record. Technically you could go with the process dictionary for this, 
but as an experiment try writing a program that does everything through the 
process dictionary instead of a state record, then the same one that does 
everything through a state record -- and compare the readability, 
discoverability and post-crash trace meaningfullness of the two versions. 
There is a reason you've never seen any documentation that does everything 
through the process dictionary; writing a non-trivial gen_server that works 
that way is a great way to discover why.

You replied to ROK something to the effect that "I need state, for obvious 
reasons". Well, everyone does. And this is what we do: we use a tuple that 
represents the state of the process. If its a lot of state then a record is 
definitely called for, but its not impossible even without this (its just a 
syntactic convenience anyway). (Incidentally, using a tuple is a good way to 
discover that there is a healthy balance somewhere between data structure 
depth and the ideal of collection flatness...)

When using a record a refactoring doesn't have to involve changing anything in 
the middle of a chain of passthrough functions. It actually makes this so easy 
that many people commit to long chains of passthrough when they really 
shouldn't. But I digress...

In any case, I think the best thing for you to do now is sally forth and write 
useful code. Lots of it. Use the process dictionary. Watch the world spin 
faster! Don't use the process dictionary! Watch it spin... about the same! Use 
an ETS table for every bit you can think you might want to pack in there. Who 
cares, so long as your code works, right? And then discover which approaches 
are easier or harder to debug in actual practice.

Several of us have given our opinions and the reasoning behind them. I don't 
have a long litany of anecdotal evidence for "wow, I'm SOOOO glad I didn't use 
the process dictionary here!" because I learned that lesson a long time ago in 
Scheme, C and Python, and so have just steered clear of the process dictionary 
in Erlang so far because it hasn't been an *obvious* correct answer to any 
problem I've faced thus far. Beyond this point the discussion will necessarily 
devolve into either religion or astronautics, so best to avoid that and at 
least produce something useful and have fun on your way to developing your own 
insights into the issue.

-Craig