the point of 'public' digraphs?

Ulf Wiger (AL/EAB) ulf.wiger@REDACTED
Tue Mar 1 10:13:14 CET 2005


I noticed that it's possible to specify the attribute 'public' in digraph:new/1.

Shouldn't it be very clearly stated in the manual then that digraph is by
no means thread safe?

One example is the new_edge_id/1 function, below:


%%
%% Generate a "unique" edge identifier (relative this graph)
%% ['$e' | N]
%%
new_edge_id(G) ->
    NT = G#graph.ntab,
    [{'$eid', K}] = ets:lookup(NT, '$eid'),
    true = ets:delete(NT, '$eid'),
    true = ets:insert(NT, {'$eid', K+1}),
    ['$e' | K].

The ntab table is a bag table, so update_counter can't be 
used (BTW, the man page for ets:update_counter() doesn't 
state the perhaps obvious fact that it doesn't support bag tables.)

Here are two possible interleavings that may cause the function
to fail:

Example 1:
----------

Process A          Process B
NT = G#graph.ntab,
[{'$eid', K}] = ets:lookup(NT, '$eid'),
true = ets:delete(NT, '$eid'),
% scheduled out
                   NT = G#graph.ntab,
                   [{'$eid', K}] = ets:lookup(NT, '$eid'),
                   % crashes since all instances of '$eid' gone

Example 2:
----------

Process A          Process B
NT = G#graph.ntab,
[{'$eid', K}] = ets:lookup(NT, '$eid'),
% scheduled out
                   NT = G#graph.ntab,
                   [{'$eid', K}] = ets:lookup(NT, '$eid'),
                   true = ets:delete(NT, '$eid'),
                   true = ets:insert(NT, {'$eid', K+1}),
                   ['$e' | K].
                   % ... scheduled out
true = ets:delete(NT, '$eid'),
true = ets:insert(NT, {'$eid', K+1}),
['$e' | K].

Process A and Process B now get the same key,
and '$eid' is incremented by 1 instead of 2, which may
be a smaller problem.


These two problems can occur even with today's scheduler,
but if we move to multi-pro architectures, the probability
of such race conditions occurring increases tremendously.

Since digraph contains several functions that are implemented
using multiple ets accesses, the most reasonable fix seems to 
be to remove the 'public' option. As a more backwards compatible,
but IMO worse alternative, clearly document that some kind of 
locking mechanism is required to protect accesses to a public 
digraph.


/Uffe



More information about the erlang-questions mailing list