[erlang-questions] Performance Testing with C++, Java, Ruby and Erlang
Ulf Wiger (TN/EAB)
ulf.wiger@REDACTED
Fri Sep 7 15:22:41 CEST 2007
shahzad bhatti wrote:
> I posted that blog entry yesterday and didn't expect that much
> controversy. I am learning Erlang and just trying any interesting
> distributed problems I see. This was not meant as Erlang bashing,
> in fact I like Erlang a lot.
With comments enabled, you might have received a lot of helpful
pointers on how to improve the code, but this forum will also
work. I've not spent much time trying to understand the outline
of your code, but some comments about programming style might
be in order:
You should strive to eliminate the get() and put() operations,
and have the functions return all the structures that were
modified. For example:
355 set_node_value(Round) ->
356 Config = get(config),
357 ProcIds = lists:seq(0, config:get_num_procs(Config)-1),
358 lists:foreach(
359 fun(Id) ->
360 set_node_value(Round, Id) end, ProcIds).
361
362 set_node_value(Round, Id) ->
363 L = repository:get_rank_list(Round, Id),
364 lists:foreach(
365 fun(Path) ->
366 set_node_value(Round, Id, Path) end, L).
367
368 set_node_value(_Round, _Id, Path) ->
369 Nodes = get(nodes),
370 Node = get_node(Path),
371 Value = find_majority(Path),
372 Nodes1 = dict:store(Path, node:set_output(Node, Value), Nodes),
373 put(nodes, Nodes1).
The set_node_value/1 function is called from within a lists:foreach(),
when you should rather use e.g. lists:foldl() with Nodes as an
accumulator. That would eliminate all the get() and put() operations,
as the updated Nodes variable is passed on to the next iteration.
set_node_value(Round, #config{proc_ids = Proc_ids, nodes = Nodes}) ->
lists:foldl(
fun(Id, Ns) ->
set_node_value(Round, Id, Nodes)
end, Nodes, ProcIds).
set_node_value(Round, Id, Nodes) ->
L = repository:get_rank_list(Round, Id),
lists:foldl(
fun(Path, Ns) ->
Node = get_node(Path),
Value = find_majority(Path),
dict:store(Path, node:set_output(Node, Value), Ns).
end, Nodes, L).
These are just minor changes, to illustrate the point:
- No need to build Proc_ids in every iteration. Do it once,
since it's static, and store it in your record. Also, keep
nodes in the record, rather than in the process dictionary.
Besides, each Erlang process has a unique Id - no need to
invent a corresponding value, when the Pid will do nicely.
- No need to have a separate module that selects attributes
from a record. It's much faster to do it inline, using
record syntax.
You can apply this to many places in the code.
And if you then look at find_majority/1:
393 find_majority(Path) ->
394 Counts = dict:new(),
395 Counts1 = dict:store(?ONE, 0, Counts),
396 Counts2 = dict:store(?ZERO, 0, Counts1),
397 Counts3 = dict:store(?UNKNOWN, 0, Counts2),
398 put(counts, Counts3),
399 L = repository:get_children_path(Path),
400 N = length(L),
401 lists:foreach(fun(Child) ->
402 increment_count(Child) end, L),
403 Counts4 = get(counts),
404 OneCount = dict:fetch(?ONE, Counts4),
405 ZeroCount = dict:fetch(?ZERO, Counts4),
Here, you rely on the process dictionary to save
the side-effect of increment_count/1, so that you
can retrieve Counts4 right afterwards. You also use dict
to keep track of the only three counters that
increment_count/1 can update. It would be _much_ more
efficient to simply fold over the list with the counters
as an accumulator:
lists:foldl(fun(Child, {C1,C2,C3}) ->
Node = get_node(Child),
Node of ?ONE -> {C1+1,C2,C3};
?ZERO -> {C1,C2+1,C3};
?UNKNOWN -> {C1,C2,C3+1}
end, {0,0,0}, L)
(There are several other ways to write this, achieving
the same effect.)
Finally, in repository.erl:
519 table_lookup(Tab, Key) ->
520 Result = ets:lookup(Tab, Key),
521 case Result of
522 [] ->
523 [];
524 error ->
525 [];
526 {ok, Value} ->
527 lists:reverse(Value);
528 [{Key, Value}] ->
529 lists:reverse(Value)
530 end.
ets:lookup/2 _only_ returns a (possibly empty) list
of objects. error and {ok,Value} are not possible
return values.
Also, it seems odd that you reverse Value at every
lookup, but prepend to the (reversed) list at insert.
The insert sequence [1,2,3,4,5] would then give the
value list [5,3,1,2,4]. Is this really what you want?
If so, you should write a comment explaining the
virtues of the arrangement.
BR,
Ulf W
More information about the erlang-questions
mailing list