[erlang-questions] List Question

zxq9 zxq9@REDACTED
Mon Aug 7 14:53:11 CEST 2017


On 2017年08月07日 月曜日 22:29:31 you wrote:
> Hello zxq9,
> 
> Thanks, Unfortunately I do not know the value of the string that will
> be there. Its an extensible hierarchy that can be several lists deep -
> or not. Might need to revise the data structure

In this case it can be useful to consider a way of tagging values.

Imagine we want to represent a directory tree structure and have a descent-first traversal function recurse over it while creating the tree. We have two things that can happen, there is a flat list of new directories that need to be created, and there is the possibility that the tree depth extends deeper at each node.

The naive version would look like what you have:

["top_dir_1",
 "top_dir_2",
 ["next_level_1",
  "next_level_2"]]

This leaves a bit to be desired, not only because of the problem you have pointed out that makes it difficult to know what is deep and what is shallow, but also because we don't really have a good way to represent a full tree (what would be the name of a directory containing other directories?).

So consider instead something like this:

[{"top_dir_1", []},
 {"top_dir_2", []},
 {"top_dir_3",
  [{"next_level_1", []},
   {"next_level_2", []}]}]

Now we have a representation of each directory's name AND its contents.

We can traverse this laterally AND in depth without any ambiguity or need for carrying around a record of where we have been (by using depth recursion and tail-call recursion):


make_tree([{Dir, Contents} | Rest]) ->
    ok =
        case filelib:is_dir(Dir) of
            true ->
                ok;
            false ->
                ok = log(info, "Creating dir: ~p", [Dir]),
                file:make_dir(Dir)
        end,
    ok = file:set_cwd(Dir),
    ok = make_tree(Contents),
    ok = file:set_cwd(".."),
    make_tree(Rest);
make_tree([]) ->
    ok.


Not so bad.

In your case we could represent things perhaps a bit better by separating the types and tagging them. Instead of just "FT" and whatever other string labels you might want, you could either use atoms (totally unambiguous) or tuples as we have in the example able (also totally unambiguous). I prefer tuples, though, because they are easier to read.

[{value, "foo"},
 {tree,
  [{value, "bar"},
   {value, "foo"}]},
 {value, "baz"}]


So then we do something like:


traverse([{value, Value} | Rest]) ->
   ok = do_thing(Value),
   traverse(Rest);
traverse([{tree, Contents} | Rest]) ->
   ok = traverse(Contents),
   traverse(Rest);
traverse([]) ->
   ok.


Anyway, don't be afraid of varying your value types to say exactly what you mean. If your strings like "FT" only had meaning within your system consider NOT USING STRINGS, and using atoms instead. That makes it even easier:


[foo,
 bar,
 [foo,
  bar],
 foo]


So then we can do:


traverse([foo | Rest]) ->
    ok = do_foo(),
    traverse(Rest);
traverse([bar | Rest]) ->
    ok = do_bar(),
    traverse(Rest);
traverse([Value | Rest]) when is_list(Value) ->
    ok = traverse(Value),
    traverse(Rest);
traverse([]) ->
    ok.


And of course, you can not use a guard if you want to match on a list shape in the listy clause there, but that is a minor detail. The point is to make your data types MEAN SOMETHING REASONABLE within your system. Use atoms when your values are meaningful only within your system. Strings are for the birds.

-Craig



More information about the erlang-questions mailing list