[erlang-questions] Filtering hierarchical data structure

Sat Jul 14 10:54:49 CEST 2018

On Fri, Jul 13, 2018 at 3:25 PM, Stanislav Ledenev <s.ledenev@REDACTED> wrote:
> Hello,
>
> I am new to Erlang and functional programming (1-1.5 months) and have task:
>
> 1. Suppose we have hierarchical tree-like data structure:
> L = [
> {section, "id1", [
> {section, "id1.1", [
> {policy, "p1", "v1"},
> {policy, "p2", "v2"}
> ]},
> {section, "id1.2", [
> ]}
> ]},
> {section, "id2", [
> {policy, "p3", "v3"}
> ] },
> {section, "id3", [
> ] }
> ]
>
> 2. List of some "section"'s with children elements any of which could be
> another "section" with children or "policy" with name and value.
>
> 3. Also we have another list which contains identifiers of policies:
> F = ["p1", "p3"].
> 4. Now I need to write a function which takes lists L, F and returns list L2
> such as L2 contains only those "section"'s which has "policies" wich Id
> equals to those in F. Empty sections must be removed.
>
> For lists L and F result should be:
> L2 = [
> {section, "id1", [
> {section, "id1.1", [
> {policy, "p1", "v1"},
> ]},
> ]},
> {section, "id2", [
> {policy, "p3", "v3"}
> ] },
> ]
>
> 5. I have solution (please see below) but it seems to me not as good as
> it can be and even little weird. I am sure that it could be improved
> and done in proper way.
> If anyone could give me any glimpse or advice I would appreciate.
>
> Source:
> ---------------------------------------------------------------------------
> main() ->
>     L = [
>             {section, "id1", [
>                 {section, "id1.1", [
>                     {policy, "p1", "v1"},
>                     {policy, "p2", "v2"}
>                 ]},
>                 {section, "id1.2", [
>                 ]}
>             ]},
>             {section, "id2", [
>                 {policy, "p3", "v3"}
>             ] },
>             {section, "id3", [
>             ] }
>         ],
>     %filter(L, ["p1", "p3"]).
>     filter(L, ["p3"]).
>
> filter([H | T], Search) ->
>     lists:flatten([filter(H, Search, []) | filter(T, Search, [])]).
>
> filter({section, I, C}, Search, _Acc) ->
>     NewC = filter(C, Search, []),
>     case NewC of
>         [] -> [];
>         _ -> {section, I, NewC}
>     end;
>
> filter([{section, I, C} | T], Search, Acc) ->
>     NewC = filter(C, Search, []),
>     case NewC of
>         [] -> filter(T, Search, Acc);
>         _ -> filter(T, Search, [{section, I, NewC} | Acc])
>     end;
>
> filter([{policy, I, V} | T], Search, Acc) ->
>     case lists:member(I, Search) of
>         true -> filter(T, Search, [{policy, I, V} | Acc]);
>         false -> filter(T, Search, Acc)
>     end;
> filter([], _S, Acc) -> Acc.

Just some high-level comments.

- Your data is clearly layered (there are items which can be policies
or sections, and sections in turn contain lists of items), but your
filtering code conflates these and in turn gets confused as to what
it's processing, hence the large number of cases in filter/3.  Also,
your code would crash if the top-level lists starts with a policy, but
accepts it elsewhere, which smells like a bug.
- You roll your own logic for reassembling the filtered-out fragments,
which gets unnecessarily fiddly, and in this case I think wrong (you
reorder items).
- Using lists:filtermap/2 would greatly simplify your code.