[erlang-questions] traversal of multiple lists (or list of lists)
Richard A. O'Keefe
ok@REDACTED
Wed Oct 26 01:47:40 CEST 2016
On 22/10/16 9:41 AM, Karlo Kuna wrote:
> i want to implement what i call Select-Consume function:
>
> take list of lists [L1 ... Ln]
> select some (potentially all) heads from L1 ... Ln and do something with
> them
> recurse with tails of lists where selection happened and with complete
> lists where there wasn't selection preserving original order
> end when all L1 ... Ln are exhausted or when there was no selected heads.
This is rather unclear.
Here's a function that takes a list of lists.
If all the lists are empty, it returns [].
If all of the lists are non-empty, it returns [Heads|Tails].
If some are empty and some not, it crashes.
unfold_lists([[Head|Tail] | []]) ->
[[Head] | [Tail]];
unfold_lists([[Head|Tail] | Lists]) ->
[Heads|Tails] = unfold_lists(Lists),
[[Head|Heads] | [Tail|Tails]];
unfold_lists([[] | []]) ->
[];
unfold_lists([[] | Lists]) ->
[] = unfold_lists(Lists),
[].
Now we can write things like
multilist_map(F, Lists) ->
case unfold_lists(Lists)
of [] -> []
; [Heads|Tails] -> [F(Heads) | multilist_map(Tails)]
end.
(Something not entirely unlike this was in the DEC-10 Prolog library.)
For example,
> multi:unfold_lists(["abc","def","ghi"]).
["adg","bc","ef","hi"]
> multi:multilist_map(fun erlang:list_to_atom/1, ["abc","def","ghi"]).
[adg,beh,cfi]
> multi:multilist_map(fun erlang:list_to_atom/1, ["abc","def","gh"]).
** exception error: no match of right hand side value []
in function multi:unfold_lists/1 (multi.erl, line 7)
in call from multi:unfold_lists/1 (multi.erl, line 7)
in call from multi:multilist_map/2 (multi.erl, line 16)
in call from multi:multilist_map/2 (multi.erl, line 18)
Another approach, also found in the DEC-10 Prolog library, might be
clearer:
all_empty([[_|_]|_]) ->
false;
all_empty([[]|Lists]) ->
all_empty(Lists);
all_empty([]) ->
true.
heads([[Head|_]|Lists]) ->
[Head | heads(Lists)];
heads([]) ->
[].
tails([[_|Tail]|Lists]) ->
[Tail | tails(Lists)];
tails([]) ->
[].
cons_lists([Head|Heads], [Tail|Tails]) ->
[[Head|Tail] | cons_lists(Heads, Tails)];
cons_lists([], []) ->
[].
With those definitions, we'd have
multilist_filter(F, Lists) ->
case all_empty(Lists)
of true -> []
; false ->
Heads = heads(Lists),
Rest = multilist_filter(F, tails(Lists)),
case F(Heads)
of true -> [Heads|Rest]
; false -> Rest
end
end.
There's a trade-off here: unfold_lists/1 does a single traversal,
but allocates twice as much memory as it retains, while
all_empty/1, heads/1, and tails/1 allocate no unretained memory,
but take two traversals (not three: in the non-empty case the
all_empty/1 function will just check the first element).
>
> my worry is about performance, as far as i know in erlang there is no
> way to coordinate traversal of multiple lists!
Why not write the code then measure it and see if you have a problem?
> i looked at implementation of lists module and there i found horror of
> manual implementation of case combinatorics in merge/3
> which obviously does not scale at all.
merge/3 and its support functions have been micro-optimised to reduce
memory references and pattern matching work. In the service of
something as commonly used as sorting, that makes sense. It's not
*meant* to scale to more than 2 lists, or even to be something that
ordinary programmers would imitate.
> i am also aware that forwarding strategy in my case is making new list
> every time, which is BAD but i don't know of any other way
> to handle variadic function call other than passing list of arguments
> (in my case list of lists).
Presumably you're referring to code similar to the code above.
Why do you call it "BAD" with shrieking capitals?
You're right that there *isn't* any way to write a "variadic" function
in Erlang than to give it a list (or tuple), and that if the arguments
are different, that list (or tuple) will need to be different.
Erlang's garbage collector *expects* to be given a fair bit of work
with intermediate values being created instead of possibly shared
data being smashed. It doesn't mind. It's a good idea in any
programming language to avoid repeated work and garbage creation,
but not at the expense of getting the code working and appropriately
maintainable.
It really would help a lot if you could be clearer about what exactly
you are trying to do.
More information about the erlang-questions
mailing list