push_element/2 - efficient sliding window

Mon Aug 15 05:52:57 CEST 2005

With reference to (fixed size) sliding windows,
I asked whether we really needed built-in support.

"Ulf Wiger" <ulf@REDACTED> replied:
	It's about as justified as erlang:make_tuple/2
	and erlang:append_element/2.

I guess that's a NO, then.  (:-)

It is far from clear why make_tuple/2 should ever have been
a built-in.

    make_tuple(Size, Initial) ->
	tabulate(Size, fun (_) -> Initial end).

    tabulate(Size, F) ->  % the built-in I'd prefer
	list_to_tuple(lists:map(F, lists:seq(1, Size))).

For what kind of application is constructing a tuple with all elements
equal a large help?  Is this really something that is so common that
it *pays* to cut its overhead?

    erlang:append_element(Tuple, Element) ->
	list_to_tuple(tuple_to_list(Tuple) ++ [Element]).

Again, making this a built-in operation saves a constant factor in
time, but it is *still* an O(N) operation and therefore best avoided.

Here is *one* operation which could handle them both:

    set_or_extend(Index, Tuple, New_Element, Fill)
	when tuple(Tuple), integer(Index), 1 =< Index, Index =< size(Tuple)
	-> setelement(Index, Tuple, New_Element);
    set_or_extend(Index, Tuple, New_Element, Fill)
	when tuple(Tuple), integer(Index), Index > size(Tuple)
	-> list_to_tuple(tuple_to_list(Tuple) ++
		lists:map(fun (N) -> if N < Index -> Fill
				      ; N = Index -> New_Element
				     end end,
		    lists:seq(size(Tuple)+1, Index))).

That is, if the index is outside the existing range, the tuple is
extended with the Fill element, and then the resulting tuple has
the element at Index replaced by New_Element.

    make_tuple(S, E) -> set_or_extend(S, {}, E, E).

    append_element(T, E) -> set_or_extend(size(S)+1, T, E, E).

Am I seriously proposing set_or_extend/4?
Well, this is half fun and full earnest.  This operation would be quite
a bit more useful to me (as in, I would have occasion to use it) than
make_tuple/2 or append_element/2 (which I have not yet had occasion to use).
My real point is simply that hacking oodles of special cases is not the
best way to design built-in functions; there are generalisations which
should be exploted.

In the same way, the proposed 'push' operation (which is, contrary to the
usual sense of 'push' in computing, also a truncate; such clashes are the
snake's warning rattle that something bad is about to happen) is a quirky
special case.  There must be some good generalisations waiting to be
thought of.

I should also point out that I am not saying that a good candidate for
built-in status is a good candidate for everyday use; "wrapper" functions
for common special cases may also be a good idea.

Let's push this one a little further.
Can we find ONE operation which covers make_tuple/2, append_element/2,
the proposed push_element, and a whole lot of other things, which could
reasonably go in C?

Yep.

    tuple_stuff(Tuple, Amount, Fill, Rotation, Drop, Take)
	Tuple		a tuple
	Amount		a non-negative integer
	Fill		any term	
	Rotation	an integer in the range -N..+N,
			where N = size(Tuple) + Amount	
	Drop		an integer in the range -N..+N
	Take		an integer in the range -M..+M
			where M = N - |Drop|

    Step 1:  extend Tuple on the right with Amount copies of Fill
	e.g. tuple_stuff({a,b,c}, 2, d, ...) starts with {a,b,c,d,d}.

    Step 2:  rotation Tuple by Rotation.  If Rotation >= 0, take the
	last Rotation elements and move them to the front; if Rotation =< 0,
	take the first |Rotation| elements and move them to the back.

    Step 3:  if Drop >= 0, discard the first |Drop| elements;
	if Drop =< 0, discard the last |Drop| elements.

    Step 4:  if Take >= 0, take the first |Take| elements;
	if Take =< 0, take the last |Take| elements.

Now,
    make_tuple(Size, Element) ->
	tuple_stuff({}, Size, Element, 0, 0, Size).

    append_element(Tuple, Element) ->
	tuple_stuff(Tuple, 1, Element, 0, 0, size(Tuple)+1).

    prepend_element(Tuple, Element) ->
	tuple_stuff(Tuple, 1, Element, 1, 0, size(Tuple)+1).

    add_element_at_left_of_window(Element, Tuple) ->
	tuple_stuff(Tuple, 1, Element, 1, -1, size(Tuple)).

    add_element_at_right_of_window(Tuple, Element) ->
	tuple_stuff(Tuple, 1, Element, 0, 1, size(Tuple)).

There has to be a better name than tuple_stuff/6, but I hope you get
my point.  ONE built-in goes into erlang:, lots of special-purpose
wrappers go in a new tuples: modules, and further special cases can
be added without penalty.

	The cheapest way I've found so far to do a "left shift"
	of a tuple is:

The obvious question has to be this:
    if you want a sliding window, is a tuple the right tool for the job?
The equally obvious answer is "NO WAY!"  Why?  Because even with a BIF,
it's O(N) time and O(N) space turn-over to add an element (and remove
another), whereas with the right kind of tree it can be done in O(lgN)
time and O(lgN) space.  (Proof:  think of it as a priority queue.  At
every step we remove the element with the smallest timestamp and add a
new element with a greater timestamp than any existing element.)
Using a tuple only makes sense for SMALL fixed windows.

	Since a shift BIF would cost about the same as

The tuple_stuff/6 operation described above handles shifts and rotates.

	I see no reason why we should grace the list type with
	optimized versions of append, subtract, keymember, member,
	keysearch and reverse, but not offer similar optimizations
	for tuples.

I see no reason why the list type should be graced with optimised versions
of append, subtract, member, keymember, keysort, and reverse.  At least,
in an ideal world.

Just last week I responded to a question from someone in a Prolog mailing
list.  It turned out that the crude unordered list representation of sets
was looking pretty good in the Prolog system he was using, and more
sophisticated approaches weren't paying off.  Why?  Because he was using
a SLOW Prolog, where memberchk/2 was coded in C.  In a commercial Prolog,
where memberchk/2 *wasn't* special-cased like that, not only did the
more sophisticated approaches pay off after all, but even the code based
on memberchk/2 was faster!

The only reason for having BIF support for list operations is a compiler
that doesn't generate fast code for plain Erlang versions.  With time, one
may be permitted to HiPE that the need for such things will be reduced.

	I think tuples could be used to great advantage
	in many cases where we use ETS today, if only a few steps
	were taken to ease their use a little bit.

I think that's a good idea, *BUT* hacking one special case at a time
may not be the best way to do that.

	Another thing I'd like to see for tuples is a syntax
	addition:

	tag_search([{Tag | _} = T|_], Tag) ->
	    {value, T};

Agreed.