[erlang-questions] Backtracking in Erlang, part 1 - control

Wed Mar 18 03:01:14 CET 2009

Sometimes you need to be able to do search algorithms which can step back
and try other alternate ways of finding a solution, "backtracking".
Generally this is difficult to do in functional languages like Erlang (most
other languages too for that matter). A classic example of this type of
problem is parsing ambiguous grammars.

Prolog syntax is like that, in some cases you need almost indefinite
look-ahead before you can decide how input is to be parsed. When I
implemented Erlog (a prolog in and for Erlang) I decided to do the parser in
Erlang so that it could run stand-alone and so I ran straight into this
problem.

Here is *a* solution to the problem. It is in two parts, the first
describing the control structures and the second (more difficult) describing
handling of data. There may also be a third part discussing some problems
with the solution to control, if anyone is still interested.

The main problem we have to overcome is that normally when you choose a path
you "commit" to it and can't go back and try an alternate path. For example
in:

    case get_a_value() of
        Choice1 -> ... ;
        Choice2 -> ... ;
        ...
    end

once you have chosen a path, say Choice1, there is no (simple) way to go
back try Choice2 if you later decide that Choice1 was wrong. We can't
backtrack.

Another problem is that if we have:

    a() -> foo(), bar(), baz().

and once we have completed foo() then even if it gave an answer which later
was shown to be wrong there is no way to step back into it, backtrack, and
let it give a new answer.

In this first section we will (almost) completely ignore arguments and
return values.

When looking at control we will (almost) completely ignore arguments and
returning values. We will use the convention that functions return either
'{succeed,Value}' when they succeed or 'fail' when they can't find a
solution.

The trick is to ensure that functions only return at all when they fail,
otherwise they just continue by explicitly calling the next thing which
should be done. We will do this by using a Continue Passing Style (CPS)
where each function will have an argument which is the next function call to
continue. (CPS is not new and is generally known) So functions will usually
look like this:

    a(Next) ->
        <do our stuff>,
        OurNext = fun () -> even_more_stuff(Next) end,
        more_stuff(OurNext).

We will use the convention that when a function fails and cannot go on it
will return 'fail'. When *the* solution has been found then the final test
will return {succeed,Value}. So the top level function will look something
like:

top_call() ->
    Succeed = fun () -> {succeed,yes} end,        %We have succeeded
    find_solution(Succeed).

So now our program will keep going forwards until it finally calls the
Succeed fun when it all returns back to the initial caller.

The next problem is to introduce 'choice points' where we can choose one
path but still allow the code to backtrack and make another choice. This is
the done in the function cp/1 where the argument is a list functions which
are separate paths to choose from:

    cp([fun () -> do_stuff_a(Next) end,
        fun () -> do_stuff_b(Next) end,
        fun () -> do_stuff_c(Next) end,
        ...]).

Cp/1 will try the functions in order, if one succeeds then the choice point
will succeed and return the same succeed value, if it fails then the next
function in the list will be tried, until no more exist when the choice
point will fail: The code for cp is simple:

    cp([G|Gs]) ->
        case G() of
            {succeed,Value} -> {succeed,Value};
            fail -> cp(Gs)
        end;
    cp([]) -> fail.

That's it, we're done. A lot of explanation for very little code.

A slight variation is to not chain the continuation together as shown here
but to have Next as a list of functions to call. Instead of directly calling
Next you call next/1 with argument Next, next(Next), where next/1 is defined
as:

    next([N|Ns]) -> N(Ns).

There is no functional difference it just make writing some code a bit
easier. For example the function defined above:

    a() -> foo(), bar(), baz().

would look like this in the normal way:

    a(Next) ->
        Next1 = fun () -> baz(Next) end,
        Next2 = fun () -> bar(Next1) end,
        foo(Next2).

while using the list variant it would look like:

    a(Next) ->
        foo([fun (Ns) -> bar(Ns) end,
             fun (Ns) -> baz(Ns) end | Next]).

which can be easier to read, especially if there many calls in the
continuation.

After I introduce handling data in the next installment I can give a proper
example.

I have also posted this to my blogg for those who prefer to read there,

Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20090318/b8555a16/attachment.htm>