Recursive list comprehension

Fri Jan 20 03:24:04 CET 2006

I wrote:
	> Er, WHAT "rampant memory consumption"?
	> Prolog-style backtracking is a way of ECONOMISING on memory,
	> by holding different things at different TIMES rather than in
	> different PLACES.

Peter-Henry Mander <erlang@REDACTED> replied:

	I agree that it avoids *explicitly* holding terms of computation
	in your program code, but (and I speak from a C programming
	perspective with all the incumbent memory management paranoia)
	this information is _still_ being implicitly stored in memory to
	enable the backtracking mechanism to rewind to a previous
	decision branch in the search tree.

For every variable that gets bound there is at most ONE word in the
trail to enable it to be unbound.  In practice, the trail is usually
TINY in comparison with all the other data.  For every choice point with
outstanding untried choices, there are typically about 8 words.  While
one does prefer not to have too many choice points hanging around, the
reason has nothing to do with their size:  the space required for choice
points is almost never of any practical concern.

Back in the days when 4 megabytes was considered a lot of memory,
people were running Prolog interpreters on serious problems quite
happily.  This even applied to Prolog interpreters (like "C Prolog")
that did not have a garbage collector:  BACKTRACKING RECLAIMS MEMORY
QUICKLY AND CHEAPLY.

	A red or green cut (I believe) discards sub-trees that can be
	reclaimed as free memory.  Otherwise if memory was not an issue,
	Prolog would not need cuts at all.

Not to be insulting, but don't you think it would be a good idea to
actually learn something about Prolog semantics and practice before
pontificating about it?  So-called "Green" cuts are cuts that have no
*logical* effect but simply tell the compiler or interpreter about
determinism it is too dumb to discover for itself.  They *do* have
memory implications, but that's not their main point.  So-called "red"
cuts have *nothing* to do with memory; the thing that *makes* them "red"
is that THEY CHANGE THE SEMANTICS OF THE PROGRAM.  If memory were
infinite and free, Prolog would still need "red" cuts for the same
reasons that they have always been used.

	> It is perfectly possible to have a language with backtracking but
	> without cuts.  (The logic programming language Mercury, for example.)

	My point above echos yours. But I guess that Mercury either has very
	clever dead branch pruning to reclaim unreachable backtrack paths, or
	just doesn't care and guzzles memory.

Why do you find it necessary to say bad things ("guzzles memory") about
a system you aren't acquainted with?  As it happens, Mercury has an
*extremely* clever compiler.

In both Prolog and Mercury, it is (usually) NOT backtracking that
is responsible for using memory, but constructing data structures.
What backtracking offers is fast cheap storage reclamation; faster
than either GC *or* free().

If you are worried about memory use, then the languages to be frightened
of are the ones like C and C++ with manual storage reclamation.

	Prolog is designed to solve problems like Zurg, I can't contest
	that.  But I don't agree that Prolog somehow magically stores a
	search space in thin air.

Straw man.  Nobody ever said it did.  What Prolog *does* do is that
when it stops exploring a branch of the search space, it reclaims
all the memory allocated along that branch in a single very cheap
step.

	Backtracking needs memory,

Yes, it needs memory.  Of course it does.  But what it needs is *LESS*
memory.  For example, the classical A* algorithm needs a lot of memory
because it keeps around all the states you have visited in case you
bump into them again.  The IDA* algorithm (Iterative Deepening A*)
uses backtracking *IN ORDER TO SAVE MEMORY*; that algorithm is to pay
the price of repeated visits because it only needs to store the states
along the current branch.  That's an *exponential* saving in memory.

Something like Haskell or Erlang working over a "spatial"
representation of the search space has to rely on the garbage collector
to reclaim parts of the search space that are no longer interesting.
It is possible to set things up so that you are in effect running a
backtracking search, so you don't *need* (much) more memory than a
backtracking search would, BUT it takes the garbage collector time to
notice this, so even with fairly clever programming you end up
*temporarily* holding onto a few hundred percent more memory.

	and please try hard to shake this
	perception, 'cos the argument will be great fun!  :-)

Not for me.  This is an old and well documented issue.

	I agree, side effects must be one of the reason Dr. Joe et al. omitted
	backtracking, not memory (sorry, another fart). But I'm not convinced
	that backtracking in Prolog has no memory cost.

"no memory cost" COMPARED WITH *WHAT*?

	There seems to be mutual suspicion between FP/logic programmers and
	etch-bare-metal programmers. You imply that there's no cost at all in
	using the high level features of Prolog, and I don't believe so.

That is an outright lie.  I have never said, implied, hinted, or even
accepted any such thing.  There are ALWAYS costs in using the high level
features of ANY language that HAS high level features.  The question is
always "what are the alternatives and what are THEIR costs"?

	A programmer whose professional obsession revolves around
	pointers and other low level resource management knows by
	experience that these wonderfully powerful tools *must* have a
	computational and resource cost.

Henry Ledgard invented this wonderful term:  "a P-sub-A programmer".
This is a programmer who THINKS he is a professional but is REALLY
an amateur.  Your "professional" programmer seems to be a P-sub-A one.

It is true that high level features have a cost.
It is ALSO true that low level features have a cost.
Your P-sub-A programmer isn't even CLOSE to worrying about the
costs of the features HE is (ab)using.

One of my favourite examples is a Prolog-vs-Fortran race.
I had some data that I wished to analyse using a particular
statistical technique.  I had a textbook with a full listing
of a Fortran program to do the analysis.  So I typed the
program in.  It didn't work.  I spent a couple of hours trying
to debug it and got nowhere.  So I wrote a program in Prolog
to do the job.  (It took me less time than failing to debug
the Fortran code had.)  That gave me the answer.

COST ONE:  time to market.  A fast program that isn't finished yet
isn't earning you money (or praise).  A slow program that IS finished
CAN earn you money and praise.

But that's not the punch-line.  The punch-line is that I went back
to the Fortran program and proof-read it with extreme care and found
the one typo that had caused the problems and got it going.  Now
Fortran was designed for numerical calculations like this.  And the
compiler was an optimising compiler with optimisation level set high.
The Prolog program was faster!

How did that happen?  Well, the Prolog code being simpler and easier
to debug, I had used a better data structure.

COST TWO:  time spent fighting low level details is time NOT spent
choosing appropriate high level data structures.

This is not rare.  I have an example of a program that was first
written in C and then rewritten in AWK.  The AWK program is faster.
Reason?  Making the AWK program use memoisation took about five lines
of AWK code.  Changing the C program would have meant a lot more work.

Then there is the fact that the alleged "costs" of high level features
are sometimes imaginary.  It's not unusual for me to find SML code
compiled with the Mlton compiler running at close to the speed of C,
certainly within a factor of two.  (SML is a strict functional language
with strong polymorphic types and mutable arrays.)  When Mercury (a pure
logic programming language with strong polymorphic types and strong
mode/determinism checking) was still new, the Mercury team did some
benchmarks and found Mercury running faster than C.

I have text processing code where AWK (using a modified version of
mawk-1.3.3) beats C.  I have some other text processing code in a
functional language that's a *lot* faster than the C code it was
based on because it uses a "lazy concatenation" representation for
strings that suits the problem at hand, a representation which would
be quite impractical to use in C.

As a practical matter, I have an XML processing toolkit written in
Scheme.  The tool-chain is
    (XML to S-expression converter in C) -->
    (Scheme program uses 'read' to read S-expression) ->
    (Scheme program uses list processing to transform the tree) ->
    (Scheme program calls a custom C function to write the tree as XML)
and this runs about a factor of 10 or better faster than any XSLT
processor I've been able to benchmark (and the Scheme code is about
a factor of 6 or 7 shorter than XSLT).

	The lisp/prolog weenies seem blissfully unaware of this cost.

Why do you find it necessary to slander ("weenies", "blissfully unaware")
a group of people you don't even know?  *Professional* programmers who
use Lisp or Prolog (or Erlang!) *are* aware of costs (one way to tell a
real professional programmer is that they actually *measure* their
programs rather than relying on prejudice).  They are also aware that
low-level languages have costs of their own.

	I may be ignorant,

You are.

	maybe they're right to dismiss the cost as a sacrifice for
	provably correct programs,

But they *DON'T* dismiss cost as an issue.  They just aren't one-eyed
and prejudiced about it.  They are aware that development time is a cost.
They are aware that a high level language enlarges the scope of what you
*CAN* program at all.  They are aware that cheap ($1000) computers are
now more than a thousand times faster than expensive ($1000000) mainframes
of 25 years ago and have about a thousand times as much memory.  They
are aware that programs that give the wrong answers impose heavy costs
on customers.  They are aware that buffer overruns (far too frequent in
C programs) create security holes that too many "black hats" will
maliciously exploit, and that creates costs for all of us.  They are
aware that the *overall* cost of a computing system depends on more than
just how fast you can do strcpy(); disc speed and data volume, and
network speed and data volume, these are likely to be at least as
important.

There aren't just people running web servers in Erlang, SML, Scheme, Lisp,
and Prolog, there are people *happily* running such web servers because
they are fast *enough*, and the other costs (like security) are *less*.

	Erlang bridges a gap between these two camps.  The design of the
	language contains pragmatic compromises that keeps resource
	consumption low, and preserves enough of the functional paradigm
	to enable proof of correctness.

But you could say this of SML or Clean or Mercury or ...

	So my question(s) changes:
	    Is backtracking essential?

Once you have Turing equivalence, NO language feature is "essential".
"Essential" is the wrong question.  Backtracking is ***USEFUL***.
Software engineers care about what will let them get usable working
code in time, not about essences.  Backtracking is useful for SOME
problems.  Those are the problems that Prolog is meant for.
Backtracking is pretty much irrelevant to other problems.

	How much do Erlang programmers suffer for not having backtracking?

Very little, because they mostly don't use Erlang for problems where it's
useful.  Just as they suffer very little from not having high speed
floating-point vector operations, because they mostly don't use Erlang
for doing heavy-duty statistical calculations.

	What other FP/logic features are unavailable in Erlang
	that would be cool to have, and more pertinently, why are they
	*NOT* included?

>From logic programming, the only things missing are unification (and other
forms of constraint solving) and backtracking search.  Unification implies
"instantaneous communication" between remote parts of the program, so
doesn't suit Erlang's distributed style.

Some functional languages have lazy evaluation.  But that's a tricksy
sort of control structure, and it is very hard to write reliable code
that mixes lazy evaluation with side effects.  (For a language that
tries, see the statistics environment S-PLUS or its free cousin R.)

List comprehension syntax *used* to be missing, but it's there now.
Lambda-expressions *used* to be missing, but they're there now.

>From some logic programming languages and from some functional languages,
*the* big thing that's missing would be strong polymorphic type checking
with typeclasses.  The sticking point here is mixing that with hot loading,
which is one of Erlang's strong points.  There is an optional type checker
for Erlang, in fact there have been several.

Why are (or were) various features omitted?
Basically because Erlang wasn't an academic research project.
It had to become a usable tool for real commercial products very quickly,
which meant it had to start *small*.  Really good concurrent garbage
collection, for example, was more important than streamlined syntax.