Optimized lists

Fri Mar 18 00:01:57 CET 2005

The question about CDR-coding came up.

The experience Quintus had putting Prolog on the Xerox Lisp Machines
(the Dandelion, Dandetiger, Daybreak &c: 1108, 1109, 1185, 1186) may
be informative here.

The Xerox machines ran Interlisp-D.  It's the classic CDR-coding system.
On this machine, the memory bus was 16 bits wide, and the address space'
was 24 bits.  A cons cell was 32 bits, holding a 24-bit CAR and an 8-bit
CDR.  The CDR codes included
    CDR is NIL
    CDR is a pointer to cell N on the current page
    CDR is the pointer _in_ cell N on the current page
and there may have been an "I've moved to another address" case or not,
I can't recall.  They were able to divide up the word this way because
they used the BiBOP (Big Bag Of Pages) technique for run time typing
(_what_ a word is depends on _where_ it is) and because their garbage
collector (a version of reference counting that assumes the reference
count is 1 unless there is evidence to the contrary) put its associated
information elsewhere.

The Xerox machines were microcoded.  They were the same machines that
ran Mesa (with different microcode) for the Xerox Star office machines,
and the same machines that ran Smalltalk (with different microcode).

Xerox Quintus Prolog was given 4k of microcode space and 4MB of main
address space.  We wrote specifications of all our instructions in Interlisp,
only to discover that the microcoder didn't read it.  That worked out OK;
Herb Jellinek translated for him.  This was important, because any
unimplemented instruction could trap out to Interlisp, so we could run with
all our instructions in Lisp, all of them in microcode, or any mix, for
debugging.  Nice system.

The big thing was that we just didn't have *space* in our block of microcode
to implement CDR-coding, so we didn't.  We did the traditional WAM thing of
2 pointers per cons cell, except we put the tag in the top 8 bytes where the
microcode dispatch hardware could find it.

So there were two list-processing languages on the _same_ hardware using
similar levels of technology (non-optimisation to virtual instructions
emulated in microcode) with the microcode written by the _same_ very clever
microcoder.

And Prolog did basic list processing about 5 times as fast as Lisp.

A large part of that was Prolog's light-weight procedure calling mechanism
(heck, benchmarks doing nothing but procedure calls ran _faster_ in WAM-
emulated code than native C code did on Sun-2 computers...), but another
large part was _not_ doing CDR coding.

Lightweight function calls (including as much TRO as possible)
and lightweight access to major data structures (excluding CDR coding).
5 times faster.