Antw: Anybody done this

Tue Apr 6 05:54:31 CEST 2004

One of the features of DEC-10 Prolog was that you could save a memory
image and later restore it, resuming from exactly the save call.

Quintus Prolog supported this, in a very emacs-like way.

But eventually we deprecated it.  Let me quote the relevant part of the manual.

    Unfortunately, it has not been possible to retain the semantics of
    save/[1,2] available in previous releases of Quintus Prolog.  This
    is regrettable because it means that programs that incorporate code
    for building saved-states will need to be changed.  This section
    explains why it was necessary to remove these predicates.  Note,
    however, that save_program/1 is available and has the same semantics
    as previous releases (except for foreign code), although it is based
    on a new implementation using QOF files.  A new predicate
    save_program/2, described in {manual(g-5-4)}, has been provided
    which supports the most common usage of save/[1,2] which was to
    specify an initial goal for a saved-state to call when run.

    The difference between save_program/1 and save/[1,2] in previous
    releases of Quintus Prolog was that save_program/1 saved only the
    Prolog database, whereas save/[1,2] saved both the Prolog database
    and the Prolog execution stacks.  It has not been possible to retain
    the saving of the Prolog execution stacks in a way which is
    consistent with the Release 3 support of embeddability and the
    general portability of QOF files.  This is why save/[1,2] have been
    removed.  The reasoning goes as follows:

       1. QOF files are a completely portable machine-independent
	  representation of Prolog data.

       2. It is difficult, if not impossible, to make the Prolog execution
	  state portable in the same way as facts and rules in QOF files
	  (see further points).

       3. QOF files can also be combined and loaded in flexible ways, and it
	  is unclear what this would mean for execution states.

*      4. The QOF file saved-states do not save any C (or other foreign
*         language) state. This is a change from the previous Quintus Prolog
*         saved-states, and is further discussed below.

*      5. In the general case, Prolog execution can now be arbitrarily
*         interleaved with C (or other) function calls since Prolog and C
*         are completely intercallable and can call each other recursively.

       6. Since the C state is not saved, it is not possible to meaningfully
	  save the Prolog execution state in the general case where it
	  depends on interleaved C execution state.

*      7. In addition, Prolog code embedded in a C (or other) application is
*         highly likely to be manipulating C data, such as pointers and
*         other process-specific information. This data would be meaningless
*         if restored into another process, and indeed would be likely to
*         cause faults.

    The model that an arbitrary Prolog execution state can be saved thus
    only works well within a Prolog-only situation.  In the complex
    embedded environments supported by Quintus Prolog Release 3 this
    model cannot work properly.  Hence the removal of the facility.

    As mentioned in points 4-7 above, an additional important aspect
    here is that Prolog no longer makes any attempt to save the state of
    C (or other foreign language) code.  This was a feature of
    saved-states in previous releases where both the C code and its data
    structures were saved (as a memory image) into saved-states.  This
    was a feature that caused many problems.  A primary problem was that
    the saved C state was initialized (variables retained their values
    when restored) and yet the initialized C state could contain many
    items that were no longer valid in the new process, such as
    addresses and file descriptors.  Such code would often fail when
    restored.  In addition, Prolog was unable to guarantee that it had
    saved all the necessary foreign code state.  With the advent of
    shared libraries and other complex memory management facilities in
    the operating system, it became impossible for Prolog to control and
    manage the states of other tools in the address space.

    When one takes a step back and looks at Prolog in the light of the
    goals of Release 3 ({manual(a-2)}) - where Prolog code is a
    component that can be embedded in complex applications written in
    many languages - it is clearly unreasonable for Prolog to try and
    control, let alone save, arbitrary non-Prolog state.  The Prolog
    operations for saving and loading QOF files now operate solely on
    the Prolog database and these operations do not involve making any
    assumptions about non-Prolog state.  This is a much cleaner and more
    robust approach, and is the most appropriate when Prolog
    applications become embedded software components.

For example, in a couple of operating systems that Quintus Prolog ran on,
a C program could *not* rely on getting the same addresses whenever it
ran.  Here's the kind of thing I'm getting at:

    #include <stdio.h>
    static char c = 1;
    int main(void) {
        printf("%p %p\n", &c, (void *)main);
        return 0;
    }

Compile that in a UNIX environment, and you expect to get the same numbers
every time.  Well, in a couple of operating systems, you didn't.  So if QP
saved out a memory image, and then you ran QP and restored that image, all
the C code and static-storage-duration data would now be somewhere else, and
all the addresses would be wrong.  With dynamically loaded code, you can't
even trust this kind of thing in UNIX any more; if foreign files foo.o and
bar.o are loaded, when you restore things on startup, foo.o may now load a
*different* version of the .so files it depends on, so bar.o may be loaded
at a different address.

Then there were things like people mapping a frame buffer into their
address space, shared memory, System V message queues that didn't work
on restoration because the other process wasn't there any more, socket
connections that couldn't be restored, &c &c.

The key sentences from the manual are these:

>>> The model that an arbitrary Prolog execution state can be saved thus
>>> only works well within a Prolog-only situation.
>>> In the complex embedded environments supported by Quintus Prolog
>>> Release 3 this model cannot work properly.

The lesson for Erlang (and Joe) is that if you want to save out a suspended
version of an Erlang node, and then restore it, even on the exact same
machine one millisecond later, you must be running an Erlang(-and-runtime-
library)-ONLY system; any drivers had better be statically linked into the
runtime system and they had better be d--- careful about what they do.