Antw: Anybody done this
Richard A. O'Keefe
ok@REDACTED
Tue Apr 6 05:54:31 CEST 2004
One of the features of DEC-10 Prolog was that you could save a memory
image and later restore it, resuming from exactly the save call.
Quintus Prolog supported this, in a very emacs-like way.
But eventually we deprecated it. Let me quote the relevant part of the manual.
Unfortunately, it has not been possible to retain the semantics of
save/[1,2] available in previous releases of Quintus Prolog. This
is regrettable because it means that programs that incorporate code
for building saved-states will need to be changed. This section
explains why it was necessary to remove these predicates. Note,
however, that save_program/1 is available and has the same semantics
as previous releases (except for foreign code), although it is based
on a new implementation using QOF files. A new predicate
save_program/2, described in {manual(g-5-4)}, has been provided
which supports the most common usage of save/[1,2] which was to
specify an initial goal for a saved-state to call when run.
The difference between save_program/1 and save/[1,2] in previous
releases of Quintus Prolog was that save_program/1 saved only the
Prolog database, whereas save/[1,2] saved both the Prolog database
and the Prolog execution stacks. It has not been possible to retain
the saving of the Prolog execution stacks in a way which is
consistent with the Release 3 support of embeddability and the
general portability of QOF files. This is why save/[1,2] have been
removed. The reasoning goes as follows:
1. QOF files are a completely portable machine-independent
representation of Prolog data.
2. It is difficult, if not impossible, to make the Prolog execution
state portable in the same way as facts and rules in QOF files
(see further points).
3. QOF files can also be combined and loaded in flexible ways, and it
is unclear what this would mean for execution states.
* 4. The QOF file saved-states do not save any C (or other foreign
* language) state. This is a change from the previous Quintus Prolog
* saved-states, and is further discussed below.
* 5. In the general case, Prolog execution can now be arbitrarily
* interleaved with C (or other) function calls since Prolog and C
* are completely intercallable and can call each other recursively.
6. Since the C state is not saved, it is not possible to meaningfully
save the Prolog execution state in the general case where it
depends on interleaved C execution state.
* 7. In addition, Prolog code embedded in a C (or other) application is
* highly likely to be manipulating C data, such as pointers and
* other process-specific information. This data would be meaningless
* if restored into another process, and indeed would be likely to
* cause faults.
The model that an arbitrary Prolog execution state can be saved thus
only works well within a Prolog-only situation. In the complex
embedded environments supported by Quintus Prolog Release 3 this
model cannot work properly. Hence the removal of the facility.
As mentioned in points 4-7 above, an additional important aspect
here is that Prolog no longer makes any attempt to save the state of
C (or other foreign language) code. This was a feature of
saved-states in previous releases where both the C code and its data
structures were saved (as a memory image) into saved-states. This
was a feature that caused many problems. A primary problem was that
the saved C state was initialized (variables retained their values
when restored) and yet the initialized C state could contain many
items that were no longer valid in the new process, such as
addresses and file descriptors. Such code would often fail when
restored. In addition, Prolog was unable to guarantee that it had
saved all the necessary foreign code state. With the advent of
shared libraries and other complex memory management facilities in
the operating system, it became impossible for Prolog to control and
manage the states of other tools in the address space.
When one takes a step back and looks at Prolog in the light of the
goals of Release 3 ({manual(a-2)}) - where Prolog code is a
component that can be embedded in complex applications written in
many languages - it is clearly unreasonable for Prolog to try and
control, let alone save, arbitrary non-Prolog state. The Prolog
operations for saving and loading QOF files now operate solely on
the Prolog database and these operations do not involve making any
assumptions about non-Prolog state. This is a much cleaner and more
robust approach, and is the most appropriate when Prolog
applications become embedded software components.
For example, in a couple of operating systems that Quintus Prolog ran on,
a C program could *not* rely on getting the same addresses whenever it
ran. Here's the kind of thing I'm getting at:
#include <stdio.h>
static char c = 1;
int main(void) {
printf("%p %p\n", &c, (void *)main);
return 0;
}
Compile that in a UNIX environment, and you expect to get the same numbers
every time. Well, in a couple of operating systems, you didn't. So if QP
saved out a memory image, and then you ran QP and restored that image, all
the C code and static-storage-duration data would now be somewhere else, and
all the addresses would be wrong. With dynamically loaded code, you can't
even trust this kind of thing in UNIX any more; if foreign files foo.o and
bar.o are loaded, when you restore things on startup, foo.o may now load a
*different* version of the .so files it depends on, so bar.o may be loaded
at a different address.
Then there were things like people mapping a frame buffer into their
address space, shared memory, System V message queues that didn't work
on restoration because the other process wasn't there any more, socket
connections that couldn't be restored, &c &c.
The key sentences from the manual are these:
>>> The model that an arbitrary Prolog execution state can be saved thus
>>> only works well within a Prolog-only situation.
>>> In the complex embedded environments supported by Quintus Prolog
>>> Release 3 this model cannot work properly.
The lesson for Erlang (and Joe) is that if you want to save out a suspended
version of an Erlang node, and then restore it, even on the exact same
machine one millisecond later, you must be running an Erlang(-and-runtime-
library)-ONLY system; any drivers had better be statically linked into the
runtime system and they had better be d--- careful about what they do.
More information about the erlang-questions
mailing list