More restricted execution silly ideas

Sat Jun 14 19:07:29 CEST 2003

erlang@REDACTED wrote:
> > I think this is kind of the direction I had been suggesting earlier.
> [snip bit about defining trusted module directory] 
> > When a module is loaded, if it comes from a "trusted" directory,
> > the code server loads it as is.
> 
> How about if we just allow OTP contents, and then cherrypick from those
> which actual calls we will permit?

I thought about that one, but then you are constrainted to providing
only some features of OTP which are considered safe, and what do you
do with the rest?  Ignore it?  What if I chose to offer most of
mnesia (the user functions, mnesia:transaction, mnesia:match_object),
but not the dets or ets module?

I thought about this case last night while grilling dinner, and realized
that a .erl + parse_transform -> safe_.beam is the only way to go and
still save performance.  The "administrator" can define what modules
and functions are trusted to be called from untrusted code, what should
go through a security_manager module (as I've described), and what
should be flagged as illegal operations at compile time, preventing
compile from occuring.

> > If the module is from an untrusted directory, the code server
> > rips apart the .beam file and begins to "verify" it:
> [snip approval algorithm] 
> > Basically we allow 'system' code to run without penalty, and allow user
> > 'unsafe' code to run with minimal penalty, basically anytime they want
> > to access external resources.  Basic calls to stdlib such as lists,
> > sets, dict are all allowed as-is.  Its ets, file, code, message sending,
> > modifying the process dictionary, etc that are slower.  By putting all
> > calls through a security_manager module, the security_manager can
> > decide how a call should react, and can either emulate the real call,
> > or check access (based on arguments, etc) and forward or exit/1 as
> > necessary.
> 
> This seems a bit overhead-rich.  The idea (to me, at least) is that if
> one can simply define a standard subset of standard functions which can
> be permitted (typically, aside from message passing and spawn/3, anything
> which has particular side-effects with reference to machine state), then
> this subset can be permitted without any interference at runtime, which is
> good for things like realtime behaviour.  Ultimately, the execution becomes
> more functional, in the sense of isolation from side effects, until the 
> program hands back the answers in the form of messages.

I think the end result allows overhead only where necessary.  What if I
want/need to offer the ability to read/write files in directory A, but disallow
all access in any other directory?  There must be a runtime check, and adding
that check to file:open and dets:open through a dynamic security_manager module
would be the way to do it.  But the rest of the dets: routines could be allowed
to stay exactly as-is.

There is a concern that dynamic module calls have to always go through the
security manager, and this will slow down the call.  Is M:F(A,B) today faster
than apply(M, F, [A,B])?  Because if so, I was going to have the
security_manager module implement something like:

	call(M,F) -> M:F().
	call(M,F,A) -> M:F(A).
	call(M,F,A,B) -> M:F(A,B).
	call(M,F,A,B,C) -> M:F(A,B,C).
	call(M,F,A,B,C,D) -> M:F(A,B,C,D).
	callv(M,F,ArgList) -> apply(M, F, ArgList).

So its not as ugly.  I contend you won't get safe execution without losing some
performance in some areas of the application, but you can still have the
security_manager provide a soft-realtime bound on the overhead it adds.

> In addition, I'm unsure why analysis of .beam files would be really preferable
> to analysis of .erl files ...

The only advantage of a .beam file is I can receive code from over the network
which I did not compile locally.  But given that Erlang/OTP is usually only sent
in source form between parties, doing analysis of .erl files is acceptable.

> If one were to simply slurp through standard erlang/OTP source code, and 
> ruthlessly mutilate anything with outside references beyond, say, messages,
> then one could pretty much guarantee that what happens inside the virtual
> machine is the concern of that program and its spawn only (load on available
> resources notwithstanding).

I'd rather not get into rebuilding OTP to be safe.  OTP is big, and most likely
we'll never be able to keep up with the entire erlang kernel, stdlib, plus
OTP and all standard applications.  Not to mention that gutting them may remove
critical functionality.  What if I need to use ets tables for performance, as
they are used only as lookup tables by my "untrusted" code, while other code
which is trusted can send messages to the ets table owner to update the table?

-- 
Shawn.

  Nondeterminism means never having to say you are wrong.