[erlang-questions] Right direction ?

Wed Sep 28 12:28:59 CEST 2011

Ok - I Let's talk about how to ensure we get the correct beam code.

I guess you know how code gets loaded, but I'll
summarize it, since it's probably not that well known

There are several ways to load code

1) Method 1

   You call foo:bar(1,2,3)

    If the module foo has not been loaded
    the system converts this to the call

      error_handler:undefined_function(foo, bar, [1,2,3])

    this is in  .../kernel/src/error_handler.erl

     the function undefined_function deligates the problem of actually finding
the code to the code_server, which is supposed to know how to find the code.
One the code has been found and loaded, which is the job of the code_server
undefined_function can then evaluates apply(foo, bar, [1,2,3])

2) The code is pre-loaded in a boot file and loaded by init - I'm not
sure if the
boot file has a list of filesnames or the content of the beam files,
so there might be a security problem here.

3) Some programs (actually any program) evaluates the BIF

    erlang:load_module(Mod, Bin)

    This BIF assumes Bin contains the beam code for Mod - this is
potentially dangerous since *any* process can execute this. Note: you
might have to
do purge_module(Mod) before calling load_module/2 (because we can run up
to two different version of the module at the same time).

    When you compile a program in the shell method 3) is used to
reload the code. Running compiled programs usually autoloads the code
using method 1)

A simple security measure would be to modify the code server
to make sure it authenticates the code before loading.

The problem is that any program can a type 3) operation
So the only way to be really secure is to hack the emulator so that
load_module will fail if the code in Bin has not been correctly signed.

You could possibly run a two node distributed system, where one node
is guaranteed to be secure and the other untrusted. Then you highly restrict
which processes are running.

Thinking out loud - the root problem is that "any" processes can evaluate
erlang:load_module/2 I guess we could hack the system so that *only*
the registered process code_server can call this .. this sounds doable

I'll ask around a bit to see.

/Joe

On Wed, Sep 28, 2011 at 3:25 AM, David Goehrig <dave@REDACTED> wrote:
>
>
>>
>> I don't really understand. The only (legal) way to modify the beam
>> is to change the source and recompile. I think you have to
>> decide exactly what the semantics of require are.
>
> I'm actually most concerned about the illegal way of modifying a beam:
> a.) Sysadmin gets clever an runs rsync patching the beam with a diff from
> one on another server (bad if that server doesn't have the right version)
> b.) Developer gets clever and uses source control as a deployment mechanism,
> "git push production master", overriding the version there
> c.) Nefarious type replaces beams with other beams that have been compiled
> with compromised security built in
> having a loader that can check at run time (late binding)
>
>>
>>   a) We check the require targets *before* compilation with
>>        a separate program
>
> rebar (http://github.com/basho/rebar) already does a pretty good job at this
> (as long as you list all your dependencies as git repos) and I've been
> making heavy use of this over the past year.  It handles checking out and
> compiling all the dependencies, and you can specify which specific tags you
> depend upon.
>>
>>     c) we check at usage time. The first time we call daves_module and
>> find
>>
>>         it has not been loaded we check the cache and so on
>
>
> Right now I'm most worried about c.) in the context
> of lib/kernel/src/code_server.erl:
>
> try_load_module(Mod, Dir, Caller, St) ->
>     File = filename:append(Dir, to_path(Mod) ++
>                            objfile_extension()),
>     case erl_prim_loader:get_file(File) of
>         error ->
>             {reply,error,St};
>         {ok,Binary,FName} ->
>             try_load_module(absname(FName), Mod, Binary, Caller, St)
>     end.
>
> Where the file pointed to by FName is now "trusted" and will then be read
> into memory and passed off to hipe.
> Part of the problem is I'm also introducing a new risk, because I'm
> replacing this load bit with code that can read a URL rather than just a
> filename, so I'd like a way to hook in to check that the file I've
> downloaded is the same as the signature I have on file in a dets store.
>
>>
>> Good stuff - needs some thought though. I was thinking of
>> signing/validating
>> the source with an RSA public/private keypair.
>
> I've thought about adding RSA public/private key signing, but that
> ultimately goes down the route of having a CA to form a trust network, and
> since CAs tend to prove to be unworthy of trust, I'm wary.  Self publishing
> a RSA public key + signing a SHA hash of the source and putting both in DNS
> seems like a reasonable way of doing it, but can also be exploited to deny
> service by DNS cache poisoning.
>
> If one were to implement a pub/private key signature check, would it best be
> done in code_server.erl or somewhere else?  That seems to be the first place
> the files are loaded into memory at run time.
> Dave
>
> --
> -=-=-=-=-=-=-=-=-=-=- http://blog.dloh.org/
>