[erlang-questions] Right direction ?

Tue Sep 27 11:47:22 CEST 2011

On Wed, Sep 21, 2011 at 7:40 PM, David Goehrig <dave@REDACTED> wrote:
> Last night I began hacking on code.erl, code_server.erl, and looking to extend load_file(Module :: atom()) to include a load_file(Module, Url) function that would look for the module at the associated URL rather than searching for the file via abs path. It would then compare the sha256 hash of the file against the copy in cache and bail if they are different (no hash in cache adds it and uses as the baseline).
>
> I was wondering if there was a good way to verify that a .beam has not been modified since last load.
>
> For example:
>
> -module(my_mod)
> -require(daves_mod,"http://erlang.dloh.org/")
>
> Could then look for http://erlang.dloh.org/daves_mod.erl and download and compile a local beam. Once I have that beam, I can just load from cache, but what happens if the beam is modified after compilation?

I don't really understand. The only (legal) way to modify the beam
is to change the source and recompile. I think you have to
decide exactly what the semantics of require are. There are several
possible meanings:

   a) We check the require targets *before* compilation with
        a separate program

       ie image a program:

        > check_dependencies *.erl

       This parses all erlang files (*.erl) extract the require information
        then it check the cache to see if it has all the necessary files

        Even this program could have two modes:
            a1) always check with the origonal source for new versions
            a2) check once every N days (or minutes or hours or something)

     b) we check at compile time

     c) we check at usage time. The first time we call daves_module and find
         it has not been loaded we check the cache and so on

   a) represents an early binding scenario, c) very-late binding

If you are in a development scenario I'd favor a) because have code
changing rapidly under my feet would worry me. (actually a) is the
easiest to implement

If I am in a deployment scenario I might choose c) I might even want to
*push* changes rather than reply on polling or some other way of
finding out that the code has changed.

The point is that you have to have a clear idea of which of these
particular problems you are solving.

Doing a really good job on the a) scenario interest me - I'd just like
to type "make" and be told - "foo123.erl on http:/..../ has changed from the
cached version, do you want to update?" ....

Only a) fits nicely with unit testing/type checking etc. delayering
to load time makes testing difficult. If things can change under your feet
without you knowing, life might become difficult.

>
> The other thing I would like to add is DNS TXT records that could be published sha256 hashes of each source module.
>
> http://erlang.dloh.org/daves_mod.erl 45663AFDA....
>
> Adding a
>
> -signature("http://erlang.dlog.org/daves_mod.erl","45663AFDA....")
>
> Would allow a 3 part verification of the source:
>
> 1.) I can compute the source has the right hash
> 2.) I can look up that the module has the same published signature
> 3.) I can verify against the original at the specified URL
>
> In this scenario it is not enough to modify the source and rehash, nor enough to replace the upsteam file, but also replace the DNS entry without anyone noticing.
>
> Thoughts?

Good stuff - needs some thought though. I was thinking of signing/validating
the source with an RSA public/private keypair.

I'd like to see this as part of the build process, if I did "make" I
might like to see:

$ make
module foo123.erl is up to date. Written by joearms *validated*
module bar23.erl has a newer version
module bingo23.erl is up to date written by cleverperson *untrusted*
...
etc.

/Joe

>
> Dave
> -=-=- dave@REDACTED -=-=-
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>