[erlang-questions] A Generic API for controlling software components

Wed Nov 25 22:50:03 CET 2009

> I guess you mean OTP applications? - If so then I disagree. To start
> with the directory structure within
> an OTP application is fixed which I don't really like. Also there is
It's fine that you don't like it, however, what use case justifies it.  "Because I want to" is not really a reason.  If you don't want to do it for a reason, it's probably bad design.  No offense, really.

Here's what I don't want.  I don't want each component to have radically different layouts that require a ton of work to adapt to.  I want contributing to projects and deploying them to be easy.  Conventions solve this.  Unless you have a problem that "rigid directory structures" prevent you from solving, that's just not good enough.

> no construction rule
> that says you are not allowed to write anything in the application
> directory tree. I'd like to put the
> entire component in a read-only area of disk for safety reasons (once
> an initial install has been performed).
This is a good idea.  Separate the mutable data from code.  This can be done perfectly well by convention, though.

> I have a differnt proposal for reducing duplicates - this is a
> different problem.
If the problem is that deployment and installation is a hassle, I beg to differ.  Part of the problem is that the core Erlang developers are often changing internal modules, so it is natural that their deployment system duplicates the entire system.  For most of us, we just need to overlay our little modules on top of the giant Erlang environment.  That's why I've been suggesting an overlay approach.

> Yes to deploying Erlang components *and upgrading them*
Applications are in modules and have versions.  The upgrade case for Erlang applications is pretty much solid.  Now, the way that releases do it (mutating an existing installation) is not exactly ideal, and I'd love to make adapt it to something smarter.

> Sure. epkg etc. will be just a thin shell script layer over the API I suggested.
> The CLI can be written as a thin layer over the API as well.
I would welcome that.  I think that a gen_event is a good model for handling runtime CLI functions.

> because 90% is not enough
Perhaps I wasn't clear.  I was suggesting you start from 90% done instead of start from 0% done.  Also, components and applications have enough conceptual overlap that it would only serve to be confusing.

If you want to deprecate applications, that's fine, but understand that two packaging / distribution formats is a horrible idea.  It took forever in Python to unify eggs and setuptools.

> Of course. I just want to download yaws, ejabberd etc. drop them into
> my eComponent Directory
> and that's it - they are installed and runnable.
No, you don't.  :)  A list of steps is not what you want.  You want "installing components to be easy".  Downloading a file and dropping it in can't run any code, can't make any decisions custom to your system, and can't autogenerate configuration.  All of these things are great strengths of "gem install" and "easy_install".  The last thing you want is to have a dozen different component downloads.

For example, assume that RabbitMQ gets a bunch of C components that need building for three OSes.  Now assume that it gets a special federated mode that requires different configuration.  Next assume that it develops a multi-hosted version for hosting providers.  If each combination has to be packaged separately (i.e. because we can't build the C extensions or mutate the configuration on install) then we end up with 12 (3*2*2) packages!

Being devil's advocate here, let's say we do it your way and have "install code" that runs when you start it the first time we do a deployment.  Even with this code, it has very weird behavior.  The first time startup may take forever or fail if the extension fails to build.  Now we have to keep track of that process and recover gracefully from it.  It would have been far better to have had the "install" or "deploy" step fail.

Basically, I don't see any value to the extra step.  It should be "install" then "deploy", and install should be smart.  If "install" is downloading a file and working with directories it's not smart.

Ironically, it's also not really that beneficial to anyone in terms of user experience.  Power users won't mind a single command in the shell.  Non-power users would rather click an icon (moving files around for them is scary).  It lands in a middle-ground that is probably less populated than you would think.

> Ok, I didn't fully specify this - fine to change the details here or
> use an environment variable to point
> to the directory.
I think unobtrusive default + environmental variable is good here.

> This is the bit that is not enforced in OTP applications - I think
> users are even encouraged to use the /priv
> directory in an otp application
Is there something about this that is hard to retrofit into applications?  The great thing is that it's all abstracted already--just change what comes back from code:priv_dir/1.  If people are doing it properly, their data should go where you want it to go.

I'd really recommend having a cascading/overlaying/shadowing setup.  Have everything search the deploy location, user location, and system location in that order.  Mutable data never is found outside of the deploy.

>    Ok .eplist then?
That sounds good.

> Because I might not want to follow that particular convention. For
> simple applications I usually put everything in one directory.
> Actually this could be dropped, the system could scan the filesystem
> to find the code paths
> the first time you run the program.
> directory
One of the most misunderstood things about the Ruby community is their intolerance.  Don't get me wrong, some of them are intolerant bigots, just like you get anywhere.  However, a lot of the intolerance comes from a desire to have solid conventions.  If you have to write applications all day long, knobs are only helpful if they're invisible.  Every required parameter and every choice about where to put a directory is extra time before I have a working project and extra trouble when I come back to it six months later.

Having the ability to override a convention is fine, but it should have a default and it should require no configuration.

The scanning the filesystem part is a good one.  One of the ugly things about Python, Java, Ruby, and Erlang is management of the code path.  Java has it down with a set of mostly ugly conventions.  Ruby still does a lot of manhandling (mostly generating stub files that know where the real code is).  Python has, in my opinion, the best solution.  Specifically, in the "system" directory, you can drop .pth files that point to additional code paths.  This is fabulous in that it helps package managers (i.e. dpkg, rpm) plug in code in a manageable way.  If Erlang were to have, for example, .epth files in /usr/lib/erlang/bin, that would make an awesome hook for distribution managers.  Adding a .erlang directory (in the $HOME) with the same semantics would make it easy for users.  It would be nice to move the cookie and hosts file into that directory, but I understand that might be asking too much too soon.

> It's a hint to tell the package manager how often to check for updates
> - I havn't thought through all the details
> it could be something else:
> 
>     {checkForUpdates, everyTimeYouStart | dayly | hourly | weekly |
> {after,Year,Month,Day} | {every, 10, minutes}}
> 
> etc.
Most system administration tasks are done rarely (i.e. running updates with the package manager, which this is a hint for).  In this case, everybody else just downloads the current package list and compares it locally.  I don't think that hints are all that useful in comparison.  They can only generate false positives (expired but no update) and false negatives (not expired, but new update is out there).

> The designer of the program knows what is a sensible value here
Upgrades can trash data.  It is not the designer's data.  I don't think it takes more than blowing away the data in your RabbitMQ or losing your ejabberd user database to see why this could be extraordinarily bad on a large scale.  Bugs happen.  Best to let the administrator be in the loop.

>> Again, I humbly suggest more traditional Unix pathnames.  How about $HOME/.erlang/library/<component>/<vsn>/prefs.epl?
> 
> No worries ...
Thank you.

> I guess the component framework would use the check for updates or
> expiry date or whatever
> to decide whether to do a check. The check would return a list of
> improvements and the user would decide
> whether or not to install them.
Almost everything under the sun has an update command, but never does anything automatically.  I really think it would be prudent.  If you want to do "automatic updates", I would at least suggest that you have it turned off by default, and have it turned on by a project, not the user.  The rationale is that the user of a component probably doesn't know enough about how the data is stored to keep it safe, but that the component developer should know.

> I guess we should also be able to roll back a version (which is why I
> have eLibrary/ComponentName/VSN/data ..
> tags.
Definitely.  This is another reason that I suggested having a separate "deploy" step.  You no longer have to worry about upgrading an installation.  Each "version" is another deployment.  In this scenario, you can specify "migrations" to upgrade/downgrade the data, but the code is managed as a separate entity.

> It is pretty similar
One is better than two.  :)

-- 
Jayson Vantuyl
kagato@REDACTED