[erlang-questions] A Generic API for controlling software components

Wed Nov 25 13:40:21 CET 2009

First, let me take a stab at describing the goal (or what I think is and should be the goal) of components.

It seems that you're trying to address:

1.  Unifying execution / configuration conventions across popular Erlang projects.
2.  Putting some sort of management conventions around deployments of said products.

So, I don't think you're going to get a lot of traction on #2 unless it makes deployment easier.  Right now, those programs all have ad-hoc conventions precisely because Erlang's deployment conventions, while rock solid, are nearly impossible to use outside of an embedded environment.

If you really want to make things better (and I'm sure you do), here's what I suggest doing.

Don't create yet-another-packaging-abstraction (i.e. components).  Applications are perfectly fine for this.  Really.

Address the issues that make it hard to deploy Erlang applications.  Specifically, applications should probably be deployed as a set of modules in a second LIBDIR, that shadows the existing one, thereby reducing the number of duplicates of the entire environment (which is a current problem with Erlang's release stuff).  I already do this, and it's great.

Address issues that make deploying applications hard, in general.  The CLI itch is a good one to scratch.  Make it trivial to write a CLI event handler.  Make it a normal gen_event.  Maybe require a key in the app config (maybe {cli_handler,Module}) or perhaps just have a convention (something like an atom of the form '<application>_cli').  For that matter, make a distribution container that is more like a gem/egg.

I think this is really want you want.  Installing an Erlang application should be as easy as commands something like:

# epkg install rabbitmq
...downloads, builds, and installs rabbitmq code somewhere sane...
# epkg deploy rabbitmq mydeploy
...generates a "deployment" somewhere, basically just config and state data, mydeploy is used to generate nodenames (i.e. mydeploy_rabbitmq_main, mydeploy_rabbitmq_helper, etc)...
# ectl mydeploy start
...starts nodes specified for the deploy...
# ectl mydeploy flush_cache
...since flush_cache is not implemented by ectl, it should be passed through to an event handler...

Obviously there's a lot to go on under the covers, but anything more complex than that is, at best, a well engineered user-experience failure.  More importantly, I think the above captures what you should be doing.

All of that said, here are my comments on the proposal, so far (most of which I've already stated above):

> What I would like to do is manage all the components in a uniform manner.
> Much of this design is inspired by how Mac OS-X manages applications.
Why not just use the application framework, again?  It's 90% there for this.

Is there anything about an application that makes it unsuitable (i.e. smaller than) a component?  I would just add this to an application.

> By manage I mean start and stop the component, upgrade the code,
> change the behavior of the component at run time and so on.
I think you're thinking in terms of actual user-level management, right?  Basically, this is to unify things like ejabberdctl and rabbitmqctl.  I think that's a fantastic idea.

Looking forward, I'd suggest taking a hard look at Ruby gems and Python distutils/setuputils/eggs.  I really think that it would make a lot of sense to take the existing Erlang release system and use it to make something not unlike a gem or egg (which it already kind of does, but at a whole-Erlang-system level, not a component level).

As for the command line, it would be extremely handy to be able to have a convention where the nodename of a certain component was known, and there was a command line wrapper to send it commands.  Perhaps, have the component run a gen_event, and have "erlctl <component> <command> [args...]" generate an event in the node that the component runs in.  Extending edoc to generate commandline documentation would be killer here, too.

> Here's a suggestion for a set of rules for managed components:
> 
>    Draft 1 - 25 Nov 2009
> 
>    Rule1: All components are unpacked into the same top-level
>           directory (default $HOME/eComponent) and have the extension
>           .ec (Erlang component)
> 
>           Example: Imagine I have installed mochiweb ejabberd and couchDB
> 	   then after installation I should see the following:
> 
> 	   $pwd
>           /home/joe/eComponents
>           $ls
>           mochiweb.ec ejabberd.ec couchDB.ec
I don't think eComponents, as a non-hidden top-level directory is going to thrill the aesthetic sense of most Unix users.  I would humbly suggest something more like $HOME/.erlang/components/.  Or, better yet, a system directory (probably just in the Erlang code root) and a user directory ($HOME/.erlang/lib).

>    Rule2: A normal user should *never* have to examine any of the
>           files *inside* the component. To do so causes
>           "abstraction leakage" - I want to consider each of these
>           components as black boxes. Once installed the component
> 	   should be installed in a read-only disk area.
Certainly.

> 
>    Rule3: The component should not break if relocated to a different
>           top-level directory. A simple "mv" command should suffice
> 	   So no hard-wired paths or links please.
Certainly.

>    Rule4: All components C must have a file called	
> 
>           $HOME/eComponents/C.ec/Preferences.pl
> 	
> 	   The extension .pl means the file contains a property
>           list. Here is an example:
.pl is bad mojo, as it's universally accepts to be Perl.  How about, .epl (i.e. Erlang Property List).

> 
>           {codePath, "/bin"}.
>           {expiryDate, {2009,12,24}}.
> 	   {icon, "/images/myIcon.png"}
> 	   {version,1}	
>           {myKey1, ...}
> 
>           The keys codePath, version, and expiryDate are obligatory
> 
>           Why do we need code path? - so that Erlang can find
>           a module called C.erl
If the code path is relative, what is gained by having it specifiable?  Why not just have a convention that it's ebin (or maybe have component/lib be added as an additional LIBDIR).

>           Expiry date has a "time to live for the component"
>           Version is used for local configuration data (see later)
>           Version numbers should start at one and be increased by one
>           for each new release.
The expiry time thing is confusing to me.  Is this automatically done by some program, or is it metadata for a person?

>    Rule5: Local configuration data
> 
>           Local configuration data must not be stored under the
>           eComponents directory - (you can't remember we said the
>           component is in a read-only disk area - see rule2)
> 
>           Local data for the component C must be stored
>           in the directory
> 
>           $HOME/eLibrary/ComponentName/Vsn
> 
> 	   Thus local preferences for ejabberd version 1
> 	   would be stored in the file
> 
> 	   $HOME/eLibrary/ejabberd/1/Preferences.pl
Again, I humbly suggest more traditional Unix pathnames.  How about $HOME/.erlang/library/<component>/<vsn>/prefs.epl?

>     Rule6: Code upgrade
> 
>           We should upgrade an component C when it's expiry date has
>           been reached. To update an component we delete the entire
>           component under $HOME/eComponents/C.ec we install the
>           new component and run the command: C:install().
> 
>           Data that is to be carried over between different versions
>           of the component is stored in $HOME/eLibrary/C/V/...
Is this done by the component framework?  Is this really a good idea to be done automatically?  Automatically deployed upgrades have generally been associated with tears and gnashing of teeth in my experience.

>     Rule7: management
> 
>     	  All components C must provide a module in the file
> 	  C_control.erl - so ejabberd provides ejabberd_control.erl
> 
> 	  The management API is as follows:
> 	
> 	  C:start_link() -> Pid
> 	
> 	     create a controller process for the component.
> 
> 	  Pid has the following protocol [see note 1 for notation]
> 	  PL is a property list (list of {Key::atom(), Value::any()}
> 
>          Pid !! {start, PL} => ok | {error, Why}
> 
> 	      Cold start the component. PL is a property
>              list describing how the component should behave
> 
> 	  Pid !! {stop, PL} => ok
> 
> 	      Stop the component
> 
> 	  Pid !! {modify, PL} => ok | {error, Why}
> 	      Change the behavior of the component
> 
> 	  Pid !! info => PL
> 	  Pid !! {info, [Keys]} => [Values]
> 
> 	  Pid !! suspendLocal => ok
> 	  Pid !! resumeLocal => ok
> 
> 	        Suspend the component data can be written to
> 		$HOME/eLibrary/ComponentName/Vsn/...
> 
> 	   Pid !! suspendRemote => <<BinaryClosure>>
> 	   Pid !! {resumeRemote, <<BinaryClosure>>} => ok
> 
> 	         This ton suspend an component on one machine
> 		 and resume it on another	

This looks a lot like it could just use the existing application framework.  I don't think that most of this would even require breaking backwards-compatibility.

-- 
Jayson Vantuyl
kagato@REDACTED