Architectural Suggestions for Job Queuing

Wed Apr 14 08:49:18 CEST 1999

On Wed, 14 Apr 1999, Claes Wikstrom wrote:

> 
> Hello,
> 
>  > be missing some big picture.  I am looking for are some productive
>  > hints.  I want to code the applications myself.
>  > 
> 
> Ok, I'll just indicate the direction where you should look.
> 
>  > The first application I could use would limit the number of
>  > simultaneous jobs to run.  In particular, we have limited number of
>  > software licenses.  I'd like to start up a server and give it a list
>  > of jobs queues and their simultaneous limits.  
> 
> This means creating sockets, this is done with gen_tcp.erl

    Hang on, this implies a design decision that is not yet made.

    You have to decide:

    1) Server in  Erlang + clients in Erlang  + Distributed Erlang for
       communication, or,

    2) Server in Erlang +  clients  in Erlang + socket communication,
       or,

    3) Server in Erlang +  clients in (C, C++, ...) + socket communication
       + API for (C, C++, ..) client side applications

These are in order of complexity (simplest first).

	1) would be OK for a quick prototype to get the protocols ok
	3) would be for a commercial product that you could sell
  	2) is a half way house to implementing 3)

> 
>  > Throughout the network
>  > whenever someone wants to run job which uses one of those licenses
>  > they communicate with the server.  If a license is not available then
>  > the job blocks until the license is available.

    I  think the key  architectural/design  problem is one of deciding
what you want to do  in the event  of failure. If the server  crashes,
then all the clients are blocked. If there is a communication failure
then you may loose licenses etc.

    Here you can virtually design whatever you want.

    In a small (closed) system - then maybe there should be no server -
the clients could alternatively take on the roles of client or server and 
use a broadcast/lock strategy to negotiate licenses.

    In a  commercial system all nodes might  not be  equal. One server
(placed on a  reliable node) might service  hundreds of clients, etc. -
you  still might want to  have some hot-standby/fail-over behaviour for
the server,...

    This kind   of stuff soon  gets complicated  (but  that's OK - our
telecomms stuff is like this :-)

    Firstly I'd, like   to see a  simple "ball  park"  analysis of the
problem, in terms of;

	a) How many clients
	b) How many servers
	c) Holding times (how long does a client use a license)
	d) Reliability requirements.
	   - Is it acceptable that clients block if the server crashes?
           - Do you want hot standby for server crashes?
        e) is this a LAN/WAN application?
	f) required response times for obtaining/freeing a license 
	g) security levels (none, ... full) How much effort do you
	   want to put into making sure that you cannot forge a license
	   (you can have anything up to a full public/private key system)
	h) Maintaince levels (do you want a remote management system)
           If so what ...

    Once you have some idea of the answers to questions like these you
can *begin* to think about an architecture.

    It may be that you have a very specific set of answers - fine then
we can talk architectures. Or, you might want to "grow" a solution for
a very simple idealized system.

	/Joe

--
Joe Armstrong  Computer Science Laboratory  +46 8 719 9452
AT2/ETX/DN/SU  Ericsson Telecom SE-126 25 Stockholm Sweden 
joe@REDACTED    http://www.ericsson.se/cslab/~joe