Thought of the day: was RE: Gen_server and Gen_fsm questions

Joe Armstrong (AL/EAB) joe.armstrong@REDACTED
Tue Jan 4 14:37:53 CET 2005




Great - I like BIG pictures - wide-screen surround sound - the works 
give it to me ...

On my way to work I was thinking about Remote Procedure Calls (RPC's)
 and !! and HTTP and SIP and all that kind of stuff.

Why are there so many *different* formats and ways of doing the *same* thing?

I think we should use ONE format for everything - bear with me and I'll
try to explain: 

Firstly, what is an RPC?

In Erlang to do and RPC you send something a message and wait for a reply.

	rpc(Pid, Q) ->
         Pid ! {self(), Q},
	   receive
	      {Pid, Reply} ->
		    Reply
	   end.

How do you do this in http?

The URL: "http://www.erlang.org/index.html" is a neat way of saying
"open port 80 on the host www.erlang.org then send a GET HTTP 1.0 ... request
to port 80 and wait for a reply"

ie "http://www.erlang.org/index.html" serves to name an RPC, it's very neat since it
manages to say several things in one simple string.

Now let's imagine an Erlang equivalent: 

What might: "http://www.erlang.org/Mod/Func?Arg1=val1&Arg2=Val2" mean?

Let's interpret this as:

	Go to port 80 on www.erlang.org and write a GET Mod/Fun?Arg1=Val1& ...
string to the port.

      What does the server do? - Yes - evaluate Mod:Func(Args) ... assume this
returns a term T - then convert this to a binary and send it back with a mime type
text/erlangBinterm.

     Now suppose I want to make "something like email" based on http -
easy! define a URL like

	"http://www.erlang.org/mail/deliver&who=joe&subject=mail...."

     To mean "deliver mail to joe ...."

     Now structure your software like this:
     
      HTTP GET ...   +--------+  {[mail,deliver],[{"ho","joe"}]}  +--------+
    ---->------------| driver |-------->--------------------------| server |
                     +--------+                                   +--------+

      The driver does HTTP packet reassembly etc - it parses the request
into a normalised Erlang term and sends it to the server.

       Why go to all this trouble?

      Suppose we change transport medium - to FTP

	"ftp://joe@www.erlang.org/mail/deliver&who=joe&subject=mail...."

      Again this has to be interpreted and parsed, so we add a new front-end

     
      HTTP GET ...   +--------+  {[mail,deliver],[{"ho","joe"}]}  +--------+
    ---->------------| driver |-------->---+-----------------------| server |
                     +--------+            |                      +--------+
                                           |
     FTP put request +--------+            |
    ----->-----------| driver |----->------+
                     +--------+

     Now the back-end server only understands Erlang message - the drivers turn these 
messages into HTTP, or FTP or whatever is the flavour of the day (even XML RPC)


      What we have to recognise is that all these different syntaxes are just different
ways of doing an RPC.

     
      RPC format 1   +---------+  Universal term format             +--------+
    ---->------------| driver1 |-------->---+-----------------------| server |
                     +---------+            |                       +--------+
                                            |
      RPC format 2   +---------+            |
    ----->-----------| driver2 |----->------+
                     +---------+            |
                                            |
      RPC format 3   +---------+            |
    ----->-----------| driver3 |----->------+
                     +---------+
 
     Now why people get all excited about the different formats (XML-RCP, SOAP,
HTTP, FTP, sun-rpc) etc. is beyond me - THEY ARE ALL JUST DIFFERENT SYNTAXES
FOR RPCs.

     Whether you fetch a file with HTTP or FTP or rcp or XML-RPC is *irrelevant*
the semantics "fetching a file" is identical.

     Still with me? - good.

     Let's generalise

	P://H/Function?Args

     means 
	 1) let's use a Protocol called P 
       2) To talk to a host H
       3) and tell it to do Function with arguments
       4) Args

     That's why it's a very nice notation (4 things in one string)

     How do we find H? - there are three alternative.

     1) If *is* the server hostname then use DNS
     2) If H contains no hostname use a distributed hash table (chord, pastry, DKS, CAN etc)
     3) If H is a mixture of a server name and a key use SIP

     To do 1) you need to own some DNMS domain that you can easily modify
2) is research - they are no public name severs (or am I wrong?). 3) Implies a
SIP proxy at a fixed hostname. Given name@REDACTED the host bit can be resolved by DNS
and the name bit can be resolved by a SIP server at host.

     So here's an idea:

     Lets define a new URL (or is it a URI - I can never remember)

	erl://name@host/Mod/Func?arg1=val1&arg2=val2
       
     To mean something like: use SIP to locate a joe@REDACTED, open a socket to an Erlang
server on this machine and send it the message {rpc, From, "Mod/Func", [{"arg1","val1"}..}]}

     SIP stands for "session initiation protocol" - I assume the designers of
SIP were thinking of "Erlang sessions". I guess SIP is really just "a rather complicated
way of connecting two Erlang processes together" - once you've done this then the
processes can get on with the real job of "doing something useful."

     This is, of course, the tricky bit - discussing syntax (should we use XML, HTTP, FTP, SIP, DNS, SOAP) distracts attention away from semantics (what should we do with this stuff).

     The former question usually attracts much more attention than the latter :-)

     Cheers

/Joe

<reading through this I think I've rambled far off the original question that set me off,
so I'll try a quick summary.

Get rid of proprietary formats/protocols etc. as soon as possible - use drivers to
convert to a universal messaging format (Erlang terms). Write all your programs using
the internal formats. Introduce a universal naming scheme for everything>

-----Original Message-----
From: Casper [mailto:casper2000a@REDACTED]
Sent: den 4 januari 2005 13:25
To: Joe Armstrong (AL/EAB); 'Vance Shipley'
Cc: erlang-questions@REDACTED
Subject: RE: Gen_server and Gen_fsm questions


Hi Joe,

Wish you a happy new year too. And thanks a lot for your valuable advice.

I have a BIG picture. A picture of a common platform, very generalized,
which has Telecom applications such as SMSC, IVR, Prepaid, HLR, SCP, etc as
pluggable modules (or applications). One module to handle ISUP Call control,
one module for IVR functions, one for Prepaid functions, one for TCAP, one
for SMS handling, one OAM, etc., distributed and having full redundancy.

I'm kind of tired by doing various platforms in various languages and
platforms. MMSC runs on any, since it's done using Java, SMSC on Linux
C/C++, IVR on Win32 VC, etc. These developments are started in different
levels/times, so has not come under one platform. Also maintenance and
debugging takes a lot of time. DBMS is not giving the required transaction
speed, etc. So I want all of them to come under one platform, and I'm
getting very much convinced, under Erlang/OTP platform.

I know it'll be difficult to start, but I'm sure it's worth doing. So at the
moment I'm investigating the architectures of other platforms developed
using Erlang. It's kind of hard to find any good documentation of such a
system. 

If any of you can give me any advice/reference materials regarding above
discussion, I greatly appreciate.

Thanks!
Eranga







-----Original Message-----
From: Joe Armstrong (AL/EAB) [mailto:joe.armstrong@REDACTED] 
Sent: Tuesday, January 04, 2005 4:16 PM
To: 'Casper'; 'Vance Shipley'
Cc: erlang-questions@REDACTED
Subject: RE: Gen_server and Gen_fsm questions

> If I have one/two process for each call, then if I maintain let's say
> 100,000 simultaneous calls, I will have to create 200,000 gen_fsm, ie.
> Processes. 
  
   Yes

> Is that a Good method?
  
   Yes

> Will that create unnecessary system overhead?

   No

It's exactly the right way to think.

You have to get used to thinking in terms of processes - creating processes
is a light-weight operation (this means you can create lot's of them very
quickly).

No you might run into memory problems - I don't know what the minimum size
of a process is
but let's guess 1KB - so your 200 K process might take 200M of memory and
that might
be a problem.

But suppose you were to do it some other way - suppose you "suspend" a
process when it's not
doing anything useful - you have to store it's data structures somewhere -
you have to
make it go away, store it's data structures, then at a later stage wake it
up and
restore it's data structures etc. All of this takes lots of unnecessary code
and there's no guarantee that it's quicker.

Even storing the data structures required by suspended processes takes space
so doing this might not be a good idea.

The Erlang "way" is to identify all the truly parallel activities in your
application and then assign exactly ONE process per activity. (The exactly
ONE bit is important) -
this makes the code isomorphic to the problem - and easy to write understand
and debug.

So first you do as I have suggested - THEN you measure and possibly
optimise.

First make it right - then make it fast.

Happy new Year




More information about the erlang-questions mailing list