Programming components

Joe Armstrong (AL/EAB) <>
Tue Aug 23 11:28:05 CEST 2005

  Hello world,

  This document contains a few ideas that Uffe and I have been playing with.

  Erlang lacks a notion of types and a way of specifying components.

  This note describes a systems of dynamic types and type serializations 
which seems to us to be easy to understand and implement.
  Think of it as a sort of grand union of XML-RPC, and UBF.

  This is what XML-RPC, SOAP etc. should have been :-)

  Later this year I will release a library which simplifies writing
components - for now I just want to discuss the ideas involved. 



On Types, serializations and components


Any programming language must have a notion of variables and types.

When we say

	Var = Value

We must know the domain that Value belongs to.

We state the domain of Var with a type declaration:

	Type Var = T

Meaning that Var has type T.

Type systems always have three parts:

	1) A set of primitive types
	2) Glue for making complex types from primitive types
	3) Aliases for defining new types
We can make a pretty powerful type system with two primitive types
three types of glue and one way of making new types.

The two primitive types are

	int() - meaning an integer
	str() - meaning a string

The three types of glue are

	{T1, T2, ..., Tn} - tuple glue
	[T]		      - list glue
	T1 | T2| ... Tn   - choice glue

The way of making new types is:

	defType Name = T

Serialization of type instances

Given the above type system we now ask: How can we serialize instances
of the types - easy - here are two possible serializations of type

First the primitive types

Type	Instance  Erlang	        Xml
int()	23        23		  <i>23</i>
str() "joe"     <<"joe">>A      <s>joe</s>

Then the glue

Type	         Instance     Erlang	         Xml
{int(), str()} {23,"joe"}   {23,<<"str">}   <t>

[int()]	   [45,67]	    [45,67]	        <l>

Note the 1:1 correspondence between the Erlang and XML serializations


What is a component?

Definition: A component is a black box with ports. If you send the port a
well-typed message it will respond with a well-typed reply.

Below I discuss two styles of specify the behavior of components.

One uses a CSP like notation the other an RPC like notation.

CSP Components 1

         s	+------------------+
         ---|    Black box xx  |

I define a component as a black box with one of more ports.

All black boxes have at least one port called "s" (standard I/O)

To *specify* the behavior of a black box I can write CSP style equations
like this 

 Type fileName = str().
 Start = s ? {"get", fileName()} -> s ! ({"ok", str()} | "enoFile"}) -> Start
       | s ? "ls" -> [fileName()];
       | s ? {"put", fileName(), str()}.

 This tells me all that I need to know about the FTP server!

 << the notation is
   Process = Port [?|!] Type -> ... -> Process
   P ? T means receive a type T from port P
   P ! T means send a message of type T to P


 A serialization needs a "wrapper" to say what it is.

 So to use my server with the XML encoding I might make up a message like this

	Content-Type: text/myXml
	Content-Length: NNNN
	[blank line]
	 <s>some file</s>

	Content-Type: text/erlText
	Content-Length: NNNN
	[blank line]
	{<<"get">>, <<"some file">>}


	Content-Type: text/erlBin
	Content-Length: NNNN
	[blank line]
	term-to_binary({<<"get">>, <<"some file">>})

CSP Components 2

What happens if my components call modules or other components?

My ftp component will use the file component and the lists module

 -import_ports(p is (s from file)).
 Type fileName = str().
 Start = s ? {"get", fileName()} ->
	   p ! {"read", filename()} -> Reply
       | s ? "ls" -> [fileName()];

 Reply -> p ? {"ok", str()} -> ...
	     lists ! {"member", str(), [str()]} -> Ok

 Ok -> lists ? "true" -> ...

 This type of specification is precise but rapidly becomes unreadable.

 "State less RPC" seems nicer

RPC Components
 Type fileName = str().

    {"get", fileName()} => {"ok", str()};
    "ls" => [fileName()}


 A => B; means if we send the component ftp a type A it will reply with
a type B. This is a normal RPC.

 Reverse RPC's are written:

    X <= Y


    A => $empty

 and Notificataions

    $empty => E

  (or E <= $empty)


Bind associates an IP and a port with a component

	> Bind 2345 ftp

Means that Port 2345 on IP obeys the ftp protocol (if it
is running)

       To use the server

	1) Open a connection to port 2345
	2) Send it some correctly typed message, like:

	Content-Type: text/myXml
	Content-Length: NNNN
	[blank line]
	 <s>some file</s>

Open questions

1) What should the *semantics* of X => Y be
   - immediate reply
   - will eventually reply
   - will eventually reply or timeout

2) Can we interleave A => B and X <= Y

3) Timeouts

   Should we write

	A => B within Time

4) Which style of component specification is best
   CSP, or RCP?

5) Introspection - yes :-)

Things to think about

 1) Erlang strings.
    To obtain a beautiful mapping from abstract types to
    Erlang and XML I need to represent strings as <<"string">> and 
    NOT "string"
    The change to the pretty printer that I posted earlier
    would encourage this usage.

    Note: representing strings as binaries *everywhere* has great advantages
(think about it) 

   2) Dynamic type checking of these types is trivial - but can a type
    checker *prove* that your program follows a given component spec.
All comments welcome




More information about the erlang-questions mailing list