scala (was checksum for distributed debugging?)

Mon Dec 12 12:38:11 CET 2005

Thanks for your lengthy post.  I have little time on my hands, but I 
make time to read such useful reports.
I agree with your analysis and am pleased to know that an erlang expert 
had his hands in on scala development.  It would be nice to see some of 
these ideas implemented one day as I will be keeping my eye on scala for 
future suitability in real world apps.
thanks again, ke han

happi wrote:
> [quote="Bob.Smart@REDACTED"]
>  P.S. On a completely different matter, I'd be interested in
>  comments on the Scala language (http://scala.epfl.ch/) by
>  Erlang experts. It can be used in a fairly pure functional
>  style using val declarations, and it seems to support
>  Erlang-style message exchange (see chapter 3 of "Scala by
>  Example" http://scala.epfl.ch/docu/files/ScalaByExample.pdf).
>  Far be it from me to suggest that Erlang might not last
>  forever, but if not this seems a possible migration path. 
> [/quote]
> 
> ( I've been trying to answer this a couple of times now, 
>   but my answer always becomes too long and confusing.
>   But since I spent the time writing it I'm posting it anyway. 
>   If you have the time to read half of it you have too much
>   time on your hands. )
> 
> In 2003, after seven years of hacking Erlang, I joined 
> Prof. Odersky at EPFL and spent a year and a half
> working in the Scala project.  
> 
> As both Ulf Wiger and Ke Han notes the biggest problem
> with the Scala concurrency model is that it is based on
> the underlying model of the VM it is running on.
> I will try to go a bit deeper into this problem.
> 
> In his thesis Joe Armstrong lists a number of requirements on
> a programming language and its libraries. In short,these are 
>   R1) Concurrency, 
>   R2) Error encapsulation, 
>   R3) Fault detection, 
>   R4) Fault identification, 
>   R5) Code upgrade, and 
>   R6) Stable storage.
> 
> According to Joe these requirements are fulfilled by Erlang
> in the following way: 
>   R1--by Erlang's processes, 
>   R2--processes are designed as units of errors, 
>   R2--processes fail if functions are called with the wrong arguments, 
>   R3 and R4--when a process fails the reason for failure is
>        broadcasted to linked processes, 
>   R5--by hot code loading, and 
>   R6--by the Erlang libraries dets and mnesia. 
>  
> There are some key elements in Erlang that according
> to Joe makes it easier (or indeed possible) to build
> fault-tolerant software systems. These are:
>   K1 -- Processes,
>   K2 -- Asynchronous message passing, 
>   K3 -- Process links, 
>   K4 -- No shared memory,
>   K5 -- Code updates, 
>   K6 -- and Lightweight processes.
> 
> Most of this can be implemented in Scala today, and
> what is left might actually be fixed some time in the
> future.
> 
>  K1 Processes & K6 Lightweight processes.
>  ===========================
> Concurrency in Scala is provided by the underlying
> backend, at the moment either the JVM or the CLRE, which is assumed
> to provide a thread model compatible with the thread model of Java.
> 
> This gives Scala concurrency "for free" and the possibility to 
> use Java or .NET libraries that uses the native thread model.
> 
> A disadvantage is that threads in most Java implementations
> are not very lightweight, and starting new threads as well as
> context switching between them can be relatively slow. 
> But there is nothing in the Scala language definition that
> requires a heavy thread implementation. And there are some
> experimental Java implementations that actually have very
> light weight processes.   
> 
> Scala is extensible in many ways which makes it possible
> to get a language within the language that looks very much
> like Erlang. (We will use imports, def parameters, quoting,
> and higher order functions to achive this.)
> 
> Given an implementation of processes in the class Process, 
> we can in Scala define the spawn function as:
> 
> package scala.concurrent;
> object Process {
>   def spawn(def body:unit):Process = {
>     val p = new Process(body);
>     p.start();
>     p;
>   }
> }
> 
> Here the def parameter body:unit takes a closure of the
> type () => unit.  One nice aspect of the def parameter is that
> the compiler automatically infers that a closure is needed at 
> all call sites of spawn, making it unnecessary for the programmer
> to write code to create the closure.  We can now create a new
> process by just calling spawn with the code that the new process
> should start executing. 
> 
> Assuming we have a Server object with a method 
>  loop:Int=>Unit, 
> we can create a process like this:
>  import scala.concurrent.Process.spawn;
>  ...
>  val pid = spawn(Server.loop(0));
> 
> So far we have assumed an implementation of the class Process, 
> let us now look at how this can be implemented in Scala.
> All we need to do is to extend the Thread class and implement
> the abstract run method of Thread:
> 
> class Process(def body:unit) extends Thread {
>     override def run() = body;
> }
> 
> Now we can start a process, but how do we stop it? Well just as in
> Erlang a process stops when it has no more code to execute, i.e.,
> when the code in body reaches its end. Sometimes we would like to
> kill the process prematurely, in Erlang this is done by calling the
> BIF exit(p:pid, reason:Term), in Scala we can also implement
> exit. We implement it both in the Process class and in the
> Process object in order to get an Erlang like syntax:
> 
> object Process {
>   def spawn(def body:unit):Process = {
>     val p = new Process(body);
>     p.start();
>     p;
>   }
>  def exit(p:Process,reason:AnyRef) =
>    p.exit(reason);
> }
> 
> class Process(def body:unit) extends Thread {
>   private var exitReason:AnyRef = null;
>   override def run() = {
>     try {body}
>       catch {
>       case e:java.lang.InterruptedException =>
>         exitReason.match {
>         case null: =>
>           Console.println("Process exited abnormally " + e);
>         case _: => 
>           Console.println("Process exited with reason: " + exitReason);
>         }
>       }
>     } 
> 
>   def exit(reason:AnyRef):unit = {
>     exitReason = reason;
>     interrupt();
>   }
> }
> 
> Processes in Erlang can get their own pid by calling the BIF
> self(), this can easily be simulated in the process class
> by the method: def self = this;
> 
> We have one small problem though, in the example above we started
> the process by calling Server.loop(0), but the object server
> does not inherit from the class Process, and hence the method 
> self is not available in the code of loop. 
> We can fix this by implementing self in the process object:
> 
>   def self:Process = {
>     if (Thread.currentThread().isInstanceOf[Process]) 
>       Thread.currentThread().asInstanceOf[Process]
>     else error("Self called outside a process");
>   }
> 
> 
>   K2 Asynchronous message passing
>   ======================
> 
> So far our processes can only be created and execute code,
> which is not too bad in it self, but in order to get Erlang
> like processes we need to give the processes the ability to
> communicate. In Erlang processes communicate through 
> asynchronous message passing implemented with mailboxes.
> We can implement the same mechanism in Scala, although here
> we will go one step further and first implement a more general
> mailbox which can be read by several processes.
> 
> On page 138 of the document "Scala by Example" 
> (http://scala.epfl.ch/docu/files/ScalaByExample.pdf)
> you can find an implementation of Erlang like mailboxes 
> with the following signature: 
>  class MailBox {
>    def send(msg: Any): unit;
>    def receive[a](f: PartialFunction[Any, a]): a;
>    def receiveWithin[a](msec: long)(f: PartialFunction[Any, a]): a;
>  }
> 
> There is a special message TIMEOUT which is used to signal a
> time-out, implemented as:
>  case class TIMEOUT;
> 
> The receive method first checks whether the message processor
> function f can be applied to a message that has already been sent
> but that was not yet consumed. If yes, the thread continues
> immediately by applying f to the message. Otherwise, a new
> receiver is created and linked into the receivers list,
> and the thread waits for a notification on this receiver.
>  Once the thread is woken up again, it continues by applying
> f to the message that was stored in the receiver.
> 
> The mailbox class also offers a method receiveWithin
> which blocks for only a specified maximal amount of time.  If no
> message is received within the specified time interval (given in
> milliseconds), the message processor argument f will be unblocked
> with the special TIMEOUT message. 
> 
> With an implementation of mailboxes we can now add mailboxes to
> our Process class by mixing in MailBox in Process, and we can make
> the syntax more Erlang like if we want by defining the method !:
> 
>  class Process(def body:unit) extends Thread with MailBox {
>    def !(msg:Message) = send(msg);
>    ...
> 
> In order to be able to do send and receive in code that does
> not inherit from Process we supply some methods in the Process
> object:
> 
>  object Process {
>     def send(p:Process,msg:Message) =
> 	p.send(msg);
>     def receive[a](f: PartialFunction[Message, a]): a = 
> 	self.receive(f);
> 
>     def receiveWithin[a](msec: long)(f: PartialFunction[Message, a]):a =
>         self.receiveWithin(msec)(f);
>   ...
> 
> We can also get named process by, as in Erlang, using a name server:
> 
> object NameServer {
>   val names = new scala.collection.mutable.HashMap[Symbol, Process];
> 
>   def register(name: Symbol, proc: Process) = {
>     if (names.contains(name)) error("Name:" + name 
>                                                    + " already registred");
>     names += name -> proc;
>   }	
> 
>   def unregister(name: Symbol) = {
>     if (names.contains(name)) 
>       names -= name;
>     else 
>       error("Name:" + name + " not registred");
>   }
>   
>   def whereis(name: Symbol): Option[Process] = 
>     names.get(name);
> 
>   def send(name: Symbol, msg: Actor#Message) =
>     names(name).send(msg);
> 
>   def view(name: Symbol): Process = names(name);
> 
> }
> 
> Then we can just write code like
>   register('myServer, Server.loop(0));
>   'myServer ! Tuple2('myMessage, self);
> 
>  K3 Process links
>  ==========
> One of the most important aspects of Erlang is the ability to link
> processes together. When a process is linked to another process it
> will send a signal to the other process when it dies. This makes it
> possible to monitor the failure of processes and to implement
> supervision trees where a supervisor process monitors worker processes
> and can restart them if they fail.
> 
> In Erlang a process can be linked to its father (creator) by using
> the spawn_link BIF when spawning a new process, it is also possible to
> link to another process at a later time by calling the link BIF.
> To implement this in Scala we have to add a list of links to the
> Process class and provide the link methods, as well as signal a
> failure to all linked processes. 
> We can now see the complete Process class in all its g(l)ory:
> 
> class Process(def body:unit) extends Thread with MailBox {
>     private var exitReason:AnyRef = null;
>     private var links:List[Process] = Nil;	
>     override def run() = {
> 	try {body;signal('Normal)}
> 	catch {
> 	    case _:java.lang.InterruptedException =>
>     	      signal(exitReason);
> 	    case (exitSignal) => 
> 	      signal(exitSignal);
> 	}
>     }
>     
>     private def signal(s:Message) = {
> 	links.foreach((p:Process) => p.send(Tuple3('EXIT,this,s)));
>     }
> 
>     def !(msg:Message) = send(msg);
> 
>     def link(p:Process) = links = p::links;
>     def unlink(p:Process) = 
>       links = links.remove((p2) => p == p2);
>     
>     def spawn_link(def body:unit) = {
> 	val p = new Process(body);
> 	p.link(this);
> 	p.start();
> 	p
>     }
>      
>     def self = this;
> 
>     def exit(reason:AnyRef):unit = {
> 	exitReason = reason;
> 	interrupt();
>     }
> }
> 
> In Erlang links actually work slightly differently, when a process
> (p1) dies and that process is linked to another process (p2) then p2
> will also be killed unless that process has the flag trap_exit set to
> true. With this behavior a whole set of processes that are linked can
> all be killed by killing just one of the processes. Often this is the
> desired behavior, if the set of processes are dependent upon
> each other then there is no reason for any of the processes to continue
> executing if one of them dies. 
> 
> When the process-flag trap_exits is set to true then the linked process,
> p2, will not be killed instead it will receive a message about the
> cause of the death of p1. This is the behaviour of the Scala code
> above.
> 
> It would not be hard to mimic the Erlang behavior in Scala by
> adding the trap_exit flag and then test it in the signal method and
> either use send or exit.
> 
>   K5 Code updates
>   ===========
> Currently there is no real support for code updates in Scala,
> but this might not be as big a problem as one might suspect.
> The hot code replacement is a cool feature of Erlang but
> it has some shortcomings. There is no automatic way to
> update or even detect changes in data structures.
> 
> Therefore it seems like many real systems goes through
> special upgrade states and conversion functions, often 
> even using redundant hardware for upgrades. By just following
> some conventions you can actually upgrade running Scala
> systems in a similar fashion.
> 
>   K4 No shared memory
>   ==============
> Now, here is the big difference between Scala and Erlang.
> Since Scala uses the underlying concurrency model of
> Java or .Net data sent as messages will be shared
> between processes.
> 
> If you (like Joe) belive that sharing is evil, there are
> four possible solutions to the problem of shared data.
> 
>  1. Make send copying. 
>      This requires a copy or serialize method in all messages,
>      which could be implemented in two ways in Scala.
>      a) Add an abstract copy method to the toplevel class Anyref
>          requiering all classes to implement this method.
>      b) Have a speciall Message class that has a copy method,
>          requiering all classes that needs to be sent to implement
>          this method.
>      The problem with this solution is that the system would have
>      to somehow ensure that a deep copy is performed. An open
>      problem at the moment.
> 
>  2. The Erlang way
>      No updateable structures. 
>      You could for example implement all Erlang terms in Scala
>      and define the type Message as Term. 
>      A simple solution which would make scala Processes as 
>      powerful as Erlang processes, but you would loose OO for
>      all messages.
> 
>  3. A new type of type analysis that can determine that there are
>      no mutable structures in objects used as messages. 
>      Another open problem.
> 
>  4. Head in the sand.
>      Just ask the programmer to not send mutable data,
>      (unless he knows what he is doing).
>      This is the current approach of Scala (and Java).
> 
> 
> I will not present a conclusion here, instead i hope to inspire
> to some debate.
> 
> 
> 
> 
> 
> Anyway... Scala is a really cool language and I encourage
> you all to try it out, you just might like it as much as I do. 
> Still, though, I am back in the Erlang world now, and I'm
> loving every minute of it.
> 
> /Erik Happi Stenman
> 
> PS.
>   Sorry for the badly formatted code, I'm using the trap-exit
>   forum and it seems like code either looks bad in the forum
>   or in the email version... or perhaps both.
>   Any suggestions are welcome.
> _________________________________________________________
> Sent using Mail2Forum (http://m2f.sourceforge.net)
>