[erlang-questions] suitability of erlang

Thu Oct 11 19:42:23 CEST 2012

On Thu, Oct 11, 2012 at 4:17 PM, Rustom Mody <rustompmody@REDACTED> wrote:
>
>
> On Thu, Oct 11, 2012 at 12:00 AM, Joe Armstrong <erlang@REDACTED> wrote:
>>
>> On Wed, Oct 10, 2012 at 6:06 PM, Rustom Mody <rustompmody@REDACTED>
>> wrote:
>> > On Wed, Oct 10, 2012 at 3:45 PM, Matthias Lang <matthias@REDACTED>
>> > wrote:
>> >>
>> >> On Wednesday, October 10, Rustom Mody wrote:
>> >
>> >
>> >>
>> >> Another thing: you write "under severe time-crunch", but that could
>> >> mean "we need to ship a product in a very short time" or it could mean
>> >> "the data has to be processed in a very short time".
>> >>
>> >> Matt
>> >
>> >
>> > Well mainly the latter
>> > But the former is also true. (When is it ever not :-))
>> > My concern is that there is some desire to try out erlang which may be
>> > fine... However I am wondering whether the difference between
>> > concurrency
>> > and parallelism has not been appreciated.
>>
>> Very true - I have given several lectures on this ... many people use the
>> words concurrent and parallel as synonyms - but there is a real
>> difference.
>>
>> In a nutshell - parallel has to do with real hardware than can do
>> several things "at the same time"
>> whereas concurrency is a software structuring abstraction which
>> pretends that different
>> parts of the software can run at the same time (if suitable hardware
>> were available)
>>
>> Parallel has to do with hardware - on a single cpu you can write your
>> programs
>> using pmap (instead of map) and parBegin and parEnd  and it won't make
>> any difference
>> since the machine is still only doing one instruction at a time.
>>
>> It's only because machine are fast that we think they do several
>> things in parallel -
>> in the single CPU + time sharing case if we slowed the clock down we would
>> see
>> that no parallelism is involved - the CPU does one thing a time - even
>> when executing
>> concurrent programs :-)
>>
>> There are only a few real sources of true parallelism - multi-cores,
>> what happens in the pipeline, and how data is fetched/stored
>>
>> Multicores do give you true parallelism, but you must be careful to
>> about the granularity
>> of computation - ie it should not be more work to shift the work to a
>> remote CPU than do it
>> locally. Pipeline messing is old for compiler writers. Data fetching
>> and storing can be done in
>> parallel, this needs a bit of thought.
>>
>> Parallel programs are inherently non portable - in the sense that they
>> specifically depend
>> upon details of the hardware.
>>
>> Concurrent programs are program written using a concurrency
>> abstraction. In Erlang
>> processes are the unti of concurrency - processes are pretty coarse-grain
>> things
>> ie they take a wee bit of effort to get going, so they actually map
>> well onto multi-cores,
>> even if you can detect parallelism in your program (easy in a pure FP,
>> just evaluate
>> all the arguments to a function in parallel) - it's pretty difficult
>> (ie an open research problem)
>> to map this onto the available hardware.
>>
>> Even if you know how long computations take, and you know what
>> resources you have
>> to solve the computations, then mapping the computations onto the
>> resources involves
>> solving the knapsack problem which is NP hard.
>>
>> It you write program in Erlang you at least have a head start over
>> sequential code
>> since the programmer has at least said (using processes) which
>> activities should be run in parallel.
>> Optimally scheduling these is NP hard - but non-optimal first on
>> demand, best effort scheduling
>> works pretty well for non-pathological code.
>>
>> My 10 c.
>>
>> /Joe
>
>
> Thanks Joe for a detailed answer.
>
> I have one more question that sits somewhere between parallelism and
> concurrency.
> In the 'normal' world (ie C++, Java, Python) there is a general
> recommendation that converting a threaded solution to an event-driven
> solution usually speeds up the program and removes hard-to-find bugs.
> And at the heart of an event-driven system is usually an FSM.
> Now I am particularly interested in the combined (regular-exp+FSM) approach
> that ragel provides: http://www.complang.org/ragel/
>
> In particular, here is the use of ragel to write an http server:
> http://www.zedshaw.com/essays/ragel_state_charts.html
>
> Now since ragel is more or less language agnostic -- it has backends for
> generating C, C++, Java, Ruby etc -- would an Erlang backend for ragel make
> sense?

I have no idea - you haven't told me what problem you want to solve -
you've told me the
technique you wish to use to solve the problem.

This is a bit "back to front" - I'd like to *start* with the the
problem then figure out how to solve
it - not *start* with the idea that you want FSMs and events and so on
and not tell me what the
problem is.

I can't possibly recommend any technology without knowing what the problem is.

Regarding FSM in erlang - these are easy:

     state1() ->
          receive
               event1 -> state2();
               event2 -> state3()
          end.

      state2() ->
           receive
               event3 -> state4()
               ..
           end.

     whatever - not really worth getting grey hair over

Event driven programming is with Erlang spectacles on - not a wonderful idea.

Events are a poor substitue for the lack of concurrency and non-blocking reads.
If I had a language with a) only one thread of execution and b)
blocking I/O then you
know what I'd do - I'd invent event based programming.

Let me explain what I think happened ...

Once upon a time there was a sequential processor, that could run only
one sequential program.

The program did the following

     a:   start -> read -> compute -> write -> compute -> read ->
compute ... -> stop

ie it interleaved computation with reads and writes.

    If the reads blocked - who cared - it's only doing one thing - a
consequence of this
architecture was "the program should not crash" - since there is only
one thread if it crashes
you have a BIG problem - so we invent "defensive programming"

   Now we want to do TWO or more things at the same time:

     a:   start -> read -> compute -> read -> compute -> read ->
compute ... -> stop
     b:   start -> read -> read -> write -> compute -> read ...

Now - horrors - read stops the system - it blocks, and writes are
problematic we two things
might write the same object at the same time.

Enter - the big global write lock and event based programming

We transform a: into

    a: start -> read and on_read_complete(a1)
    a1: compute -> read and on_read_complete(a2)
    a2: compute -> read and on_read_complete(a3)

etc. and the same for b - the result is the software is a big mess

look at Javascript

   $('#result').load('ajax/test.html', function() {
        alert('Load was performed.');
    });

 This is typical jquery javascript code

     load("blaa", fun)

   means read fun and when the read completes call fun.

Now look what happened:

Simple code like

   a: start -> read -> computer -> write -> read ->

got broken into a mess of callbacks and you need a TOOL to help you - enter
FSM modeling the callback style a -> ... read on-read_complete(a1) becomes
a total mess and needs a graphic tool to solve the problem that should never
 occurred in the first place.

If read is NON blocking (ie does not block other processes) then you
don't need callbacks.

What about crashing?

When you have ONE thread crashing is a BIG DEAL

But now we have millions of thread - who cares if one crashes - nobody
- "let it crash"

But wait a moment - fault tolerance then? - "let some other guy fix the error"

Who? some other process. Which process? - enter the link - "the linked
processess"

So now we can chuck the idea of defensive programming - don't do it.
Bye bye to 80% of your code that did error checking :-)

Erlang encourages a different mind-set - you can have hundeds and of
thousands of processes
each running sequential easy to write code

    a: start -> read -> compute -> read -> read -> write

and not a FSM or event framework on the horizon

Try it - since (again) I don't know what your problem is I can't
really make more than generic observation - but from the Erlang point
of view events and so on are a very "70's" way of programming.

Cheers

/Joe

>
> Rusi
>
>