[erlang-questions] Glossary

Mon Jul 20 16:39:12 CEST 2009

2009/7/20 Richard Andrews <bflatmaj7th@REDACTED>:
> Apologies. This got long. I think it is reasonably accurate.

Since it's just that bit different from other system+language setup,
I think it's worth getting the understanding right. Tks for the patience.

>>>> Node
<snip/>

>> Node = (approx) a single processor/computer/machine?
>> That's how I had it.  Then one of those is 'the erlang node', i.e. where
>> whole programs are started from.
>
> Hmmm. A node is a program/process started in the operating system on a
> machine. You would see it as a task under Windows task manager or a
> process in UNIX top.

So not necessarily 'one per CPU'? With your definition I can
see I could start several 'programs' from one machine. OK.

> It is a virtual machine program for running
> erlang code in.

That's clear!

 When a node program starts it is configured to
> internally start various erlang processes based on what that node is
> designed to achieve. Erlang nodes can start other nodes, but systems
> don't always emanate from a nucleus. It can be detrimental to fault
> tolerance (which is a big reason for using erlang).

OK, I'm happy  with that one.

>
>
>>>> Process
>>>
>>> A light-weight state machine with a mailbox for receiving messages
>>> from other processes or system IO resources. In java you might know
>>> them as green threads. The node schedules processes to run when there
>>> is something for them to do (like a message in the mailbox).
>>
>> This seems the key bit. Yet least natural to get hold of.
>> Receiving messages is a key part.
>
> An erlang program as a whole is event driven.

[A program being a number of processes?]

An external event (eg.
> IO or timer) will cause a message to arrive at a process (eg. data
> from a socket) which will cause other messages to flow between
> processes, aggregating data, checking what should be done. Some
> processes are connected to the outside world (eg. network socket, file
> handle) and data will flow out of the erlang node via those.

 I'm clear with the messaging idea as implementing events.

>
> Erlang was designed for highly asynchronous applications with lots of
> partially completed tasks running concurrently and safely (not
> interfering with each other). Because erlang processes CANNOT access
> the information stored in any other process, the only way to get to
> that info is to ask nicely and wait for the response message.

I like that clean interface. One of the Erlang differences I guess.

This
> might seem inefficient to a C coder

I've done similar things in Assembler, and suffered the failures ;-)

>  * Because state cannot be shared, when a process suffers a terminal
> fault (eg. unhandled situation), only that small process is killed.
> The rest of the system can be guaranteed to be unaffected and keeps on
> ticking. So what would typically be a segfault or abort in a C program
> becomes an internal process restart. Errors are contained and the
> system keeps running.

I read that in the O'Reilly book. I've yet to play with it. Propogating
errors to an appropriate handler level makes sense.

>> Is it true that a process is generally a single module ('a chunk of code')?
>> Even if code from other modules is used?
>> And what's the relationship between scheduling, the VM and processes/modules?
>
> A process is started by specifying to the erlang node the
> module+function+arguments to start the process running (called
> spawning). This is often abbreviated to MFA
> (Module/Function/Arguments). A process runs until the code calls the
> exit function (or gets killed).

Or hangs waiting for a message?

OTP (which you are probably using)
> provides some common boiler-plate process templates like gen_server.
> These use one module to drive core process behaviour. So in this
> respect you are correct. And yes from that core module the code can
> run code from other modules.

Thanks.

>
> The erlang VM like an OS kernel it receives events when there is work
> for a process to perform (eg. timer expires, file descriptor/handle
> gets data, etc) and manages swapping CPU time between the internal
> erlang processes to achieve this goal. Modules are just containers of
> code which processes call as they run. You can think of modules as
> shared libraries that all the erlang processes can use.

I'm good with that. My background I've written schedulers for real time systems
which implement that 'kernel' type functionality. I guess I can ignore the VM
when thinking about design, just let it do it's thing.

>
>
> Unsolicited advice: Erlang programming syntax can seem just plain
> wrong to a newcomer.

I'll settle for 'odd' :-) Having played with Scheme and Lisp, it isn't
much worse!

 If you hit one of these issues where you think
> erlang must be the dumbest most inefficient language on earth, ask
> about it. Erlang is different, but it works the way it does for very
> good reasons.

It seems to have that quirkiness that comes from doing a specific job?

The O'Reilly book contains quite a bit of 'experience' put forth as advice
and some solid logic as to why some things are as they are.

I'm still quite intrigued by it.

Again. Tks Richard.

regards

-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk