[erlang-questions] OOP in Erlang

Wed Aug 11 14:44:47 CEST 2010

On Wed, Aug 11, 2010 at 12:24 PM, Ulf Wiger
<ulf.wiger@REDACTED> wrote:

I simply cannot resist replying to this one :)

> It's certainly useful to have the slightly abstracted discussion:
> What types of programs are a breeze to write in language X, but
> cumbersome in Erlang? What parts of Erlang make it cumbersome?
> Could they be fixed? Do we care enough to fix it?

I think your last paragraph is the most important. A language is
defined by what you leave out rather than what you let slip in. So it
is entirely possible to punt on certain problems and eschew their
addition to the programming language in question. There are two kinds
of programs I still prefer to write in Ocaml rather than Erlang:

1) Programs which are relying heavily on algebraic datatypes (ADTs).
That is, symbolic manipulation. A struct where the first field is
tagged with an atom() type simply just doesn't feel the same as having
a rigorously defined variant type with fast member access.

2) Programs with a computationally intensive kernel. An optimization
in Ocaml makes arrays of floats be unboxed by default. This means that
numeric code actually runs fast in ocaml. Add all the virtues of
functional programming on top of this and you have an
imperative/functional hybrid which rips apart most other programming
languages. One way around this trouble can be achieved by using NIFs
and write the kernels in C.

Ocaml doesn't fare well on concurrency however. It can do parallellism
with some trouble (There is a rather nice MPI-interface), but for
concurrency, I'd much rather have Erlang.

The only way to add this to Erlang is to open up Pandoras box. This
box contain the destructive updates and I would rather that it is not
opened.

[...]

> - Strong static (a la Haskell & ML) typing is praised by many
>  as a great boon to software quality and productivity.

With a channel-primitive, it is possible to add strong static typing
to a concurrent language. The upgradeable processes part is far more
interesting however. One simple way to achieve it is to postpone
type-checking till on-load-time but you will still need to resolve
types around channels if they change. In principle you need
functionality akin to code_change/3 on the type level - and perhaps
some fairly deep dependency tracking features if more than one module
is loaded. Indeed, the ramifications are deep.

> - Lazy evaluation is superb for some problems, and e.g. QuickCheck
>  relies heavily on it. While it can be done in Erlang, you have
>  to make do with "poor-man's lazy evaluation", rolling your own.
>  Again, going all the way seems to be somewhat at odds with soft
>  real-time characteristics and predictable memory utilization.

Lazy evaluation is also a curse for other problems. Lazily evaluating
is slower so you need strictness analysis to get it to run fast. If
the strictness analyzers can't "see" what is going on in the program
you end up with a program that in the worst case fills up all of your
memory. In the common case, the program will fill up memory and then
consume it, fill up, consume, ..., and so on. This way of computing is
hard on the memory bandwidth - the scarcest resource we have in modern
computers. Also, writing daemons which do not leak takes time as you
have to go over all the details of strictness in the program. I much
prefer strict evaluation and then the ability to "go lazy" on a
by-need basis.

Note that rpc:async_call/4 and its cousin rpc:yield/1 defines a lazy
promise if the call is *not* executed by a separate thread in the
background. Alice ML uses lazy promises like this for lazy evaluation
and in addition for concurrent background work as in Erlang.

> - Shared-memory concurrency is the cat's miau for some problems,
>  but Erlang carefully stays away from it, since it was designed
>  for problems where shared memory is more trouble than it's
>  worth - not least because all bets are off in terms of fault
>  tolerance if a process with write access to your memory dies
>  in the process of modifying it.

There is one idea here I have been toying with. One problem of Erlangs
memory model is that sending a large datastructure as a capability to
another process, several megabytes in size, will mean a copy. In the
default VM setup that is. But if you had a region into which it got
allocated, then that region could safely be sent under a proof that
the original process will not touch it anymore. It is possible to
infer regions like these (See Tofte/Talpin and later
Niss/Makholm/Henglein) but I would opt for a simpler variant with
explicit mention of regions for starters. If the (albeit cumbersome)
model of explicit region management can't work out, why have any hope
for an automatic variant? There is also a tangent here with
substructural type systems, linear types or uniqueness types in
particular. A destructive update can be allowed if we can prove we are
the only one holding a reference to the data.

But unless such a system works out rahter clearly, it is not worth the
hassle of adding. Practical usefulness is more important for Erlang -
so it should be kept in a theoretic toy language first for play, fun,
and profit.

> My own experience from telecoms tells me that OO gives the wrong
> signals to people in complex projects. They tend to want to
> maximize design-time dependencies rather than going with a style
> more similar to electrical component design - black boxes, with
> well-defined interfaces.

This is my experience as well. The trap of OO is to be an architecture
astronaut and define a wild system - only to find that your wrapping
of objects 4-5 times is what kills your Java heap. And Garbage
collectors with one big heap greatly dislikes having to walk large
amounts of live data...

> The hardest part of designing a language is probably not what to
> include, but what to leave out. Erlang is not a multi-paradigm
> language[2], and shouldn't strive to become one IMHO.

Precisely. It is about choosing a few concepts which can support each
other well with little overlap. And aggressively shaving off
everything that Occam's Razor suggest. For that reason, Joe Armstrongs
idea of removing something whenever you add something is a good one.
Improvement should come from "silent" optimizations, like the
gen_server:call/2,3 mailbox optimization recently added. I for one
have enjoyed that every time a new version of Erlang came out, my
program got faster.

-- 
J.