Erlang philosophy explained (was Re: Joe's "deep trickery" )
Chris Pressey
cpressey@REDACTED
Sat Mar 1 02:56:38 CET 2003
On Fri, 28 Feb 2003 17:15:44 +0100 (CET)
Joe Armstrong <joe@REDACTED> wrote:
> On Fri, 28 Feb 2003, Chris Pressey wrote:
>
> > But I've been using a very much simpler method for starting generic
> > TCP/IP servers, roughly:
> >
> ... cut ...
>
> > This version seems to work ok for me, and even with those
> > improvements, it could hardly be called deep trickery; so in deference
> > to Joe's wisdom, and knowing how much of a fan he is of simplicity, I
> > must assume either:
> >
> > a) mine has a fatal flaw or flaws that I can't presently perceive, or
> > b) Joe forgot the word "efficiently" before "arrange" in his sentence
> >
>
> Neither really, If you compare the two you'll find our code is
> pretty similar - I have a bit of extra stuff so I can limit the
> maximum number of simultaneously open sockets, and close down all
> sockets etc.
>
> I think if you added this to your code they'd end up being pretty
> similar in length - they *should* be similar since they solve the same
> problem.
Yes, I realize now, after thinking about how to code the max servers
thing, that there appears to be no super simple way like I thought there
might be.
I still think that the warning in tcp_server, without any explanation of
what is going on, is pretty harsh for a tutorial, though.
> > My concern is mainly that an intimidating module like tcp_server.erl
> > could scare off new Erlang programmers by giving them the impression
> > that working with TCP/IP in Erlang is a bit nasty. It's not nasty at
> > all, at least I don't think it is unless you want to push the
> > envelope.
>
> Perhaps for the tutorial I should put in the simpler version.
Yes :) If you could include a simpler version in the body of the
tutorial, and say something about how tcp_server basically boils down to
it, I think that would make it mind-blowingly approachable.
>
> > This is otherwise a fine tutorial. I like it.
> >
> > Middlemen gets me thinking: are there generic consumer/producer
> > patterns that can be packaged? I find I'm writing a lot of processes
> > along the lines of: accept data in arbitrary-sized chunks from one
> > process and dispense it in chunks of some calculated size to another
> > process, sometimes reformatting it on the way.
>
> yes yes yes ^ 100
>
> Congratulations you have discovered the Erlang philosophy
Cool :)
I guess what tripped the switch for me was seeing that any problem can be
decomposed into a bunch of middlemen with state - that is, a middleman
that translates carriage returns to linefeeds isn't very interesting, but
one that buffers data until the next linefeed, then sends a whole line, is
much more interesting.
> Let me reformulate what you said in another way to clarify this.
>
> > Middlemen gets me thinking: are there generic consumer/producer
> > patterns that can be packaged? I find I'm writing a lot of processes
> > along the lines of: accept data in arbitrary-sized chunks from one
> > process and SEND IT AS AN ERLANG MESSAGE TO ANOTHER PROCESS
>
> I'm kicking myself here, this way of programming was so obvious to
> me that I never explicitly wrote it down. I always used the *say* it
> when giving lectures but never actually committed it to paper.
>
> The Erlang "philosophy" is "everything is an Erlang process"
>
> Remember, Erlang processes share no data and only interact by
> exchanging
> Erlang messages.
>
> So if you have a non-Erlang thing you should fake it up so that the
> other things in the system think that it *is* an Erlang process.
>
> Then everything become ridiculously easy.
>
> That's where the middle-man comes in:
>
>
> Back to my tutorial. A web sever is like this:
>
> +---------------------+ +--------+
> ------>------| Middle man |--------->--------| Web |
> TCP/packets | defragments packets | {get,URL,Args} | server |
> | parse HTTP requests | | |
> ------<------| and formats HTTP |---------<--------| |
> | responses | {Header,Data} +--------+
> +---------------------+
>
> The middle man turns the HTTP data stream (where TCP can fragment the
> packets) into a nice stream of fully parsed Erlang terms.
>
> An HTTP/1.0 server is trivial:
>
> server() ->
> receive
> {From, {get,URL,Args}} ->
> Response = process_get(URL, Args),
> From ! {self(), Response}
> end.
>
> And an HTTP/1.1 server with keep-alive sockets
>
> server() ->
> loop().
>
> loop() ->
> receive
> {From, {get,URL,Args}} ->
> Response = process_get(URL, Args),
> From ! {self(), Response},
> loop();
> after 10000 ->
> exit(timeout)
> end.
>
> Which is *very* clear and easy to write etc.
Yes.
> If you munge these into a single process you get an unholy mess
> (this is what I call getting the concurrency model wrong) - using one
> process per connection is simple obvious and highly efficient (as I've
> said earlier YAWS beats the socks of Apache)
>
> <aside> - in a sequential language you are virtually *forced* to get
> the concurrency model wrong - remember the world *is* parallel, in the
> world things really do happen *concurrently* and trying to program
> concurrent things in a sequential language is just plain stupid -
> often the biggest mistake people make in Erlang is not using enough
> processes - the best code maps the concurrent structure of the problem
> 1:1 onto a set of processes.
>
> If you think about the web server - when a server has 12,456
> simultaneous connections there are actually at that instant in time
> 12,456 clients connected to the server, and 12456 people are staring
> at the screen waiting for an answer - kind of scary really :-) - this
> problem should at this point of time have spawned exactly 24,912
> processes to handle this (which is why you can't do it in Java or
> anything that eventually creates an OS process to do this)
> </aside>
>
> Look what we've done here, we've kind of "lifted" the abstraction
> level
> of a device driver.
>
> In unix things are nice because *everything* is a producer or
> consumer of flat streams of bytes - sockets and pipes are just the
> plumbing that carry the data from a producer to a consumer.
>
> In Erlang the data level is lifted instead of flat stream of bytes,
> everything is an object of type "term" but *no parsing or deparsing is
> necessary" and no fragmentation of the term can occur.
>
> We might like to ask what a unix pipe:
>
> cat <file1 | x | y | z > file2
>
> Might look like in Erlang
>
> This is surely 4 process linked together
>
> cat is a process which sends a stream of
>
> {self(), {line, Str}}
>
> followed by a stream of
>
> {self(), eof}
>
> messages
>
> x and y are processes that look like
>
> loop(IN, Out) ->
> receive
> {In, Msg} ->
> ...
> Out ! {self(), Msg2}
> loop(In, Out)
>
>
> etc.
>
> All of this makes me wonder if perhaps the modules with API ways of
> programming is wrong.
Well, I definately have some thoughts on API's that coincide with that.
Current conceptions of what makes an API are crude. What you generally
have is a list of entry points (names of synchronous function calls) with
the number of arguments and their types for each. This is I think because
this fits in with current systems-construction linker technology well.
But it's not as powerful as it could be if it were to provide more
information (such as the complexity of the exported functions) and to
provide it in a more flexible way (in patterns, which may or may not be
synchronous, i.e. like Erlang messages.)
> Perhaps we should be thinking more in terms of abstractions that
> allow us to glue things together with pipes etc.
>
> This seems to be related to my "bang bang" notation but I haven't yet
> made the connection - I'm still thinking about it.
>
> > Is there something like a
> > gen_stream that I've overlooked in OTP?
>
> No
>
> > then I start thinking: why the hell do I want more gen_* modules when
> > I rarely ever use the existing ones? For better or worse, I usually
> > build my own with receive and !, which I find easier to read (at least
> > while coding,) with the assumption that some day, if it becomes really
> > important, I'll rewrite them to be gen_*'s. So I sat myself down the
> > other day and forced myself to write one each gen_server, gen_event
> > and gen_fsm, to practice.
>
> Me too :-) The gen_ things were put together for projects with lots
> of programmers in the same team - without gen_server (say) in a 20
> programmer projects we'd end up with 20 ways of writing a server -
> using one way means the people can understand each other's code.
Yes, I guess I can see how any convention would help with that.
> For small projects you can happily "roll you own"
And for that, it's just as important to know how they work, that is, to
know what they look like (that is, to be able to recognize a design
pattern instead of just knowing how to use the off-the-shelf
implementation of the pattern.) I take that point well now...
-Chris
More information about the erlang-questions
mailing list