[erlang-questions] Newbie - Erlang VS other functional languages

Sat Jan 17 20:46:25 CET 2015

Hello,

Wow! Thanks all for the terrific insights. It will take me awhile to digest all this wisdom and apply to my current project.

Over the next few days I'll try to organize, summarize, and re-post these ideas for benefit of noobies who follow.

Best wishes to all,

Lloyd

-----Original Message-----
From: "Garrett Smith" <g@REDACTED>
Sent: Friday, January 16, 2015 2:28pm
To: "Lloyd R. Prentice" <lloyd@REDACTED>
Cc: "Jesper Louis Andersen" <jesper.louis.andersen@REDACTED>, "erlang-questions" <erlang-questions@REDACTED>
Subject: Re: [erlang-questions] Newbie - Erlang VS other functional languages

Web apps naturally federate activities into separate threads - each one handling a request. You already have a lot isolation in effect off the bat - any HTTP request should be able to (directly) impact another.
The material benefit is that your app can handle a *lot* of concurrent access without becoming corrupt. You can even be sloppy and still probably get away with it. This is *not* true of say a Java app - if you don't pay very close attention to the way memory is allocated and accessed, a long running app like a web server will fall over eventually.
This btw is why you still see advocates of CGI style web apps (fork exec). They introduce substantial overhead by using separate OS processes but they're *stable* in the face of load, sloppiness, bugs, unexpected events. Give that web apps are trivial to scale horizontally, people who prefer their software to run without falling over will opt for the extra cost per HTTP request.

That part's easy - you're already getting it. A win!
As for the rest, you're also already getting it :) You have no choice here. Sorry. You cannot create the ball-of-twine that often emerges from a monolithic app managing a single shared heap. Another win!
As for the rest... just solve the problems that you have! You have expressed angst and wonder and hope. These are all good and important topics but they're not problems to your app.
What's the next thing you need to do for your app? Provide a specific problem and it will be easier to elaborate some options - and then pick the one that is the most sensible.
I think one of the reasons Erlangers don't talk a lot about architecture is that process oriented apps can evolve very happily without much forethought. That's true of systems. As much as we like to fancy ourselves as architects and designers, good systems evolve incrementally. If you have the right underlying abstractions to support evolution, you'll get that without trying. I believe Erlang provides those abstractions:
- Isolated processes
- Links (enable supervision and a cascading process death)
- OTP managed processes (servers/services and supervisors)
- OTP apps (system startup and upgrades)
Okay, that's all high level and more of the stuff you've heard and I know it doesn't answer your questions :)
Starting with your app, each HTTP request handler is running in a separate process. That's how they run concurrently. Each handler will want to do something: serve a page, read from a database, write to a database, etc. You ought to provide a nice API for each of the things these handlers do. The API will be implemented via functions that live in a module. You have no choice.
The decision then becomes how are these functions implemented. Do they just crank through sequential Erlang without sending or receiving a single message? That's your best bet if you can swing it - a nice side effect free function! A good example of that would be serving a web template. Using a great library like Erlydtl you'd just call your compiled template module - it will render the data and off it goes to the client. There's no gen_server here. It's just each HTTP request handler doing work alongside the other. Super scalable. Super simple.
But now accessing a database...
This depends on the database library you're using and whether or not its API supports concurrent access. I imagine most would - but via what? There'll be something that is shared here - a connection, a pool, a registered process. You should understand that that thing is and then consider how all your HTTP request handlers will access it. If you're sharing a single connection, for example, that can only manage a single transaction at a time, obviously you don't want each request piling onto the same transaction. In that case you'll be either serializing access to this connection (you'd use a gen_server for that - or an e2 service) or use multiple connections via a connection pool (which also requires serialized access, again via a gen_server/e2 service).
But really, if you get this wrong, you're going to run smack dab into a very concrete problem that you *must* solve to move forward. You don't really need to design this stuff. Once you run into a few cases of these problems, you're going recognize them and know how to fix them.
I *think* it comes down to this: do you need to serialize access to something? If yes, your API sits in front of a process, which implements the functionality safely in the server loop (no concurrent access). If no, your API implements the functionality directly.
More...
Use OTP (or e2) and run your processes under supervision.
If your processes are short lived, use a simple-one-for-one supervisor (e2 task supervisor + tasks) to start them.
Don't solve problems you don't have. In particular, don't add or use something unless you have to. That goes for OTP apps. Just use one. Don't implement functionality behind a gen_server unless you know why you're doing that (e.g. to strictly control access to something). Don't use a separate db for each customer unless you know why you need that. Etc.
I don't know, I'm running out of ideas here :)

On Fri, Jan 16, 2015 at 12:31 PM, Lloyd R. Prentice <[ lloyd@REDACTED ]( mailto:lloyd@REDACTED )> wrote:
Ah Professor Smith,

 Once again I must confess my ignorance in hope that you can bring clarity. I've been working happily with sequential Erlang now for several years--- most recently an ambitious Nitrogen app. And I've heard from the beginning the virtues of of distributed concurrent processes happily chattering among themselves.

 But where the rubber meets the road, e.g. as I design my current app, I just don't get how to transform these virtues from virtual to real.

 Conceptually, I understand spawned processes, message passing, and supervisors. I've read and reread the Erlang canon. I've built several gen-servers and have worked my way through your e2 tutorial. It all makes good sense.

 But as I consider these principles in context of the architectural design of my current app, it's not at all clear how to apply them.

 The goal of my webapp is to deliver a set of data-based "tools" to "users" (wouldn't it be nice if we had many of these elusive critters) where each user owns his/her own data.

 In my case, users are non-technical author-publishers where each has a personal page for accessing tools and managing data related to project, marketing, and business management.

 Best I can tell, Cowboy and Nitrogen deal with "connections" and "sessions" under the hood, so I don't have to worry about them. But I struggle with such questions as:

 -- Should each "tool" be implemented as a gen-server?
 -- Should the user interface and database be further factored as separate processes?
 -- Should each "tool" be developed as a separate Erlang application then integrated as dependencies of a higher-level portal? Or should they be developed as a set of modules in one application?
 -- Or maybe each tool should be a totally independent "microservice," whatever that means.

 In other words, my attentive studies of Erlang have left me with very few PRACTICAL architectural techniques and tools; that is an insufficient bridge between the PRINCIPLES of concurrent Erlang and the PRACTICE of building robust Erlang systems.

 I'd much appreciate any guidelines to help me thrash through my confusion.

 But more, I wonder if others struggle with the same issues? And if so, how can I work with wizards like you to shed light on this corner of Erlang technology?

 All the best,

 Lloyd

 Sent from my iPad

 > On Jan 16, 2015, at 10:03 AM, Garrett Smith <[ g@REDACTED ]( mailto:g@REDACTED )> wrote:
 >
 > I don't think Erlang has an edge over any other language in terms of
 > scaling across multiple servers - not at all. Other functional
 > languages have access to network APIs the same way Erlang does.
 >
 > I suppose it's less fiddly to send Erlang terms across the wire.
 > Erlang has some built in facilities for building distributed
 > applications. Erlang's been doing this sort of thing for a long time.
 > But other languages can do it - particularly with the many
 > multi-language messaging libraries and tools available today.
 >
 > What Erlang is special for IMO is it concurrency model, which is
 > implemented in and enforced by the *VM* - there's no shared memory
 > across threads of execution. One thread cannot corrupt the memory used
 > by another. To communicate between threads of execution, threads must
 > pass and receive messages (copied data).
 >
 > I use the word threads generically here - in Erlang they're called processes.
 >
 > This changes everything. It's completely transformative of the way you
 > build software. But the real payoff is in your locally running program
 > and less so in the ability to "distribute".
 >
 > Try building software using single threaded, isolated processes (no
 > shared memory). How do you do it? If you use Erlang, you're forced to
 > do it - there's no choice.
 >
 > It's the same as the operating system level. How do you build a LAMP
 > stack? (sorry, I'm older then 24) You install independent components
 > (Apache, PHP (fork exec'd), MySQL). Then you configure them to work
 > together. Then you start them up in a particular order. It's
 > coordinated communication independent functions. If one blows up, the
 > others keep working.
 >
 > Imagine every part of your program working like this and you have an
 > Erlang application - actually, an Erlang *system*.
 >
 > To replicate this model at the OS level, you'd write dozens of small,
 > independent applications that communicated with each other over pipes
 > or sockets. The ZeroMQ community is familiar with this approach.
 >
 > The payoff for this, as I see it, is flexibility and speed of
 > introducing new functionality and fixing bugs.
 >
 > To illustrate at a higher level, imagine a smart phone today that
 > didn't provide isolation across applications. What would you expect
 > from it? I'd expect it to not work unless the applications were all
 > nearly perfect. That means it will either not work, or the app
 > ecosystem would be very limited. But today, smart phones all have
 > kernels and user space where apps are isolated from one another. So I
 > can install some random thing from an app store and have a pretty high
 > confidence that it's not going to ruin my phone. The result is *huge*
 > app ecosystems with phones that pretty much work (shockingly well for
 > what they're asked to do).
 >
 > When you use Erlang, your program becomes this ecosystem of "apps"
 > that you can add to, modify, remove very freely without concern for
 > the whole system. Your program will be evolvable *and stable* in the
 > same way your phone is evolvable and stable. It's a scalable
 > programming model!
 >
 > It's truly fantastic - and seldom mentioned. (Folks here have heard
 > this line from me before - this is my particular drum that I like to
 > beat on :)
 >
 > Use Erlang!
 >
 > Garrett
 >
 > P.S. I routinely run into critical memory problems in Java apps that
 > host ecosystems of other "apps" (plugins). It's a completely
 > unsolvable problem when your language/VM encourages intractable memory
 > graphs across threads unless you are incredibly careful about what you
 > run. If you want to make a system pluggable in Java (in one JVM), be
 > prepared for it to stop working at some point. So your either limited
 > in what you can do (and how fast) or in the stability of your program.
 >
 > On Fri, Jan 16, 2015 at 6:29 AM, Jesper Louis Andersen
 > <[ jesper.louis.andersen@REDACTED ]( mailto:jesper.louis.andersen@REDACTED )> wrote:
 >> I think your observation is correct.
 >>
 >> An Erlang program works by having many small processes, all isolated from
 >> each other. The way to communicate between processes is to send a message,
 >> asynchronously. This in turn leads to the key observation: when you send
 >> messages, you don't care about *where* the other process is. It could be
 >> local or on a completely different machine. The syntax and the semantics are
 >> the same, and you would program the system much in the same way. The
 >> environment is thus very homogeneous, compared to other solutions where you
 >> need to communicate on two levels: one for local messaging and one for
 >> distributed messaging.
 >>
 >> I also second Bob's observation: The design feature of being functional
 >> forces a lot of properties which are beneficial to programs where
 >> correctness matters more than squeezing out the last ounces of performance
 >> from a tight computational kernel. But there is more to it than that. A good
 >> example is the choice of standard data structures which have no pathological
 >> problems in corner cases. Or the deep continued focus on scaling to multiple
 >> cores rather than looking for efficient single-core performance.
 >>
 >>
 >>> On Thu Jan 15 2015 at 11:38:52 PM Bob Ippolito <[ bob@REDACTED ]( mailto:bob@REDACTED )> wrote:
 >>>
 >>> I'd agree with that observation. Erlang is particularly well designed for
 >>> reliability and ease of maintenance/debugging. I wouldn't necessarily say
 >>> that these properties are due to the language, it's really the environments
 >>> that Erlang has been deployed in that shaped the VM and libraries in this
 >>> way. The tooling and libraries have at least a decade head start for this
 >>> kind of industrial usage over just about any other functional language.
 >>>
 >>>> On Fri, Jan 16, 2015 at 9:01 AM, Ken Wayne <[ kwayne@REDACTED ]( mailto:kwayne@REDACTED )> wrote:
 >>>>
 >>>> I've been investigating functional languages and the concepts that lead
 >>>> to increased speed, reliability, and decreased maintenance.  Erlang seems to
 >>>> have a distinct advantage over other functional languages when you need to
 >>>> scale across multiple servers because it's a natural part of the language.
 >>>> Can anyone confirm/deny or elaborate on the observation?
 >>>>
 >>>> Without wax,
 >>>> Ken Wayne
 >>>> [ kwayne@REDACTED ]( mailto:kwayne@REDACTED )
 >>>> Desk: 715.261.9412
 >>>> _______________________________________________
 >>>> erlang-questions mailing list
 >>>> [ erlang-questions@REDACTED ]( mailto:erlang-questions@REDACTED )
 >>>> [ http://erlang.org/mailman/listinfo/erlang-questions ]( http://erlang.org/mailman/listinfo/erlang-questions )
 >>>
 >>> _______________________________________________
 >>> erlang-questions mailing list
 >>> [ erlang-questions@REDACTED ]( mailto:erlang-questions@REDACTED )
 >>> [ http://erlang.org/mailman/listinfo/erlang-questions ]( http://erlang.org/mailman/listinfo/erlang-questions )
 >>
 >>
 >> _______________________________________________
 >> erlang-questions mailing list
 >> [ erlang-questions@REDACTED ]( mailto:erlang-questions@REDACTED )
 >> [ http://erlang.org/mailman/listinfo/erlang-questions ]( http://erlang.org/mailman/listinfo/erlang-questions )
 > _______________________________________________
 > erlang-questions mailing list
 > [ erlang-questions@REDACTED ]( mailto:erlang-questions@REDACTED )
 > [ http://erlang.org/mailman/listinfo/erlang-questions ]( http://erlang.org/mailman/listinfo/erlang-questions )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150117/39c957e8/attachment.htm>