[erlang-questions] What are the "Most valuable libraries?"...and a few other questions

Wed May 18 02:59:21 CEST 2011

On 2011-05-17, at 20:25 PM, Todd wrote:

> The whole "reply-all" debate has stirred me to ask some festering questions...
> 
> 1. In general, what are the most valuable libraries to learn, both within the Erlang dist and external?
> 
In my opinion, learning the OTP behaviour modules brought me a lot in term of Erlang programming efficiency, and also taught me a lot in terms of writing abstractions. They're pretty useful regarding that. Add the sys module to the mix; it's pretty good as far as being able to debug gen_* behaviour goes.

For external libraries, I'd say gproc, ibrowse, proper and meck are pretty neat. The riak_core stuff is also impressive.

> 2. Is there a consolidated/curated repository of libraries that is industry standard? I know the erlware folks have a repo...is that both a complete and accepted authoritative repo? From reading the list, it sounds like there's also a fair bit of stuff scattered about in github, too.
> 
Not really. You have erlware, as you mentioned, but I personally tend to use agner (http://erlagner.org/).

> 3. How does one easily multithread an app? For instance, there's pmap in clojure and something similar in akka that lets you map a function across a list, and it allocates threads accordingly...
> 
> literally something like: "pmap(myfun, mylist);"

In my own use cases, I tend to write Erlang for multi-user stuff, server-side software. In these cases, parallelism on things like mapping over a list is rather useless. By this, I mean that if you only map over one list at a time, then breaking it in N processes might win you some time (assuming the function you apply takes more time to run than spawning and communicating data), but if you have N lists for N users each having their own process, then the system is already processes a lot of data in parallel, probably more than what could be useful if you had M processes per list on top of the N users -- at this point you get to play with the scheduler and processes might fight for CPU time.

This is pretty complex but my point is that if you're writing server software where concurrency units are already large, but there are many of them, low-level parallelism as in a pmap function isn't the most useful enhancement. Smart application architecture design might play a lot more in the long run. 

In any case, if you really want a pmap, there's rpc:pmap/3 (http://erldocs.com/R14B02/kernel/rpc.html?i=0&search=pmap#pmap/4) and conc lists (http://dustin.github.com/2010/03/04/erlang-conc.html, also on agner). Measure and see what fits.
> 
> 4. Along that note, does anyone have any ideas as to how to tackle the Typesafe 'getting started tutorial?'
> 
> http://typesafe.com/resources/getting-started/tutorials/getting-started-first-scala.html
> 
> (Typesafe is the funded version of Jonas Bonér's Akka combined with Scala)

This is a tough one. There are plenty of guides to get started with Erlang. You have 2-3 versions of them in the official documentation, there are screencasts over at pragmatic programmers, I write http://learnyousomeerlang.com and you have plenty of blog posts around the web doing a good, short job of introducing users to the language. The latest one doing that is from IBM (http://www.ibm.com/developerworks/xml/library/os-erlang1/index.html?ca=drs-)

For the part about getting an executable running fast, things in Erlang are a bit complex. The standard way to get an application running has to do with OTP Applications and OTP Releases; this is complex, requires you to learn the whole framework and can't be explained fast. You could probably do something basic with erl -run and -noshell options, but I'd also like to see one or two tutorials focusing on Escripts as a quick jumpstart option.

> 
> 4b. Side note: is anyone concerned about Akka/Typesafe stealing mindshare?

Not really. I don't think many established Erlangers are going to leave for Scala/Akka. I think many Java people who were somewhat interested but hesitating about Erlang might study Scala/Akka first. If they like it, I wouldn't be surprised to see them trying Erlang at some point, given Akka is heavily inspired by its design choices. 

The more languages try to borrow Erlang idioms, the better (although I'd like it if they focused more on fault-tolerance)

> 
> And lastly, the most burning of questions:
> 
> 5. How does one push an app such that it self instantiates it's processes across the cluster? I can see how OTP is great at managing an app on a single node, but how do you say something like: "create one of these processes on each node in the cluster, and restart 1-for-1 if they die"... or something similar. I see mention of gproc, but honestly, I don't see how to use it. Likewise, if nodes are added to the cluster, how would you ensure that the necessary processes are pushed to the new node after it joins the cluster?

You can have many design options. Using OTP applications, you can specify a takeover/failover mechanism, but not something that would be instantly started on all nodes. To do something like that, the simplest thing I can think of is to simply start your application on each node. You could add some kind of 'sync' function to ask each node to synchronises itself with the rest of the cluster (this is what the global module does as far as I know) or try to work some mechanism that does it automatically. 

Generally, distributed applications where they all run at the same time and need to share some state is a harder problem than just having independent components distributed across a cluster. Things will be very application specific. Is the state shared or independent? Can one of the nodes disappear without impacting the rest of the application? What do you do in cases of netsplits when you can't know if a node is down or the connection broken? There are many questions like this that will drive your design. You can possibly look at riak_core for some design decisions, then at the global module for an entirely different (and smaller scale) approach.

> 
> 6. How do you deploy and live code upgrade in real life? I've been looking at some of the work by the 'Dukes of Erl' ... is erlrc what folks commononly use?
> 
> Dukes of Erl project (Paul Mineiro):
> https://code.google.com/p/erlrc/
> 
> Paul Mineiro's Erlang factory 2009 presentation:
> http://www.erlang-factory.com/conference/SFBayAreaErlangFactory2009/speakers/PaulMineiro

Releases have mechanisms in place to handle this. The basic idea is that every gen_* behaviour implements callbacks from the sys module. These callbacks allow various operations such as suspending a behaviour (switching it into a 'I only accept sys messages' mode), calling for the code change functions, then resuming it. This generally allows safe code updates, but always remember to test things before deploying them (and test the deploying itself).
> 
> 7. Does anyone use dynamic load balancing of demand across a cluster (e.g. spinning up erlang processes to meet the demand curve?)
> 
I'll leave this to be answered by people with more production experience than I have, but I generally write my programs so that each concurrent (independent) unit of computation has its own process. By this I mean that in the case of a web server, I'll usually create one process per query rather than one process for the data fetching, then one for the templating, etc. Some things are sequential and should remain that way in the code.

 The VM also does a lot of heavy lifting for me with regards to that approach, distributing processes in ways that makes things reasonably balanced.

> 8. What's the best way to integrate w/ other code bases. In akka, you'd use camel as an integration bus. What are the common ways to integrate with erlang? Is that what ports and nifs are for? Forgive my ignorance, but I always considered those as simply ways to write code in a different, perhaps more comfortable language...not as integration mechanisms.

Ports, Port Drivers, C Nodes and NIFs are all standard ways to communicate with the outside world. There are also interfaces to Java and communication layers to PHP, Ruby, Python, etc. You can also add BERT-RPC as a protocol that can be used with Erlang, among others.
>  
> Also, I've continued to peck away at various newbie tutorials. Any comments/suggestions/corrections are welcome.
> 
> https://github.com/ToddG/experimental/tree/master/erlang/wilderness
> 
> -Todd
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

Hopefully, this is helpful :)

--
Fred Hébert
http://www.erlang-solutions.com