[erlang-questions] Advanced Erlang Subtleties

Joe Armstrong erlang@REDACTED
Thu May 19 23:02:35 CEST 2011


On Thu, May 19, 2011 at 10:02 PM, James Churchman
<jameschurchman@REDACTED>wrote:

> Mondays's excellent London User Group about Misultin ( by Roberto Ostinelli
> ) brought up many fascinating discussions, one was about supervisors vs
> custom rolling the functionality & storing Pids in an Ets table due to
> supervisors storing children as a list, how to make Misultin 0.8 OTP
> compatible but still quick to get started with & directly embeddable etc..
>
> One thing I decided to after one convocation was to document all the erlang
> syntax ( eg, no otp ) subtleties or bits i find counter-intuitive i have
> discovered, for the use of others & also to prompt the occasional "why was
> it implemented this way" out of anybody who knows. None of this is a
> criticism ( and some is point out things that are lesser used and i very
> much like), as so far i have found erlang to have very few
> inconsistencies,including in error conditions!, but knowing all the
> subtleties for any language I use is a personal favourite of mine ( coming
> from js gives you that! )
>
> so here we go :
>
> *A) List Comprehension are personally my favourite part of erlang, but have
> the most subtleties of anything, partially because they pack a lot of
> functionality in a tiny bit of syntax :
>
> 1) LC's act as a list filter, then list map
> therefore :
> List = [{a,b},{c,d,e,f},{h,i}]
> [I1||{I1,I2}<-List] will not crash but instead filter out the 4 elem tuple.
> This is great, but where is the opposite version of a LC that is only a map
> and will crash ?
>

lists:map(fun({I,J}) -> I end, List)

it is "only a map"

:-)




> The standard behaviour seems to go against "let it crash" & on a few
> occasions have had code that gives a final [] rather than the output
> expected.
> Using :
> [fun({I1,I2}) -> I1 end(Elem)||Elem<-List]
>

But this does crash and does not produce []

1> L = [{1,2},{3,4},{1,2,3,4}].
[{1,2},{3,4},{1,2,3,4}]
2> [fun({I,J})->I end(E) || E <- L].
** exception error: no function clause matching
                    erl_eval:'-inside-an-interpreted-fun-'({1,2,3,4})


same in compiled code - I checked



> or
> [begin {I1,I2}=Elem, I1 end||Elem<-List]
> give a behaviour equivalent to a list map but are less elegant than a :
> [I1||{I1,I2}<---List] or something similar. ( notice the <--- )
> I would really welcome this addition to the language ( and many many many
> user i have met presume this is the default behaviour for a LC :-)
>
> 2) List comprehensions can't normally contain multiple statements, possibly
> because it might confuse users that [do(),do2(),do3()||_<-List] would add
> three elements to a list each time (that syntax could be really useful to do
> that tho!). You can however use the "begin end", also shown above, syntax
> like : [begin do(),do2(),do3() end ||_<-List] to make it all "one statement"
> (and you will end up with a list of the results of do3()).
>

It's a grammar thing. It's extremely difficult (impossible?) to
write a grammar where you can put a sequences of forms
at any place where a single form is expected - this is what
begin .. end is for it's like the lisp (progn a b c) which evaluates
a b and c in order and whose result is c.


>
> 3) This, i feel, is a VERY annoying scoping bug :
>
> OuterVar = 'a', List = [{a,b},{c,d},{e,f}],
>
> Y : [I2||{OuterVar,I2}<-List]
>
> Z : [I2||{I1,I2}<-List when I1==OuterVar]
>
> these do not produce the same result.
>
>
It's a feature :-)


> in "Y:" the "OuterVar" gets overridden, not matched, so you end up with
> every element matching. In "Z: " it behaves (as "Y:" should) ad filters out
> all but the matches with 'a'
> this is unlike a case where
> case input of Input -> ok; _ -> other end,
> will give ok but if you set Input before to be say 'not_an_input_atom' it
> will give 'other'
>
> I hate this behaviour... :-)
>
> 4) you can produce generatorless LC's
> these act like a "if the entire list does not match the guards replace with
> an empty list otherwise return it".. they look crazy! but can be useful when
> generating iolists (or *not* iolists :-) to avoid case statements
>
> 5) custom guards and error handling :
> after much thought i feel this is the correct behaviour, though it's
> surprising :
>

> List = [1,2,three,4,5],
>
> Y : [I1||I1<-List when I1+1>0 ]
>
> Z : Guard = fun(I1) -> I1+1>0 end,
> [I1||I1<-List when Guard(I1)]
>
> "Y :" will filter out 'three' and "Z : " will crash, and importantly
> crashes when it hits 'three' and not before. same code! however having
> guards that rerun false rather than crash prevents complex is_type() etc..
> whereas, when you call you could be calling huge functions, even with side
> effects ; silencing all that makes no sense. It also allows you to make
> guards that crash, so make the code a tad less defensive.
>
> 6) list comprehensions crash with an invalid list, but only when it reaches
> the tail element :
> [io:format("~p~n",[Elem])||Elem<-[1,2,3,4|invalid_tail]]
> will first print out 1 to 4, then crash
>
>
> *B) variable name repetition matchings :
>
> you can write :
> function_name(Elem1,Elem1)->this_matches;
> function_name(Elem1,Elem2)->this_does_not_match;
>
> and when you put function_name(100,100) it will match. this may seem
> obvious but in many cases it can reduce a lot of code where otherwise you
> are writing guards or dropping into a case, especially when you have more
> complex disassembly pattern matches combined with recursion. I use this in
> much of my code, as it feels more functional than lots of == or case etc..
> :-)
>
>
> *C) pattern matches can be anywhere
> All things in erlang always return, even pattern matches which return
> themselves so :
>
> [elem1,Var1 = elem2,elem3] will give the list [elem1,elem2,elem3] but also
> set Var1 to elem2. In a more complex case this can (with reasonable use)
> reduce code a lot as it lets you change the order you build up code, so the
> last element that returns is the one you want.. you don't need to build up
> an intermediate value, do some code, then place that var at the end to
> return.
>
> They can also be used in function heads like :
> function(Elem1={_,_,_})-> Elem.
>

This is very useful I use this all the time

Suppose I want to do this

 foo({a,{b,X},c}) -> bar({b,X})

Better is

foo({a,{b,X}=Z,c}) -> bar(Z).


which avoids reconstructing the {b,X} term and is 2 characters shorter
and reduces the possibility of error if the subterm is large


>
> *D) "begin end" statements can be anywhere
>
> this can be used to embed more than one statement when building a list etc
> :
> [elem1,
> elem2,
> begin
> {Elem3,_,_,_ } = fetch_elem3(),
> Elem3
> end,
> elem4]
>
> its also possible to put case, if etc.. ( as everything in erlang returns)
> statements there too like :
> [elem1,
> elem2,
> case Elem3 of
>  _ when is_atom(Elem3) -> atom_to_list(Elem3)
> _ -> Elem3
> end,
> elem4]
>
> *E) case ( and other like try catch etc.. ) drop their matched variables
> out the bottom ( eg they stay in scope even after they are used)
> case input_atom of Input -> ok; _ -> other end,
>  Input is now still 'input_atom'
>
> *F) Operators have Type Precedence
> "hello < 10" etc.. does not throw but returns false.
> the atom 'aaa' is less than the atom 'aab' etc..
>
> There may be small use cases for this like implementing generic sorting
> functions etc.. but 99% of the time if you magnitude compare two different
> types ( without intermediate type conversion, and ignoring float/int
> differences) this is a huge bug in your code.
>

Disagree - having a total order over all terms means it is
possible to write extremely useful generic functions. Key-Value
stores for example (llke dicts and ets tables) would be
very difficult to implement without this


>
> I can see the equivalent evaluation of "250"<250 being a common mistake,
> and always being false is hardly useful.
>

A "<" B when A and B have different types means A is
"less complex" than B. Lists and more complex than integers
"250" is a list so "250"<250 is false.

Suppose I want to make a Key-Value database, where the Key can be anything,
including 250 and "250" to make any form of ordered tree we need a total
order over all the keys.



> Also this behaviour can hardly help the dialyser spot common type mistakes
> as this is valid code. Does this ever prove to be a big problem in large
> scale systems? or not really? or does producing a module full of :
>
> '>'(I1,I2) when is_number(I1), is_number(I2) -> I1>I2. ( '>' could be
> "greater_than" or "gt" or similar)
> which crashes on non numeric input produce helpful crashes at the point of
> error? or just overkill?
>
> *G) {module,function}(Inputs)
> The tuple pair can be used as a way of calling functions, tho the
> recommended style is: fun mod:funct/arity but still fun to know.
>
> *H) Funs can be self executing by wrapping () after them like : fun
> (FunInput) -> some_code() end(Input)
>
> *J) try/catch:
> 1) - in the ERROR_TYPE:ERROR catch the throw: can be left out, and it means
> the same as catching a throw of type "Error".. nice short hand.
> There is also a 3rd parameter in the AST that is always a match all ( a bit
> like ERROR_TYPE:ERROR:_ ) does anybody know what this means? is there an
> extra option in a catch?
>
> ( maybe everybody knows these last few but just to be sure )
> 2) if nothing matches in the catch statements the error propagates up as if
> there was no try catch at all, very useful and stops the need for
> re-throwing like other languages!
> 3) in the try Code of ... catch the "of" is optional & returns the CODE
> value if left out, also ( maybe v obvious) the code in the "of" section is
> not caught
> 4) the old, strange and broken "catch" requires wrapping in parens if you
> want its return value:
> Var = catch throw(100). seems to be a syntax error
> Var = (catch throw(100)). is fine
> also distinguishing between certain throws and an ordinary return can be
> impossible.
>
> *K) records
> ( this got too long so I made it a follow on email :-) basically I wish to
> write a (better) parse transform fix )
>
> If after corrections/some extra suggestions someone wants to put this on
> their blog then feel free, I don't have one :-)
>  James
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110519/e7a2d9cf/attachment.htm>


More information about the erlang-questions mailing list