[erlang-questions] Advanced Erlang Subtleties

Thu May 19 23:42:54 CEST 2011

>
> But this does crash and does not produce []
>
> 1> L = [{1,2},{3,4},{1,2,3,4}].
> [{1,2},{3,4},{1,2,3,4}]
> 2> [fun({I,J})->I end(E) || E <- L].
> ** exception error: no function clause matching
>                     erl_eval:'-inside-an-interpreted-fun-'({1,2,3,4})
>
>
> same in compiled code - I checked
>

yes that's what i ment ( sorry if i was unclear), you have to place a fun or
a "begin match end" in to make it crash :-)
but it would be nice to have a "map" equivalent to LC's that had a small (
but cear ) syntax difference.  ( when i worry about this i tend to [begin
{I1,I2}=Elem, I1 end||Elem<-List] rather than use list map or self executing
fun .. :-)

> It's a grammar thing. It's extremely difficult (impossible?) to
> write a grammar where you can put a sequences of forms
> at any place where a single form is expected - this is what
> begin .. end is for it's like the lisp (progn a b c) which evaluates
> a b and c in order and whose result is c.
>

i see v interesting!
not a problem tho as an extra begin end fixes it!

> 3) This, i feel, is a VERY annoying scoping bug :
>>
>> OuterVar = 'a', List = [{a,b},{c,d},{e,f}],
>>
>> Y : [I2||{OuterVar,I2}<-List]
>>
>> Z : [I2||{I1,I2}<-List when I1==OuterVar]
>>
>> these do not produce the same result.
>>
>
> It's a feature :-)
>

really?? it seems totally inconstant to the rest of erlang to me .. tho il
take your word for it. it just annoys me especially as the vars in the
guards even get effected by this "overwriting" behaviour. I guess its just
my personal style that shows it up.

This is very useful I use this all the time
>
> Suppose I want to do this
>
>  foo({a,{b,X},c}) -> bar({b,X})
>
> Better is
>
> foo({a,{b,X}=Z,c}) -> bar(Z).
>
>
> which avoids reconstructing the {b,X} term and is 2 characters shorter
> and reduces the possibility of error if the subterm is large
>

i agree entirely:-)... problem with email is its hard to convey what you
mean... i like this a huge amount!
fair bit of erlang code i see does not  "= p-match" in the function head
when it could..
- Show quoted text -
Thats a very good point that I had not thought of the hash maps case... it
still leads to errors tho and makes erlang feel a bit like a loosely typed
language in that regard. My question i guess was more have other ppl hit
this silent error (which can be hard to trace) or have large products
produced numeric only comparison modules? if you start off with say xml
etc.. its easy to forget a list_to_integer() and the comparisons don't throw
you just get weird results!

The main point of the email was i had never seen a detailed list of all the
less known syntax features and thought it would interest the list :-) most
of the above are great features ( apart from the lack of map LC's :-) ) so i
guess many thanks for that joe!

james

On 19 May 2011 22:02, Joe Armstrong <erlang@REDACTED> wrote:

>
>
> On Thu, May 19, 2011 at 10:02 PM, James Churchman <
> jameschurchman@REDACTED> wrote:
>
>> Mondays's excellent London User Group about Misultin ( by Roberto
>> Ostinelli ) brought up many fascinating discussions, one was about
>> supervisors vs custom rolling the functionality & storing Pids in an Ets
>> table due to supervisors storing children as a list, how to make Misultin
>> 0.8 OTP compatible but still quick to get started with & directly embeddable
>> etc..
>>
>> One thing I decided to after one convocation was to document all the
>> erlang syntax ( eg, no otp ) subtleties or bits i find counter-intuitive i
>> have discovered, for the use of others & also to prompt the occasional "why
>> was it implemented this way" out of anybody who knows. None of this is a
>> criticism ( and some is point out things that are lesser used and i very
>> much like), as so far i have found erlang to have very few
>> inconsistencies,including in error conditions!, but knowing all the
>> subtleties for any language I use is a personal favourite of mine ( coming
>> from js gives you that! )
>>
>> so here we go :
>>
>> *A) List Comprehension are personally my favourite part of erlang, but
>> have the most subtleties of anything, partially because they pack a lot of
>> functionality in a tiny bit of syntax :
>>
>> 1) LC's act as a list filter, then list map
>> therefore :
>> List = [{a,b},{c,d,e,f},{h,i}]
>> [I1||{I1,I2}<-List] will not crash but instead filter out the 4 elem
>> tuple. This is great, but where is the opposite version of a LC that is only
>> a map and will crash ?
>>
>
> lists:map(fun({I,J}) -> I end, List)
>
> it is "only a map"
>
> :-)
>
>
>
>
>> The standard behaviour seems to go against "let it crash" & on a few
>> occasions have had code that gives a final [] rather than the output
>> expected.
>> Using :
>> [fun({I1,I2}) -> I1 end(Elem)||Elem<-List]
>>
>
> But this does crash and does not produce []
>
> 1> L = [{1,2},{3,4},{1,2,3,4}].
> [{1,2},{3,4},{1,2,3,4}]
> 2> [fun({I,J})->I end(E) || E <- L].
> ** exception error: no function clause matching
>                     erl_eval:'-inside-an-interpreted-fun-'({1,2,3,4})
>
>
> same in compiled code - I checked
>
>
>
>> or
>> [begin {I1,I2}=Elem, I1 end||Elem<-List]
>> give a behaviour equivalent to a list map but are less elegant than a :
>> [I1||{I1,I2}<---List] or something similar. ( notice the <--- )
>> I would really welcome this addition to the language ( and many many many
>> user i have met presume this is the default behaviour for a LC :-)
>>
>> 2) List comprehensions can't normally contain multiple statements,
>> possibly because it might confuse users that [do(),do2(),do3()||_<-List]
>> would add three elements to a list each time (that syntax could be really
>> useful to do that tho!). You can however use the "begin end", also shown
>> above, syntax like : [begin do(),do2(),do3() end ||_<-List] to make it all
>> "one statement"
>> (and you will end up with a list of the results of do3()).
>>
>
> It's a grammar thing. It's extremely difficult (impossible?) to
> write a grammar where you can put a sequences of forms
> at any place where a single form is expected - this is what
> begin .. end is for it's like the lisp (progn a b c) which evaluates
> a b and c in order and whose result is c.
>
>
>>
>> 3) This, i feel, is a VERY annoying scoping bug :
>>
>> OuterVar = 'a', List = [{a,b},{c,d},{e,f}],
>>
>> Y : [I2||{OuterVar,I2}<-List]
>>
>> Z : [I2||{I1,I2}<-List when I1==OuterVar]
>>
>> these do not produce the same result.
>>
>>
> It's a feature :-)
>
>
>> in "Y:" the "OuterVar" gets overridden, not matched, so you end up with
>> every element matching. In "Z: " it behaves (as "Y:" should) ad filters out
>> all but the matches with 'a'
>> this is unlike a case where
>> case input of Input -> ok; _ -> other end,
>> will give ok but if you set Input before to be say 'not_an_input_atom' it
>> will give 'other'
>>
>> I hate this behaviour... :-)
>>
>> 4) you can produce generatorless LC's
>> these act like a "if the entire list does not match the guards replace
>> with an empty list otherwise return it".. they look crazy! but can be useful
>> when generating iolists (or *not* iolists :-) to avoid case statements
>>
>> 5) custom guards and error handling :
>> after much thought i feel this is the correct behaviour, though it's
>> surprising :
>>
>
>> List = [1,2,three,4,5],
>>
>> Y : [I1||I1<-List when I1+1>0 ]
>>
>> Z : Guard = fun(I1) -> I1+1>0 end,
>> [I1||I1<-List when Guard(I1)]
>>
>> "Y :" will filter out 'three' and "Z : " will crash, and importantly
>> crashes when it hits 'three' and not before. same code! however having
>> guards that rerun false rather than crash prevents complex is_type() etc..
>> whereas, when you call you could be calling huge functions, even with side
>> effects ; silencing all that makes no sense. It also allows you to make
>> guards that crash, so make the code a tad less defensive.
>>
>> 6) list comprehensions crash with an invalid list, but only when it
>> reaches the tail element :
>> [io:format("~p~n",[Elem])||Elem<-[1,2,3,4|invalid_tail]]
>> will first print out 1 to 4, then crash
>>
>>
>> *B) variable name repetition matchings :
>>
>> you can write :
>> function_name(Elem1,Elem1)->this_matches;
>> function_name(Elem1,Elem2)->this_does_not_match;
>>
>> and when you put function_name(100,100) it will match. this may seem
>> obvious but in many cases it can reduce a lot of code where otherwise you
>> are writing guards or dropping into a case, especially when you have more
>> complex disassembly pattern matches combined with recursion. I use this in
>> much of my code, as it feels more functional than lots of == or case etc..
>> :-)
>>
>>
>> *C) pattern matches can be anywhere
>> All things in erlang always return, even pattern matches which return
>> themselves so :
>>
>> [elem1,Var1 = elem2,elem3] will give the list [elem1,elem2,elem3] but also
>> set Var1 to elem2. In a more complex case this can (with reasonable use)
>> reduce code a lot as it lets you change the order you build up code, so the
>> last element that returns is the one you want.. you don't need to build up
>> an intermediate value, do some code, then place that var at the end to
>> return.
>>
>> They can also be used in function heads like :
>> function(Elem1={_,_,_})-> Elem.
>>
>
> This is very useful I use this all the time
>
> Suppose I want to do this
>
>  foo({a,{b,X},c}) -> bar({b,X})
>
> Better is
>
> foo({a,{b,X}=Z,c}) -> bar(Z).
>
>
> which avoids reconstructing the {b,X} term and is 2 characters shorter
> and reduces the possibility of error if the subterm is large
>
>
>>
>> *D) "begin end" statements can be anywhere
>>
>> this can be used to embed more than one statement when building a list etc
>> :
>> [elem1,
>> elem2,
>> begin
>> {Elem3,_,_,_ } = fetch_elem3(),
>> Elem3
>> end,
>> elem4]
>>
>> its also possible to put case, if etc.. ( as everything in erlang returns)
>> statements there too like :
>> [elem1,
>> elem2,
>> case Elem3 of
>>  _ when is_atom(Elem3) -> atom_to_list(Elem3)
>> _ -> Elem3
>> end,
>> elem4]
>>
>> *E) case ( and other like try catch etc.. ) drop their matched variables
>> out the bottom ( eg they stay in scope even after they are used)
>> case input_atom of Input -> ok; _ -> other end,
>>  Input is now still 'input_atom'
>>
>> *F) Operators have Type Precedence
>> "hello < 10" etc.. does not throw but returns false.
>> the atom 'aaa' is less than the atom 'aab' etc..
>>
>> There may be small use cases for this like implementing generic sorting
>> functions etc.. but 99% of the time if you magnitude compare two different
>> types ( without intermediate type conversion, and ignoring float/int
>> differences) this is a huge bug in your code.
>>
>
> Disagree - having a total order over all terms means it is
> possible to write extremely useful generic functions. Key-Value
> stores for example (llke dicts and ets tables) would be
> very difficult to implement without this
>
>
>>
>> I can see the equivalent evaluation of "250"<250 being a common mistake,
>> and always being false is hardly useful.
>>
>
> A "<" B when A and B have different types means A is
> "less complex" than B. Lists and more complex than integers
> "250" is a list so "250"<250 is false.
>
> Suppose I want to make a Key-Value database, where the Key can be anything,
> including 250 and "250" to make any form of ordered tree we need a total
> order over all the keys.
>
>
>
>> Also this behaviour can hardly help the dialyser spot common type mistakes
>> as this is valid code. Does this ever prove to be a big problem in large
>> scale systems? or not really? or does producing a module full of :
>>
>> '>'(I1,I2) when is_number(I1), is_number(I2) -> I1>I2. ( '>' could be
>> "greater_than" or "gt" or similar)
>> which crashes on non numeric input produce helpful crashes at the point of
>> error? or just overkill?
>>
>> *G) {module,function}(Inputs)
>> The tuple pair can be used as a way of calling functions, tho the
>> recommended style is: fun mod:funct/arity but still fun to know.
>>
>> *H) Funs can be self executing by wrapping () after them like : fun
>> (FunInput) -> some_code() end(Input)
>>
>> *J) try/catch:
>> 1) - in the ERROR_TYPE:ERROR catch the throw: can be left out, and it
>> means the same as catching a throw of type "Error".. nice short hand.
>> There is also a 3rd parameter in the AST that is always a match all ( a
>> bit like ERROR_TYPE:ERROR:_ ) does anybody know what this means? is there an
>> extra option in a catch?
>>
>> ( maybe everybody knows these last few but just to be sure )
>> 2) if nothing matches in the catch statements the error propagates up as
>> if there was no try catch at all, very useful and stops the need for
>> re-throwing like other languages!
>> 3) in the try Code of ... catch the "of" is optional & returns the CODE
>> value if left out, also ( maybe v obvious) the code in the "of" section is
>> not caught
>> 4) the old, strange and broken "catch" requires wrapping in parens if you
>> want its return value:
>> Var = catch throw(100). seems to be a syntax error
>> Var = (catch throw(100)). is fine
>> also distinguishing between certain throws and an ordinary return can be
>> impossible.
>>
>> *K) records
>> ( this got too long so I made it a follow on email :-) basically I wish to
>> write a (better) parse transform fix )
>>
>> If after corrections/some extra suggestions someone wants to put this on
>> their blog then feel free, I don't have one :-)
>>  James
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110519/4c7a040c/attachment.htm>