[erlang-questions] Must and May convention

Wed Sep 27 12:46:19 CEST 2017

On 09/27/2017 11:08 AM, Joe Armstrong wrote:
> For several years I've been using a convention in my hobby
> projects. It's what I call the must-may convention.
> 
> I'm wondering if it should be widely used.
> 
> What is it?
> 
> There are two commonly used conventions for handling bad arguments to
> a function. We can return {ok, Val} or {error, Reason} or we can
> return a value if the arguments are correct, and raise an exception
> otherwise.
> 
> The problem is that when I read code and see a function like
> 'foo:bar(a,12)' I have no idea if it obeys one of these conventions or
> does something completely different. I have to read the code to find
> out.
> 
> My convention is to prefix the function name with 'must_' or 'may_'

I've been debating this in my head for a long time. I came to the 
conclusion that 99% of the time I do not want to handle errors. 
Therefore 99% of the functions should not return an error.

What happens for the 1% of the time where I do want to handle an error 
and the function doesn't allow it? Well, I catch the exception. And 
that's why I started using more meaningful exceptions for these cases. 
For example, Cowboy 2.0 has the following kind of code when it fails to 
validate input:

     try
         cow_qs:parse_qs(Qs)
     catch _:_ →
         erlang:raise(exit, {request_error, qs,
             'Malformed query string; application/x-www-form-urlencoded 
expected.'
         }, erlang:get_stacktrace())
     end.

99% of the time I don't care about it because Cowboy will properly 
notice it's an input error and will return a 400 automatically (instead 
of 500 for other crashes). It still contains the full details of the 
error should I wish to debug it, and if it is necessary to provide more 
details to the user I can catch it and do something with it.

(The exception probably won't make it as a documented feature in 2.0 due 
to lack of time but I will rectify this in future releases.)

This strategy also helps with writing clearer code because I don't need 
to have nested case statements, I can just have one try/catch with 
multiple catch clauses to identify the errors I do want to catch, and 
let the others go through.

     try
         Qs = cowboy_req:parse_qs(Req),
         Cookies = cowboy_req:parse_cookies(Req),
         doit(Qs, Cookies)
     catch
         exit:{request_error, qs, _} ->
             bad_qs(Req);
         exit:{request_error, {header, <<"cookie">>}, _} ->
             bad_cookie(Req)
     end

Write for the happy path and handle all errors I care about in the same 
place. Goodbye nested cases for error handling!

I have also been using exceptions in a "different" way for parsing 
Asciidoc files. Asciidoc input is *always* correct, there can not be a 
malformed Asciidoc file (as far as parsing is concerned). When input 
looks wrong it's a paragraph.

I can therefore in that case simply write functions for parsing each 
possible elements, and try them one by one on the input until I find a 
parsing function that doesn't crash. If it doesn't crash, then that 
means I found the type of block for the input. If it crashes, I try the 
next type of block.

So I have a function like this defining the block types:

block(St) →
     skip(fun empty_line/1, St),
     oneof([
         fun eof/1,
         %% Section titles.
         fun section_title/1,
         fun long_section_title/1,
         %% Block macros.
         fun block_id/1,
         fun comment_line/1,
         fun block_macro/1,
         %% Lists.
         fun bulleted_list/1,
         fun numbered_list/1,
...

And then one of those parse functions would be like this for example:

comment_line(St) →
     «"//", C, Comment0/bits» = read_line(St),
     true = ?IS_WS(C),
     Comment = trim(Comment0),
     %% Good!
     {comment_line, #{}, Comment, ann(St)}.

If it crashes, then it's not a comment line!

The oneof function is of course defined like this:

oneof([], St) →
     throw({error, St}); %% @todo
oneof([Parse|Tail], St=#state{reader=ReaderPid}) →
     Ln = asciideck_line_reader:get_position(ReaderPid),
     try
         Parse(St)
     catch _:_ →
         asciideck_line_reader:set_position(ReaderPid, Ln),
         oneof(Tail, St)
     end.

This allows me to do some parsec-like parsing by abusing exceptions. But 
the great thing about it is that I don't need to worry about error 
handling here again, I just try calling parse functions until one 
doesn't crash.

So to go back to the topic at hand, I would say forget about the 
distinction between must and may, and truly embrace "happy path" 
programming and make smart use of exceptions. Deal with errors in one 
place instead of having nested cases/many functions. There are of course 
other ways to do this, but only exceptions let you do this both in the 
local process and in a separate process, depending on your needs.

(I will now expect horrified replies from purists. Do not disappoint.)

Cheers,

-- 
Loïc Hoguin
https://ninenines.eu