[erlang-questions] Idiomatically handling multiple validation checks

Tue Dec 6 07:57:03 CET 2016

On 2016年12月5日 月曜日 18:16:16 qp wrote:
> Hi, I am new to Erlang and wrote this code for validating that the Name,
> Action & Target atoms passed in to validRequest are all valid.
> 
> validRequest(valid, valid, Target) ->
>   case validName(Target) of
>       true -> true;
>       false -> false
>   end;
> validRequest(valid, Action, Target) ->
>   case validAction(Action) of
>       true -> validRequest(valid, valid, Target);
>       false -> false
>   end;
> validRequest(Name, Action, Target) ->
>   case validName(Name) of
>       true -> validRequest(valid, Action, Target);
>       false -> false
>   end.
> 
> I've since refactored into
> 
> validRequest(Name, Action, Target) ->
>   validName(Name) and validAction(Action) and validName(Target).
> 
> I'm curious what a more idiomatic way of writing this would be? I've seen a
> mix of styles using booleans for return values and tuples, with errors
> listed. I don't think that's needed here but I'm just curious how you would
> handle multiple validation checks, especially if they were more complicated
> than this example. Thank you!

We don't often actually do many validation checks, more like a circumstance
check. Many times you will have a case where you want to take some specific
action only if *everything* is good to go and some other action if
*anything* is not good:

validate_request(valid, valid, valid) -> true;
validate_request(_, _, _)             -> false.

But this is sort of weird, because normally I wouldn't want a boolean, I
would want to be doing something already (also... notice that the function
name is not wonkyCamelCase -- use underscores between parts of a word in
atoms, modules and function names; its sort of a thing around here):

validate_request(valid, valid, valid) ->
	do_whatever();
validate_request(_, _, _) ->
	{error, invalid!}.

But why are we resolving things to the atom 'valid' in the first place?
This makes me think something else is going on way back in the code that is
unidiomatic to start with.

Consider if maybe you have a process, and you want that process to be doing
something specific once a certain point is reached. Let's say you have the
state of that process stored in a record. Maybe it is parsing a stream of
bytes that is supposed to be a chat message and you need to pull out the
channel, user, and timestamp before taking some action:

-record(message,
        {channel   = none  :: none | string(),
         user      = none  :: none | string(),
         timestamp = none  :: none | timestamp(),
         message   = none  :: none | string()}).

The parts after the '::' are the type specification. In this case we are
saying that the 'channel' attribute could be either the atom 'none' or
a string -- but that it will default to being 'none' if we just create the
record without defining the 'channel' attribute.

So we are receiving some data in a stream and calling a parse/2 function
over it every time we receive another segment. Which part of the message
we are in might have to do with what parts of the record are still 'none':

parse(Segment, Chat = #message{channel = none}) ->
    case parse_channel(Segment) of
        {ok, Channel, Remainder} ->
            parse(Remainder, Chat#message{channel = Channel});
        {not_enough_data} ->
            {incomplete, Segment, Chat}
    end;
parse(segment, Chat = #message{user = none}) ->
    case parse_user(Segment) of
        {ok, User, Remainder} ->
            parse(Remainder, Chat#message{user = User});
        {not_enough_data} ->
            {incomplete, Segment, Chat}
    end;
parse(Segment, Chat = #message{timestamp = none}) ->
    % ...you get the idea...

That could be naked arguments, it could be in a record, it could be
any form where you can match to make a decision.

Notice how utterly the same the code above is. It is common to see code
sort of like this, but pretty soon you'll figure out how to refactor the
kind of thing above into something more concise. But concise code wasn't
the subject of your question, idiomatic situation checking was -- and
the code above is doing more like what we would be used to seeing.

If you handle you data an unidiomatic way, or enter your functions in an
unidiomatic way you can often find yourself forced into writing some pretty
unidiomatic function bodies -- and that sucks. It happens to me all the time
that long after starting a project I realize something fundamental about the
nature of the problem I'm solving and in one fell swoop a hundred or more
lines of frustrating, unidiomatic code just becomes obsolete. Poof!

You'll probably have this experience at least a few times as you play with
Erlang a bit more.

Speaking of unidiomatic entries to functions... This reminds me of a
question on SO a while back that was sort of like this, where I suspected
the guy had a bit of an X-Y problem, in that he was asking how to make
a function better (un-nest some nested cases, specifically) that really
seemed to call for refactoring further back in the program before the call
in question is made:
http://stackoverflow.com/questions/28673437/erlang-nested-cases/28673997#28673997

Blah blah. I hope I gave you some ideas. It is good that you are not
letting yourself just be satisfied with annoyingly unidiomatic code!

Have fun!

-Craig