[erlang-questions] Idiomatically handling multiple validation checks

Fred Hebert mononcqc@REDACTED
Tue Dec 6 14:46:03 CET 2016


On 12/06, zxq9 wrote:
>That's one way to do it... but most code I've seen that makes heavy use
>of `throw` has insane structural issues (either actual bugs, or mental
>grenades with C++ style confusing execution).
>
> [...]
>
>Why not return `{error, Reason}` back up the chain of calls? Then the
>top-level caller can choose whether to crash on an `{ok, Value}` or
>`{error, Reason}` assertion match, or engage in their own insanity with
>`throw` .. `catch` and other nonsense.
>

I find this depends a whole lot on how much validation there is to do.  
If there's 3-5 rules, then nested cases are easy to handle, to write, 
and to read.

When you've got 15-30 rules if not more, then they become cumbersome 
extremely rapidly. For larger cases, there's a few discriminating 
questions I like to ask:

- Do you have to interoperate with other pieces of code you haven't 
  written or is this a thing you fully control?
- Do you need to tell the user what failed or you're just pruning bad 
  data?
- Do you need to find all the errors or the first one you find will do?

The first question can pretty much decide the solution for you, but 
adapters can always be built.

The second question tells me how much information you have to be able to 
provide back. If you're just pruning and/or returning data, a quick and 
clean-ish pattern is really just to use throw and exceptions, but I like 
to have them named and in order:

    -spec validate(map()) -> {ok, map()} | {error, term()}.
    validate(Data) ->
        try
            min_age(18, Data),
            user_type(admin, Data),
            verified_credit_card(Data),
            ...
            user_status(regular, Data), % not suspended or validating
            {ok, Data}
        catch
            Reason -> {error, Reason}
        end.
    
    min_age(Req, #{age := Given}) when -> Req >= Given -> ok;
    min_age(_, #{age := Given}) -> throw({error, {too_young, Given}});
    min_age(_, _) -> throw({error, age_missing}).
    
    ...

This preserves a seemingly functional interface, allows to easily 
rearrange the validation order (and put rapid or frequent discriminators 
early).

Now the downside with this is that if your user is an actual person and 
they have to massage their data, you're gonna give them the real shitty 
experience of "submitting the form 30 times to see what gives". Instead 
we can use validation functions like:

    -spec validate(map()) -> {ok, map()} | {error, [term()}]}.
    validate(Data) ->
        Funs = [fun(Data) -> min_age(18, Data) end
               ,fun(Data) -> user_type(admin, Data) end
               ,fun verified_credit_card/1
                ...
               ,user_status(regular, Data) % not suspended or validating
               ],
        case validate(Data, Funs) of
            [] -> {ok, Data};
            List -> {error, List}
        end.

    -spec validate(map(), [fun()]) -> [term()].
    validate(Data, Funs) ->
        lists:filtermap(
                fun(F) ->
                    case F(Data) of
                        {error, Term} -> {true, Term};
                        _ -> false
                    end
                end,
                Funs
        ).
    
    min_age(Req, #{age := Given}) when -> Req >= Given -> ok;
    min_age(_, #{age := Given}) -> {error, {too_young, Given}};
    min_age(_, _) -> {error, age_missing}.
    
    ...

What this does here is change filters like min_age/2 to be functional 
(no exception throwing!) and uses the lists:filtermap function to return 
only the invalid results in a list. This list is then packaged as an 
error and lets the caller know everything that went wrong.

The interesting effect of doing this is especially in getting all of the 
small validation routines into a clear functional pattern. It would, on 
one hand, be simpler to test and reason about, and could be used with a 
try ... catch as earlier by just wrapping the result in a function like:

    throw_error({error, R}) -> throw(R);
    throw_error(X) -> X.

This looks good, but the downside is that this functional approach is 
that each of the validation function may repeat previously done steps.  
For example, a lot of validation functions may only make sense to enable 
once basic prerequired validation has been done (i.e. the user is 
registered fully may be a requirement to having a lot of fields to 
validate in further points).

the exception based approach makes this very simple by just aborting 
ASAP (depending on how you wrote it). It also means it may be a bit 
harder to follow through.

What you may end up with with complex rules is possibly a bunch of 
stepwise validation of more complex sort or getting a hybrid format:

    validate(Data) ->
        try
            throw_errs(validate(Data, format_rules())),
            throw_errs(validate(Data, user_info_rules())),
            ...
            {ok, Data}
        catch
            Err -> {error, Err}
        end.

    throw_errs([]) -> ok;
    throw_errs(Other) -> throw(Other).

    user_info_rules() ->
        [fun verified_credit_card/1,
         ...].
    ...

The same code can be organized in nested case/of situations by doing an 
explicit check at every level, but I'll leave this one for you to 
imagine being written.

Regards,
Fred.



More information about the erlang-questions mailing list