[erlang-questions] Debug support on for guards?

Tue May 4 04:40:29 CEST 2010

Richard,

thanks a lot for the detailed discussion, I follow almost all of it, 
here's where I am not sure.

For context, the function is part of the encoding and decoding of data 
that is coming in from and going out to a VoltDB server. It will be used 
on a lower granular level to read out and create the contents of arrays, 
and SQL result table data. It is certain to at some point be crashed 
into invalid input due to errors of my programming of this API; or 
errors on part of the business logic programmer; or errors in the 
transmission if processing serialized data. There is potential for range 
errors, too, because {M,S,U} has a broader range than what VoltDB can 
store. Disallowing any negative values in M,S,U means for now 
disallowing timestamps before Christ. Theoretically, VoltDB could store 
timestamps ~4000 B.C. and at some point this should be made encompassed, 
and be as lenient as appropriate with negative values.

The encoder function in question would receive time data from the domain 
logic into the API. It may be in any kind of shape if somebody did 
calculations with it. It may also be used to convert serialized data.

1

I do not produce the function_clause Error Reason myself, but this is 
what Erlang produces if I have the >= 0 checks in the guard.

My original question was if there where means in the language maybe to 
enhance this message that the guards throw.

At this point I come to believe that this would be a useful thing and 
actually the best solution if that would be possible. That would save 
the advantages of literate programming that you point out, with the 
least, well, bloat.

Because if I can't, but still want to have a custom messages being 
thrown, I would have to pull it out of the guards, into the code, which 
is the first step to bloat, and give away the elegance of the guard and 
literate code structure.

Or, as I understood from recent posts, split as below, which is another 
way of bloat, adding boilerplate in form of a function heads and 
repetitive guards. (And not quite the way of splitting diagnosis and job 
as you proposed).

But despite the function head repetition, my understanding is growing 
that this is the way to do it in Erlang (I won't assume this to be 
important but I also come to like it):

        [untested]

        *wire_time({Mega, Sec, Micro}) when is_integer(Mega),
        is_integer(Sec), is_integer(Micro),
                                      Mega < 0, Sec < 0, Micro < 0 ->
            erlang:error(negative_input_value);

        wire_time({Mega, Sec, Micro}) when is_integer(Mega),
        is_integer(Sec), is_integer(Micro) ->

            MilliEpoch = Mega * ?BILLION + Sec * 1000 + trunc(Micro /
        1000), % (*)
            <<MilliEpoch:64>>.*

        --- Instead of: ---

        wire_time({Mega, Sec, Micro}) when is_integer(Mega),
        is_integer(Sec), is_integer(Micro) ->

            if (Mega >= 0), (Sec >= 0), (Micro >= 0) -> ok;
            true -> throw(negative_input_value) end,

            MilliEpoch = Mega * ?BILLION + Sec * 1000 + trunc(Micro /
        1000), % (*)
            <<MilliEpoch:64>>.

        --- Or: ---

        wire_time({Mega, Sec, Micro}) when is_integer(Mega),
        is_integer(Sec), is_integer(Micro),
                                      Mega >= 0, Sec >= 0, Micro >= 0 ->

            MilliEpoch = Mega * ?BILLION + Sec * 1000 + trunc(Micro /
        1000), % (*)
            <<MilliEpoch:64>>.

However, once a better error detection is put into place, possibly 
accepting a mix of positive and negative values in the tuple, then it 
cannot be in the guards anymore in any event. It has to be calculated 
and then the result tested against the limits, instead of simply testing 
the tuple elements for >= 0.

Would this then best be:

        *[untested]

        **-define(WIRE_MIN, ...).
        **-define(WIRE_MAX, ...).

        **wire_time({Mega, Sec, Micro}) when is_integer(Mega),
        is_integer(Sec), is_integer(Micro) ->
            wire_time(**milli_epoch({Mega, Sec, Micro})).*
        **
        *wire_time(MilliEpoch) when MilliEpoch < ?WIRE_MIN ->
            erlang:error(time_underrun);

        **wire_time(MilliEpoch) when MilliEpoch > ?WIRE_MAX ->
            erlang:error(time_overrun);
        *
        *wire_time(MilliEpoch) when is_integer(MilliEpoch) ->
            <<MilliEpoch:64>>;
        *
        * milli_epoch({Mega, Sec, Micro}) ->*
        * **    Mega * ?BILLION + Sec * 1000 + trunc(Micro / 1000).
        *

        --- or, briefer variabled: ---

        *-define(WIRE_MIN, ...).
        **-define(WIRE_MAX, ...).

        **wire_time({M,S,U}=N) when is_integer(M), is_integer(S),
        is_integer(U) ->
            wire_time(milli_epoch(N)).

        wire_time(T) when T < ?WIRE_MIN ->
            erlang:error(time_underrun);

        wire_time(T) when T > ?WIRE_MAX ->
            erlang:error(time_overrun);

        wire_time(T) when is_integer(T) ->
            <<T:64>>;

        milli_epoch({M,S,U}) ->
            M * ?BILLION + S * 1000 + trunc(U / 1000).*
        **

This would be split up then, though again not into (pre-)diagnosis and 
job. It shows a lot of boiler-repetitions of function heads, does it?

2

I maintain that in this case I am able to predict a wrong, naive 
interpretation of the error message. Or rather, that I can predict what 
will NOT come to mind as error source for the given Reason 
'function_clause', until one inspects the source, or the docs; namely: 
that a part of the time tuple is negative.

It's a less ambitious prediction than what you are proposing and defeating.

But a programmer who is not me who is going to use this function, if she 
runs into the situation to have negative values fed into it,  from 
getting 'function_clause' back, she will not know what to look for 
without looking at the implementation of the function and parsing the 
error message quite closely. Giving "negative_time_value" instead of 
"function_clause" will be a legitimate help to speed things up for her 
and not wrong.

Trying to respond back to the server would mostly affect other 
functions, and not this one as I had erroneously proposed.

I am willing to assume that even if I write a doc (and I will) it will 
not be read, or the remarks about the use of this function forgotten. I 
think such is life and it is forgivable. Giving a bit of a hint in 
errors can go a long way to boost GDP I think. Well, ok, reduce pain.

Regards,
Henning

Richard O'Keefe wrote:
>
> On May 3, 2010, at 11:53 PM, Henning Diedrich wrote:
>
>> Thank you for the advice Richard!
>>
>> The function is to be called often and directly on non-tested data 
>> coming in over the network.
>>
>> So at first sight, splitting up seems to bloat things instead of 
>> making them clearer in this case. There is no moment in time, long up 
>> front, when the diagnosis should be done. I'll give splitting up a 
>> try to learn how that looks.
>>
>
> Funny, I thought the style I proposed was all about *removing* bloat.
>
> One of the reasons I got interested in literate programming, back when
> that was a new thing, was the fact that in typical imperative code,
> the normal case could easily get lost in code dedicated to checking
> arguments and reporting errors, and with literate programming I could
> at least reduce that to
>     <<check arguments>>
>     if (ok) {
>         <<do the real work>>
>     } else {
>         <<report some sort of error>>
>     }
> and then <<do the real work>> could be uncluttered.
>
>>> Why?  What is the programmer supposed to do about it?
>>> More to the point, what is the *program* supposed to do about it?
>> 1) Discard the data he/it is trying to parse and tell the data source 
>> that it was no good, and why.
>
> But to what level of detail.
> Recalling that the data is supposed to have the form
> {M,S,U} where M,S,U are all non-negative integers,
> "why" a term T might be no good could be
>
>     - T is not a tuple
>     - T is a tuple but not of arity 3
>     - T is a tuple of arity >= 1
>       whose first element is not a number
>     - T is a tuple of arity >= 1
>       whose first element is not an integer
>     - T is a tuple of arity >= 1
>       oh heck there are just TOO many possibilities.
>
> And the whole thing is ambiguous.
> Take {-1.2,fred,999999999999999999999}.
> Is it wrong
>     - because the first argument ISN'T AN INTEGER
>     - because the first argument IS NEGATIVE
>     - because the second argument IS AN ATOM
>     - because the third argument IS OUT OF RANGE
> (I don't think the code we've seen so far bothered to check that
> the arguments were within reasonable bounds, but it should have.)
>
> I've been through this once, when I designed the exception handling
> facility in Quintus Prolog and then had to go through all the source
> code ensuring that every single operation available to users
> reported reasonable errors.  I'm going through it again in my
> Smalltalk system, gradually replacing
>     "self error: 'I do not like thee Dr Fell'"
> with
>     (NastyLecturerError subject: drFell critic: self) signal
> and the answer is that as soon as there is more than one value to
> inspect you *can't* report every possible error in a perfect way.
> Take something like
>     anArray copyFrom: firstIndex to: lastIndex
> which does what it looks like (inclusive bounds).
> If lastIndex - firstIndex < -1, which one of them is at fault?
> Any guess you make might be wrong.
>
> I therefore respectfully suggest that any attempt to provide
> fine-grained diagnosis of "why" is misguided.  To start with,
> if someone sets up a data source that is supposed to be sending
> you well formed time stamps, THEY MUST KNOW what a well formed
> time stamp is supposed to look like.  You *can* tell them
> "<this> is not a well formed timestamp".  And that's *all* you
> can or should tell them.  THEY have all the information they
> need to make a diagnosis in a way that makes sense for their
> situation.  You do NOT.
>
> So here's what you should be doing.
>
> You have a protocol between you and your data source.
> Typically, you will have {Action,Operand1,...,Operandn}.
> Define types for the operands (in the UBF sense, which
> includes ranges). If an operand does not conform to its
> type, report that fact, AND NO MORE DETAIL THAN THAT.
>
> Better still, use UBF, and let UBF do the checking.
>
> You *can't* put the reporting in the function that
> decodes a timestamp, for the simple reason that it doesn't
> know who to report the error *to*, or how.  The function
> that decodes a timestamp has that job and no other.
> It could possibly do
>
>     decode({M,S,U}) when ... -> {ok,...};
>     decode(T) -> {error,invalid_timestamp,T}.
>
> But no more.
>
>>
>> 2) If the data was self-generated, e.g. for testing, the programmer 
>> to use this function may make assumptions that are wrong and the 
>> programmer to program this function (me) wants a way to inform him in 
>> case he runs into trouble.
>
> WHY might the programmer make wrong assumptions?
> Is there something desperately wrong with your documentation?
>
> What is it about your situation that lets someone know "this
> operation needs a timestamp" but NOT know what a timestamp
> is supposed to look like?  If they don't know what a timestamp
> is supposed to look like, how are they supposed to make sense
> of your detailed diagnostic.  Above all, HOW ON EARTH are they
> supposed to make sense of a diagnostic that talks about which
> test in some guard failed, when they should never even see the
> guard in question?
>
> If you are testing, why can't you use UBF or a similar scheme of
> your own devising?
>
>> By providing more information than function_clause. Especially if I 
>> can imagine what the wrong assumption may be.
>
> My absolutely uniform experience has been that when people imagine what
> the wrong assumption may be, they are WRONG.  It is worse than useless
> to make such guesses.  Don't ever waste anyone's time or bandwidth on
> imagined causes of errors, just report the errors themselves simply and
> clearly.
>
> You should not be reporting function_clause back to your data source in
> the first place.  You should be reporting {type_error,timestamp,T} or
> something like that which is framed in terms of your PROTOCOL between
> your program and the data source, which neither reveals nor depends on
> ANY internal aspect of your program whatever.
>
> It may be that there is more about your program that you could tell us.
>
>