[erlang-questions] List comprehension puzzler

Wed Sep 21 00:43:13 CEST 2016

Thanks Joe and Jesper.

In my case the ISBN numbers come as callback (if I'm using the right term) from an HTML form. Thus they are by nature untrustworthy. It would be much better to catch format errors before I submit them on to a book info api.

I do understand that in the scheme of things ISBN validation will have little to no influence on user experience or the performance the system as a whole. You guys have taught me well that premature optimization is wasted effort.  

To confess, I was fishing to learn and, perhaps, help others learn a new skill. Given that folks have proposed several algorithmic solutions to the same problem, it seemed like a teachable opportunity to learn how to compare execution time of the respective solutions. Someday I may need to know how to do this for real.

I do hope I'm not imposing too much on list readers. But it does seem to me like a worthwhile exercise.

Thanks again,

Lloyd

-----Original Message-----
From: "Joe Armstrong" <erlang@REDACTED>
Sent: Tuesday, September 20, 2016 4:34pm
To: "Jesper Louis Andersen" <jesper.louis.andersen@REDACTED>
Cc: "Lloyd R. Prentice" <lloyd@REDACTED>, "Erlang" <erlang-questions@REDACTED>
Subject: Re: [erlang-questions] List comprehension puzzler

On Tue, Sep 20, 2016 at 9:55 PM, Jesper Louis Andersen
<jesper.louis.andersen@REDACTED> wrote:
>
> On Tue, Sep 20, 2016 at 6:52 PM, Lloyd R. Prentice <lloyd@REDACTED>
> wrote:
>>
>> Question: how can we time the proposed solutions to compare performance?
>
>
> One solution is https://github.com/jlouis/eministat
>
> But one has to weigh the most efficient solution against two things:
>
> * Which solution is the most readable and elegant. It is more likely to be
> correct.
> * How many ISBN numbers per second are we looking at?
>
> Modern computers are unfairly quick at computation once data are on the CPU
> itself. So unless you have a very large count of ISBN numbers to verify, I
> would perhaps spend my time elsewhere in the code base. Your systems overall
> efficiency is likely to suffer from other factors than a single ISBN
> verification.

Agreed - if you think about it, the ISBN number must have come from
*somewhere* - if they have come from disk then at a guess
most of the time will be spent reading and parsing the input. If they come
from a database most of the time will be in format conversions.

Once the ISBN is in RAM then any processing involved will be fast.
Modern processors do Giga instructions/second so converting
millions of ISBN/second should not be a problem - reading and *parsing*
millions of numbers per second would be a problem)

The 'old truth' was that most time in most applications was spent in I/O
not in computation (apart from computational heavy problems)

The problem with comparing different versions of the routine
isbn_format_valid that you don't see the big picture. If isbn_format_valid
only takes 1% if the total time then it doesn't matter if you
optimize it.

The old advice was write as clearly as possible, measure, then
optimize the least efficient part *if necessary*.

In my experience optimization is hardly ever needed - of all the code I've
ever written only a tiny fraction ever really needed optimizing.

By optimization I mean choosing complex code that is fast, rather than simple
code that is easy to understand. Choosing a smart algorithm that is
intrinsically efficient is not something that I consider to be an
optimisation
but rather good design.

The advice - "write first - then measure" is difficult to follow since
measurements are difficult to make and interpret. The only thing I
know is
that my intuition as to where the time really goes is almost always
wrong (apart from in I/O which is always slow)

/Joe

>
>
> --
> J.
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>