[erlang-questions] Trying to learn the Erlang Way

Mon Feb 10 05:30:22 CET 2014

The first thing I note is that there is no need for you ever
to use square root.
[Jon L. Bentley mentions this trick of avoiding the square root
in one of his books.]

\sqrt{x^2 + y^2 + z^z} \le r
if and only if
x^2 + y^2 + z^2 \le r^2

in_sphere({X0,Y0,Z0,R}, {X,Y,Z}) ->
    (DX = abs(X - X0)) =< R andalso
    (DY = abs(Y - Y0)) =< R andalso
    (DZ = abs(Z - Z0)) =< R andalso
    DX*DX + DY*DY + DZ*DZ =< R*R.

Unpack, precompute:

in_sphere(X0, Y0, Z0, R, R2, {X,Y,Z}) ->
    (DX = abs(X - X0)) =< R andalso
    (DY = abs(Y - Y0)) =< R andalso
    (DZ = abs(Z - Z0)) =< R andalso
    DX*DX + DY*DY + DZ*DZ =< R*R.

cull({X0,Y0,Z0}, R, Vectors) ->
    R2 = R*R,
    [V || V <- Vectors, in_sphere(X0, Y0, Z0, R, R2, V)].

If your task (like selecting elements from a list) fits what
a list comprehension can express clearly, you should normally
use a list comprehension rather than your own recursive
function, just because it is so much easier to see at a glance
what is going on.

On 7/02/2014, at 8:25 PM, kraythe . wrote:

> I am a newbie to erlang so please excuse the newbie questions. To learn the language I have a use case, simply to take a vector C which defines the center of a sphere with radius R and then cull a list of vectors and return only the vectors in the sphere. A simple math problem the most efficient method is to calculate the difference in C and a vector V and if any component is greater than R then the vector can't be in the sphere, if not then we have to do the expensive sort magnitude calculation.  
> 
> To this end I have created a simple set of functions: 
> -module(vector3d).
> -author("rsimmons").
> 
> %% API
> -export([subtract/2, magnitude/1, scale/2, cull/3]).
> 
> %% Subtract the second vector from the first
> subtract({X1, Y1, Z1}, {X2, Y2, Z2}) -> {(X1 - X2), (Y1 - Y2), (Z1 - Z2)}.
> 
> %% Compute the magnitude of the vector
> magnitude({X, Y, Z}) -> math:sqrt((X * X) + (Y * Y) + (Z * Z)).
> 
> %% Determines if any coordinate in the given vector is bigger than the passed in value V.
> coordinateGreaterThan(V, {X, Y, Z}) when X > V; Y > V; Z > V -> true;
> coordinateGreaterThan(_, _) -> false.
> 
> %% Culls the list of vectors X to only those that are in the sphere devined by vector C as a center and R as a radius.
> cull(C, R, Vectors) when is_number(R), is_tuple(C) -> cull(C, R, Vectors, []).
> cull(C, R, Culled, []) -> Culled;
> cull(C, R, Culled, [Head|Tail]) ->
>   D = subtract(C, Head),
>   case coordinateIsGreaterThan(R, Head)	of
>     true -> cull(C, R, Culled, Tail);
>     false -> cullOnMagnitude(C, R, D, Culled, Head, Tail)
>   end.
> 
> cullOnMagnitude(C, R, D, Culled, Head, Tail) when is_tuple(D), is_number(R) ->
>   M = magnitude(D),
>   if M > R -> cull(C, R, Culled, Tail);
>      true -> cull(C, R, [Head | Culled], Tail)
>   end.
> 
> When I load these into a file and compile them I get the following result: 
> 
> 16> c(vector3d).
> vector3d.erl:53: function coordinateIsGreaterThan/2 undefined
> vector3d.erl:45: Warning: function coordinateGreaterThan/2 is unused
> vector3d.erl:50: Warning: variable 'C' is unused
> vector3d.erl:50: Warning: variable 'R' is unused
> 
> Note that there are some other methods in this file so the line numbers will vary. My questions are: 
> 
> 1) Why is it saying my coordinateIsGreaterThan/2 method is undefined when I can see it here and then to say it is unused in the next line?

Because, asYouMightHavePredicted, itIsHardToRead wordsJammedTogetherLikeThis.

You *define* (coordinate)(Greater)(Than)/2 but
you *call*   (coordinate)(Is)(Greater)(Than)/2.

If you separated the words so that you could read them like text
-- which is indeed the usual Erlang style, thisIsNotJava --
it would be easier to see the presence of "is" in
coordinate_is_greater_than or the absence of "is" in
coordinate_greater_than.

> 3) I understand the variables not being used but it seems odd to me from the java language world to not define the names for the variables. Is this normal in erlang?

There is a big difference between "I have to have _some_ variable name
in here so I've slapped something in and I don't care what it is" and
"OOPS!  I _meant_ to use this name but accidentally left out a line."

In fact this happened to me in this reply.  In in_sphere/6, I had
introduced R2, but forgot to replace R*R by R2.  Erlang warned me
that R2 was unused.  I *needed* that warning.

It is not unusual for compilers, even for imperative languages like
Java (yes, it is OO, but OO is a subclass of imperative) to warn
about variables whose values are not used.  See the "-Wunused..."
command line options for GCC, for example.  Those warnings are
available because they very often indicate an error.

I would expect good Java programmers to use checking tools at least
as good as FindBugs.  "DLS: Dead store to local variable (DLS_DEAD_LOCAL_STORE)"
is one of the things FindBugs checks for.  "This instruction assigns a value
to a local variable, but the value is not read or used in any subsequent
instruction.  Often, this indicates an error, because the value computed
is never used."

So it is normal in Erlang to use

	A_Normal_Variable_Name
	    when you intend the value to be used
	_
	    when you intend the value NOT to be used
	_Flagged_Variable_Name
	    when you intend the value NOT to be used
	    and want to be clear about what it is you
	    are not using.

Turn to the definition

> magnitude({X, Y, Z}) -> math:sqrt((X * X) + (Y * Y) + (Z * Z)).

That has a style problem: the obfuscatory excess parentheses.

magnitude({X,Y,Z}) ->
    math:sqrt(X*X + Y*Y + Z*Z).

It also has a numeric problem.  There's a reason why C has the
hypot() function.  It's a crying shame that Erlang doesn't have it.
If you are willing to live with avoidable problems of X*X+Y*Y+Z*Z,
then as noted above you can avoid sqrt by comparing against R*R.

What are those problems?  Overflow when overflow is not necessary;
underflow (and possible flush to zero) when underflow is not
necessary.  Yes, it is possible to find X,Y,Z,R such that in real
arithmetic sqrt(X*X+Y*Y+Z*Z) > R but in floating point arithmetic
X*X+Y*Y+Z*Z is zero.  This has nothing to do with Erlang and
everything to do with floating point arithmetic.

But wait!  Did I say this has nothing to do with Erlang?
Wrong!  If DX, DY, DZ, and R are *integers*, then
DX*DX + DY*DY + DZ*DZ =< R*R
will NOT overflow in Erlang, it will just give you precisely
correct answers.  If in your problem, your co-ordinates and
radii are integers, Erlang lets you stop worrying IF you
avoid the square root.

With the square root eliminated, it's not clear that the
early exit checks pay for themselves.  You might find that

in_sphere(X0, Y0, Z0, R, R2, {X,Y,Z}) ->
    DX = X - X0,
    DY = Y - Y0,
    DZ = Z - Z0,
    DX*DX + DY*DY + DZ*DZ =< R*R.

is as fast as you need.

By the way, can I suggest _not_ using the word 'cull'.

It is usually easier to understand if you focus on the
description of the elements you DO want (that is, what to KEEP)
rather than the description of the elements you DON'T want.
Again, this is real: I got confused in exactly this way by your
code.  In English, when we "cull", we specify which animals
are to *go* (cull the sick and deformed), not the ones we want
to *keep*.  Cull "cull" from your programming vocabulary.

A better name in this case would be something like

	filter_points_in_sphere(Sphere, Points) -> Filtered_Points.

Why "filter"?  When you filter coffee, the coffee is what you
*keep*, so natural language does not mislead us, and "filter" is
the conventional name in functional programming generally and
Erlang specifically for "KEEP selected elements of this list".