[erlang-questions] Refactoring in anger

Sun Aug 9 00:07:51 CEST 2015

We've had an ongoing thread on how to handle some problems
idiomatically in Erlang:

http://erlang.org/pipermail/erlang-questions/2015-August/085382.html

The OP's questions are coming out of an exercise here:

http://chimera.labs.oreilly.com/books/1234000000726/ch05.html#CH05-ET01

This in turn points to a proposed solution here:

http://chimera.labs.oreilly.com/books/1234000000726/apa.html#_literal_geom_erl_literal_9

Upon reading this solution, I became so enraged [1] that I rewrote the
module and want to make a number of points.

Here's my rewrite:

https://gist.github.com/gar1t/7bb80d728f804554ac32

The tone of my points below is *very preachy* which is going to annoy
some people here. I apologize in advance - but hey, ranters gotta
rant.

# Clarity of "what's going on"

Compare my area/0 to the original area/0. Which is easier to see
"what's going on"? I'm not boasting here but rather making the most
important point I can about programming: take the time to be clear in
your intent! If you're actually clear - in your brain - making the
code reflect your understanding *is not that hard*. If your code is
not clear, chances are your brain is not clear.

Maybe the original code works, maybe it doesn't. I can't tell from
looking at that function, at all. I have to dig around various
implementation details and hold a bunch of information in my brain to
understand what the author is maybe trying to do. At some point I
can't keep it all straight and have to run the thing and observe its
behavior. *Now* I'm thinking, what happens to the pour schlep who has
to modify this code. In the real world, this causes fear, and
loathing, and tests - lots and lots of tests!

We don't have to live this way.

# Separation of concerns (in this case IO vs calculations)

The original calculate/3 mixes user facing behavior (print error
messages) with a calculation. If your function returns a mishmash of
completely unrelated values (e.g. the result of io:format and also the
result of an area calculation) *it's a mess*.

In the rewrite I've added print_area/0, which is responsible for
displaying results to the user. area/0 returns an area or raises an
exception. print_area/0 handles both the result and any error.

# Handling invalid input

The original thread here involves a discussion on how to handle bad
input to a function. My rewrite does one of two things on this front:

- If input is from a user, there's an explicit exception raised on bad
input that can be used by an error handler to inform the user

- If input is not from a user but rather internal, I don't handle bad
input at all, but let Erlang crash with a function or case clause
error

The first case address the user experience. We could let exceptions
just propagate and upset users with arcane Erlang messages. Or we can
handle errors politely with intelligible messages.

The original ask_erl.erl handles invalid input by passing atoms like
'error' and 'unknown' along a call chain. This is tedious and finally
culminates in the original calculate/3 - a monster mishmash function
of error handling, IO, and calculation.

My rewrite raises an exception for those functions that take user
provided input. I prefer exceptions in this case as they keep return
values on the happy path, which makes code easier to read.

I don't care about handling internal errors, as long as they manifest
in an obvious way.

# Avoid variable assignment/binding inside case expressions

Just don't do this:

  case Shape of
      rectangle -> Numbers = get_dimensions("width", "height");
      triangle -> Numbers = get_dimensions("base", "height");
      ellipse -> Numbers = get_dimensions("major axis", "minor axis")
  end

Now that you're not allowed to do that, what? Hey, a function!

  Numbers = get_dimensions(Shape)

Every time, all the time.

# Consider not using case expressions at all

Think of a function as a named case expression with a well defined scope.

You'll find that having to name the function forces you to think about
"what's going on" there. It will help your reader (often that means
you, later on) to understand your intention.

My rewrite doesn't use a single case expression. Is it somehow worse?
It's better!

# If it looks confusing, it's bad!

Erlang is not C, or bash, or Perl, or Ruby. It's possible to write
really easy-to-read code in Erlang. People who complain about Erlang
syntax are probably complain about terrible code in Erlang. Terrible
code in any language is worth complaining about - but it's unrelated
to syntax.

It's easy to spot bad code in Erlang:

- Long functions

- Excessive nesting (more than two levels is a train wreck, and IMO
more than one is bad)

I hate nesting so much that I'll go to the trouble of writing
to_number/1 as a list of "try" attempts (see rewrite). Some people
call this monadic, which I like because it sounds super cool.

- Variable assignment/binding inside case and if expressions (see above)

- Functions that are a litany of imperative style instructions, like this:

  get_number(Prompt) ->
      Str = io:get_line("Enter " ++ Prompt ++ " > "),
      {Test, _} = string:to_float(Str),
      case Test of
          error -> {N, _} = string:to_integer(Str);
          _ -> N = Test
      end,
      N.

This fails the "long functions" test - but lines per function is just
a proxy. The real problem here is that it takes too much effort to
read and understand. Really, this is an imperative pattern applied
naively to Erlang. No!

Try this:

  number_from_user(Prompt) ->
      to_positive_number(prompt_user(["Enter ", Prompt, " > "])).

Where's the rest of the code? It's there, inside the other functions.
But *this* function doesn't make you deal with that detail because it
wants you to understand what *it* is doing.

Okay, so I've been pretty critical here of this online training
resource, which some will consider bad form. So let's turn this into
something nice and kind!

The subject of this email is "refactoring in anger", which turns out
to be something I routinely do with my own code. It's hard to write
the perfect code in one pass. So I tend to just "get it to work" and
then examine what seems to be working very carefully - and then change
it so that it becomes *super obvious* what's going on. This state can
take a few passes and still might have warts. Okay, no sweat - it's
just code. Nothing to get angry about. Calm down and fix it.

Over time, when you practice this pattern of refactoring, you'll start
writing things clearly the first time. That's because you'll start
*thinking* more clearly the first time.

Now that's positive and nice and kind right?

Garrett

---

[1] Of course I'm joking here, but only partly. The original solution
is presented to learners - and it's full of bad practices, so that
makes me cranky. I suppose it's good to have teaching material for
Erlang in any form - at least it serves as a point of discussion, even
if contentious.