[erlang-questions] A question of style

Thu Mar 19 01:43:04 CET 2015

On 18/03/2015, at 2:55 am, Joe Armstrong <erlang@REDACTED> wrote:
> The other day I was writing a program that called some library code
> that I had written.
> 
> The library had a routine that could parse a string:
> 
>    -module(my_lib).
>    -export([parse_string/1]).
>     ...
> 
> In the code I was writing I had a binary B containing a string that I
> wanted parsing.

For one thing, it's always possible to have
two modules:

   -module(my_lib_core).
   -export(... only the things that *have* to be exported ...).

   -module(my_lib).
   -export(... everything from my_lib_core ...)
   -export(... a whole bunch of convenience functions ...).

A module should encapsulate something, hide some information.
If it is impossible or very difficult to implement some
function without access to that information, then that
function, or something to support it, belongs in the module.
Also if putting the function (or support element) inside the
module will make a practically important difference to space
or time costs, consider putting it in the module.

Otherwise, if something *can* stay outside the core module,
it *should* stay outside the core.

One important exception is if you decide to change the
interface.  If you found that parse_binary/1 was needed
more often than parse_string/1, you might make that the
core function, in which case the old function ought to
move out, but you might leave it in for a release or two
to support old clients.

Doing this keeps the core simpler, easier to test, and easier
to document.  It also makes the core *easier to use*, in the
sense that someone who wants to use it has less to learn and
fewer mistakes to make.

In the case of string -vs- binary, the conversion *can* be
done outside the module, and it's not likely to make any
difference to performance putting it outside.

> So is this true:
> 
>    easy to use the library == not easy to understand the library code

It depends on what you mean by "easy to use".
If you mean "does not require writing much if any glue
code", possibly.  If you mean "easy to write code that
uses the library correctly", maybe not.

In the specific case of text, you are *never* going to
handle all the ways people might want to represent text.

Taking a Smalltalk perspective here,
if there is a parsing function, I would like
to be able to give it
 - a string
 - a stretchy string (think StringBuffer)
 - a decoded byte array
 - an external file
 - ...
and the best way to do that is to provide a
*stream* of characters.  Something that accepts that
is "easy to understand" (only one interface method)
and "easy to use".  I can use the *same* technique to
convert a character source to a stream no matter which
parser I'm giving it to:
    JSON parse: aString readStream
    XML parse: anotherString readStream
    (FileStream read: 'foo.json') bindOwn: [:s | JSON parse: s]
    (FileStream read: 'bar.xml' ) bindOwn: [:s | XML parse: s]
It's actually *better* than having a lot of so-called
convenience methods, because each thing that accepts a stream
can be combined with *any* way of making a character stream.

So I suggest a paradox: a "lean" module may be *easier* to
use than a "rich" module with a lot of special cases.

Hmm.  Excuse me while I go and rewrite a module or two...