[erlang-questions] OO programming style in Erlang?

Richard A. O'Keefe ok@REDACTED
Thu Jan 25 03:26:16 CET 2007


Ladislav Lenart <lenartlad@REDACTED> wrote:
	No, this is not what I was worrying about. It seemed
	strange to me that I have to know exactly which function
	I need to invoke on a particular item.

That is because it is precisely the *function* that you are expected
to be primarily interested in; the data is there to let the function
do its job, as it were.

	In Smalltalk, I just write
	
	   "aCollection can be anything - List, Set, Dictionary, ..."
	   aCollection collect: [:each | each something].
	
	because aCollection knows what kind of collection it
	actually is.

Speaking as someone who is still trying to finish his own Smalltalk
compiler, it really isn't quite that simple, even in Smalltalk.
To start with, what does #collect: return?

    List:  my Smalltalk doesn't have one, but LinkedList>>collect:
    unexpectedly returns an Array, not a LinkedList;

    Dictionary: you would *expect* this definition,
	collect: aBlock
	    |result|
	    result := self copyEmpty.
	    self keysAndValuesDo: [:key :value |
	        result at: key put: (aBlock value: value)].
	    ^result
    but in fact you get an OrderedCollection

    Set>>collect: returns a Set, and surprisingly to many people,
    so does IdentitySet>>collect:

The real gem is ByteVector>>collect: where I have had to file bug reports
in two Smalltalk systems because it was originally written to return a
ByteVector, but there is of course no guarantee that the function passed
to it will provide byte results.  For real grins/groans, try this:

    'abc' collect: [:x | 2]

Arguably a system that follows the ANSI Smalltalk standard must get this
and the ByteVector case wrong.  My SequencedReadableCollection class goes
to quite a lot of trouble: first it builds an array of the results, then
it checks whether something just like the receiver could hold them.  If
it could, it returns something like the receiver, if not, an Array.  So
    'abc' collect: [:each | each asUppercase]	=> 'ABC'
    'abc' collect: [:each | 2]                  => #(2 2 2)
Ahem.

The point I want to establish here is
    You don't need to know what the receiver is in order to know
    >>what selector to use<<, but you DO need to know what the
    receiver is in order to know
       - whether it will work at all
       - what you will get if it does.

Let's take one more example, Set>>collect:.  If we are mapping over a
set, there are at least four cases:

    - we want exactly one element in the result for each element in
      the source, whether any of them are equal or not
    => Array withAll: set collect: aBlock    "mine"
    or (Array withAll: set) collect: aBlock  "ANSI, less efficient"

    - we want a set of results, and the way to tell whether two results
      are equal is the built in #= method
    => Set withAll: set collect: aBlock
    or set collect: aBlock

    - we want a set of results, but two results are to count as equal
      if and only if they are the very same object (#==)
    => IdentitySet withAll: set collect: aBlock "mine"

    - we want a set of results, but we need another definition of
      equality (say, ignoring alphabetic case)
    => (PluggableSet equalBlock: #equalsIgnoringCase:
                     hashBlock:  #hashIgnoringCase:)
         addAll: set collecting: aBlock                     


 In Erlang, I have to know
	
	   lists:map(...)
	   dict:map(...)
	   orddict:map(...)
	
	Furthemore lists:map expects fun/1 while dict/orddict:map
	expects fun/2.
	
Not "furthermore", BECAUSE list mapping and dictionary mapping have
different interfaces you have to know which one you are going to call.
lists:map(L, F) is the equivalent of OrderedCollection>>collect:.
dict:map(D, F) is the equivalent of Dictionary>>keysAndValuesCollect:,
or it would be if #keysAndValuesCollect: existed.

The orddict module is extremely unusual in having the same interface
as the dict module.  However, it does NOT have the same PERFORMANCE
characteristics.  So once again, if you want to have any idea how well
your code is likely to work, you had BETTER know which version you are
using.  (And wouldn't it be nice if the documentation said something
about performance?)  In fact, there really aren't that many places that
you would want to use orddict at all.

	I got some good suggestions to deal with this issue,
	wrapper functions and macros.

No, those are not good suggestsions, because they are all about
helping you hide information that is actually important to you.

One thing I have found in writing my own Smalltalk system (27 kSLOC
in the library and rising) is that every time I add a new leaf class to an
existing tree of classes, refactoring is needed, sometimes extensive
refactoring, despite modelling the interfaces on a 26+-year-old success.
One thing I have found in using existing Smalltalk systems is that very
often inherited methods *don't* work (see String>>collect: above).

	The thing I don't like is that I have to solve it myself...
	
For the reasons given above, I just plain CANNOT believe that there is a
genuine problem with Erlang that you need to solve.  The problem that you
have is in your expectation that things *should* work in Erlang as they
*don't in fact* work in OO languages.

There really are very very very few cases where one data structure can
be substituted for another in Erlang.




More information about the erlang-questions mailing list