[erlang-questions] Additions to lists module

Thu Nov 27 04:08:18 CET 2008

Richard O'Keefe wrote:
>> pos([Ele | Tail], Ele, Pos) ->
>>     Pos;
>> pos([_ | Tail], Ele, Pos) ->
>>     pos(Tail, Ele, Pos+1);
>> pos([], _Ele, _) ->
>>     0.
> 
> The main problem with this is its name;
> it's abbreviated to meaninglessness.
> 
> If I may quote the Haskell List module:
> 
>     findIndex :: (a -> Bool) -> [a] -> Maybe Int
>     elemIndex :: Eq a => a -> [a] -> Maybe Int
> 
> Leaving aside the fact that Haskell uses 0 origin,
> the Erlang analogue would be
> 
>     find_index(P, Xs) ->
>     find_index_loop(P, Xs, 1).

I agree that find_index is a more clear name reflecting the purpose.

> This raises the question of what value should be returned
> when no element of the list satisfies the search criterion.
> 0 is a popular decision, going back to BASIC, but one that
> makes very little sense.  The version that actually makes
> the most sense is length(Xs)+1, so that
> 
>     elem_index(E, Xs) -> find_index(fun (X) -> X =:= E end).
>     find_index(P, Xs) -> 1 + length(drop_while(P, Xs)).
> 
> because that makes reasoning about programs that use it
> noticeably freer of special cases.
 >
> Returning something that isn't a number (my choice here would
> have been 'absent') has the advantage that failing to find
> something would be unlikely to go un-noticed.  It's awkward
> with a type checker though.
> 
> The only advantages of returning 0 seem to be "consistency
> with old fashioned BASIC" and "simplicity for the type checker".
> It makes reasoning about a program more awkward than it need
> be, and it makes it far too easy to fail to notice that
> something wasn't found.

On the other hand returning a value of length(L)+1 means that you would 
need to traverse the list twice - once to figure out the "unsuccessful" 
return value, and the second one to find the actual index of an element. 
  Doesn't it sound like an unnecessary piece of work to compute length(L)+1?

> Summary: I recommend adopting/adapting the Haskell names,
> and there's a choice between
> Index|absent, Index|0, Index|length+1, {present,Index}|absent
> for the result.

My preference is Index|absent or Index|0.

> So there are two things about the name.
> The first one for me was that the "l" after the "d" was awfully
> hard to see.  Separated_words_really_are_much_easier_to_read
> then runtogetherrunsofwordssuchasyouseebeforeyou.  (My mail
> program likes them a lot more too. (:-) (:-))

Though I agree with your reasoning, the only counter argument I can come 
up with is that all functions in the lists module don't have separators, 
so if we choose to respect the naming style in that module, we'd need to 
drop the underscore.

On the other hand in the string module we find co-existing names like 
substr/2 and sub_string/2...

> Something like
> 
>     foldl_prefix(Stepper, Acc0, [X|Xs]) ->
>         case Stepper(X, Acc0)
>           of {continue,Acc} -> foldl_prefix(Stepper, Acc, Xs)
>            ; {stop,Result}  -> Result
>         end;
>     foldl_prefix(S, Acc, []) when is_function(S, 2) ->
>         Acc.
> 
> which _does_ at least fold a prefix, might be less confusing.
> The function results are now completely unambiguous, there
> is no mention of any 'predicate' (because there is none),
> and while the name doesn't completely describe the function,
> everything it says is true.  The Stepper function here is closer
> to being the Fun of a fold than the Pred of a takewhile.
> 
> I'm sure someone else can think of a better name.
> 
> I am rather concerned that the final value of a logically
> empty list (one for which it is not the case that the list
> has an element and the stepper function says to continue)
> has its result derived in two different ways:
>  - if the list is physically empty ([]), Acc0 is the result
>  - if the list is logically but not physically empty ([X|_]),
>    Acc1 is the result where {stop,Acc1} = Stepper(X, Acc0).
> Somehow this just doesn't feel right.

Why is the second case a concern?  Wouldn't it be at the discretion of 
the Stepper function to make the determination of when to stop and 
return the accumulator whatever it wants one to be?

Serge