[erlang-questions] [enhancement] string:split/2

Mon Oct 13 06:00:38 CEST 2008

On 13 Oct 2008, at 9:22 am, Robert Virding wrote:

> Can we end the language discussion? :-)
>
> I suggest a function string:split/2 which splits the input string  
> with the separator and assumes that there is no separator and either  
> end of the string. So:
>
> string:split("ab:de:fg", ":") ==> ["ab","de","fg"]
> string:split(":ab:de:fg:", ":") ==> [[],"ab","de","fg",[]]

This violates the assumption "that there is no separator [at]
either end of the string".  To make that assumption is to
assume that something that _looks_ like a separator at either
end _isn't_ one, so
string:split(":ab:de:fg:", ":") ==> [":ab","de","fg:"]
under that assumption.

I _think_ you are talking about the issue that came up
last week with *multicharacter* separators:
    string:split("a:::b", "::")
    ==> ["a:","b"]
    or  ["a",":b"]  -- the code I posted does this

As I recall it, we don't have a resolution for that.
The posted 'unjoin' takes the _leftmost_ occurrence of
a multicharacter separator as a separator, but it could
just as well take the _rightmost_.  (Existence proof:
reverse . map reverse . (\ s -> unjoin s sep) . reverse)

> We call it split because it sounds better inspite of it being an  
> unjoin.

s/inspite of/in spite of/ or more idiomatically,
s/inspite of/despite/

> This also matches what will be in the re module (if they get it  
> right) and what is in the old regexp module.

No, it DOESN'T match it.  It CONFLICTS with regexp splitting.
In fact, that's why it is useful!

Call it disjoin/2 if you will (that being an existing English
word).  But please do not give it the same name as an existing
function with significantly incompatible behaviour.
>