pain (and stripping whitespace from text)

Steve Davis <>
Sun Mar 14 08:45:29 CET 2010


On Mar 14, 12:35 am, Robert Virding <> wrote:
> I did a regexp version in Erlang based on these principles and it is
> actually time linear in the size of the input. It is fun to see it zip
> through what would be for Perl/PCRE a super backtracking pathological
> case in a flash. One day when I get the time to cleanup the code I
> will release it.

For sure, I'll keep an eye out for this :)

Meantime, I conceded defeat to the regex gods and solved my immediate
issue with...

%%
strip(Bin) ->
	strip(Bin, [], false).
%
strip(<<$", Rest/binary>>, Acc, false) ->
	strip(Rest, [$"|Acc], true);
strip(<<$", Rest/binary>>, Acc, true) ->
	strip(Rest, [$"|Acc], false);
strip(<<$ , Rest/binary>>, Acc, false) ->
	strip(Rest, Acc, false);
strip(<<$\t, Rest/binary>>, Acc, false) ->
	strip(Rest, Acc, false);
strip(<<$\r, Rest/binary>>, Acc, false) ->
	strip(Rest, Acc, false);
strip(<<$\n, Rest/binary>>, Acc, false) ->
	strip(Rest, Acc, false);
strip(<<X, Rest/binary>>, Acc, State) ->
	strip(Rest, [X|Acc], State);
strip(<<>>, Acc, false) ->
	list_to_binary(lists:reverse(Acc)).

...while not supremely elegant, it seems perfectly adequate and fast
enough for my current need.

/s


More information about the erlang-questions mailing list