[erlang-questions] Binary pattern matching inconsistencies with R12B
Rory Byrne
rory@REDACTED
Fri Feb 29 18:43:44 CET 2008
Hello
I'm writing a scanner for a query language and I'm encountering
intermittent segmentation faults and other odd errors. The
code I'm working on appears to work fine on 11.b.2-4
(linux/amd64), but gives problems on r12b-0 (linux/i386) and
r12b-1 (linux/amd64). I didn't add any fancy options when I
compiled r12b, just a --prefix.
I'm an erlang newbie so highly likely I've written something
stupid. Just hope it's obvious whatever it is!
The scanner is quite large so I've reduced it down to two
smaller programs which show similar symptoms. The first one
just throws exceptions from time to time. The second program
ends up dying as a result of a segmentation fault sooner
of later.
The following code won't make real sense. The full scanner
makes sense but this is only a mutated 10% of that. Sorry
the code is so unintelligible - but on the bright side
it fails more frequently and predictably than the full
scanner does.
%% START OF CODE: weird.erl %%
-module(weird).
-compile(export_all).
%% For testing - runs scanner N number of times with same input
run(N) ->
lists:foreach(fun(_) ->
scan(<<"region:whatever">>, [])
end, lists:seq(1, N)).
scan(<<>>, TokAcc) ->
lists:reverse(['$thats_all_folks$' | TokAcc]);
scan(<<D, $\s, Rest/binary>>, TokAcc) when
(D =:= $D) or (D =:= $d) ->
scan(Rest, ['AND' | TokAcc]);
scan(<<D>>, TokAcc) when
(D =:= $D) or (D =:= $d) ->
scan(<<>>, ['AND' | TokAcc]);
scan(<<N, Z, Rest/binary>>, TokAcc) when
(N =:= $N) or (N =:= $n),
(Z =:= $\s) ->
scan(<<Z, Rest/binary>>, ['NOT' | TokAcc]);
scan(<<C, Rest/binary>>, TokAcc) when
(C >= $A) and (C =< $Z);
(C >= $a) and (C =< $z);
(C >= $0) and (C =< $9) ->
case Rest of
<<$:, R/binary>> ->
scan(R, [{'FIELD', C} | TokAcc]);
_ ->
scan(Rest, [{'KEYWORD', C} | TokAcc])
end.
%% END OF CODE %%
Here's what I see from the shell on an i386 machine:
1> c(weird).
{ok,weird}
2> weird:run(1000).
ok
3> weird:run(1000).
ok
4> weird:run(1000).
ok
5> weird:run(1000).
** exception error: no function clause
matching weird:scan(<<"whatever">>,
[{'FIELD',110},
{'KEYWORD',111},
{'KEYWORD',105},
{'KEYWORD',103},
{'KEYWORD',101},
{'KEYWORD',114}])
in function lists:foreach/2
6> weird:run(1000).
** exception error: no function clause
matching weird:scan(<<"whatever">>,
[{'FIELD',110},
{'KEYWORD',111},
{'KEYWORD',105},
{'KEYWORD',103},
{'KEYWORD',101},
{'KEYWORD',114}])
in function lists:foreach/2
7>
It will then keep throwing exceptions from this point on. On an
amd64 machine I'm getting similar output, but it usually has
the sequence ok, error, ok, error... And if I bump it from
1,000 up to 10,000 iterations the errors usually stop (on amd64).
The second block of code is:
%% START OF CODE: scanner.erl %%
-module(scanner).
-compile(export_all).
%% For testing - runs scanner N number of times with same input
run(N) ->
lists:foreach(fun(_) ->
scan(<<"region:whatever">>, [])
end, lists:seq(1, N)).
scan(<<>>, TokAcc) ->
lists:reverse(['$thats_all_folks$' | TokAcc]);
scan(<<D, Z, Rest/binary>>, TokAcc) when
(D =:= $D orelse D =:= $d) and
((Z =:= $\s) or (Z =:= $() or (Z =:= $))) ->
scan(<<Z, Rest/binary>>, ['AND' | TokAcc]);
scan(<<D>>, TokAcc) when
(D =:= $D) or (D =:= $d) ->
scan(<<>>, ['AND' | TokAcc]);
scan(<<N, Z, Rest/binary>>, TokAcc) when
(N =:= $N orelse N =:= $n) and
((Z =:= $\s) or (Z =:= $() or (Z =:= $))) ->
scan(<<Z, Rest/binary>>, ['NOT' | TokAcc]);
scan(<<C, Rest/binary>>, TokAcc) when
(C >= $A) and (C =< $Z);
(C >= $a) and (C =< $z);
(C >= $0) and (C =< $9) ->
case Rest of
<<$:, R/binary>> ->
scan(R, [{'FIELD', C} | TokAcc]);
_ ->
scan(Rest, [{'KEYWORD', C} | TokAcc])
end.
%% END OF CODE %%
When I use this code in the shell (on i386) is usually works okay
for a smaller number of iterations but when you get into the
hundreds it dies fast:
1> c(scanner).
{ok,scanner}
2> scanner:run(10). % Start with 10
ok
3> scanner:run(10).
ok
4> scanner:run(100). % Bumped up to 100
** exception error: no function clause
matching weird:scan(<<"whatever">>,
[{'FIELD',110},
{'KEYWORD',111},
{'KEYWORD',105},
{'KEYWORD',103},
{'KEYWORD',101},
{'KEYWORD',114}])
in function lists:foreach/2
5> scanner:run(100).
Segmentation fault
Anyone got any ideas?
Cheers,
Rory
More information about the erlang-questions
mailing list