[erlang-bugs] Dialyzer bug with binary:compile_pattern/1

Kostis Sagonas kostis@REDACTED
Tue May 3 18:09:47 CEST 2011


Jay Nelson wrote:
> The following snippet of code gives a dialyzer error:
> 
>    NL = list_to_binary(io_lib:nl()),
>    LinesPattern = binary:compile_pattern(NL),
>    Lines = [Line || Line <- binary:split(Bin, LinesPattern, [global, trim]),
>                     size(Line) > 0],
> 
> The call binary:split(Bin::binary(),LinesPattern::{'bm',binary()},['global' | 'trim',...]) will never return since it differs in the 2nd argument from the success typing arguments: (binary(),binary() | [binary()] | {'cp',binary()},['global' | 'trim' | {'scope',{_,_}}])
> 
> If the LinesPattern in the list comprehension is changed to NL, then there is no error.
> 
> It seems the return value of binary:compile_pattern/1 is not compatible with the call to binary:split/3.

Well,

It's definitely not a dialyzer bug, meaning it has nothing to do with 
dialyzer's analysis, but something indeed seems wrong there...

The published Erlang/OTP documentation mentions that the second argument 
of binary:split/3 is:
	
	Pattern = binary() | [ binary() ] | cp()

    where cp():
	Opaque data-type representing a compiled search-pattern. 		
	Guaranteed to be a tuple() ...

The developer who wrote the 'binary' module added the following as the 
hard-coded type information for the above in erl_bif_types.erl:

   t_binary_pattern() ->
     t_sup([t_binary(),
            t_list(t_binary()),
            t_binary_compiled_pattern()]).

   t_binary_compiled_pattern() ->
     t_tuple([t_atom('cp'), t_binary()]).

On the other hand, strangely the return of binary:compile_pattern/1 does 
not use this type but instead specifies that the return is:

   type(binary, compile_pattern, 1, Xs) ->
     strict(arg_types(binary, compile_pattern, 1), Xs,
            fun(_) -> t_tuple([t_atom(bm),t_binary()]) end);

i.e. a pair tagged with the atom 'bm' (as it actually is).

I have no idea whether the 'cp' atom there in the definition of 
t_binary_compiled_pattern() should read 'bm', whether a 'bm' tagged pair 
should be added to the list of binary_compiled_patterns or whether this 
is intentional...  The responsible person at OTP who added this type 
information should answer this (and possibly fix).  But I suspect that 
the proper fix here is to change:

   t_binary_compiled_pattern() ->
     t_tuple([t_atom('bm'), t_binary()]).

and then use this type in:

   type(binary, compile_pattern, 1, Xs) ->
     strict(arg_types(binary, compile_pattern, 1), Xs,
            fun(_) -> t_binary_compiled_pattern() end);

Oh, probably this should also be declared as opaque. Right?

Kostis



More information about the erlang-bugs mailing list