[erlang-questions] Dialyzer underspecs and overspecs options

Dmitry Kakurin dima_kakurin@REDACTED
Tue Mar 28 21:03:19 CEST 2017


Hi Jesper,


We need to separate what constitutes a specified contract violation for function inputs and function outputs.

I have reworked my demo example (below) and produced warnings (also below) to make it absolutely clear.


Let SpecIn be a set of @spec inputs and RealIn be a set of inputs as inferred by Dialyzer from real code, then:

* The Input Contract is satisfied when SpecIn <= RealIn (where <= is a non-strict subset operation). See over_in in demo code below.

* The Input Contract violation is detected by -Wunderspecs option when SpecIn > RealIn. See under_in below.

It is easy to see in the code:

* It's OK for over_in to declare that it only accepts :a and :b while it also happens to accept :c. Maybe suboptimal, but fine.

* It's NOT OK for under_in to claim that it accepts :a, :b and :c and break if :c is passed. Rejecting :c would break the caller.


Let SpecOut be a set of @spec outputs and RealOut be a set of outputs as inferred by Dialyzer from real code, then:

* The Output Contract is satisfied when SpecOut >= RealOut (where >= is a non-strict superset operation). See under_out below.

* The Output Contract violation is detected by -Woverspecs option when SpecOut < RealOut. See over_out below.

It is easy to see in the code:

* It's OK for under_out to declare that it returns :a, :b and :c, while currently it only returns :a and :b. Maybe future implementations will return :c as well.
* It's NOT OK for over_out to declare that it returns :a and :b, but to also return :c sometimes. Returning :c would break the caller.


We can see the perfect asymmetry for inputs and outputs.

These are pretty-much covariance and contravariance rules: you cannot accept less than you claim and you cannot produce more than you claim.


Also we see how:

* -Wunderspecs option detects Input Contract violations and produces false positives for acceptable Output contracts

* -Woverspecs option detects Output Contract violations and produces false positives for acceptable Input contracts


That's why I'd like to enable underspec_in and overspec_out options (if they existed), as they are the 2 options detecting real problems (out of 4 possible combinations).


  # suboptimal implementation, fine for caller
  @spec over_in(x :: :a | :b) :: nil
  def over_in(x) when x == :a or x == :b or x == :c do
    nil
  end

  # BROKEN caller contract, rejects input allowed by the spec
  @spec under_in(x :: :a | :b | :c) :: nil
  def under_in(x) when x == :a or x == :b do
    nil
  end

  # current implementation detail, does not concern caller
  @spec under_out() :: :a | :b | :c
  def under_out() do
    produce_ab()
  end

  # BROKEN caller contract, could return unexpected value
  @spec over_out() :: :a | :b
  def over_out() do
    produce_abc()
  end

  # helpers

  @spec produce_ab() :: :a | :b
  def produce_ab() do
    case :rand.uniform(2) do
      1 -> :a
      _ -> :b
    end
  end

  @spec produce_abc() :: :a | :b | :c
  def produce_abc() do
    case :rand.uniform(3) do
      1 -> :a
      2 -> :b
      _ -> :c
    end
  end


and the warnings are:


demo.ex:6: Type specification
'Elixir.Virt':over_in('a' | 'b') -> 'nil' is a subtype of the success typing:
'Elixir.Virt':over_in('a' | 'b' | 'c') -> 'nil'

demo.ex:12: Type specification
'Elixir.Virt':under_in('a' | 'b' | 'c') -> 'nil' is a supertype of the success typing:
'Elixir.Virt':under_in('a' | 'b') -> 'nil'

demo.ex:18: The specification for
'Elixir.Virt':under_out/0 states that the function might also return
'c' but the inferred return is
'a' | 'b'

demo.ex:24: Type specification
'Elixir.Virt':over_out() -> 'a' | 'b' is a subtype of the success typing:
'Elixir.Virt':over_out() -> 'a' | 'b' | 'c'


Thank you, Dmitry

________________________________
From: Jesper Louis Andersen <jesper.louis.andersen@REDACTED>
Sent: Tuesday, March 28, 2017 8:46:00 AM
To: Dmitry Kakurin; erlang-questions@REDACTED
Subject: Re: [erlang-questions] Dialyzer underspecs and overspecs options

On Fri, Mar 24, 2017 at 11:37 PM Dmitry Kakurin <dima_kakurin@REDACTED<mailto:dima_kakurin@REDACTED>> wrote:
Two of them that seem especially useful are underspecs and overspecs.

If you think about it, an overspecification is not really a problem with the code. The dialyzer uses inference analysis and makes a best effort at determining the type of a given function. Your contract specifies your usage of the function. When a overspec happens, it is because your contract is more restrictive than the inference. This is usually not a problem. A small change in your function body might change the inferred type entirely. But unless you want to go change your contract all the time, it may be better to keep it more restrictive. Lots of functions are more general than their intended use. Another problem is when the dialyzer fails to make a precise inference, so falls back to type any(). Enabling overspecs will make the system complain all the time in this situation, but it may not be a problem.

Underspecs, on the other hand, could spell trouble. Here the inference is more restrictive, and thus there are inputs to the function which is within the contract, but won't make the code terminate as expected according to the inference scheme. If your code never makes use of the extended contract, there is no problem, but as soon as some user tries to do so, things will go bad. The dialyzer will usually catch such a use quickly, but it is nicer to be strict in the contract if possible.

There are, however, situations where underspecs isn't that desirable as a property either. Suppose you are returning a map, such as #{ a => 37 }. The dialyzer can infer its success type to be #{ a := pos_integer() }[0]. But you may have written a specification where you say a map() is returned. This is more general than the inferred type, and as such it is an underspec violation. Your contract isn't wrong, but on the other hand, it isn't precise either. A common problem that also hits a lot are when you are declaring a binary(), but the dialyzer is able to figure out a more precise type scheme, for instance <<_:64, _:_*24>>. You will perhaps be more happy with just saying "this is a binary".

My usual recommendation is to start with the default set of warnings. Remove every error there, because the slogan is

   The dialyzer is never wrong!

and this is absolutely true. Once you have a dialyzer clean piece of code with the default warnings, you start turning on underspecs, no_return, and so on. But often this is more for understanding if there is anything that sticks out as being too general.


[0] This is strictly not true. If you *always* return 37, then it figures out the type is exactly the same response every time and tags its type as #{ a := 37 }. But for this exposition, I'm assuming your code returns "some positive integer" where 37 is an example.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170328/e838580e/attachment.htm>


More information about the erlang-questions mailing list