[erlang-questions] On OTP rand module difference between OTP 19 and OTP 20

Wed Aug 30 10:29:12 CEST 2017

On Wed, Aug 30, 2017 at 04:14:56PM +0900, zxq9 wrote:
> On 2017年08月30日 水曜日 08:54:30 Raimo Niskanen wrote:
> > On Wed, Aug 30, 2017 at 03:48:16PM +0900, zxq9 wrote:
> > > On 2017年08月30日 水曜日 08:42:02 Raimo Niskanen wrote:
> > > > On Wed, Aug 30, 2017 at 11:44:57AM +1200, Richard A. O'Keefe wrote:
> > > > > 
> > > > > 
> > > > > On 29/08/17 8:35 PM, Raimo Niskanen wrote:
> > > > > >
> > > > > > Regarding the changed uniform float behaviour: it is the functions
> > > > > > rand:uniform/0 and rand:uniform_s/1 this concerns.  They were previously
> > > > > > (OTP-19.3) documented to output a value 0.0 < X < 1.0 and are now
> > > > > > (OTP-20.0) documented to return 0.0 =< X < 1.0.
> > > > > 
> > > > > There are applications of random numbers for which it is important
> > > > > that 0 never be returned.  Of course, nothing stops me writing
> > > > 
> > > > What kind of applications?  I would like to get a grip on how needed this
> > > > function is?
> > > 
> > > Any function where a zero would propagate.
> > > 
> > > This can be exactly as bad as accidentally comparing a NULL in SQL.
> > 
> > That's vague for me.
> > 
> > Are you saying it is a common enought use pattern to divide with a
> > random number?  Are there other reasons when a float() =:= 0.0 is fatal?
> 
> It is relatively common whenever it is guaranteed to be safe! Otherwise it becomes a guarded expression.
> 
> Sure, that is a case of "well, just write it so that it can't do that" -- but the original function spec told us we didn't need to do that, so there is code out there that would rely on not using a factor of 0.0. I've probably written some in game servers, actually.
> 
> Propagating the product of multiplication by 0.0 is the more common problem I've seen, by the way, as opposed to division.
> 
> Consider: character stat generation in games, offset-by-random-factor calculations where accidentally getting exactly the same result is catastrophic, anti-precision routines in some aiming devices and simulations, adding wiggle to character pathfinding, unstuck() type routines, mutating a value in evolutionary design algorithms, and so on.
> 
> Very few of these cases are catastrophic and many would simply be applied again if the initial attempt failed, but a few can be very bad depending on how the system in which they are used is designed. The problem isn't so much that "there aren't many use cases" or "the uses aren't common" as much as the API was originally documented that way, and it has changed for no apparent reason. Zero has a very special place in mathematics and should be treated carefully.

The spec did not match the reality.  Either had to be corrected.
It is in general safer to change the documentation to match the reality.

So I do not agree that the spec changed for no apparent reason.

Furthermore Java's Random:nextFloat(), Python's random.random() and
Ruby's Random.rand all generate in the same interval:
    http://docs.oracle.com/javase/6/docs/api/java/util/Random.html#nextFloat()
    https://docs.python.org/3/library/random.html#random.random
    http://ruby-doc.org/core-2.0.0/Random.html#method-i-rand

I think this all boils down to the fact that digital floating point values
(IEEE 754) has limited precision and in the interval 0.0 to 1.0 are better
regarded as 53 bit fixed point values.

A half open interval matches integer random number generators that also
in general use half open intervals.

With half open intervals you can generate numbers in [0.0,1.0) and other
numbers in [1.0,2.0), where the number 1.0 belongs to only one of these intervals.

This I think is a good default behaviour.

> 
> I think ROK would have objected a lot less had the original spec been 0.0 =< X =< 1.0 (which is different from being 0.0 =< X < 1.0; which is another point of potentially dangerous weirdness). I'm curious to see what examples he comes up with. The ones above are just off the top of my head, and like I mentioned most of my personal examples don't happen to be really catastrophic in most cases because many of them involve offsetting from a known value (which would be relatively safe to reuse) or situations where failures are implicitly assumed to provoke retries.
> 
> -Craig

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB