8 bit signed float possible?

Sun Jun 18 10:55:22 CEST 2006

Christian S wrote:

> And once that is understood you can of course deconstruct 8 bits into
> small integers and construct a floating point number out of them
> yourself.
> 
> Given 1 sign bit, 2 exponent bits and 5 mantissa bits you can have 32
> different fractions between 1.0 and 0.0, and you have have (as a
> suggestion) 10^1, 10^0, 10^-1, 10^2
> multipliers from the exponent bit. Then the closest you can come to 3.14 is
> 10*(1.0/32) * 10^1 = 0.3125 * 10^1 = 3.125.
> 
> Good enought for government work, or what is it they say?

I am assuming that you are being flippant, so I send this return message 
in the same vein. ;)

What they say is, "The net provides a thousand wrong answers as well as 
the right one.".

First, mixing bases for exponent (10) and fractional mantissa (2) is 
unusual.  Think there might be a reason for that?  (Hint: there is.)

In addition, you make the implicit assumption that the floating point 
numbers are allowed to be denormal without explicit indication (ie. no 
implicit leading 1 in the mantissa).  While not incorrect, not having an 
implicit leading 1 in the mantissa without indication is certainly an 
unusual assumption and should be noted.

To top things off, two's complement arithmetic normally implies that the 
  negative range is larger than the positive range.  Thus, the normal 
sequence for your exponent would be -2, -1, 0, 1.

Additionally, since you allow unindicated denormal numbers, you would be 
better better off representing the number as 5/16 (0.3125) so that you 
could use an exponent of 1 and preserve more bits of accuracy.

Finally, the request to store 3.14 normally carries an implication of 
error of approximately +/- 1/200 (1/2 ulp).  Your floating point exceeds 
that error.  Not indicating that the next two nearest numbers around it 
are {9/32*10, 10/32*10, 11/32*10} = (2.8125, 3.125, 3.4375} is just 
unsporting.

-a