# 8 bit signed float possible?

Andrew Lentvorski bsder@REDACTED
Sun Jun 18 10:55:22 CEST 2006

```Christian S wrote:

> And once that is understood you can of course deconstruct 8 bits into
> small integers and construct a floating point number out of them
> yourself.
>
> Given 1 sign bit, 2 exponent bits and 5 mantissa bits you can have 32
> different fractions between 1.0 and 0.0, and you have have (as a
> suggestion) 10^1, 10^0, 10^-1, 10^2
> multipliers from the exponent bit. Then the closest you can come to 3.14 is
> 10*(1.0/32) * 10^1 = 0.3125 * 10^1 = 3.125.
>
> Good enought for government work, or what is it they say?

I am assuming that you are being flippant, so I send this return message
in the same vein. ;)

What they say is, "The net provides a thousand wrong answers as well as
the right one.".

First, mixing bases for exponent (10) and fractional mantissa (2) is
unusual.  Think there might be a reason for that?  (Hint: there is.)

In addition, you make the implicit assumption that the floating point
numbers are allowed to be denormal without explicit indication (ie. no
implicit leading 1 in the mantissa).  While not incorrect, not having an
implicit leading 1 in the mantissa without indication is certainly an
unusual assumption and should be noted.

To top things off, two's complement arithmetic normally implies that the
negative range is larger than the positive range.  Thus, the normal
sequence for your exponent would be -2, -1, 0, 1.

Additionally, since you allow unindicated denormal numbers, you would be
better better off representing the number as 5/16 (0.3125) so that you
could use an exponent of 1 and preserve more bits of accuracy.

Finally, the request to store 3.14 normally carries an implication of
error of approximately +/- 1/200 (1/2 ulp).  Your floating point exceeds
that error.  Not indicating that the next two nearest numbers around it
are {9/32*10, 10/32*10, 11/32*10} = (2.8125, 3.125, 3.4375} is just
unsporting.

-a

```