Cosc 2150: Computer Organization: Chapter 9, Part 3 Floating Point Numbers
Cosc 2150: Computer Organization: Chapter 9, Part 3 Floating Point Numbers
Computer Organization
Chapter 9, Part 3
Floating point numbers
Real Numbers
Example:
Express 3210 in the simplified 14-bit floating-
point model.
We know that 32 is 25. So in (binary) scientific notation
32 = 1.0 x 25 = 0.1 x 26.
In a moment, well explain why we prefer the
second notation versus the first.
Using this information, we put 110 (= 610) in the
exponent field and 1 in the significand as shown.
2.5 Floating-Point Representation
Example:
Express 3210 in the revised 14-bit floating-
point model.
We know that 32 = 1.0 x 25 = 0.1 x 26.
To use our excess 16 biased exponent, we add 16 to
6, giving 2210 (=101102).
So we have:
Example 2
Example:
Express 0.062510 in the revised 14-bit
floating-point model.
We know that 0.0625 is 2-4. So in (binary) scientific
notation 0.0625 = 1.0 x 2-4 = 0.1 x 2 -3.
To use our excess 16 biased exponent, we add 16 to
-3, giving 1310 (=011012).
Example 3
Example:
Express -26.62510 in the revised 14-bit
floating-point model.
We find 26.62510 = 11010.1012. Normalizing, we have:
26.62510 = 0.11010101 x 2 5.
To use our excess 16 biased exponent, we add 16 to
5, giving 2110 (=101012). We also need a 1 in the sign
bit.
Floating-Point Standards
(implied)
Since we have an implied 1 in the significand, this equates
to
-(1).1112 x 2 (128 127) = -1.1112 x 21 = -11.112 = -3.75.
FP Ranges
For a 32 bit number
8 bit exponent
+/- 2256 1.5 x 1077
Accuracy
The effect of changing lsb of significand
23 bit significand 2-23 1.2 x 10-7
About 6 decimal places
Expressible Numbers
Floating-Point Representation
24
Floating-Point Representation
30
Rounding and Errors
128.5 - 128
128.5
0.39%
34
Rounding and Errors