Module2.1 of nothing
Module2.1 of nothing
Engineers (IEEE)
Floating-point representation is used to store real numbers efficiently, allowing for a wide
range of values with high precision. There are several ways to represent floating point
number but IEEE 754 is the most efficient in most cases The IEEE 754 standard defines how
floating-point numbers are stored and manipulated in computers.
Example
85.125
85 = 1010101
0.125 = 001
85.125 = 1010101.001
=1.010101001 x 2^6
sign = 0
1. Single precision:
biased exponent 127+6=133
133 = 10000101
Normalised mantisa = 010101001
we will add 0's to complete the 23 bits
2. Double precision:
biased exponent 1023+6=1029
1029 = 10000000101
Normalised mantisa = 010101001
we will add 0's to complete the 52 bits
Advantages
Disadvantages
Real-World Applications
64 32 16 8 4 2 1 0.5 0.25
1 1 1 1 0 1 1 1 1
64 + 32 + 16 + 8 + 2 + 1 + 0.5 + 0.25 = 1111011.11
123.75 in binary is 1111011.11
Key fact
The computer will not store the actual decimal point as part of the floating point
number but it is used here for illustrative purposes.
To find the mantissa, move the decimal point to the right of the most significant bit of the
mantissa:
1111011.11 → 0.111101111
To calculate the exponent, count how many places the decimal point moved to give the
mantissa. In this case the decimal point moved seven places to the left:
So the exponent for our number is 7.
4 2 1
1 1 1
In binary, the number 7 is 111 as 4 + 2 + 1 = 7
In order to represent 123.75 the mantissa would be 111101111 and the exponent would be
111. This can be thought of as:
0.111101111 x 2111
Sign bit
As well as the mantissa, base and exponent, we have a digit before the decimal point. This is
used as a sign bit and is represented in binary as a 0 for positive and a 1 for negative.
How many bits?
There will always be a trade-off between accuracy and range when using floating point
notation, as there will always be a set number of bits allocated to storing real numbers:
increasing the number of bits devoted to the mantissa will improve the accuracy of a
floating point number
increasing the number of bits devoted to the exponent will increase the range of
numbers that can be held
In the Higher course, floating point numbers are represented as follows:
1 bit for the sign
15 bits for the mantissa
8 bits for the exponent
FLOATING POINT ADDITION AND SUBTRACTION