L-5 Floating Point Representation of Numbers
L-5 Floating Point Representation of Numbers
Numbers
Md. Monarul Islam Mithu
Lecturer, Daffodil International University
Dept. Of CSE
References
2
Types of Computer
Arithmetic
There are two types of arithmetic operations which
are required in computers. These are:
(i) Integer arithmetic,
(ii) Real or floating point arithmetic.
Integer arithmetic, as the name implies, deals
with integer operands, that is, operands without
fractional parts.
Real arithmetic, on the other hand, uses numbers
with fractional parts and is used in most
computations.
3
Integer representation
Can you remember how to represent integer
number in register ??
Do you know signed number and Unsigned
number?
Do you know computer register ?
-12
10001100
Is it possible to represent it in 16 bits or 64 bits
register?? How?? Why??
Interger representation
Q.How many numbers may be represented in 8
bits register and what is the range of number?
Try yourself: 16 bits or 32 bits or 64 bits
Fixed point and floating
point number
Fixed and Floating-Point Number: In digital
technology, data is stored in memory registers
with binary bits 0’s and 1’s because the
computer only understands binary language.
Fixed point arithmetic
Fixed point arithmetic
Sign bit -The fixed-point numbers in binary uses a sign bit.
A positive number has a sign bit 0, while a negative
number has a sign bit 1.
Integral Part – The integral part is of different lengths at
different places. It depends on the register's size, like in
an 8-bit register, integral part is 4 bits.
Fractional part – Fractional part is also of different lengths
at different places. It depends on the register's size, like in
an 8-bit register, integral part is of 3 bits.
Mind it!!
8 bits = 1Sign bit + 4 bits(integral) + 3bits
(fractional part)
Number is 4.5
What is the smallest negative number in fixed-
point representation?
What is the largest number in fixed-point
representation ?
Fixed Point Arithmetic
One method of representing real numbers in a computer
would be by assuming a fixed position for the binary
point and storing numbers with an assumed decimal
point, as shown in the following figure.
This figure shows a memory location storing
+101101101.101101.
1
Fixed Point
Arithmetic……... …
If such a convention is used, the maximum and
minimum (in magnitude) numbers that may be
stored are:
•111111111.1111112 = (29 - 1) + (1 - 2-6)
(Maximum)
= 511.98437510
•000000000.0000012 = 2-6 (Minimum)
= 0.01562510
1
Floating Point
Arithmetic………... ……
1
Floating Point
Arithmetic………... ……
The shifting of the mantissa to the left till its most
significant bit is non-zero is called normalization.
The normalization is done to preserve the maximum
number of useful (information carrying) bits.
The leading zeros in 0.000010101 serve only to locate
the binary point.
The information may thus be transferred to the
exponent part of the number and the number is stored
as 0.10101 x 2-4.
1
Floating Point
Arithmetic………... ……
When numbers are stored using this notation, the range
of numbers (magnitude) that may be stored will be:
•Maximum = 0.11111111E0111111
= (1 – 2−8) x
263
•Minimum = 0.10000000E1111111
= 2−1 x
= 2−64
This range is much larger than the range 29 to 2-6
obtained with the fixed point representation.
2
Key Words/Phrases
Integer arithmetic
Mantissa
Exponent
Normalization