0% found this document useful (0 votes)

6 views

Fixed _And_Floating_Point_representation

The document discusses fixed and floating point representation in computer systems, focusing on how numbers are stored in binary format. It explains the differences between unsigned integers, sign-and-magnitude, and two's complement representations, as well as the significance of floating-point representation for real numbers. Additionally, it covers the IEEE standard for floating-point representation, including normalization and the handling of special cases like zero and truncation errors.

Uploaded by

goutam sanyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Fixed _And_Floating_Point_representation

Uploaded by

goutam sanyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Course :Computer Class :

System Sem-1
Architecture

Lesson :Fixed and floating

point representation

By :Goutam Sanyal
Fixed and floating point representation
STORING NUMBERS

A number is changed to the binary system before being stored in the computer
memory, as described in . However, there are still two issues that need to be
handled:

1. How to store the sign of the number.

2. How to show the decimal point.

For the decimal point, computers use two different representations: fixed-point and
floating-point. The first is used to store a number as an integer- without a fraction
part, the second is used to store a number as a real- with a fractional part.
Storing integers

Integers are whole numbers (numbers without a fractional part). For

example, 134 and −125 are integers, whereas 134.23 and −0.235 are not.
An integer can be thought of as a number in which the position of the
decimal point is fixed: the decimal point is to the right of the least significant
(rightmost) bit. For this reason, fixed-point representation is used to store an
integer, as shown in Figure. In this representation the decimal point is
assumed but not stored.
Unsigned representation
An unsigned integer is an integer that can never be negative and can take only 0 or
positive values. Its range is between 0 and positive infinity.
0 → (2n -1)

An input device stores an unsigned integer using the following steps:

1. The integer is changed to binary.

2. If the number of bits is less than n, 0s are added to the left.
Example

Store 7 in an 8-bit memory location using unsigned representation.

Solution
First change the integer to binary, (111)2. Add five 0s to make a total of eight
bits, (00000111)2. The integer is stored in the memory location. Note that the
subscript 2 is used to emphasize that the integer is binary, but the subscript is
not stored in the computer.
Example

Store 258 in a 16-bit memory location.

Solution
First change the integer to binary (100000010)2. Add seven 0s to make a total
of sixteen bits, (0000000100000010)2. The integer is stored in the memory
location.

Retrieving unsigned integers

An output device retrieves a bit string from memory as a bit pattern and
converts it to an unsigned decimal integer.
Figure shows what happens if we try to store an integer that is larger than 24 − 1 =
15 in a memory location that can only hold four bits.

Applications of unsigned integers:

Figure 3.5 Overflow in unsigned integers
Counting- Addressing- storing other data types (text, images, audio and video)
Sign-and-magnitude representation

In this method, the available range for unsigned integers (0 to 2n − 1) is divided into
two equal sub-ranges. The first half represents positive integers, the second half,
negative integers.

Figure Sign-and-magnitude representation

Note that we have two 0s: positive zero and negative zero.
Range: -(2n-1 -1) to +(2n-1 -1)
Example
Store +28 in an 8-bit memory location using sign-and-magnitude
representation.

Solution
The integer is changed to 7-bit binary. The leftmost bit is set to 0. The 8-bit
number is stored.
Example
Store -28 in an 8-bit memory location using sign-and-magnitude
representation.

Solution
The integer is changed to 7-bit binary. The leftmost bit is set to 1. The 8-bit
number is stored.
Two’s complement representation

Almost all computers use two’s complement representation to store a signed integer
in an n-bit memory location. In this method, the available range for an unsigned
integer of (0 to 2n − 1) is divided into two equal sub-ranges. The first sub-range is
used to represent nonnegative integers, the second half to represent negative
integers. The bit patterns are then assigned to negative and nonnegative (zero and
positive) integers as shown in Figure .
Example
The following shows that we always get the original integer if we apply the
two’s complement operation twice.
Storing an integer in two’s complement format:
• The integer is changed to an n-bit binary.
• If it is positive or zero, it is stored as it is. If it is negative, take
the two’s complement and then stores it.

Retrieving an integer in two’s complement format:

• If the leftmost bit is 1, the computer applies the two’s
complement operation to the integer. If the leftmost bit is 0,
no operation is applied.
• The computer changes the integer to decimal.
Example
Store the integer 28 in an 8-bit memory location using two’s complement
representation.

Solution
The integer is positive (no sign means positive), so after decimal to binary
transformation no more action is needed. Note that five extra 0s are added to
the left of the integer to make it eight bits.
Example
Store −28 in an 8-bit memory location using two’s complement
representation.

Solution
The integer is negative, so after changing to binary, the computer applies the
two’s complement operation on the integer.
There is only one zero in two’s complement notation.

Overflow in two’s complement representation

Applications: it is the standard representation for storing integers in computers
today.
Storing reals

A real is a number with an integral part and a fractional part. For example, 23.7 is a
real number—the integral part is 23 and the fractional part is 7/10. Although a fixed-
point representation can be used to represent a real number, the result may not be
accurate or it may not have the required precision. The next two examples explain
why.

Real numbers with very large integral parts or very small fractional parts should
not be stored in fixed-point representation.
Example

In the decimal system, assume that we use a fixed-point representation with

two digits at the right of the decimal point and fourteen digits at the left of
the decimal point, for a total of sixteen digits. The precision of a real number
in this system is lost if we try to represent a decimal number such as 1.00234:
the system stores the number as 1.00.

Example

In the decimal system, assume that we use a fixed-point representation with

six digits to the right of the decimal point and ten digits for the left of the
decimal point, for a total of sixteen digits. The accuracy of a real number in
this system is lost if we try to represent a decimal number such as
236154302345.00. The system stores the number as 6154302345.00: the
integral part is much smaller than it should be.
Floating-point representation

The solution for maintaining accuracy or precision is to use floating-point

representation.

Figure The three parts of a real number in floating-point representation

A floating point representation of a number is made up of three parts: a sign, a

shifter and a fixed-point number.

Floating-point representation is used in science to represent very small or very large

decimal numbers. In this representation called scientific notation, the fixed-point
section has only one digit to the left of point and the shifter is the power of 10.
Example

The following shows the decimal number

7,452,000,000,000,000,000,000.00

in scientific notation (floating-point representation).

The three sections are the sign (+), the shifter (21) and the fixed-point part
(7.425). Note that the shifter is the exponent.
Some programing languages and calculators shows the number as +7.425E21
Example

Show the number

−0.0000000000000232

in scientific notation (floating-point representation).

Solution
We use the same approach as in the previous example—we move the decimal
point after the digit 2, as shown below:

The three sections are the sign (-), the shifter (-14) and the fixed-point part
(2.32). Note that the shifter is the exponent.
(.1)2= (1 x2-1)10

(.01)2= (1x 2-2)10

(.001)2= (1x 2-3)10

(1)2= (1 x20)10

(10)2= (1 x21)10

(100)2= (1 x22)10

(.011)2 = (.01)2+ (.001)2= (1x 2-2)10+(1x 2-3)10

=(10x 2-3)10+(1x 2-3)10
=(10+1) x 2-3=11x2-3
Binary Exponent Integer part Exponent
Representation
(.1)2 (1 x2-1) 1 -1
(.01)2 (1x 2-2) 1 -2
(.001)2 (1x 2-3) 1 -3
(.00001)2 (1x 2-5) 1 -5

(1)2 1 x20 1 0
(10)2 1 x21 1 1
(100)2 1 x22 1 2

(.011)2 11x2-3 11 -3
Example

Show the number

−(0.00000000000000000000000101)2

in floating-point representation.

Solution
We use the same idea, keeping only one digit to the left of the decimal point.
Normalization

To make the fixed part of the representation uniform, both the scientific method (for the
decimal system) and the floating-point method (for the binary system) use only one non-zero
digit on the left of the decimal point. This is called normalization. In the decimal system this
digit can be 1 to 9, while in the binary system it can only be 1. In the following, d is a non-zero
digit, x is a digit, and y is either 0 or 1.
Note that the point and the bit 1 to the left of the fixed-point section are not stored—
they are implicit.

The mantissa is a fractional part that, together with the sign, is treated like an integer
stored in sign-and-magnitude representation.
Excess_127 and Excess_1023 system
• The exponent, the power that shows how many bits the decimal point
should be moved to the left or right, is a signed number.

• Although this could have been stored using two’s complement

representation, a new representation, called the Excess system, is used
instead.
• In the Excess system, both positive and negative integers are stored as
unsigned integers.

• To represent a positive or negative integer, a positive integer (called a bias)

is added to each number to shift them uniformly to the non-negative side.

• The value of this bias is 2m−1 − 1, where m is the size of the memory
location to store the exponent.
Example

We can express sixteen integers in a number system with 4-bit allocation. By adding
seven units to each integer in this range, we can uniformly translate all integers to the
right and make all of them positive without changing the relative position of the
integers with respect to each other, as shown in the figure. The new system is referred
to as Excess-7, or biased representation with biasing value of 7.

Figure Shifting in Excess representation

IEEE Standard

Figure IEEE standards for floating-point representation

IEEE Specifications

Storage of IEEE standard floating point numbers:

1. Store the sign in S (0 or 1).
2. Change the number to binary.
3. Normalize.
4. Find the values of E and M.
5. Concatenate S, E, and M.
Example

Show the Excess_127 (single precision) representation of the decimal

number5.75.
Solution

a. The sign is positive, so S = 0.

b. Decimal to binary transformation: 5.75 = (101.11)2.
c. Normalization: (101.11)2 = (1.0111)2 × 22.
d. E = 2 + 127 = 129 = (10000001)2, M = 1011. We need to add nineteen
zeros at the right of M to make it 23 bits.
e. The presentation is shown below:

The number is stored in the computer as

01000000110110000000000000000000
Example

Show the Excess_127 (single precision) representation of the decimal number

–161.875.
Solution

a. The sign is negative, so S = 1.

b. Decimal to binary transformation: 161.875= (10100001.111)2.
c. Normalization: (10100001.111)2 = (1.0100001111)2 × 27.
d. E = 7 + 127 = 134 = (10000110)2 and M = (0100001111)2.
e. Representation:

The number is stored in the computer as

11000011010000111100000000000000
Example

Show the Excess_127 (single precision) representation of the decimal number

–0.0234375.

Solution

a. S = 1 (the number is negative).

b. Decimal to binary transformation: 0.0234375 = (0.0000011)2.
c. Normalization: (0.0000011)2 = (1.1)2 × 2−6.
d. E = –6 + 127 = 121 = (01111001)2 and M = (1)2.
e. Representation:

The number is stored in the computer as

10111100110000000000000000000000
Retrieving numbers stored in IEEE standard floating point format:
1. Find the value of S,E, and M.
2. If S=0, set the sign to positive, otherwise set the sign to negative.
3. Find the shifter (E-127).
4. Denormalize the mantissa.
5. Change the denormalized number to binary to find the absolute value.
6. Add the sign.
Example

The bit pattern (11001010000000000111000100001111)2 is stored in

Excess_127 format. Show the value in decimal.

Solution

a. The first bit represents S, the next eight bits, E and the remaining 23 bits, M.

b. The sign is negative.

c. The shifter = E − 127 = 148 − 127 = 21.
d. This gives us (1.00000000111000100001111)2 × 221.
e. The binary number is (1000000001110001000011.11)2.
f. The absolute value is 2,104,378.75.
g. The number is −2,104,378.75.
Overflow and Underflow

Figure Overflow and underflow in floating-point representation of reals

Storing Zero

A real number with an integral part and the fractional part set to zero, that is,
0.0, cannot be stored using the steps discussed above. To handle this special
case, it is agreed that in this case the sign, exponent and the mantissa are set
to 0s.
Truncation errors

The value of the number stored using floating-point representation may not
be exactly as we expect it to be.
Ex: (1111111111111111.11111111111)2
in memory using excess_127 representation. After normalization, we have:
(1.11111111111111111111111111)2
the mantissa has 27 1s. This mantissa needs to be truncated to 23 1s.
(1111111111111111.11111111)2
the difference between the original number and what is retrieved is called
the truncation error.
Thank You

Your TELUS Bill: Account Summary
No ratings yet
Your TELUS Bill: Account Summary
6 pages
MapWinGIS Reference Manual
100% (2)
MapWinGIS Reference Manual
194 pages
Lecture 3 Hamming Code
No ratings yet
Lecture 3 Hamming Code
14 pages
ARCh Presentation1
No ratings yet
ARCh Presentation1
12 pages
COS1521-foundations_of_computer_science_-chapter_3
No ratings yet
COS1521-foundations_of_computer_science_-chapter_3
78 pages
Chapter 3
No ratings yet
Chapter 3
72 pages
CSI_03_tim
No ratings yet
CSI_03_tim
73 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Integer Representation
No ratings yet
Integer Representation
34 pages
Module 1 Data Rep
No ratings yet
Module 1 Data Rep
14 pages
COMPX203 Computer Systems: Number Representation
No ratings yet
COMPX203 Computer Systems: Number Representation
33 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Introduction To Numerical Computing: Statistics 580 Number Systems
No ratings yet
Introduction To Numerical Computing: Statistics 580 Number Systems
35 pages
Chap 02
No ratings yet
Chap 02
16 pages
CSI104 Slot05
No ratings yet
CSI104 Slot05
66 pages
Computer Arithmetic (5 Hours)
No ratings yet
Computer Arithmetic (5 Hours)
27 pages
Computer Architecture & Organization Unit 2
No ratings yet
Computer Architecture & Organization Unit 2
24 pages
Unit III CAO
No ratings yet
Unit III CAO
39 pages
L4
No ratings yet
L4
29 pages
CSC340 - HW3
No ratings yet
CSC340 - HW3
28 pages
Lect4 Floats
No ratings yet
Lect4 Floats
64 pages
Unit 2
No ratings yet
Unit 2
16 pages
Number Representation: CHAPTER 3 - Part 3
No ratings yet
Number Representation: CHAPTER 3 - Part 3
52 pages
UNIT5POSITIVENUMBERS
No ratings yet
UNIT5POSITIVENUMBERS
6 pages
MOD 2
No ratings yet
MOD 2
122 pages
Introduction To Computer Science 3
No ratings yet
Introduction To Computer Science 3
69 pages
Unit1 Data Representation_1
No ratings yet
Unit1 Data Representation_1
35 pages
Number Representation
No ratings yet
Number Representation
7 pages
Arithmetic Operations On Binary Numbers: Two's Complement Addition
No ratings yet
Arithmetic Operations On Binary Numbers: Two's Complement Addition
11 pages
CH10 COA10e
No ratings yet
CH10 COA10e
48 pages
Data Representation
No ratings yet
Data Representation
58 pages
Coa Unit 2
No ratings yet
Coa Unit 2
35 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
5 pages
CO Unit-V
No ratings yet
CO Unit-V
10 pages
Number Representation
No ratings yet
Number Representation
59 pages
Fop Presentation Group 3 Updated
No ratings yet
Fop Presentation Group 3 Updated
22 pages
UNIT 2 Computer Organization
No ratings yet
UNIT 2 Computer Organization
48 pages
Barry B Bary Book PDF
No ratings yet
Barry B Bary Book PDF
82 pages
CH08.2-Computer Arithmetic
No ratings yet
CH08.2-Computer Arithmetic
14 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
2 CS1FC16 Information Representation
No ratings yet
2 CS1FC16 Information Representation
4 pages
Fixed Point Numbers
No ratings yet
Fixed Point Numbers
20 pages
Chapter 3 Data Storage
No ratings yet
Chapter 3 Data Storage
51 pages
Number Systems: Prof. Indranil Sen Gupta
No ratings yet
Number Systems: Prof. Indranil Sen Gupta
21 pages
Wa0018.
No ratings yet
Wa0018.
55 pages
Chapter 3 - Data - Representation
No ratings yet
Chapter 3 - Data - Representation
21 pages
Fixed Versus Floating Point
No ratings yet
Fixed Versus Floating Point
5 pages
5.3 Representing Data - The Binary Number System
No ratings yet
5.3 Representing Data - The Binary Number System
22 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
CH 03
No ratings yet
CH 03
50 pages
Ece3101l Lab6 Signal Quantization
No ratings yet
Ece3101l Lab6 Signal Quantization
14 pages
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
31 pages
L12 Representation of Numbers
No ratings yet
L12 Representation of Numbers
27 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
Chapter 3 Data Storage (1)
No ratings yet
Chapter 3 Data Storage (1)
51 pages
Unit 2
No ratings yet
Unit 2
85 pages
3 Fixed and Floating Point DSP
No ratings yet
3 Fixed and Floating Point DSP
23 pages
Datarep
No ratings yet
Datarep
69 pages
CA Notes 01
No ratings yet
CA Notes 01
14 pages
Unit 3 Partial
No ratings yet
Unit 3 Partial
41 pages
Unit -1
No ratings yet
Unit -1
92 pages
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Basic Math Notes
From Everand
Basic Math Notes
Ernest Bywater
5/5 (2)
Lecture 9 Introduction Group Theory
No ratings yet
Lecture 9 Introduction Group Theory
17 pages
Equation of Tangent To General Conic - Ax +2hxy+by +2gx+2fy+c 0
No ratings yet
Equation of Tangent To General Conic - Ax +2hxy+by +2gx+2fy+c 0
12 pages
Lecture 22 Polar Equation Conic
No ratings yet
Lecture 22 Polar Equation Conic
7 pages
Lecture 18 General Equation of 2nd Degree
No ratings yet
Lecture 18 General Equation of 2nd Degree
17 pages
Lecture 19 General Equation of 2nd Degree 2
No ratings yet
Lecture 19 General Equation of 2nd Degree 2
8 pages
8086 Microprocessor and Assembly Language Program
No ratings yet
8086 Microprocessor and Assembly Language Program
6 pages
Answer Q1 - Q3 (From Tutorial Available in Emulator) What Is Assembly Language?
No ratings yet
Answer Q1 - Q3 (From Tutorial Available in Emulator) What Is Assembly Language?
26 pages
Screen Shots of Oracle Installation
No ratings yet
Screen Shots of Oracle Installation
8 pages
Ahmed Ismail Mahdi: Work Experience
No ratings yet
Ahmed Ismail Mahdi: Work Experience
2 pages
Training Workshop - Blockchain Basics
No ratings yet
Training Workshop - Blockchain Basics
3 pages
RAC SRVCTL Comands
No ratings yet
RAC SRVCTL Comands
8 pages
WQX Web: A Tool For Sharing Your Water Quality Data: Background
No ratings yet
WQX Web: A Tool For Sharing Your Water Quality Data: Background
2 pages
Finite State Transducers
No ratings yet
Finite State Transducers
4 pages
Timer - STM32
No ratings yet
Timer - STM32
14 pages
Company Profile
No ratings yet
Company Profile
20 pages
Vaadin Tutorial
100% (1)
Vaadin Tutorial
75 pages
Installing the ChartCo software
No ratings yet
Installing the ChartCo software
4 pages
CCNA 1 1 Release Notes
No ratings yet
CCNA 1 1 Release Notes
1 page
IT 10th (Prashant Kirad)-1-35
No ratings yet
IT 10th (Prashant Kirad)-1-35
35 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
12 pages
SAP Acronyms
No ratings yet
SAP Acronyms
52 pages
User Manual LPT
No ratings yet
User Manual LPT
3 pages
SQL 6
No ratings yet
SQL 6
12 pages
C&CPP Lab Programs
100% (2)
C&CPP Lab Programs
14 pages
Data Structure
No ratings yet
Data Structure
18 pages
Freelancing Platform Synopsis
No ratings yet
Freelancing Platform Synopsis
9 pages
Eagle Point
No ratings yet
Eagle Point
24 pages
Opps Using C++ SECTION B Questions
No ratings yet
Opps Using C++ SECTION B Questions
4 pages
Sega Dreamcast Game Console-BPT
No ratings yet
Sega Dreamcast Game Console-BPT
15 pages
Nemo Scanner Guide 7.50
No ratings yet
Nemo Scanner Guide 7.50
145 pages
DevOps Course Breakdown
No ratings yet
DevOps Course Breakdown
2 pages
Traffic Management System - Concept
No ratings yet
Traffic Management System - Concept
3 pages
Et200s 2ai I 2 4wire HF Manual en-US
No ratings yet
Et200s 2ai I 2 4wire HF Manual en-US
28 pages
VMAX3 Is Lab Guide
No ratings yet
VMAX3 Is Lab Guide
190 pages
Pointseg: Real-Time Semantic Segmentation Based On 3D Lidar Point Cloud
No ratings yet
Pointseg: Real-Time Semantic Segmentation Based On 3D Lidar Point Cloud
7 pages
Konsep Pengembangan Diri Dalam Menghadap Fb4192e5
No ratings yet
Konsep Pengembangan Diri Dalam Menghadap Fb4192e5
23 pages

Fixed _And_Floating_Point_representation

Uploaded by

Fixed _And_Floating_Point_representation

Uploaded by

Course :Computer Class :

Lesson :Fixed and floating

1. How to store the sign of the number.

Integers are whole numbers (numbers without a fractional part). For

An input device stores an unsigned integer using the following steps:

1. The integer is changed to binary.

Store 7 in an 8-bit memory location using unsigned representation.

Store 258 in a 16-bit memory location.

Retrieving unsigned integers

Applications of unsigned integers:

Figure Sign-and-magnitude representation

Retrieving an integer in two’s complement format:

Overflow in two’s complement representation

In the decimal system, assume that we use a fixed-point representation with

In the decimal system, assume that we use a fixed-point representation with

The solution for maintaining accuracy or precision is to use floating-point

Figure The three parts of a real number in floating-point representation

A floating point representation of a number is made up of three parts: a sign, a

Floating-point representation is used in science to represent very small or very large

The following shows the decimal number

in scientific notation (floating-point representation).

Show the number

in scientific notation (floating-point representation).

(.01)2= (1x 2-2)10

(.001)2= (1x 2-3)10

(.011)2 = (.01)2+ (.001)2= (1x 2-2)10+(1x 2-3)10

Show the number

• Although this could have been stored using two’s complement

• To represent a positive or negative integer, a positive integer (called a bias)

Figure Shifting in Excess representation

Figure IEEE standards for floating-point representation

Storage of IEEE standard floating point numbers:

Show the Excess_127 (single precision) representation of the decimal

a. The sign is positive, so S = 0.

The number is stored in the computer as

Show the Excess_127 (single precision) representation of the decimal number

a. The sign is negative, so S = 1.

The number is stored in the computer as

Show the Excess_127 (single precision) representation of the decimal number

a. S = 1 (the number is negative).

The number is stored in the computer as

The bit pattern (11001010000000000111000100001111)2 is stored in

b. The sign is negative.

Figure Overflow and underflow in floating-point representation of reals

You might also like