0% found this document useful (0 votes)

8 views22 pages

4.4_1 New Floating Point.pptx

Uploaded by

pes2ug23cs007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views22 pages

4.4_1 New Floating Point.pptx

Uploaded by

pes2ug23cs007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

DIGITAL DESIGN & COMPUTER

ORGANISATION
Floating Point

Sudarshan T S B., Ph.D.

Department of Computer Science
& Engineering
DIGITAL DESIGN & COMPUTER ORGANISATION

Floating Point

Sudarshan T S B., Ph.D.

Department of Computer Science & Engineering
FLOATING POINT
Course Outline

🔵 Digital Design
► Combinational logic design
► Sequential logic design Concepts covered
★ Floating Point
🔵 Floating Point Representation
🔵 Computer Organisation
► Architecture (microprocessor
instruction set)
► Microarchitecture (microprocessor
operation)
FLOATING POINT
Not Just Integers

🔵 Real numbers can be represented using:

► Fixed point
► Floating point

🔵 Fixed point notation is where the decimal point is fixed and numbers to the right
of decimal point are the fraction portion and to the left is the integer portion.
► Limited by the digits used
► Not suitable to represent very small are very large numbers

🔵 Programming languages support fraction called floating point numbers

► Example: 3.14159265… (𝜋); 2.71828… (𝑒)
► Data type used float , double
Number Systems
Fixed–Point Number
Systems
❖ Signed fixed–point numbers can use either two’s complement or
sign/magnitude notation.
❖ Figure shows the fixed–point representation of −2.375 using both
notations with four integer and four fraction bits.
FLOATING POINT
Fixed Point Example

🔵 Represent 6.75 using 4 integer bits and 4 fraction bits:

► 6 => 0110 (22+21)
► 0.75 => 0.1100 (2-1 + 2-2)
► 6.75 => 0110.1100

► Here binary point is implied and the number of bits used is decided before
hand
► Fixed point
► Floating point

🔵 Represent -7.5 using 4 integer and 4 fraction bits

► +7.5 => 0111.1000
► 2’s complement -7.5 => 1000.1000
FLOATING POINT
Fixed Point Example

🔵 Perform the following operation: 7.5 – 0.625 => 7.5 + (-0.625)

► 7.5 => 0111.1000
► -0.625 => 111.0110 (2’scomplement)
► 0111.1000 + 1111.0110 = 0110.1110 (6.875)

🔵 The range and accuracy is very limited.

► Ex: 8.9375 + 8.3125 = 17.2495
► 8.9375 => 1000.1111
► 8.3125 => 1000.0101
► Add: 0001.0100 (1.25) which is the result of limited range and limited
accuracy

🔵 How to increase the range and improve the accuracy?

► Go for Floating Point Representation
FLOATING POINT
Not Just Integers

🔵 Floating point notation is used to represent real numbers which are from small
to large numbers

🔵 We use scientific notation to represent these numbers

► ± d.f1f2f3… x 10 ± 𝑒1e2
► ±MxB±E

🔵 This representation is to include very small numbers like 1.0 x 10-23 and very
larger numbers like 9.546 x 1012

🔵 Floating point numbers should be normalized

► Use one non-zero digit as integer
► In decimal it will be from 1 to 9
► In binary this should be 1
► Ex:
Normalised floating point: 2.234 x 103 or 1.101 x 2-4
Non-normalized floating point: 0.0234 x 105 or 110.1 x 2-6
FLOATING POINT
IEEE 754-2008 Standard

🔵 IEEE Standard defines structure of floating point number representation

🔵 Developed in response to divergence of representations and arithmetic operations
► Portability issues for scientific code
► Universally adopted
🔵 Defines four representations:
► Single Precision (32-bits)
► Double Precision (64-bits)
► Extended Double Precision 10 bytes (80-bits)
► Quadruple Precision 16 bytes(128-bits)
🔵 Real Number is represented in IEEE 754-2008 standard as three parts:
► Sign bit
► Exponent bits
► Mantissa bits or Significand bits
FLOATING POINT
IEEE 754-2008 Standard (Single Precision)

🔵 Sign bit 0 indicates positive and 1 indicates negative number

🔵 Mantissa represents fraction and signifies accuracy of the number
🔵 Exponent represents range of the numbers that shall be represented
🔵 General form: ± 1.Mantissa x 2 Exponent
🔵 For Single precision (32-bits) representation:
► Biased Exponent is 8-bits
► Mantissa is 23 bits
FLOATING POINT
IEEE 754-2008 Standard (Double Precision)

🔵 Sign bit 0 indicates positive and 1 indicates negative number

🔵 Mantissa represents fraction and signifies accuracy of the number
🔵 Exponent represents range of the numbers that shall be represented
🔵 General form: ± 1.Mantissa x 2 Exponent
🔵 For Double precision (64-bits) representation:
► Biased Exponent is 11-bits
► Mantissa is 52 bits
FLOATING POINT
Biased Exponent

🔵 Biased Exponent, BE = Bias + Exponent

🔵 Actual Exponent, E = Biased Exponent – Bias
🔵 Recall for Single precision Biased Exponent is 8-bits (Range: 0 to 255)
🔵 BE = 0 and BE = 255 are reserved for special use
🔵 So, BE = 1 to 254 are used for normalized floating point numbers.
🔵 Bias = 127 => (2n-1 – 1)
🔵 Therefore Range of Actual exponent that could be represented is:
► Min = 1 – 127 = -126

► Max = 254 – 127 = 127

► So range is from -126 to +127

🔵 FP Representation:
s (BE-Bias)
Bias = 127 for SP
N = (-1) * (1+.M)*2 Bias = 1023 for DP
FLOATING POINT
FP Example

What is the value of the following number:

0 100 0011 0 110 0100 0000 0000 0000 0000

In Hexadecimal this is represented as 0x43640000

Solution:
Sign = 0; Positive number
Biased Exponent = (1000 0110)2 = 134;
Actual Exponent = 134 – 127 = 7
Mantissa = (1. 1100 10…000)2 = 1.78125 (1. is implicit)

So, the value of the decimal = 1.1100100 x 27 = 11100100 = 228

FLOATING POINT
FP Example

What is the value of the following number:

1 011 1110 0 010 0000 0000 0000 0000 0000

In Hexadecimal this is represented as 0xBE200000

Solution:
Sign = 1;
Biased Exponent = (0111 1100)2 = 124;
Actual Exponent = 124 – 127 = -3
Mantissa = (1. 0100 00…000)2 = 1.25 (1. is implicit)

So, the value of the decimal = -1.25 x 2-3 = - 0.15625

FLOATING POINT
FP Example

Write -58.2510 in Single Precision Floating Point (IEEE 754)

1. Convert decimal to binary:

58.2510 = 111010.012
2. Write in normalized scientific notation:
1.1101001 × 25 (1. is implicit)
3. Fill in fields:
Sign bit: 1 (negative)
8 exponent bits: (127 + 5) = 132 = 100001002
23 fraction bits: 110 1001 0000 0000 0000 0000

1 100 0010 0 110 1001 0000 0000 0000 0000

In Hexadecimal this is represented as 0xC2690000

FLOATING POINT
FP Example

Write -58.2510 in Double Precision Floating Point (IEEE 754)

1. Convert decimal to binary:

58.2510 = 111010.012
2. Write in normalized scientific notation:
1.1101001 × 25 (1. is implicit)
3. Fill in fields:
Sign bit: 1 (negative)
11 exponent bits: (1023 + 5) = 1028 = 100 0000 01002
52 fraction bits: 1101 0010 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

1 100 0000 0100 1101 0010 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

In Hexadecimal this is represented as 0xC04D200000000000

FLOATING POINT
Smallest and Largest Normalised FP value

Single Precision FP:

Exponents 00000000 and 11111111 are reserved

Smallest value
Bias Exponent: 00000001
⇒ Actual Exponent = 1 – 127 = –126
Fraction: 000…00 ⇒ significand = 1.0
±1.0 × 2–126 ≈ ±1.17549… × 10–38

Largest value
Biased Exponent: 11111110
⇒ Actual Exponent = 254 – 127 = +127
Fraction: 111…11 ⇒ significand ≈ 2.0
±2.0 × 2+127 = 2-128 ≈ ±3.4028… × 10+38
FLOATING POINT
Special Cases

Number Sign Exponent Fraction

0 X 00000000 00000000000000000000000
∞ 0 11111111 00000000000000000000000
-∞ 1 11111111 00000000000000000000000
NaN X 11111111 non-zero

*NaN is Not a Number

Ex: ÷ by zero, √-ve no.
FLOATING POINT
Special Cases

Source: Computer Organisation & Design by

Patterson & Hennessy, Morgan Kaufmann
FLOATING POINT
Rounding Modes

🔵 Overflow: number too large to be represented

🔵 Underflow: number too small to be represented
🔵 Rounding modes:
► Down
► Up
► Toward zero
► To nearest
🔵 Example: round 1.100101 (1.578125) to only 3 fraction bits
► Down: 1.100
► Up: 1.101
► Toward zero: 1.100
► To nearest: 1.101 (1.625 is closer to 1.578125 than 1.5 is)
FLOATING POINT
Think about it

🔵 What are the largest normalized double precision FP numbers?

► Hint: double precision exponent is 11 bits and mantissa is 52
bits

🔵 What is the relative precision in terms of decimal fractional digits

that single precision and double precision offer?
► Hint: mantissa bits

🔵 An example to represent denormalized valid floating point

number?
► Hint: Biased Exponent = 0 & Mantissa = Nonzero
THANK YOU

Sudarshan T S B. Ph.D.,
Department of Computer Science & Engineering
[email protected]
+91 80 6666 3333 Extn 215

Adic Lab Manual 1
No ratings yet
Adic Lab Manual 1
40 pages
Using Open-Source Java From RPG: Aaron Bartell
No ratings yet
Using Open-Source Java From RPG: Aaron Bartell
2 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
Floating Points
No ratings yet
Floating Points
31 pages
IEEE Standard 754
No ratings yet
IEEE Standard 754
10 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
Module 2 - PART D Floating
No ratings yet
Module 2 - PART D Floating
30 pages
Fixed and Floating Point Representation
No ratings yet
Fixed and Floating Point Representation
5 pages
Module2.1 of nothing
No ratings yet
Module2.1 of nothing
7 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
5 pages
Fixed and Floating Point Numbers: Dr. Ashish GUPTA Sense, Vit-Ap Ashish - Gupta@vitap - Ac.in
No ratings yet
Fixed and Floating Point Numbers: Dr. Ashish GUPTA Sense, Vit-Ap Ashish - Gupta@vitap - Ac.in
34 pages
floating-point-numbers-237045407-237045407
No ratings yet
floating-point-numbers-237045407-237045407
20 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
16-Algorithms For Floating Point Arithmetic Operations and Numericals-01-02-2024
No ratings yet
16-Algorithms For Floating Point Arithmetic Operations and Numericals-01-02-2024
21 pages
Week 5: IEEE Floating Point Revision Guide For Phase Test
No ratings yet
Week 5: IEEE Floating Point Revision Guide For Phase Test
23 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
EC-502 - Aritra Dutta
No ratings yet
EC-502 - Aritra Dutta
6 pages
Floating Point Representation - M.eng Term Paper
No ratings yet
Floating Point Representation - M.eng Term Paper
6 pages
IEEE STANDARD FOR FLOATING POINT NUMBERS
No ratings yet
IEEE STANDARD FOR FLOATING POINT NUMBERS
5 pages
The IEEE Standard For Floating Point Arithmetic
No ratings yet
The IEEE Standard For Floating Point Arithmetic
9 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
7 pages
Ieee 754 F P R: Loating Oint Epresentation
No ratings yet
Ieee 754 F P R: Loating Oint Epresentation
11 pages
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
No ratings yet
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
51 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
27 pages
arch1-LECTURE-NUMBER REPRESENTATION
No ratings yet
arch1-LECTURE-NUMBER REPRESENTATION
42 pages
Floating Point Representation
No ratings yet
Floating Point Representation
3 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Lecture 05 - Floating Point Numbers
No ratings yet
Lecture 05 - Floating Point Numbers
28 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
26 pages
Floating Point 6up
No ratings yet
Floating Point 6up
7 pages
Complete Floating Point (Blog)
No ratings yet
Complete Floating Point (Blog)
18 pages
Floating-Point Numbers and Operations Representation
No ratings yet
Floating-Point Numbers and Operations Representation
8 pages
5268882
No ratings yet
5268882
23 pages
Cacc
No ratings yet
Cacc
106 pages
Digital Signal Processing: Date: 17/08/2017
No ratings yet
Digital Signal Processing: Date: 17/08/2017
27 pages
IEEE FP Representation
No ratings yet
IEEE FP Representation
3 pages
LECTURE NOTE Fixed and Floating Point Representation (1)
No ratings yet
LECTURE NOTE Fixed and Floating Point Representation (1)
3 pages
Lecture 4 - Computer Arithmetic
No ratings yet
Lecture 4 - Computer Arithmetic
18 pages
Floating Point
No ratings yet
Floating Point
13 pages
13.3-Floating-Point-Numbers-Notes-2024
No ratings yet
13.3-Floating-Point-Numbers-Notes-2024
8 pages
IEEE Paper On Floating Point
No ratings yet
IEEE Paper On Floating Point
28 pages
The Conversion Procedure (Decimal To Floating Point)
No ratings yet
The Conversion Procedure (Decimal To Floating Point)
8 pages
Floating Point & fixed point Representation_BCA II
No ratings yet
Floating Point & fixed point Representation_BCA II
24 pages
Number Representation
No ratings yet
Number Representation
7 pages
08-FloatingPoint
No ratings yet
08-FloatingPoint
52 pages
What Are Floating Point Numbers?
No ratings yet
What Are Floating Point Numbers?
7 pages
181
No ratings yet
181
11 pages
Fixed Point Numbers
No ratings yet
Fixed Point Numbers
20 pages
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
No ratings yet
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
24 pages
A Level ZIMSEC Computer Science Notes
No ratings yet
A Level ZIMSEC Computer Science Notes
10 pages
9-Algorithms For Floating Point Arithmetic Operations-22-01-2024
No ratings yet
9-Algorithms For Floating Point Arithmetic Operations-22-01-2024
49 pages
CEF352 Lect2
No ratings yet
CEF352 Lect2
18 pages
IEEE Standard 754 Floating Point Numbers
No ratings yet
IEEE Standard 754 Floating Point Numbers
7 pages
Computer Organisation
No ratings yet
Computer Organisation
4 pages
Floating-Point Numbers
No ratings yet
Floating-Point Numbers
23 pages
Floating Point Tutorial
No ratings yet
Floating Point Tutorial
15 pages
Unit 2
No ratings yet
Unit 2
16 pages
Number System
No ratings yet
Number System
38 pages
NT Notes
No ratings yet
NT Notes
8 pages
2.4 Floating Points
No ratings yet
2.4 Floating Points
36 pages
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Mathematics V11 Home Study
From Everand
Mathematics V11 Home Study
Clive W. Humphris
No ratings yet
JVM Architecture
No ratings yet
JVM Architecture
3 pages
Introduction To Python Programming
No ratings yet
Introduction To Python Programming
17 pages
HP Page Wide Pro 300 400 500 Series Troubleshooting
0% (1)
HP Page Wide Pro 300 400 500 Series Troubleshooting
162 pages
Lesson 3 Building A Circuit
No ratings yet
Lesson 3 Building A Circuit
20 pages
Operating Systems Kcs 401 2023
No ratings yet
Operating Systems Kcs 401 2023
2 pages
6MBP20VSA060-50-Fuji
No ratings yet
6MBP20VSA060-50-Fuji
12 pages
Red Hat OpenShift Container Platform Delivers Enterprise-Grade Application Containers
No ratings yet
Red Hat OpenShift Container Platform Delivers Enterprise-Grade Application Containers
11 pages
Methods To Repair Corrupted or Damaged PDF
No ratings yet
Methods To Repair Corrupted or Damaged PDF
11 pages
Bài CH A
No ratings yet
Bài CH A
4 pages
2223-1 Assignment 5 at KE17503
No ratings yet
2223-1 Assignment 5 at KE17503
5 pages
Seminar Report ON Cloud Storage 1
100% (1)
Seminar Report ON Cloud Storage 1
25 pages
Z-80 Assembly Language Programming 1979 Leventhal Text
100% (1)
Z-80 Assembly Language Programming 1979 Leventhal Text
642 pages
Interrupt and Exception: - Introduction
No ratings yet
Interrupt and Exception: - Introduction
19 pages
Lab Report 1
No ratings yet
Lab Report 1
4 pages
Experience - Moshfegh Hamedani - LinkedIn
No ratings yet
Experience - Moshfegh Hamedani - LinkedIn
4 pages
Universal Laboratory Power Supply
No ratings yet
Universal Laboratory Power Supply
5 pages
Module IV-Part 1 Astable and Monostable
No ratings yet
Module IV-Part 1 Astable and Monostable
39 pages
Presentation - WinCC Unified
No ratings yet
Presentation - WinCC Unified
69 pages
Pagasys Gen Ii: Description, Installation, Operations and Maintenance Manual
No ratings yet
Pagasys Gen Ii: Description, Installation, Operations and Maintenance Manual
144 pages
Network Security Unit - 1 Notes
100% (2)
Network Security Unit - 1 Notes
9 pages
Edirectory - Support - How To Collect Edirectory LDAP Traces For Troubleshooting
No ratings yet
Edirectory - Support - How To Collect Edirectory LDAP Traces For Troubleshooting
2 pages
Fhtek Passive CWDM Solution V1.5
No ratings yet
Fhtek Passive CWDM Solution V1.5
17 pages
Up1513p Datasheet
No ratings yet
Up1513p Datasheet
16 pages
Report On Raspberry Pi
25% (4)
Report On Raspberry Pi
26 pages
HP DMI Utility for Notebooks
No ratings yet
HP DMI Utility for Notebooks
3 pages
Ps0061 Manual
No ratings yet
Ps0061 Manual
1 page
制作精美的幻灯片演示文稿
100% (2)
制作精美的幻灯片演示文稿
11 pages

4.4_1 New Floating Point.pptx

Uploaded by

4.4_1 New Floating Point.pptx

Uploaded by

DIGITAL DESIGN & COMPUTER

Sudarshan T S B., Ph.D.

Sudarshan T S B., Ph.D.

🔵 Real numbers can be represented using:

🔵 Programming languages support fraction called floating point numbers

🔵 Represent 6.75 using 4 integer bits and 4 fraction bits:

🔵 Represent -7.5 using 4 integer and 4 fraction bits

🔵 Perform the following operation: 7.5 – 0.625 => 7.5 + (-0.625)

🔵 The range and accuracy is very limited.

🔵 How to increase the range and improve the accuracy?

🔵 We use scientific notation to represent these numbers

🔵 Floating point numbers should be normalized

🔵 IEEE Standard defines structure of floating point number representation

🔵 Sign bit 0 indicates positive and 1 indicates negative number

🔵 Sign bit 0 indicates positive and 1 indicates negative number

🔵 Biased Exponent, BE = Bias + Exponent

► Max = 254 – 127 = 127

► So range is from -126 to +127

What is the value of the following number:

In Hexadecimal this is represented as 0x43640000

So, the value of the decimal = 1.1100100 x 27 = 11100100 = 228

What is the value of the following number:

In Hexadecimal this is represented as 0xBE200000

So, the value of the decimal = -1.25 x 2-3 = - 0.15625

Write -58.2510 in Single Precision Floating Point (IEEE 754)

1. Convert decimal to binary:

1 100 0010 0 110 1001 0000 0000 0000 0000

In Hexadecimal this is represented as 0xC2690000

Write -58.2510 in Double Precision Floating Point (IEEE 754)

1. Convert decimal to binary:

In Hexadecimal this is represented as 0xC04D200000000000

Single Precision FP:

Exponents 00000000 and 11111111 are reserved

Number Sign Exponent Fraction

*NaN is Not a Number

Source: Computer Organisation & Design by

🔵 Overflow: number too large to be represented

🔵 What are the largest normalized double precision FP numbers?

🔵 What is the relative precision in terms of decimal fractional digits

🔵 An example to represent denormalized valid floating point

You might also like