0% found this document useful (0 votes)
3 views158 pages

Lecture3

Computer Archtecture - Bitwise Standards

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views158 pages

Lecture3

Computer Archtecture - Bitwise Standards

Uploaded by

minulo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 158

CSC 252/452: Computer Organization

Fall 2024: Lecture 3

Instructor: Yanan Guo


Department of Computer Science
University of Rochester
Carnegie Mellon

Announcement
• Programming Assignment 1 is out
• Details:
https://www.cs.rochester.edu/courses/252/fall2024/labs/
assignment1.html
• Due on Sep 16th, 11:59 PM
• You have 3 slip days

2
Carnegie Mellon

Announcement
• Programming Assignment 1 is in C language.
• Seek help from TAs.
• TAs are best positioned to answer your questions about
programming assignments!!!
• Programming assignments do NOT repeat the lecture
materials. They ask you to synthesize what you have
learned from the lectures and work out something new.
• Pay attention to Blackboard announcements
• There are changes about the office hour locations/time…
• I have to move my office hour tomorrow to early next
week.

3
Carnegie Mellon

Last Lecture
• Why Binary (bits)?
• Bit-level manipulations
• Integers
• Representation: unsigned and signed
• Conversion, casting
• Expanding, truncating
• Addition, negation, multiplication, shifting
• Summary

4
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
0 1 2 3 4 5 6 7

Unsigned Binary
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

Unsigned Binary
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

Signed Unsigned Binary


0 0 000
1 1 001
2 2 010
3 3 011
-4 4 100
-3 5 101
-2 6 110
-1 7 111

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

b2b1b0 Signed Unsigned Binary


0 0 000
1 1 001
Weights in
Unsigned
22 21 20 2 2 010
3 3 011
-4 4 100
-3 5 101
-2 6 110
-1 7 111

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

b2b1b0 Signed Unsigned Binary


0 0 000
1 1 001
Weights in
Unsigned
22 21 20 2 2 010
3 3 011
-4 4 100
Weights in
-22 21 20 -3 5 101
Signed -2 6 110
-1 7 111

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

b2b1b0 Signed Unsigned Binary


0 0 000
1 1 001
Weights in
Unsigned
22 21 20 2 2 010
3 3 011
-4 4 100
Weights in
-22 21 20 -3 5 101
Signed -2 6 110
-1 7 111
1012 = 1*20 + 0*21 + (-1*22) = -310

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

b2b1b0 Signed Unsigned Binary


0 0 000
1 1 001
Weights in
Unsigned
22 21 20 2 2 010
3 3 011
-4 4 100
Weights in
-22 21 20 -3 5 101
Signed -2 6 110
-1 7 111
1012 = 1*20 + 0*21 + (-1*22) = -310

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

b2b1b0 Signed Unsigned Binary


0 0 000
1 1 001
Weights in
Unsigned
22 21 20 2 2 010
3 3 011
-4 4 100
Weights in
-22 21 20 -3 5 101
Signed -2 6 110
-1 7 111
1012 = 1*20 + 0*21 + (-1*22) = -310

5
Carnegie Mellon

Encoding Negative Numbers


• Two’s Complement
-4 -3 -2 -1 0 1 2 3

b2b1b0 Signed Unsigned Binary


0 0 000
1 1 001
Weights in
Unsigned
22 21 20 2 2 010
3 3 011
-4 4 100
Weights in
-22 21 20 -3 5 101
Signed -2 6 110
-1 7 111
1012 = 1*20 + 0*21 + (-1*22) = -310

5
Carnegie Mellon

Two-Complement Implications
• Only 1 zero Signed Binary

• There is (still) a bit that represents sign! 0


1
000
001
• Unsigned arithmetic still works 2 010
3 011
-4 100
-3 101
-2 110
-1 111

6
Carnegie Mellon

Two-Complement Implications
• Only 1 zero Signed Binary

• There is (still) a bit that represents sign! 0


1
000
001
• Unsigned arithmetic still works 2 010
3 011
-4 100
010 -3 101
+) 101 -2 110
111 -1 111

6
Carnegie Mellon

Two-Complement Implications
• Only 1 zero Signed Binary

• There is (still) a bit that represents sign! 0


1
000
001
• Unsigned arithmetic still works 2 010
3 011
-4 100
010 2 -3 101
+) 101 +) -3 -2 110
111 -1 -1 111

6
Carnegie Mellon

Two-Complement Implications
• Only 1 zero Signed Binary

• There is (still) a bit that represents sign! 0


1
000
001
• Unsigned arithmetic still works 2 010
3 011
-4 100
010 2 -3 101
+) 101 +) -3 -2 110
111 -1 -1 111

• 3 + 1 becomes -4 (called overflow. More on it later.)

6
Carnegie Mellon

Data Types (in C)


• Suppose you want to define a variable that represents a
person’s age. What assumptions can you make about this
variable’s numerical value?

7
Carnegie Mellon

Data Types (in C)


• Suppose you want to define a variable that represents a
person’s age. What assumptions can you make about this
variable’s numerical value?
• Integer
• Non-negative
• Between 0 and 255 (8 bits)

7
Carnegie Mellon

Data Types (in C)


• Suppose you want to define a variable that represents a
person’s age. What assumptions can you make about this
variable’s numerical value?
• Integer
• Non-negative
• Between 0 and 255 (8 bits)
• Define a data type that captures all these attributes:
unsigned char in C
• Internally, an unsigned char variable is represented as a 8-bit,
non-negative, binary number

7
Carnegie Mellon

Data Types (in C)


• What if you want to define a variable that could take
negative values?

8
Carnegie Mellon

Data Types (in C)


• What if you want to define a variable that could take
negative values?
• That’s what signed data types (e.g., int, short, etc.) are for

8
Carnegie Mellon

Data Types (in C)


• What if you want to define a variable that could take
negative values?
• That’s what signed data types (e.g., int, short, etc.) are for
• How are int values internally represented?
• Theoretically could be either signed-magnitude or two’s complement
• The C language designers chose two’s complement

8
Carnegie Mellon

Data Types (in C)

C Data Type 32-bit 64-bit

(unsigned) char 1 1
(unsigned) short 2 2
(unsigned) int 4 4
(unsigned) long 4 8

9
Carnegie Mellon

Data Types (in C)

• C Language
C Data Type 32-bit 64-bit •#include <limits.h>
(unsigned) char 1 1 •Declares constants, e.g.,
•ULONG_MAX
(unsigned) short 2 2
•LONG_MAX
(unsigned) int 4 4 •LONG_MIN
(unsigned) long 4 8 •Values platform specific

9
Carnegie Mellon

Mapping Between Signed & Unsigned


• Mappings between unsigned and two’s complement
numbers: Keep bit representations and reinterpret

Signed Unsigned Binary


0 0 000
1 1 001
2 2 010
3 3 011
-4 4 100
-3 5 101
-2 6 110
-1 7 111

10
Carnegie Mellon

Mapping Signed « Unsigned


Bits Signed Unsigned
0000 0 0
0001 1 1
0010 2 2
0011 3 = 3
0100 4 4
0101 5 5
T2U
0110 6 6
0111 7 7
1000 -8 U2T 8
1001 -7 9
1010 -6 +/- 16 10
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
11
Carnegie Mellon

Today: Representing Information in Binary


• Why Binary (bits)?
• Bit-level manipulations
• Integers
• Representation: unsigned and signed
• Conversion, casting
• Expanding, truncating
• Addition, negation, multiplication, shifting
• Summary

12
Carnegie Mellon

The Problem C Data Type 64-bit

short int x = 15213; char 1


int ix = (int) x;
short int y = -15213;
short 2
int iy = (int) y; int 4
long 8

13
Carnegie Mellon

The Problem C Data Type 64-bit

short int x = 15213; char 1


int ix = (int) x;
short int y = -15213;
short 2
int iy = (int) y; int 4
long 8
• Converting from smaller to larger integer data type
• Should we preserve the value?
• Can we preserve the value?
• How?

13
Carnegie Mellon

The Problem C Data Type 64-bit

short int x = 15213; char 1


int ix = (int) x;
short int y = -15213;
short 2
int iy = (int) y; int 4
long 8
• Converting from smaller to larger integer data type
• Should we preserve the value?
• Can we preserve the value?
• How?
Decimal Hex Binary
x 15213 3B 6D 00111011 01101101
ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101
y -15213 C4 93 11000100 10010011
iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011

13
Carnegie Mellon

Signed Extension
• Task:
• Given w-bit signed integer x
• Convert it to (w+k)-bit integer with same value

14
Carnegie Mellon

Signed Extension
• Task:
• Given w-bit signed integer x
• Convert it to (w+k)-bit integer with same value
• Rule:
• Make k copies of sign bit:
• X ′ = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0

k copies of MSB

14
Carnegie Mellon

Signed Extension
• Task:
• Given w-bit signed integer x
• Convert it to (w+k)-bit integer with same value
• Rule:
• Make k copies of sign bit:
• X ′ = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0

k copies of MSB w
X •••

•••

X¢ ••• •••
k w 14
Carnegie Mellon

Another Problem
unsigned short x = 47981;
unsigned int ux = x;

Decimal Hex Binary


x 47981 BB 6D 10111011 01101101
ux 47981 00 00 BB 6D 00000000 00000000 10111011 01101101

15
Carnegie Mellon

Unsigned (Zero) Extension


• Task:
• Given w-bit unsigned integer x
• Convert it to (w+k)-bit integer with same value
• Rule:
• Simply pad zeros:
• X ′ = 0 ,…, 0 , xw–1 , xw–2 ,…, x0

k copies of 0 w
X •••

•••

X¢ 0 0
••• 0 0
•••
k w 16
Carnegie Mellon

Yet Another Problem


int x = 53191;
short sx = (short) x;

Decimal Hex Binary


x 53191 00 00 CF C7 00000000 00000000 11001111 11000111
sx -12345 CF C7 11001111 11000111

17
Carnegie Mellon

Yet Another Problem


int x = 53191;
short sx = (short) x;

Decimal Hex Binary


x 53191 00 00 CF C7 00000000 00000000 11001111 11000111
sx -12345 CF C7 11001111 11000111

• Truncating (e.g., int to short OR unsigned int to unsigned short)


• C’s implementation: leading bits are truncated, results reinterpreted
• So can’t always preserve the numerical value

17
Carnegie Mellon

Today: Representing Information in Binary


• Why Binary (bits)?
• Bit-level manipulations
• Integers
• Representation: unsigned and signed
• Conversion, casting
• Expanding, truncating
• Addition, negation, multiplication, shifting
• Summary
• Representations in memory, pointers, strings

18
Carnegie Mellon

Unsigned Addition
Unsigned Binary
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111

19
Carnegie Mellon

Unsigned Addition
• Similar to Decimal Addition Unsigned Binary
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111

19
Carnegie Mellon

Unsigned Addition
• Similar to Decimal Addition Unsigned Binary

• Suppose we have a new data type that is 0


1
000
001
3-bit wide (c.f., short has 16 bits) 2 010
3 011
4 100
5 101
6 110
010 2
Normal 7 111
+) 101 +) 5
Case 111 7

19
Carnegie Mellon

Unsigned Addition
• Similar to Decimal Addition Unsigned Binary

• Suppose we have a new data type that is 0


1
000
001
3-bit wide (c.f., short has 16 bits) 2 010
• Might overflow: result can’t be 3 011
represented within the size of the data type 4 100
5 101
6 110
010 2
Normal 7 111
+) 101 +) 5
Case 111 7

110 6
Overflow +) 101 +) 5
Case 1011 11

19
Carnegie Mellon

Unsigned Addition
• Similar to Decimal Addition Unsigned Binary

• Suppose we have a new data type that is 0


1
000
001
3-bit wide (c.f., short has 16 bits) 2 010
• Might overflow: result can’t be 3 011
represented within the size of the data type 4 100
5 101
6 110
010 2
Normal 7 111
+) 101 +) 5
Case 111 7

110 6
Overflow +) 101 +) 5
Case 1011 11 True Sum

19
Carnegie Mellon

Unsigned Addition
• Similar to Decimal Addition Unsigned Binary

• Suppose we have a new data type that is 0


1
000
001
3-bit wide (c.f., short has 16 bits) 2 010
• Might overflow: result can’t be 3 011
represented within the size of the data type 4 100
5 101
6 110
010 2
Normal 7 111
+) 101 +) 5
Case 111 7

110 6
Overflow +) 101 +) 5
Case 1011 11 True Sum
011 3 Sum with same bits
19
Carnegie Mellon

Unsigned Addition in C
Operands: w bits u •••
+v •••
True Sum: w+1 bits
u+v •••
Discard Carry: w bits UAddw(u , v) •••

20
Carnegie Mellon

Two’s Complement Addition


Signed Binary
0 000
1 001
2 010
3 011
-4 100
-3 101
-2 110
-1 111

21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
3 011
-4 100
-3 101
-2 110
-1 111

21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
3 011
-4 100
010 2 -3 101
Normal +) 101 +) -3 -2 110
Case 111 -1
-1 111

21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
• Overflow can also occur 3 011
-4 100
010 2 -3 101
Normal +) 101 +) -3 -2 110
Case 111 -1
-1 111

110 -2
Overflow +) 101 +) -3
Case 1011 -5

21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
• Overflow can also occur 3 011
-4 100
010 2 -3 101
Normal +) 101 +) -3 -2 110
Case 111 -1
-1 111

110 -2
Overflow +) 101 +) -3
Case 1011 -5
011 3

21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
• Overflow can also occur 3 011
Min -4 100
010 2 -3 101
Normal +) 101 +) -3 -2 110
Case 111 -1
-1 111

110 -2
Overflow +) 101 +) -3
Case 1011 -5
011 3

Negative Overflow
21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
• Overflow can also occur 3 011
Min -4 100
010 2 -3 101
Normal +) 101 +) -3 -2 110
Case 111 -1
-1 111

110 -2 011 3
Overflow +) 101 +) -3 +) 001 +) 1
Case 1011 -5 0100 4
011 3

Negative Overflow
21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
• Overflow can also occur 3 011
Min -4 100
010 2 -3 101
Normal +) 101 +) -3 -2 110
Case 111 -1
-1 111

110 -2 011 3
Overflow +) 101 +) -3 +) 001 +) 1
Case 1011 -5 0100 4
011 3 100 -4

Negative Overflow
21
Carnegie Mellon

Two’s Complement Addition


• Has identical bit-level behavior as Signed
0
Binary
000
unsigned addition (a big advantage 1 001
over sign-magnitude) 2 010
• Overflow can also occur Max 3 011
Min -4 100
010 2 -3 101
Normal +) 101 +) -3 -2 110
Case 111 -1
-1 111

110 -2 011 3
Overflow +) 101 +) -3 +) 001 +) 1
Case 1011 -5 0100 4
011 3 100 -4

Negative Overflow Positive Overflow


21
Carnegie Mellon

Two’s Complement Addition in C


Operands: w bits u •••
+ v •••
True Sum: w+1 bits
u+v •••
Discard Carry: w bits TAddw(u , v) •••

22
Carnegie Mellon

Is This Signed Addition an Overflow?


Signed Binary
0 000
1 001
111 2 010
+) 110 3 011
-4 100
1101 -3 101
-2 110
-1 111

23
Carnegie Mellon

Is This Signed Addition an Overflow?


Signed Binary
0 000
1 001
111 2 010
+) 110 3 011
-4 100
1101 -3 101
-2 110
-1 111

23
Carnegie Mellon

Is This Signed Addition an Overflow?


Signed Binary
0 000
1 001
111 2 010
+) 110 3 011
-4 100
1101 -3 101
Truncate -2 110
-1 111

23
Carnegie Mellon

Is This Signed Addition an Overflow?


Signed Binary
0 000
1 001
111 -1 2 010
+) 110 +) -2 3 011
-4 100
1101 -3 -3 101
Truncate -2 110
-1 111

23
Carnegie Mellon

Is This Signed Addition an Overflow?


Signed Binary
0 000
1 001
111 -1 2 010
+) 110 +) -2 3 011
-4 100
1101 -3 -3 101
Truncate -2 110
-1 111

• This is not an overflow by definition

23
Carnegie Mellon

Is This Signed Addition an Overflow?


Signed Binary
0 000
1 001
111 -1 2 010
+) 110 +) -2 3 011
-4 100
1101 -3 -3 101
Truncate -2 110
-1 111

• This is not an overflow by definition


• Because the actual result can be represented using
the bit width of the datatype (3 bits here)

23
Carnegie Mellon

Multiplication

24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

Original Number (w bits)

OMax 2w –1–1

OMin –2w –1

24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

Original Number (w bits) Product

OMax 2w –1–1

0
0

OMin –2w –1

24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

Original Number (w bits) Product


PMax
OMax 2w –1–1

0
0

OMin –2w –1

24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

Original Number (w bits) Product


PMax 22w-2 OMin2
OMax 2w –1–1

0
0

OMin –2w –1

24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

Original Number (w bits) Product


PMax 22w-2 OMin2
OMax 2w –1–1

0
0

OMin –2w –1

PMin
24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

Original Number (w bits) Product


PMax 22w-2 OMin2
OMax 2w –1–1

0
0

OMin –2w –1

PMin –22w–2 + 2w–1 OMin * OMax


24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y

Original Number (w bits) Product (2w bits)


PMax 22w-2 OMin2
OMax 2w –1–1

0
0

OMin –2w –1

PMin –22w–2 + 2w–1 OMin * OMax


24
Carnegie Mellon

Multiplication
• Goal: Computing Product of w-bit numbers x, y
• Exact results can be bigger than w bits
• Up to 2w bits (both signed and unsigned)
Original Number (w bits) Product (2w bits)
PMax 22w-2 OMin2
OMax 2w –1–1

0
0

OMin –2w –1

PMin –22w–2 + 2w–1 OMin * OMax


24
Carnegie Mellon

Unsigned Multiplication in C
u •••
Operands: w bits
* v •••
True Product: 2*w bits u·v ••• •••
Discard w bits: w bits
•••

• Standard Multiplication Function


• Ignores high order w bits
•Effectively Implements the following:
UMultw(u , v) = u · v mod 2w

25
Carnegie Mellon

Unsigned Multiplication in C
u •••
Operands: w bits
* v •••
True Product: 2*w bits u·v ••• •••
Discard w bits: w bits
•••

• Standard Multiplication Function


• Ignores high order w bits
•Effectively Implements the following:
UMultw(u , v) = u · v mod 2w
1110 1001 E9 223
* 1101 0101 * D5 * 213
**** **** 1101 1101 C1DD 47499
1101 1101 DD 221
25
Carnegie Mellon

Signed Multiplication in C
u •••
Operands: w bits
* v •••
True Product: 2*w bits u·v ••• •••
Discard w bits: w bits
•••

• Standard Multiplication Function


• Ignores high order w bits
• Some of which are different for signed vs. unsigned multiplication
• Lower bits are the same

26
Carnegie Mellon

Signed Multiplication in C
u •••
Operands: w bits
* v •••
True Product: 2*w bits u·v ••• •••
Discard w bits: w bits
•••

• Standard Multiplication Function


• Ignores high order w bits
• Some of which are different for signed vs. unsigned multiplication
• Lower bits are the same
1110 1001 E9 -23
* 1101 0101 * D5 * -43
**** **** 1101 1101 03DD 989
1101 1101 DD -35
26
Carnegie Mellon

Power-of-2 Multiply with Shift


• Operation
• u << k gives u * 2k
k
• Both signed and unsigned •••
u
Operands: w bits
* 2k 0 ••• 0 1 0 ••• 0 0
Carnegie Mellon

Power-of-2 Multiply with Shift


• Operation
• u << k gives u * 2k
k
• Both signed and unsigned •••
u
Operands: w bits
* 2k 0 ••• 0 1 0 ••• 0 0

True Product: w+k bits u · 2k ••• 0 ••• 0 0


Carnegie Mellon

Power-of-2 Multiply with Shift


• Operation
• u << k gives u * 2k
k
• Both signed and unsigned •••
u
Operands: w bits
* 2k 0 ••• 0 1 0 ••• 0 0

True Product: w+k bits u · 2k ••• 0 ••• 0 0

Discard k bits: w bits ••• 0 ••• 0 0


Carnegie Mellon

Power-of-2 Multiply with Shift


• Operation
• u << k gives u * 2k
k
• Both signed and unsigned •••
u
Operands: w bits
* 2k 0 ••• 0 1 0 ••• 0 0

True Product: w+k bits u · 2k ••• 0 ••• 0 0

Discard k bits: w bits ••• 0 ••• 0 0

• Examples
• u << 3 == u * 8
• (u << 5) – (u << 3) == u * 24
• Most machines shift and add faster than multiply
• Compiler generates this code automatically
Carnegie Mellon

Today: Representing Information in Binary


• Why Binary (bits)?
• Bit-level manipulations
• Integers
• Representation: unsigned and signed
• Conversion, casting
• Expanding, truncating
• Addition, negation, multiplication, shifting
• Summary

28
Carnegie Mellon

Arithmetic: Basic Rules


• Addition:
• Unsigned/signed: Normal addition followed by truncate,
same operation on bit level

• Multiplication:
• Unsigned/signed: Normal multiplication followed by truncate,
same operation on bit level
• Shift: Power-of-2 Multiply
Carnegie Mellon

Why Should I Use Unsigned?


• Don’t use without understanding implications
• Easy to make mistakes
unsigned int i;
for (i = cnt-2; i >= 0; i--)
a[i] += a[i+1];

• Can be very subtle


#define DELTA sizeof(int)
int i;
for (i = CNT; i-DELTA >= 0; i-= DELTA)
. . .
Carnegie Mellon

Why Should I Use Unsigned? – Bit Set


• Use bits to represent my availability of the week
b6 b5 b4 b3 b2 b1 b0
Sun Mon Tue Wed Thu Fri Sat
1 0 1 1 0 0 1

• Use 1 bit per day, 7 bits in total.


• If bit x is set to 1 then I’m available on day mapped to bit x.
Carnegie Mellon

Why Should I Use Unsigned? – Bit Set


• Use bits to represent my availability of the week
b6 b5 b4 b3 b2 b1 b0
Sun Mon Tue Wed Thu Fri Sat
1 0 1 1 0 0 1

• Use 1 bit per day, 7 bits in total.


• If bit x is set to 1 then I’m available on day mapped to bit x.

• In C: unsigned int aval;


Carnegie Mellon

Why Should I Use Unsigned? – Bit Set


• Use bits to represent my availability of the week
b6 b5 b4 b3 b2 b1 b0
Sun Mon Tue Wed Thu Fri Sat
1 0 1 1 0 0 1

• Use 1 bit per day, 7 bits in total.


• If bit x is set to 1 then I’m available on day mapped to bit x.

• In C: unsigned int aval;

aval = 1*20 + 0*21 + 0*22 + 1*23 + 1*24 + 0*25 + 1*26 = 8910


Carnegie
CarnegieMellon
Mello

Today: Floating Point


• Background: Fractional binary numbers and fixed-point
• Floating point representation
• IEEE 754 standard
• Rounding, addition, multiplication
• Floating point in C
• Summary

33
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

10.012 = 1*2 1 + 0*2 0 + 0*2 -1 + 1*2 -2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

10.012 = 1*2 1 + 0*2 0 + 0*2 -1 + 1*2 -2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

10.012 = 1*2 1 + 0*2 0 + 0*2 -1 + 1*2 -2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

10.012 = 1*2 1 + 0*2 0 + 0*2 -1 + 1*2 -2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

10.012 = 1*2 1 + 0*2 0 + 0*2 -1 + 1*2 -2

34
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal

12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

10.012 = 1*2 1 + 0*2 0 + 0*2 -1 + 1*2 -2

= 2.2510

34
Carnegie Mellon

Fractional Binary Numbers

2i
2i-1

4
••• 2
1

bi bi-1 ••• b2 b1 b0 b-1 b-2 b-3 ••• b-j


1/2
1/4 •••
1/8

2-j

35
Carnegie Mellon

Can We Represent Fractions in Binary?


• What does 10.012 mean?
• C.f., Decimal
• 12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2
• 10.012 = 1*21 + 0*20 + 0*2-1 + 1*2-2 = 2.2510

36
Carnegie Mellon

Can We Represent Fractions in Binary?


Decimal Binary
• What does 10.012 mean? 0 0000
• C.f., Decimal 1 0001
2 0010
12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

3 0011
• 10.012 = 1*21 + 0*20 + 0*2-1 + 1*2-2 = 2.2510 4 0100
5 0101
6 0110
7 0111
8 1000
0 1 2 3 4 5 6 7 …. 15 9 1001
10 1010
11 1011
12 1100
13 1101
14 1110
15 1111
36
Carnegie Mellon

Can We Represent Fractions in Binary?


Decimal Binary
• What does 10.012 mean? 0 00.00
• C.f., Decimal 0.25 00.01
0.5 00.10
12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

0.75 00.11
• 10.012 = 1*21 + 0*20 + 0*2-1 + 1*2-2 = 2.2510 1 01.00
1.25 01.01
1.5 01.10
1.75 01.11
2 10.00
0 1 2 3 2.25 10.01
2.5 10.10
2.75 10.11
3 11.00
3.25 11.01
3.5 11.10
3.75 11.11
37
Carnegie Mellon

Can We Represent Fractions in Binary?


Decimal Binary
• What does 10.012 mean? 0 00.00
• C.f., Decimal 0.25 00.01
0.5 00.10
12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

0.75 00.11
• 10.012 = 1*21 + 0*20 + 0*2-1 + 1*2-2 = 2.2510 1 01.00
1.25 01.01
1.5 01.10
1.75 01.11
2 10.00
0 1 2 3 2.25 10.01
2.5 10.10
2.75 10.11
3 11.00
01.10 1.50
3.25 11.01
+ 01.01 + 1.25
3.5 11.10
10.11 2.75 3.75 11.11
37
Carnegie Mellon

Can We Represent Fractions in Binary?


Decimal Binary
• What does 10.012 mean? 0 00.00
• C.f., Decimal 0.25 00.01
0.5 00.10
12.45 = 1*101 + 2*100 + 4*10-1 + 5*10-2

0.75 00.11
• 10.012 = 1*21 + 0*20 + 0*2-1 + 1*2-2 = 2.2510 1 01.00
1.25 01.01
1.5 01.10
1.75 01.11
2 10.00
0 1 2 3 2.25 10.01
2.5 10.10
Integer Arithmetic Still Works!
2.75 10.11
3 11.00
01.10 1.50
3.25 11.01
+ 01.01 + 1.25
3.5 11.10
10.11 2.75 3.75 11.11
37
Carnegie Mellon

Fixed-Point Representation
Decimal Binary
• Binary point stays fixed 0 00.00
• Fixed interval between two representable 0.25 00.01
numbers as long as the binary point stays fixed 0.5 00.10
0.75 00.11
• The interval in this example is 0.2510
1 01.00
• Fixed-point representation of numbers 1.25 01.01
• Integer is one special case of fixed-point 1.5 01.10
1.75 01.11
2 10.00
0 1 2 3 2.25 10.01
2.5 10.10
2.75 10.11
3 11.00
3.25 11.01
3.5 11.10
3.75 11.11
38
Carnegie Mellon

Fixed-Point Representation
Decimal Binary
• Binary point stays fixed 0 00.00
0000.
• Fixed interval between two representable 0.25
1 00.01
0001.
numbers as long as the binary point stays fixed 0.5
2 00.10
0010.
0.75
3 00.11
0011.
• The interval in this example is 0.2510
1
4 01.00
0100.
• Fixed-point representation of numbers 1.25
5 01.01
0101.
• Integer is one special case of fixed-point 1.5
6 01.10
0110.
1.75
7 01.11
0111.
2
8 10.00
1000.
0 1 2 3 4 5 6 7 …. 15 2.25
9 10.01
1001.
2.5
10 10.10
1010.
2.75
11 10.11
1011.
3
12 11.00
1100.
3.25
13 11.01
1101.
3.5
14 11.10
1110.
3.75
15 11.11
1111.
38
Carnegie
CarnegieMellon
Mello

Limitations of Fixed-Point (#1)

39
Carnegie
CarnegieMellon
Mello

Limitations of Fixed-Point (#1)


• Can exactly represent numbers only of the form x/2k

39
Carnegie
CarnegieMellon
Mello

Limitations of Fixed-Point (#1)


• Can exactly represent numbers only of the form x/2k

b3b2.b1b0
0 1/4 1/2 3/4 5/4 3/2 7/4 2 …. 15/4

39
Carnegie
CarnegieMellon
Mello

Limitations of Fixed-Point (#1)


• Can exactly represent numbers only of the form x/2k
• Other rational numbers have repeating bit representations

b3b2.b1b0
0 1/4 1/2 3/4 5/4 3/2 7/4 2 …. 15/4

39
Carnegie
CarnegieMellon
Mello

Limitations of Fixed-Point (#1)


• Can exactly represent numbers only of the form x/2k
• Other rational numbers have repeating bit representations

Decimal Value Binary Representation


1/3 0.0101010101[01]…
1/5 0.001100110011[0011]…
1/10 0.0001100110011[0011]…

b3b2.b1b0
0 1/4 1/2 3/4 5/4 3/2 7/4 2 …. 15/4

39
Carnegie Mellon

Limitations of Fixed-Point (#2)

40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time

40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs to be
large, making it hard to represent small numbers

40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs to be
large, making it hard to represent small numbers

+∞
0 ….

40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs to be
large, making it hard to represent small numbers

+∞
0 ….
A Large
Number
40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs to be
large, making it hard to represent small numbers

Unrepresentable
small numbers
+∞
0 ….
A Large
Number
40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs to be
large, making it hard to represent small numbers
• To represent very small numbers, the (fixed) interval needs to
be small, making it hard to represent large numbers

+∞
0

40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs to be
large, making it hard to represent small numbers
• To represent very small numbers, the (fixed) interval needs to
be small, making it hard to represent large numbers

+∞
0
A Small
Number
40
Carnegie Mellon

Limitations of Fixed-Point (#2)


• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs to be
large, making it hard to represent small numbers
• To represent very small numbers, the (fixed) interval needs to
be small, making it hard to represent large numbers

Unrepresentable
large numbers
+∞
0
A Small
Number
40
Carnegie
CarnegieMellon
Mello

Today: Floating Point


• Background: Fractional binary numbers and fixed-point
• Floating point representation
• IEEE 754 standard
• Rounding, addition, multiplication
• Floating point in C
• Summary

41
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In decimal: M × 10E
• E is an integer
• Normalized form: 1<= |M| < 10

42
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In decimal: M × 10E
• E is an integer
• Normalized form: 1<= |M| < 10

Decimal Value Scientific Notation


2 2×100
-4,321.768 -4.321768×103
0.000 000 007 51 7.51×10−9

42
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In decimal: M × 10E
• E is an integer
• Normalized form: 1<= |M| < 10

M × 10E

Decimal Value Scientific Notation


2 2×100
-4,321.768 -4.321768×103
0.000 000 007 51 7.51×10−9

42
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In decimal: M × 10E
• E is an integer
• Normalized form: 1<= |M| < 10

M × 10E

Significand
Decimal Value Scientific Notation
2 2×100
-4,321.768 -4.321768×103
0.000 000 007 51 7.51×10−9

42
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In decimal: M × 10E
• E is an integer
• Normalized form: 1<= |M| < 10

M × 10E

Significand Base
Decimal Value Scientific Notation
2 2×100
-4,321.768 -4.321768×103
0.000 000 007 51 7.51×10−9

42
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In decimal: M × 10E
• E is an integer
• Normalized form: 1<= |M| < 10

M × 10E Exponent

Significand Base
Decimal Value Scientific Notation
2 2×100
-4,321.768 -4.321768×103
0.000 000 007 51 7.51×10−9

42
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
Significand Base

Binary Value Scientific Notation


1110110110110 (-1)0 1.110110110110 x 212
-101.11 (-1)1 1.0111 x 22
0.00101 (-1)0 1.01 x 2-3

43
Carnegie Mellon

Primer: (Normalized) Scientific Notation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
Significand Base
• If I tell you that there is a number where:
• Fraction = 0101
• s=1
• E = 10
• You could reconstruct the number as (-1)11.0101x210

44
Carnegie Mellon

Primer: Floating Point Representation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
Significand Base

45
Carnegie Mellon

Primer: Floating Point Representation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
• Encoding Significand Base

45
Carnegie Mellon

Primer: Floating Point Representation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
• Encoding Significand Base

s exp frac

45
Carnegie Mellon

Primer: Floating Point Representation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
• Encoding Significand Base
• MSB s is sign bit s

s exp frac

45
Carnegie Mellon

Primer: Floating Point Representation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
• Encoding Significand Base
• MSB s is sign bit s
• exp field encodes Exponent (but not exactly the same, more later)

s exp frac

45
Carnegie Mellon

Primer: Floating Point Representation


• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
• Encoding Significand Base
• MSB s is sign bit s
• exp field encodes Exponent (but not exactly the same, more later)
• frac field encodes Fraction (but not exactly the same, more later)

s exp frac

45
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7
• How about negative exponent?

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7
• How about negative exponent?
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7
• How about negative exponent?
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
• bias is always 2k-1 - 1, where k is number of exponent bits

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7
• How about negative exponent?
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
• bias is always 2k-1 - 1, where k is number of exponent bits

• Example when we use 3 bits for exp (i.e., k = 3):

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7
• How about negative exponent?
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
• bias is always 2k-1 - 1, where k is number of exponent bits

• Example when we use 3 bits for exp (i.e., k = 3):


• bias = 3

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7
• How about negative exponent?
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
• bias is always 2k-1 - 1, where k is number of exponent bits

• Example when we use 3 bits for exp (i.e., k = 3):


• bias = 3
• If E = -2, exp is 1 (0012)

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value E exp
• If exp were E, we could represent exponents from 0 to 7 -3 000
• How about negative exponent? -2 001
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
-1 010
0 011
• bias is always 2k-1 - 1, where k is number of exponent bits
1 100
• Example when we use 3 bits for exp (i.e., k = 3): 23 101
110
• bias = 3 4 111
• If E = -2, exp is 1 (0012)

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value E exp
• If exp were E, we could represent exponents from 0 to 7 -3 000
• How about negative exponent? -2 001
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
-1 010
0 011
• bias is always 2k-1 - 1, where k is number of exponent bits
1 100
• Example when we use 3 bits for exp (i.e., k = 3): 23 101
110
• bias = 3 4 111
• If E = -2, exp is 1 (0012)
• Reserve 000 and 111 for other purposes (more on this later)

46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value E exp
• If exp were E, we could represent exponents from 0 to 7 -3 000
• How about negative exponent? -2 001
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
-1 010
0 011
• bias is always 2k-1 - 1, where k is number of exponent bits
1 100
• Example when we use 3 bits for exp (i.e., k = 3): 23 101
110
• bias = 3 4 111
• If E = -2, exp is 1 (0012)
• Reserve 000 and 111 for other purposes (more on this later)
• We can now represent exponents from -2 (exp 001) to 3 (exp 110)
46
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M
• frac = 10 implies M = 1.10

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M
• frac = 10 implies M = 1.10
• Putting it Together: An Example:

-10.12 = (-1) 1 1.01 x 21

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M
• frac = 10 implies M = 1.10
• Putting it Together: An Example:

-10.12 = (-1) 1 1.01 x 21

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M
• frac = 10 implies M = 1.10
• Putting it Together: An Example:

-10.12 = (-1) 1 1.01 x 21

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M
• frac = 10 implies M = 1.10
• Putting it Together: An Example:

-10.12 = (-1) 1 1.01 x 21

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp
100 frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp
100 frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

47
Carnegie
CarnegieMellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp
100 frac
01

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

47

You might also like