Patterson6e_MIPS_Ch03_PPT_r2
Patterson6e_MIPS_Ch03_PPT_r2
6th
Edition
The Hardware/Software Interface
Chapter 3
Arithmetic for Computers
§3.1 Introduction
Arithmetic for Computers
Operations on integers
Addition and subtraction
Multiplication and division
Dealing with overflow
Floating-point real numbers
Representation and operations
Length of product is
the sum of operand
lengths
Initially 0
4-bit ALU
8-bit
2nd 0110 0001 0011 0000 4th 0001 1000 0000 1100
Can be pipelined
Several multiplication performed in parallel
Chapter 3 — Arithmetic for Computers — 12
MIPS Multiplication
Two 32-bit registers for product
HI: most-significant 32 bits
LO: least-significant 32-bits
Instructions
mult rs, rt / multu rs, rt
64-bit product in HI/LO
mfhi rd / mflo rd
Move from HI/LO to rd
Can test HI value to see if product overflows 32 bits
mul rd, rs, rt
Least-significant 32 bits of product –> rd
Initially dividend
4-bit ALU
8bit
initial 0000 0111 Rem neg, so quotient bit =0, restore rem
Shift left
Sub
1st 0000 1110 1110 111x 0000 1110
Shift left
Sub
2nd 0001 1100 1111 110x 0001 1100
Shift left
Sub
3rd 0011 1000 0001 100x 0001 1001
Shift left remainder
Sub
4th 0011 0010 0001 001x 0001 0011 0001 0011
quotient
Chapter 3 — Arithmetic for Computers — 17
Faster Division
Can’t use parallel hardware as in multiplier
Subtraction is conditional on sign of remainder
Faster dividers (e.g. SRT devision)
generate multiple quotient bits per step
Still require multiple steps
(Exponent Bias)
x ( 1) (1 Fraction) 2
S
Step 1
Step 2
Step 3
Step 4
space
Compiled MIPS code:
f2c: lwc1 $f16, const5($gp)
lwc2 1 $f18, const9($gp)
div.s $f16, $f16, $f18
lwc1 $f18, const32($gp)
sub.s $f18, $f12, $f18
mul.s $f0, $f16, $f18
jr $ra
Chapter 3 — Arithmetic for Computers — 39
FP Example: Array Multiplication
X=X+Y×Z
All 32 × 32 matrices, 64-bit double-precision elements
C code:
void mm (double x[][],
double y[][], double z[][]) {
int i, j, k;
for (i = 0; i! = 32; i = i + 1)
for (j = 0; j! = 32; j = j + 1)
for (k = 0; k! = 32; k = k + 1)
x[i][j] = x[i][j]
+ y[i][k] * z[k][j];
}
Addresses of x, y, z in $a0, $a1, $a2, and
i, j, k in $s0, $s1, $s2
Chapter 3 — Arithmetic for Computers — 40
FP Example: Array Multiplication
MIPS code:
li $t1, 32 # $t1 = 32 (row size/loop end)
li $s0, 0 # i = 0; initialize 1st for loop
L1: li $s1, 0 # j = 0; restart 2nd for loop
L2: li $s2, 0 # k = 0; restart 3rd for loop
sll $t2, $s0, 5 # $t2 = i * 32 (size of row of x)
addu $t2, $t2, $s1 # $t2 = i * size(row) + j
sll $t2, $t2, 3 # $t2 = byte offset of [i][j]
addu $t2, $a0, $t2 # $t2 = byte address of x[i][j]
l.d $f4, 0($t2) # $f4 = 8 bytes of x[i][j]
L3: sll $t0, $s2, 5 # $t0 = k * 32 (size of row of z)
addu $t0, $t0, $s1 # $t0 = k * size(row) + j
sll $t0, $t0, 3 # $t0 = byte offset of [k][j]
addu $t0, $a2, $t0 # $t0 = byte address of z[k][j]
l.d $f16, 0($t0) # $f16 = 8 bytes of z[k][j]
…
Sticky bit
The standard has a third bit in addition to guard and round; it is set
whenever there are nonzero bits to the right of the round bit.
This sticky bit allows the computer to see the difference between
0.50 … 00 ten and 0.50 … 01ten when rounding.
Optional variations
I: integer operand
P: pop operand from stack
R: reverse operand order
But not all combinations allowed