0% found this document useful (0 votes)

2 views

Lecture5_Arithmetic for Computers – Part 2

The document discusses floating point representation in computer arithmetic, emphasizing the IEEE 754 standard for single and double precision formats. It explains the significance of normalized and denormalized forms, special numbers like zero, infinity, and NaN, and the range and precision of floating point numbers. Additionally, it provides examples of converting between decimal and floating point representations.

Uploaded by

matt33768.ee11

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Lecture5_Arithmetic for Computers – Part 2

Uploaded by

matt33768.ee11

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

計算機組織

Computer Organization

Arithmetic for Computers – Part 2

Kun-Chih (Jimmy) Chen 陳坤志

[email protected]
Institute of Electronics,
National Yang Ming Chiao Tung University

NYCU EE / IEE
Floating Point: Motivation

❖ What can be represented in n bits?

Unsigned 0 to 2n - 1
2's Complement -2n-1 to 2n-1- 1
1's Complement -2n-1+1 to 2n-1- 1
Excess M -M to 2n - M - 1
❖ But, what about ...
❖very large numbers? 1,987,987,987,987,987,987,987,987,987
❖very small numbers? 0.0000000000000000000000054088
❖rationals 2/3
❖irrationals 2
❖transcendentals e, 
❖Types float and double in C

P2
Floating Point: Example

❖ Floating Point
❖ A = 31.48
➢ 3 → 3  101
➢ 1 → 1  100
➢ 4 → 4  10-1
➢ 8 → 8  10-2

❖ Scientific notation
❖ A = 3.148  101
➢ 3 → 3  100  101
➢ 1 → 1  10-1  101
➢ 4 → 4  10-2  101
➢ 8 → 8  10-3  101

P3
Scientific Notation: Decimal

Fraction (Mantissa) exponent

Significand 3.5ten x 10-9

radix (base)
“decimal point”
❖ Normalized form: no leading 0s
(exactly one digit to left of decimal point)

❖ Alternatives to represent 0.0000000035

❖ Normalized: 3.5 x 10-9
❖ Not normalized: 0.35 x 10-8, 35.0 x 10-10

P4
Scientific Notation: Binary

Fraction (Mantissa) exponent

Significand 1.001two x 2-9

radix (base)
“binary point”

❖ Computer arithmetic that supports it is called floating point, because

the binary point is not fixed, as it is for integers

❖ Normalized form: no leading 0s

(exactly one digit to left of binary point)

❖ Scientific notation
❖ Normalized: 1.001 x 2-9
❖ Not normalized: 0.1001 x 2-8, 10.01 x 2-10

P5
Floating Point Standard

❖ Defined by IEEE Std 754-1985

❖ Developed in response to divergence of representations

❖ Portability issues for scientific code

❖ Now almost universally adopted

❖ Two representations
❖ Single precision (32-bit)
❖ Double precision (64-bit)

P6
FP Representation

❖ Normal format: 1.xxxxxxxxxxtwo  2yyyytwo

❖ Want to put it into multiple words: 32 bits for single-precision and 64
bits for double-precision
❖ A simple single-precision representation:

31 30 23 22 0
S Exponent Fraction
1 bit 8 bits 23 bits

S represents sign
Exponent represents y's
Fraction represents x’s

❖ Represent numbers as small as ~2.0 x 10-38 to as large as ~2.0 x

1038

P7
Double Precision Representation

❖ Next multiple of word size (64 bits)

31 30 20 19 0
S Exponent Fraction
1 bit 11 bits 20 bits

Fraction (cont'd)
32 bits

❖ Double precision (vs. single precision)

❖ Represent numbers almost as small as
~2.0 x 10-308 to almost as large as ~2.0 x 10308
❖ But primary advantage is greater accuracy
due to larger fraction

P8
IEEE 754 Standard (1/4)

❖ Regarding single precision, DP similar

❖ Sign bit:
1 means negative
0 means positive
❖ Fraction:
❖ To pack more bits, leading 1 implicit for normalized numbers (hidden leading
1 bit)
❖ 1 + 23 bits single, 1 + 52 bits double
❖ always true: 0 ≤ Fraction < 1
(for normalized numbers)
❖ Significand is Fraction with the “1.” restored
❖ Note: 0 has no leading 1, so reserve exponent value 0 just for number 0

P9
IEEE 754 Standard (2/4)

❖ Exponent:
❖ Need to represent positive and negative exponents
❖ Also want to compare FP numbers as if they were integers, to help in
value comparisons
❖ How about using 2's complement to represent?
Ex: 1.0 x 2-1 versus 1.0 x2+1 (1/2 versus 2)

1/2 0 1111 1111 000 0000 0000 0000 0000 0000

2 0 0000 0001 000 0000 0000 0000 0000 0000

If we use integer comparison for these two words, we

will conclude that 1/2 > 2!!!

P10
IEEE 754 Standard (3/4)

❖ Instead, let notation 0000 0000 be most negative, and 1111 1111
most positive
❖ Called biased notation, where bias is the number subtracted to get
the real number
❖ IEEE 754 uses bias of 127 for single precision:
Subtract 127 from Exponent field to get actual value for exponent
❖ 1023 is bias for double precision

126-127=-1
1/2 0 0111 1110 000 0000 0000 0000 0000 0000
2 0 1000 0000 000 0000 0000 0000 0000 0000
128-127=1
We can use integer comparison for floating point
comparison.

P11
Biased (Excess) Notation
❖ Biased 7
0 0000 -7
1 0001 -6
2 0010 -5
3 0011 -4
4 0100 -3
5 0101 -2
6 0110 -1
7 0111 0
8 1000 1
9 1001 2
10 1010 3
11 1011 4
12 1100 5
13 1101 6
14 1110 7
15 1111 8

P12
IEEE 754 Standard (4/4)

❖ Summary (single precision):

31 30 23 22 0
S Exponent Fraction
1 bit 8 bits 23 bits

(-1)S x (1.Fraction) x 2 (Exponent-127)

❖ Double precision are same, except with exponent bias of 1023

P13
Example 1: FP to Decimal

0 0110 1000 101 0101 0100 0011 0100 0010

❖ Sign: 0 => positive

❖ Exponent:
❖ 0110 1000two = 104ten
❖ Bias adjustment: 104 - 127 = -23
❖ Fraction:
❖ 1+2-1+2-3 +2-5 +2-7 +2-9 +2-14 +2-15 +2-17 +2-22
= 1.0 + 0.666115
❖ Represents: 1.666115ten2-23  1.986  10-7

P14
Example 2: Decimal to FP

❖ Number = - 0.75
= - 0.11two  20 (scientific notation)
= - 1.1two  2-1 (normalized scientific notation)

❖ Sign: negative => 1

❖ Exponent:
❖ Bias adjustment: -1 +127 = 126
❖ 126ten = 0111 1110two

1 0111 1110 100 0000 0000 0000 0000 0000

P15
Example 3: Decimal to FP

❖ A more difficult case: representing 1/3?

= 0.33333…10 = 0.0101010101… 2  20
= 1.0101010101… 2  2-2
❖ Sign: 0
❖ Exponent = -2 + 127 = 12510=011111012
❖ Fraction = 0101010101…

0 0111 1101 0101 0101 0101 0101 0101 010

P16
Double-Precision Range

❖ Exponents 0000…00 and 1111…11 reserved

❖ Smallest value
❖ Exponent: 00000000001
 actual exponent = 1 – 1023 = –1022
❖ Fraction: 000…00  significand = 1.0
❖ ±1.0 × 2–1022 ≈ ±2.2 × 10–308
❖ Largest value
❖ Exponent: 11111111110
 actual exponent = 2046 – 1023 = +1023
❖ Fraction: 111…11  significand ≈ 2.0
❖ ±2.0 × 2+1023 ≈ ±1.8 × 10+308

P17
Floating-Point Precision

❖ Relative precision
❖ all fraction bits are significant
❖ Single: approx 2–23
➢ Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision
❖ Double: approx 2–52
➢ Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision

P18
Zero and Special Numbers

❖ What have we defined so far? (single precision)

Exponent Fraction Object

0 0 ???
0 nonzero ???
1-254 anything +/- floating-point
255 0 ???
255 nonzero ???

P19
Representation for 0

❖ Represent 0?
❖ Exponent: all zeroes
❖ Fraction: all zeroes, too
❖ What about sign?
❖ +0: 0 00000000 00000000000000000000000
❖ -0: 1 00000000 00000000000000000000000
❖ Why two zeroes?
❖ Helps in some limit comparisons

P20
Special Numbers

❖ What have we defined so far? (single precision)

Exponent Fraction Object

0 0 +/- 0
0 nonzero ???
1-254 anything +/- floating-point
255 0 ???
255 nonzero ???
❖ Range:
1.0  2-126  1.2  10-38
What if result too small? (>0, < 1.2x10-38 => Underflow!)
1.11…1  2127  (2 – 2-23)  2127  3.4  1038
What if result too large? (> 3.4x1038 => Overflow!)

P21
Range of Singe Precision Floating Point Number

Underflow
Overflow Overflow
0
–∞ +∞

+ 1.0  2–126
– 1.11…11  2127
– 1.0  2–126 + 1.11…11  2127

P22
Gradual Underflow

❖ Represent denormalized numbers (denorms)

❖ Exponent : all zeroes
❖ Fraction : non-zeroes
❖ Allow a number to degrade in significance until it become 0 (gradual
underflow)

❖ The smallest normalized number

➢ 1.0000 0000 0000 0000 0000 000  2-126
❖ The smallest de-normalized number
➢ 0.0000 0000 0000 0000 0000 001  2-127

P23
Special Numbers

❖ What have we defined so far? (single precision)

Exponent Fraction Object

0 0 +/- 0
0 nonzero denorm
1-254 anything +/- floating-point
255 0 ???
255 nonzero ???

P24
Representation for +/- Infinity

❖ In FP, divide by zero should produce +/- infinity, not overflow

❖ Why?
❖ OK to do further computations with infinity
Ex: X/0 > Y may be a valid comparison
❖ IEEE 754 represents +/- infinity
❖ Most positive exponent reserved for infinity
❖ Fractions all zeroes

S 1111 1111 0000 0000 0000 0000 0000 000

P25
Special Numbers (cont’d)

❖ What have we defined so far? (single-precision)

Exponent Fraction Object

0 0 +/- 0
0 nonzero denom
1-254 anything +/- floating-point
255 0 +/- infinity
255 nonzero ???

P26
Representation for Not a Number

❖ What do I get if I calculate sqrt(-4.0) or 0/0?

❖ If infinity is not an error, these should not be either
❖ They are called Not a Number (NaN)
❖ Exponent = 255, fraction nonzero

❖ Why is this useful?

❖ Hope NaNs help with debugging?
❖ They contaminate: op(NaN,X) = NaN
❖ OK if calculate but don't use it

P27
Special Numbers (cont’d)

❖ What have we defined so far? (single-precision)

Exponent Fraction Object

0 0 +/- 0
0 nonzero denom
1-254 anything +/- floating-point
255 0 +/- infinity
255 nonzero NaN

P28
Decimal Addition

❖ A = 3.71345  102, B = 1.32  10-4, Perform A + B

3.71345  102
+ 0.00000132  102
3.71345132  102
Right shift 2 – (-4) bits
❖ A = 3.71345  102
❖ B = 1.32  10-4 = 0.00000132  102
❖ A + B = (3.71345 + 0.00000132)  102

P29
Floating-Point Addition
Basic addition algorithm:
(1) Align binary point :compute Ye – Xe
❖ right shift the smaller number, say Xm, that many positions to
form Xm  2Xe-Ye

(2) Add mantissa: compute Xm  2Xe-Ye +Ym

(3) Normalization & check for over/underflow if necessary:

❖ left shift result, decrement result exponent
❖ right shift result, increment result exponent
❖ check overflow or underflow during the shift

(4) Round the mantissa and renormalize if necessary

P30
Floating-Point Addition Example

❖ Now consider a 4-digit binary example

❖ 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
❖ 1. Align binary points
❖ Shift number with smaller exponent
❖ 1.0002 × 2–1 + –0.1112 × 2–1
❖ 2. Add mantissa
❖ 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
❖ 3. Normalize result & check for over/underflow
❖ 1.0002 × 2–4, with no over/underflow
❖ 4. Round and renormalize if necessary
❖ 1.0002 × 2–4 (no change) = 0.0625

P31
Floating-Point Addition

P32
Sign Exponent Significand Sign Exponent Significand

Compare
Small ALU exponents

Exponent
difference Step 1
0 1 0 1 0 1

Shift smaller
Control Shift right
number right

Add Step 2
Big ALU

0 1 0 1

Increment or Step 3
decrement Shift left or right Normalize

Step 4
Rounding hardware Round

Sign Exponent Significand

P33
FP Adder Hardware

❖ Much more complex than integer adder

❖ Doing it in one clock cycle would take too long
❖ Much longer than integer operations
❖ Slower clock would penalize all instructions
❖ FP adder usually takes several cycles
❖ Can be pipelined

P34
Decimal Multiplication

❖ A = 3.12  102, B = 1.5  10-4, Perform A  B

3.12  102
 1.5  10-4
4.68  10-2

❖ A = 3.12  102
❖ B = 1.5  10-4
❖ A  B = (3.12  1.5)  10(2+(-4))

P35
Floating-Point Multiplication

Basic multiplication algorithm

(1) Add exponents of operands to get exponent of product
doubly biased exponent must be corrected:
Xe = 7
Ye = -3 Xe = 1111 = 15 = 7 + 8
Excess 8 Ye = 0101 = 5 = -3 + 8
10100 20 4+8+8
need extra subtraction step of the bias amount
(2) Multiplication of operand mantissa
(3) Normalize the product & check overflow or underflow
during the shift
(4) Round the mantissa and renormalize if necessary
(5) Set the sign of product

P36
Floating-Point Multiplication

P37
Floating-Point Multiplication Example

❖ Now consider a 4-digit binary example

❖ 1.0002 × 2–1 × –1.1102 × 2–2 (i.e., 0.5 × –0.4375)

1. Add exponents
❖ Unbiased: –1 + –2 = –3
❖ Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
2. Multiply operand mantissa
❖ 1.0002 × 1.1102 = 1.1102  1.1102 × 2–3
3. Normalize result & check for over/underflow
❖ 1.1102 × 2–3 (no change) with no over/underflow
4. Round and renormalize if necessary
❖ 1.1102 × 2–3 (no change)
5. Determine sign:
❖ –1.1102 × 2–3 = –0.21875

P38
FP Arithmetic Hardware

❖ FP multiplier is of similar complexity to FP adder

❖ But uses a multiplier for significands instead of an adder
❖ FP arithmetic hardware usually does
❖ Addition, subtraction, multiplication, division, reciprocal, square-root
❖ FP  integer conversion
❖ Operations usually takes several cycles
❖ Can be pipelined

P39
P40
FP Instructions in RISC-V
❖ Separate FP registers: f0, …, f31
❖ double-precision
❖ single-precision values stored in the lower 32 bits

❖ FP instructions operate only on FP registers

❖ Programs generally don’t do integer ops on FP data, or vice versa
❖ More registers with minimal code-size impact

❖ FP load and store instructions

❖ flw, fld
❖ fsw, fsd

P41
FP Instructions in RISC-V
❖ Single-precision arithmetic
❖ fadd.s, fsub.s, fmul.s, fdiv.s, fsqrt.s
➢ e.g., fadd.s f2, f4, f6

❖ Double-precision arithmetic
❖ fadd.d, fsub.d, fmul.d, fdiv.d, fsqrt.d
➢ e.g., fadd.d f2, f4, f6

❖ Single- and double-precision comparison

❖ feq.s, flt.s, fle.s
❖ feq.d, flt.d, fle.d
❖ Result is 0 or 1 in integer destination register
➢ Use beq, bne to branch on comparison result

❖ Branch on FP condition code true or false

❖ B.cond

P42
FP Instructions in RISC-V

2024/10/8 Andy Yu-Guang Chen 43

P43
FP Example: °F to °C

❖ C code:
float f2c (float fahr) {
return ((5.0/9.0)*(fahr - 32.0));
}
❖ fahr in f10, result in f10, literals in global memory space

❖ Compiled RISC-V code:

f2c:
flw f0,const5(x3) // f0 = 5.0f
flw f1,const9(x3) // f1 = 9.0f
fdiv.s f0, f0, f1 // f0 = 5.0f / 9.0f
flw f1,const32(x3) // f1 = 32.0f
fsub.s f10,f10,f1 // f10 = fahr – 32.0
fmul.s f10,f0,f10 // f10 = (5.0f/9.0f) * (fahr–32.0f)
jalr x0,0(x1) // return

We assume the compiler places the three floating-point

constants in the memory within easy reach of register x3

P44
Accurate Arithmetic

❖ IEEE Std 754 specifies additional rounding control

❖ Extra bits of precision (guard, round, sticky)
❖ Choice of rounding modes
❖ Allows programmer to fine-tune numerical behavior of a computation

❖ Not all FP units implement all options

❖ Most programming languages and FP libraries just use defaults

❖ Trade-off between hardware complexity, performance, and market

requirements

P45
Subword Parallelism

❖ Graphics and audio applications can take advantage of performing

simultaneous operations on short vectors
❖ Example: 128-bit adder:
➢ Sixteen 8-bit adds
➢ Eight 16-bit adds
➢ Four 32-bit adds

❖ Also called data-level parallelism, vector parallelism, or Single

Instruction, Multiple Data (SIMD)

P46
Final 64-bit RISC-V ALU

ALUop Function
0000 and
0001 or
0010 add
0110 subtract
0111 set-on-less-than
1100 nor

P47
ALU Control and Function

ALUop
Ainvert 2
CarryIn Operation

a 0
0
1
1
Binvert
Result
b 0 2
1
ALU Control (ALUop) Function
slt 3
0000 and
0001 or
CarryOut 0010 add
0110 subtract
0111 set-on-less-than
1100 nor

P48
Ripple Carry Adder

❖ Carry Ripple from lower-bit to the higher-bit

00111111 Cin = 1
00101010
+ 00010101
01
01 01
01 01
01 01
0

❖ Ripple computation dominates the run time

❖ Higher-bit ALU must wait for carry from lower-bit ALU
❖ Run time complexity: O(n)

P49
Problems with Ripple Carry Adder

❖ Carry bit may have to propagate from LSB to MSB => worst case
delay: N-stage delay

CarryIn0
A0 1-bit Result0
B0 ALU
CarryOut0
CarryIn1
A1 1-bit Result1
B1 ALU
CarryOut1
CarryIn2
A2 1-bit Result2
B2 ALU
CarryOut2
CarryIn3
A3 1-bit Result3
Design Trick: look for
B3 ALU
CarryOut3
parallelism and throw
hardware at it

P50
Remove the Dependency

❖ Ripple carry adder

a7 b7 a6 b6 a5 b5 a4 b4 a3 b3 a2 b2 a1 b1 a0 b0

Cout + + + + + + + + Cin

s7 s6 s5 s4 s3 s2 s1 s0

❖ Carry lookahead adder

❖ No carry bit propagation from LSB to MSB

Carry Computation Circuit

a7 b7 a6 b6 a5 b5 a4 b4 a3 b3 a2 b2 a1 b1 a0 b0

Cout + + + + + + + + Cin

s7 s6 s5 s4 s3 s2 s1 s0

P51
4-bit Carry-Lookahead Adder (CLA)

❖ Ripple carry adder which takes a lot of time to determine the carry bit

❖ Carry-Lookahead adder (CLA) is type of adder which improves speed by

reducing the amount of time required to determine carry bit

A3 B3 A2 B2 A1 B1 A0 B0

1-bit full C3 1-bit full C2 1-bit full C1 1-bit full

C0
S3 adder S2
adder adder adder
S1 S0
P3 G3 P2 G2 P1 G1 P0 G0

4-bit CLL (carry look-ahead logic)

PG GG
P52
Carry-Lookahead Adder
𝑆 = 𝐴 ⊕ 𝐵 ⊕ 𝐶𝑖𝑛
Full adder = { 𝐶out =𝐴 ⊕ 𝐵 · 𝐶𝑖𝑛 + 𝐴 · 𝐵

❖ Ci+1 = (Ai · Bi) + (Ai ^ Bi) · Ci

=Gi + Pi · Ci
❖ Generate : Gi = Ai · Bi
❖ Propagate : Pi = Ai ^ Bi

❖ C1 = G0 + P0 · C0
C2 = G1 + P1 · C1 = G1 + P1 · (G0 + P0 · C0) = G1 + P1 · G0 + P1 · P0 · C0
C3 = G2 + P2 · G1 + P2 · P1 · G0+ P2 · P1 · P0 · C0
C4 = G3 + P3 · G2 + P3 · P2 · G1+ P3 · P2 · P1 · G0 + P3 · P2 · P1 · P0 · C0

❖ Only need A, B and C0 to calculate the carry bit

P53
16-bit CLA

g p

C P G

• Same as before, p,g’s are generated in parallel in 1 gate delay

• Now, without input carry, the first-tier CLL cannot generate C’s……
Instead they generate P,G’s (group propagator and group generator) in 2 gate delay
P => This group will propagate the input carry to the group P=p0p1p2p3
G => This group will generate an output carry G=g3+p3g2+p3p2g1+p3p2p1g0
• The second-tier CLL takes the P,G’s from first-tier CLLs and C0 to generate “seed C’s”
for first-tier CLLs in 2 gate delay. (note that the logic for generating “seed C’s” from
P,G’s is exactly the same to generating C’s from p,g’s!)
• With the seed C’s as input, the first-tier CLLs use Cin and p,g’s to generate C’s in
2 gate delay
• With all C’s in place, S’s are calculated in 3 gate delay due to the XOR gate
P54
Pi, Gi Generation in a 16-bit CLA
❖ Propagate (P) → 1 gate delay
❖ P0 = p3 · p2 · p1 · p0
Therefore, totally
❖ P1 = p7 · p6 · p5 · p4
1+2+2+3 = 8 gate delay
❖ P2 = p11 · p10 · p9 · p8 to finish the whole thing!!
❖ P3 = p15 · p14 · p13 · p12

❖ Generate (G) → 2 gate delay

❖ G0 = g3 + (p3 · g2) + (p3 · p2 · g1) + (p3 · p2 · p1 · g0)
❖ G1 = g7 + (p7 · g6) + (p7 · p6 · g5) + (p7 · p6 · p5 · g4)
❖ G2 = g11 + (p11 · g10) + (p11 · p10 · g9) + (p11 · p10 · p9 · g8)
❖ G3 = g15 + (p15 · g14) + (p15 · p14 · g13) + (p15 · p14 · p13 · g12)

❖ Carry (C) → 2 gate delay

❖ C1 = G0 + c0 · P0
❖ C2 = G1 + G0 · P1 + c0 · P0 · P1
❖ C3 = G2 + G1 · P2 + G0 · P1 · P2 + c0 · P0 · P1 · P2
❖ C4 = G3 + G2 · P3 + G1 · P2 · P3 + G0 · P1 · P2 · P3 + c0 · P0 · P1 · P2 · P3
P55
16-bits Carry-Lookahead Adder

❖ 16-bit Carry-Lookahead Adder is composed of 4-bit Carry-Lookahead Adder

P56
Who Cares About FP Accuracy?

❖ Important for scientific code

❖ But for everyday consumer use?
➢"My bank balance is out by 0.0002¢!" 

❖ The Intel Pentium (Floating point Division) FDIV bug, 1994

❖ Recall cost: USD $475M
❖ The market expects accuracy
❖ See Colwell, The Pentium Chronicles

66 MHz Intel Pentium

P57

COA - Unit2 Floating Point Arithmetic 2
No ratings yet
COA - Unit2 Floating Point Arithmetic 2
67 pages
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
No ratings yet
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
51 pages
Floating Point Arithmetic
No ratings yet
Floating Point Arithmetic
30 pages
Floating Point Arithmetic Class
No ratings yet
Floating Point Arithmetic Class
24 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
Chapter_03_arith_3_float
No ratings yet
Chapter_03_arith_3_float
30 pages
Week8 Slides
No ratings yet
Week8 Slides
43 pages
4-Floating-Point-inclass
No ratings yet
4-Floating-Point-inclass
33 pages
Fixed & Floating Point
No ratings yet
Fixed & Floating Point
31 pages
Floating Point
No ratings yet
Floating Point
13 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
Floating-Point Numbers
No ratings yet
Floating-Point Numbers
23 pages
Ieee 754 F P R: Loating Oint Epresentation
No ratings yet
Ieee 754 F P R: Loating Oint Epresentation
11 pages
chapter02b-float-中文
No ratings yet
chapter02b-float-中文
48 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
L1 FloatingPointNumbers Intro
No ratings yet
L1 FloatingPointNumbers Intro
17 pages
Floating Points
No ratings yet
Floating Points
31 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
No ratings yet
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
31 pages
GSC-320 Numerical Computing: Lecturer:Fasiha Ikram
No ratings yet
GSC-320 Numerical Computing: Lecturer:Fasiha Ikram
17 pages
Lecture 2
No ratings yet
Lecture 2
27 pages
Floating Point & fixed point Representation_BCA II
No ratings yet
Floating Point & fixed point Representation_BCA II
24 pages
CSE_321_4_5
No ratings yet
CSE_321_4_5
11 pages
Lecture 02 - Floating Point Arithmetic
No ratings yet
Lecture 02 - Floating Point Arithmetic
14 pages
CH03-Data-II(2) (2)
No ratings yet
CH03-Data-II(2) (2)
31 pages
Floating Point Numbers: CS101 Introduction To Computing
No ratings yet
Floating Point Numbers: CS101 Introduction To Computing
41 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
55 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Floating Point
No ratings yet
Floating Point
33 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Pooja Vashisth
No ratings yet
Pooja Vashisth
35 pages
CEF352 Lect2
No ratings yet
CEF352 Lect2
18 pages
95% Completely Clueless: " of The Folks Out There Are About Floating-Point."
No ratings yet
95% Completely Clueless: " of The Folks Out There Are About Floating-Point."
33 pages
Chap 02
No ratings yet
Chap 02
16 pages
Floating-Point Numbers and Operations Representation
No ratings yet
Floating-Point Numbers and Operations Representation
8 pages
ML System Optimization Lecture 11 Quantization
No ratings yet
ML System Optimization Lecture 11 Quantization
150 pages
Lect4 Floats
No ratings yet
Lect4 Floats
64 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
ARCh Presentation1
No ratings yet
ARCh Presentation1
12 pages
Complete Floating Point (Blog)
No ratings yet
Complete Floating Point (Blog)
18 pages
Data Storage in Computer System: BITS Pilani
No ratings yet
Data Storage in Computer System: BITS Pilani
30 pages
class03_cs230s22
No ratings yet
class03_cs230s22
33 pages
Architetture Dei Calcolatori 2425 079 092_e00eb5fe056df070c4c89d6aa133367a
No ratings yet
Architetture Dei Calcolatori 2425 079 092_e00eb5fe056df070c4c89d6aa133367a
14 pages
Floating Point: - We Need A Way To Represent
No ratings yet
Floating Point: - We Need A Way To Represent
14 pages
181
No ratings yet
181
11 pages
Fixed Point Numbers
No ratings yet
Fixed Point Numbers
20 pages
Floating Point Representation of Data: By-Astha Jain Class-It1 0827IT171019
No ratings yet
Floating Point Representation of Data: By-Astha Jain Class-It1 0827IT171019
16 pages
COMP0068 Lecture10 High Level Data Types
No ratings yet
COMP0068 Lecture10 High Level Data Types
25 pages
STRM Floating Point
No ratings yet
STRM Floating Point
12 pages
Module 2 - PART D Floating
No ratings yet
Module 2 - PART D Floating
30 pages
2.4 Floating Points
No ratings yet
2.4 Floating Points
36 pages
Lec 06
No ratings yet
Lec 06
49 pages
Lecture 4 - Floating Point Data
No ratings yet
Lecture 4 - Floating Point Data
44 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
No ratings yet
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
24 pages
Machine Level Representation of Data Part 3
100% (1)
Machine Level Representation of Data Part 3
32 pages
Explain The Single Precision Floating Point Single IEEE 754 Representation
No ratings yet
Explain The Single Precision Floating Point Single IEEE 754 Representation
2 pages
Lecture 4 - Computer Arithmetic
No ratings yet
Lecture 4 - Computer Arithmetic
18 pages
Floating Point Alu
No ratings yet
Floating Point Alu
11 pages
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
From Everand
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
Analog Dialogue
4/5 (1)
Assign 1 MTH308 Sol
No ratings yet
Assign 1 MTH308 Sol
3 pages
IEEE Standard 754 Floating Point Numbers
No ratings yet
IEEE Standard 754 Floating Point Numbers
7 pages
Handbook of Floating-Point Arithmetic
No ratings yet
Handbook of Floating-Point Arithmetic
11 pages
IEEE Standard 754
No ratings yet
IEEE Standard 754
10 pages
Exam1 f09 v1
No ratings yet
Exam1 f09 v1
18 pages
Chapter 03
No ratings yet
Chapter 03
77 pages
COA - Advanced Sheet 2023
No ratings yet
COA - Advanced Sheet 2023
48 pages
Exm Opencl Tdfir Optimization Guide
No ratings yet
Exm Opencl Tdfir Optimization Guide
42 pages
Computer Based Numerical & Statistical Techniques (MCA - 106)
No ratings yet
Computer Based Numerical & Statistical Techniques (MCA - 106)
209 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
26 pages
Chapter 3 Arithmetic For Computers
No ratings yet
Chapter 3 Arithmetic For Computers
82 pages
4 Floating Point
No ratings yet
4 Floating Point
39 pages
Handbook of Floating Point Arithmetic 2nd Edition Jean Michel Muller Nicolas Brunie Florent De Dinechin Claude Pierre Jeannerod Mioara Joldes Vincent Lefèvre Guillaume Melquiond Nathalie Revol Serge Torres pdf download
100% (1)
Handbook of Floating Point Arithmetic 2nd Edition Jean Michel Muller Nicolas Brunie Florent De Dinechin Claude Pierre Jeannerod Mioara Joldes Vincent Lefèvre Guillaume Melquiond Nathalie Revol Serge Torres pdf download
44 pages
AN4044 Application Note: Floating Point Unit Demonstration On STM32 Microcontrollers
No ratings yet
AN4044 Application Note: Floating Point Unit Demonstration On STM32 Microcontrollers
31 pages
02 - Data Representation 2
No ratings yet
02 - Data Representation 2
48 pages