Lecture 4 Roundoff-Error
Lecture 4 Roundoff-Error
Errors
2
There are discrete points
on the number lines that
can be represented by our
computer.
How about the space
between these points?
3
Implication of FP-Representations
• Only limited range of quantities may be
represented, resulting in:
– Overflow and underflow.
4
Round-Off / Chopping Errors
(Error Bounds’ Analysis)
Let:
z be a real number we want to represent in a
computer, and let
fl (z) be the FL-representation of z in the computer.
5
Chopping Errors (Error Bounds’ Analysis)
Suppose the mantissa can only support n digits.
𝑧 = 0. 𝑎1 𝑎2 … 𝑎𝑛 𝑎𝑛+1 𝑎𝑛+2 … 𝛽 × 𝛽𝑒 , 𝑎1 ≠ 0
𝑓𝑙(𝑧) = 0. 𝑎1 𝑎2 … 𝑎𝑛 𝛽 × 𝛽𝑒
𝑧 = 𝜎 × 0. 𝑎1 𝑎2 … 𝑎𝑛 𝑎𝑛+1 … 𝛽 × 𝛽𝑒 , 𝑎1 ≠ 0
𝜎 = ±1 (sign) 𝛽 = base 𝑒 = exponent
𝛽
𝜎 × (0. 𝑎1 𝑎2 … 𝑎𝑛 )𝛽 × 𝛽𝑒 0 ≤ an+1 <
2
Round down
𝑓𝑙(𝑧) =
𝛽
𝜎 × [(0. 𝑎1 𝑎2 … 𝑎𝑛 )𝛽 + (0.00. . . 01)𝛽 ] × 𝛽𝑒 ≤ an+1 < 𝛽
𝛽−𝑛
2
Round up
1 1 e−n
an +1 (.an +1 ) z − fl ( z )
2 2 2
Similarly, when rounding up:
1 e−n
i.e., when an +1 z − fl ( z )
2 2
9
Round-off Errors (Error Bounds’ Analysis)
Relative Error of fl (z)
1 𝑒−𝑛 Absolut round-off FP-error
𝑧 − 𝑓𝑙(𝑧) ≤ 𝛽
2
𝑧 − 𝑓𝑙(𝑧) 1 𝛽−𝑛
≤
𝑧 2 𝑧 𝛽−𝑒
1 𝛽−𝑛
= because 𝑧 = (. 𝑎1 𝑎2 … )𝛽 × 𝛽𝑒
2 (. 𝑎1 𝑎2 … )𝛽
1 𝛽−𝑛
≤ because (. 𝑎1 )𝛽 ≥ (0.1)𝛽
2 (.1)𝛽
1 𝛽−𝑛 𝒛 − 𝒇𝒍(𝒛) 𝟏 𝟏−𝒏
= ≤ 𝜷
2 𝛽−1 𝒛 𝟐
Relative 𝒛 − 𝒇𝒍(𝒛)
≤ 𝜷𝟏−𝒏
𝒛 − 𝒇𝒍(𝒛) 𝟏 𝟏−𝒏
≤ 𝜷
𝒛 𝒛 𝟐
12
Propagation of Errors
• Each number or value of a variable is represented with
an error:
xT = x A + E x
yT = y A + E y
• These errors (Ex and Ey) are carried over to the result
of every arithmetic operation (+, -, x, ÷)
𝟏 𝒆−𝒏
𝒛 − 𝒇𝒍(𝒛) ≤ 𝜷
𝟐
15
Example 1 Cont.
Rounding-off Error:
𝟏 𝒆−𝒏
𝒛 − 𝒇𝒍(𝒛) ≤ 𝜷
𝟐
16
Example 1 Cont.
Finally, the Total Error is: Ex = -0.4000*10-1
Ey = 0.3000*10-3
( xT + yT ) − fl ( x A + y A )
= (0.12346 103 + 0.45623 101 ) − 0.1281 103
= 0.1280223 103 − 0.1281 103
= − 0.7770 10−1
−1 −3 −1
= − 0. 4000 10
+ 0 .3000 10 + −
0. 3800 10
propagated error rounding error
18
Propagation of Errors (In General)
Error between the true result and the computed result is:
(xT yT ) – (xA * yA) =
(xT yT – xA yA) + (xA yA – xA * yA)
19
Analysis of Propagated Errors
Addition and Subtraction
Addition:
(𝑥𝑇 + 𝑦𝑇 ) − (𝑥𝐴 + 𝑦𝐴 ) = (𝑥𝐴 + 𝐸𝑥 + 𝑦𝐴 + 𝐸𝑦 ) − (𝑥𝐴 + 𝑦𝐴 )
= 𝐸𝑥 + 𝐸𝑦
Subtraction:
(𝑥𝑇 − 𝑦𝑇 ) − (𝑥𝐴 − 𝑦𝐴 ) = (𝑥𝐴 + 𝐸𝑥 − 𝑦𝐴 − 𝐸𝑦 ) − (𝑥𝐴 − 𝑦𝐴 )
= 𝐸𝑥 − 𝐸𝑦
Note: 𝐸𝑥 + 𝐸𝑦 or 𝐸𝑥 − 𝐸𝑦 ≤ 𝐸𝑥 + 𝐸𝑦
20
Propagated Errors – Multiplication
(𝑥𝑇 × 𝑦𝑇 ) − (𝑥𝐴 × 𝑦𝐴 )
𝜀𝑥×𝑦 =
𝑥 𝑇 𝑦𝑇
𝑥𝑇 𝑦𝑇 − (𝑥𝑇 − 𝐸𝑥 ) × (𝑦𝑇 − 𝐸𝑦 )
=
𝑥 𝑇 𝑦𝑇
𝑦𝑇 𝐸𝑥 + 𝑥𝑇 𝐸𝑦 − 𝐸𝑥 𝐸𝑦
=
𝑥 𝑇 𝑦𝑇
𝐸𝑥 𝐸𝑦 𝐸𝑥 𝐸𝑦 Very small
= + −
𝑥 𝑇 𝑦𝑇 𝑥 𝑇 𝑦𝑇 and can be
≈ 𝜀𝑥 + 𝜀𝑦 neglected.
21
Propagated Errors – Division
xT x A xA xA
−
yT y A yA xT yT − y A
x/ y = = 1− = 1− y =
xT xT yA yT
yT yT yT yA
1− x = 1− y
= 1− yT
1− y
x −y
=
1− y
x −y if εy is small and negligible
22
Example 2
Effect of Rounding Errors in Arithmetic Manipulations:
• Assuming 4-digit decimal mantissa.
• Round-off in simple multiplication or division.
23
Danger of Adding/Subtracting a Small Number
to/from a Large Number
8000 + 0.3
= 0.8000 × 104 + 0.00003 × 104
= 0.80003 × 104
= 0.8000 × 104 (Rounding−off)
= 8000
Possible workarounds:
1) Sort the numbers by magnitude (if they have the same signs)
and add the numbers in increasing order.
2) Reformulate the formula algebraically.
24
Associativity not Necessarily Hold for Floating Point
Addition (or Multiplication)
Note:
In this example, if we simply sort the numbers by
magnitude and add them in increasing order, we get
worse answer!
Better approach is to analyze the problem algebraically.
25
Subtraction of Two Close Numbers
0.3641 102
− 0.2686 102
0.0955 102
The result will be normalized into 0.9550 x 101
However, note that the zero added to the end of the
mantissa is not significant, but is merely appended to
fill the empty space created by the shift.
Note:
0.9550 x 101 implies the error is about ± 0.00005 x 101
but the actual error could be as big as ± 0.00005 x 102 26
Subtractive Cancellation
Subtraction of Two Very Close Numbers
xT = 0.5764 12 10−4
yT = 0.5763 12 10−4
xT − y T = 0.0001 0.0001
The error bound is just as large as the estimation of
the result!
Subtraction of nearly equal numbers are major
cause of errors!
Avoid subtractive cancellation whenever possible.
27
Avoiding Subtractive Cancellations
Example 1: When x is large, compute:
f ( x) = x +1 − x
Is there a way to reduce the errors assuming that we are using the
same number of bits to represent numbers?
f ( x) = ( x +1 − x ) x +1 +
x +1 +
x
x
1
=
x +1 + x
28
Subtraction of Nearly Equal Numbers
Example 2: Compute the roots of ax2 + bx + c = 0 using:
−b b 2 − 4ac
x= when b 2 4ac
2a
Solve: x2 – 26x + 1 = 0
26 + 262 − 4
x(1)
T = = 13 + 168
2
26 − 262 − 4
x( 2)
T = = 13 − 168
2
29
Example 2 Cont.
Assume 5 decimal mantissa, 168 = 12.961
x A(1) = 25.961 x A( 2 ) = 13.000 − 12.961 = 0.039
Since E x (1) = E x ( 2 ) 0.0005
xT(1) = 25.961, xT( 2 ) = 0.0385186
0.0005 −5 0.0005
𝜀𝑥(1) = = 1.9 × 10 , 𝜀𝑥 (2) = = 1.3 × 10−2
25.961 0.0385186
−b− b 2 − 4ac
i.e., instead of computing x=
2a
4ac 2c
we use x= =
2a ( −b + b − 4ac )
2
−b+ b 2 − 4ac
32
Exercise 1
1 1
Given 𝑓(𝑥) = −
𝑥 𝑥+1
Assume 3 decimals mantissa with rounding-off:
33
Propagation of Errors in a Series
𝑚
Let the series be: 𝑆 = 𝑥𝑖
𝑖=0
34
Example for (i = 0; i < 100000; i++) {
𝑚 sumx = sumx + x;
sumy = sumy + y;
𝑆 = 𝑥𝑖
sumz = sumz + z;
𝑖=0
}
#include <stdio.h>
int main() { printf("%sumx = %f\n", sumx);
float sumx, x; printf("%sumy = %f\n", sumy);
float sumy, y; printf("%sumz = %f\n", sumz);
double sumz, z;
int i; return 0;
sumx = 0.0; }
sumy = 0.0;
sumz = 0.0;
x = 1.0; Output:
y = 0.00001; sumx = 100000.000000 Exact
z = 0.00001; sumy = 1.000990 Less accurate
sumz = 0.99999999999808375506
More accurate
35
Exercise 2
Discuss to what extent
(a + b)c = ac + bc
36
2 3 n
x x x
Example: Evaluate ex as e x = 1 + x + + + ... + + ...
2! 3! n!
#include <stdio.h>
#include <math.h>
int main() {
float x = 10, sum = 1, term = 1, temp = 0;
int i = 0;
38
𝑥 2 𝑥 3 𝑥 𝑛
Example: Evaluate ex as 𝑒 𝑥 = 1 + 𝑥 + + +. . . + +. . .
2! 3! 𝑛!
#include <stdio.h>
#include <math.h>
int main() {
float x = 10, sum = 1, term = 1, temp = 0;
int i = 0;
41
Errors vs. Number of Arithmetic Operations
43