0% found this document useful (0 votes)
45 views

Lecture 4 Roundoff-Error

Round-off errors occur due to the finite precision of floating-point numbers in computers. This document discusses how round-off errors are introduced during arithmetic operations and how they propagate. It analyzes the worst-case bounds on absolute and relative round-off errors, which are shown to be a function of the number of digits in the floating-point mantissa. Examples are given to demonstrate how initial errors in operands propagate and combine with rounding errors during calculations.

Uploaded by

Nour Hesham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Lecture 4 Roundoff-Error

Round-off errors occur due to the finite precision of floating-point numbers in computers. This document discusses how round-off errors are introduced during arithmetic operations and how they propagate. It analyzes the worst-case bounds on absolute and relative round-off errors, which are shown to be a function of the number of digits in the floating-point mantissa. Examples are given to demonstrate how initial errors in operands propagate and combine with rounding errors during calculations.

Uploaded by

Nour Hesham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Round-off

Errors

Edited by: Dr. Wafaa El-Haweet and Dr. Zeinab Eid 1


Key Concepts
• Round-off / Chopping Errors.
• Recognize how Floating-Point (FP)
arithmetic operations can introduce and
amplify round-off errors.

• What can be done to reduce the effect of


round-off errors.

2
There are discrete points
on the number lines that
can be represented by our
computer.
How about the space
between these points?

3
Implication of FP-Representations
• Only limited range of quantities may be
represented, resulting in:
– Overflow and underflow.

• Only a finite number of quantities within the


range may be represented, resulting in:
– Round-off errors or Chopping errors.

4
Round-Off / Chopping Errors
(Error Bounds’ Analysis)
Let:
z be a real number we want to represent in a
computer, and let
fl (z) be the FL-representation of z in the computer.

What is the largest possible value of 𝒛 − 𝒇𝒍(𝒛) ?


𝒛

That is, in the worst case, how much data are we


losing due to round-off or chopping errors?

5
Chopping Errors (Error Bounds’ Analysis)
Suppose the mantissa can only support n digits.
𝑧 = 0. 𝑎1 𝑎2 … 𝑎𝑛 𝑎𝑛+1 𝑎𝑛+2 … 𝛽 × 𝛽𝑒 , 𝑎1 ≠ 0
𝑓𝑙(𝑧) = 0. 𝑎1 𝑎2 … 𝑎𝑛 𝛽 × 𝛽𝑒

Thus, the absolute and relative chopping errors are:


𝑧 − 𝑓𝑙(𝑧) = (0. 00. . . 0 𝑎𝑛+1 𝑎𝑛+2 … )𝛽 × 𝛽𝑒
𝑛 zeroes
= (0. 𝑎𝑛+1 𝑎𝑛+2 … )𝛽 × 𝛽𝑒−𝑛
𝑧 − 𝑓𝑙(𝑧) (0.00. . . 0𝑎𝑛+1 𝑎𝑛+2 … )𝛽 × 𝛽𝑒
=
𝑧 (0. 𝑎1 𝑎2 … 𝑎𝑛 𝑎𝑛+1 𝑎𝑛+2 … )𝛽 × 𝛽𝑒

Suppose ß = 10 (base 10), what are the values of ai such that


the errors are the largest? (Worst-Case Analysis) 6
Chopping Errors (Error Bounds’ Analysis)
Because 0. 𝑎𝑛+1 𝑎𝑛+2 𝑎𝑛+3 … 𝛽 < 1
𝑧 − 𝑓𝑙(𝑧) = 0. 𝑎𝑛+1 𝑎𝑛+2 … × 𝛽𝑒−𝑛 𝒛 − 𝒇𝒍(𝒛) ≤ 𝜷𝒆−𝒏
≤ 𝜷𝒆−𝒏
𝑧 − 𝑓𝑙(𝑧) 0.00. . . 0𝑎𝑛+1 𝑎𝑛+2 … × 𝛽𝑒
=
𝑧 0. 𝑎1 𝑎2 … 𝑎𝑛 𝑎𝑛+1 𝑎𝑛+2 … × 𝛽𝑒
𝛽𝑒−𝑛

0. 𝑎1 𝑎2 … 𝑎𝑛 𝑎𝑛+1 𝑎𝑛+2 … × 𝛽 𝑒
𝛽𝑒−𝑛

0. 100000 𝑎𝑛+1 𝑎𝑛+2 … × 𝛽𝑒
𝑛 digits
𝛽𝑒−𝑛 𝛽𝑒−𝑛 𝑒−𝑛−(𝑒−1) 𝒛 − 𝒇𝒍(𝒛)
≤ = = 𝛽 ≤ 𝜷𝟏−𝒏
0.1 × 𝛽𝑒 𝛽−1 × 𝛽𝑒 𝒛
= 𝜷𝟏−𝒏
7
Round-Off Errors (Error Bounds’ Analysis)

𝑧 = 𝜎 × 0. 𝑎1 𝑎2 … 𝑎𝑛 𝑎𝑛+1 … 𝛽 × 𝛽𝑒 , 𝑎1 ≠ 0
𝜎 = ±1 (sign) 𝛽 = base 𝑒 = exponent

𝛽
𝜎 × (0. 𝑎1 𝑎2 … 𝑎𝑛 )𝛽 × 𝛽𝑒 0 ≤ an+1 <
2
Round down
𝑓𝑙(𝑧) =
𝛽
𝜎 × [(0. 𝑎1 𝑎2 … 𝑎𝑛 )𝛽 + (0.00. . . 01)𝛽 ] × 𝛽𝑒 ≤ an+1 < 𝛽
𝛽−𝑛
2
Round up

fl (z) is the rounded value of z.


8
Round-off Errors (Error Bounds’ Analysis)
Absolute Error of fl (z)
When rounding down:
z − fl ( z ) =   (0.00 0an +1an + 2 an +3 )   e
=   (0.an +1an + 2 an +3 )   e − n
z − fl ( z ) = (0.an +1an + 2 an +3 )   e − n

 1 1 e−n
an +1   (.an +1 )   z − fl ( z )  
2 2 2
Similarly, when rounding up:
 1 e−n
i.e., when  an +1   z − fl ( z )  
2 2
9
Round-off Errors (Error Bounds’ Analysis)
Relative Error of fl (z)
1 𝑒−𝑛 Absolut round-off FP-error
𝑧 − 𝑓𝑙(𝑧) ≤ 𝛽
2
𝑧 − 𝑓𝑙(𝑧) 1 𝛽−𝑛

𝑧 2 𝑧 𝛽−𝑒
1 𝛽−𝑛
= because 𝑧 = (. 𝑎1 𝑎2 … )𝛽 × 𝛽𝑒
2 (. 𝑎1 𝑎2 … )𝛽
1 𝛽−𝑛
≤ because (. 𝑎1 )𝛽 ≥ (0.1)𝛽
2 (.1)𝛽
1 𝛽−𝑛 𝒛 − 𝒇𝒍(𝒛) 𝟏 𝟏−𝒏
= ≤ 𝜷
2 𝛽−1 𝒛 𝟐

Relative round-off FP-error


10
Summary of Error Bounds’ Analysis
Chopping Errors Round-off Errors
Absolute 𝟏 𝒆−𝒏
𝒛 − 𝒇𝒍(𝒛) ≤ 𝜷𝒆−𝒏 𝒛 − 𝒇𝒍(𝒛) ≤ 𝜷
𝟐

Relative 𝒛 − 𝒇𝒍(𝒛)
≤ 𝜷𝟏−𝒏
𝒛 − 𝒇𝒍(𝒛) 𝟏 𝟏−𝒏
≤ 𝜷
𝒛 𝒛 𝟐

Where β is the base, and


n is the number of significant digits or number of digits in the mantissa.

Regardless of chopping or round-off is used to round the


numbers, the absolute errors may increase as the numbers grow
in magnitude, e >>, but the relative errors are bounded by the
same magnitude.
11
Machine Epsilon
Relative chopping error: 𝑧 − 𝑓𝑙(𝑧)
≤ 𝛽1−𝑛 = 𝑒𝑝𝑠Chopping
𝑧

Relative round-off error: 𝑧 − 𝑓𝑙(𝑧)



1 1−𝑛
𝛽 =𝑒𝑝𝑠Round−off
𝑧 2

eps is known as the machine epsilon = 1;


while (1 + epsilon > 1)
epsilon – the smallest number
epsilon = epsilon / 2;
such that: epsilon = epsilon * 2;
1 + eps > 1
Algorithm to compute machine epsilon

12
Propagation of Errors
• Each number or value of a variable is represented with
an error:
xT = x A + E x
yT = y A + E y
• These errors (Ex and Ey) are carried over to the result
of every arithmetic operation (+, -, x, ÷)

• How much error is propagated to the result of


each arithmetic operation?
13
Example 1
𝑥𝑇 = 0.12346 × 103 𝑦𝑇 = 0.45623 × 101
Assume 4 decimal mantissa with rounding-off is used.
𝑥𝐴 = 0.1235 × 103 𝑦𝐴 = 0.4562 × 101
𝑥𝐴 + 𝑦𝐴 = 0.1235 × 103 + 0.4562 × 101
= 0.128062 × 103
𝑓𝑙(𝑥𝐴 + 𝑦𝐴 ) = 0.1281 × 103
(Final value after
round-off)

How many types of errors and how much errors are


introduced to the final value? 14
Example 1 Cont.
Propagated Error: (xT + yT) - (xA + yA) = Ex+Ey

𝑥𝑇 = 0.12346 × 103 𝑦𝑇 = 0.45623 × 101


𝑥𝐴 = 0.1235 × 103 𝑦𝐴 = 0.4562 × 101
𝐸𝑥 = −0.4000 × 10−1 𝐸𝑦 = 0.3000 × 10−3

𝟏 𝒆−𝒏
𝒛 − 𝒇𝒍(𝒛) ≤ 𝜷
𝟐

Propagated Error = -0.4000x10-1 + 0.3000x10-3

15
Example 1 Cont.
Rounding-off Error:

𝑥𝐴 + 𝑦𝐴 = 0.1235 × 103 + 0.4562 × 101 = 0.128062 × 103


𝑓𝑙(𝑥𝐴 + 𝑦𝐴 ) = 0.1281 × 103

(𝑥𝐴 + 𝑦𝐴 ) − 𝑓𝑙(𝑥𝐴 + 𝑦𝐴 ) = 0.1281 × 103 − 0.128062 × 103


= 0.00038 × 103 = −0.3800 × 10−1

𝟏 𝒆−𝒏
𝒛 − 𝒇𝒍(𝒛) ≤ 𝜷
𝟐
16
Example 1 Cont.
Finally, the Total Error is: Ex = -0.4000*10-1
Ey = 0.3000*10-3
( xT + yT ) − fl ( x A + y A )
= (0.12346 103 + 0.45623 101 ) − 0.1281 103
= 0.1280223 103 − 0.1281 103
= − 0.7770 10−1

−1 −3 −1
= − 0. 4000 10
 + 0 .3000 10 + −
  0. 3800 10

propagated error rounding error

The total error is the sum of the propagated error and


rounding-off error.
17
Propagation of Errors (In General)
𝑥𝑇 = 𝑥𝐴 + 𝐸𝑥 𝑦𝑇 = 𝑦𝐴 + 𝐸𝑦

• Let be the operation between xT and yT


• can be any of +, -, x, ÷
• Let * be the corresponding operation carried
out by the computer:
– Note: xA yA ≠ xA * yA

18
Propagation of Errors (In General)
Error between the true result and the computed result is:
(xT yT ) – (xA * yA) =
(xT yT – xA yA) + (xA yA – xA * yA)

Errors in x and y propagated Rounding error of the result


by the operation

|xA yA – xA * yA| = fl(xA yA) ≤ |xA yA | x eps

19
Analysis of Propagated Errors
Addition and Subtraction
Addition:
(𝑥𝑇 + 𝑦𝑇 ) − (𝑥𝐴 + 𝑦𝐴 ) = (𝑥𝐴 + 𝐸𝑥 + 𝑦𝐴 + 𝐸𝑦 ) − (𝑥𝐴 + 𝑦𝐴 )
= 𝐸𝑥 + 𝐸𝑦

Subtraction:
(𝑥𝑇 − 𝑦𝑇 ) − (𝑥𝐴 − 𝑦𝐴 ) = (𝑥𝐴 + 𝐸𝑥 − 𝑦𝐴 − 𝐸𝑦 ) − (𝑥𝐴 − 𝑦𝐴 )
= 𝐸𝑥 − 𝐸𝑦

Note: 𝐸𝑥 + 𝐸𝑦 or 𝐸𝑥 − 𝐸𝑦 ≤ 𝐸𝑥 + 𝐸𝑦
20
Propagated Errors – Multiplication
(𝑥𝑇 × 𝑦𝑇 ) − (𝑥𝐴 × 𝑦𝐴 )
𝜀𝑥×𝑦 =
𝑥 𝑇 𝑦𝑇
𝑥𝑇 𝑦𝑇 − (𝑥𝑇 − 𝐸𝑥 ) × (𝑦𝑇 − 𝐸𝑦 )
=
𝑥 𝑇 𝑦𝑇
𝑦𝑇 𝐸𝑥 + 𝑥𝑇 𝐸𝑦 − 𝐸𝑥 𝐸𝑦
=
𝑥 𝑇 𝑦𝑇
𝐸𝑥 𝐸𝑦 𝐸𝑥 𝐸𝑦 Very small
= + −
𝑥 𝑇 𝑦𝑇 𝑥 𝑇 𝑦𝑇 and can be
≈ 𝜀𝑥 + 𝜀𝑦 neglected.

21
Propagated Errors – Division
xT x A xA xA

yT y A yA xT yT − y A
 x/ y = = 1− = 1− y =
xT xT yA yT
yT yT yT yA
1−  x  = 1−  y
= 1− yT
1−  y
x −y
=
1−  y
 x −y if εy is small and negligible
22
Example 2
Effect of Rounding Errors in Arithmetic Manipulations:
• Assuming 4-digit decimal mantissa.
• Round-off in simple multiplication or division.

3333 × 111 = 369963 (EXACT)


Results by the computer:
(0.3333 × 104 ) × (0.1110 × 103 )
= 0.369963 × 106
= 0.3700 × 106 (Rounding−off)
= 370000 (Result)

23
Danger of Adding/Subtracting a Small Number
to/from a Large Number
8000 + 0.3
= 0.8000 × 104 + 0.00003 × 104
= 0.80003 × 104
= 0.8000 × 104 (Rounding−off)
= 8000

Possible workarounds:
1) Sort the numbers by magnitude (if they have the same signs)
and add the numbers in increasing order.
2) Reformulate the formula algebraically.
24
Associativity not Necessarily Hold for Floating Point
Addition (or Multiplication)

𝑎 = 0.8567 × 100 , 𝑏 = 0.1325 × 104 , 𝑐 = −0.1325 × 104


a𝑎 ++(b(𝑏+ + = ?= 0.8567 × 100
c )𝑐)
a ++b𝑏)
((𝑎 ) ++c =𝑐 ?= 0.1000 × 101

The two answers are NOT the same!

Note:
In this example, if we simply sort the numbers by
magnitude and add them in increasing order, we get
worse answer!
Better approach is to analyze the problem algebraically.
25
Subtraction of Two Close Numbers
0.3641 102
− 0.2686  102
0.0955  102
The result will be normalized into 0.9550 x 101
However, note that the zero added to the end of the
mantissa is not significant, but is merely appended to
fill the empty space created by the shift.
Note:
0.9550 x 101 implies the error is about ± 0.00005 x 101
but the actual error could be as big as ± 0.00005 x 102 26
Subtractive Cancellation
Subtraction of Two Very Close Numbers
xT = 0.5764  12  10−4
yT = 0.5763  12  10−4
xT − y T = 0.0001  0.0001
The error bound is just as large as the estimation of
the result!
Subtraction of nearly equal numbers are major
cause of errors!
Avoid subtractive cancellation whenever possible.

27
Avoiding Subtractive Cancellations
Example 1: When x is large, compute:
f ( x) = x +1 − x
Is there a way to reduce the errors assuming that we are using the
same number of bits to represent numbers?

Answer: One possible solution is via rationalization:

f ( x) = ( x +1 − x  ) x +1 +
x +1 +
x
x
1
=
x +1 + x
28
Subtraction of Nearly Equal Numbers
Example 2: Compute the roots of ax2 + bx + c = 0 using:
−b b 2 − 4ac
x= when b 2  4ac
2a

Solve: x2 – 26x + 1 = 0
26 + 262 − 4
x(1)
T = = 13 + 168
2
26 − 262 − 4
x( 2)
T = = 13 − 168
2

29
Example 2 Cont.
Assume 5 decimal mantissa, 168 = 12.961
x A(1) = 25.961 x A( 2 ) = 13.000 − 12.961 = 0.039
Since E x (1) = E x ( 2 )  0.0005
xT(1) = 25.961, xT( 2 ) = 0.0385186

0.0005 −5 0.0005
𝜀𝑥(1) = = 1.9 × 10 , 𝜀𝑥 (2) = = 1.3 × 10−2
25.961 0.0385186

𝜀𝑥 (2) >> 𝜀𝑥 (1) implies that one solution is more accurate


than the other one.
30
Example 2Example
Cont. 2 Cont.
Alternatively, a better solution is:
13 + 168
( 2)
xA = 13 − 168 = 13 − 168 
13 + 168
1 1
= = = 0.038519
13 + 168 25.961

with 𝜀𝑥 (2) = 𝜀 1 = 𝜀𝑥 (1)


25.961

−b− b 2 − 4ac
i.e., instead of computing x=
2a
4ac 2c
we use x= =
2a ( −b + b − 4ac )
2
−b+ b 2 − 4ac

as the solution for the second root.


31
Notes
• This formula does NOT give more accurate
result in ALL cases.
• We must be careful when writing numerical
programs.
• A prior estimation of the answer, and the
corresponding error, is needed first. If the
error is large, we must use alternative
methods to compute the solution.

32
Exercise 1
1 1
Given 𝑓(𝑥) = −
𝑥 𝑥+1
Assume 3 decimals mantissa with rounding-off:

a) Evaluate f(1000) directly.


b) Evaluate f(1000) as accurate as possible, using an
alternative approach.
c) Find the relative error of calculating f(1000) in
parts (a) and (b).

33
Propagation of Errors in a Series
𝑚
Let the series be: 𝑆 = ෍ 𝑥𝑖
𝑖=0

Is there any difference between adding:

(((x1 + x2) +x3) +x4) +…+xm and

(((xm + xm-1) +xm-2) +xm-3) +…+x1 ?

34
Example for (i = 0; i < 100000; i++) {
𝑚 sumx = sumx + x;
sumy = sumy + y;
𝑆 = ෍ 𝑥𝑖
sumz = sumz + z;
𝑖=0
}
#include <stdio.h>
int main() { printf("%sumx = %f\n", sumx);
float sumx, x; printf("%sumy = %f\n", sumy);
float sumy, y; printf("%sumz = %f\n", sumz);
double sumz, z;
int i; return 0;
sumx = 0.0; }
sumy = 0.0;
sumz = 0.0;
x = 1.0; Output:
y = 0.00001; sumx = 100000.000000 Exact
z = 0.00001; sumy = 1.000990 Less accurate
sumz = 0.99999999999808375506
More accurate
35
Exercise 2
Discuss to what extent

(a + b)c = ac + bc

is violated in machine arithmetic.

36
2 3 n
x x x
Example: Evaluate ex as e x = 1 + x + + + ... + + ...
2! 3! n!
#include <stdio.h>
#include <math.h>

int main() {
float x = 10, sum = 1, term = 1, temp = 0;
int i = 0;

while (temp != sum) { //Stop when old Sum=new Sum


i++;
term = term * x / i;
temp = sum;
sum = sum + term;
printf("%2d %-12f %-14f\n", i, term, sum);
}
printf("exact value = %f\n", exp((double)x));
return 0;
}
37
𝑥 2 𝑥 3 𝑥 𝑛
𝑒 𝑥 = 1 + 𝑥 + + +. . . + +. . .
Output (when x = 10) 2! 3! 𝑛!

term sum 17 281.145752 21711.982422


18 156.192078 21868.173828
1 10.000000 11.000000 19 82.206360 21950.380859
2 50.000000 61.000000 20 41.103180 21991.484375
3 166.666672 227.666672 21 19.572943 22011.056641
4 416.666687 644.333374 22 8.896792 22019.953125
5 833.333374 1477.666748 23 3.868171 22023.822266
6 1388.889038 2866.555664 24 1.611738 22025.433594
7 1984.127197 4850.682617 25 0.644695 22026.078125
8 2480.158936 7330.841797 26 0.247960 22026.326172
9 2755.732178 10086.574219 27 0.091837 22026.417969
10 2755.732178 12842.306641 28 0.032799 22026.451172
11 2505.211182 15347.517578 29 0.011310 22026.462891
12 2087.676025 17435.193359 30 0.003770 22026.466797
13 1605.904541 19041.097656 31 0.001216 22026.468750
14 1147.074585 20188.171875 32 0.000380 22026.468750
15 764.716431 20952.888672
16 477.947754 21430.835938 exact value = 22026.465795

38
𝑥 2 𝑥 3 𝑥 𝑛
Example: Evaluate ex as 𝑒 𝑥 = 1 + 𝑥 + + +. . . + +. . .
2! 3! 𝑛!

#include <stdio.h>
#include <math.h>

int main() {
float x = 10, sum = 1, term = 1, temp = 0;
int i = 0;

while (temp != sum) { Arithmetic operations


i++; that introduce errors
term = term * x / i;
temp = sum;
sum = sum + term;
printf("%2d %-12f %-14f\n", i, term, sum);
}
printf("exact value = %f\n", exp((double)x));
return 0;
39
}
𝑥 2 𝑥 3 𝑥 𝑛
𝑒 𝑥 = 1 + 𝑥 + + +. . . + +. . .
Output (when x = -10) 2! 3! 𝑛!
term sum 29 -0.011310 -0.002908
30 0.003770 0.000862
1 -10.000000 -9.000000 31 -0.001216 -0.000354
2 50.000000 41.000000 32 0.000380 0.000026
3 -166.666672 -125.666672 33 -0.000115 -0.000089
4 416.666687 291.000000 34 0.000034 -0.000055
5 -833.333374 -542.333374 35 -0.000010 -0.000065
6 1388.889038 846.555664 36 0.000003 -0.000062
7 -1984.127197 -1137.571533 37 -0.000001 -0.000063
8 2480.158936 1342.587402 38 0.000000 -0.000063
9 -2755.732178 -1413.144775 39 -0.000000 -0.000063
10 2755.732178 1342.587402 40 0.000000 -0.000063
11 -2505.211182 -1162.623779 41 -0.000000 -0.000063
12 2087.676025 925.052246 42 0.000000 -0.000063
13 -1605.904541 -680.852295 43 -0.000000 -0.000063
… 44 0.000000 -0.000063
45 -0.000000 -0.000063
Not just incorrect answer! 46 0.000000 -0.000063
We get negative value!
exact value = 0.000045
40
2 3 𝑛
𝑥 𝑥 𝑥
𝑒 𝑥 = 1 + 𝑥 + + +. . . + +. . .
2! 3! 𝑛!

41
Errors vs. Number of Arithmetic Operations

Assume 3-digits mantissa with rounding-off:

a) Evaluate y = x3 – 3x2 + 4x + 0.21 for x = 2.73

b) Evaluate y = [(x – 3)x + 4] x + 0.21 for x = 2.73

Compare and discuss the errors obtained in parts (a)


and (b).
42
Summary
• Round-off/chopping errors
– Analysis

• Propagation of errors in arithmetic operations


– Analysis and Calculation

• How to minimize propagation of errors


– Avoid adding huge number to small number
– Avoid subtracting numbers that are close
– Minimize the number of arithmetic operations involved

43

You might also like