0% found this document useful (0 votes)
10 views22 pages

Course3&4

Uploaded by

Sana Brahmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views22 pages

Course3&4

Uploaded by

Sana Brahmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Probability and Statistics for Engineers

Level: 1st Year Students


Section(s): All groups

Major: Mechanical/Civil Engineering


ter.
Chap 3.

RANDOM VARIABLES AND


PROBABILITY DISTRIBUTION

A. DEFINITION

▪ A Random Variable (R.V) is a function, which associates a real number with each element in a
sample space. A R.V is either DISCRETE(e.g., nbr. of defects in a production line, nbr. of
admitted females students in medical school, etc.), or CONTINUOUS (e.g., Dia. of shafts produced
in a turning center, stress value in a part submitted to varying torque, normal/plane anisotropy of
sheet metal when subjected to a forming process, etc.)

Q3.1: Complete by continuous or discrete. A R.V, which measures/counts,

1) The outlet pressure of a hydraulic pump is ……………………..………………..


2) The nbr. of accidents per year in a forging workshop is…………………………
3) The national height of newborn babies is ………………………………………..
4) The IQ score of the Tunisian primary students is ……….………………………..

Q3.2: An urn contains 4 red balls and 3 black. We draw two balls without replacement. Suppose
X a R.V, which represents the nbr. of red balls drawn, describe X.

B. CONTINUOUS AND DISCRETE PROB. DIST

▪ X a discrete R.V, the set of pairs (x, f(x)=p(X=x)) is the PROBABILITY FUNCTION or PROB.
DENSITY FUNCTION of X
f(x) = p(X = x)
f(x) ≥ 0) and ∑ f(x) = 1
For continuous R.V, X has zero prob. of assuming exactly any of its values, hence, its prob. dist.
cannot be given in a tabular form and intervals of continuous R.V are considered.
b f(x)dx a
p(a < X < b) = ∫ +∞ and f(x) ≥ 0 x
∫ f(x)dx = 1 −∞
CDF: CUMULATIVE DIST. FUNCTION, F(x) = p(X ≤ x) = ∫ f(x)dx
−∞

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 2


Q3.3: A shipment of 8 PCs contains 3 defective PCs. A school makes a random purchase of 2
PCs. Give, Q1. The prob. dist of the number of defective PCs. Q2. The cumulative prob. for X=1.

Q3.4: In a fan assembling line, the proportion of defective products can be described using a
continuous R.V, let it be X,
��(�� + ��)
��(��) { ��������
��������
���� < ��
< 1 ��

Q1. Is f(x) a probability density function?


Q2. Calculate Prob. (¼<X<½).
Q3. Calculate F(⅔).

Q3.5: X is the error measurement of a physical quantity given by the density function

��(��) {��(�� − ����) �� < �� < 1


�� ����������������

Q1. Determine k so that f(x) is a valid density function.


Q2. Calculate Prob. (X≤0.5) and Prob. (X ≥ 0.8).
Q3. What is Prob. (0.85≤X≤0.93)?

C. JOINT PROB. DIST.(J.P.D)

▪ In some situations we need to record the outcomes of numerous RVs., SIMULTANEOUSLY, e.g.,
the hardening capacity, Hc, the tensile stress (Ts), and the Yield stress (Ys) occurring in a given
material subjected to forming force. This results in a 3D sample space (Hc, Ts, Ys) and the joint
prob. dist. f (Hc, Ts, Ys) is required.

Given two RVs, X, and Y.

Discrete RVf(x, y) = Prob. (X = x, Y = y) ≥ 0


∑ ∑ f(x, y) x y = 1

Continuous RV∫ ∫ f(x, y)dxdy = 1


−∞ −∞
f(x, y) = Prob. (X, YϵA) = ∬ f(x, y)dxdy ≥ 0
A
Note,
The J.D.P, f(x,y), is a surface lying in region A above the (x, y) plane and
��������.[(��, ��) ∈ ��] = ∑ ∑�� ��(��, ��).

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 3


Q3.6: Two ATC systems are randomly selected from a company, which commercializes three
brand names of ATCs (3ATCs of brand A, 2 ATC of brand B and 3 ATC of brand C). Given X is
the nbr. of the ATCs of brand A and Y of brand B. X and Y two R.Vs

Q1. Find the J.P.F, f(X=x, Y=y).


Q2. Prob. {(X, Y) ϵ A, 2x-y ≥ 1}.
Q3. Prob. {(X, Y) ϵ D, x2+ (y-1)2≥ ¼}.

2
Q3.7: Given ��(��, ��������ℎ�����
��) { �
5(2�� + 3��) 0 ≤ ��,

�� ≤ 1 0
Q1. Prove that ∬ ��(��, ��) = 1
−∞
Q2. Give Prob. (X, YϵA, 0 ≤ x≤ ½ and ¼≤ y≤ ½).
Q3. Give Prob. (X, Y ϵ A such that (�� − 1/2)2 + (�� − 1/2)2 ≥ 1/4).

D. MARGINAL DISTRIBUTION

▪ Given X and Y two RVs. The MARGINAL DISTRIBUTION functions g(x) and h(y) are obtained by
summing/integrating f(x, y) over the values of Y and X, respectively.

Discrete RV g(x) = ∑y f(x, y)


h(y) = ∑xf(x, y)

Continuous RV g(x) = ∫ f(x, y)dy
−∞
h(y) = ∫ f(x, y)dx ∞
−∞
▪ Let X and Y be two RVs. The CONDITIONAL DISTRIBUTION of Y given X=x is,

f(y│x) =f(x, y)

g(x); g(x) > 0

X and Y are two RVs. The CONDITIONAL DISTRIBUTION of X given Y=y is,

f(x│y) =f(x, y)

h(y); h(y) > 0

Q3.8: For questions Q3.6 and Q3.7, calculate the marginal distribution, g(x) and h(y) of the
RVs, X, and Y.

Q3.9: Two refills of an HP color printer are randomly selected from a box, which contains 3
blue refills, 2 red refills, and 3 green refills. Let X and Y be the number of blue and green refills,
which are selected, respectively, and at random.

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 4


Q1. Find the J.P.F, f(x, y).
Q2. Prob. {(X, Y) ϵ A, x+y ≤ 1}.
Q3. Find f(x|1) for x=0,1 and 2.

Q3.10: The J.P.D for two R.Vs X and Y is given by

��(��, ��) {10����2 0 < �� < �� < 1


0 ��������ℎ������

Q1. Find the marginal density functions g(x) and h(y). Find f(y|x)
Q2. Find f(Y>½|X=0.25).

E. STATISTICAL INDEPENDENCE

Given X, Y two RVs having f(x, y), g(x), and h(y) as joint prob. dist., marginal dist. in X and
marginal dist. in Y, resp.; X and Y are STATISTICALLY INDEPENDENT if and only if,

f(x, y) = g(x). h(y) for all (x, y) within their range

Q3.11: Are the RVs X, and Y of Q3.6 and Q3.7 statistically independent?
F. GENERALIZATION

▪ Let f(x1, x2, .., xn) be the J.P.F of the RVs, X1, X2, .., Xn

Marginal distribution of X1

g(x1) = ∑… .∑f(x1, x2, . . , xn)


x1 xn
∞ ∞
g(x1) = ∫ … … ∫ f(x1, x2, . . , xn)dx2dx3 … dxn
−∞ −∞

Joint marginal distribution of X1 and X2

g(x1, x2) = ∑…∑f(x1, x2, . . , xn)


x3 xn
∞ ∞
g(x1, x2) = ∫ … ∫ f(x1, x2, . . , xn)dx3dx4 … dxn
−∞ −∞

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 5


Joint conditional distribution of X1, X2 and X3 given X4=x4, X5=x5,…, Xn=xn

f(x1, x2, x3|x4, x5, … , xn) =f(x1. x2, … , xn)


g(x4. x5, … , xn)

Statistical independence of X1, X2, .., Xn having a JPF ��(����,

����, . . , ����) n

f(x1, x2, … , xn) = ∏fi marginal dist. of Xi


i=1 (xi)

Where fi(xi) is the

G. MEANS AND VARIANCES OF RANDOM VARIABLES


▪ MEANS: Given an R.V, X, its average value is the mean/mathematical expectation/expected
value of X and it is denoted E(X). The expected value of G(X) is as follows,

Discrete RV μG(X) = E(G(X)) = ∑x G(x)f(x)



Continuous RV μG(X) = E(G(X)) = ∫ G(x)f(x)
−∞dx

Other useful formulas using the marginal functions;

∑∑G(x)f(x, y) = ∑G(x)g(x)
E(G(X)) = ∞
xyx∞ ∞
{
∫ ∫ G(x)f(x, y)dxdy = ∫ G(x)g(x)dx
−∞ −∞ −∞

∑∑G(y)f(x, y) =
E(G(Y)) = ∑G(y)h(y) x y x
∞ ∞

{
∫ ∫ G(y)f(x, y)dxdy = ∫ G(y)h(y)dy
−∞ −∞ −∞

Here g(x) and h(y) are the marginal distributions of the RVs X, Y.

Q3.12: X a discrete R.V which represents the number of PCs sold each Saturday from 4:00 to
9:00 P.M.
X=x 4 5 6 7 8 9
1 1 1 1 1 1
f(x)=P(X=x) 12 12 4 4 6 6

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 6


Let G(X) =2X-1 the money in USD paid to the attendant by the shop manager. Find the
attendant’s expected earnings during this period.

Q3.13: Given X a RV has the density function.

����
��(��) { �������
�������
��− �� < ����
�� < 2 ��
Find the expected value of G(X) = 4X+3.

▪ Given X, Y two RVs with a J.P.D f(x, y), the mean of G(X, Y) is,

Discrete RV μG(X,Y) = E(G(X, Y)) = ∑ ∑ G(x, y)f(x, y) x y

∞ ∞
Continuous RV μG(X,Y) = E(G(X, Y)) = ∫ ∫ G(x, y)f(x, −∞dxdy

−∞y)

Q3.14: For Q3.6, find the expected value of G(X, Y) =XY.

Q3.15: Find E(Y/X) for the density function.

��(�� + ������)
��(��, ��) { ������������
������
���� < �� < 2, 0 <
�� < 1 ��

▪ VARIANCES: The mean or expected value of a R.V., X, describes where the probability
distribution is centered (Location). The variance describes the shape and spread of the
distribution, that is, it characterizes the data variability about the mean. Let X be a R.V with
P.D.F, f(x). The variance of the RV g(X) is,

2
= E[G(X) − μG(X)]2 = ∑ [G(x) − μG(X)]2
Discrete RV σG(X)
xf(x)
2
= E[G(X) − μG(X)]2 = ∫ [G(x) − μG(X)]2f(x)dx ∞
Continuous RV σG(X)
−∞

Q3.16: Calculate the variance of G(X)=2X+3 when f(x) is given by the table below

X=x 0 1 2 3
f(x) 0.51 0.38 0.10 0.01

Q3.17: X is a R.V having the density function f(x). Is f(x) a P.D.F. If not suggest a corrected
form then calculate the mean and the variance of G(X) =4X+3
Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 7
�� ��
��(��) { ���������
���� − ���������
������(��)

���� < �� <

▪ The covariance, ������, of two R.Vs, X, Y, is a non-scalar free measurement that evaluates
the nature of the association between X and Y. A positive covariance means that X and Y vary
monotonically. When X and Y are statistically independent the Cov. (X, Y) = 0. If Cov. (X, Y) =
0 means a nonlinear relationship between X and Y and not automatically independency. A more
used scalar-free measure of the strength of the linear relationship of X and Y is the correlation
coefficient ������.

Let X, Y be a RVs having f(x,y) as a J.P.D

Discrete RV
f(x, y)
σXY = E[(X − μX)(Y − μY)] = E(XY) −
μXμY = ∑∑(x − μX)(y − μY)
x
y
Continuous RV


σXY = E[(X − μX)(Y − μY)] = E(XY) − μXμY = ∫ ∫ (x − μX)(y − μY)f(x, y)dxdy
−∞
−∞

Q3.18: Refer to Q3.6 and Q3.7. Calculate Cov. (X, Y).

Q3.19: The fraction of totally nonreworkable (X) and reworkable (Y) parts in a production line
is given by the J.P.D.

��(��, ��) {������ �� < �� < �� < 1


�� ������������������

Q1. Find the marginal density function g(x) and h(y).


Q2. Find E(X), E(Y), and E(XY).
Q3. Find the covariance of X and Y.

▪ Given X and Y two R.Vs having ������, ���� ������ ���� as Cov. and Std Dev.,
respectively. The correlation Coef. , ������, of X and Y is given below,

ρXY =σXY
σXσY− 1 ≤ ρXY ≤ 1

Q3.20: Calculate ρxy for Q3.6 and Q3.7.

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 8


▪ Means and variances of LINEAR COMBINATIONS of RV both discrete and continuous are given
below.
E(aX + b) = aE(X) + b
E[G(X, Y) ± H(X, Y)] = E[G(X, Y] ± E[H(X, Y)]
Given X, Y stat. indep. E(XY) = E(X)E(Y)
2
= a2σX2 σaX+bY
σaX+b 2abσXY
2 2 2 2 2
= a σX + b σY +

▪ Given X a R.V and Y=g(X) nonlinear. The Taylor series approximation of G(X) around
X=E(X)= ����is:
(x − μX)2
∂x x=μX(x − μX)
2
+∂ G(x)
G(x) = G(μX) +∂G(x)
∂x2x=μX 2+ ⋯

If we truncate with the linear terms and take the expected value from both sides, we obtain
��[��(��)] ≈ ��(����)
E[G(X)] ≈ G(μX) σX2 2
for nonlinear cases +∂2G(X) ∂X2X=μX
2

Var[G(X)] ≈ [∂G(X)

∂X]X=μXσX2

Given X1, X2, .., Zk a set of k-independent RVs with means ��1, ��2, . . , ���� and variances
��12, ��22, . . , ����2, resp. Let Y=h(X1, X2, ..,Xk) be a nonlinear function, then,

k
E(Y) ≈ H(μ1, μ2, . . , μk) + [∂2H(x1,x2, . . , xk) ∂xi2]xi=μi
2
∑σi 2 i=1
2
Var (Y) = ∑ σi2

[∂H(x1,x2, . . , xk) k

i=1
∂xi]xi=μi

Q3.21: Given the RV, X, with ���� ������ ����2.Give the second order approximation to
E[exp(X)].

Q3.22: Given the RVs, X, and Y, with ����, ����, ����2������ ����2.Give
approximation to E[X/Y] and Var [X/Y].

H. CHEBYSHEV’S THEOREM

▪ The probability that a discrete/continuous RV, X, will assume a value within k Std. is at least
2
(1 −1�� )
Prob. (μ − kσ < X < μ + kσ) ≥ 1 −1k2

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 9


The Chebyshev theorem holds for ANY DISTRIBUTION of observations and, for this reason, the
results are usually weak. We can determine exact probability only when the probability
distribution is known.

Q3.23: A RV, X, has ���� = 8 ������ ����2 = 9, the prob. dist. function is unknown. Find
out,

Q1. Prob. (-4<X<20).


Q2. Prob. (|�� − 8| ≥ 6).

I. PROJECT

Using Mathematica 11.3, find out,


2
2
√�� −�� ���� 3. ∫√��2 + ��2 ���� 4. ∫����2����
1. ∫ ���� ���� 2.∫���� 5. ∫��2
1
√��+2���� 0 6. ∫��2
�� +4��−1����
2 1/2
������(�� − 1)���� ��2

��������(��)

��
1−sin(��)���� 0 8. ∫
��
0 7. ∫
4

��/6
������ 9. ∫cos(2��)
0 13. ∫ √��������(��)����
2 �� +∞
0 11. ∫ �� ln(x)���� 1

2
[1+cos(����)] ���� ��/4
2��
2
�� �� ���� ∞
0 10. ∫1 12. ∫ �� − 5��3������

−∞ 14. ∫ln(x)
1

15.∫ ∫ 1 − cos(�� + ��) �������� ��0


2 2 ��/6
0 16.∫ ∫ (�� + ��������(√�� + �� ��������
1 1

0 0
17. ∫ ∫ 1 − cos(�� + ��)
�������� ��0
0 18.∫ ∫ (�� + ��������(√��2 + ��2�������� ��/6
1
5��2��2�� 0
0
������ 2��
19. ∫ ∫ 1

0 20.∫ ∫ (��������(√��2 + 3�������� ��/6


1
��
��
0

Using Mathematica 11.3, find out the derivatives of,

1. ��������(��) 2. �� − 1/√4��2
3.ln(��−3)

1−������(��+3)
��−1/3 4.

�� 5. ������ℎ(1 − ��)
������+3
��^3 7.
1−���
6.
��^3

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 10


ter.
Chap 4.

LINEAR CORRELATION

AND REGRESSION

A. INTRODUCTION

▪ In many engineering and quality control problems, the estimation of the relationship between
two or more R.Vs is required, e.g., how does a tool life vary regarding the cutting speed and the
DOC? How does the octane number of gasoline vary with % purity? How does babies’ weight
vary with age and sex? How does stress in a section of steel shafts vary when a twisting torque
is applied? etc.

▪ The statistical relationship between variables is expressed using a REGRESSION/LS/CURVE


FITTING approach. Variables may be CONTROLLABLE or RANDOM. Notice, in major engineering
problems, variables are classified as either RESPONSE/PREDICT/DEPENDENT/OUTPUT/CRITICAL
variables or PREDICTOR/EXPLANATORY/INDEPENDENT/INPUT/NONCRITICAL variables.

▪ In engineering, uses of the regression equations may help, among other purposes: i)
PREDICTION, ii) DESCRIPTION OF THE STRENGTH OF THE RELATIONSHIP BETWEEN VARIABLES, iii)
Finding out IMPORTANT INDEPENDENT VARIABLES, iv) INTERPOLATION between values of a
function, v) Determination of the OPTIMUM OPERATING CONDITIONS, vi) Discrimination between
ALTERNATIVE MODELS, vii) and/or, estimation REGRESSION COEFFICIENTS.

Q4.1: During 6 working days, a company kept records of absent workers and defective parts
(see Table)
Day 1 2 3 4 5 6
X : Nbr. of absent workers 3 5 0 1 2 6
Y: Nbr. of Defect. Parts 15 22 7 12 20 30

Q1. Plot the scatter diagram/cross plot Y=f(X) of the bi-variate data.
Q2. Comment about the distribution of the dataset (Xi, f(Xi).

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 11


B. SIMPLE LINEAR REGRESSION

When one system independent variable is of interest (others variables are either held Cte. or
their effect on the response variable is supposed to be small) the problem is a simple linear
regression.

▪ The linear model to be considered is �� = ���� + ������ + �� where ε denotes


random error due to errors in measuring Ys, effects of the non-included variables (noise factors),
and RANDOM independents Xs (errors arising with planned data is assumed negligible).

▪ The purpose of regression is to make predictions about the Yi for some Xi OVER THE RANGE OF
̂ ̂
THE EXPERIMENT DATA. The prediction equation is �� = ���� + ������where �� is the
predicted value of Y for a given X and the L.S estimate for the parameters of the �� = ����
+ ������ + �� are given as below

̂
�� = �� + �� ��

Yi

��̅
̂
���� = ���� − ���� = ���� − (�� + �� ����)
̅
�� Xi
▪ The L.S method gives the best LINEAR UNBIASED estimates of parameters ‘a’ and ‘b’. One �� 2
way to determine coefs. ‘a’ and ‘b’ is to minimize ∑ |���� − ��̂��|
��=1. Note, no assumption has been
made as to the distribution of the random error, ε.

▪ Whereas REGRESSION is about the form of the relationship given by the equation of the
regression line, the CORRELATION determines the STRENGTH of the linear relationship between the
two variables. Even though the concept of correlation is meaningful when both variables, X and
Y, are random, the LS estimate of parameter ‘b’ (rate of change of Y per unit change in X) has
meaning for both random and controlled/fixed X.

Suppose X and Y are NORMALLY DISTRIBUTED, the product-moment correlation coefficient, r, is


given below; r =SXY
√SXXSYYr ϵ [−1, 1]

SXX = ∑(xi − x̅ )2 = ∑x��2 − nx̅ 2; SYY = ∑(yi − y̅ )2


and
= ∑y��2 − ny̅ 2 SXY = ∑(xi − x̅ )(yi − y̅ ) = ∑x��y�� − nx̅ y̅

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 12


S
The regression line of Y on X is, then, ŷ = a + bx = (y̅ − xy
S
Sxxx̅) + xy
Sxxx

S
The regression line of X on Y is, then, x̂ = a + by = (x̅ − xy
S
Syyy̅) + xy
Syyy

Note, the prediction equation is sometimes expressed in terms of deviation from the average,
hence, ŷ = y̅ + b(x − x̅ )

▪ After calculation of the prediction equation, the equation should be plotted over the data, i)
roughly, half the data points should be over the line and half below it, ii) the line should pass
̅ ̅
through (��, ��), iii) and the scatter plot may also indicate the presence of outliers
(observations (Xi, Yi) which deviate substantially from the rest of the data).

▪ An unbiased estimate of the population variance (σ2) is

�� − 2=������ − ��������
��2 = ������

=������
�� − ����/2��
A 100(1-α)% CI for parameter �� − 2
β is

√������< �� < �� − ����/2��


√������
A 100(1-α)% CI for parameter α is
�� 2 �� �� 2
��
��=1
�� − ���� 2 �< �� <
√∑ ���� �� + √�������
��=1 ����/2 �
√������� √∑ ����
A 100(1-α)% CI for parameter testing (H0:β=β0 vs β≠β0) is
��−��
�� = 0

��/√������ with n-2 dof


A 100(1-α)% CI for parameter testing (H0:α=α 0 vs α ≠α 0)

��−��
is �� = 0

�� with n-2 dof


��
��=1
2
√∑ ���� /(��������)

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 13


Plot1 Plot2 Plot3 Plot4 Plot5
Pearson 0.94 -0.90 0.23 0.33 0.74
Spearman 0.88 -0.85 0.18 0.42 0.89

Note: Data falling close to the regression means a strong correlation between X and Y.

Q4.2: For Q4.1 calculate the product-moment correlation coefficient as well as the regression
line.

Q4.3: The summary data for 33 pairs of (x, Y) values are as follows (see data set below)

x
37

1
5
11
1
51

6
81

6
72

8
92

7
03

5
03

5
13

0
13

0
23

2
33

4
33

2
43

4
63

7
63

8
63

4
73

6
83

8
93

7
93

6
93
5
04

9
14

1
24

0
24

4
34

7
44

4
54

6
64

6
74

9
05

Y
1
2
1
1
2
2
2
3
3
4
3
3
3
3
3
3
3
3
3
3
3
4
3
4
4
4
3
4
4
4
4
5

∑x�� = 1104 ∑y�� = 1124 ∑x��y�� = 41355∑x��2 = 41086 ∑y��2 = 41998

Q1. Calculate the product-moment correlation coefficient.


Q2. Find the equation of the LS regression line of y on x.

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 14


Q3. Using interpolation within xϵ [3, 50] find out the estimates ��̂(�� = 20) and ��̂(�� =
53).

Q4.4: A planned experiment has considered a process response (Y: Tool Life in mn) against a
controlled factor(X: Cutting Speed in m.mn-1). The summary data is given below.
Test 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
7
2 2 2 3 3 3 3 3 3 3 3 3 3 3

x 0.

0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0
0.

3 5 2 2 5 9 8 1 3 8 0 5 1 1
6
1
4 3 3 2 3 2 1 2 1 1 2 1 1 5.

Y
5. 5. 0. 5.
4 4. 4. 5. 5. 0. 0. 0. 5. 5.
4.
3

4. 0 0 2 3
7 7 0 0 2 2 2 3 3
3
7

Q1. Calculate the product-moment correlation coefficient.


Q2. Find the equation of the LS regression line of y on x.
Q3. Plot the scatter plot and the LS regression line of y on x. Comment.
Q4. Using the LS regression line, find the cutting speed, which minimizes the tool wear.

Q4.5: A planned experiment has considered a process response (Y: Octane Number) against a
random variable (X: % Purity). The summary data are given below.

Test 1 2 3 4 5 6 7 8 9 10 11 X 99.8 99.7 99.6 99.5 99.4 99.3 99.2 99.1 99.0
98.9 98.8 Y 88.6 86.4 87.2 88.4 87.2 86.8 86.1 87.3 86.4 86.6 87.1

Q1. Calculate the product-moment correlation coefficient.


Q2. Find the equation of the LS regression line of y on x.
Q3. Plot the scatter plot and the LS regression line of y on x. Comment.
Q4. Using interpolation between data, find the estimated Octane Number when the %
purity is 99.45% and 98.6%, respectively.

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 15


A. Appendix Minitab (Q41, Q4.3 and Q4.4)
Q4.1

0.922 P-Value = 0.009

30
Regression Analysis: Y41 versus
x41
25 Analysis of Variance
Source DF Adj SS Adj MS F-Value P
20 Value
1

4 Regression 1 279.92 279.92 22.66


Y

0.009 x41 1 279.92 279.92 22.66


15
0.009 Error 4 49.42 12.35
Total 5 329.33
10 Model Summary
Scatterplot of Y41 vs x41 0 1 2 3 4 5 6 x41
S R-sq R-sq(adj) R-sq(pred)
Correlation: x41, Y41 3.51483 85.00% 81.24% 69.91%
Pearson correlation of x41 and Y41 =

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 8.52 2.40 3.55 0.024
x41 3.230 0.679 4.76 0.009 1.00

Regression Equation Y41 = 8.52 + 3.230 x41

Q4.3

Correlation: x43, Y43


Scatterplot of Y43 vs x43
Pearson correlation of x43 and Y43 = 0.955
50

Regression Analysis: Y43 versus x43


40 Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value Regression
3
30 1 3390.6 3390.55 325.08 0.000 x43 1 3390.6 3390.55
4 325.08 0.000 Error 31 323.3 10.43
Y

Lack-of-Fit 23 156.0 6.78 0.32 0.984 Pure Error 8


20
167.3 20.92
Total 32 3713.9
10
Model Summary
S R-sq R-sq(adj) R-sq(pred) 3.22954 91.29%
0
91.01% 90.02%
0 1 0 20 30 40 50 x43

P-Value = 0.000

Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 3.83 1.77 2.17 0.038
x43 0.9036 0.0501 18.03 0.000 1.00

Regression Equation Y43 = 3.83 + 0.9036 x43

Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 16


Q4.4
Correlation: x44, Y44
Pearson correlation of x44 and Y44 = -0.909
P-Value = 0.000
Scatterplot of Y44 vs x44
5
45
Regression Analysis: Y44 versus x44
40

35
Analysis of Variance

30
Source DF Adj SS Adj MS F-Value P-Value Regression
4

25
1 1619.42 1619.42 66.78 0.000 x44 1 1619.42 1619.42
4

Y 66.78 0.000 Error 14 339.51 24.25


20 Lack-of-Fit 2 11.76 5.88 0.22 0.809 Pure Error 12
15
327.75 27.31
Total 15 1958.94
10

Term Coef SE Coef T-Value


27 28 29 30 31 32 33 34 x44
P-Value VIF Constant 160.6 16.9
S R-sq R-sq(adj) R-sq(pred) 9.52 0.000 x44 -4.458 0.546 -8.17
4.92454 82.67% 81.43% 77.75% 0.000 1.00
Model Summary
Coefficients

Regression Equation Y44 = 160.6 - 4.458 x44


Stat & Prob. for Engineers_ENSIT2020/2021 Ali TRABELSI, Dr., Eng., 17

You might also like