0% found this document useful (0 votes)
128 views

List of Formula For Unit-3 and Unit-4

The document discusses various statistical concepts such as mean, median, mode, standard deviation, moments, correlation, and regression. It provides formulas to calculate these measures for both ungrouped and grouped data. Specifically, it outlines how to fit straight lines, parabolas, and exponential curves to data using normal equations and defines correlation coefficient and lines of regression.

Uploaded by

Hitika Teckani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views

List of Formula For Unit-3 and Unit-4

The document discusses various statistical concepts such as mean, median, mode, standard deviation, moments, correlation, and regression. It provides formulas to calculate these measures for both ungrouped and grouped data. Specifically, it outlines how to fit straight lines, parabolas, and exponential curves to data using normal equations and defines correlation coefficient and lines of regression.

Uploaded by

Hitika Teckani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Silver Oak College of Engineering and Technology

Subject Name: Probability, Statistics and Numerical Analysis


Faculty Name: Dr.Yogini Vashi
Unit -3 Statistics
Fitting of 𝒀 = 𝒂 + 𝒃𝒙
straight line Normal equations:

∑ 𝑦 = 𝑛𝑎 + 𝑏 ∑ 𝑥

∑ 𝑥𝑦 = 𝑎 ∑ 𝑥 + 𝑏 ∑ 𝑥 2

Fitting of (Second degree curve) 𝒚 = 𝒂 + 𝒃𝒙 + 𝒄𝒙𝟐


parabola
Normal equations:

∑ 𝑦 = 𝑛𝑎 + 𝑏 ∑ 𝑥 + 𝑐 ∑ 𝑥 2

∑ 𝑥𝑦 = 𝑛 ∑ 𝑥 + 𝑏 ∑ 𝑥 2 + 𝑐 ∑ 𝑥 3

∑ 𝑥2𝑦 = 𝑛 ∑ 𝑥2 + 𝑏 ∑ 𝑥3 + 𝑐 ∑ 𝑥4

Fitting of 𝒚 = 𝒂𝒃𝒙 or 𝒚 = 𝒂𝒙𝒃 or 𝒚 = 𝒂𝒆𝒃𝒙


exponential
curves (1) For the curve 𝒚 = 𝒂𝒃𝒙
Taking logarithm base e on both sides we get

log 𝑦 = log(𝑎𝑏 𝑥 )
log 𝑦 = log(𝑎) + log(𝑏 𝑥 )
log 𝑦 = log(𝑎) + 𝑥 log(𝑏)

where 𝑌 = log 𝑦 , 𝐴 = log 𝑎 , 𝑋 = 𝑥 , 𝐵 = log 𝑏

Now to fit the line 𝑌 = 𝐴 + 𝐵𝑋 use the normal equations

∑ 𝑌 = 𝑛𝐴 + 𝐵 ∑ 𝑋

∑ 𝑋𝑌 = 𝐴 ∑ 𝑋 + 𝐵 ∑ 𝑋 2
Solving above equations we get the value of A and B. Then find the
value of a and b from 𝑎 = 𝑒 𝐴 , 𝑏 = 𝑒 𝐵

(2) For the curve 𝒚 = 𝒂𝒙𝒃


Taking logarithm base e on both sides we get

log 𝑦 = log(𝑎𝑥 𝑏 )
log 𝑦 = log(𝑎) + log(𝑥 𝑏 )
log 𝑦 = log(𝑎) + 𝑏 log(𝑥)

where 𝑌 = log 𝑦 , 𝐴 = log 𝑎 , 𝑋 = log 𝑥 , 𝐵 = 𝑏

Now to fit the line 𝑌 = 𝐴 + 𝐵𝑋 use the normal equations

∑ 𝑌 = 𝑛𝐴 + 𝐵 ∑ 𝑋

∑ 𝑋𝑌 = 𝐴 ∑ 𝑋 + 𝐵 ∑ 𝑋 2

Solving above equations we get the value of A and B. Then find the
value of a and b from 𝑎 = 𝑒 𝐴 , 𝑏 = 𝐵

(3) For the curve 𝒚 = 𝒂𝒆𝒃𝒙

Taking logarithm base e on both sides we get

log 𝑦 = log(𝑎𝑒 𝑏𝑥 )
log 𝑦 = log(𝑎) + log(𝑒 𝑏𝑥 )
log 𝑦 = log(𝑎) + 𝑏𝑥 log(𝑒)

where 𝑌 = log 𝑦 , 𝐴 = log 𝑎 , 𝑋 = 𝑥 , 𝐵 = 𝑏

Now to fit the line 𝑌 = 𝐴 + 𝐵𝑋 use the normal equations

∑ 𝑌 = 𝑛𝐴 + 𝐵 ∑ 𝑋

∑ 𝑋𝑌 = 𝐴 ∑ 𝑋 + 𝐵 ∑ 𝑋 2

Solving above equations we get the value of A and B. Then find the
value of a and b from 𝑎 = 𝑒 𝐴 , 𝑏 = 𝐵
Mean It is denoted by symbol 𝑋̅

For ungrouped data

∑𝑥
𝑋̅ = 𝑛 𝑖
Where n represents the total number of observations

For grouped data

∑𝑓 𝑥
𝑋̅ = 𝑛𝑖 𝑖

Where 𝑛 = ∑ 𝑓𝑖
Median It is denoted by symbol M

For ungrouped data

𝑛+1 𝑡ℎ
𝑀=( ) observation (when n is an odd number)
2

1 𝑛 𝑡ℎ 𝑛+1 𝑡ℎ
𝑀 = 2 [( 2 ) obs + ( ) 𝑜𝑏𝑠] (when n is an even number)
2
Formula for grouped data

𝑛
−𝐹
𝑀 = 𝐿 + (2 )×𝑐
𝑓
where
= ∑ 𝑓𝑖

L=Lower limit of median class


F= Cumulative frequency of the class preceding the median class
f= Frequency of the median class
c= Class length of median class

Note: In case of continuous frequency distribution the class


corresponding to the cumulative frequency just greater than or equal to
𝑛
is called the median class.
2

Mode It is denoted by symbol Z


For ungrouped data:
An observation which occurs maximum number of times is called
mode value of that data.

Formula for grouped data


𝑓1 − 𝑓0
𝑍 =𝐿+( )×𝑐
2𝑓1 − 𝑓0 − 𝑓2
Where
L= Lower limit of the modal class
𝑓0 = frequency of the class preceding the modal class
𝑓1 = frequency of the modal class
𝑓2 = frequency of the class succeeding the modal class
𝑐 = class length of the modal class

Relation between mean median and mode

̅
𝒁 = 𝟑𝑴 − 𝟐𝑿
Standard
deviation It is denoted by the SD or symbol 𝜎 (𝑠𝑖𝑔𝑚𝑎)

For ungrouped data

∑(𝑥𝑖 −𝑥̅ )2 ∑ 𝑥𝑖 2
𝜎=√ OR 𝜎=√ − 𝑥̅ 2
𝑛 𝑛

For grouped data

∑ 𝑓𝑖 (𝑥𝑖 −𝑥̅ )2 ∑ 𝑓𝑖 𝑥𝑖 2
𝜎=√ OR 𝜎=√ − 𝑥̅ 2
𝑛 𝑛

Variance = 𝝈𝟐

Coefficient of It is denoted by symbol CV


Variation
𝜎
CV = × 100
𝑥̅

Moments about mean (central moments)

Moments For ungrouped data

∑𝑛
𝑖=1(𝑥𝑖 −𝑥̅ )
𝑟
𝜇𝑟 = 𝑟 = 1,2,3,4
𝑛

For grouped data

∑𝑛
𝑖=1 𝑓𝑖 (𝑥𝑖 −𝑥̅ )
𝑟
𝜇𝑟 = 𝑟 = 1,2,3,4
𝑛

Moments about assume mean (raw moments):

In above four formulae if we take 𝑥̅ = 𝑎 , we get formula for raw


moments. Where a is assume mean or arbitrary origin.
∑𝑛
𝑖=1(𝑥𝑖 −𝑎)
𝑟
𝜇𝑟 ′ = for 𝑟 = 1,2,3,4
𝑛

∑𝑛
𝑖=1 𝑓𝑖 (𝑥𝑖 −𝑥̅ )
𝑟
𝜇𝑟 ′ = (for grouped data)
𝑛

Moments about zero (simple moments):


In the central moments formula if we take 𝑥̅ = 0 then we get the formula
for simple moments.

Relation between central moments and raw moments:


𝜇1 = 𝜇1 ′ − 𝜇1 ′

𝜇2 = 𝜇2 ′ − (𝜇1 ′ )2

𝜇3 = 𝜇3 ′ − 3𝜇2 ′ 𝜇1 ′ + 2(𝜇1 ′ )3

𝜇4 = 𝜇4 ′ − 4𝜇3 ′ 𝜇1 ′ +6𝜇2 ′ (𝜇1 ′ )2 − 3(𝜇1 ′ )4

Relation between raw moments and central moments:


𝜇1 ′ = 𝜇 − 𝑎

𝜇2 ′ = 𝜇2 + (𝜇1 ′ )2

𝜇3 ′ = 𝜇3 + 3𝜇2 ′ 𝜇1 ′ + 2(𝜇1 ′ )3

𝜇4 ′ = 𝜇4 + 4𝜇3 𝜇1 ′ + 6𝜇2 (𝜇1 ′ )2 + (𝜇1 ′ )4

Skewness:
𝜇3 2
𝛽1 = 3
𝜇2

Karl Pearson’s coefficients of skewness:

𝑀𝑒𝑎𝑛 − 𝑚𝑜𝑑𝑒
𝑆𝑘 =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

Kurtosis:

𝜇4
𝛽2 =
𝜇2 2

Properties of central moments:


(a) The first moment about mean is always zero. (𝜇1 = 0)
(b) The second moments about mean is variance. (𝜇2 = 𝜎 2 )
(c) In a symmetric distribution all odd moments are zero.(𝜇1 = 𝜇3 =
𝜇2𝑟+1 = 0)

Unit-4 Correlation and regression

1 Correlation coefficient is denoted by symbol r (−𝟏 ≤ 𝒓 ≤ 𝟏):

∑(𝑥−𝑥̅ )(𝑦−𝑦̅)
𝑟=
√∑(𝑥−𝑥̅ )𝟐 √∑(𝑦−𝑦̅)𝟐

OR
∑𝑥 ∑𝑦
∑ 𝑥𝑦−
𝑛
𝑟= 2 2
√∑ 𝑥 2 −(∑ 𝑥) √∑ 𝑦 2 −(∑ 𝑦)
𝑛 𝑛

OR

∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
𝑛
𝑟=
2
2
√∑ 𝑑 2 − (∑ 𝑑𝑦 ) √∑ 𝑑𝑥 2 − (∑ 𝑑𝑥 )
𝑦 𝑛 𝑛

Where 𝑑𝑥 = 𝑥 − 𝑎 , 𝑑𝑦 = 𝑦 − 𝑏
a and b are arbitrary origin.
Spearman’s rank correlation coefficient:

6 ∑ 𝑑2
𝑟 =1−
𝑛(𝑛2 − 1)
where ∑ 𝑑 = ∑(rank of 𝑥 − rank of 𝑦)

Formula of Spearman’s rank correlation coefficient for tied rank:

1 1
6 [∑ 𝑑 2 + 12 (𝑚1 3 − 𝑚1 ) + 12 (𝑚2 3 − 𝑚2 )+. . .]
𝑟 =1−
𝑛(𝑛2 − 1)

where m is the number of items having equal rank.

2 Regression:

Line of regression of x on y

𝑥 − 𝑥̅ = 𝑏𝑥𝑦 (𝑦 − 𝑦̅)

where
∑(𝑥−𝑥̅ )(𝑦−𝑦̅)
𝑏𝑥𝑦 = ∑(𝑦−𝑦̅)𝟐
(Regression coefficient of x on y)

OR

∑𝑥 ∑𝑦
∑ 𝑥𝑦−
𝑛
𝑏𝑥𝑦 =
(∑ 𝑦)2
√∑ 𝑦 2 −
𝑛

OR

∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
𝑛
𝑏𝑥𝑦 =
2
√∑ 𝑑 2 −(∑ 𝑑𝑦 )
𝑦 𝑛

Line of regression of y on x

𝑦 − 𝑦̅ = 𝑏𝑦𝑥 (𝑥 − 𝑥̅ )

where

∑(𝑥−𝑥̅ )(𝑦−𝑦̅)
𝑏𝑦𝑥 = ∑(𝑥−𝑥̅ )𝟐
(Regression coefficient of y on x)

OR

∑𝑥 ∑𝑦
∑ 𝑥𝑦−
𝑛
𝑏𝑦𝑥 =
(∑ 𝑥)2
√∑ 𝑥 2 −
𝑛

OR

∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
𝑛
𝑏𝑦𝑥 = 2
√∑ 𝑑𝑥 2 −(∑ 𝑑𝑥 )
𝑛

Correlation coefficient 𝑟 = √𝑏𝑥𝑦 𝑏𝑦𝑥

You might also like