0% found this document useful (0 votes)
85 views48 pages

Random Variable & Probability Distribution

Here are the steps to solve this problem: a) The mean of the population is the sum of all values divided by the total number of values. Thus, the mean is (2 + 3 + 6 + 8 + 11) / 5 = 6 b) To calculate the standard deviation of the population, we first calculate the variance: Variance = Σ(x - μ)^2 / N Where μ is the population mean (6) (2 - 6)^2 + (3 - 6)^2 + (6 - 6)^2 + (8 - 6)^2 + (11 - 6)^2 = 16 + 9 + 0 + 4 + 25 = 54 Variance = 54 / 5

Uploaded by

RISHAB NANGIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views48 pages

Random Variable & Probability Distribution

Here are the steps to solve this problem: a) The mean of the population is the sum of all values divided by the total number of values. Thus, the mean is (2 + 3 + 6 + 8 + 11) / 5 = 6 b) To calculate the standard deviation of the population, we first calculate the variance: Variance = Σ(x - μ)^2 / N Where μ is the population mean (6) (2 - 6)^2 + (3 - 6)^2 + (6 - 6)^2 + (8 - 6)^2 + (11 - 6)^2 = 16 + 9 + 0 + 4 + 25 = 54 Variance = 54 / 5

Uploaded by

RISHAB NANGIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
You are on page 1/ 48

- Probability Distributions

 Line managers may use probability distributions to generate


sample plans and predict process yields.
 Fund managers may use them to determine the possible
returns a stock may earn in the future.
 Restaurant mangers may use them to resolve
future customer complaints.
 Insurance managers may use them to forecast
the uncertain future claims.
Parameter and Statistics

• A measure calculated from population data called


is
Parameter.
• A measure calculated from sample data is called Statistic.
Parameter Statistic
Size N n
Mean μ x
Standard deviation σ s
Proportion P p
Correlation coefficient ρ r
Random variable- expected value

⚫ The mean of the probability distribution is referred


to as the , and is represented
by μₓ.

which just means that the mean(or expected value)


of a random variable is a weighted average
EXPECTED VALUE
Suppose the random variable x can take on the n values
x1, x2, …, xn.
Also, suppose the probabilities that these values occur are
respectively p1,p2, …, pn. Then the expected value
of the random variable is:
E(x) = x1p1 + x2p2 + …. + xnpn
Variance, Standard Deviation

Variance is often depicted by this symbol: σ2. The square root of the
variance is the standard deviation (σ), which helps determine the
consistency of an EVENT over a period of time.

Variance without probabilities Variance with probabilities

Variance = ∑ P(x-x)²
Example: expected value, standard
deviation and CV (HW)

x 10 11 12 13 14
P(x) .4 .2 .2 .1 .1

E(R)  10(.4)  11(.2)  12(.2)  13(.1)  14(.1)  11.3


PRACTICE SUMS Q 1 to 6
Example: expected value, standard
deviation and CV

For this probability distribution, the


expected value is

= 0(1/8) + 1(3/8) + 2(3/8) + 3(1/8)= 12/8=


1.5
Example
Find the expected value of x in the probability
distribution below and S.D, C.V:

X 1 2 3 4 5
P(x) .13 .29 .38 .13 .07
Z-Score 

A Z-score is a numerical measurement used in statistics of a value's


relationship to the mean (average) of a group of values, measured in
terms of standard deviations from the mean. If a Z-score is 0, it
indicates that the data point's score is identical to the mean score.
Z-Score Formula
It is a way to compare the results from a test to a “normal”
population. If X is a random variable from a normal
distribution with mean (μ) and standard deviation (σ), its Z-
score may be calculated by subtracting mean from X and
dividing the whole by standard deviation.
Scores on a history test have average of 80 with
standard deviation of 6. What is the z-score for The z-score of (75 - 80)/6 = -0.833.
a student who earned a 75 on the test?

The weight of chocolate bars from a


particular chocolate factory has a mean of 8 (8.17 - 8)/.1 = 1.7
ounces with standard deviation of .1 ounce.
What is the z-score corresponding to a
weight of 8.17 ounces?
Books in the library are found to have average
length of 350 pages with standard deviation (80 - 350)/100 = -2.7.
of 100 pages. What is the z-score
corresponding to a book of length 80 pages?

A particular leg bone for dinosaur fossils has


a mean length of 5 feet with standard
(62 - 60)/3 = .667.
deviation of 3 inches. What is the z-score that
corresponds to a length of 62 inches?
https://www.youtube.com/watch?v=okhrFgaUwio

Video 2

Videohttps://www.youtube.com/watch?v=fnU42Ue9utk 3

Video 4
Let’s assume that the mean score for a class of 50 students is 60
and the standard deviation is 15 marks. A student named Emily
asked the teacher if by scoring 70 she has performed well or not. 

Initially by looking at Emily’s score, it appears that she did well considering that 60 is
the mean class score.
However, this doesn’t reflect the variation among 50 students. Considering the
standard deviation of 15, it is very likely that there is a significant variation among
the scores. To answer the question how well Emily performed in the coursework
compared to other students in the class we can use the Z score.
For finding out the
number of students in the
class that scored higher
or lower than Emily, we
will look at the normal
distribution table. In this
case the Z-value comes
to 0.74857

we can clearly see that Emily performed better than 74.86% (37 students)
of the students in her class.

Also, it means that the probability of a score being higher than 0.67 is 1-
0.74857 = 0.25143 = 25.14% (13 students). It shows that approximately
25% of all the students scored higher than Emily.
A class of 50 students who have written the science test last week. Today
is the result day, and the class teacher told me that John scored 93 in the
test while the average score of the class was 68. Determine the z-score for
John’s test mark if the standard deviation is 13.

Z score = (93 – 68) / 13


Z  Score = 1.92
Therefore, John’s Z
test score is 1.92
standard deviation
above the average
score of the class,
which means
0.97257 = 97.26% of
the class (49
students) scored less
than John.
TYPES OF
SAMPLING
DISTRIBUTION

The types of sampling distribution are


as follows:
1) Sampling Distribution of the Mean:
Sampling distribution of means of a
population data is defined as the theoretical
probability distribution of the sample means
which are obtained by extracting all the
possible
samples having the same size from the
given population.
Given a finite population with mean (m) and
variance (s2). When sampling from a normally
distributed population, it can be shown that
the distribution of the sample mean will have
the following properties -
The mean of the sampling distribution of the sample mean will always be the
same as the mean of the original non-normal distribution. In other words, the
sample mean is equal to the population mean.
2) SAMPLING DISTRIBUTION OF THE PROPORTION :

Sampling distribution of the proportion is found when the sample


proportion and proportion of successes are given.
Statistical Inference
The m e thod to infer about population on the basis of
sample information is known as Statistical inference.

It mainly consists of two parts:


• Estimation
• Testing of Hypothesis
Estimation
Estimation is a process whereby we select a random sample from
a population and use a sample statistic to estimate a population
parameter.
There are two ways for estimation:
• Point Estimation
• Interval Estimation
Point & Interval
Estimates
There are two kinds of estimates of population parameters from
sample statistics :

POINT INTERVAL
ESTIMATE ESTIMATE
S S

A point estimate is a single value and an interval estimate is a range of


values.
Point Estimate
Point Estimate – A sample statistic used to estimate the exact value
of a population parameter.
• A point estimate is a single value and has the advantage of
being very precise but there is no information about its
reliability.
• The probability that a single sample statistic actually equal to the
parameter value is extremely small. For this reason point
estimation is rarely used.
Interval Estimate
Confidence interval (interval estimate) – A range of values
defined by the confidence level within which the population
parameter is estimated to fall.
• The interval estimate is less precise, but gives more
confidence.
Example of Point and Interval Estimate
Government wants to know the percentage of cigarette smokers
among college students.
If we say that there was 10% are smokers, it is a point estimate.
But if we make a statement that 8% to 12% of college students
are smokers, it is interval estimate.
SD formula in Sampling Distribution

In case of
replacement: 𝜎x = 𝜎/ √N
𝜎2x = 𝜎2/N
In case of no
replacement:
𝜎2x = 𝜎2 (Np –N)
N (Np-1)
MATHEMATICAL PROBLEMS
Sampling Distribution of means
Prob. 1 :
A population consists of the five numbers 2,3,6,8 and 11.
all possibleConsider
samples of size 2 that can be drawn with and without
from this population . Find out
replacement
a)The mean of the population.
b) The standard deviation of the population .
c)The mean of the sampling distribution of
means. deviation of the sampling distribution of means (the standard
d)Standard
error of means ).
# Answer :
2+3+6+8+11
a) Mean of the population = = =
5
30 5 6 2
2 ⅀ 𝑥−𝜇
b)Standard deviation of population ,𝜎 =
𝑁
2
= (2−6)2 +
(3−6) +(6−6)52+(8−6)2+(11

16+9+0+4+25
2
= −6)
54 5 = =
5 10.8
∴ 𝜎 = 3.29
With replacement :
c)There are 5(5)= 25 samples of size 2 that can be drawn with replacement. These
are :
(2,2) (2,3) (2,6) (2,8) (2,11)
(3,2) (3,3) (3,6) (3,8) (3,11)
(6,2) (6,3) (6,6) (6,8) (6,11)
(8,2) (8,3) (8,6) (8,8) (8,11)
(11,2) (11,3) (11,6) (11,8) (11,11)
The corresponding sample means are :

2.0 2.5 4.0 5.0


2.5 3.0 4.5 5.5 7.0
6.5

4.0 4.5 6.0 7.0 8.5.


5.0

5.5
𝑠𝑢𝑚 𝑜 𝑓 𝑎𝑙𝑙 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛𝑠
7.0distribution of
And the mean of sampling
mean is , 25
8.0
1
= 5
= 6.0
9.5 25
0

6.5
Hence proved the fact that 𝜇𝑥 =
µ 7.0

8.5

9.5

11.0
d) Here, standard deviation of the sampling distribution of mean is,
2 2 2
𝜎2x = 2−6 ++(2.5−6) + ………+ (11−6) ( substracting the mean 6 from each numbers,
squaring the
result, adding all1325 numbers25thus obtained and dividing by 25 )
=5 =
2
5.40
σx = 55.40 =
2.32 𝜎2
This illustrates the fact that for finite populations involving sampling with replacement 𝜎 2x = -
𝑁
since
, the right hand side is 10.8/2 = 5.40 ; agreeing with the above value
.

Without Replacement:

c) There are 10 samples of size 2 that can be drawn without replacement from
the population NCn = 5!/[(5-2)! X 2! ]

(2,3) (2,6) (2,8) (2,11) (3,6) (3,8) (3,11) (6,8) (6,11) (8,11)
The corresponding sample means are :
2.5, 4.0 , 5 , 0 , 6.5 , 4.5 , 5.5 , 7.0 , 7.0 , 8.5 ,
9.5 .
The mean of sampling distribution of means is ,
2.5+4.0+
𝜇 𝑥 = …….…+9.5 =
6.0
10 ∴ 𝜇𝑥 = µ
(d) The variance of sampling distribution of mean is ,
(2.5−
6)2+ 4.0−6 2+ ……….+
= 4.05
𝜎2x = 10
(9.5−6) 2

And, 𝜎𝑥 = 2.01
𝜎 2 𝑁−
𝑝 𝑁
this illustrates, 𝜎2x = )
𝑁 𝑁𝑝− 1
(
10.8
= (
2
5−2
5−1)
= 4.05
As obtained above .
Q 2. The population is the weight of six pumpkins (in pounds) displayed in a carnival
"guess the weight" game booth. You are asked to guess the average weight of the six
pumpkins by taking a random sample without replacement from the population.
Pumpkin Weight (in pounds)
A 19
B 14
C 15
D 9
E 10
F 17

To demonstrate the sampling distribution, let’s start with obtaining all of the possible
samples of size n = 2 from the populations, sampling without replacement. Sample size
= 6!/ (4! x 2!) = 15.
To demonstrate the sampling distribution, let’s start with obtaining all of the possible
samples of size n = 5 from the populations, sampling without replacement. Sample
size = 6!/ (1! X 5!) = 6.
MATHEMATICAL PROBLEMS
Sampling Distribution of means (HW)
Prob. 3 :
A population consists of the THREE numbers 2,3, and 4.
all possibleConsider
samples of size 2 that can be drawn with and without
from this population .
replacement
a)The mean of the population.
b) The standard deviation of the population .
c)The mean of the sampling distribution of
means. deviation of the sampling distribution of means (the standard
d)Standard
error of means ).
Central limit theorem is a statistical theory which states that when the large
sample size is having a finite variance, the samples will be normally distributed
and the mean of samples will be approximately equal to the mean of the whole
population.
In other words, the central limit theorem states that for any population with mean
and standard deviation, the distribution of the sample mean for sample size N
has mean μ and standard deviation σ / √n .
As the sample size gets bigger and bigger, the mean of the sample will get closer
to the actual population mean. If the sample size is small, the actual distribution
of the data may or may not be normal, but as the sample size gets bigger, it can
be approximated by a normal distribution. This statistical theory is useful in
simplifying analysis while dealing with stock index and many more.
The record of weights of the male population follows the normal distribution. Its
mean and standard deviations are 70 kg and 15 kg respectively. If a researcher
considers the records of 50 males, then what would be the mean and standard
deviation of the chosen sample?

Mean of the population μ = 70 kg


Standard deviation of the population = 15
kg
sample size n = 50
Mean of the sample is given by: =70kg
Standard deviation of the sample is given by = σ / √n  

= 15/√50​
= 15/7.071 
= 2.122= 2.1 kg
Example:
The record of weights of female population follows normal distribution. Its
mean and standard deviation are 65 kg and 14 kg respectively. If a researcher
considers the records of 50 females, then what would be the standard deviation
of the chosen sample?

Solution:
Mean of the population μ = 65 kg
Standard deviation of the population = 14
kg
sample size n = 50
Standard deviation is given by = σ / √n  
= 14/√50​
= 14/7.071 
= 1.97
The average weight of a water bottle is 30 kg with a standard deviation of 1.5 kg. If
a sample of 45 water bottles is selected at random from a consignment and their
weights are measured, find the probability that the mean weight of the sample and
standard deviation.

Population mean = μ = 30 kg


Population standard deviation= σ = 1.5kg
Sample size: n = 45 (which is greater than 30)
The sample standard deviation = σ / √n  =

= 1.5/√45​
= 1.5/6.7082
= 0.2236 
Hypothesis Testing

•Hypothesis testing is used to assess the plausibility of a


hypothesis by using sample data.
•The test provides evidence concerning the plausibility of the
hypothesis, given the data.
•Statistical analysts test a hypothesis by measuring and
examining a random sample of the population being analyzed.
•Null hypothesis, H0 - represents a hypothesis of chance
basis.
•Alternative hypothesis, Ha - represents a hypothesis of
observations which are influenced by some non-random
cause.
Suppose we wanted to check whether a coin was fair and balanced. A null
hypothesis might say, that half flips will be of head and half will of tails whereas
alternative hypothesis might say that flips of head and tail may be very different.
H0:P=0.5
Ha:P≠0.5
For example if we flipped the coin 50 times, in which 40 Heads and 10 Tails
results. Using result, we need to reject the null hypothesis and would conclude,
based on the evidence, that the coin was probably not fair and balanced.

A researcher thinks that if expectant mothers use vitamins, the


birth weight of the babies will increase. The average birth weight
of the population is 8.6 pounds.
H0: = 8.6
H1: > 8.6

You might also like