0% found this document useful (0 votes)
42 views

Statistics and Probability Notes

⮚ The document discusses random variables and probability distributions. It defines discrete and continuous random variables and explains that a probability distribution describes the possible values and likelihoods of a random variable. ⮚ It then focuses on discrete random variables and probability distributions. It explains that a discrete probability distribution, also called a probability mass function, lists the possible values a discrete random variable can take and their probabilities. It also provides the formulas for calculating the mean, variance, and standard deviation of a discrete probability distribution. ⮚ The document then discusses the normal distribution and its importance. It provides steps for finding the area under the normal curve for a given z-value using the z-table. It also discusses some

Uploaded by

Kent Daniel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Statistics and Probability Notes

⮚ The document discusses random variables and probability distributions. It defines discrete and continuous random variables and explains that a probability distribution describes the possible values and likelihoods of a random variable. ⮚ It then focuses on discrete random variables and probability distributions. It explains that a discrete probability distribution, also called a probability mass function, lists the possible values a discrete random variable can take and their probabilities. It also provides the formulas for calculating the mean, variance, and standard deviation of a discrete probability distribution. ⮚ The document then discusses the normal distribution and its importance. It provides steps for finding the area under the normal curve for a given z-value using the z-table. It also discusses some

Uploaded by

Kent Daniel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

STATISTICS AND PROBABILITY Probability distributions describe the dispersion of the

values of a random variable. Consequently, the kind of variable


Module 1 - Quarter 3
determines the type of probability distribution.

Random Variables and Probability Distributions


Lesson 1: The Concept of Random Variables

A random variable is a variable whose value is


unknown or a function that assigns values to each of an
experiment's outcomes.

Random variables are often designated by letters and


can be classified as discrete, which are variables that have
specific values, or continuous, which are variables that can
have any values within a continuous range.

Random variables are often used in econometric or


regression analysis to determine statistical relationships
among one another.

There is a total of 8 possible outcomes with 4 distinct


possible values of X. we now assign the probability values of
each.
There are two varieties of random variables that are numerical
in nature:

Note: A sample spaces is the set of all possible outcomes in a


random experiment. The table shown on the previous page is what we call
the probability distribution or probability mass function of the
random variable. We can graph the distribution in the form of a
histogram as presented below.

Lesson 2: Probability Distribution of Discrete Random


Variables

Understanding Probability distribution


A probability distribution is a statistical function that
describes all the possible and likelihoods that a random
variable can take within a given range. This range will be What is a histogram?
bounded between the minimum and maximum possible values Is like a bar graph but has no spaces in between the
but precisely where the possible value is likely to be plotted on bars.
the probability distribution depends on a number of factors.
This factors include the distribution’s mean (average), standard
deviation, skewness, and kurtosis.
Lesson 3: Solving for the Mean, Variance, and Standard SUMMARY
Deviation of Discrete Probability Distributions
⮚ A random variable is a function that links a specific
The probability distribution of a discrete random
variable is the same with the frequency distribution of a sample numerical value to each element in the sample space of
in a given population that needs to be summarized using a any given experiment or situation.
central value for it will give the general behavior of the random
variable under observation. Any discrete probability distribution ⮚ A discrete random variable is a random variable whose
has a mean, a variance, and a standard deviation.
set of possible outcomes is finite. Values of each sample
The mean is the average of all possible outcomes. It
are separated with a finite gap or space. Values of this
is otherwise referred to as the “expected value” of a probability
variable may be obtained through counting.
distribution. When we say expected value, it means that if we
repeat any given experiment infinite times, the theoretical
mean would be the :expected value”. ⮚ A continuous random variable is a random variable
whose set of possible outcomes is infinite. This type of
The variance and standard deviation are measures of speed or variable may take on a continuous stream of values. Its
variability. values can only be obtained through measurement.

⮚ A discrete probability distribution, otherwise known as


a probability mass function, is made up of the values that
a random variable can take with their corresponding
probabilities. A valid probability distribution must have a
sum of exactly equal to 1, no more no less.

As a point reference, if the values of all the outcomes


⮚ Formulas for Mean, Variance, and Standard Deviation.
in an experiment are all the same, the variance and standard
deviation are both 0. but of curse, this rarely happens in real
life application.

Steps in Finding the Mean:


1. Multiply the random variable by its probability.
2. Use equation 1 to find the mean by adding all products
obtained in step 1.

Steps in Finding the Variance:
3. Subtract the computed mean from each value of the
random variable:
4. Square the value obtained in step 3:
5. Multiply the value obtained in step 4 by the corresponding
probability:
6. Use equation 2 to find the variance by adding all products
obtained in step 5.

Steps in Finding the Standard Deviation:
7. Use equation 3 to find the standard deviation by getting
the square root of the value obtained in step 6.
Module 2 - Quarter 3 However, this formula is now rarely used because of the
readily available z-Table which displays all the area of the
Normal Distribution region under the curve given a z-value
Lesson 1: Normal Distribution
Steps in finding the area under the normal curve given a
The normal distribution is the most important z-value
distributions in statistics. Many researchers from different field 1) Express the given z-value into a three-digit number.
use its idea in order to test their research hypotheses that will 2) Using the z-Table, find the first 2 digits on the first column.
generate new knowledge and transform this knowledge into 3) Find the third digit on the first row on the right.
new applications that improve the quality of people’s lives 4) Read the area for probability at the intersection of the row
(Albay 2019, p. 82). (first 2 digit number) and column (third digit number). The
value observed at the intersection indicates the area of
Six Properties the given z-value.

Lesson 3: Shaded Region Under the Normal Curve

Lesson 2: Areas Under the Normal Curve Probability notations are commonly used to express
a lengthy idea into symbols concerning the normal curve.
Finding the shaded area of the polygon is different
from finding the area of the shaded region in the normal The following are the most common probability
distribution. For polygons, we use formulas and simple notations used in studying concepts on the normal curve.
calculations to find the shaded region however, in the normal
distribution, we use the ztable to locate the z-value. P(a < z < b) this notation represents the idea stating the
probability that the z-value is between a and b
A specific proportion of the area of the region under
the curve can be calculated manually using the formula P(z> a) this notation represents the idea stating the probability
that the z-value is above a

P(z< a)this notation represents the idea stating the probability


that the z-value is below a where a and b are z-score values.
where
Y represents the height of the curve at aparticular value of X P(z = a) = 0 this notation represents the idea stating the
X represents any score in the distribution probability that the z-value is equal to a is 0. This notation
σ represents the standard deviation of the population indicates that a z-value is equal to exactly one point on the
μ represents the population mean curve. With that single point, a line can be drawn signifying the
π = 3.1416 probability can be below or above it. That is why, for a z-value
= 2.7183 to be exactly equal to a value its probability is equal to 0.
Lesson 4: Understanding the Z-Scores
Lesson 5: Percentiles Under the Normal Curve
Let us begin understanding the z-scores by
acknowledging that for a given distribution, it is more preferred A percentile is a measure used in statistics indicating
to have a larger set of data in order to make good the value below which a given percentage of observations in a
generalizations. However, at times in a given distribution, raw group of observations fall. It is a measure of relative standing
scores may be composed of large values and large values as it measures the relationship of a measurement of the rest of
cannot be accommodated at the baseline of the normal curve. the data.
Thus, the raw scores have to be transformed into z-scores in
order to get meaningful decisions relative to the concepts of
finding the equivalent percentage and probability of the given
measure of value from the mean.

For any population, the mean and the standard


deviation are fixed. This gives the way to understand that for
every raw score X, there corresponds an exactly one z-score
value, and vice versa. Therefore, if we wish to find the
percentage associated with X, we must find its matched z
value using the z formula.

The z formula is given by:

The area of the region under the normal curve


represents the probability or percentage or proportion of a
given measurement value. It is computed by subtracting the
measurement X to the sample Ẍ or population mean, then
dividing the result by the standard deviation. The resulting
value gives the z-score. The z-score indicates the distance
between a given measurement X and the mean expressed in
standard deviations. It locates either within a sample or within
a population. But for now, a readily available z-Table is made to
obtain the corresponding area given a z-score.

Steps in finding the z-score given the mean (µ)


,standard deviation ( σ ) and the measurement (X)
SUMMARY

⮚ A standard normal curve is a normal probability


distribution that has mean equal to 0 and standard
deviation equal to 1.
⮚ The normal probability distribution has the following
properties:

1. The curve of the distribution is a bell-shaped.

2. The curve is symmetrical about the mean.

3. The mean, median and mode are of equal values and


when sketched, they coincide at the center of the graph.

4. The width of the curve is determined by the standard


deviation of the distribution.

5. The curve extends indefinitely approaching the x-axis but


never touching it. Thus, the curve is asymptotic to the
line.

6. The area of the region under the curve is 1. It represents


the probability or percentage or proportion associated
with the specific sets of measurement values.

* Formula for computing the z-value.


Module 3 - Quarter 3

Random Sampling and Sampling Distribution


Lesson 1: Random Sampling

Wrong conclusion maybe drawn from samples in


letters a and b. These samples will not represent the common
brand of cellphone of Mila’s classmates and the SHS student’s
most admired young actors and actresses. The samples in
letter c is the best representation of the described population.
Now that you know how to determine the sample size
Now that you know the idea of representativeness of of a certain population, you are now ready to learn how to
a sample for a population, the next activity will lead you to the compute the sample mean which serves as an estimator for
process of getting a sample which are good representatives of the population mean.
a population. This process is called random sampling
Illustrative example:
The heights in meters of 5 students chosen at random
Types of Random Sampling Techniques are 1.5 , 1.23,1.6, 1.4, and 1.3.

✔ Simple Random/Lottery Sampling – a sampling The mean height of these 5 students is computed as,
technique by which every member of the population has
an equal chance to be chosen as sample (drawn by lot) Mean = 1.5+1.23+1.6+1.4+1.3
5
✔ Systematic Sampling – a sampling technique by which
Mean = 1.41 meters
every member of the population is selected with a random
start. Lesson 2: Parameter and Statistic

✔ Stratified Random sampling – a sampling technique Statistic describes a sample as an estimate for the
whole population. The fact is about a very large population in
that is used when the population can be classified into
which you can’t imagine the time and money to be spent to
groups or strata based on some characteristics such as
survey the entire population.
age, gender or socioeconomic status.
Parameter describes the entire population. The fact is
✔ Cluster sampling – a sampling technique by which the about the whole population that is easy to figure out because
sample is taken from different levels generally from higher the groups are small enough to measure.
levels to lower levels
Lesson 3: Sampling Distribution of the Sample Means from a
Finite Population
✔ Multi-Stage Sampling – a sampling technique that is
done using the combination of different sampling A sampling distribution of sample means is a
techniques. probability distribution where all possible random samples of a
specific size were taken from a population.
Illustrative example:
Supposing that your school has a population of 5,000 A finite population is a population that has a fixed
students and you want to know the average height of the number of elements or observations
students, it would be impractical to interview or to get the
height of all students. All you need to do is to determine the
sample size that will estimate the whole population. To do this,
we will use the Slovin’s Formula in getting the sample size.

Lesson 4: Mean and Variance of the Sampling Distribution of


Sample Means
Lesson 5: Sampling Distribution of the Sample Means from an
Infinite Population

SUMMARY

The standard deviation of the sampling distribution of


the sample means is also called standard error of the mean. It
tells how accurate is the sample mean to estimate the
population mean. If the value of the standard deviation is small
or very close to zero, then the sample mean is a good estimate
for the population mean. If the value of the standard deviation
is large, the mean is a poor estimate for the population mean.

A good estimate for the population mean can be


obtained if the random sample size n is sufficiently large. This
is stated as a theorem which is called The Central Limit
Theorem.

Lesson 6: The Central Limit Theorem


Lesson 7: Defining the Sampling Distribution of the Sample
✧ Mean is the sum of all observations divided by the total
Mean using the Central Limit Theorem
number of observations

✧ Statistic describes a sample as an estimate for the whole


population. The fact is about a very large population in
which you can’t imagine the time and money to be spent
to survey the entire population.

✧ Parameter describes the entire population. The fact is


about the whole population that is easy to figure out
because the groups are small enough to measure

✧ A finite population is a population that has a fixed number


of elements or observations

✧ A sampling distribution of sample means is a probability


distribution where all possible random samples of a
specific size were taken from a population.

✧ These are the properties of the sampling distribution of


Lesson 8: Problems Involving Sampling Distribution of the
Sample Mean the sample means:

a. The mean (μ X̅) of the sampling distribution of the sample


means is equal to the mean ( μ ) of the population where
the samples are taken. That is; μx̅= μ
✧ The standard deviation is also called standard error of the
mean. It tells how accurate is the sample mean to
estimate the population mean. If the value of the standard
deviation is small or very close to zero, then the sample
mean is a good estimate for the population mean. If the
value of the standard deviation is large, the mean is a
poor estimate for the population mean.
✧ The central limit theorem states that if random samples of
SUMMARY size n are drawn from a population with mean μ and
variance δ2, the sampling distribution of the mean
✧ Random Sampling- a method of getting a sample by approaches normal distribution with mean μ and variance
δ2/n as n, the sample size, gets larger regardless of the
which every member of a population has an equal chance
shape of the original population distribution.
to be included.
✧ The Central Limit Theorem justifies the use of the formula
✧ Lottery Sampling- a sampling technique by which every when computing the probability that distribution of x ̅ will
take on a value within a given range in the sampling x̅
member of the population has an equal chance to be
chosen as sample (Drawn by lot)

✧ Systematic Sampling- a sampling technique by which


every member of the population is selected with a random
start.

✧ Stratified Random sampling – a sampling technique that


is used when the population can be classified into groups
or strata based on some characteristics such as age,
gender or socioeconomic status.

✧ Cluster sampling – a sampling technique by which the


sample is taken from different levels generally from higher
levels to lower levels

✧ Multi-Stage Sampling – a sampling technique that is done


using the combination of different sampling techniques.
Module 4 - Quarter 3

Estimation of Parameters
Lesson 1: The t-Distribution

The t - distribution – is the probability distribution


that estimates the population parameters when the sample
size is small and the population standard deviation is unknown.

Degree(s) of freedom – refers to the number of


independent observations on the set of data, or the number of
variables that are free to vary. The formula for the degree of
freedom is df = n -1 where n is the number of observations.

Confidence level – usually expressed in percent, it


sets a portion of the sample to be included within a known
range of the true population. It also quantifies the probability in
which, a member of the sample would fall within a known
interval of the true population. If a (alpha) is the allowable
sampling error, the confidence level, is equal to 1 – a .

Confidence interval – also called interval estimate, is


a range of values that is used to estimate a parameter. This Finding Areas and Percentiles
estimate may or may not contain the true parameter value. The process of solving problems in areas and
percentiles in the t distribution is similar to the z distribution.
Here are several properties of the t – distribution: We just refer to the t table to find the critical values.
1. The mean, median, and mode of the t-distribution are
equal to 0. Example 4
2. The t-distribution is bell-shaped and symmetric about the a. What is the 95th percentile of a one-tailed t-distribution when
mean. df=10?
3. The total area under the t-distribution curve is equal to 1.
4. The tails in the t-distribution are “thicker” than those in the Solution:
standard normal distribution. Looking at the t table when df= 10, it shows that when ,
5. The standard deviation of the t-distribution varies with the one-tailed, the t value is 1.812. Thus, the 95th percentile is
sample size, but it is greater than 1. 1.812
6. The t-distribution is a family of curves, each determined
by a parameter called the degrees of freedom. The b. What is the 90th percentile of a two-tailed t distribution when
degrees of freedom (sometimes abbreviated as df) are df=25?
the number of free choices left after a sample statistic
such as x is calculated. When you use a t-distribution to Answer: t value is 1.708
estimate a population mean, the degrees of freedom are
equal to one less than the sample size. df = n – 1 To find the area under the t-distribution for a certain
7. As the degrees of freedom increase, the t-distribution range of t-values, first identify whether a two-tailed or
approaches the standard normal distribution, as shown in one-tailed t-test was done. The area always corresponds to 1-
the figure. After df=30, the t-distribution is close to the a.
standard normal distribution.

You might also like