0% found this document useful (0 votes)
9 views6 pages

Statistics and Probability Q3

The document provides an overview of statistics and probability, focusing on discrete and continuous random variables, their definitions, and examples. It explains how to compute variance, mean, and standard deviation, as well as the properties of normal distribution and z-scores. Additionally, it discusses various sampling methods, including probability and non-probability sampling techniques, along with their advantages and disadvantages.

Uploaded by

paminawa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

Statistics and Probability Q3

The document provides an overview of statistics and probability, focusing on discrete and continuous random variables, their definitions, and examples. It explains how to compute variance, mean, and standard deviation, as well as the properties of normal distribution and z-scores. Additionally, it discusses various sampling methods, including probability and non-probability sampling techniques, along with their advantages and disadvantages.

Uploaded by

paminawa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Statistics and Probability

Discrete and Continuous Variable


Variable Computing for Variance
 A variable can change its value and x P(x) x ∙ P(x) 2
x ∙ P(x)
may vary based on experiment
outcomes. 1
or 1
 If its value depends on a random 1 10 or 0.1 0.1
10
experiment, it’s called a random 0.1
variable, which can take any real value 3
or 6
Discrete Random Variable 2 10 or 0.6 1.2
10
0.3
 Obtained by counting and has distinct,
separate values. 3
or 9
 Examples include: 3 10 or 0.9 2.7
10
1. Number of students present 0.3
2. Number of red marbles in a jar 2
or 8
3. Number of heads when flipping 4 10 or 0.8 3.2
three coins 10
0.2
4. Students' grade level
1
or 5
Continuous Random Variable 5 10 or 0.5 2.5
10
 Obtained by measuring and can take 0.1
any value within a range. Uses M =2.9 ∑ [ x2 ∙ P ( x ) ]=9.7
decimals.
 Examples include: Variance ∑ [ x¿¿ 2−P ( x ) ]−M 2 ¿
1. Height of students in class σ
2 2
9.7−2.9 = 1.29
2. Weight of students in class
3. Time it takes to get to school
4. Distance traveled between classes Computing for Standard Deviation
Computing for Mean
√ σ2
x P(x) x ∙ P(x)
¿ √ 1.29
1 1 Formula (σ)
1 or 0.1 or 0.1 ¿ 1.135781669
10 10
3 6 = 1.14
2 or 0.3 or 0.6
10 10
3 9 Data Distribution
3 or 0.3 or 0.9
10 10 Data can be distributed in various ways:
2 8  Spread more to the left (left-skewed).
4 or 0.2 or 0.8
10 10  Spread more to the right (right-
1 5 skewed).
5 or 0.1 or 0.5  Jumbled up with no clear pattern.
10 10
29 In many cases, data tends to cluster
Mean ∑ x ∙ P (x)= 10 ∨2.9 around a central value with no bias to the
left or right, forming a Normal Distribution.
 It is a measure of relative standing,
showing where a value lies within a
distribution.
Calculating the z-Score:

Normal Distribution
 The graph of a normal distribution is
bell-shaped and symmetric about the
mean.
 It is often referred to as a bell curve.
 Data points are more frequent around
the mean and decrease as they move
away from the center.
Properties of Normal Distribution
1. The distribution curve is bell-shaped.
2. The curve is symmetrical about its
center.
The mean, the median, and the
mode are equal and coincide
at the center.
3. The width of the curve is determined
by the standard deviation of the
distribution.
4. The tails of the curve flatten out
indefinitely along the horizontal
axis, always approaching the axis but Interpretation of z-Score:
never touching it. That is, the curve is  A positive z-score indicates the value
asymptotic to the base line. is above the mean.
5. The area under the curve is 1. Exactly  A negative z-score indicates the value
half of the values are to the left of is below the mean.
the center and exactly half the values
are to the right. The magnitude of the z-score tells how far
the value is from the mean in terms of
z-Table standard deviations.
 The z-Table (or standard normal table)
provides the area under the normal
curve to the left of a given z-score.
 It is used to find probabilities or
percentiles for normally distributed
data.
z-Score
 The z-score measures how many
standard deviations a data point (x) is
from the mean
 Advantages: Simple and ensures even
coverage of the population.
 Disadvantages: Risk of periodicity if
the population has a hidden pattern

3. Stratified Sampling
 Definition: Divides the population into
subgroups (strata) and randomly
n
Probability Sampling samples from each stratum. ∙ Strata
N
 Every member of the population has a  Uses: When the population has distinct
known, non-zero chance of being subgroups that need representation.
selected.  Advantages: Ensures representation of
 Used when the goal is to generalize all subgroups and improves accuracy.
results to the entire population.  Disadvantages: Requires knowledge
 Advantages: Unbiased, representative, of the population structure and can be
and allows for statistical inference. complex.
 Disadvantages: Can be time-
consuming, expensive, and requires a
complete population list.

1. Simple Random Sampling


 Definition: Every member of the
population has an equal chance of 4. Cluster Sampling
being selected.  Definition: Divides the population into
 Uses: Ideal for small, homogenous subgroups (strata) and randomly
populations. samples from each stratum.
 Advantages: Easy to implement and  Uses: When the population has distinct
free from bias. subgroups that need representation.
 Disadvantages: Requires a complete  Advantages: Ensures representation of
list of the population, which may not all subgroups and improves accuracy.
always be available.  Disadvantages: Requires knowledge
of the population structure and can be
complex.

2. Systematic Sampling
 Definition: Selects every k-th element
n 5. Multi-Stage
from a list after a random start.
N  Definition: Divides the population into
 Uses: Useful when the population is subgroups (strata) and randomly
large and evenly distributed. samples from each stratum.
 Uses: When the population has distinct  Advantages: Quick, easy, and
subgroups that need representation. inexpensive.
 Advantages: Ensures representation of  Disadvantages: Highly biased and not
all subgroups and improves accuracy. representative of the population.
 Disadvantages: Requires knowledge
of the population structure and can be
complex.

2. Quota
6. Area  Definition: Selects samples based on
 Definition: A form of cluster sampling predefined quotas to reflect population
where geographic areas are selected subgroups.
as clusters.  Uses: When time and resources are
 Uses: Commonly used in surveys limited but some representation is
covering large regions. needed.
 Advantages: Practical for  Advantages: Ensures some diversity
geographically dispersed populations. and is faster than probability sampling.
 Disadvantages: May not capture  Disadvantages: Non-random selection
diversity within areas. can introduce bias.

3. Snowball
 Definition: Existing participants recruit
future participants from their network.
 Uses: For hard-to-reach or hidden
populations (e.g., rare diseases, niche
groups).
 Advantages: Useful for studying rare
or hidden populations.
 Disadvantages: Risk of bias and lack
Non-Probability Sampling of representativeness.
 Definition: Members of the population
do not have a known or equal chance
of being selected.
 Uses: Used when the goal is
exploratory or when resources are
limited.
 Advantages: Cost-effective, quick, and 4. Purposive
easy to implement.  Definition: Samples are selected
 Disadvantages: Results may not be based on the researcher’s judgment or
generalizable and can be biased. specific criteria.
1. Convenience  Uses: When studying a specific
 Definition: Samples are taken from a subgroup or unique cases.
group that is easily accessible.  Advantages: Targets specific
 Uses: Exploratory research or pilot characteristics of interest.
studies.
 Disadvantages: Highly subjective and
M=
∑x 1+ 3+5 9
3
not generalizable. N 3 3

5. Consecutive
 Definition: Recruits every available
participant who meets criteria over a Population Variance
period.
 Uses: Common in medical or clinical 2
σ =
∑ (x−M )
research. N
 Advantages: Ensures a larger sample
size and reduces selection bias.
 Disadvantages: May not be x x−M (x−M )
2

representative of the entire population.


1 -2 4
3 0 0
5 2 4
∑ (x−M )2 = 8

Population Standard Deviation

6. Panel Sampling
 Definition: Repeatedly samples the
same group of participants over time.
σ=
√ ∑ (x−M )
N

 Uses: Longitudinal studies to track


changes or trends.
 Advantages: Allows for tracking
σ=
√ ∑ (x−M )
N √ 8
3
2 √6
3
1.63

changes over time.


 Disadvantages: Risk of participant Sample Mean
dropout and high cost.
 Similar to Population Mean
M x =M M x =3

Sample Variance (with replacement)


 If n = 2
2 2
2 σ 8 64
σ x= 32
n 2 2
Sample Variance (without replacement)
Population Mean  If n = 2
 Population: {1,3,5}
σ (N−n) 8 (3−2)
2 2
2
Therefore, N = 3 σ x= ∙ ∙ 16
n (N −1) 2 (3−1)
Sample Standard Deviation (with
replacement)

σ x=
√ σ2
n
σ x=
σ
√n
Sample Standard Deviation (without
replacement)

√ √
σ (N−n)
2
σ ( N−n) σ x=
σ x= ∙ ∙
n ( N−1) √ n (N −1)

You might also like