0% found this document useful (0 votes)
6 views

Basic Stats

AS in title

Uploaded by

002 ojij
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Basic Stats

AS in title

Uploaded by

002 ojij
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Basics Statistics and Probability

Statistics and Probability Basics

1. Terminology:

● Data: Data refers to any collection of facts, observations, or measurements. It can be

01
classified into two types: qualitative data (non-numerical) and quantitative data
(numerical).

0/
● Variables: Variables are characteristics or attributes that can take on different values. They

60
can be classified as qualitative variables (nominal or ordinal) and quantitative variables
(discrete or continuous).

24
● Frequency Distribution: A frequency distribution is a tabular representation of data that

08
shows the number of times each value occurs. It helps in understanding the pattern and
distribution of data.

0
81
● Measures of Central Tendency: Measures of central tendency are used to determine the
centre or average of a set of data. The three commonly used measures are mean, median,
and mode.
||
● Measures of Dispersion: Measures of dispersion provide information about the spread or
in

variability of data. The commonly used measures are range, quartiles, variance, and
e.

standard deviation.
r
su

● Probability: Probability is the likelihood of an event occurring. It is expressed as a value


between 0 and 1, where 0 represents impossibility, and 1 represents certainty.
u
ed

● Random Variables: Random variables are variables whose values depend on the outcome
of a random event. They can be classified as discrete random variables or continuous
@

random variables.
ct

● Normal Distribution: The normal distribution is a bell-shaped probability distribution that


is symmetric around the mean. It is widely used in statistical analysis due to its properties
a

and applicability to many real-world phenomena.


nt
co

● Correlation: Correlation measures the strength and direction of the linear relationship
between two variables. It is represented by the correlation coefficient, which ranges from
-1 to 1.

[email protected] / [email protected] || 8100824600/01 || edusure.in


Basics Statistics and Probability

● Sampling: Sampling involves selecting a subset of individuals or items from a larger


population to make inferences about the entire population. It helps in reducing time, cost,
and effort in data collection.

2. Measures of Central Tendency:

01
Measures of central tendency is the process of describing a complete data set by using
a central value of that data set. The three commonly used measures of central tendency are the

0/
mean, median, and mode. Let's explore each of them in more detail:

60
A) MEAN: It is the ratio of the sum of the values of the items in a series to the total amount of

24
data.

08
i) Arithmetic Mean: The arithmetic mean, often referred to as the average, is the most
commonly used measure of central tendency. It is calculated by summing up all the values

0
in a dataset and dividing the sum by the total number of values.

81
||
in
e.

ii) Geometric Mean: The geometric mean is used when dealing with quantities that are
r

multiplicative in nature, such as growth rates, ratios, or compound interest rates. It is


su

calculated by taking the nth root of the product of n values in a dataset, where n represents
the total number of values. Consider, if x1, x2 …. Xn are the observation, then the G.M is
u

defined as:
ed
@
a ct
nt
co

[email protected] / [email protected] || 8100824600/01 || edusure.in


Basics Statistics and Probability

iii) Harmonic Mean: The harmonic mean is used when dealing with rates, ratios, or
average speeds. It is the reciprocal of the arithmetic mean of the reciprocals of the values in
the dataset. If we have a set of observations given by x1, x2, x3....xn. The reciprocal terms of
this data set will be 1/x1, 1/x2, 1/x3....1/xn. Thus, the harmonic mean formula is given by

01
0/
60
24
Relation between AM, GM and HM:
The products of the harmonic mean (HM) and the arithmetic mean (AM) will always be

08
equal to the square of the geometric mean (GM) of the given data set.
GM^2 = HM × AM.

0
81
Also, HM ≤ GM ≤ AM.
||
Note the following:
The arithmetic mean is used when the data values have the same units.
in

The geometric mean is used when the data set values have differing units.
When the values are expressed in rates we use harmonic mean.
r e.
su

B) MEDIAN: The value of the middle-most observation obtained after arranging the data in
ascending order is called the median of the data.
u
ed

i) Median Formula when ‘n’ is odd: The median formula of a given set of numbers, say
having 'n' odd number of observations, can be expressed as
t @
ac
nt

ii) Median Formula when ‘n’ is even: The median formula of a given set of numbers say
co

having 'n' even number of observations, can be expressed as:

[email protected] / [email protected] || 8100824600/01 || edusure.in


Basics Statistics and Probability

01
C) MODE: In statistics, the mode formula is used to calculate the mode or modal value of a

0/
given set of data. It is defined as the value that is repeatedly occurring in a given set. That
means, the value or number in a data set, which has a high frequency or appears more

60
frequently is called mode or modal value. It can be expressed as:

24
0 08
All measures of central tendency as above changes by same amount as the change of origin
and also changes in the same ratio due to a change in scale i.e. if y=(x-a)/b then Cx = b.Cy+a

3. Measures of Dispersion: 81
||
The dispersion or scatter in the data is measured based on the observations and the types of the
in

measure of central tendency. The different types of measures of dispersion are:


e.

● Range: Given a data set, the range can be defined as the difference between the maximum
r

value and the minimum value. It can be expressed as


su

H-S
where H is the largest value and S is the smallest value in a data set.
u
ed

● Variance: The average squared deviation from the mean of the given data set is known as
the variance. This measure of dispersion checks the spread of the data about the mean. It
@

can be expressed as
t
ac
nt
co

[email protected] / [email protected] || 8100824600/01 || edusure.in


Basics Statistics and Probability

01
0/
60
24
08
● Standard Deviation: The square root of the variance gives the standard deviation. Thus, the
standard deviation also measures the variation of the data about the mean. It can be

0
expressed as

81
||
in

Standard Deviation of Discrete Random Variable:


r e.
u su
ed

Standard Deviation of Continuous Random Variable:


t @
ac
nt

where
co

[email protected] / [email protected] || 8100824600/01 || edusure.in


Basics Statistics and Probability

● Mean Deviation: The mean deviation gives the average of the data's absolute deviation
about the central points. These central points could be the mean, median, or mode. It can be
expressed as

01
0/
60
24
Mean Deviation is least when it is measured around Median.

08
● Quartile Deviation: Quartile deviation can be defined as half of the difference between the
third quartile and the first quartile in a given data set. It can be expressed as

0
81
All measures of dispersion as above remains unchanged due to a change of origin but
changes in the same ratio due to a change in scale i.e. if y=(a-x)/b then Dx = b.Dy
||
in

4. Probability:
r e.

Probability can be defined as the ratio of the number of favourable outcomes to the total number
su

of outcomes of an event.
u
ed
t @
ac

The following terms in probability theory help in a better understanding of the concepts of
nt

probability.
co

❖ Experiment: A trial or an operation conducted to produce an outcome is called an


experiment.
❖ Sample Space: All the possible outcomes of an experiment together constitute a sample
space. For example, the sample space of tossing a coin is {head, tail}.

[email protected] / [email protected] || 8100824600/01 || edusure.in


Basics Statistics and Probability

❖ Favorable Outcome: An event that has produced the desired result or expected event is
called a favourable outcome. For example, when we roll two dice, the possible/favourable
outcomes of getting the sum of numbers on the two dice as 4 are (1,3), (2,2), and (3,1).
❖ Trial: A trial denotes doing a random experiment.
❖ Random Experiment: An experiment that has a well-defined set of outcomes is called a
random experiment. For example, when we toss a coin, we know that we would get ahead

01
or tail, but we are not sure which one will appear.
❖ Event: The total number of outcomes of a random experiment is called an event.

0/
❖ Equally Likely Events: Events that have the same chances or probability of occurring are
called equally likely events. The outcome of one event is independent of the other. For

60
example, when we toss a coin, there are equal chances of getting a head or a tail.
❖ Exhaustive Events: When the set of all outcomes of an event is equal to the sample space, we

24
call it an exhaustive event.
❖ Mutually Exclusive Events: Events that cannot happen simultaneously are called mutually

08
exclusive events. For example, the climate can be either hot or cold. We cannot experience
the same weather simultaneously.

0
81
NOTE: Rules for Probability:

● P(E) = 0 if and only if E is an impossible event.


||
in
e.

● The maximum probability of an event is its sample space (sample space is the total
r

number of possible outcomes).


su

● P(E) = 1 if and only if E is a certain event.


● 0 ≤ P(E) ≤ 1.
u

● There cannot be a negative probability for an event.


ed

● Law of Total Probability: P(A1) + P(A2) + P(A3) + … + P(An) = 1


● If A and B are two mutually exclusive outcomes (two events that cannot occur at the
@

same time) then the probability of A or B occurring is the probability of A plus the
t
ac

probability of B.
nt

● Addition Rule : P(A or B) = P(A) + P(B) - P(A∩B)


● Complementary Rule: Whenever an event is the complement of another event,
co

specifically, if A is an event, then P(not A) = 1 - P(A) or P(A') = 1 - P(A).


● P(A) + P(A′) = 1.
● Multiplication Rule: Whenever an event is the intersection of two other events, that
is, events A and B need to occur simultaneously. Then:

P(A ∩ B) = P(A)⋅P(B) (in case of independent events)

[email protected] / [email protected] || 8100824600/01 || edusure.in


Basics Statistics and Probability

P(A∩B) = P(A)⋅P(B|A) (in case of dependent events)

Past Paper Questions to be solved:

JNU SSS 2018 Q13, 14, 15, 30


SSS 2020 16,34

01
SIS 2018 Q14, 16
SIS 2019 Q28,30
SIS 2020 Q15, 26, 34

0/
SIS 2021 Q48

60
DSE 2001 Q30
2021 Q14, 47, 48

24
2022 Q46

SAU 2015 Q17, 26

08
CUET 2019 Q63, 96
2020 Q72

0
2021 Q77,18

81
2022 Q21

CDS 2013 Q89


||
B.R Ambedkar 2012-13 Q25
2015-16 Q15
in

University of 2011 Q20, 62


e.

Hyderabad 2017 Q8
r

GIPE 2021 Q14


su

APU 2019 Q33


u

2021 Q21
ed
t @
ac
nt
co

[email protected] / [email protected] || 8100824600/01 || edusure.in

You might also like