0% found this document useful (0 votes)
4 views

Statistics w3

Uploaded by

Gunjan Choudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Statistics w3

Uploaded by

Gunjan Choudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

STATISTICS I

Methods to describe statistical data


Numerical representation
Pradeep Guin
([email protected])
B.A. (H) Economics, Statistics I (STAT-1004), Fall 2022
September 27 & 29, 2022 Jindal School of Government and Public Policy
O.P. Jindal Global (Institution of Eminence Deemed to be University)
Discussion points
2

¨ Numerical representation
¤ Measures of central tendency
¤ Measures of dispersion

¤ Measures of skewness

¨ Any other matter?


Measures of central tendency
3

¨ Summary statistics; single value describes the ‘centre


point’ (where most values of a distribution fall) of a
dataset
¨ Central position of a dataset?
¨ Measures of central location
¨ Mean, median and mode are three most valid
measures of central tendency
¨ Data type determines which measure is most
appropriate
Mean (average)
4

¨ Most commonly used form of central tendency


¨ Three types: arithmetic, geometric, harmonic
¨ Arithmetic mean: The value obtained by dividing
sum of values by number of values
¨ Denoted by 𝑥̅ (x-bar)
¨ Assume there are 5 numbers, then arithmetic mean
¤ 𝑥̅ = (𝑥! + 𝑥" + 𝑥# + 𝑥$ + 𝑥% )/5 = ∑ 𝑥 /𝑛
¨ For grouped data, arithmetic mean
¤ 𝑥̅ = ∑(&'! 𝑓& ∗ 𝑥& / ∑(&'! 𝑓&
Mean (average)…contd.
5

¨ Refer to “example-data-w2.2.xlsx” file


¤ Compute mean for simple and grouped data
¤ In simple data, if value for each observation is increased by 2.5,
what is the new arithmetic mean?
¤ In grouped data, what happens to the arithmetic mean if the
frequency of each class is increased by 2?
¨ The mean monthly wage of five employees is ₹46,000.
What is the wage of 5th employee if wages for remaining
four employees are ₹10,000, ₹25,000, ₹50,000 and
₹45,000?
¤ What does this result imply?
n Implication of extreme values
n Extremes values as outliers
Mean (average)…contd.
6

¨ Weighted arithmetic mean


¤ Certain values have greater weight (w)
∑ !! ∗$!
¤ 𝑥̅! = ∑ !!
n When weights add to 1: multiply each weight by corresponding value and
sum all of it
¤ Similar to the mean of grouped data
¤ Helps in making decisions: which one is better?
n e.g., you plan to buy a car. You are trying to decide between car A and car
B based on three dimensions: engine type (45%), design (25%), safety
features (30%)
n You rate Car A: 8 (out of 10) for engine type, 5 for design, and 7 for safety
feature
n You rate Car B: 6 (engine type), 7 (design), and 8 (safety feature)
n Which car would you end up buying? [Your answer]
n e.g., your final grades? Class participation (10%), individual assignment
(20%), group-assignment (20%), final take home (50%)
¤ Impact of outliers?
Mean (average)…contd.
7

¨ Geometric mean (GM) for a set of n +ve values


¤ 𝐺= !
𝑥" ∗ 𝑥# ∗ ⋯ ∗ 𝑥$ = (𝑥" ∗ 𝑥# ∗ ⋯ ∗ 𝑥$ )"/$
¤ Computation based on higher roots difficult
"
¤ log 𝐺 = $ ∑ 𝑙𝑜𝑔𝑥&
¤ 𝐺 = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 (log 𝐺)
¨ GM for grouped data
¤ If 𝑥", 𝑥#, … , 𝑥' are the class intervals with 𝑓", 𝑓#, … , 𝑓' as corresponding frequencies,
where 𝑛 = ∑'&(" 𝑓&
¤ 𝐺= !
𝑥"𝑓! … . 𝑥' 𝑓 " = (𝑥"𝑓! … . 𝑥' 𝑓 " )"/$
"
¤ log 𝐺 = $ ∑ 𝑓𝑖𝑙𝑜𝑔𝑥&
¨ Typically used when values change exponentially or in case of skewed distribution
(can you think of one?)
¨ GM of 5, 8, 17, and 9 [no base information is given, assume it 10: Your answer]
¨ GM of 0, 1, 4, 2, 8 [Your answer]
Mean (average)…contd.
8

¨ Harmonic mean (HM) of a set of n values (x1, x2,…, xn) is the reciprocal of
the average of the reciprocals
¨ HM = * *) *
+*
% + %⋯+
, )
¨ Weighted HM
¤ If 𝑥! , 𝑥" , … , 𝑥# are the class intervals with 𝑓! , 𝑓" , … , 𝑓# as corresponding
frequencies, where 𝑛 = ∑#$%! 𝑓$
¤ HM = ∑ $∑(&/())
$!

¨ HM is useful while looking at questions related to rate


¤ You travel 10 km at 50 km/h, next 10 km at 60km/h, final 10 km at 30km/h.
What is your average speed? [Your answer]
¤ Good when there are large outliers
n AM vs. HM for 2, 6, 10, 200
¨ When all non-zero numbers are same: AM = GM = HM
¨ What if they are not same? [Your answer]
Median
9

¨ Middle value
¨ Splits the dataset in half
¨ For odd number of observations, median is the middle
(center-most) value, obtained after arranging the dataset in
an order
¨ For even number of observations, median is the average of
two centre-most values
¨ Outliers and skewed data have small effect on median
¤ e.g., income data
¨ In symmetrical distribution, mean and median almost overlap
¨ In skewed data, mean is typically pulled away from the
centre, more towards the longer tail
Mode
10

¨ Most frequently occurring value in the dataset


¨ Mode is the highest bar in a bar chart
¨ There can be more than one mode in a distribution
¤ Bimodal
¤ Multimodal

¨ Used with categorical, ordinal and discrete data


¨ Unlikely in a continuous data
¨ Ordinal data: median or mode

You might also like