Chapter 3&4 Stats
Chapter 3&4 Stats
Learning Objectives
1. Calculate and interpret measures of central location.
2. Interpret a percentile and a boxplot.
3. Calculate and interpret measures of dispersion.
4. Explain mean-variance analysis and the Sharpe ratio.
5. Apply Chebyshev’s theorem, the empirical rule, and z-scores.
6. Calculate and interpret measures of association.
7. Calculate and interpret a geometric mean return and a compound growth rate.
Introductory Case: Investment Decision
• Dorothy, a financial advisor, compares the annual returns of two mutual funds: Fidelity’s Growth
Index and Value Index.
• Tasks include calculating typical returns, investment risks, and determining which fund provides
a greater return relative to risk.
3.1 Measures of Central Location
• Arithmetic Mean: The primary measure, calculated by summing all observations and dividing
by the number of observations.
• Median: The middle value that divides data into two equal parts, useful when outliers are present.
• Mode: The most frequently occurring value, less useful with multiple modes.
3.2 Percentiles and Boxplots
• Percentiles: Divide data into 100 equal parts. Key percentiles include Q1 (25th), Q2 (50th,
median), and Q3 (75th).
• Boxplots: Graphically display the five-number summary (minimum, Q1, median, Q3, maximum)
and identify outliers.
3.3 Measures of Dispersion
• Range: Difference between the maximum and minimum values.
• Interquartile Range (IQR): Difference between Q3 and Q1, representing the range of the
middle 50% of data.
• Mean Absolute Deviation (MAD): Average of absolute differences from the mean.
• Variance and Standard Deviation: Variance is the average of squared differences from the
mean; standard deviation is the square root of variance.
• Coefficient of Variation (CV): Standard deviation divided by the mean, useful for comparing
variability across datasets with different units or means.
3.4 Mean-Variance Analysis and the Sharpe Ratio
• Mean-Variance Analysis: Evaluates investments based on return (mean) and risk (variance).
• Sharpe Ratio: Measures the extra reward per unit of risk, calculated as the difference between
the mean return and the risk-free rate divided by the standard deviation.
3.5 Analysis of Relative Location
• Chebyshev’s Theorem: Provides bounds for the proportion of observations within a certain
number of standard deviations from the mean.
• Empirical Rule: For bell-shaped distributions, approximately 68% of observations fall within
one standard deviation, 95% within two, and almost all within three.
• Z-Scores: Measure the relative position of an observation within a distribution, useful for
detecting outliers.
3.6 Measures of Association
• Covariance: Measures the direction of the linear relationship between two variables.
• Correlation Coefficient: Measures both the direction and strength of the linear relationship,
ranging from -1 to 1.
3.7 The Geometric Mean
• Geometric Mean: A multiplicative average, suitable for evaluating investment returns over
multiple periods and calculating average growth rates.
Key Examples
• Investment Decision: Comparing returns and risks of Growth and Value funds.
• Employee Salaries: Calculating mean, median, and mode to understand central location.
• Boxplots: Constructing boxplots for Growth and Value funds to visualize data distribution.
• Dispersion Measures: Calculating range, IQR, MAD, variance, and standard deviation for
Growth and Value funds.
• Sharpe Ratio: Comparing the reward-to-risk ratio for Growth and Value funds.
• Relative Location: Using Chebyshev’s theorem and the empirical rule to analyze exam scores.
• Association Measures: Calculating covariance and correlation for Growth and Value funds.
• Geometric Mean: Evaluating investment returns and growth rates for a multinational
corporation.
Chapter 4: Introduction to Probability
Learning Objectives
1. Describe fundamental probability concepts.
2. Apply the rules of probability.
3. Calculate and interpret probabilities from a contingency table.
4. Apply the total probability rule and Bayes’ theorem.
5. Describe the various counting rules.
Introductory Case: Fitness Center Annual Membership
• A manager at 24/7 Fitness Center uses data from 400 past open house attendees to
develop a strategy for selecting which new attendees to contact.
• Tasks include constructing a contingency table and calculating empirical probabilities
concerning age and enrollment.
4.1 Fundamental Probability Concepts
• Probability: A numerical value between 0 and 1 that measures the likelihood of an event
occurring.
• Experiment: A process leading to one of several possible outcomes, with the actual
outcome unknown before the experiment.
• Sample Space (S): Contains all possible outcomes of an experiment.
• Event: Any subset of outcomes from the sample space. Simple events contain a single
outcome, while other events may contain several outcomes.
• Exhaustive Events: Include all possible outcomes of an experiment.
• Mutually Exclusive Events: Do not share any common outcomes; the occurrence of one
event precludes the occurrence of others.
• Union of Two Events (A ∪ B): All outcomes in A or B (or both).
• Intersection of Two Events (A ∩ B): All outcomes in both A and B.
• Complement of an Event (A^c): All outcomes in the sample space that are not in A.
4.2 Rules of Probability
• Complement Rule: ( P(A^c) = 1 - P(A) )
• Addition Rule: ( P(A ∪ B) = P(A) + P(B) - P(A ∩ B) )
• Conditional Probability: ( P(A|B) = \frac{P(A ∩ B)}{P(B)} )
• Multiplication Rule: ( P(A ∩ B) = P(A|B) \cdot P(B) )
• Independence: Two events A and B are independent if ( P(A|B) = P(A) ) or equivalently
( P(A ∩ B) = P(A) \cdot P(B) )
4.3 Contingency Tables and Probabilities
• Contingency Table: Shows frequencies for two categorical variables, with each cell
representing a mutually exclusive combination of the pair of values.
• Joint Probability Table: Converts the contingency table by dividing the frequency in
each cell by the total number of outcomes.
• Marginal Probabilities: The values in the margins of the table, representing
unconditional probabilities.
4.4 The Total Probability Rule and Bayes’ Theorem
• Total Probability Rule: Expresses the probability of an event in terms of probabilities of
the intersection of the event with any mutually exclusive and exhaustive events. $ P(A) =
P(A ∩ B) + P(A ∩ B^c) $
• Bayes’ Theorem: Updates probabilities based on new information. $ P(B|A) =
\frac{P(A|B) \cdot P(B)}{P(A)} $ Where ( P(A) = P(A|B) \cdot P(B) + P(A|B^c) \cdot
P(B^c) )
4.5 Counting Rules
• Factorials: The number of ways to assign every member of a group of size n to n slots. $
n! = n \cdot (n-1) \cdot (n-2) \cdot … \cdot 1 $
• Combinations: The number of ways to choose x objects from a total of n objects, where
the order does not matter. $ \binom{n}{x} = \frac{n!}{x!(n-x)!} $
• Permutations: The number of ways to choose x objects from a total of n objects, where
the order does matter. $ P(n, x) = \frac{n!}{(n-x)!} $
Key Examples
• Fitness Center Membership: Using a contingency table to calculate probabilities of
enrollment based on age groups.
• Snowboarder's Medal Probabilities: Calculating subjective probabilities for different
medal outcomes.
• Richest Americans' Ages: Using empirical probabilities to determine the likelihood of
different age groups.
• Rolling a Die: Calculating classical probabilities for different outcomes.
• Lie-Detector Test: Applying Bayes’ theorem to update probabilities based on test
results.
• Little-League Coach: Using factorials, combinations, and permutations to determine the
number of ways to assign players to positions.