Statistical Inference
Statistical Inference
Instruction
Due to COVID-19, this course will be taught remotely, and office hours will be online using
Zoom.
Most of the lectures will follow the second edition of Statistical Inference by Casella and Berger.
Before class meetings, there will be assigned pre-recorded modules you should watch. Class
time will be for discussion and questions about the topics, and homework questions.
The syllabus contains the weekly schedule. Before Wednesday’s class meeting, you should have
already watched
1. Set Theory
2. The Basics of Probability Theory: Axiomatic Foundations and the Calculus of
Probabilities
COURSE SYLLABUS
STA 5352 THEORY OF STATISTICS I
FALL, 2020
COURSE DESSCRIPTION: Theory of random variables, distribution and density functions, statistical
estimation, and hypothesis testing. Topics include probability, probability distributions, expectation,
point and interval estimation, and sufficiency.
TEXT: Statistical Inference, Secon Edition by George Casella and Roger L. Berger
SOFTWARE: R. As of mid-July, 2020, the most recent version of R is 4.0.2. If your version of R is not
current, you should update it.
INSTRUCTOR INFORMATION:
Name: Dr. Jane L. Harvill
E-mail address: [email protected]
Phone number: (254) 710-1517
Office location: Marrs McLean Science, Office 153
HOMEWORK: Th average of the home scores will count as 20% of your final average for the
course:
- Homework must be completed and turned in on-time. No late homework will be
accepted. Problems should appear in the order they’re assigned.
- Homework should be a PDF (Adobe Acrobat) file, e-mailed to me no later than 4:00
P.M. on the day homework is due. The name of the PDF file should have your last
name and the homework assignment number; for example, Harvill_hw5.pdf. Do
not send me a photo of your homework. If you do, it will be as if you never turned it
in.
- Your work must be neat and eligible. If I cannot read it, I will not grade it LATEX is
preferred. However, if you can write legibly, that will be acceptable. For STA 5352,
using LATEX to complete homework will not be required.
- You may work together on homework in the spirt of discussion. You may not share
your results with one another. You should not copy your answers from the solutions
manual, from the Internet, or any other resource. You can’t learn by copying
someone else’s work. You learn by doing. If I suspect you are not doing your own
work, or that you are sharing your work with someone else, there will be a
substantial penalty.
EXAMS: All exams are closed book and closed notes, and are to be done without any outside
resources or collaboration. I expect you will memorize formulas.
- There will be two exams during the semester. The score on each will count as 25% of
your final course average.
- There will be a comprehensive Final Exam. The score on the Final will count as 30%
of your final course average.
- Completed exams should be e-mailed to me no later than 4:00 P.M. on the day the
exam is due. The guidelines for homework (legibility, PDF files, etc.) are to be used
for exams.
FINAL COURSE AVERAGE: The final course average will be computed as a weighted average of
the average score of homework assignments and exams.
Course component Percent of Final
Course Average
Homework average 20%
Exam #1 25%
Exam #2 25%
Final exam 30%
GRADE FOR THE COURSE: Course grades will be assigned based on the final course average.
Final Course Average Course Grade
[95, 100] A
[90, 95) A-
[85, 90) B+
[80, 85) B
[75, 80) C+
[70, 75) C
<70 F
TITLE IX: Baylor University does not discriminate on the basis of sex or gender in any of its
education or employment programs and activities, and it does not tolerate discrimination or
harassment, sexual assault, sexual exploitation, stalking, intimate partner violence, and
retaliation (collectively referred to as prohibited conduct). For more information on how to
report, or to learn more about our policy and process, please visit www.baylor.edu/titleix. You
may also contact the Title IX office directly by phone, (254) 710-8454, or email,
[email protected].
The Title IX office understands the sensitive nature of these situations and can provide
information about available on- and off-campus resources, such as counseling and
psychological services, medical treatment, academic support, university housing, and other
forms of assistance that may be available. Staff members at the office can also explain your
rights and procedural options if you contact the Title IX Office. You will not be required to share
your experience. If you or someone you know feels unsafe or may be in imminent danger,
please call the Baylor Police Department (254-710-2222) or Waco Police Department (9-1-1)
immediately.
MILITARY ADVISORY: Veterans and active duty military personnel are welcomed and
encouraged to communicate, in advance if possible, any special circumstances (e.g., upcoming
deployment, drill requirements, disability accommodations). You are also encouraged to visit
the VETS Program Office with any questions at (254) 710-7264.
CLASS SCHEDULE
Week 1: Introduction and Probability Theory
-- Set Theory
-- The Basics of Probability Theory: Axiomatic Foundations and the Calculus of Probabilities
Week 2: Counting Techniques and Enumerating Outcomes
-- The Fundamental Theorem of Counting and Basic Ideas
-- Ordered, Without Replacement
-- Ordered, With Replacement
-- Unordered, Without Replacement
-- Unordered, With Replacement
-- Enumerating Outcomes
Week 3: Conditional Probability and Independence; Random Variables; Distribution Functions
-- Conditional Probability and Bayes’ Rule
-- Statistical Independence
-- Random Variables
-- Distribution Functions
Week 4: Density and Mass Functions; Transformations and Expectations; Differential Under and
Integral Sign
-- Mass Functions
-- Density Functions
-- Distribution of Functions of a Random Variable
-- Expected Values
-- Differentiating Under the Integral Sign; Miscellanea
Week 5: Exam #1 and Discrete Distributions
-- Discrete Distributions
-- Discrete Uniform Distribution
-- Hypergeometric Distribution
-- Binomial Distribution
-- Poisson Distribution
-- Negative Binomial Distribution
-- Geometric Distribution
Week 6: Discrete Distributions (continued) and Continuous Distributions
-- Poisson Distribution
-- Negative Binomial Distribution
-- Geometric Distribution
-- Continuous Distributions
-- Uniform Distribution
-- Gamma Distribution
-- Normal Distribution
Week 7: Continuous Distributions (continued); Exponential Families; Location and Scale
Families
-- Beta Distribution
-- Cauchy Distribution
-- Lognormal Distribution
-- Double Exponential Distribution
-- Exponential Families
-- Location and Scale Families
-- Inequalities and Identities; Miscellanea
Week 8: Multiple Random Variables
-- Joint and Marginal Distributions
-- Conditional Distributions and Independence
-- Bivariate Transformations
Week 9: Multiple Random Variables (continued)
-- Hierarchical Models and Mixture Distributions
-- Covariance and Correlation
-- Multivariate Distributions
-- Inequality and Miscellanea
Week 10: Exam 2 and Properties of a Random Sample
-- Basic Concepts of Random Samples
-- Sums of Random Variables from a Random Sample
-- Sampling from the Normal Distribution
Week 11: Properties of a Random Sample (continued)
-- Order Statistics
-- Convergence Concepts
-- Generating a Random Sample; Miscellanea
Week 12: Principles of Data Reduction
-- Introduction to Data Reduction
-- The Sufficiency Principle and Sufficient Statistics
-- Sufficient Statistics and Minimal Sufficiency
-- Ancillary Statistics
-- Complete Statistics
Week 13: The Likelihood Principle and Equivariance
-- The Likelihood Function
-- The Formal Likelihood Principle
-- The Equivariance Principle
-- Miscellanea
Week 14: Point Estimation
-- Introduction to Point Estimation
-- Method of Moments Estimators
-- Maximum Likelihood Estimators
Week 15: Point Estimation (continued)
-- Bayes Estimators
-- The EM Algorithm
Week 16: Final Exam (Due by 4:00 P.M., Monday, December 14, 2020)
https://www.history.com/news/brooklyn-bridge-construction-deaths
Probability: Building a Bridge
An M.S. or Ph.D. statistician needs a bridge like this.
Class Outline
Probability theory
Set theory
The calculus of probabilities: counting techniques
Conditional probability; Bayes theorem
Random variables
Distribution functions
Density and mass functions
Transformations
Expectations
Families of distributions
Discrete distributions
Continuous distributions
Exponential families
Location and scale families
Multiple random variables
Properties of random sample
Sampling from normal population
Order statistics
Convergence concepts
Principles of data reduction
Sufficiency
Minimal sufficiency
Completeness
Set Theory
Theory of Statistics I
Outline
1. Conventions and Best Practices
2. Sample Space
3. Events and Venn Diagrams
Solution: For each of the four coin flips, there are two possible outcomes. So an outcome for four
flips will be a 4-tuple. Therefore the sample space is
{ ( HHHH ) , ( HHHT ) , ( HHTH ) , ( HTHH ) ,
S= ( THHH ) , ( HHTT ) , ( HTHT ) , ( THTH ) ,
( THHT ) , ( THTH ) , ( TTHH ) , ( HTTT ) ,
( THTT ) , (TTHT ) , ( TTTH ) ,(TTTT )}.
Solution: The sample space for this example is more complicated than in other examples. It is
¿
S= ( THHH ) , ( HHTTH ) , ( HTHTH ) ,
( HTTHH ) , (TTHHH ) , ( THTHH ) , …}.
The cardinalities (numbers of outcomes) in the other sample spaces are finite. This sample space is
infinite, but we can provide a type of listing of the outcomes in S.
A Note on Terminology
There are three terms that are used that can feel misleading. These terms are used to describe the
number of elements in a set, or the cardinality of the set. Let C (A ) represent the cardinality of the set
A.
1. The set A is said to be finite if C ( A )< ∞.
2. The set A is said to be countable, if the elements of A can be represented in a list.
3. If A is countable, but the number of elements in the list is infinite (like the example on the
previous slide), the set is said to be countably infinite.
4. If the set of elements cannot be represented by some type of list, then the set is said to be
infinite.
b. with replacement?
Solution:
S={( aa ) , ( ab ) , ( ac ) , ( ad ) , ( bb ) ,
( bc ) , ( bd ) , ( cc ) , ( cd ) ,(dd )}.
Convention is to denote events with upper-case letters from the beginning of the alphabet, A , B, etc.
Letters at the end of the alphabet are typically reserved for another purpose.
Events
Relations
In the sample space S, let ω represent an outcome, and A and B any two event. Then
A occurs if the outcome of the experiment is in A .
B is a subset of A , or B is contained in A if for every ω ∈ B ⇒ ω ∈ A ; that is,
B⊂ A ⇔ ω ∈ B ⇒ ω ∈ A .
A and B are equal if and only if A ⊂ B and B⊂ A ; that is,
A=B ⇔ A ⊂ B and B⊂ A .
Events: Operations
Union of Events
In S, let ω represent an outcome, and A and B represent any two events. The union of A and B,
written A ∪ B is the set of all outcomes in A , in B, or in both A and B; that is,
A ∪ B= {ω ∈ S :ω ∈ A∨ω ∈ B } .
Events: Operations
Intersection of Events
In S, let ω represent an outcome, and A and B represent any two events. The intersection of A and B,
written A ∩ B, or AB, is the set of all outcomes in both A and B; that is,
A ∩ B= AB= { ω ∈ S :ω ∈ A∧ω ∈ B } .
Events: Operations
Complement of an Event
In S, let ω represent an outcome, and A a set. Then the complement of A , written Ac or A ' , is the set
of all outcomes that are not in A ; that is,
A =A ={ ω ∈ S : ω ∉ A } .
C '
Example 1: Events
Flip the fair coin n=5 times. Consider the following events, and describe them in terms of outcomes in
the sample space.
Note: The sample space consists of 32 5-tuples.
The intersection of A and B is the set of all outcomes in both A and B. Therefore, the
intersection of A and B is
A ∩ B= AB={ ( HHHHH ) , ( HHHHT ) ,
( HHHTH ) , ( HHTHH )}.
The intersection of A and C is the set of all outcomes in both A and C . There are no outcomes
in common, so the intersection is empty. We denote the empty set using a ∅ , and we write
A ∩C=∅ .
3. Solution: E ∪ G is
{ (1 , 6 ) , ( 2 ,5 ) , ( 3 , 4 ) , ( 4 , 3 ) ,
E ∪ G=( 5 , 2 ) , ( 6 ,1 ) , ( 3 , 1 ) , ( 3 , 2 ) ,
( 3 , 3 ) , (3 , 5 ) ,(3 , 6)}.
To prove any of these, you must show the sets are equal. For example, to prove A ∪ B=B ∪ A , you
must show
A ∪ B⊂ B ∪ A , and
B∪ A ⊂ A ∪ B .
∞
∩i=1 A i={ω ∈ S :ω ∈ Ai for all i}.
Unions and intersections can be defined over uncountable collections of sets. Let Γ be a set of
elements to be used as indices (an index set). Then
∪ a∈ Γ A a ={ ω ∈ S : ω ∈ Aa for some a } ,
∩a ∈Γ Aa ={ ω ∈ S : ω ∈ A a for all a } .
Exhaustive
∞
If A1 , A 2 ,… are such that ∪i=1 A i=S , then the collection of sets is said to be exhaustive.
Partition of S
If A1 , A 2 ,… are mutually exclusive and exhaustive, then the collection A1 , A 2 ,… forms a partition of S.
Since a Smart TV can’t be two brands at the same time, PS=PM =PL=SM =SL=ML=∅ . Since the
store carries only these four brands, P ∪ S ∪ M ∪ L=S . Therefore, the sets P, S, M , and L partition the
S.
References
Venn, M.A. (1880) “On the Diagrammatic and Mechanical Representation of Propositions and
Reasonings.” Philosophical Magazine, Series 5, 10: 59, pp. 1-18. DOI: 10.1080/14686448008626877.
Basics of Probability
Frequentist Interpretation of Probability
When an experiment is performed, the realization is an outcome ω in the sample space S. The
probability of any outcome ω ∈ S is the limit of the relative frequency of the event as the number of
times the experiment is performed goes to infinity.
Other Properties of B
Since SC = ∅ , then S ∈ B. (Why?)
B is closed under countable intersections. (Why?)
σ -Algebra: Examples
Example. Consider any sample space S. Then the collection of sets B={ ∅ , S } is a σ -algebra.
This specific σ -algebra is the trivial σ -algebra.
Example 1. Consider flipping a fair coin 2 times. Then the sample space is
S= { ( HH ) , ( HT ) , (TH ) , ( TT ) } .
If we are interested in the smallest σ -algebra that is not the trivial σ -algebra, we would use the
power set, PS , which is the set of all subsets of S. Since C ( S )=4 , then
Example. Finding a σ -algebra on the unit interval ¿ isn’t trivial. The set of intervals, called dyadic
intervals, are defined by
(
I k ,n= m+
k
2
n ]
k +1
, m+ n ,
2
where k , n=0 , 1, 2 , …, and k ≤ 2n−1. For the unit interval m=0 . The σ -algebra for ¿ is
∞
B=∪ n=1 I k ,n , k =0 , 1 ,… , 2n−1 .
Since the dice are fair, each outcome ω i has pi=1 /36 for i=1 , 2 ,… , 36 . Notice this satisfies the Axioms
of Probability.
Solution:
E={ ( 3 , 1 ) , ( 3 , 2 ) , ( 3 , 3 ) , ( 3 , 4 ) , ( 3 , 5 ) , ( 3 , 6 ) }=¿
{ω13 ,ω 14 , ω15 , ω 16 , ω 17 , ω 18 }. Therefore
18
6
P ( E )= ∑ pi= p13 + p14 +…+ p 18= .
i=13 36
Solution:
G= { ( 1, 6 ) , ( 2 , 5 ) , ( 3 , 4 ) , ( 4 , 3 ) , ( 5 ,2 ) , ( 6 , 1 ) }=¿
{ω6 , ω11 , ω 16 , ω 21 , ω26 , ω31 }. Define the index set
Γ ={6 , 11, 16 , 21 ,26 , 31 }. Then
6
P ( G )=∑ pi= p6 + p11 + p 16+ p21 + p26 + p31= .
i ∈Γ 36
The Axioms of Probability can be used to build many properties that can be used for calculating more
complicated probabilities.
1
Oxford English Dictionary
Proof of (c): The sets A and AC partition S, and P ( S )=1. Therefore by the Axiom of Finite Additivity
1=P ( S ) =P ( A ∪ A C )=P ( A ) + P ( AC )
⇔ P ( A C )=1−P (A ).
See pp. 10-11 of text for proof. For (a) and (b) “trick” involves finding the appropriate group of sets that
are disjoint and that are equal to the operation on A and B.
Example 2: Probabilities of Set Operations
Find the probabilities of the following events.
The sum of the faces of the dices is not nine.
Solution: This is the complement of F ; that is F C. Therefore,
4 32
P ( F ) =1−P ( F )=1−
C
= .
36 36
The face of the first die is a one and the sum of the faces of the dices is exactly seven.
Solution: The described event is D and G , or DG={( 1 , 6 ) }, which has probability 1/36.
i=1
Result (a) is called the Law of Total Probability, and result (b) is Boole’s inequality.