0% found this document useful (0 votes)
21 views

Statistical Inference

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Statistical Inference

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

STA 5352: Theory of Statistics I

Introduction and Syllabus


STA 5352: Theory of Statistics I
 COURSE DESCRIPTION: Theory of random variables, distribution and density functions,
statistical estimation and hypothesis testing. Topics include probability, probability
distributions, expectation, point and interval estimation, and sufficiency.
 PREREQUISITE: STA 5351 “Introduction to the Theory of Statistics”
 As the course title suggests, this is the first semester of a two-semester sequence on the theory
of statistics. It is my hope that by the end of the second semester, you will see that the
methods you’re learning at the same time are strongly rooted in the theory we will cover.

Instruction
 Due to COVID-19, this course will be taught remotely, and office hours will be online using
Zoom.
 Most of the lectures will follow the second edition of Statistical Inference by Casella and Berger.
 Before class meetings, there will be assigned pre-recorded modules you should watch. Class
time will be for discussion and questions about the topics, and homework questions.
 The syllabus contains the weekly schedule. Before Wednesday’s class meeting, you should have
already watched
1. Set Theory
2. The Basics of Probability Theory: Axiomatic Foundations and the Calculus of
Probabilities

Homework and Exams


 Homework should be neat and legible if hand-written. Or, it should be completed using LATEX.
However, for STA 5352, LATEX is not required if you can write neatly. If I cannot read your
writing, I will not grade your homework.
 No late homework will be accepted. Follow the guidelines on the syllabus for turning in
homework.
 You can collaborate on homework in the spirit of discussion. But your must be your own. Do
not copy answers from the internet, from one another, from the solutions manual, or
anywhere else. Do the work yourself. You learn by doing, not copying. A severe penalty will be
assessed if I suspect you have violated these basic tenets.
 Exams are closed books and closed notes. I do expect you will learn the formulas. There is
benefit in it. Since you know, get started! Don’t let them snowball on you!

Final Course Average and Grade


FINAL COURSE AVERAGE: The final course average will be computed as a weighted average of
homework and exams.
Component Percent of Average
Homework 20 %
Exam #1 25 %
Exam #2 25 %
Final Exam 30 %
GRADE FOR THE COURSE: Grades for the course will be assigned based on the Final Course Average.
Final Course Average Course Grade
[95 ,100 ] A
−¿¿
¿ A
+¿¿
¿ B
¿ B
+¿¿
¿ C
¿ C
¿ 70 F

COURSE SYLLABUS
STA 5352 THEORY OF STATISTICS I
FALL, 2020

COURSE DESSCRIPTION: Theory of random variables, distribution and density functions, statistical
estimation, and hypothesis testing. Topics include probability, probability distributions, expectation,
point and interval estimation, and sufficiency.

PREREQUISITE: STA 5351: “Introduction to Theory of Statistics.” Introduction to mathematics of


statistics. Fundamentals of probability theory, convergence concepts, sampling distributions, and
matrix algebra.

TEXT: Statistical Inference, Secon Edition by George Casella and Roger L. Berger

SOFTWARE: R. As of mid-July, 2020, the most recent version of R is 4.0.2. If your version of R is not
current, you should update it.

INSTRUCTOR INFORMATION:
Name: Dr. Jane L. Harvill
E-mail address: [email protected]
Phone number: (254) 710-1517
Office location: Marrs McLean Science, Office 153

METHOD OF INSTRUCTION AND EVALUATION: Due to COVID-19, and because I am in a high-risk


category, this class will be an online class.
 CONTENT DELIVERY: You are expected to watch the assigned, pre-recorded modules prior to
online class meetings. Modules are available on Canvas. Class meetings using Zoom are for
open discussion about the module content, and help with homework.

 HOMEWORK: Th average of the home scores will count as 20% of your final average for the
course:
- Homework must be completed and turned in on-time. No late homework will be
accepted. Problems should appear in the order they’re assigned.
- Homework should be a PDF (Adobe Acrobat) file, e-mailed to me no later than 4:00
P.M. on the day homework is due. The name of the PDF file should have your last
name and the homework assignment number; for example, Harvill_hw5.pdf. Do
not send me a photo of your homework. If you do, it will be as if you never turned it
in.
- Your work must be neat and eligible. If I cannot read it, I will not grade it LATEX is
preferred. However, if you can write legibly, that will be acceptable. For STA 5352,
using LATEX to complete homework will not be required.
- You may work together on homework in the spirt of discussion. You may not share
your results with one another. You should not copy your answers from the solutions
manual, from the Internet, or any other resource. You can’t learn by copying
someone else’s work. You learn by doing. If I suspect you are not doing your own
work, or that you are sharing your work with someone else, there will be a
substantial penalty.

 EXAMS: All exams are closed book and closed notes, and are to be done without any outside
resources or collaboration. I expect you will memorize formulas.
- There will be two exams during the semester. The score on each will count as 25% of
your final course average.
- There will be a comprehensive Final Exam. The score on the Final will count as 30%
of your final course average.
- Completed exams should be e-mailed to me no later than 4:00 P.M. on the day the
exam is due. The guidelines for homework (legibility, PDF files, etc.) are to be used
for exams.

 FINAL COURSE AVERAGE: The final course average will be computed as a weighted average of
the average score of homework assignments and exams.
Course component Percent of Final
Course Average
Homework average 20%
Exam #1 25%
Exam #2 25%
Final exam 30%

 GRADE FOR THE COURSE: Course grades will be assigned based on the final course average.
Final Course Average Course Grade
[95, 100] A
[90, 95) A-
[85, 90) B+
[80, 85) B
[75, 80) C+
[70, 75) C
<70 F

 TITLE IX: Baylor University does not discriminate on the basis of sex or gender in any of its
education or employment programs and activities, and it does not tolerate discrimination or
harassment, sexual assault, sexual exploitation, stalking, intimate partner violence, and
retaliation (collectively referred to as prohibited conduct). For more information on how to
report, or to learn more about our policy and process, please visit www.baylor.edu/titleix. You
may also contact the Title IX office directly by phone, (254) 710-8454, or email,
[email protected].
The Title IX office understands the sensitive nature of these situations and can provide
information about available on- and off-campus resources, such as counseling and
psychological services, medical treatment, academic support, university housing, and other
forms of assistance that may be available. Staff members at the office can also explain your
rights and procedural options if you contact the Title IX Office. You will not be required to share
your experience. If you or someone you know feels unsafe or may be in imminent danger,
please call the Baylor Police Department (254-710-2222) or Waco Police Department (9-1-1)
immediately.

 MILITARY ADVISORY: Veterans and active duty military personnel are welcomed and
encouraged to communicate, in advance if possible, any special circumstances (e.g., upcoming
deployment, drill requirements, disability accommodations). You are also encouraged to visit
the VETS Program Office with any questions at (254) 710-7264.

CLASS SCHEDULE
 Week 1: Introduction and Probability Theory
-- Set Theory
-- The Basics of Probability Theory: Axiomatic Foundations and the Calculus of Probabilities
 Week 2: Counting Techniques and Enumerating Outcomes
-- The Fundamental Theorem of Counting and Basic Ideas
-- Ordered, Without Replacement
-- Ordered, With Replacement
-- Unordered, Without Replacement
-- Unordered, With Replacement
-- Enumerating Outcomes
 Week 3: Conditional Probability and Independence; Random Variables; Distribution Functions
-- Conditional Probability and Bayes’ Rule
-- Statistical Independence
-- Random Variables
-- Distribution Functions
 Week 4: Density and Mass Functions; Transformations and Expectations; Differential Under and
Integral Sign
-- Mass Functions
-- Density Functions
-- Distribution of Functions of a Random Variable
-- Expected Values
-- Differentiating Under the Integral Sign; Miscellanea
 Week 5: Exam #1 and Discrete Distributions
-- Discrete Distributions
-- Discrete Uniform Distribution
-- Hypergeometric Distribution
-- Binomial Distribution
-- Poisson Distribution
-- Negative Binomial Distribution
-- Geometric Distribution
 Week 6: Discrete Distributions (continued) and Continuous Distributions
-- Poisson Distribution
-- Negative Binomial Distribution
-- Geometric Distribution
-- Continuous Distributions
-- Uniform Distribution
-- Gamma Distribution
-- Normal Distribution
 Week 7: Continuous Distributions (continued); Exponential Families; Location and Scale
Families
-- Beta Distribution
-- Cauchy Distribution
-- Lognormal Distribution
-- Double Exponential Distribution
-- Exponential Families
-- Location and Scale Families
-- Inequalities and Identities; Miscellanea
 Week 8: Multiple Random Variables
-- Joint and Marginal Distributions
-- Conditional Distributions and Independence
-- Bivariate Transformations
 Week 9: Multiple Random Variables (continued)
-- Hierarchical Models and Mixture Distributions
-- Covariance and Correlation
-- Multivariate Distributions
-- Inequality and Miscellanea
 Week 10: Exam 2 and Properties of a Random Sample
-- Basic Concepts of Random Samples
-- Sums of Random Variables from a Random Sample
-- Sampling from the Normal Distribution
 Week 11: Properties of a Random Sample (continued)
-- Order Statistics
-- Convergence Concepts
-- Generating a Random Sample; Miscellanea
 Week 12: Principles of Data Reduction
-- Introduction to Data Reduction
-- The Sufficiency Principle and Sufficient Statistics
-- Sufficient Statistics and Minimal Sufficiency
-- Ancillary Statistics
-- Complete Statistics
 Week 13: The Likelihood Principle and Equivariance
-- The Likelihood Function
-- The Formal Likelihood Principle
-- The Equivariance Principle
-- Miscellanea
 Week 14: Point Estimation
-- Introduction to Point Estimation
-- Method of Moments Estimators
-- Maximum Likelihood Estimators
 Week 15: Point Estimation (continued)
-- Bayes Estimators
-- The EM Algorithm
 Week 16: Final Exam (Due by 4:00 P.M., Monday, December 14, 2020)

Introduction to the Theory of Statistics


Introduction and Outline
Introduction to the Theory of Statistics

Introduction to the Theory of Statistics


 Many people believe that statistics is all about data analysis. Descriptive statistics. Inferential
statistics. And that’s mostly true.
 But many people don’t realize that all of those methods – the t-tests, the ANOVA F tests, the χ 2
goodness of fit tests – have a rich mathematical theory behind them.
 Because they are unaware of the theory, many people blindly apply a method – even if it’s not
the correct method – because it seems to fit. It’s obvious – to the most casual observer.
 The material we’re going to learn in this class is that theory. You need to know more. You need
to know why that t-test is appropriate, or not. Knowledge is power (pun intended).

Introduction to the Theory of Statistics


 This semester, we will spend a lot of time studying probability from a variety of perspectives.
 You may be wondering, “Why?”
 Simply put, probability is the bridge that connects descriptive statistics and inferential statistics.
Proper application of inferential statistics comes with understanding the underlying probability
structure.
Probability Bridge

Probability: Building a Bridge


It takes a lot of planning to build a
bridge. A foundation must be laid,
brick-by-brick. Cables and supports
must be strong. It’s hard work. At
times, it’s hard to see the big picture.
But with hard work, you’ll have a
bridge that will serve it’s purpose and
that is built to last.

https://www.history.com/news/brooklyn-bridge-construction-deaths
Probability: Building a Bridge
An M.S. or Ph.D. statistician needs a bridge like this.

Class Outline
 Probability theory
 Set theory
 The calculus of probabilities: counting techniques
 Conditional probability; Bayes theorem
 Random variables
 Distribution functions
 Density and mass functions
 Transformations
 Expectations
 Families of distributions
 Discrete distributions
 Continuous distributions
 Exponential families
 Location and scale families
 Multiple random variables
 Properties of random sample
 Sampling from normal population
 Order statistics
 Convergence concepts
 Principles of data reduction
 Sufficiency
 Minimal sufficiency
 Completeness
Set Theory
Theory of Statistics I
Outline
1. Conventions and Best Practices
2. Sample Space
3. Events and Venn Diagrams

Conventions and Best Practices


Before we begin, there are some best practices you should implement as you learn.
1. Different authors chose different notation to represent the same idea. So don’t read the letter
used for the notation, and don’t attach an idea to the letter. Instead, attach the terminology to
the letter, and meaning to the terminology. This will be illustrated as we progress.
2. Do NOT memorize the number assigned in the book to a result. For example, do not memorize
that the Probability Integral Transform is Theorem 2.1.10. Because that only makes sense in
Casella and Berger. Don’t refer to the Probability Integral Transform as Theorem 2.1.10.
Remember its name, and refer to that result using its name.
3. As we progress, there will be other best practices we will discuss.

Set Theory: Sample Space


Sample Space
The set S of all possible outcomes of a particular experiment is called the sample space for the
experiment.

This is a good time to illustrate the best practices.


1. In Casella and Berger, the sample space is denoted with a S. But in other books, you might see
it denoted using a Ω . So when reading this book, if you see a S, don’t think “ S.” Instead think
“sample space.”
2. In Casella and Berger, this is Definition 1.1.1 – who cares? It’s the definition of the sample
space.

Set Theory: Examples


We will use these examples throughout this section, and probably beyond. Admittedly, some of these
examples are trite. That is intentional. I do not want you to be thinking so hard about the example that
you miss the set theory concept being illustrated.
 Example 1. The experiment: Flip a fair con, two-sided coin n times, where n is an integer that is
at least one. Under this framework, n is not necessarily finite.
 Example 2. The experiment: Roll two fair dice, one at a time.
 Example 3. The experiment: From a finite population of N individuals, take a sample of size n ,
where n ≤ N .

Example 1: Sample Space


Example 1. Let H represents the flipped coin lands heads up, and T that it lands tails up. What is the
sample space when a fair, two-sided coin is flipped n times if
a. n=1?
Solution: The sample space is the set containing all possible outcomes from the experiment. Since
there is one coin flip, with only two possible results, the sample space is
S= { H , T }.
Example 1: Sample Space (continued)
b. n=4 ?

Solution: For each of the four coin flips, there are two possible outcomes. So an outcome for four
flips will be a 4-tuple. Therefore the sample space is
{ ( HHHH ) , ( HHHT ) , ( HHTH ) , ( HTHH ) ,
S= ( THHH ) , ( HHTT ) , ( HTHT ) , ( THTH ) ,
( THHT ) , ( THTH ) , ( TTHH ) , ( HTTT ) ,
( THTT ) , (TTHT ) , ( TTTH ) ,(TTTT )}.

Example 1: Sample Space (continued)


Example 1. Suppose the experiment is flipping the coin until the third H is observed.

Solution: The sample space for this example is more complicated than in other examples. It is
¿
S= ( THHH ) , ( HHTTH ) , ( HTHTH ) ,
( HTTHH ) , (TTHHH ) , ( THTHH ) , …}.

The cardinalities (numbers of outcomes) in the other sample spaces are finite. This sample space is
infinite, but we can provide a type of listing of the outcomes in S.

A Note on Terminology
There are three terms that are used that can feel misleading. These terms are used to describe the
number of elements in a set, or the cardinality of the set. Let C (A ) represent the cardinality of the set
A.
1. The set A is said to be finite if C ( A )< ∞.
2. The set A is said to be countable, if the elements of A can be represented in a list.
3. If A is countable, but the number of elements in the list is infinite (like the example on the
previous slide), the set is said to be countably infinite.
4. If the set of elements cannot be represented by some type of list, then the set is said to be
infinite.

Example 2: Sample Space


Example 2. For a six-sided die, an outcome is the number of pips on the face of the die that faces
upward. When rolling two fair dice, an outcome is a 2-tuple. The sample space is
{ (1 , 1 ) , ( 1 , 2 ) , ( 1 , 3 ) , (1 , 4 ) , (1 , 5 ) , ( 1, 6 ) ,
( 2 , 1 ) , ( 2 , 2 ) , ( 2 , 3 ) , ( 2 , 4 ) , (2 , 5 ) , ( 2 ,6 ) ,
S= ( 3 ,1 ) , ( 3 ,2 ) , ( 3 , 3 ) , ( 3 , 4 ) , ( 3 ,5 ) , ( 3 , 6 ) ,
( 4 , 1) , ( 4 , 2) , ( 4 , 3) ,( 4 , 4 ) ,( 4 , 5) , ( 4 , 6 ) ,
( 5 , 1 ) , ( 5 , 2 ) , ( 5 , 3 ) , ( 5 , 4 ) , ( 5 ,5 ) , ( 5 , 6 ) ,
( 6 , 1 ) , ( 6 ,2 ) , ( 6 , 3 ) , ( 6 , 4 ) , ( 6 , 5 ) ,(6 , 6)}.

Example 3: Sample Space


Example 3. Suppose the population is of size N=4 , and the sample taken is of size n=2. Each
member of the population is uniquely labeled using a lower-case letter a , b , c , or d . Assume the order
that two individuals are selected results in equivalent outcomes; that is, ( ab )=(ba), ( ac )=(ca), etc.
What is the sample space if sampling is
a. without replacement?
Solution:
S={( ab ) , ( ac ) , ( ad ) , ( bc ) , ( bd ) , ( cd ) }.

b. with replacement?
Solution:
S={( aa ) , ( ab ) , ( ac ) , ( ad ) , ( bb ) ,
( bc ) , ( bd ) , ( cc ) , ( cd ) ,(dd )}.

Illustrating Venn Diagrams

Set Theory: Events


Event
An event is any collection of possible outcomes of an experiment; that is, any subset of S, including S
itself.
1. A simple event is an event with a single outcome. A simple event can occur in only one way.
2. A compound event is the combination of two or more simple events. The number of ways a
compound event can occur is determined using counting techniques, and is not necessarily a
simple thing to find.

Convention is to denote events with upper-case letters from the beginning of the alphabet, A , B, etc.
Letters at the end of the alphabet are typically reserved for another purpose.

Set Theory: Venn Diagrams


Venn Diagram
A Venn Diagram is a diagram that is used to illustrate all possible logical relations between a finite
collection of different sets.
 Venn diagrams were conceived in (or around) 1880 by John Venn, and English mathematician,
logician, and philosopher.
 The sample space is often represented as a rectangle. All other sets will fall within the
rectangle.

Events
Relations
In the sample space S, let ω represent an outcome, and A and B any two event. Then
 A occurs if the outcome of the experiment is in A .
 B is a subset of A , or B is contained in A if for every ω ∈ B ⇒ ω ∈ A ; that is,
B⊂ A ⇔ ω ∈ B ⇒ ω ∈ A .
 A and B are equal if and only if A ⊂ B and B⊂ A ; that is,
A=B ⇔ A ⊂ B and B⊂ A .

Events: Operations
Union of Events
In S, let ω represent an outcome, and A and B represent any two events. The union of A and B,
written A ∪ B is the set of all outcomes in A , in B, or in both A and B; that is,
A ∪ B= {ω ∈ S :ω ∈ A∨ω ∈ B } .

Events: Operations
Intersection of Events
In S, let ω represent an outcome, and A and B represent any two events. The intersection of A and B,
written A ∩ B, or AB, is the set of all outcomes in both A and B; that is,
A ∩ B= AB= { ω ∈ S :ω ∈ A∧ω ∈ B } .

Events: Operations
Complement of an Event
In S, let ω represent an outcome, and A a set. Then the complement of A , written Ac or A ' , is the set
of all outcomes that are not in A ; that is,
A =A ={ ω ∈ S : ω ∉ A } .
C '

Example 1: Events
Flip the fair coin n=5 times. Consider the following events, and describe them in terms of outcomes in
the sample space.
Note: The sample space consists of 32 5-tuples.

 A=¿ the first two flips are H ; then


{ ( HHHHH ) , ( HHHHT ) , ( HHHTH ) ,
S= ( HHTHH ) , ( HHHTT ) , ( HHTHT ) ,
( HHTTH ) ,(HHTTT )}.
 B=¿ the number of heads flipped is at least 4 .
B={ ( HHHHT ) , ( HHHTH ) , ( HHTHH ) ,
( HTHHH ) , ( THHHH ) ,( HHHHH )}.
 C=¿ the number of heads flipped is exactly 1.
C={ ( HTTTT ) , ( THTTT ) , ( TTHTT ) ,
( TTTHT ) ,(TTTTH )}.

Example 1: Events (continued)


Find
1. the union of A and B and the intersection of A and B.
2. A ∪ C and A ∩C .

Example 1: Events (continued)


1. Solution: The union of A and B is the set of outcomes in A , in B, and in both A and B.
Therefore, the union of A and B is
{ ( HHHHH ) , ( HHHHT ) , ( HHHTH ) , ( HHTHH ) ,
A ∪ B=( HHHTT ) , ( HHTHT ) , ( HHTTH ) , ( HHTTT ) ,
( HTHHH ) , (THHHH )}.

The intersection of A and B is the set of all outcomes in both A and B. Therefore, the
intersection of A and B is
A ∩ B= AB={ ( HHHHH ) , ( HHHHT ) ,
( HHHTH ) , ( HHTHH )}.

Example 1: Events (continued)


2. Solution: A ∪ C is the set of all outcomes in A , in C , and in both A and C . Therefore A ∪ C is
{ ( HHHHH ) , ( HHHHT ) , ( HHHTH ) ,
S= ( HHTHH ) , ( HHHTT ) , ( HHTHT ) ,
( HHTTH ) , ( HHTTT ) ,(HTTTT ),
( THTTT ) , ( TTHTT ) , ( TTTHT ) ,
( TTTTH ) }.

The intersection of A and C is the set of all outcomes in both A and C . There are no outcomes
in common, so the intersection is empty. We denote the empty set using a ∅ , and we write
A ∩C=∅ .

Example 2: Events (continued)


Consider the following events.
 D=¿ the face of the first die is one
 E=¿ the face of the first die is three
 F=¿ the sum of the faces of the dice is exactly nine
 G=¿ the sum of the faces of the dice is exactly seven
1. Find DF and DG .
2. Find EF and EG .
3. Find E ∪ G .
Example 2: Events (continued)
1. Solution: The outcomes in D are
D= { ( 1 ,1 ) , ( 1, 2 ) , ( 1, 3 ) , ( 1 , 4 ) , ( 1 ,5 ) , ( 1 ,6 ) } .
The outcomes in F are
F={ ( 3 , 6 ) , ( 4 ,5 ) , ( 5 , 4 ) , ( 6 , 3 ) } .
So DF =∅ . The outcomes in G are
G= { ( 1, 6 ) , ( 2 , 5 ) , ( 3 , 4 ) , ( 4 , 3 ) , ( 5 ,2 ) , ( 6 , 1 ) } .
So DG={( 1 , 6 ) }.

Example 2: Events (continued)


2. Solution: The outcomes in E are
E={ ( 3 , 1 ) , ( 3 , 2 ) , ( 3 , 3 ) , ( 3 , 4 ) , ( 3 , 5 ) , ( 3 , 6 ) } .
So EF={( 3 , 6 ) } and EG={ ( 3 , 4 ) }.

3. Solution: E ∪ G is
{ (1 , 6 ) , ( 2 ,5 ) , ( 3 , 4 ) , ( 4 , 3 ) ,
E ∪ G=( 5 , 2 ) , ( 6 ,1 ) , ( 3 , 1 ) , ( 3 , 2 ) ,
( 3 , 3 ) , (3 , 5 ) ,(3 , 6)}.

Properties of Set Operations


For any three events A , B, and C defined on S,
1. Commutativity. A ∪ B=B ∪ A , and A ∩ B=B ∩ A .
2. Associativity. A ∪ ( B∪ C )= ( A ∪ B ) ∪ C , and A ∩ ( B ∩C ) =( A ∩B ) ∩C .
3. Distributive Laws. A ∩ ( B ∪ C )= ( A ∩ B ) ∪( A ∩ C), and A ∪ ( B∩ C )= ( A ∪ B ) ∩( A ∪ C ).
C C C C C C
4. DeMorgan’s Laws. ( A ∪ B ) = A ∩ B , and ( A ∩ B ) = A ∪ B .

To prove any of these, you must show the sets are equal. For example, to prove A ∪ B=B ∪ A , you
must show
 A ∪ B⊂ B ∪ A , and
 B∪ A ⊂ A ∪ B .

The technique is illustrated on pages 3-4 of the text.

Infinite Collection of Sets


The operations of union and intersection can be extended to an infinite collection of sets.
∪i=1 A i= { ω ∈ S :ω ∈ A i for some i } ,


∩i=1 A i={ω ∈ S :ω ∈ Ai for all i}.

Unions and intersections can be defined over uncountable collections of sets. Let Γ be a set of
elements to be used as indices (an index set). Then
∪ a∈ Γ A a ={ ω ∈ S : ω ∈ Aa for some a } ,
∩a ∈Γ Aa ={ ω ∈ S : ω ∈ A a for all a } .

Disjoint (Mutually Exclusive) and Partition


Disjoint (Mutually Exclusive)
Two events A and B, defined on S are said to disjoint (or mutually exclusive) if A ∩ B= ∅. The events
A1 , A 2 ,… are said to be pairwise disjoint (or mutually exclusive) if Ai ∩ A j=∅ for all i≠ j.

Exhaustive

If A1 , A 2 ,… are such that ∪i=1 A i=S , then the collection of sets is said to be exhaustive.

Partition of S
If A1 , A 2 ,… are mutually exclusive and exhaustive, then the collection A1 , A 2 ,… forms a partition of S.

Example 4: Disjoint and Partition


Example 4. Suppose an electronics store carries four brands of Smart TV: Panasonic, Sony, Samsung,
and LG. A customer purchases one Smart TV. Define the following events. The sample space is the set
that has a listing of all Smart TVs the customer could purchase from the store.
 P=¿ the customer purchases a Panasonic,
 S=¿ the customer purchases a Sony,
 M =¿ the customer purchases a Samsung, and
 L=¿ the customer purchases an LG.

Since a Smart TV can’t be two brands at the same time, PS=PM =PL=SM =SL=ML=∅ . Since the
store carries only these four brands, P ∪ S ∪ M ∪ L=S . Therefore, the sets P, S, M , and L partition the
S.

References
Venn, M.A. (1880) “On the Diagrammatic and Mechanical Representation of Propositions and
Reasonings.” Philosophical Magazine, Series 5, 10: 59, pp. 1-18. DOI: 10.1080/14686448008626877.

Basics of Probability Theory


Theory of Statistics I
Outline
1. Introduction to the Basics of Probability
2. Axiomatic Foundations of Probability
3. Defining Probability Function: Kolmogorov’s Axioms
4. The Calculus of Probabilities

Basics of Probability
Frequentist Interpretation of Probability
When an experiment is performed, the realization is an outcome ω in the sample space S. The
probability of any outcome ω ∈ S is the limit of the relative frequency of the event as the number of
times the experiment is performed goes to infinity.

This interpretation is intuitively appealing. It is also supported by axiomatic foundations.

Axiomatic Foundations: σ -Algebra (Borel Field)


For each event A in the sample space S, we want to associate with A a number between zero and one
that we will call the probability of A , denoted P( A). Notice the argument A in P( A) is not a number,
but is a set. However, the domain of P is not all outcomes in S!
σ -Algebra or Borel Field
A collection of subsets of S is called a σ -algebra or a Borel field, denoted B, if it satisfies the following
properties.
1. ∅ ∈ B,
2. A ∈ B then AC ∈ B ( B is closed under complementation), and
3. If A1 , A 2 ,… ∈ B , then ∪i=1 A i ∈ B ( B is closed under countable unions).

Other Properties of B
 Since SC = ∅ , then S ∈ B. (Why?)
 B is closed under countable intersections. (Why?)

With any sample space, there can be many different σ -algebras.

σ -Algebra: Examples
 Example. Consider any sample space S. Then the collection of sets B={ ∅ , S } is a σ -algebra.
This specific σ -algebra is the trivial σ -algebra.
 Example 1. Consider flipping a fair coin 2 times. Then the sample space is
S= { ( HH ) , ( HT ) , (TH ) , ( TT ) } .

If we are interested in the smallest σ -algebra that is not the trivial σ -algebra, we would use the
power set, PS , which is the set of all subsets of S. Since C ( S )=4 , then

Borel Field: Examples (continued)


4
C ( P S )=2 =16.
{θ , ( HH ) ,
( HT ) , ( TH ) ,
(TT ), { ( HH ) , ( HT ) } ,
{ ( HH ) , ( TH ) } , { ( HH ) , ( TT ) } ,
PS ={( HT ) , ( TH ) }, { ( HH ) , ( TT ) } ,
{ ( TH ) , ( TT ) } , ¿
{ ( HH ) , ( HT ) , ( TT ) } , { ( HH ) , (TH ) , ( TT ) } ,
{ ( HT ) , ( TH ) , ( TT ) } , S }

Borel Field: Examples (continued)


Example. σ -Algebra II (in the text). Let S=(−∞, ∞). Then B is chosen to contain all sets of the form
[ a , b ], ¿, ¿, and (a ,b),

Example. Finding a σ -algebra on the unit interval ¿ isn’t trivial. The set of intervals, called dyadic
intervals, are defined by

(
I k ,n= m+
k
2
n ]
k +1
, m+ n ,
2
where k , n=0 , 1, 2 , …, and k ≤ 2n−1. For the unit interval m=0 . The σ -algebra for ¿ is

B=∪ n=1 I k ,n , k =0 , 1 ,… , 2n−1 .

Probability Function: Kolmogorov’s Axioms


Probability Function: Kolmogorov’s Axioms
Given a sample space S and an associated σ -algebra B, a probability function is a function P with
domain B that satisfies
1. P ( A ) ≥0 for all A ∈ B ,
2. P ( S )=1, and
3. If A1 , A 2 ,… ∈ B are pairwise disjoint, then

P (∪ ∞
i =1 A i ) =∑ P( Ai ).
i=1

This result is also sometimes called the “Axioms of Probability.”

Probability Measure Space


A function P is a probability function if and only if it satisfies the Axioms of Probability.

Probability Measure Space


If B is a σ -field on S and P is a probability measure satisfying Kolmogorov’s Axioms, then the triple
(S , B , P) is called a probability measure space, or simply a probability space.a
a
Billingsley (1995)

Example 1: Defining Probability Functions


Example 1. Consider the probability space (S , B , P) where S= { ( HH ) , ( HT ) , (TH ) , ( TT ) }, PS =B. Assume
that the coin is fair, so that each outcome in S has equal chance of occurring. Then
P ( { HH } )=P ( { HT } )=P ( {TH } ) =P ( { TT } ) .

Each of the four outcomes in S are disjoint so that


P ( { HH } )+ P ( { HT } )+ P ( {TH } )+ P ( {TT } )=1

Combining these two equations gives each outcome in S has probability


1
P ( { HH } )=P ¿ {HT }¿=P({TH })=P({TT })= .
4

This assignment of probabilities depends on knowing the coin is fair.

Defining Probability Functions


Unlike the previous example, we need a method for defining a legitimate probability function that
does not require we check the Axioms of Probability.

Defining a Probability Function


Let S={ω 1 , … , ω n } be a finite sample space and B be any σ -algebra of subsets of S. Let p1 , … , pn be
nonnegative real numbers that sum to one. For any A ∈ B , define P( A) by
P ( A )= ∑ pi .
i : ωi ∈ A
The sum over ∅ is zero. Then P is a probability function on B. This remains true if S={ω 1 , ω 2 , … } is a
countable set.

Example 2: Defining a Probability Function


When tossing two fair six-sided dice, S has 36 outcomes, so B=P S for which
36
C ( P S )=2 =68 ,719 , 476 ,736 (which is too many to list here)!

Recall for tossing two six-sided dice


 D=¿ the face of the first die is one
 E=¿ the face of the first die is three
 F=¿ the sum of the faces of the dice is exactly nine
 G=¿ the sum of the faces of the dice is exactly seven

Since the dice are fair, each outcome ω i has pi=1 /36 for i=1 , 2 ,… , 36 . Notice this satisfies the Axioms
of Probability.

Find the probabilities that each of D , E , F , and G occur.

Example 2: Defining a Probability Function (continued)


 Solution:
D= { ( 1 ,1 ) , ( 1, 2 ) , ( 1, 3 ) , ( 1 , 4 ) , ( 1 ,5 ) , ( 1 ,6 ) }=¿
{ω1 , ω2 , ω3 , ω4 , ω 5 , ω 6 }. Therefore
6
6
P ( D )=∑ pi =p 1+ p 2+ …+ p 6= .
i=1 36

 Solution:
E={ ( 3 , 1 ) , ( 3 , 2 ) , ( 3 , 3 ) , ( 3 , 4 ) , ( 3 , 5 ) , ( 3 , 6 ) }=¿
{ω13 ,ω 14 , ω15 , ω 16 , ω 17 , ω 18 }. Therefore
18
6
P ( E )= ∑ pi= p13 + p14 +…+ p 18= .
i=13 36

Example 2: Defining a Probability Function (continued)


 Solution:
F={ ( 3 , 6 ) , ( 4 ,5 ) , ( 5 , 4 ) , ( 6 , 3 ) } ={ω 18 , ω 23 , ω 28 , ω33 }. So
4
P ( F)= ∑ p i=p 19+ p 23+ p28 + p33= 36 .
i= {18 ,23 ,28 ,33 }

 Solution:
G= { ( 1, 6 ) , ( 2 , 5 ) , ( 3 , 4 ) , ( 4 , 3 ) , ( 5 ,2 ) , ( 6 , 1 ) }=¿
{ω6 , ω11 , ω 16 , ω 21 , ω26 , ω31 }. Define the index set
Γ ={6 , 11, 16 , 21 ,26 , 31 }. Then
6
P ( G )=∑ pi= p6 + p11 + p 16+ p21 + p26 + p31= .
i ∈Γ 36

The Calculus of Probabilities


We have all had a course (or three) in the branch of mathematics called “calculus.” But the word
calculus has another meaning: “A particular method or system of calculating or reasoning.” 1

The Axioms of Probability can be used to build many properties that can be used for calculating more
complicated probabilities.

Axiom of Finite Additivity


If A ∈ B and B∈ B are disjoint, then
P ( A ∪ B )=P ( A ) + P(B).

1
Oxford English Dictionary

The Calculus of Probabilities


P(θ), P( A) and P ( A C )
For the probability space (S , B , P)
a. P ( ∅ )=0 ,
b. P ( A ) ≥0 ,
c. P ( A C ) =1−P ( A ) .

Proof of (c): The sets A and AC partition S, and P ( S )=1. Therefore by the Axiom of Finite Additivity
1=P ( S ) =P ( A ∪ A C )=P ( A ) + P ( AC )
⇔ P ( A C )=1−P (A ).

Probabilities of Operations on Sets


Probabilities of Operations on Sets
For the probability space (S , B , P), with A and B any two sets in B,
a. P ( A ∩B C )=P ( A ) −P (A ∩B);
b. P ( A ∪ B )=P ( A ) + P ( B ) −P( A ∩ B) ;
c. If A ⊂ B then P ( A ) ≤ P(B).

See pp. 10-11 of text for proof. For (a) and (b) “trick” involves finding the appropriate group of sets that
are disjoint and that are equal to the operation on A and B.
Example 2: Probabilities of Set Operations
Find the probabilities of the following events.
 The sum of the faces of the dices is not nine.
Solution: This is the complement of F ; that is F C. Therefore,
4 32
P ( F ) =1−P ( F )=1−
C
= .
36 36

 The face of the first die is a one and the sum of the faces of the dices is exactly seven.
Solution: The described event is D and G , or DG={( 1 , 6 ) }, which has probability 1/36.

Example 2: Probabilities of Set Operations


 The face of the first die is a one, and the sum is not seven.
Solution: This is D ∩GC . Therefore
6 1 5
P ( D ∩G )=P ( D )−P ( DG )=
C
− = .
36 36 36

 The face of the first die is a one or the sum is seven.


Solution: This event is D ∪ G, which has probability
P ( D∪ G )=P ( D )+ P ( G ) −P ( DG )
6 6 1 11
¿ + − = .
36 36 36 36
Bonferroni’s Inequality
Bonferroni’s Inequality
For the probability space (S , B , P), with A and B any two sets in B,
P ( A ∩B ) ≥ P ( A ) + P ( B )−1.

Law of Total Probability and Boole’s Inequality


Law of Total Probability and Boole’s Inequality
For the probability space (S , B , P) with A ∈ B

a. P ( A )=∑ P (A ∩Ci ) for C 1 , C 2 , … ∈ B , where C 1 , C 2 , … are a partition of S;
i =1

b. P ( ∪i =1 A i ) ≤ ∑ P ( A i ) for any collection of sets A1 , A 2 ,… ∈ B .

i=1

Result (a) is called the Law of Total Probability, and result (b) is Boole’s inequality.

You might also like