0% found this document useful (0 votes)
24 views32 pages

Unit I Random Variables

This document outlines the foundational concepts of probability theory and random variables, emphasizing the distinction between deterministic and probabilistic phenomena. It introduces key definitions such as trials, events, and types of random variables, along with their associated probability functions. The document also covers essential probability rules and the mathematical framework for calculating probabilities, including classical, empirical, and axiomatic approaches.

Uploaded by

criccrack46
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views32 pages

Unit I Random Variables

This document outlines the foundational concepts of probability theory and random variables, emphasizing the distinction between deterministic and probabilistic phenomena. It introduces key definitions such as trials, events, and types of random variables, along with their associated probability functions. The document also covers essential probability rules and the mathematical framework for calculating probabilities, including classical, empirical, and axiomatic approaches.

Uploaded by

criccrack46
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Department of Mathematics

MA241TA: Probability Theory and Linear Programming


Unit - I: Random Variables

Topic Learning Objectives:


• To apply the knowledge of the statistical analysis and theory of probability in the study of
uncertainties.
• Define the degree of dependence between two random variables and measuring the same .

Prerequisites:
If an experiment is repeated under essentially homogeneous and similar conditions one
generally comes across two types of situations:
(i) The result or what is usually known as the 'outcome' is unique or certain.
(ii) The result is not unique but may be one of the several possible outcomes.
The phenomena covered by (i) are known as 'deterministic' or 'predictable' phenomena. By a
deterministic phenomenon the result can be predicted with certainty.
For example:
(a) The velocity ′𝑣′ of a particle after time ′𝑡′ is given by 𝑣 = 𝑢 + 𝑎𝑡 where ‘𝑢’ is the initial
velocity and ‘𝑎’ is the acceleration. This equation uniquely determines ‘𝑣’ if the right-hand
quantities are known.
(b) Ohm’s Law, viz., 𝐶 = 𝐸/𝑅 where 𝐶 is the flow of current, 𝐸 the potential difference between
the two ends of the conductor and 𝑅 the resistance, uniquely determines the value 𝐶 as soon
as 𝐸 and 𝑅 are given.
A deterministic model is defined as a model which stipulates that the conditions under which
an experiment is performed to determine the outcome of the experiment. For a number of
situations, the deterministic model suffices. However, there are phenomena (as covered by
(ii) above) which do not lend themselves to deterministic approach and are known as
'unpredictable' or 'probabilistic' phenomena. For example:
(a) In tossing of a coin one is not sure if a head or tail will be obtained.
(b) If a light tube has lasted for t hours, nothing can be said about its further life. It may fail to
function any moment.
In such cases chance or probability comes into picture which is taken to be a quantitative
measure of uncertainty.
Some basic definitions:
Trial and Event: Consider an experiment which, though repeated under essentially identical
conditions, does not give unique results but may result in any one of the several possible
Fourth Semester 1 Probability theory and Linear Programming (MA241TA)
outcomes. The experiment is known as a trial and the outcomes are known as events or casts.
For example:
(i) Throwing of a die is a trial and getting 1 (𝑜𝑟 2 𝑜𝑟 3, . . . 𝑜𝑟 6) is an event.
(ii) Tossing of a coin is a trial and getting head (𝐻) or tail (𝑇) is an event.
(iii) Drawing two cards from a pack of well-shuffled cards is a trial and getting a king and a
queen are events.
Exhaustive Events: The total number of possible outcomes in any trial is known as
exhaustive events or exhaustive cases. For example:
(i) In tossing of a coin there are two exhaustive cases, viz., head and tail.
(ii) In throwing of a die, there are six, exhaustive cases since anyone of the 6 faces 1, 2, . . . ,6
may come uppermost.
(iii) In drawing two cards from a pack of cards the exhaustive number of cases is 52𝐶2, since
2 cards can be drawn out of 52 cards in 52𝐶2 ways.
Favourable Events or Cases: The number of cases favourable to an event in a trial is the
number of outcomes which entail the happening of the event. For example:
(i) In drawing a card from a pack of cards the number of cases favourable to drawing of an
ace is 4, for drawing a spade 13 and for drawing a red card is 26.
(ii) In throwing of two dice, the number of cases favourable to getting the sum 5 is : (1,4)
(4,1) (2,3) (3,2), i.e., 4.
Mutually exclusive events: Events are said to be mutually exclusive or incompatible if the
happening of any one of them precludes the happening of all the others, i.e., if no two or
more of them can happen simultaneously in the same trial. For example:
(i) In throwing a die all the 6 faces numbered 1 to 6 are mutually exclusive since if any one
of these faces comes, the possibility of others, in the same trial, is ruled out.
(ii) Similarly in tossing a coin the events head and tail are mutually exclusive.
Equally likely events: Outcomes of a trial are set to be equally likely if taking into
consideration all the relevant evidences, there is no reason to expect one in preference to the
others. For example:
(i) In tossing an unbiased or uniform coin, head or tail are likely events.
(ii) In throwing an unbiased die, all the six faces are equally likely to come.
Independent events: Several events are said to be independent if the happening (or non-
happening) of an event is not affected by the supplementary knowledge concerning the
occurrence of any number of the remaining events. For example:
(i) In tossing an unbiased coin the event of getting a head in the first toss is independent of
getting a head in the second, third and subsequent tosses.

Fourth Semester 2 Probability theory and Linear Programming (MA241TA)


(ii) If one draws a card from a pack of well-shuffled cards and replace it before drawing the
second card, the result of the second draw is independent of the first draw. But, however,
if the first card drawn is not replaced then the second draw is dependent on the first draw.
There are three systematic approaches to the study of probability as mentioned below.
Mathematical or Classical or ‘a priori’ Probability: If a trial results in 𝑛 exhaustive,
mutually exclusive and equally likely cases and 𝑚 of them are favourable to the happening of
an event 𝐸 then the probability ′𝑝′ of happening of 𝐸 is given by
Favourable number of cases 𝑚
𝑝 = 𝑃(𝐸) = Exhaustive number of cases
= 𝑛

Since the number of cases favourable to the 'non-happening' of the event 𝐸 are (𝑛 − 𝑚), the
probability '𝑞' that 𝐸 will not happen is given by
𝑛−𝑚 𝑚
𝑞= = 1 − = 1 − 𝑝 gives 𝑝 + 𝑞 = 1
𝑛 𝑛
Obviously 𝑝 as well as 𝑞 are non-negative and cannot exceed unity, i.e,
0 ≤ 𝑝 ≤ 1, 0 ≤ 𝑞 ≤ 1 .
Statistical or Empirical Probability: If a trial is repeated a number of times under
essentially homogeneous and identical conditions, then the limiting value of the ratio of the
number of times the event happens to the number of trials, as the number of trials become
indefinitely large, is called the probability of happening of the event. (It is assumed that the
limit is finite and unique).
Symbolically, if in 𝑛 trials an event 𝐸 happens 𝑚 times, then the probability ′𝑝′ of the
𝑚
happening of 𝐸 is given by 𝑝 = 𝑃(𝐸) = lim .
𝑛→∞ 𝑛

Axiomatic Probability: Let 𝐴 be any event in the sample space 𝑆, then 𝑃(𝐴) is called the
probability of event 𝐴, if the following axioms are satisfied.
Axiom 1: 𝑃(𝐴) ≥ 0
Axiom 2: 𝑃(𝑆) = 1, 𝑆 being the sure event
Axiom 3: For two mutually exclusive events 𝐴 & 𝐵, 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)
Some important results:
1. The probability of an event always lies between 0 and i.e., 0 ≤ 𝑃(𝐴) ≤ 1.
2. If 𝐴 and 𝐴′ are complementary events to each other defined on a random experiment then
𝑃(𝐴) + 𝑃(𝐴′ ) = 1.
3. Addition Theorem: If 𝐴 and 𝐵 are any two events with respective probabilities 𝑃(𝐴) and
𝑃(𝐵), then the probability of occurrence of at least one of the events is given by
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵).
4. The probability of null event is zero i.e., 𝑃(∅) = 0.

Fourth Semester 3 Probability theory and Linear Programming (MA241TA)


5. For any two events 𝐴 and 𝐵 of a sample space 𝑆
(i) 𝑃(𝐴 − 𝐵) = 𝑃(𝐴) − 𝑃(𝐴 ∩ 𝐵)
(ii) 𝑃(𝐵 − 𝐴) = 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)
(iii) 𝑃(𝐴 ̅ ∩ 𝐵) = 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴 ∪ 𝐵) − 𝑃(𝐴)
(iv) 𝑃[(𝐴 − 𝐵) ∪ (𝐵 − 𝐴)] = 𝑃(𝐴) + 𝑃(𝐵) − 2𝑃(𝐴 ∩ 𝐵)
6. Addition Theorem for three events: If 𝐴, 𝐵 and 𝐶 are any three events with respective
probabilities 𝑃(𝐴), 𝑃(𝐵) and 𝑃(𝐶), then the probability of occurrence of at least one of the
events is given by
𝑃(𝐴 ∪ 𝐵 ∪ 𝐶)
= 𝑃(𝐴) + 𝑃(𝐵) + 𝑃(𝐶) − 𝑃(𝐴 ∩ 𝐵) − 𝑃(𝐵 ∩ 𝐶) − 𝑃(𝐴 ∩ 𝐶) + 𝑃(𝐴 ∩ 𝐵 ∩ 𝐶).

Random variable
A Random variable is a real number 𝑋 connected with the outcome of a random experiment
𝐸. For example, if 𝐸 consists of three tosses of a coin, one can consider the random variable
which is the number of ℎ𝑒𝑎𝑑𝑠 (0, 1, 2 𝑜𝑟 3).

Outcome 𝐻𝐻𝐻 𝐻𝐻𝑇 𝐻𝑇𝐻 𝑇𝐻𝐻 𝑇𝑇𝐻 𝑇𝐻𝑇 𝐻𝑇𝑇 𝑇𝑇𝑇


Value of 𝑋 3 2 2 2 1 1 1 0

Let 𝑆 denote the sample space of a random experiment. A random variable means it is a rule
which assigns a numerical value to each and every outcome of the experiment. Thus, random
variable is a function 𝑋(𝜔) with domain 𝑆 and range (−∞, ∞) such that for every real
number 𝑎, the event [𝜔: 𝑋(𝜔) ≤ 𝑎] ∈ 𝐵 the field of subsets in 𝑆. It is denoted as 𝑓: 𝑆 → 𝑅.
Note that all the outcomes of the experiment are associated with a unique number. Therefore,
𝑓 is an example of a random variable. Usually, a random variable is denoted by letters such
as 𝑋, 𝑌, 𝑍 etc. The image set of the random variable may be written as 𝑓(𝑆) = {0, 1, 2, 3}.
There are two types of random variables. Namely;
1. Discrete Random Variable (DRV)
2. Continuous Random Variable (CRV).
Discrete Random Variable: A discrete random variable is one which takes only a countable
number of distinct values such as 0, 1, 2, 3, ⋯. Discrete random variables are usually (but not
necessarily) counts. If a random variable takes at most a countable number of values, it is
called a discrete random variable. In other words, a real valued function defined on a
discrete sample space is called a discrete random variable.
Examples of Discrete Random Variable:

Fourth Semester 4 Probability theory and Linear Programming (MA241TA)


(i) In the experiment of throwing a die, define 𝑋 as the number that is obtained. Then 𝑋 takes
any of the values 1 to 6. Thus, 𝑋(𝑆) = {1, 2, 3, ⋯ ,6} which is a finite set and hence 𝑋 is a
DRV.
(ii) If 𝑋 be the random variable denoting the number of marks scored by a student in a subject of
an examination, then 𝑋(𝑆) = {0, 1, 2, 3, ⋯ ,100}. Then, 𝑋 is a DRV.
(iii) The number of children in a family is a DRV.
(iv) The number of defective light bulbs in a box of ten is a DRV.
Probability Mass Function: Suppose 𝑋 is a one-dimensional discrete random variable
taking at most a countably infinite number of values 𝑥1 , 𝑥2 , ⋯. With each possible outcome
𝑥𝑖 , one can associate a number 𝑝𝑖 = 𝑃(𝑋 = 𝑥𝑖 ) = 𝑝(𝑥𝑖 ), called the probability of 𝑥𝑖 .
The numbers 𝑝(𝑥𝑖); 𝑖 = 1, 2, ⋯ must satisfy the following conditions:
(i) 𝑝(𝑥𝑖 ) ≥ 0 ∀ 𝑖 ,
(ii) ∑∞
𝑖=1 𝑝(𝑥𝑖 ) = 1 .

This function 𝑝 is called the probability mass function of the random variable 𝑋 and the
set {𝑥𝑖 , 𝑝(𝑥𝑖 )} is called the probability distribution of the random variable 𝑋.
Remarks:
1. The set of values which 𝑋 takes is called the spectrum of the random variable.
2. For discrete random variable, knowledge of the probability mass function enables us to
compute probabilities of arbitrary events. In fact, if 𝐸 is a set of real numbers, (𝑋 ∈ 𝐸) =
∑𝑥∈𝐸∩𝑆 𝑝(𝑥), where 𝑆 is the sample space.
Cumulative Distribution Function: For the random variable 𝑋 = {𝑥1 , 𝑥2 , 𝑥3 , … }. The
cumulative distribution function is given by 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∑𝑥𝑖 ≤𝑥 𝑝 (𝑥𝑖 ).
Mean/Expected Value, Variance, and Standard Deviation of DRV:
The mean or expected value of a DRV 𝑋 is defined as
𝐸(𝑋) = 𝜇 = ∑ 𝑥𝑖 𝑃(𝑋 = 𝑥𝑖 ) = ∑ 𝑝(𝑥𝑖 )𝑥𝑖 .
The variance of a DRV 𝑋 is defined as
𝑉𝑎𝑟 (𝑋) = 𝜎 2 = ∑ 𝑃(𝑋 = 𝑥𝑖 ) (𝑥𝑖 − 𝜇)2 = ∑ 𝑝𝑖 (𝑥𝑖 − 𝜇)2 = ∑ 𝑝𝑖 (𝑥𝑖2 − 𝜇 2 ) .
The standard deviation of DRV 𝑋 is defined as
𝑆𝐷 (𝑋) = 𝜎 = √𝜎 2 = √𝑉𝑎𝑟 (𝑋) .
Continuous Random Variable: A continuous random variable is not defined at specific
values. Instead, it is defined over an interval of values. Thus, a random variable 𝑋 is said to
be continuous if it can take all possible values between certain limits. In other words, a
random variable is said to be continuous when its different values cannot be put in 1-1
correspondence with a set of positive integers. Here, the probability of observing any single

Fourth Semester 5 Probability theory and Linear Programming (MA241TA)


value is equal to zero, since the number of values which may be assumed by the random
variable is infinite.
A continuous random variable is a random variable that (at least conceptually) can be
measured to any desired degree of accuracy.
Examples of Continuous Random Variable:
(i) Rainfall in a particular area can be treated as CRV.
(ii) Age, height and weight related problems can be included under CRV.
(iii) The amount of sugar in an orange is a CRV.
(iv) The time required to run a mile is a CRV.
Important Remark: In case of DRV, the probability at a point i.e., 𝑃 (𝑥 = 𝑐) is not zero for
some fixed 𝑐. However, in case of CRV the probability at a point is always zero.
i.e., 𝑃(𝑥 = 𝑐) = 0 for all possible values of 𝑐.

Probability Density Function: The probability density function (p.d.f) of a random variable
𝑋 usually denoted by 𝑓𝑥 (𝑥) or simply by 𝑓(𝑥) has the following obvious properties:
i) 𝑓(𝑥) ≥ 0, −∞ < 𝑥 < ∞

ii) ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
iii) The probability 𝑃(𝐸) given by 𝑃(𝐸) = ∫ 𝑓(𝑥)𝑑𝑥 is well defined for any event 𝐸.
If 𝑓(𝑥) is the p.d.f of 𝑥, then the probability that 𝑥 belongs to 𝐴, where 𝐴 is some interval
(𝑎, 𝑏) is given by the integral of 𝑓(𝑥) over that interval.
𝑏
i.e., 𝑃(𝑋 ∈ 𝐴) = ∫𝑎 𝑓(𝑥)𝑑𝑥
Cumulative Distribution Function: Cumulative Distribution function of a continuous
𝑥
random variable is defined as 𝐹(𝑥) = ∫−∞ 𝑓(𝑡)𝑑𝑡 for − ∞ < 𝑥 < ∞ .
Mean/Expectation, Variance and Standard deviation of CRV:

The mean or expected value of a CRV 𝑋 is defined as 𝜇 = 𝐸(𝑋) = ∫−∞ 𝑥 𝑓(𝑥)𝑑𝑥

The variance of a CRV 𝑋 is defined as 𝑉𝑎𝑟(𝑋) = 𝜎 2 = ∫−∞ 𝑥 2 𝑓(𝑥)𝑑𝑥 − 𝜇 2

The standard deviation of a CRV 𝑋 is given by = √𝑉𝑎𝑟(𝑋) .


Examples:
1. The probability density function of a discrete random variable 𝑋 is given below:
𝑋 0 1 2 3 4 5 6
𝑃(𝑋 = 𝑥) = 𝑓(𝑥) 𝑘 3𝑘 5𝑘 7𝑘 9𝑘 11𝑘 13𝑘

Find (𝑖)𝑘, (𝑖𝑖) 𝐹(4), (𝑖𝑖𝑖)𝑃(𝑋 ≥ 5), (𝑖𝑣)𝑃(2 ≤ 𝑋 < 5), (𝑣) 𝐸(𝑋) 𝑎𝑛𝑑 (𝑣𝑖)𝑉𝑎𝑟(𝑋).

Fourth Semester 6 Probability theory and Linear Programming (MA241TA)


Solution:
(i) To find the value of 𝑘, consider the sum of all the probabilities which equals to 49𝑘.
Equating this to 1 and solving for 𝑘, we get 𝑘 = 1/49. Therefore, distribution of 𝑋 is
written as
𝑋 0 1 2 3 4 5 6
𝑃(𝑋 = 𝑥) 1 3 5 7 9 11 13
= 𝑓(𝑥) 49 49 49 49 49 49 49
25
ii) 𝐹(4) = 𝑃[𝑋 ≤ 4] = 𝑃[𝑋 = 0] + 𝑃[𝑋 = 1] + 𝑃[𝑋 = 2] + 𝑃[𝑋 = 3] + 𝑃[𝑋 = 4] = 49 .
24
iii) 𝑃[𝑋 ≥ 5] = 𝑃[𝑋 = 5] + 𝑃[𝑋 = 6] = 49
.
21
iv) 𝑃[2 ≤ 𝑋 < 5] = 𝑃[𝑋 = 2] + 𝑃[𝑋 = 3] + 𝑃[𝑋 = 4] = 49
.

v) Next to find 𝐸(𝑋), consider


203
𝐸(𝑋) = ∑ 𝑥𝑖 ∗ 𝑓( 𝑥𝑖 ) = .
49
𝑖

vi) To obtain Variance, it is necessary to compute


973
𝐸(𝑋 2 ) = ∑ 𝑥𝑖2 ∗ 𝑓(𝑥𝑖 ) = .
49
𝑖

Thus, Variance of 𝑋 is obtained by using the relation,


973 203 2
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = − ( 49 ) .
49

2. A random variable, 𝑋, has the following distribution table.


𝑋 -2 -1 0 1 2 3
𝑓(𝑥𝑖 ) 0.1 𝑘 0.2 2𝑘 0.3 𝑘
Find (𝑖)𝑘, (𝑖𝑖)𝐹(2), (𝑖𝑖𝑖)𝑃(−2 < 𝑋 < 2), (𝑖𝑣)𝑃(−1 < 𝑋 ≤ 2), (𝑣)𝐸(𝑋)𝑎𝑛𝑑, (𝑣𝑖) 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒.
Solution:
(i) Consider the result, namely, sum of all the probabilities equals 1,
0.1 + 𝑘 + 0.2 + 2𝑘 + 0.3 + 𝑘 = 1 ⇒ 𝑘 = 0.1.
In view of above, the distribution table of 𝑋 is written as
𝑋 -2 -1 0 1 2 3
𝑓(𝑥𝑖 ) 0.1 0.1 0.2 0.2 0.3 0.1
(ii) Note that
𝐹(2) = 𝑃[𝑋 ≤ 2] = 𝑃[𝑋 = −2] + 𝑃[𝑋 = −1] + 𝑃[𝑋 = 0] + 𝑃[𝑋 = 1] + 𝑃[𝑋 = 2]
= 0.9.
The same also be obtained using the result,
F(2) = P[X ≤ 2] = 1 − P[X < 1] = 1 − {P[X = −2] + P[X = −1] + P[X = 0]} = 0.9.

Fourth Semester 7 Probability theory and Linear Programming (MA241TA)


(iii) Next, 𝑃(−2 < 𝑋 < 2) = 𝑃[𝑋 = −1] + 𝑃[𝑋 = 0] + 𝑃[𝑋 = 1] = 0.5 .F
(iv) Clearly, 𝑃(−1 < 𝑋 ≤ 2) = 0.7 .
(v) Now, consider 𝐸[𝑋] = ∑𝑖 𝑥𝑖 ∗ 𝑓(𝑥𝑖 ) = 0.8
(vi) Then 𝐸(𝑋 2 ) = ∑𝑖 𝑥𝑖2 ∗ 𝑓(𝑥𝑖 ) = 2.8.
𝑉𝑎𝑟(𝑋) = 𝐸[𝑋 2 ] − {𝐸[𝑋]}2 = 2.8 − 0.64 = 2.16 .
3. A shipment of 20 similar laptop computers in a retail outlet contains 3 that are defective. If
a school makes a random purchase of 2 of these computers, find the probability distribution
for the number of defectives.
Solution: Let 𝑋 be a random variable whose values 𝑥 are the possible numbers of defective
computers purchased by the school. Then x can only take the numbers 0, 1, and 2. Now
(30)(17
2) 68 (31)(17
1) 51
𝑓(0) = 𝑃(𝑋 = 0) = = , 𝑓(1) = 𝑃(𝑋 = 1) = =
(20
2) 95 (20
2) 190

(32)(17
0) 3
𝑓(2) = 𝑃(𝑋 = 2) = = .
(20
2) 190

Thus, the probability distribution of 𝑋 is


𝑥 0 1 2
𝑓(𝑥) 68/95 51/190 3/190

4. If a car agency sells 50% of its inventory of a certain foreign car equipped with side
airbags, find a formula for the probability distribution of the number of cars with side airbags
among the next 4 cars sold by the agency.
Solution: Since the probability of selling an automobile with side airbags is 0.5, the 24 = 16
events in the sample space are equally likely to occur. Therefore, the denominator for all
probabilities, and also for our function, is 16. To obtain the number of ways of selling 3 cars
with side airbags, it is required to consider the number of ways of partitioning 4 outcomes
into two cells, with 3 cars with side airbags assigned to one cell and the model without side
airbags assigned to the other. This can be done in (43) = 4 ways. In general, the event of

selling x models with side airbags and 4 − 𝑥 models without side airbags can occur in (𝑥4)
ways, where 𝑥 can be 0, 1, 2, 3, or 4. Thus, the probability distribution 𝑓(𝑥) = 𝑃(𝑋 = 𝑥) is
1
𝑓(𝑥) = (16 ) (𝑥4) 𝑓𝑜𝑟 𝑥 = 0, 1, 2, 3, 4.

Fourth Semester 8 Probability theory and Linear Programming (MA241TA)


5. Determine probability mass function for the following cumulative distribution function
0, 𝑥 < −2
0.2, −2 ≤ 𝑥 < 0
𝐹𝑋 (𝑥) = {
0.7, 0≤𝑥<2
1, 2≤𝑥
Solution: Clearly, 𝑋 = {−2, 0, 2}
𝐹𝑋 (−2) = 0.2
𝐹𝑋 (0) = 0.7
𝐹𝑋 (2) = 1
Probability Mass Function (p.m.f.) at each point is the change in the cumulative distribution
function at the point.
𝑃𝑋 (−2) = 0.2 − 0.2
𝑃𝑋 (0) = 0.7 − 0.2 = 0.5
𝑃𝑋 (2) = 1 − 0.3.
6. Toss a coin repeated until head occurs for the 1st time. Let 𝑋 denote the number of tosses
before head occurs. Find (𝑖) 𝑝𝑚𝑓, (𝑖𝑖) 𝑃(𝑋), (𝑖𝑖𝑖) 𝑃(𝑋 = 𝑒𝑣𝑒𝑛), (𝑖𝑣) 𝑃(𝑋 = 𝑥: 3|𝑥).
Solution: Sample space 𝑆 = {𝐻, 𝑇𝐻, 𝑇𝑇𝐻, 𝑇𝑇𝑇𝐻, 𝑇𝑇𝑇𝑇𝐻, ⋯ }
Random Variable 𝑋 = {1,2,3,4,5, ⋯ }
1 1 1
𝑃𝑋 (1) = , 𝑃𝑋 (2) = 2 , 𝑃𝑋 (3) = 3
2 2 2
(i) In general, probability of occurrence of head after 𝑛 tosses is given by
1
𝑃𝑋 (1) = , 𝑛 = 1, 2, 3, 4, ⋯ is the p.m.f.
2𝑛
1
1 1 1 1 2
(ii) 𝑃(𝑋) = ∑𝑥 𝑃𝑋 (𝑥) = + + + +⋯= 1 =1
2 22 23 24 1−
2
1
1 1 1 22 1
(iii) 𝑃(𝑋 = 𝑒𝑣𝑒𝑛) = ∑𝑥:𝑒𝑣𝑒𝑛 𝑃𝑋 (𝑥) = + + +⋯= 1 =
22 23 24 1− 2 3
2
1
1 1 1 1 23 1
(iv) 𝑃(𝑋 = 𝑥: 3|𝑥) = ∑3|𝑥 𝑃𝑋 (𝑥) = + + + +⋯= 1 =
23 26 29 212 1− 3 7
2

7. The diameter of an electric cable, say 𝑋, is assumed to be a continuous random variable


6𝑥(1 − 𝑥) 0≤𝑥≤1
with p.d.f 𝑓(𝑥) = {
0 otherwise
(i) Verify that 𝑓(𝑥) is valid p.d.f.
2
(ii) Find 𝑃 (3 < 𝑥 < 1)

(iii) Determine the number 𝑏 such that 𝑃(𝑋 < 𝑏) = 𝑃(𝑋 > 𝑏).

Fourth Semester 9 Probability theory and Linear Programming (MA241TA)


Solution:
(i) 𝑓(𝑥) ≥ 0 in the given interval.
∞ 0 1 ∞ 1
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑓(𝑥)𝑑𝑥 + ∫ 𝑓(𝑥)𝑑𝑥 + ∫ 𝑓(𝑥)𝑑𝑥 = 0 + ∫ 6𝑥(1 − 𝑥) 𝑑𝑥 + 0
−∞ −∞ 0 1 0
1
6𝑥 2 6𝑥 3
={ − }| = 1
2 3 0
2 1 1 7
(ii) 𝑃 (3 < 𝑥 < 1) = ∫2/3 𝑓(𝑥)𝑑𝑥 = ∫2/3(6𝑥 − 6𝑥 2 )𝑑𝑥 = 27 .
𝑏 1 𝑏 1
(iii) 𝑃(𝑋 < 𝑏) = 𝑃(𝑋 > 𝑏) ⇒ ∫0 𝑓(𝑥)𝑑𝑥 = ∫𝑏 𝑓(𝑥)𝑑𝑥 ⇒ 6 ∫0 𝑥(1 − 𝑥)𝑑𝑥 = 6 ∫𝑏 𝑥(1 − 𝑥)𝑑𝑥
𝑏2 𝑏3 1 1 𝑏2 𝑏3
⇒ (2 − ) = [(2 − 3) − ( 2 − )]
3 3

⇒ 3𝑏 2 − 2𝑏 3 = [1 − 3𝑏 2 + 2𝑏 3 ]
⇒ 4𝑏 3 − 6𝑏 2 + 1 = 0
⇒ (2𝑏 − 1)(2𝑏 2 − 2𝑏 − 1) = 0
1
From this 𝑏 = 2 is the only real value lying between 0 and 1 and satisfying the given

condition.
8. Suppose that the error in the reaction temperature, in 𝐶 0 , for a controlled laboratory
experiment is a continuous random variable 𝑋 having the probability density function
𝑥2
𝑓(𝑥) = { 3 , −1 < 𝑥 < 2
0 , elsewhere
(i) Verify that 𝑓(𝑥) is a probability density function.
(ii) Find 𝑃(0 < 𝑋 ≤ 1).
Solution:
∞ 2 𝑥2
(i) ∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫−1 3
𝑑𝑥 = 1. Hence the given function is a p.d.f.
1 𝑥2 1
(ii) 𝑃(0 < 𝑋 ≤ 1) = ∫0 3
𝑑𝑥 = 9 .

9. The length of time (in minutes) that a certain lady speaks on telephone is found to be a
−𝑥

random variable with probability function 𝑓(𝑥) = { 𝐴𝑒 5 𝑓𝑜𝑟 𝑥 ≥ 0


0 otherwise
(i) Find 𝐴
(ii) Find the probability that she will speak on the phone
(a) more than 10 min (b) less than 5 min (c) between 5 & 10 min.
Solution:

(i) Given 𝑓(𝑥) is p.d.f. i.e., ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1

Fourth Semester 10 Probability theory and Linear Programming


(MA241TA)
0 ∞ ∞ −𝑥 1
⇒ ∫ 𝑓(𝑥)𝑑𝑥 + ∫ 𝑓(𝑥)𝑑𝑥 = 1 ⇒ 0 + ∫ 𝐴𝑒 5 𝑑𝑥 = 1 ⇒ 𝐴 =
−∞ 0 0 5
−𝑥
∞ ∞1
(ii) (a) 𝑃(𝑥 > 10) = ∫10 𝑓(𝑥)𝑑𝑥 = ∫10 5 𝑒 5 𝑑𝑥 = 𝑒 −2 = 0.1353
−𝑥
5 51
(b) 𝑃(𝑥 < 5) = ∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫0 5 𝑒 5 𝑑𝑥 = −𝑒 −1 + 1 = 0.6322
10 10 1 −𝑥
(c) 𝑃(5 < 𝑥 < 10) = ∫5 𝑓(𝑥)𝑑𝑥 = ∫5 𝑒 5 𝑑𝑥 = −𝑒 −2 + 𝑒 −1 = 0.2325 .
5

10. Suppose 𝑋 is a continuous random variable with the following probability density
function
𝑓(𝑥) = 3𝑥 2 for 0 < 𝑥 < 1 . Find the mean and variance of 𝑋.

Solution: Mean = 𝜇 = ∫−∞ 𝑥𝑓(𝑥)𝑑𝑥
0 1 ∞
= ∫−∞ 𝑥𝑓(𝑥)𝑑𝑥 + ∫0 𝑥𝑓(𝑥)𝑑𝑥 + ∫1 𝑥𝑓(𝑥)𝑑𝑥
1 1 3
= 0 + ∫0 𝑥 ∗ 3𝑥 2 𝑑𝑥 + 0 = ∫0 3𝑥 3 𝑑𝑥 = .
4

Variance = 𝜎 2 = ∫−∞ 𝑥 2 𝑓(𝑥)𝑑𝑥 − 𝜇 2
1
= ∫0 𝑥 2 𝑓(𝑥)𝑑𝑥 − 𝜇 2
1 3 2
= ∫0 𝑥 2 ∗ 3𝑥 2 𝑑𝑥 − (4)
1 3 2 3
= ∫0 3𝑥 4 𝑑𝑥 − (4) = .
80

11. In a certain city, the daily consumption of electric power (in million 𝑘𝑊/ℎ𝑟) is a
continuous random variable having the probability density function
1 −𝑥/3
𝑓𝑋 (𝑥) = {9 𝑥𝑒 , 𝑥≥0
0, 𝑥<0
If the cities power plant has a daily capacity of 12 million 𝑘𝑊/ℎ𝑟. What is the probability
that this power supply will be insufficient on any given day?
1
𝑥𝑒 −𝑥/3 , 𝑥≥0
Solution: Given pdf, 𝑓𝑋 (𝑥) = {9
0, 𝑥<0
The power supply will be insufficient on any given day if power consumption is greater than
12 million 𝑘𝑊/ℎ𝑟.
𝑥 𝑥 ∞
− −
∞ ∞1 1 𝑥𝑒 3 𝑒 3 45
i.e., 𝑃(𝑋 > 12) = ∫12 𝑓𝑋 (𝑥) 𝑑𝑥 = ∫12 9 𝑥𝑒 −𝑥/3 𝑑𝑥 = 9[ 1 − 1 2
] = 9
𝑒 −4 .
− (− )
3 3 12
1, 0 ≤ 𝑥 ≤ 1
12. The density function of 𝑋 is given by 𝑓𝑋 (𝑥) = { .
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Fourth Semester 11 Probability theory and Linear Programming
(MA241TA)
Let 𝑌 = 𝑒 𝑋 . (i) Find pdf of 𝑌, (ii) 𝐸(𝑌).
Solution:
Cumulative distribution function of 𝑌 is
log 𝑦
𝑋
𝐹𝑌 (𝑦) = 𝑃(𝑌 ≤ 𝑦) = 𝑃(𝑒 ≤ 𝑦) = 𝑃(𝑋 ≤ log 𝑦) = ∫ 𝑓𝑋 (𝑥) 𝑑𝑥
−∞
0 log 𝑦
= ∫ 𝑓𝑋 (𝑥) 𝑑𝑥 + ∫ 𝑓𝑋 (𝑥) 𝑑𝑥 , 𝑤ℎ𝑒𝑛 0 ≤ 𝑥 ≤ 1, 1 ≤ 𝑦 ≤ 𝑒 ⇒ 0 ≤ log 𝑦 ≤ 1
−∞ 0
log 𝑦
⇒ 𝐹𝑌 (𝑦) = ∫ 1 𝑑𝑥 = log 𝑦
0

By differentiating, we get
𝑑 1
𝑓𝑌 (𝑦) = 𝑑𝑦 𝐹𝑌 (𝑦) = 𝑦 , 1 ≤ 𝑦 ≤ 𝑒.
𝑒 1
Hence 𝐸[𝑌] = ∫1 𝑦𝑓𝑌 (𝑦)𝑑𝑦 = 𝑒 − 1 OR 𝐸[𝑌] = ∫0 𝑒 𝑥 𝑑𝑥 = 𝑒 − 1.
13. Given the probability density function 𝑓𝑋 (𝑥), find the cumulative distribution function.
0, 𝑥 < −2
0.25, −2 ≤ 𝑥 < 1
𝑓𝑋 (𝑥) = {
0.5, 1 ≤ 𝑥 < 1.5
0, 1.5 ≤ 𝑥
Solution:
𝑥
(i) if 𝑥 < −2, 𝐹𝑋 (𝑥) = ∫−∞ 𝑒 𝑥 𝑑𝑥 = 0
𝑥 𝑥
(ii) if −2 ≤ 𝑥 ≤ 1, 𝐹𝑋 (𝑥) = ∫−∞ 𝑓𝑋 (𝑥)𝑑𝑥 = ∫−2 0.25 𝑑𝑥 = 0.25(𝑥 + 2)
𝑥 −2 1 𝑥
(iii) if 1 ≤ 𝑥 ≤ 1.5, 𝐹𝑋 (𝑥) = ∫−∞ 𝑒 𝑥 𝑑𝑥 = ∫−∞ 0 𝑑𝑥 + ∫−2 0.25 𝑑𝑥 + ∫1 0.5 𝑑𝑥
= 0 + 0.25(1 + 2) + 0.5(𝑥 − 1) = 0.5𝑥 + 0.25
𝑥 1 1.5 𝑥
(iv) if 1.5 ≤ 𝑥, 𝐹𝑋 (𝑥) = ∫−∞ 𝑒 𝑥 𝑑𝑥 = ∫−2 0.25 𝑑𝑥 + ∫1 0.5 𝑑𝑥 + ∫1.5 0 𝑑𝑥
= 0.25(1 + 2) + 0.5(1.5 − 1) + 0 = 1
Thus the cumulative distribution function is given by
0, 𝑥 < −2
0.25𝑥 + 0.5, −2 ≤ 𝑥 < 1
𝐹𝑋 (𝑥) = {
0.5𝑥 + 0.25, 1 ≤ 𝑥 < 1.5
1, 1.5 ≤ 𝑥
14. Determine the probability density function for the following cumulative distribution
function.
0, 𝑥<0
0.2𝑥, 0≤𝑥<4
𝐹𝑋 (𝑥) = {
0.04𝑥 + 0.64, 4≤𝑥<9
1, 9≤𝑥
Fourth Semester 12 Probability theory and Linear Programming
(MA241TA)
∞ 𝑑
Solution: If 𝑓(𝑥) is a p.d.f. then ∫−∞ 𝑓(𝑥) 𝑑𝑥 ⇒ 𝑓(𝑥) = 𝑑𝑥 𝐹(𝑥).

(i) If 𝑥 < 0, 𝑓(𝑥) = 0


𝑑
(ii) If 0 ≤ 𝑥 < 4, 𝑓(𝑥) = 𝑑𝑥 (0.2𝑥) = 0.2
𝑑
(iii) If 4 ≤ 𝑥 < 9, 𝑓(𝑥) = 𝑑𝑥 (0.04𝑥 + 0.64) = 0.04
𝑑
(iv) If 𝑥 ≥ 9, 𝑓(𝑥) = 𝑑𝑥 (1) = 0
0, 𝑥<0
0.2, 0≤𝑥<4
∴ 𝑓𝑋 (𝑥) = {
0.5, 4≤𝑥<9
0, 9≤𝑥
Exercise:
1. Two cards are drawn randomly, simultaneously from a well shuffled deck of 52 cards. Find
the variance for the number of aces.
1 2 𝑥
2. If 𝑋 is a discrete random variable taking values 1,2,3,… with 𝑃(𝑥) = 2
(3) . Find

𝑃(𝑋 𝑏𝑒𝑖𝑛𝑔 𝑎𝑛 𝑜𝑑𝑑 𝑛𝑢𝑚𝑏𝑒𝑟) by first establishing that 𝑃(𝑥) is a probability function.
3. The probability mass function of a random variable 𝑋 is zero except the points 𝑥 = 0,1,2. At
these points it has the values 𝑝(0) = 3𝑐 3 , 𝑝(1) = 4𝑐 − 10𝑐 2 and
𝑝(2) = 5𝑐 − 1 for some 𝑐 > 0.
a) Determine the value of 𝑐.
b) Compute the probabilities 𝑃(𝑋 < 2)𝑎𝑛𝑑 𝑃(1 < 𝑋 ≤ 2).
c) Find the largest 𝑥 such that (𝑥) < 1/2 .
d) Find the smallest 𝑥 such that 𝐹(𝑥) ≥ 1/3.
1
4. If 𝑋 is a random variable with 𝑃(𝑋 = 𝑥) = 2𝑥 , where 𝑥 = 1,2,3, … ∞.

Find 𝑖) 𝑃(𝑋), (𝑖𝑖)𝑃(𝑋 = 𝑒𝑣𝑒𝑛), (𝑖𝑖𝑖)𝑃(𝑋 = 𝑑𝑖𝑣𝑖𝑠𝑖𝑏𝑙𝑒 𝑏𝑦 3).


5. A continuous random variable has the density function
2
𝑓(𝑥) = {𝑘𝑥 − 3 < 𝑥 < 3
0 otherwise
Find 𝑘 and hence find 𝑃(𝑥 < 3) , 𝑃(𝑥 > 1) .

6. Let 𝑋 be a continuous random variable with p.d.f.


𝑎𝑥 , 0≤𝑥≤1
𝑎, 1≤𝑥≤2
𝑓(𝑥) = {
−𝑎𝑥 + 3𝑎 , 2≤𝑥≤3
0, otherwise
(i) Determine the constant. (ii) Compute 𝑃(𝑋 ≤ 1.5).

1
7. Find the mean and variance of the probability density function 𝑓(𝑥) = 2 𝑒 −|𝑥|

Fourth Semester 13 Probability theory and Linear Programming


(MA241TA)
8. A continuous distribution of a variable 𝑋 in the range (-3, 3) is defined by
1
(3 + 𝑥)2 , − 3 ≤ 𝑥 ≤ −1
16
1
𝑓(𝑥) = (6 − 2𝑥 2 ) , −1≤𝑥 ≤1
16
1 2
{ 16 (3 − 𝑥) , 1≤𝑥≤3

(i) Verify that the area under the curve is unity.


(ii) Find the mean and variance of the above distribution.

Answers: 1) 0.1392 2) 3/5 3) 1/3, 1/3, 2/3, 1, 1 4) 1, 1/3, 1/7 5) 1/18, 1, 13/27
6) 1/2, 1/2 7) Mean = 0 and Variance = 2 8) Unit area and 0, 1.

JOINT PROBABILITY
Two or more random variables:
So far, only single random variables were considered. If one chooses a person at random and
measures his or her height and weight, each measurement is a random variable – but taller
people also tend to be heavier than shorter people, so the outcomes will be related. In order to
deal with such probabilities, joint probability distribution of two random variables are studied
in detail.
Joint Probability distribution for discrete random variables
Joint Probability Mass Function:
Let 𝑋 and 𝑌 be random variables defined on the same sample space 𝑆 with respective range
spaces 𝑅𝑋 = {𝑥1 , 𝑥2 , … , 𝑥𝑛 } and 𝑅𝑌 = {𝑦1 , 𝑦2 , … , 𝑦𝑚 }. The joint distribution or joint
probability function of 𝑋 and 𝑌 is the function ℎ on the product space 𝑅𝑋 × 𝑅𝑌 defined by
ℎ(𝑥𝑖 , 𝑦𝑗 ) ≡ 𝑃(𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 ) ≡ 𝑃({𝑠 ∈ 𝑆 ∶ 𝑋(𝑠) = 𝑥𝑖 , 𝑌(𝑠) = 𝑦𝑗 })
The function ℎ has the following properties:
(i) ℎ(𝑥𝑖 , 𝑦𝑗 ) ≥ 0
(ii) ∑𝑖 ∑𝑗 ℎ(𝑥𝑖 , 𝑦𝑗 ) = 1
Thus, ℎ defines a probability space on the product space 𝑅𝑋 × 𝑅𝑌 .
𝑌
𝑦1 𝑦1 … 𝑦𝑗 … 𝑦𝑚 ∑ 𝑦𝑖
𝑋 𝑖

𝑥1 ℎ(𝑥1 , 𝑦1 ) ℎ(𝑥1 , 𝑦2 ) … ℎ(𝑥1 , 𝑦𝑗 ) … ℎ(𝑥1 , 𝑦𝑚 ) 𝑓(𝑥1 )


𝑥2 ℎ(𝑥2 , 𝑦1 ) ℎ(𝑥2 , 𝑦2 ) … ℎ(𝑥2 , 𝑦𝑗 ) … ℎ(𝑥2 , 𝑦𝑚 ) 𝑓(𝑥2 )
Fourth Semester 14 Probability theory and Linear Programming
(MA241TA)
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
𝑥𝑖 ℎ(𝑥𝑖 , 𝑦1 ) ℎ(𝑥𝑖 , 𝑦2 ) … ℎ(𝑥𝑖 , 𝑦𝑗 ) … ℎ(𝑥𝑖 , 𝑦𝑚 ) 𝑓(𝑥𝑖 )
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
𝑥𝑚 ℎ(𝑥𝑛 , 𝑦1 ) ℎ(𝑥𝑛 , 𝑦2 ) … ℎ(𝑥𝑛 , 𝑦𝑗 ) … ℎ(𝑥𝑛 , 𝑦𝑚 ) 𝑓(𝑥𝑛 )

∑ 𝑥𝑖 𝑔(𝑦1 ) 𝑔(𝑦2 ) … 𝑔(𝑦𝑗 ) … 𝑔(𝑦𝑚 ) 1


𝑖

The functions 𝑓 and 𝑔 on the right side and the bottom side, respectively, of the joint
distribution table are defined by
𝑓(𝑥𝑖 ) = ∑𝑗 ℎ(𝑥𝑖 , 𝑦𝑗 ) and 𝑔(𝑦𝑗 ) = ∑𝑗 ℎ(𝑥𝑖 , 𝑦𝑗 ).
That is, 𝑓(𝑥𝑖 ) is the sum of the entries in the 𝑖 𝑡ℎ row and 𝑔(𝑦𝑗 ) is the sum of the entries in the
𝑗 𝑡ℎ column. They are called the marginal distributions of 𝑋 and 𝑌, respectively.

Expectation: Consider a function 𝜑(𝑋, 𝑌) of 𝑋 and 𝑌. Then the function


𝐸{𝜑(𝑋, 𝑌)} = ∑𝑖 ∑𝑗 ℎ(𝑥𝑖 , 𝑦𝑗 ) 𝜑(𝑥𝑖 , 𝑦𝑗 )
is called the mathematical expectation of 𝜑(𝑋, 𝑌) in the joint distribution of 𝑋 and 𝑌.
Co-variance and Correlation: Let 𝑋 and 𝑌 be random variables with the joint distribution
ℎ(𝑥, 𝑦), and respective means 𝜇𝑋 and 𝜇𝑌 . The covariance of 𝑋 and 𝑌, is denoted by
𝑐𝑜𝑣(𝑋, 𝑌) and is defined as

𝑐𝑜𝑣(𝑋, 𝑌) = ∑(𝑥𝑖 − 𝜇𝑋 ) (𝑦𝑗 − 𝜇𝑌 ) ℎ(𝑥𝑖 , 𝑦𝑗 )


𝑖,𝑗

𝑐𝑜𝑣(𝑋, 𝑌) = ∑ 𝑥𝑖 𝑦𝑗 ℎ(𝑥𝑖 , 𝑦𝑗 ) − 𝜇𝑋 𝜇𝑌
𝑖,𝑗

The correlation of 𝑋 and 𝑌 is defined by


𝑐𝑜𝑣(𝑋, 𝑌)
𝜌(𝑋, 𝑌) =
𝜎𝑋 𝜎𝑌
The correlation 𝜌 is dimensionless and has the following properties:
(i) 𝜌(𝑋, 𝑌) = 𝜌(𝑌, 𝑋),
(ii) −1 ≤ 𝜌 ≤ 1,
(iii) 𝜌(𝑋, 𝑋) = 1, 𝜌(𝑋, −𝑋) = −1,
(iv) 𝜌(𝑎𝑋 + 𝑏, 𝑐𝑌 + 𝑑) = 𝜌(𝑋, 𝑌) if 𝑎, 𝑐 ≠ 0.

Conditional probability distribution:


Fourth Semester 15 Probability theory and Linear Programming
(MA241TA)
We know that the value 𝑥 of the random variable 𝑋 represents an event that is a subset of the
sample space. If we use the definition of conditional probability,
𝑃(𝐴∩𝐵)
𝑃(𝐵|𝐴) = , provided 𝑃(𝐴) > 0,
𝑃(𝐴)

where 𝐴 and 𝐵 are now the events defined by 𝑋 = 𝑥 and 𝑌 = 𝑦, respectively, then
𝑃(𝑋=𝑥,𝑌=𝑦) ℎ(𝑥,𝑦)
𝑃(𝑌 = 𝑦|𝑋 = 𝑥) = = , provided 𝑓(𝑥) > 0,
𝑃(𝑋=𝑥) 𝑓(𝑥)

Where 𝑋 and 𝑌 are discrete random variables.


Conditional expectation and Conditional Variance
Conditional expectation of 𝑋 given 𝑌 = 𝑦 is
𝐸[𝑋|𝑌 = 𝑦] = ∑𝑥 𝑥 𝑝𝑋|𝑌 (𝑥|𝑦)
If 𝑧 = 𝑔(𝑥), then

𝐸[𝑍|𝑌 = 𝑦] = ∑ 𝑔(𝑥) 𝑝𝑋|𝑌 (𝑥|𝑦)


𝑥

Conditional Variance of 𝑋 given 𝑌 = 𝑦 is


𝑉𝑎𝑟(𝑋|𝑌 = 𝑦) = 𝐸[𝑋 2 |𝑌 = 𝑦] − (𝐸[𝑋|𝑌 = 𝑦])2
Total probability and expectation theorem

𝑝𝑋 (𝑥) = ∑ 𝑝𝑋,𝑌 (𝑥, 𝑦) = ∑ 𝑝𝑌 (𝑦) 𝑝𝑋|𝑌 (𝑥|𝑦)


𝑦 𝑦

𝑝𝑌 (𝑦) = ∑ 𝑝𝑋,𝑌 (𝑥, 𝑦) = ∑ 𝑝𝑋 (𝑥) 𝑝𝑌|𝑋 (𝑦|𝑥)


𝑥 𝑥

𝐸[𝑋] = ∑ 𝑝𝑌 (𝑦) 𝐸[𝑋|𝑌 = 𝑦]


𝑦

𝐸[𝑌] = ∑ 𝑝𝑋 (𝑥) 𝐸[𝑌|𝑋 = 𝑥]


𝑥

Problem 1. A coin is tossed three times. Let 𝑋 be equal to 0 or 1 according as a head or a tail
occurs on the first toss. Let 𝑌 be equal to the total number of heads which occurs. Determine
(i) the marginal distributions of 𝑋 and 𝑌, and (ii) the joint distribution of 𝑋 and 𝑌, (iii)
expected values of 𝑋, 𝑌, 𝑋 + 𝑌 and 𝑋𝑌, (iv) 𝜎𝑋 and 𝜎𝑌 , (v) 𝐶𝑜𝑣(𝑋, 𝑌) and 𝜌(𝑋, 𝑌).
Solution:
Here the sample space is given by
𝑆 = {𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝐻𝑇𝑇, 𝑇𝐻𝐻, 𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝑇𝑇𝑇}
(i) The distribution of the random variable 𝑋 is given by the following table

Fourth Semester 16 Probability theory and Linear Programming


(MA241TA)
𝑋 0 1
(First toss Head or Tail) (First toss Head) (First toss Tail)
𝑃(𝑋) 4 4
(Probability of random variable 𝑋) 8 8

which is the marginal distribution of the random variable 𝑋.


The distribution of the random variable Y is given by the following table
𝑌 0 1 2 3
(Total number of Heads) (zero (one Head) (two (three
Heads) Head) Head)
𝑃(𝑌) 1 3 3 1
(Probability of random 8 8 8 8

variable 𝑌)
which is the marginal distribution of the random variable 𝑌.
(ii) The joint distribution of the random variables 𝑋and 𝑌 is given by the following table
𝑌 0 1 2 3
𝑋 (zero Heads) (one Head) (two Head) (three
Head)
0
1 2 1
(First toss 0
8 8 8
Head
1
1 2 1
(First toss 0
8 8 8
Tail)
4 4 4
(iii) E[X] = μX = ∑ xi P(xi ) = x1 P(x1 ) + x2 P(x2 ) = 0 × + 1 × =
8 8 8
E[Y] = μY = ∑ yj P(yj ) = y1 P(y1 ) + y2 P(y2 ) + y3 P(y3 )

1 3 3 1 12
=0× +1× +2× +3× =
8 8 8 8 8
E[X + Y] = ∑ ∑ Pij (xi + yj )

= P11 (x1 + y1 ) + P12 (x1 + y2 ) + P13 (x1 + y3 ) + P14 (x1 + y4 )


+ P21 (x2 + y1 ) + P22 (x2 + y2 ) + P23 (x2 + y3 ) + P24 (x2 + y4 )
1 2 2 1
= 0(0 + 0) + (0 + 1) + (0 + 2) + (1 + 1) + (1 + 2) + 0(1 + 3)
8 8 8 8
Fourth Semester 17 Probability theory and Linear Programming
(MA241TA)
16
E[X + Y] = = 2.
8
E[XY] = ∑ ∑ Pij (xi yj )

= P11 (x1 y1 ) + P12 (x1 y2 ) + P13 (x1 y3 ) + P14 (x1 y4 ) + P21 (x2 y1 ) + P22 (x2 y2 )
+ P23 (x2 y3 ) + P24 (x2 y4 )
1 2 2 1
= 0(0 × 0) + (0 × 1) + (0 × 2) + (1 × 1) + (1 × 2) + 0(1 × 3) = 2.
8 8 8 8
(iv) σ2X = E[X 2 ] − μ2X = ∑ xi2 P(xi ) − [E(X)]2 = x12 P(x1 ) + x22 P(x2 )

4 4 4 2 1
= 02 × 8 + 12 × 8 − (8) = 4

σ2Y = E[Y 2 ] − μ2Y

= ∑ yi2 P(yi ) − [E(Y)]2 = y12 P(y1 ) + y22 P(y2 ) + y32 P(y3 ) + +y42 P(y4 )

1
2 2
3 2
3 2 1 2 2 3
= 0 × + 1 × + 2 × +3 × − ) =
(
8 8 8 8 8 4
1 1 3 1
(v) Cov(X, Y) = E(XY) − μX μY = − × = −
2 2 2 4
Cov(X,Y) −1/4 1
ρ(X, Y) = = (1/2)( =− .
σX σY √3/2) √3

Problem 2: The joint distribution of two random variables 𝑋 and 𝑌 is given by the following
table:
Y
2 3 4
X
1 0.06 0.15 0.09
2 0.14 0.35 0.21
Determine the individual distributions of X and Y. Also, verify that X and Y are stochastically
independent.
Solution:
X takes values 1, 2 and Y takes the values 2, 3, 4. Also, h11 = 0.06, h12 = 0.15
h13 = 0.09, h21 = 0.14, h22 = 0.35, h23 = 0.21
Therefore, f1 = h11 + h12 + h13 = 0.3, f2 = h21 + h22 + h23 = 0.7,
g1 = h11 + h21 = 0.2, g 2 = h12 + h22 = 0.5, g 3 = h13 + h23 = 0.3.
The distribution of X is given by
xi 1 2
𝑓𝑖 0.3 0.7
Fourth Semester 18 Probability theory and Linear Programming
(MA241TA)
The distribution of 𝑌 is given by
yj 2 3 4
gj 0.2 0.5 0.3
f1 g1 = 0.06 = h11 , f1 g 2 = 0.15 = h12, f1 g 3 = 0.09 = h13,
f2 g1 = 0.14 = h21 , f2 g 2 = 0.35 = h22, f2 g 3 = 0.21 = h23,
Thus, fi g j = hij for all values of i and j so, X and Y are stochastically independent.
Problem 3.The joint distribution of two random variables 𝑋 and 𝑌 is given by the following
table:
Y
0 1
X
0 0.1 0.2
1 0.4 0.2
2 0.1 0
(a) Find P(X + Y > 1)
(b) Determine the individual (marginal) probability distributions of X and Y and verify that X
and 𝑌 are not independent.
(c) Find 𝑃(𝑋 = 2|𝑌 = 0).
(d) Find the conditional distribution of 𝑋 given 𝑌 = 1.
Solution:Note that 𝑋 takes the values 0, 1, 2 and 𝑌 takes the values 0, 1
ℎ11 = 0.1, ℎ12 = 0.2, ℎ21 = 0.4, ℎ22 = 0.2, ℎ31 = 0.1, ℎ32 = 0,
(a) The event 𝑋 + 𝑌 > 1 occurs only when the pair (𝑋, 𝑌) takes the values (1,1), (2,0) and (2,1).
The probability that this event occurs is therefore
P(𝑋 + 𝑌 > 1) = ℎ22 + ℎ31 + ℎ32 = 0.2 + 0.1 + 0 = 0.3.
(b) 𝑓1 = ℎ11 + ℎ12 = 0.1 + 0.2 = 0.3.
𝑓2 = ℎ21 + ℎ22 = 0.4 + 0.2 = 0.6.

𝑓3 = ℎ31 + ℎ32 = 0.1 + 0 = 0.1.


𝑔1 = ℎ11 + ℎ21 + ℎ31 = 0.6
𝑔 2 = ℎ12 + ℎ22 + ℎ32 = 0.4
The distribution of 𝑋 is given by
𝑥𝑖 0 1 2
𝑓𝑖 0.6 0.6 0.1

Fourth Semester 19 Probability theory and Linear Programming


(MA241TA)
The distribution of 𝑌 is given by
𝑦𝑗 0 1
𝑔𝑗 0.6 0.4
It is verified that 𝑓1 𝑔1 = 0.18 ≠ ℎ11 .
Therefore, 𝑋 and 𝑌 are not stochastically independent.
ℎ(2,0) ℎ31 0.1 1
(c) 𝑃(𝑋 = 2|𝑌 = 0) = = = 0.6 = 6
𝑔(0) 𝑔1

(d) Conditional distribution of 𝑋 given 𝑌 = 1 is


ℎ(𝑥, 1) ℎ𝑖2
𝑃(𝑋 = 𝑥|𝑌 = 1) = =
𝑔(1) 𝑔2
ℎ(0,1) ℎ12 0.2
𝑃(𝑋 = 0|𝑌 = 1) = = = = 0.5
𝑔(1) 𝑔2 0.4
ℎ(1,1) ℎ22 0.2
𝑃(𝑋 = 1|𝑌 = 1) = = = = 0.5
𝑔(1) 𝑔2 0.4
ℎ(2,1) ℎ32 0
𝑃(𝑋 = 2|𝑌 = 1) = = = =0
𝑔(1) 𝑔2 0.4

Problem 4. The joint distribution of two random variables 𝑋 and 𝑌 is given by 𝑝𝑖𝑗 =
𝑘(𝑖 + 𝑗), 𝑖 = 1, 2, 3, 4; 𝑗 = 1, 2, 3. Find (i) 𝑘 and (ii) the marginal distributions of 𝑋 and 𝑌.
Show that 𝑋 and 𝑌 are not independent.
Solution: For the given 𝑝𝑖𝑗 ,
4 3 4 3

∑ ∑ ℎ𝑖𝑗= ∑ ∑ ℎ = 𝑘 ∑ ∑(𝑖 + 𝑗 )
𝑖 𝑗 𝑖=1 𝑗=1 𝑖=1 𝑗=1
4 4

= 𝑘 ∑{(𝑖 + 1) + (𝑖 + 2) + (𝑖 + 3)} = 𝑘 ∑(3 𝑖 + 6)


𝑖=1 𝑖=1

= 𝑘 {(3 + 6) + (3 × 2 + 6) + (3 × 3 + 6) + (3 × 4 + 6)} = 54𝑘


1
Since ∑𝑖 ∑𝑗 ℎ𝑖𝑗 = 1, 𝑖. 𝑒. , 54𝑘 = 1 𝑜𝑟 𝑘 = 54
𝑖+2
𝑓𝑖 = ∑𝑗 ℎ𝑖𝑗 = ∑3𝑗=1 ℎ𝑖𝑗 =𝑘 ∑3𝑗=1(𝑖 + 𝑗 ) =
18

2 𝑗 +5
𝑔𝑗 = ∑𝑖 ℎ𝑖𝑗 = ∑4𝑗=1 ℎ𝑖𝑗 =𝑘 ∑3𝑗=1(𝑖 + 𝑗 ) =
27

Therefore, the marginal distributions of X and Y are


𝑖+2 2 𝑗 +5
{𝑓𝑖 } = { }, i=1,2,3,4 and {𝑔𝑗 } = { }, j=1,2,3.
18 27

Fourth Semester 20 Probability theory and Linear Programming


(MA241TA)
Finally note that 𝑓𝑖 𝑔𝑗 ≠ ℎ11 , so X and Y are not independent.
Problem 5. The joint probability distribution of two random variables X and Y is given by
the following table.
Y
1 3 9
X
2 1/8 1/24 1/12
4 1/4 1/4 0
6 1/8 1/24 1/12
Find the marginal distribution of 𝑋 and 𝑌, and evaluate 𝑐𝑜𝑣(𝑋, 𝑌). Find 𝑃(𝑋 = 4|𝑌 = 3) and
𝑃(𝑌 = 3|𝑋 = 4)
Solution: From the table, note that
1 1 1 1
𝑓1 = + + =
8 24 12 4
1 1 1
𝑓2 = + + 0=
4 4 2
1 1 1 1
𝑓3 = 8 + 24 + 12 = 4
1 1 1 1
𝑔1 = 8 + 4 + =2
8
1 1 1 1
𝑔2 = + + =
24 4 24 3
1 1 1
𝑔3 = +0+ =
12 12 6
The marginal distribution of X is given by the table:
𝑥𝑖 2 4 6
𝑓𝑖 1/4 1/2 1/4
And the marginal distribution of Y is given by the table:
𝑦𝑗 11 1 3 9
𝑔𝑗 1/2 1/3 1/6
Therefore, the means of these distributions are respectively,
1 1 1
𝜇𝑋 = ∑ 𝑥 𝑖 𝑃(𝑥 𝑖 ) = (2 × ) + (4 × ) + (6 × ) = 4
4 2 4
1 1 1
𝜇𝑌 = ∑ 𝑦 𝑗 𝑃(𝑦 𝑗 ) = (1 × ) + (3 × ) + (9 × ) = 3
2 3 6
E[𝑋𝑌] = ∑𝑖 ∑𝑗 ℎ𝑖𝑗 𝑥 𝑖 𝑥 𝑗
= (ℎ11 𝑥1 𝑦1 + ℎ12 𝑥1 𝑦2 + ℎ13 𝑥1 𝑦3 ) + (ℎ21 𝑥2 𝑦1 + ℎ22 𝑥2 𝑦2 + ℎ23 𝑥2 𝑦3 )
+(ℎ31 𝑥3 𝑦1 + ℎ32 𝑥3 𝑦2 + ℎ31 𝑥3 𝑦3 )
Fourth Semester 21 Probability theory and Linear Programming
(MA241TA)
1 1 1 1 1
= (2 × 8) + (6 × 24) + (18 × 12) + (4 × 4) + (12 × 4) + 36 × 0
1 1 1
+ (6 × 8) + (18 × 24) + (54 × 12) = 2 + 4 + 6 = 12

𝐶𝑜𝑣 (𝑋, 𝑌) = 𝐸[𝑋𝑌] − 𝜇𝑋 𝜇𝑌 = 12 − 12 = 0


𝜌(𝑋, 𝑌) = 0.
1
ℎ(4,3) ℎ22 4 3
𝑃(𝑋 = 4|𝑌 = 3) = = = 1 =
𝑔(3) 𝑔2 4
3
1
ℎ(4,3) ℎ22 4 2
𝑃(𝑌 = 3|𝑋 = 4) = = = 1 = = 0.5
𝑓(4) 𝑓2 4
2

Problem 6. The joint probability function for two discrete random variables 𝑋 and 𝑌 is given
𝑐(2𝑥 + 𝑦), 𝑖𝑓 𝑥, 𝑦 ∈ ℤ 𝑎𝑛𝑑 0 ≤ 𝑥 ≤ 2, 0 ≤ 𝑦 ≤ 3,
by 𝑝(𝑥, 𝑦) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find (i) the value of the constant 𝑐, (ii) 𝑃(𝑋 = 2, 𝑌 = 1), (iii) 𝑃(𝑋 ≥ 1, 𝑌 ≤ 2),
(iv) 𝑃(𝑋 + 𝑌 ≤ 1), (v) 𝑃(𝑋 + 𝑌 > 1).
Solution: (i) If 𝑝(𝑥, 𝑦) is joint p.m.f., then ∑𝑥 ∑𝑦 𝑝(𝑥, 𝑦) = 1
Y
0 1 2 3
X
0 0 C 2C 3C

1 2C 3C 4C 5C

2 4C 5C 6C 7C

(i) ∑𝑥 ∑𝑦 𝑝(𝑥, 𝑦) = 0 + 𝐶 + 2𝐶 + 3𝐶 + 2𝐶 + 3𝐶 + 4𝐶 + 5𝐶 + 4𝐶 + 5𝐶 + 6𝐶 +
1
7𝐶 = 1 ⇒ 42𝐶 = 1 ⇒ 𝐶 = 42.
1
(ii) 𝑃(𝑋 = 2, 𝑌 = 1) = 5 × 42
4
(iii) 𝑃(𝑋 ≥ 1, 𝑌 ≤ 2) = (2𝐶 + 3𝐶 + 4𝐶 + 4𝐶 + 5𝐶 + 6𝐶) = 7
1
(iv) 𝑃(𝑋 + 𝑌 ≤ 1) = 0 + 𝐶 + 2𝐶 + 3𝐶 =
7
1 6
(v) 𝑃(𝑋 + 𝑌 > 1) = 1 − 𝑃(𝑋 + 𝑌 ≤ 1) = 1 − 7 = 7

Problem 7. Two ballpoint pens are selected at random from a box that contains 3 blue pens,
2 red pens, and 3 green pens. If 𝑋 is the number of blue pens selected and 𝑌 is the number of
red pens selected, find the conditional distribution of 𝑋, given that 𝑌 = 1, and use it to
determine 𝑃(𝑋 = 0|𝑌 = 1).
Solution: The possible pair of values (𝑥, 𝑦) are {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (2, 0)}

Fourth Semester 22 Probability theory and Linear Programming


(MA241TA)
The joint probability distribution of 𝑋 and 𝑌 is given by
3 2 3
( )( )( )
𝑥 𝑦 2−𝑥−𝑦
𝑝𝑋,𝑌 (𝑋, 𝑌) =
8
( )
2
𝑝𝑋,𝑌 (𝑋, 𝑌) 𝑋
0 1 2
0 3 9 3
28 28 28
1 3 3 0
𝑌
14 14
2 1 0 0
28
3
𝑃(𝑌 = 1) =
7
𝑃(𝑋 = 𝑥, 𝑌 = 1) 7
𝑝𝑋|𝑌 (𝑥|1) = 𝑃(𝑋 = 𝑥|𝑌 = 𝑦) = = 𝑝𝑋,𝑌 (𝑥, 1) ×
𝑃(𝑌 = 1) 3
𝑥 0 1 2
𝑝𝑋|𝑌 (𝑥|1) 3 7 1 3 7 1 7
× = × = 0× =0
14 3 2 14 3 2 3
Finally, 𝑃(𝑋 = 0|𝑌 = 1) = 1/2.
Problem 8. Let 𝑋 denote the number of times a certain numerical control machine will
malfunction: 1, 2, or 3 times on any given day. Let 𝑌 denote the number of times a technician
is called on an emergency call. Their joint probability distribution is given as
𝑝𝑋,𝑌 (𝑋, 𝑌) 𝑋
1 2 3
1 0.05 0.05 0.10
𝑌 3 0.05 0.10 0.35
5 0.00 0.20 0.10
(i) Find 𝑃(𝑌 = 3|𝑋 = 2), (ii) 𝐸[𝑋|𝑌 = 3], (iii) 𝑉𝑎𝑟[𝑋|𝑌 = 3].
Solution:
𝑃(𝑌=3, 𝑋=2) 0.10 2
(i) 𝑃(𝑌 = 3|𝑋 = 2) = 𝑃(𝑋=2)
= 0.05+0.10+0.2 = 7

(ii) 𝑃(𝑌 = 3) = 0.05 + 0.10 + 0.35 = 0.5


0.05 1
𝑃(𝑋 = 1|𝑌 = 3) = = ,
0.50 10

Fourth Semester 23 Probability theory and Linear Programming


(MA241TA)
0.10 1
𝑃(𝑋 = 2|𝑌 = 3) = = ,
0.50 5
0.35 7
𝑃(𝑋 = 3|𝑌 = 3) = =
0.50 10
𝑥 1 2 3
𝑝𝑋|𝑌 (𝑥|3) 1 1 7
10 5 10
1 1 7
𝐸[𝑋|𝑌 = 3] = ∑ 𝑥 𝑝𝑋|𝑌 (𝑥|3) = 1 × +2× +3× = 2.6
10 5 10
𝑥
1 1 7
𝐸[𝑋 2 |𝑌 = 3] = ∑ 𝑥^2 𝑝𝑋|𝑌 (𝑥|3) = 12 × + 22 × + 32 × = 7.2
10 5 10
𝑥

𝑉[𝑋|𝑌 = 3] = 𝐸[𝑋 2 |𝑌 = 3] − (𝐸[𝑋|𝑌 = 3])2 = 7.2 − 6.76 = 0.44

Problems to practice:
1) The joint probability distribution of two random variables X and Y is given by the
following table.

Y 5
-2 -1 4
X
1 0.1 0.2 0 0.3
2 0.2 0.1 0.1 0
(a) Find the marginal distribution of 𝑋 and 𝑌, and evaluate 𝑐𝑜𝑣(𝑋, 𝑌).
(b) Also determine whether 𝜇𝑋 and 𝜇𝑌 .
(c) Find 𝑃(𝑌 = −1|𝑋 = 1) and 𝑃(𝑋 = 2|𝑌 = 4)

2) Two textbooks are selected at random from a shelf containing three statistics texts,
two mathematics texts and three engineering texts. Denoting the number of books
selected in each subject by S, M and E respectively, find (a) the joint distribution of S
and M, (b) the marginal distributions of S, M and E, and (c) Find the correlation of the
random variables S and M.
3) Consider an experiment that consists of 2 throws of a fair die. Let 𝑋 be the number of
4s and 𝑌 be the number of 5s obtained in the two throws. Find the joint probability
distribution of 𝑋 and 𝑌. Also evaluate 𝑃(2𝑋 + 𝑌 < 3).

Fourth Semester 24 Probability theory and Linear Programming


(MA241TA)
Joint Probability distribution for continuous random variables:
Let 𝑥 and 𝑦 be two continuous random variables. Suppose there exists a real valued function
ℎ(𝑥, 𝑦) of 𝑥 and 𝑦 such that the following conditions hold:
(i) h(x, y) ≥ 0 for all 𝑥, 𝑦
∞ ∞
(ii) ∫−∞ ∫−∞ h(x, y) 𝑑𝑥 𝑑𝑦 exists and is equal to 1.
Then, h(x, y) is called joint probability density function.
If [𝑎, 𝑏] and [𝑐, 𝑑] are any two intervals, then the probability that 𝑥 ∈ [𝑎, 𝑏] and 𝑦 ∈ [𝑐, 𝑑], is
denoted by 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏, 𝑐 ≤ 𝑦 ≤ 𝑑) is defined by the formula
𝑏 𝑑
𝑃(𝑎 ≤ 𝑥 ≤ 𝑏, 𝑐 ≤ 𝑦 ≤ 𝑑) = ∫ ∫ ℎ(𝑥, 𝑦) 𝑑𝑦 𝑑𝑥.
𝑎 𝑐

For any specified real numbers u, v, the function


𝑢 𝑣
F(u, v) = ∫−∞ ∫−∞ ℎ(𝑥, 𝑦) 𝑑𝑦 𝑑𝑥
is called the joint or the compound cumulative distribution function.
𝜕2 𝐹
Where 𝐹(𝑢, 𝑣) = 𝑃(−∞ < 𝑥 ≤ 𝑢, − ∞ < 𝑦 ≤ 𝑣) and = 𝑝(𝑢, 𝑣).
𝜕𝑢𝜕𝑣

Further, the function ℎ1 (𝑥) = ∫−∞ ℎ(𝑥, 𝑦) 𝑑𝑦 is called marginal density function of 𝑥, and

the function ℎ2 (𝑦) = ∫−∞ ℎ(𝑥, 𝑦) 𝑑𝑥 is called marginal density function of 𝑦. ℎ1 (𝑥) is the
density function of 𝑥 and ℎ2 (𝑦) is the density function of 𝑦.
The variables x and y are said to stochastically independent if ℎ1 (𝑥)ℎ2 (𝑦) = ℎ(𝑥, 𝑦).
If 𝜑(𝑥, 𝑦) is a function of 𝑥 and 𝑦, then the expectation of 𝜑(𝑥, 𝑦) is defined by
∞ ∞
E{𝜑(𝑥, 𝑦)} = ∫−∞ ∫−∞ 𝜑(𝑥, 𝑦) ℎ(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦.
The covariance between 𝑥 and 𝑦 is defined as
𝐶𝑜𝑣(𝑥, 𝑦) = 𝐸{𝑥𝑦} − 𝐸{𝑥} 𝐸{𝑦}.
Conditional Probability:
The idea of conditional probability function of discrete random variables of extended to the
case of continuous random variables.
If 𝑋 and 𝑌 are continuous random variables, then the conditional probability distribution 𝑌
ℎ(𝑥,𝑦)
given 𝑋 is 𝑓(𝑦|𝑥) =
ℎ1 (𝑥)

where ℎ(𝑥, 𝑦) is the joint density function of 𝑋 and 𝑌, ℎ1 (𝑥) is the marginal density
function of 𝑋.
Fourth Semester 25 Probability theory and Linear Programming
(MA241TA)
𝑃(𝑎 < 𝑋 < 𝑏, 𝑐 < 𝑌 < 𝑑)
𝑃(𝑐 < 𝑌 < 𝑑|𝑎 < 𝑥 < 𝑏) =
𝑃(𝑎 < 𝑥 < 𝑏)

Problem 1: Find the constant ‘𝑘’ so that


𝑘(𝑥 + 1)𝑒 −𝑦 , 0 < 𝑥 < 1, 𝑦 > 0
h(x, y) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
is a joint probability density function. Are 𝑥 and 𝑦 independent?
Solution: Observe that ℎ(𝑥, 𝑦) ≥ 0 for 𝑥, 𝑦, if 𝑘 ≥ 0
∞ ∞ ∞ 1
∫ ∫ ℎ(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = ∫ ∫ ℎ(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦
−∞ −∞ 𝑦=0 𝑥=0
1 ∞
= 𝑘 {∫ (𝑥 + 1)𝑑𝑥} {∫ 𝑒 −𝑦 𝑑𝑦}
0 0
22 − 12 3
= 𝑘{ } {0 + 1} = 𝑘.
2 2
∞ ∞ 2
Hence ∫−∞ ∫−∞ ℎ(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = 1 if k = 3.
2
Therefore, ℎ(𝑥, 𝑦) is a joint probability density function if k = 3.
2
With k = 3, the marginal density functions are

ℎ1 (𝑥) = ∫ ℎ(𝑥, 𝑦) 𝑑𝑦, 0<𝑥<1
−∞

2
= (𝑥 + 1) ∫ 𝑒 −𝑦 𝑑𝑦
3 0
2
= (𝑥 + 1)(0 + 1).
3
2
= (𝑥 + 1), 0 < 𝑥 < 1
3
Next,

ℎ2 (𝑦) = ∫ ℎ(𝑥, 𝑦) 𝑑𝑥, 𝑦 > 0
−∞

2 −𝑦 1 2 22 1
= 𝑒 ∫ (𝑥 + 1) 𝑑𝑥 = 𝑒 −𝑦 { − }
3 0 3 2 2
= 𝑒 −𝑦 , 𝑦 > 0.
Therefore, ℎ1 (𝑥)ℎ2 (𝑦) = ℎ(𝑥, 𝑦) and hence 𝑥 and 𝑦 are stochastically independent.
Problem 2: The life time 𝑥 and brightness 𝑦 of a light bulb are modeled as continuous
random variables with joint density function
Fourth Semester 26 Probability theory and Linear Programming
(MA241TA)
ℎ(𝑥, 𝑦) = 𝛼𝛽𝑒 −(𝛼 𝑥+ 𝛽𝑦) , 0 < 𝑥 < ∞, 0 < 𝑦 < ∞.
Where 𝛼 and 𝛽 are appropriate constants. Find (i) the marginal density functions of 𝑥 and 𝑦,
and (ii) the compound cumulative distributive function.
Solution: For the given distribution, the marginal density function of 𝑥 is
∞ ∞
ℎ1 (𝑥) = ∫ ℎ(𝑥, 𝑦) 𝑑𝑦 = ℎ1 (𝑥) = ∫ 𝛼𝛽𝑒 −(𝛼 𝑥+ 𝛽𝑦) 𝑑𝑦
−∞ 0

= 𝛼𝛽 𝑒 −𝛼𝑥 ∫0 𝑒 −𝛽𝑦 𝑑𝑦 = 𝛼 𝑒 −𝛼𝑥 , 0 < 𝑥 < ∞
the marginal density function of 𝑦 is

ℎ2 (𝑦) = ∫ ℎ(𝑥, 𝑦) 𝑑𝑥 = 𝛽 𝑒 −𝛽𝑦 , 0 < 𝑦 < ∞.
−∞

Further, the compound cumulative distribution function is


𝑢 𝑣 𝑢 𝑣
F (u, v) = ∫ ∫ ℎ(𝑥, 𝑦) 𝑑𝑦 𝑑𝑥 = ∫ ∫ 𝛼𝛽𝑒 −(𝛼 𝑥+ 𝛽𝑦) 𝑑𝑦 𝑑𝑥
−∞ −∞ 0 0
𝑢 𝑣
= 𝛼𝛽 {∫ 𝑒 −𝛼𝑥 𝑑𝑥} {∫ 𝑒 −𝛽𝑦 𝑑𝑦}
0 0

1 1
= 𝛼𝛽 { (1 − 𝑒 −𝛼 𝑢 )} { (1 − 𝑒 −𝛽 𝑣 )}
𝛼 𝛽
= (1 − 𝑒 −𝛼 𝑢 )(1 − 𝑒 −𝛽 𝑣 ), 0 < 𝑢 < ∞, 0 < 𝑣 < ∞.
Problem 3: The joint probability density function of two random variables 𝑥 and 𝑦 is given
2, 0 < 𝑥 < 𝑦 < 1
by h(x, y) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
(i) Find the covariance between 𝑥 and 𝑦.
Solution: The marginal density function of 𝑥 is
1

ℎ1 (𝑥) = ∫ ℎ(𝑥, 𝑦) 𝑑𝑦 = {∫𝑥 2 𝑑𝑦 = 2(1 − 𝑥), 0 < 𝑥 < 1
−∞
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
The marginal density function of 𝑦 is
𝑦

∫ 2 𝑑𝑥 = 2𝑦, 0<𝑦<1
ℎ2 (𝑦) = ∫ ℎ(𝑥, 𝑦) 𝑑𝑥 = { 0
−∞
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
∞ 1
E[x] = ∫ 𝑥 ℎ1 (𝑥)𝑑𝑥 = ∫ 𝑥{2 (1 − 𝑥)} 𝑑𝑥
−∞ 0
1
1 1 1
= 2 ∫ (𝑥 − 𝑥 2 ) 𝑑𝑥 = 2 ( − ) = ,
0 2 3 3

Fourth Semester 27 Probability theory and Linear Programming


(MA241TA)
∞ 1
2
E[y] = ∫ 𝑦 ℎ2 (𝑦)𝑑𝑦 = ∫ 𝑦(2𝑦)𝑑𝑦 = ,
−∞ 0 3
∞ ∞ 1 𝑦 1 1
E[xy] = ∫−∞ ∫−∞ 𝑥𝑦 ℎ(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = ∫0 2𝑦 {∫0 𝑥 𝑑𝑥} 𝑑𝑦 = ∫0 𝑦 3 dy = 4 .

Therefore,
1 1 2 1
𝐶𝑜𝑣 (𝑥, 𝑦) = 𝐸[𝑥𝑦] − 𝐸[𝑥]𝐸[𝑦] = − . = .
4 3 3 36

𝑒 −(𝑥+𝑦) , 𝑥 ≥ 0, 𝑦 ≥ 0
Problem 4: Verify that f (x, y) = { is a density function of a joint
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
probability distribution. Then evaluate the following:
1
(i) 𝑃 ( < 𝑥 < 2, 0 < 𝑦 < 4) (ii) 𝑃(𝑥 < 1) (iii) 𝑃(𝑥 > 𝑦) (iv) 𝑃(𝑥 + 𝑦 ≤ 1),
2

(v) 𝑃(0 < 𝑥 < 1|𝑦 = 2).


Solution: Given 𝑓(𝑥, 𝑦) ≥ 0

∞ ∞ ∞ ∞ ∞ ∞
𝑓(𝑥, 𝑦) = ∫ ∫ 𝑓(𝑥, 𝑦)𝑑𝑥 𝑑𝑦 = ∫ ∫ 𝑒 −(𝑥+𝑦) 𝑑𝑥 𝑑𝑦 = ∫ 𝑒 −𝑥 𝑑𝑥 ∫ 𝑒 −𝑦 𝑑𝑦
−∞ −∞ −∞ −∞ −∞ 0

= (0 + 1)(0 + 1) = 1.
Therefore, 𝑓(𝑥, 𝑦) is a density function.
1 2 4 2 4
(i) 𝑃 (2 < 𝑥 < 2, 0 < 𝑦 < 4) = ∫1/2 ∫0 𝑓(𝑥, 𝑦) 𝑑𝑦 𝑑𝑥 = ∫1/2 ∫0 𝑒 −(𝑥+𝑦) 𝑑𝑦 𝑑𝑥
2 4
= ∫1/2 𝑒 −𝑥 𝑑𝑥 ∫0 𝑒 −𝑦 𝑑𝑦 = (𝑒 −1/2 − 𝑒 −2 )(1 − 𝑒 −4 ).

(ii) The marginal density function of 𝑥 is

∞ ∞ ∞
ℎ1 (𝑥) = ∫ 𝑓(𝑥, 𝑦) 𝑑𝑦 = ∫ 𝑒 −(𝑥+𝑦) 𝑑𝑦 = 𝑒 −𝑥 ∫ 𝑒 −𝑦 𝑑𝑦 = 𝑒 −𝑥
−∞ 0 0
1 1 1
Therefore, 𝑃(𝑥 < 1) = ∫0 ℎ1 (𝑥) 𝑑𝑥 = ∫0 𝑒 −𝑥 𝑑𝑥 = 1 − .
𝑒
∞ 𝑦 ∞ 𝑦
(iii) 𝑃(𝑥 ≤ 𝑦) = ∫0 {∫0 𝑓(𝑥, 𝑦) 𝑑𝑥} 𝑑𝑦 = ∫0 {∫0 𝑒 −(𝑥+𝑦) 𝑑𝑥} 𝑑𝑦
∞ 𝑦 ∞
= ∫ 𝑒 −𝑦 (∫ 𝑒 −𝑥 𝑑𝑥) 𝑑𝑦 = ∫ 𝑒 −𝑦 (1 − 𝑒 −𝑦 ) 𝑑𝑦
0 0 0

1 1
= ∫ (𝑒 −𝑦 − 𝑒 −2𝑦 ) 𝑑𝑦 = 1 − =
0 2 2
1 1
Therefore, 𝑃(𝑥 > 𝑦) = 1 − 𝑃(𝑥 ≤ 𝑦) = 1 − = .
2 2

(iv) 𝑃(𝑥 + 𝑦 ≤ 1) = ∬𝐴 𝑓(𝑥, 𝑦)𝑑𝐴

1 1−𝑥 1 1−𝑥
= ∫ ∫ 𝑓(𝑥, 𝑦) 𝑑𝑦 𝑑𝑥 = ∫ {∫ 𝑒 −(𝑥+𝑦) 𝑑𝑦} 𝑑𝑥
𝑥=0 𝑦= 0 0 0
Fourth Semester 28 Probability theory and Linear Programming
(MA241TA)
1 1−𝑥 1
=∫ 𝑒 −𝑥
{∫ 𝑒 −𝑦
𝑑𝑦} 𝑑𝑥 = ∫ 𝑒 −𝑥 {1 − 𝑒 − (1−𝑥) } 𝑑𝑥
0 0 0
1
2
= ∫ (𝑒 −𝑥 − 𝑒 −1 ) 𝑑𝑥 = 1 − .
0 𝑒
𝑃(0 < 𝑥 < 1|𝑦 = 2)
(v) 𝑃(0 < 𝑥 < 1|𝑦 = 2) = 𝑃(𝑦=2)

[putting 𝑦 = 2]
1
∫0 𝑒 −(𝑥+2) 𝑑𝑥 1
𝑃(0 < 𝑥 < 1|𝑦 = 2) = ∞ =1− = 0.63
∫0 𝑒 −(𝑥+2) 𝑑𝑥 𝑒

Problem 5. Determine the value of 𝐶 that makes the function


𝑓(𝑥, 𝑦) = 𝐶𝑒 −2𝑥−3𝑦
a joint p.d.f. over the range 0 < 𝑥 and 𝑥 < 𝑦.
Determine the following:
(i) 𝑃(𝑋 < 1, 𝑌 < 2)
(ii) 𝑃(1 < 𝑋 < 2)
(iii) 𝑃(𝑌 > 3)
(iv) 𝑃(𝑋 < 2, , 𝑌 < 2)
(v) 𝐸[𝑋]
(vi) 𝐸[𝑌]
(vii) Marginal probability distribution of 𝑋.
(viii) Conditional probability of 𝑌 given 𝑋 = 1
(ix) 𝐸[𝑌|𝑋 = 1]
(x) 𝑃(𝑌 < 2|𝑋 = 1)
(xi) Conditional probability distribution of 𝑋 given 𝑌 = 2
Solution: 𝑓𝑋,𝑌 (𝑥, 𝑦) = 𝐶𝑒 −2𝑥−3𝑦 defined on 𝐷, where
𝐷 = {(𝑥, 𝑦)| 0 < 𝑥 < 𝑦, 0 ≤ 𝑦 ≤ ∞}
To find 𝐶, ∫𝐷 ∫ 𝑒 −2𝑥 − 3𝑦 𝑑𝐴 = 1
∞ 𝑦 ∞ 𝑦

⇒ ∫ ∫ 𝐶 𝑒 −2𝑥−3𝑦 𝑑𝑥 𝑑𝑦 = 1 ⇒ 𝐶 ∫ 𝑒 −3𝑦 ∫ 𝑒 −2𝑥 𝑑𝑥 𝑑𝑦 = 1


𝑦=0 𝑥=0 𝑦=0 𝑥=0
∞ 𝑦 ∞
−3𝑦
𝑒 −2𝑥 𝐶
⇒𝐶 ∫𝑒 [ ] 𝑑𝑦 = 1 ⇒ ∫ 𝑒 −3𝑦 [𝑒 −2𝑦 − 𝑒 0 ] 𝑑𝑦 = 1
−2 𝑥=0 −2
𝑦=0 𝑦=0

Fourth Semester 29 Probability theory and Linear Programming


(MA241TA)
∞ ∞
𝐶 𝐶 𝑒 −5𝑦 𝑒 −3𝑦
⇒ ∫ [𝑒 −5𝑦 − 𝑒 −3𝑦 ] 𝑑𝑦 = 1 ⇒ [ − ] =1
−2 −2 −5 −3 𝑦=0
𝑦=0

𝐶 𝐶
⇒− + = 1 ⇒ 𝐶 = 15
10 6
1 2
(i) 𝑃(𝑋 < 1, 𝑌 < 2) = ∬𝐷 𝑓𝑋,𝑌 (𝑥, 𝑦)𝑑𝐴 = ∫𝑥=0 ∫𝑦=𝑥 15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑦 𝑑𝑥
1

1 2 1
𝑒 −3𝑦 15
= 15 ∫ 𝑒 −2𝑥 [ ] 𝑑𝑥 = ∫ 𝑒 −2𝑥 [𝑒 −6 − 𝑒 −3𝑥 ]𝑑𝑥
−3 𝑦=𝑥 −3
𝑥=0 𝑥=0
1
𝑒 −6 𝑒 −2𝑥 𝑒 −5𝑥
= −5 [ − ] = 0.9879
−2 −5 0

2 ∞ 2 𝑒 −3𝑦
(ii) 𝑃(1 < 𝑋 < 2) = ∫𝑥=1 ∫𝑦=𝑥 15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑦 𝑑𝑥 = 15 ∫𝑥=1 𝑒 −2𝑥 [ ] 𝑑𝑥
−3 𝑦=𝑥
2 2 2
15 5𝑒 −5𝑥
= ∫ 𝑒 −2𝑥 [0 − 𝑒 −3𝑥 ]𝑑𝑥 = 5 ∫ 𝑒 −5𝑥 𝑑𝑥 = [ ] = 0.0067
−3 −5 1
𝑥=1 𝑥=1
𝑦
∞ 𝑦 ∞ 𝑒 −2𝑥
(iii) 𝑃(𝑌 > 3) = ∫𝑦=3 ∫𝑥=0 15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑥 𝑑𝑦 = 15 ∫𝑦=3 𝑒 −3𝑦 [ ] 𝑑𝑦
−2 𝑥=0
∞ ∞
15 −5𝑦 −3𝑦
15 𝑒 −5𝑦 𝑒 −3𝑦
= ∫ [𝑒 − 𝑒 ]𝑑𝑦 = [ − ] = 0.000308
−2 −2 −5 −3 3
𝑦=3
2 2
(iv) 𝑃(𝑋 < 2, 𝑌 < 2) = ∫𝑥=0 ∫𝑦=𝑥 15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑦 𝑑𝑥
2 2 2
−2𝑥
𝑒 −3𝑦 15
= 15 ∫ 𝑒 [ ] 𝑑𝑥 = ∫ 𝑒 −2𝑥 [𝑒 −6 − 𝑒 −3𝑥 ]𝑑𝑥
−3 𝑦=𝑥 −3
𝑥=0 𝑥=0
2
𝑒 −6 𝑒 −2𝑥 𝑒 −5𝑥
= −5 [ − ] = 0.9939
−2 −5 0
(v) 𝐸[𝑋] = ∫𝐷 ∫ 𝑥 𝑓𝑋,𝑌 (𝑥, 𝑦) 𝑑𝐴, where
𝐷 = {(𝑥, 𝑦)| 𝑥 ≤ 𝑦 ≤ ∞, 0 ≤ 𝑥 ≤ ∞}
∞ ∞

∴ 𝐸[𝑋] = ∫ ∫ 𝑥 15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑦 𝑑𝑥


𝑥=0 𝑦=𝑥
∞ ∞ ∞
𝑒 −3𝑦 15
= 15 ∫ 𝑥𝑒 −2𝑥 [ ] 𝑑𝑥 = ∫ 𝑥𝑒 −2𝑥 [0 − 𝑒 −3𝑥 ]𝑑𝑥
−3 𝑦=𝑥 −3
𝑥=0 𝑥=0
∞ ∞
−5𝑥
𝑒 −5𝑥 𝑒 −5𝑥 1
= 5 ∫ 𝑥𝑒 𝑑𝑥 = 5 [𝑥 −1 2
] =
−5 (−5) 0 5
𝑥=0

Fourth Semester 30 Probability theory and Linear Programming


(MA241TA)
(vi) 𝐸[𝑌] = ∫𝐷 ∫ 𝑦 𝑓𝑋,𝑌 (𝑥, 𝑦) 𝑑𝐴, where
𝐷 = {(𝑥, 𝑦)| 𝑥 ≤ 𝑦 ≤ ∞, 0 ≤ 𝑥 ≤ ∞}
∞ ∞

∴ 𝐸[𝑌] = ∫ ∫ 𝑦 15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑦 𝑑𝑥


𝑥=0 𝑦=𝑥
∞ ∞ ∞
−2𝑥
𝑒 −3𝑦 𝑒 −3𝑦 −2𝑥
𝑒 −3𝑥 𝑒 −3𝑥
= 15 ∫ 𝑒 [𝑦 − ] 𝑑𝑥 = 15 ∫ 𝑒 [𝑥 + ] 𝑑𝑥
−3 9 𝑦=𝑥 3 9
𝑥=0 𝑥=0
∞ ∞ ∞
−5𝑥
𝑒 −5𝑥 𝑒 −5𝑥 𝑒 −5𝑥 𝑒 −5𝑥 1
= 5 ∫ [𝑥𝑒 + ] 𝑑𝑥 = 5 {[𝑥 −1 2
] + [ ] }=
3 −5 (−5) 0 3(−5) 0 5
𝑥=0
1 1 8
= + =
5 3 15
(vii) Marginal probability distribution of 𝑋
∞ ∞

𝑓𝑋 (𝑥) = ∫ 𝑓𝑋,𝑌 (𝑥, 𝑦) 𝑑𝑦 = ∫ 𝑦15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑦 = 5𝑒 −5𝑥 , 𝑓𝑜𝑟 𝑥 > 0


𝑦=−∞ 𝑦=𝑥

(viii) Conditional probability of 𝑌 given 𝑋 = 1


𝑓𝑋,𝑌 (1, 𝑦) 15𝑒 −2−3𝑦
𝑓𝑋,𝑌 (𝑦|1) = = = 3𝑒 −3𝑦+3 , 𝑦 > 1
𝑓𝑋 (1) 5𝑒 −5
∞ ∞
(ix) 𝐸[𝑌|𝑋 = 1] = ∫𝑦=−∞ 𝑦 𝑓𝑌|𝑋 (𝑦|1) 𝑑𝑦 = ∫𝑦=1 3𝑦𝑒−3𝑦+3 𝑑𝑦

as at 𝑥 = 1, 𝑦 𝑣𝑎𝑟𝑖𝑒𝑠 𝑓𝑟𝑜𝑚 1 𝑡𝑜 ∞

𝑒−3𝑦+3 𝑒−3𝑦+3
𝐸[𝑌|𝑋 = 1] = 3 [𝑦 − ] = 4/3
−3 9 1
2 2
(x) 𝑃(𝑌 < 2|𝑋 = 1) = ∫𝑦=1 𝑓𝑌|𝑋 (𝑦|1)𝑑𝑦 = ∫𝑦=1 3𝑒 −3𝑦+3 𝑑𝑦
2
𝑒 −3𝑦+3
= 3[ ] = 1 − 𝑒 −3
−3 1
(xi) Conditional probability distribution of 𝑋 given by 𝑌 = 2
∞ 𝑦
15 −5𝑦 15 −3𝑦
𝑓𝑌 (𝑦) = ∫ 𝑓𝑋,𝑌 (𝑥, 𝑦) 𝑑𝑦 = ∫ 15 𝑒 −2𝑥 𝑒 −3𝑦 𝑑𝑦 = − 𝑒 + 𝑒
2 2
𝑥=−∞ 𝑥=0

15 −6
𝑓𝑌 (2) = (𝑒 − 𝑒 −10 )
2
𝑓𝑋,𝑌 (𝑥, 2) 15𝑒 −2𝑥−6
𝑓𝑋|𝑌 (𝑥|2) = =5 ,0 ≤ 𝑥 ≤ 𝑦
𝑓𝑌 (2) 𝑒 −5 (𝑒 −6 − 𝑒 −10 )
2

Problems to practice:

Fourth Semester 31 Probability theory and Linear Programming


(MA241TA)
1) If the joint probability function for 𝑓(𝑥, 𝑦) is
𝑐(𝑥 2 + 𝑦 2 ), 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 1, 𝑐 ≥ 0
f (x, y) = { is a density function of a
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
joint probability distribution. Then evaluate the following:
(i) the value of the constant 𝑐. (ii) the marginal density functions of 𝑥 and 𝑦.
1 1 1 3 1
(iii) 𝑃 (𝑥 < 2 , 𝑦 > 2) (iv) 𝑃 (4 < 𝑥 < 4) (v) 𝑃 (𝑦 < 2) .

2) For the distribution given by the density function


1
f (x, y) = {96 𝑥𝑦, 0 < 𝑥 < 4, 1 < 𝑦 < 5
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
evaluate (i) 𝑃(1 < 𝑥 < 2, 2 < 𝑦 < 3), (ii) 𝑃(𝑥 > 3, 𝑦 ≤ 2) (iii) 𝑃(𝑦 ≤ 𝑥), (iv)
(𝑥 + 𝑦 ≤ 3)

3) For the distribution defined by the density function


3𝑥𝑦(𝑥 + 𝑦), 0 ≤ 𝑥 ≤ 1, 0≤𝑦≤1
f (x, y) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
find the covariance between 𝑥 and 𝑦.

4) For the distribution defined by the density function

1
(6
f (x, y) = {8 − 𝑥 − 𝑦), 0 < 𝑥 < 2, 0<𝑦<4
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Evaluate (i) 𝑃(𝑥 < 1, 𝑦 < 3), (ii)𝑃(𝑥 + 𝑦 < 3), (iii) the covariance between 𝑥 and 𝑦
and (iv) 𝑃(𝑥 < 1|𝑦 < 3)
. Video Links:

https://www.youtube.com/watch?v=82Ad1orN-NA
https://www.youtube.com/watch?v=eYthpvmqcf0
https://www.youtube.com/watch?v=L0zWnBrjhng
https://www.youtube.com/watch?v=Om68Hkd7pfw
https://www.youtube.com/watch?v=RYIb1u3C13I

Fourth Semester 32 Probability theory and Linear Programming


(MA241TA)

You might also like