0% found this document useful (0 votes)

88 views

Laboratory Probability and Statistics 20 21 Errata Corrected

1. The laboratory practice involves exercises on random number generation, statistics, and Markov chains using Matlab. Students must complete at least one exercise in each section and subsection for a minimum of 5 exercises total. More completed exercises will be averaged for the final grade. 2. The document introduces pseudo-random number generation using linear congruential generators and discusses generating random variables from uniform distributions using the inverse transform method. 3. Statistics topics covered include parameter estimation, confidence intervals, and hypothesis testing. Parameter estimation methods like maximum likelihood estimation are introduced. Confidence intervals for the mean, variance, and expected value of random variables are discussed.

Uploaded by

JEA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views

Laboratory Probability and Statistics 20 21 Errata Corrected

Uploaded by

JEA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021

Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Indications: The laboratory practice has 3 main parts: Random numbers, Statistics, and Markov chains. Each part
has an introduction to the topic and theoretical-practical exercises. Clues of useful functions in Matlab to help you solving the
exercises are given. You must resolve a minimum of: one exercises per section 1, 1 exercise per each subsections 2.1, 2.2, 2.3 and
one exercise of section 3. You can hand-in more or all exercises. In that case, the mark will be the mean of the 5 best marks,
following the proportion above. You have to hand-in a pdf report detailing and explaining the resolution and the code used.
Teacher contact: [email protected]

Contents
1 Pseudo-Random Number Generation 1

2 Statistics 3
2.1 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Maximum likelihood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Confidence intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Confidence interval for the expected value of a random variable . . . . . . . . . . . . . . 5
2.2.2 Batch means method for the mean of a non-Gaussian random variable . . . . . . . . . . 6
2.2.3 Confidence interval for the variance of a Gaussian random variable . . . . . . . . . . . . 6
2.3 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Markov Chains 8

1 Pseudo-Random Number Generation

Many of the methods of computational statistics require the ability to generate random variables of known
probability distributions. This is the core to be able to perform statistical simulation as the Monte Carlo
method1 .
The methods to create random variables start with the generation of random digits that are linked
together to form uniformly distributed random numbers in the interval (0, 1). The oldest methods to generate
random numbers where done “by hand”: roll a dice, deck cards selection, balls subtraction from an urn, etc.
During the first half of the XX century mechanical and electrical devices were created to generate random
numbers such as rotating disks and electrical circuits. In the second half, many new works proposed physical
generators of random numbers. Usually, the numbers were published in tables so it can be used for Monte
Carlo simulation. The first tables were published by a student of Karl Pearson in 1927. The table of random
numbers most used ever was the one given by the RAND Corporation in 1955. One of the advantages of it
was his dimension, containing a million random digits.
The most common algorithm to generate random numbers is called linear congruential generator (LGC).
Given a seed number x0 the following random number is chosen as xn+1 ≡ axn + b (mod m) for some a, b
and m chosen conveniently2,3 . The numbers are normalized to get values in the interval [0, 1), dividing by m:
ui = xi /m. RANDU, a congruencial random number generator of IBM in the 60s used a = 216 + 3 = 65539,
b = 0, and m = 231 . This random numbers have a reticulated or grided structure as shown in figure 1. In this
grid structure, a series of lines can be identified where all the pairs in the series are placed. Thus, the generation
of random numbers by computers have the difficulty of the deteriorating of the notion of randomness. Also
note, that they are generated by a deterministic algorithm. For this reasons, this random number are called
pseudo-random. The techniques used to generate uniform random numbers and his defects have been widely
studied. For more information about the topic look [1].

1 The method was named after the Casino of Monte Carlo (Monaco). The term was a code name for a secret job in which von

Neumann and Ulam used this mathematical technique in the well-known project to make the atomic bomb.
2 a ≡ b (mod m), is read “a congruent with b module m” and means that the rest to dived a by m is b. See this informative

video for more information about the topic.

3 Visit https://demonstrations.wolfram.com/LinearCongruentialGenerators/ to simulate the LGC algorithm using Wolfram.

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 1 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Figure 1: Pairs of consecutive pseudo-random numbers plotted in a plane. The pairs situate in a line formation.

Exercise 1. Generating random numbers

In this exercise, we test the LGC method to compute a sample of uniforms random variables in the
interval (0, 1).

(a) Use the LGC method to generate 200 values using the following parameters a = 37, b = 1, and m = 64.
Fix any seed x0 . Use the Matlab function stem() to plot the sequence of values. Do they look random?
Now use the Matlab function scatter() and plot the pairs of consecutive random values as in figure 1.
Do them follow a pattern? How many unique values did you get?

(b) Now, generate 100 values with the parameters: a = 7, b = 0, and m = 61. Fix any seed x0 . Proceed in
the same way as in (a). Use now the function scatter3() to plot the sequence of triplets of consecutive
random numbers.

(c) In the 70s IBM generated random numbers using the following parameters: a = 216 + 3 = 65539, b = 0,
and m = 229 . Do it in the same way. Fix a value for the seed. Generate 200 random digits with the
LGC method. Plot the pairs of consecutive random numbers and the sequence of triplets of consecutive
random numbers. Do you see a pattern? Are this selection of parameters useful to generate random
numbers? Finally, make a histogram using the Matlab function histogram(). Do they look uniform?

(d) Generate 500 random numbers with the method in (b), and also another 500 values with the function
rand() in Matlab. Make an histogram off all the 1000 numbers generated. Do they look uniform?

Exercise 2. Generating exponentially distributed random numbers

We are able to generate any probability distribution having uniforms random numbers. And there are
several methods to do it. For example, the inverse transform method of the distribution function, says
that for a given random variable X with distribution function FX (x) which admits inverse, it is verified that
the transformed variable U = FX (X) follows a continuous uniform distribution. That means that you can
−1
generate the random variable X by X = FX (U ), where U ∼ U (0, 1). The procedure in general is:
−1
1. Deduct the expression of the inverse distribution function: X = FX (U ).

2. Generate a uniform random number U in (0,1).

−1
3. Obtain X as X = FX (U ).

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 2 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Generate the exponential random variable with the above method. Fix any value for the parameter λ and
generate the uniform random numbers U s with the function rand() in Matlab. Make an histogram, for
n = 100, 1000, and 10000 simulations using the function histogram(...,’Normalization’,’pdf’). For the
case n = 10000 plot the histogram together with the pdf of the exponential. Do them agree?

2 Statistics
The field of statistics plays the key role in bridging probability models to the real world. To apply probability
models to real situations, we must perform experiments and gather data. This is why the main work-object
of the statistics field is a collection of data from a population. The data is modelled as being a random
sample, X = (X1 , X2 , ..., Xn ), consisting of n independent random variables with the same distribution.
Statics can be classified in descriptive statistics and inferential statistics. The first one deals with
describing the data and the second infers properties of the entire population through the sample. At the same,
time classical inferential statistics can be subdivided into three branches: Parameter estimation, confidence
interval and hypothesis testing. Statistics also play an important role in Decision Theory, see [2], [3]. Typically,
two classical methodology or points of view arises when solving an inferential statistical problem: frequentist
approach and Bayesian approach. This section deals about inferential statistics trough the frequentist approach.
If you want to know more about statistics in general or about Bayesian statistics see [4], [5].

2.1 Parameter estimation

We consider the problem of estimating a parameter θ related to a random variable X. We suppose that we
have obtained a random sample X = (X1 , ..., Xn ) consisting of independent, identically distributed (i.i.d.)
versions of X. Our estimator is given by a function of X:

θ̂(X) = g(X1 , X2 , ..., Xn ) (1)

After making the observations, we obtain the n values (x1 , x2 , ..., xn ) and compute the estimate for θ by a
single value, g(x1 , x2 , ..., xn ). Because of that θ̂(X) is called a point estimator.
As an example, consider that θ = σ 2 = VAR[X]. It is well knows that the sample variance,
n n
2 1X 1X
Ŝ = (Xi − X̄)2 , where X̄ = Xi (2)
n i=1 n i=1

is an estimator of the variance of X. Another parameters could be to estimate the expectation, µ, or typically
the parameter\s of a family of distribution as λ when X ∼ Exp(λ).
Two typical properties are usually evaluated in a estimator: the bias and the mean square error
(MSE). The bias is defined as
B(θ̂) = E[θ̂] − θ. (3)
An estimator is said to be unbiased if E[θ̂] = θ, that is, if the bias is zero. The MSE is defined as:

E[(θ̂ − θ)2 ] = · · · = VAR[θ̂] + B(θ̂) (4)

If the bias is zero the MSE is just the variance of the estimator. Lesser MSE, bigger the accuracy of the
estimator.

Exercise 3. Emergency centre

At peak hours, the inter-arrival times at a emergency center, are exponential distributed random variable
with rate λ = 5 arrivals per minute. Consider the following two estimators for the mean inter-arrival time:
n
1X
• θ̂1 = Xi
n i=1

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 3 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

• θ̂2 = n · min(X1 , X2 , ..., Xn )

(a) The peak hours time start at 22:00. Simulate the first n = 100 inter-arrival emergency times arriving to
the center4 . Represent the arriving process, using the function stairs() (arrivals time in x-axis and n
in the y-axis). The function duration(), minutes() and cumsum() may be helpful. How long took to
the emergency center to receive 100 arrivals? If each emergency take a mean of 5 minutes to be primary
attended, how many emergency boxes would you prepare? Explain your answer.

(b) Generate a sample of (X1 , X2 , ..., X1000 , ..., X20000 ) inter-arrival emergency times. Partitioned the sample
in portions of ni = 100, i = 1, 2, ..., 200. For each portion estimate the mean inter-arrival time with the
estimators θ̂1 and θ̂2 .
You can view them as a sample of estimations, (θ̂1,1 , θ̂1,2 , ..., θ̂1,200 ) and (θ̂2,1 , θ̂2,2 , ..., θ̂2,200 ). Compute
the bias of each estimation and then the (sample) mean bias. Are the estimators unbiased? Estimate
the variance5 of each estimator and the mean square error. Which one, θ̂1 or θ̂1 , is a better estimator in
therms of the MSE?

2.1.1 Maximum likelihood estimation

The maximum likelihood method is a general procedure to find a point estimator for an unknown parameter θ.
Given a sample, the method selects the most plausible value for the parameter that explains the observations.
In other words, the method maximizes the probability of the observed data. Let x = (x1 , x2 , ..., xn ) be the
observed random sample for the random variable X and let θ be the parameter of interest. The likelihood
function, a function of θ given the sample, is

L(θ/x) = P (X1 = x1 , X2 = x2 , ..., Xn = xn /θ), (5)

if the variables are discrete, and

L(θ/x) = f (x1 , x2 , ..., xn /θ) (6)
if the variables are continuous. Thus, the likelihood function is the same to evaluating the joint probability
mass function or the join density function of (X1 , X2 , ..., Xn ) at the observation values (x1 , x2 , ..., xn ), and
letting it depend on the parameter θ. The maximum likelihood method selects the estimator θ̂ to be the
parameter value that maximize the likelihood function:

θ̂ = max L(θ/x) (7)

Exercise 4. Laboratory hand-in typos

Laboratory papers submitted by the last year students have been found to have a Poisson distributed
number of typos per page. Although all typos are Poisson distributed, each student has his own performance.
As so, the average typos per page, λ, may differ from one student to another. The teacher marks depends
slightly on the typos found. But the teacher does not want to check all pages per work so it takes a random
page and analyzes the number of typos. Using this procedure the teacher found that Darius has 2 typos in a
randomly selected page.

(a) Find the Poisson distribution that best fits the observed data, x = (x1 ) = 2. For that, find the maximum
likelihood function of θ = λ, where λ is the Poisson distribution parameter. Then, plot the likelihood
function. Does it have a maximum? Compute the maximum likelihood estimator θ̂, analytically. Does
it agree with the plot? And does it agree with the observed data? Why or why not?

(b) Two pages of the laboratory papers of Abraham y Mateo were mixed by mistake. The teacher knows
that Abraham has a mean of 1 typos per page and Mateo has a mean of 5 typos per page. Reading one
of the mixed pages the teacher found it has 3 typos. Who is the likely author?
4 Generate the random numbers using the inverse transform method from the previous exercise or use exprnd().
5 Use the sample variance Ŝ 2 .

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 4 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

2.2 Confidence intervals

Instead of seeking a single value that we designate to be the estimate of the parameter of interest, confidence
intervals attempts to find an interval or sets of values that is highly likely to contain the true value of the
parameter. In particular, we specify a high probability, say 1 − α, and then find an interval I = (a, b) such
that
P (a ≤ θ ≤ b) = 1 − α (8)
where a = l(X) and b = u(X) are sample statistics. It is important to note that the parameter θ is unknown
but deterministic. What is random is the interval I. Concretely, the extremes of the interval are random
since are driven by the random variables l(X) and u(X)6 .

2.2.1 Confidence interval for the expected value, µ = E[X], of a random variable
Suppose that we have an i.i.d. sample of Gaussian variables (X1 , X2 , ..., Xn ), with unknown mean µ and
variance σ 2 , and that we are interested in finding a confidence interval for the mean. A well know formula for
a confidence interval for the mean of a Gaussian random variables is
√ √
I = X̄ − tα/2,n−1 · σ̂/ n, X̄ + tα/2,n−1 · σ̂/ n (9)

where,
• X̄ is the sample mean.
v
u n
1 X
• σ̂ is the square root of the unbiased sample variance: σ̂ = t (Xi − X̄)2 .
u
n − 1 i=1

• c = tα/2,n−1 is the critical value of a tn−1 Student’s distribution, such that Fn−1 (c) = 1 − α2 , where
Fn−1 is the cumulative distribution function. This distribution is related with the Gaussian one and
it is characterized by having father tails. The t−distribution has only one parameter, called degrees of
freedom, n − 1. In figure 2 the values ±tα/2,n−1 are represented together with a graphic comparison
between the Gaussian and Student t-distribution density functions.

Figure 2: On the left, density function of a t-distribution with critical values. On the right, a comparison of
standard Gaussian density function with the tn -student density function. The Student t-distribution is named
after W.S. Gosset, who published under the pseudonym “A. Student”.

6 Many misunderstanding about confidence interval, mainly in psychology and social science, by both researchers and students,

have been carried out for decades, and yet remain rampant. See for example [6].

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 5 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Exercise 5. Planned obsolescence

The lifetime of a telephone device is known to be Gaussian distributed. An independent researcher group
want to study it the devices have a planned obsolescence. If the mean lifetime of the devices is less than
two years and a half, the telephones will be considered to have planned obsolescence. The researchers tested
fifty devices and the lifetime of all of them was collected.
(a) Load the data file lifetime.txt and plot an histogram. Do they look normally distributed? What are
the minimum and maximum lifetime observed?
(b) Compute a 95% confidence interval for the mean lifetime of the devices7 . What says the interval about
the planned obsolescence?

2.2.2 Batch means method for the mean of a non-Gaussian random variable
The use of the previous method can be easily misused since is only justified if the sample is (at least approxi-
mately) Gaussian distributed. Nevertheless, if the variables are not Gaussian a method of batch means can
be applied, taking advantage of the central limit theorem. This method involves performing a series of inde-
pendent and identical experiments in which the sample mean X̄ of each experiment is computed. If we assume
that in each experiment each sample mean is calculated from a large enough number of i.i.d. observations, the
central limit theorem implies that the sample mean in each experiment is approximately Gaussian. We can
therefore compute a confidence interval for the mean using as sample set the sample of means, (X̄1 , X̄2 , ..., X̄n ).

Exercise 6. Batch means

An experiment is performed and data is collected. A sample of two hundred values of some unknown
i.i.d. random variables is obtained. The investigators want to compute a confidence interval for the mean.
(a) The data is saved in the file batch.txt. Plot a histogram of the obtained data. Does it look Gaussian?
(b) Compute the sample of means. To do so, group the data in 10 batches of 20 samples each. Do the sample
of means look Gaussian? Is the histogram similar to the one in (a)? Why or why not? Perform the
method of batch means. Give a 90% confidence interval.
(c) Actually, this values were generated from an exponential distribution with parameter λ = 1. Does this
statement agree with the confidence interval obtained in (b)?

2.2.3 Confidence interval for the variance, σ 2 = VAR[X], of a Gaussian random variable
As in the case of the mean, it is useful to find a confidence interval for the variance of a random variable. In
general, whenever the sampling distribution of an estimator of a parameter θ is known, a confidence interval
can be computed. If the variables are Gaussian, it results that,

(n − 1)σ̂n2
∼ χ2n−1 (10)
σ2
where

• σ̂n2 is the unbiased sample variance of a sample of size n.

• θ = σ 2 is the true value of the parameter.
• χ2n−1 is a chi-square random variable of degree n − 1. This distribution is plotted in figure 3.

Using the relation in the above equation (10), the following confidence interval for the variance can be
obtained: " #
(n − 1)σ̂n2 (n − 1)σ̂n2
I= , (11)
χ21−α/2,n−1 χ2α/2,n−1
7 You can use the Matlab function tinv() to find the value of tα/2,n−1 .

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 6 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Figure 3: Chi-square density function with critical values. The area of each highlighted region is α/2.

where χ2α/2,n−1 and χ21−α/2,n−1 are the critical values depicted in figure 3.

Exercise 7. Medical knee implant

A factory produces a piece of a medical knee implant. The length of each fabricated piece must be under
control and the variance as low as possible. It is known that if the variance of the piece is less than 0.75 cm2
the process is under control. Knee piece implant length errors in the manufacturing process are know to be
Gaussian distributed. A sample of the length of 100 hundred medical pieces is saved in the file implant.txt.
Import the file in Matlab and compute a 99% confidence interval for the variance8 . Is the process under
control?

2.3 Hypothesis testing

An import question that arises in scientific experiments is whenever a certain hypothesis about the observed
values is true or no. In statistics the problem is stated as follows: an experiment is made and a random sample
is obtained X = (X1 , X2 , ..., Xn ). We are interested in whether the observed data is significantly different
from what would be expected under our hypothesis H0 . For that, we specify a decision rule. We partition the
observation space into an acceptance region R where we accept the hypothesis and a rejection or critical
region Rc where we reject the hypothesis. Resuming, the decision rule is:

Accept H0 if X ∈ R (12)
c
Reject H0 if X ∈ R . (13)

Two kinds of errors can occur when executing this decision rule:

Type I error : Reject H0 when H0 is true (14)

Type II error : Accept H0 when H0 is false. (15)

The significance level of a test, α, is defined as the probability of Type I error, i.e.

α = P [X ∈ Rc /H0 ]. (16)

This value represents our tolerance for Type I errors, that is, of rejecting H0 when in fact is true. Generally
what is wanted is to find the best test so for a given threshold α the Type II error is as less as possible.
8 You can use the Matlab function chi2inv() to find the values of χ2α/2,n−1 and χ21−α/2,n−1 .

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 7 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Unfortunately, in many cases there is no information about the true distribution of the observation X and
hence the probability of Type II error is not possible to evaluate9 .

Exercise 8. Battery performance

A battery manufacturer, Duracelia, claims to have improved their batteries lifetime. A nonconformist
costumer, that saw the announcement on the TV, wants to perform a test to see if it is really true. The old
batteries are known to have a lifetime that is Gaussian distributed with mean 150 hours and standard deviation
4 hours. A sample of the lifetime of 25 new batteries is saved in the file batteries.txt. Let H0 =“batteries
lifetime is unchanged” and consider a significance level of α = 0.01. If H0 is true then the sample mean X̄25
is Gaussian distributed.

(a) Compute the mean and variance of X̄25 under the hypothesis H0 .

(b) Compute c such that α = 0.01 = P [X̄25 > 150 + c /H0 ]. Then the rejection region for the sample mean,
X̄25 , is Rc = (150 + c, ∞). That is, if X̄25 > 150 + c the null hypothesis is rejected and it is accepted
that the lifetime of the batteries has significantly increase.

3 Markov Chains
This section is about a certain sort of stochastic process, called Markov processes. The characteristic of
this sort of processes is that it retains no memory of where it has been in the past. This means that only the
current state of the process can influence where it goes next. We will be concerned exclusively with the case
where the process can assume only a finite or countable set of states. They are called Markov chains.
What makes them important is not only that models many phenomena of interest, but also the lack of
memory property makes it possible to predict how a Markov chain may behave, and to compute probabilities
and expected values which quantify that behaviour. Thus, Markov processes are quite useful in modelling
many problems found in practice.
A discrete random process {Xn }n∈I is a Markov process if the future of the process given the present
is independent of the past, that is,

P [Xn+1 = xn+1 /Xn = xn , ..., X1 = x1 ] = P [Xn+1 = xn+1 /Xn = xn ]. (17)

In the above expression, we refer to n as the “present”, to n + 1 as the “future”, and to 1, 2, ..., n − 1 as
the “past”. The value of Xn is called the state of the process at instant n. Lets assume {Xn }n∈I takes values
in a finite set of integers S, and that the one-step probabilities, pij = P [X1 = j/X0 = i], are fixed and do not
change with the steps, that is:

P [Xn+1 = j/Xn = i] = pij for all n ∈ I. (18)

Also lets assume some distribution for the beginning of the process:

P [X0 = i] = pi for all i ∈ S. (19)

Can be easily seen that the joint pmf of (X0 , X1 , ..., Xn ) is given by

P [X0 = i0 , X1 = i1 , ..., Xn = in ] = pi0 · p(i0 ,i1 ) · · · p(in−1,in ) . (20)

9 There are more types of hypothesis testing, here we approach just the one mentioned. If you want to know more about statistical

hypothesis testing visit the very nice web page http://www.randomservices.org/random/hypothesis/Introduction.html of the
University of Alabama in Hunstville.

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 8 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Actually the process can be completely specified in a easy way trough a stochastic matrix10
 
p00 p01 · · · p0m
 p10 p11 · · · p1m 
P= . ..  m = |S| (21)
 
.. ..
 .. . . . 
pm0 pm1 ··· pmm ,

called transition probability matrix, and the initial distribution

p = (p0 , p1 , ..., pm )t . (22)

Finally, consider the n − step transition probabilities matrix, named P (n), where the ij element is

pij (n) = P [Xn+k = j/Xk = i] for all k, n + k ∈ I and i, j ∈ S. (23)

Actually, P [Xn+k = j/Xk = i] = P [Xn = j/X0 = i] since we supposed that the transition probabilities do not
change with the steps. It can be shown (Champan-Kolmogorov equations) that this matrix is just

P (n) = P n . (24)

As an example, imagine we are in the state i and want to know the probability of being in state j after n steps.
Then we have just to look at the element ij of the matrix P n .
Markov chains are often best described by diagrams. For example the diagram in figure 4 represents a
Markov chain with probability matrix:
1−α α
P=
β 1−β ·

Figure 4: Markov chain diagram.

Exercise 9. Google PageRank

An internet surfer browses webpages in a five-page web universe as shown in the diagram 5. The surfer
selects the next page to view by selecting with equal probability from the pages pointed to by the current page.
If a page has no outgoing link (like page 2), then the surfer selects any of the pages in the universe with equal
probability11 .

(a) Compute the transition probability matrix P.

(b) Simulate 5 samples of the process for the first 10 steps and plot it. Which web page is most visited? And
lesser?
(c) Compute empirically the probability to come back to page 3 if you were 2 steps before also on page 3.
For that, perform a simulation of the process until the step n = 100 and compute the relative frequency.
Finally, compute the theoretic probability. Are they similar? Which 2-step transitions are impossible?
10 A matrix P is stochastic if every element is nonnegative and the sum by rows is one.
11 The random surfer model, forms the basis for the PageRank algorithm that was introduced by Google to rank web pages in
their search engine results. The rank of a page is given by the steady state probability of the page in the Markov chain model.
The size of the state space in this Markov chain is in the billions of pages!

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 9 de 10

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021
Telecomunicación Date: 2020-11-26

Laboratory Teacher:
Marius Marinescu

Probability and Statistics

Figure 5: State-transition diagram.

(d) Let π = (π1 , π2 , ..., π5 ) be the proportions of instants spent in each state in the long term, called steady
state.
• Compute these probabilities empirically. For that simulate the process several times and computes
the relative frequency of being in each state.
• Compute P, P 5 , P 10 , P 20 , and P 50 using a computer12 . Does the matrix converge to a constant
matrix? Are the probabilities by row constant? How would you interpret this?
• There is a way to compute the steady state π distribution in a theoretical and direct way. For that
you have to solve the following (indeterminate) linear system of equations:

π = πP
Pn
and use the normalization condition i=1 πi = 1 since π are probabilities. Compute π analytically.
Are the probabilities similar to the one you computed empirically? What is the most frequent state?

(e) As seen in Table 1, the surfer usually spends 10 minutes in the first page (his favourite one), 5 minutes
in the pages 2, 3, and 4, and 3 minutes on page 5. Find the percentage of the time the system spends in
different states in the long run13 . In which page the surfer spends most of the time? Is the same as the
most visited page?

State 1 2 3 4 5
Expected values (minutes) 10 5 5 5 3

Table 1: Expected value of time spent in each state.

References
[1] J. E. Gentle, Random Number Generation and Monte Carlo Methods. Springer-Verlag, 1998.
[2] J. E. Gentle, Statistical decision theory and Bayesian analysis. Springer-Verlag, 1985.
[3] L. Valdés Sánchez Teófilo; Pardo Llorente, Decisiones Estratégicas. Ediciones Dı́az de Santos, 2000.
[4] M. R. Sheldom, Introducción a la Estadı́stica. Standford University, EE. UU: Ediciones Reverté, 2007.

[5] W. M. Bolstad, Introduction to Bayesian Statistics. Wiley-Blackwell, 2007.

[6] R. Hoekstra and et al., “Robust misinterpretation of confidence intervals,” Psychonomic Bulletin & Review,
no. 21, p. 1157–1164, 2014.

12 As a note, it can also be computed by hand using matrix diagonalization methods from Linear Algebra.
13 Hint: you can draw on the probabilities π to do so, or you can perform a simulation and estimate the times.

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 10 de 10

Fallsem2018-19 Mee2013 Eth Mb218 Vl2018191002622 Reference Material I Simulation III Unit
No ratings yet
Fallsem2018-19 Mee2013 Eth Mb218 Vl2018191002622 Reference Material I Simulation III Unit
119 pages
SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
randomnumbers-5
No ratings yet
randomnumbers-5
42 pages
Random Project 222
No ratings yet
Random Project 222
5 pages
Simkom-Materi Week 5 - Random-Number & Random-Variate Generation
No ratings yet
Simkom-Materi Week 5 - Random-Number & Random-Variate Generation
24 pages
Lecture 8-Generation of Random Variable1-NEW
No ratings yet
Lecture 8-Generation of Random Variable1-NEW
10 pages
Random Numbers: Bana7030 Denise L. White, PHD Mba
0% (1)
Random Numbers: Bana7030 Denise L. White, PHD Mba
34 pages
Randomnumbers Chapter6
No ratings yet
Randomnumbers Chapter6
59 pages
Simulation Chapter 3
No ratings yet
Simulation Chapter 3
19 pages
3.1 Basics of Pseudo-Random Numbers Generators
No ratings yet
3.1 Basics of Pseudo-Random Numbers Generators
10 pages
3.1 Basics of Pseudo-Random Numbers Generators
No ratings yet
3.1 Basics of Pseudo-Random Numbers Generators
10 pages
Section 7: Random Number Generators (Matlab Examples) .: Main Page Eeng School Dcu
No ratings yet
Section 7: Random Number Generators (Matlab Examples) .: Main Page Eeng School Dcu
4 pages
RN GMC Final
No ratings yet
RN GMC Final
21 pages
Random Numbers
No ratings yet
Random Numbers
7 pages
Lecture 2
No ratings yet
Lecture 2
30 pages
Class 3
No ratings yet
Class 3
51 pages
Chapter 2 - Random Number Generation
No ratings yet
Chapter 2 - Random Number Generation
21 pages
Random Number Generation: Week 3 Kelton Text (Ch. 12)
No ratings yet
Random Number Generation: Week 3 Kelton Text (Ch. 12)
9 pages
Random Numbers (2/3) : Non - Uniform Distribu/ons
No ratings yet
Random Numbers (2/3) : Non - Uniform Distribu/ons
19 pages
To Print - Randomnumber
No ratings yet
To Print - Randomnumber
29 pages
Module 3 - SMS
No ratings yet
Module 3 - SMS
38 pages
06
No ratings yet
06
57 pages
Random Number Generation
No ratings yet
Random Number Generation
42 pages
Monte Carlo Simulation Technique: Random Number Generators
No ratings yet
Monte Carlo Simulation Technique: Random Number Generators
13 pages
BANA7030_Random_Numbers
No ratings yet
BANA7030_Random_Numbers
22 pages
Cne310 Lec 6 RNG
No ratings yet
Cne310 Lec 6 RNG
34 pages
Screenshot 2023-01-16 at 6.30.45 PM
No ratings yet
Screenshot 2023-01-16 at 6.30.45 PM
14 pages
Ssm-Unit 5 6
No ratings yet
Ssm-Unit 5 6
30 pages
Random PDF
No ratings yet
Random PDF
15 pages
RNG Revised
No ratings yet
RNG Revised
132 pages
Chapter6 MATLAB PDF
No ratings yet
Chapter6 MATLAB PDF
23 pages
Chapter 7
No ratings yet
Chapter 7
29 pages
Random Numbers
No ratings yet
Random Numbers
34 pages
Generating Random Variables Mpress
No ratings yet
Generating Random Variables Mpress
24 pages
Generating Random Variables - Simulating Methods
No ratings yet
Generating Random Variables - Simulating Methods
24 pages
random number_2_240319_090536
No ratings yet
random number_2_240319_090536
6 pages
Generating Random Numbers: Lecturer: Dmitri A. Moltchanov E-Mail: Moltchan@cs - Tut.fi
No ratings yet
Generating Random Numbers: Lecturer: Dmitri A. Moltchanov E-Mail: Moltchan@cs - Tut.fi
60 pages
08 Random Number Generation
No ratings yet
08 Random Number Generation
28 pages
Random Number Variate Generation
No ratings yet
Random Number Variate Generation
64 pages
Cp01 Random
No ratings yet
Cp01 Random
18 pages
Hull, T.E. and A.R. Dobell "Random Number Generators." Reprinted With Permission From The Society For Industrial and Applied Mathematics
No ratings yet
Hull, T.E. and A.R. Dobell "Random Number Generators." Reprinted With Permission From The Society For Industrial and Applied Mathematics
26 pages
Chapter 05 Generating Random Numbers
No ratings yet
Chapter 05 Generating Random Numbers
45 pages
Random Number Generation
No ratings yet
Random Number Generation
42 pages
Simulation: An Introduction
No ratings yet
Simulation: An Introduction
51 pages
ch7
No ratings yet
ch7
15 pages
Emon 1
No ratings yet
Emon 1
11 pages
BRG - Implementing A Monte Carlo Simulation Correlation, Skew, and Kurtosis
No ratings yet
BRG - Implementing A Monte Carlo Simulation Correlation, Skew, and Kurtosis
21 pages
Random Number Generation
0% (1)
Random Number Generation
44 pages
Random Numbers
No ratings yet
Random Numbers
99 pages
Chapter Three: Random Variant Generation Random Numbers
No ratings yet
Chapter Three: Random Variant Generation Random Numbers
9 pages
Stochalgslides
No ratings yet
Stochalgslides
406 pages
Assignment1
No ratings yet
Assignment1
30 pages
Comp 03
No ratings yet
Comp 03
10 pages
Random Number Generation PDF
No ratings yet
Random Number Generation PDF
31 pages
B3 SM Exp1
No ratings yet
B3 SM Exp1
7 pages
Computational Science For Engineers - Unit-IV - DataDrivenModels - Simulations, Random Numbers and Random Walk
No ratings yet
Computational Science For Engineers - Unit-IV - DataDrivenModels - Simulations, Random Numbers and Random Walk
31 pages
2 CONGRUENTIAL RANDOM NUMBER GENERATOR unit2
No ratings yet
2 CONGRUENTIAL RANDOM NUMBER GENERATOR unit2
13 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Worked Examples in Mechanical Vibrations using MATLAB
From Everand
Worked Examples in Mechanical Vibrations using MATLAB
Eric Okoth Ogur
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
p19 Cala Unce Po
No ratings yet
p19 Cala Unce Po
55 pages
measurement error in nonlinear models 2nd Edition Raymond J. Carroll pdf download
No ratings yet
measurement error in nonlinear models 2nd Edition Raymond J. Carroll pdf download
86 pages
Analysis of Hydrochloric Acid: Standard Test Methods For
No ratings yet
Analysis of Hydrochloric Acid: Standard Test Methods For
8 pages
Review of Basic Statistics
No ratings yet
Review of Basic Statistics
86 pages
Spe 125677 MS P PDF
No ratings yet
Spe 125677 MS P PDF
12 pages
Leon-Garcia-IPPR_Chapters 1-6
No ratings yet
Leon-Garcia-IPPR_Chapters 1-6
180 pages
Instant Download (Ebook PDF) Real Econometrics: The Right Tools To Answer Important Questions by Michael Bailey PDF All Chapters
100% (8)
Instant Download (Ebook PDF) Real Econometrics: The Right Tools To Answer Important Questions by Michael Bailey PDF All Chapters
51 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
DOANE - Stats Answer Key Chap 008
100% (1)
DOANE - Stats Answer Key Chap 008
73 pages
Simulation in R PDF
100% (1)
Simulation in R PDF
27 pages
Astm e 562 DSS
100% (2)
Astm e 562 DSS
7 pages
Stat273 - Chapter 07 (Summer)
No ratings yet
Stat273 - Chapter 07 (Summer)
16 pages
Estimation Theory Lec 1 - InTRODUCTION
No ratings yet
Estimation Theory Lec 1 - InTRODUCTION
21 pages
Astm E122 17 2022
No ratings yet
Astm E122 17 2022
3 pages
Speech Enhancement Using A Minimum-Mean Square Error Short-Time Spectral Amplitude estimator-dKR PDF
No ratings yet
Speech Enhancement Using A Minimum-Mean Square Error Short-Time Spectral Amplitude estimator-dKR PDF
13 pages
ST 1
No ratings yet
ST 1
2 pages
Ch-2 Linear Models For Regression
No ratings yet
Ch-2 Linear Models For Regression
40 pages
Model Selection Evaluation Algorithm Selection 1684595082
No ratings yet
Model Selection Evaluation Algorithm Selection 1684595082
51 pages
Service Parts Management Nezih Altay Lewis A Litteral download
No ratings yet
Service Parts Management Nezih Altay Lewis A Litteral download
89 pages
Credit Loss and Systematic LGD Frye Jacobs 100611 PDF
No ratings yet
Credit Loss and Systematic LGD Frye Jacobs 100611 PDF
31 pages
Download full An Introduction to Model Based Survey Sampling with Applications 1st Edition Ray Chambers ebook all chapters
100% (1)
Download full An Introduction to Model Based Survey Sampling with Applications 1st Edition Ray Chambers ebook all chapters
54 pages
Unit-12 Bcom
No ratings yet
Unit-12 Bcom
17 pages
Applied Statistics in Business and Economics 4th Edition Doane Test Bank 1
100% (70)
Applied Statistics in Business and Economics 4th Edition Doane Test Bank 1
120 pages
Alpha Generation and Risk Smoothing Using Volatility of Volatility
No ratings yet
Alpha Generation and Risk Smoothing Using Volatility of Volatility
34 pages
Panel Data Models
No ratings yet
Panel Data Models
112 pages
Quality Exploratory Factor Analysis
No ratings yet
Quality Exploratory Factor Analysis
19 pages
Mathematical Statistics with Resampling and R 1st Edition Laura M. Chihara - The ebook is now available, just one click to start reading
100% (3)
Mathematical Statistics with Resampling and R 1st Edition Laura M. Chihara - The ebook is now available, just one click to start reading
49 pages
Briefly Explain The Properties of Good Estimators
No ratings yet
Briefly Explain The Properties of Good Estimators
4 pages

Laboratory Probability and Statistics 20 21 Errata Corrected

Uploaded by

Laboratory Probability and Statistics 20 21 Errata Corrected

Uploaded by

Escuela Técnica Superior de Ingenierı́a de Course: 2020-2021

Telecomunicación Date: 2020-11-26

Probability and Statistics

1 Pseudo-Random Number Generation

video for more information about the topic.

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 1 de 10

Probability and Statistics

Exercise 1. Generating random numbers

Exercise 2. Generating exponentially distributed random numbers

2. Generate a uniform random number U in (0,1).

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 2 de 10

Probability and Statistics

2.1 Parameter estimation

θ̂(X) = g(X1 , X2 , ..., Xn ) (1)

E[(θ̂ − θ)2 ] = · · · = VAR[θ̂] + B(θ̂) (4)

Exercise 3. Emergency centre

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 3 de 10

Probability and Statistics

• θ̂2 = n · min(X1 , X2 , ..., Xn )

2.1.1 Maximum likelihood estimation

L(θ/x) = P (X1 = x1 , X2 = x2 , ..., Xn = xn /θ), (5)

if the variables are discrete, and

θ̂ = max L(θ/x) (7)

Exercise 4. Laboratory hand-in typos

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 4 de 10

Probability and Statistics

2.2 Confidence intervals

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 5 de 10

Probability and Statistics

Exercise 5. Planned obsolescence

Exercise 6. Batch means

• σ̂n2 is the unbiased sample variance of a sample of size n.

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 6 de 10

Probability and Statistics

Exercise 7. Medical knee implant

2.3 Hypothesis testing

Type I error : Reject H0 when H0 is true (14)

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 7 de 10

Probability and Statistics

Exercise 8. Battery performance

P [Xn+1 = xn+1 /Xn = xn , ..., X1 = x1 ] = P [Xn+1 = xn+1 /Xn = xn ]. (17)

P [Xn+1 = j/Xn = i] = pij for all n ∈ I. (18)

P [X0 = i] = pi for all i ∈ S. (19)

P [X0 = i0 , X1 = i1 , ..., Xn = in ] = pi0 · p(i0 ,i1 ) · · · p(in−1,in ) . (20)

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 8 de 10

Probability and Statistics

called transition probability matrix, and the initial distribution

p = (p0 , p1 , ..., pm )t . (22)

pij (n) = P [Xn+k = j/Xk = i] for all k, n + k ∈ I and i, j ∈ S. (23)

Figure 4: Markov chain diagram.

Exercise 9. Google PageRank

(a) Compute the transition probability matrix P.

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 9 de 10

Probability and Statistics

Figure 5: State-transition diagram.

Table 1: Expected value of time spent in each state.

[5] W. M. Bolstad, Introduction to Bayesian Statistics. Wiley-Blackwell, 2007.

Universidad Rey Juan Carlos Grado en Ingenierı́a Biomédica (inglés) Pág. 10 de 10

You might also like