ML 2 EM Example
ML 2 EM Example
The EM strategy can be explained with a coin toss example. This is the example we will be using
in subsequent iterations to explain the complete flow of the EM algorithm. In this example we
assume that we are tossing a number of coins sequentially to obtain a sequence of Head or Tails.
The context of the coin toss example is given in Table 33.1. Here the problem is defined as X, the
sequence of Heads and Tails that is observed, Y as the identifier of the coin that is tossed in the
sequence, which is hidden and finally θwhich is the parameter vector which is associated with the
probabilities ofthe observed and hidden data. Here if we assume three coins are tossed λ is the
probability of coin 0 showing H (so 1 − λ is the probability of it showing T), p1 is the probability of
coin 1 showing H, and p2 is the probability of coin 2 showing H.
We can also modify the problem to figure out the probability of heads
for two coins. Normally the ML estimate can be directly calculated from
the results if we know the identity of which of the coins was tossed.
We have two coins indicated as coinsA and Band let us assume that the probabilities for heads
are qA&q Brespectively. We are given 5 measurements sets including 10 coin tosses in each set.
Now we know which coin has been tossed in each measurement. The example sets of experiments
are given in Table 33.2. In this table A coin has been indicated as red and B coin as blue. The first
column in the table indicates the coin type for each of the 5 measurements since here we assume we
know the identity of the coin. The second column indicates the sequence of 10 Heads and Tails
observed for each measurement. Columns 3 and 4 indicate the number of Heads and Tails obtained
in each coin toss for each measurement of each coin type. The final row shows the total number of
Heads and Tails obtained for the measurements for each of the A and B coin types.
Now we calculate the ML probabilitiesqA&q B, the probabilities for heads of coins A and B
respectively.
qB = 9/9+11=9/20=0.45
The above calculation is a basic probability calculation based on observations knowing whether
coin A or B has been tossed.
We will now make the problem more interesting and assume that we do not even know which one
of the coins is used for the sample set . Now we need to estimate the coin probabilities without
knowing which one of the coins is being tossed.
Note that when we do not know which of the coins is tossed in each set we cannot calculate ML
directly and hence we use EM strategy to find the probabilities of which one of the coins is likely to
be tossed. Figure 33.3 shows the complete flow of the EM algorithm. Remember we do not know
which of the coins is tossed. Hence we start the process by assuming that for each of the coins A
(red) and B (blue), the initial probabilities for heads are qA&q Brespectivelywhich are assumed to
have random values. Hence as seen from Figure 33.3 we have randomly fixed qAto be 0.6
and q Bto be 0.5. Now we observe the number of Heads and Tails for each of the 5 measurements.
E Step:
The first stage is the Expectationstage for which we initially use the randomly assumed
probabilities of qA&q B and the set of coin tosses observed for each measurement. Now we need to
calculate the probabilities for Heads and Tails for both A and B coins for each measurement since
we do not know which coin is tossed. We have shown the calculation for the first set .of
measurements in Figure 33.3.
Step E-C1
In the first step of the Expectation stage we assume that the coin toss sequence follows a
binomial distribution, where n is total number of coin tosses, k is number of Heads (Tails) observed
and p is the probability of observing heads for each coin.
Figure 33.3 Flow of EM – Coin Toss Example
Usingthe distribution we can calculate the probability of observing Heads and Tails for coins A
and B for the first set as shown in Figure 33.4. Here
n – the total number of coins tossed in the first measurement = 10, k -the total number of heads
observed =5
Now we need to calculate the probability of observing Heads(Tails) for both coins A and B since
we do not know which of the coins was tossed and hence p – (qA)- the probability of heads for A
(red) coin is initially assumed to be = 0.6 & (qb)- the probability of heads for B (blue) coin is
initially assumed to be = 0.5
Figure 33.4 Use of Binomial Distribution
Step E-C2
In the second step of the Expectation stage we calculate the probabilities of using coin A or B as
follows:
0.201/ (0.201+0.246)=0.45 Probability = Favourable Outcome/ Total Outcome.
0.246/ (0.201+0.246)=0.55
We calculate in a similar manner for all the experiments and the values obtained are shown in
Figure 33.3.
Step E-C3
In the third step of the Expectation stage using the probability values obtained in step E-C2, we
calculate the possible number of Heads and Tails that is likely to be observed in each experiment if
the coin tossed was A and if the coin tossed was B
First Experiment – values corresponding to first row:
Similarly we can do the calculations for all 5 experiments. Using these calculated values for
number of Head and Tails for each experiment, we can calculate the total number of Heads and
Tails for both Coins A and B.
M Step
Now we have the Maximization stage where we calculate the new values (after 1st iteration)
of qA B(1)that is the maximum likelihood estimates of the probability of heads when coin A and
probability of heads when coin B are tossed respectively using the values total number of Heads and
Tails for both Coins A and B. This calculation is shown below:
qA(1)= 21.3(total number of Heads when coin A is tossed)/(21.3+8.6)(total number of Heads and
Tails when coin A is tossed)= 0.71
q B(1)= 11.7(total number of Heads when coin B is tossed)/(11.7+8.4)(total number of Heads and
Tails when coin B is tossed)=0.58
Now we have completed one iteration of the EM algorithm. We now continue the second iteration
of the algorithm using the above new values of qA&q B. We continue the iterations until the values
of qA&q B do not change from one iteration to the next. This happens in the 10th iteration for our
example (shown in Figure 33.3) when the values qA&q B converge to
values 0.80 and 0.52 respectively