Information Theory, IT Entropy Mutual Information Use in NLP
Information Theory, IT Entropy Mutual Information Use in NLP
• Information theory, IT
• Entropy
• Mutual Information
• Use in NLP
• Equal probability:
0 no occupants
1 first occupant
2 second occupant
3 Both occupants
• Different probability
entropy(X) = E(I)=
-1/2 log2 (1/2) -1/4 log2 (1/4) -1/8 log2 (1/8) -1/8 log2 (1/8) = 7/4 =
1.75 bits
X = a?
si no
a X = b? no
si
b X = c?
si no
c a
p = 0 => 1 - p = 1 H(X) = 0 1
p = 1 => 1 - p = 0 H(X) = 0
p = 1/2 => 1 - p = 1/2 H(X) = 1
0
0 1/2 1 p
P(A,B,C,D…) = P(A)P(B|A)P(C|A,B)P(D|A,B,C..)