Course 1
Course 1
1 Sentiment Analysis
Exercise 1 Very first mathematical formulation: Linguistic intuition leads to
the Logistic Regression for Sentiment Analysis Classification.
• Derive the gradient descent formula in Logistic Regression case.
Exercise 2 Derive Naive Bayes formulation for Sentiment Analysis Application.
2 N-gram model
Exercise 1 Mathematical formulation for n-gram models. For example, how to
determine the probability of observing a sequence of words {𝑤 1 , 𝑤2 , . . . , 𝑤 𝑛 }
in bi-gram model.
Exercise 2 Consider the following corpus 𝑉 built on two letters {𝑎, 𝑏}
<𝑠>𝑎 𝑏
<𝑠>𝑎 𝑎
<𝑠>𝑏 𝑎
<𝑠>𝑏 𝑏
Prove that the probability of all possible 𝑘 word sentences over the alpha-
bet , is equal to 1 for any 2.
Exercise 3 𝑘 − 𝑠𝑚𝑜𝑜𝑡 ℎ𝑖𝑛 𝑔 technique formula and its usage.
1
3 Vector embedding
Exercise 1 Vector embedding concept. Static and contextual embedding.
Exercise 2 Understand the concept of occurrence matrix: word by doc design
and word by word design.
• td-idf formula.
• pointwise mutual information formula. Relate it to Information Gain
that you learnt in Machine Learning course.
• Entropy, joint entropy, conditional entropy. Prove various properties
of entropies in the course.
Exercise 3 For word2vec part, we will only focus on Bag-of-Word algorithm.
• Definition of softmax function. How to compute the partial deriva-
tives of this function.
• Two hidden layer Neural Network architecture for Bag-of-Word model.
How to derive the partial derivatives for Gradient Descent update.
Exercise 4 - Coding Understand the Jupyter notebooks on Bag-of-Word imple-
mentation, you will see some questions in the exam like this
What does the following code do?
def forward_prop ( x , W1, W2, b1 , b2 ) :
h = np . dot (W1, x )+ b1
h = np . maximum( 0 , h )
z = np . dot (W2, h)+ b2
return z , h