0% found this document useful (0 votes)
10 views48 pages

NLP DL Lecture1

This document provides an overview and agenda for a lecture on using logistic regression for text classification. The lecture covers classical machine learning techniques for natural language processing including the vector space model and tf-idf weighting. It then outlines the neural network-based roadmap for NLP, including word embeddings, recurrent neural networks, attention mechanisms, and pre-trained language models. The document concludes by stating that the lecture will demonstrate logistic regression for text classification and assign related work.

Uploaded by

thanh.tien.96.vn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views48 pages

NLP DL Lecture1

This document provides an overview and agenda for a lecture on using logistic regression for text classification. The lecture covers classical machine learning techniques for natural language processing including the vector space model and tf-idf weighting. It then outlines the neural network-based roadmap for NLP, including word embeddings, recurrent neural networks, attention mechanisms, and pre-trained language models. The document concludes by stating that the lecture will demonstrate logistic regression for text classification and assign related work.

Uploaded by

thanh.tien.96.vn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Deep Learning for Natural Language Processing

Lecture 1: Logistics Regression for Text Classification

Quan Thanh Tho


Faculty of Computer Science and Engineering
Back Khoa University
Acknowledgement

• Some slides are from Coursera course of Prof. Andew Ng.


Agenda

• Classical Machine Learning Techniques for NLP


• Neural-based Roadmap of NLP
• Logistics Regression for Text Classification
• Demonstration and Assignment
Classical Machine Learning Techniques for NLP

• Machine Learning Task


• The tf.idf weights
• Neural-based approach
Document Collection
• A collection of n documents can be represented in the vector
space model by a term-document matrix.
• An entry in the matrix corresponds to the “weight” of a term in
the document; zero means the term has no significance in the
document or it simply doesn’t exist in the document.

T1 T2 …. Tt
D1 w11 w21 … wt1
D2 w12 w22 … wt2
: : : :
: : : :
Dn w1n w2n … wtn
7
Term Weights: Term Frequency
• More frequent terms in a document are more important, i.e. more
indicative of the topic.
fij = frequency of term i in document j

• May want to normalize term frequency (tf) by dividing by the


frequency of the most common term in the document:
tfij = fij / maxi{fij}

8
Term Weights: Inverse Document Frequency
• Terms that appear in many different documents are less indicative of overall
topic.
df i = document frequency of term i
= number of documents containing term i
idfi = inverse document frequency of term i,
= log2 (N/ df i)
(N: total number of documents)
• An indication of a term’s discrimination power.
• Log used to dampen the effect relative to tf.

9
TF-IDF Weighting
• A typical combined term importance indicator is tf-idf weighting:
wij = tfij idfi = tfij log2 (N/ dfi)
• A term occurring frequently in the document but rarely in the rest of
the collection is given high weight.
• Many other ways of determining term weights have been proposed.
• Experimentally, tf-idf has been found to work well.

10
Computing TF-IDF -- An Example
Given a document containing terms with given frequencies:
A(3), B(2), C(1)
Assume collection contains 10,000 documents and
document frequencies of these terms are:
A(50), B(1300), C(250)
Then:
A: tf = 3/3; idf = log2(10000/50) = 7.6; tf-idf = 7.6
B: tf = 2/3; idf = log2 (10000/1300) = 2.9; tf-idf = 2.0
C: tf = 1/3; idf = log2 (10000/250) = 5.3; tf-idf = 1.8

11
Neural-based Approaches
Why Neural?
Neural-based Milestones for NLP

14
Neural Language Model

15
Multitask Learning

16
Word Embedding

17
18
Recurrent Neural Networks

19
RNN Common Architectures

20
RNN Common Architectures

21
RNN Common Architectures

22
RNN Common Architectures

23
Enhancement from RNN

24
Enhancement from RNN

25
Gated Recurrent Unit

26
Seq2seq Architecture

27
Seq2seq limitation

28
Attention Mechanism

29
Attention Mechanism

30
Dynamic Memory Model

31
Pre-trained Language Model

32
Pre-trained Neural Language Model

33
Supervised Learning Problem

You might also like