NLP Assignment

The document outlines two NLP problems: Continuous Bag of Words (CBOW) and the comparison between Skip-gram and GloVe models. CBOW predicts a target word based on surrounding context words, while Skip-gram generates context words from a target word, with each model having distinct training methodologies. The document highlights that Skip-gram is predictive and works well with smaller datasets, whereas GloVe is count-based and requires a larger corpus for effective training.

Uploaded by

atkalajadu69

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

NLP Assignment

Uploaded by

atkalajadu69

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

NLP Assignment

Name: Ridhyal Chauhan

Registration No: RA2211056010047

Problem-1: Continuous Bag of Words (CBOW)

(a) List of Target Words for Each Context Window

Given the sentence: "The quick brown fox jumps over the lazy dog."

With a context window size of 2, the target words and their corresponding context windows
are:

Context Window Target Word

[The, brown] quick
[quick, fox] brown
[brown, jumps] fox
[fox, over] jumps
[jumps, the] over
[over, lazy] the
[the, dog] lazy

(b) How CBOW Works

The CBOW model predicts a target word using the surrounding context words. The steps
involved are:
1. **Input Representation**: Each context word is converted into a one-hot encoded vector
or an embedding vector.
2. **Averaging the Context Vectors**: The embeddings of context words are averaged or
summed to form a single vector.
3. **Feeding into a Neural Network**: This averaged vector is passed through a neural
network (usually a single hidden layer).
4. **Output Layer (Softmax Function)**: The network outputs probabilities for all words in
the vocabulary, and the most probable word is chosen as the predicted target word.
5. **Backpropagation & Training**: The model adjusts weights based on prediction errors
to improve accuracy over multiple iterations.
Problem-2: Skip-gram vs GloVe Model

(a) Skip-gram Model (Word2Vec) Processing the Sentence

Given the sentence: "Natural language processing is amazing."

The Skip-gram model predicts context words given a target word. The steps are:

1. Target Word Selection: A word is chosen as the center (target) word.

2. Context Window Definition: With a window size of 2, it considers two words before and
after the target word.
3. Prediction Pairs Generation: The model generates training pairs in the form (target word,
context word).

For example, with a window size of 2, the Skip-gram model generates pairs like:

Target Word Context Words

Natural (language, processing)
language (Natural, processing, is)
processing (language, is, amazing)
is (processing, amazing)
amazing (is)

(b) Difference Between Skip-gram and GloVe

The Skip-gram and GloVe models differ in their approach to learning word embeddings.

1. Skip-gram Model (Word2Vec)

- Predicts context words given a target word.
- Trained using local context windows.
- Maximizes the probability of seeing correct context words for a target word.
- Performs well with small datasets and infrequent words.

2. GloVe Model
- Uses a word co-occurrence matrix instead of predicting context words.
- Trained using a global co-occurrence matrix.
- Factorizes the matrix to capture word relationships.
- Requires a large corpus for effective training.

In summary, Skip-gram is a predictive model, while GloVe is a count-based model.

Skip-gram learns embeddings through context prediction, whereas GloVe captures word
relationships by analyzing word co-occurrences across the entire corpus.

The AI Wealth Creation Blueprint PDF
67% (3)
The AI Wealth Creation Blueprint PDF
50 pages
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
100% (8)
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
148 pages
How To Hack Atm
87% (15)
How To Hack Atm
1 page
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
88% (8)
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
56 pages
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
95% (20)
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
471 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
81% (48)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
100% (10)
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
708 pages
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
100% (10)
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
821 pages
Understanding JavaScript Promises
From Everand
Understanding JavaScript Promises
Nicholas C. Zakas
No ratings yet
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
100% (25)
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
306 pages
Quiz Week 8 - Unsupervised Learning Clustering
50% (2)
Quiz Week 8 - Unsupervised Learning Clustering
2 pages
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
100% (24)
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
52 pages
Explaining The Intuition of Word2Vec & Implementing It in Python
No ratings yet
Explaining The Intuition of Word2Vec & Implementing It in Python
13 pages
The Fabric of Reality
100% (1)
The Fabric of Reality
6 pages
Banana Pancakes - Ukulele Chord Chart
100% (1)
Banana Pancakes - Ukulele Chord Chart
2 pages
A Review of Machine Learning and Deep Learning Applications
No ratings yet
A Review of Machine Learning and Deep Learning Applications
6 pages
75 Productivity Hacks - System Sunday
100% (7)
75 Productivity Hacks - System Sunday
75 pages
Military Remote Viewing Manual
100% (5)
Military Remote Viewing Manual
72 pages
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
No ratings yet
Cs 229, Autumn 2016 Problem Set #2: Naive Bayes, SVMS, and Theory
20 pages
Machine Learning For Humans
100% (4)
Machine Learning For Humans
97 pages
Let's Learn NLP in 5 minutes (Part 7)
No ratings yet
Let's Learn NLP in 5 minutes (Part 7)
8 pages
Report On Word2vec
No ratings yet
Report On Word2vec
7 pages
Chapter II
No ratings yet
Chapter II
26 pages
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
No ratings yet
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
53 pages
Word 2 Vec
No ratings yet
Word 2 Vec
6 pages
CH-3
No ratings yet
CH-3
183 pages
Word Embeddings Notes
No ratings yet
Word Embeddings Notes
9 pages
07_word_embeddings_notes
No ratings yet
07_word_embeddings_notes
23 pages
Note 1015202360148 PM
No ratings yet
Note 1015202360148 PM
4 pages
Word Embeddings Classification
No ratings yet
Word Embeddings Classification
52 pages
Learning Representations That Convey Semantic and Syntactic Information
No ratings yet
Learning Representations That Convey Semantic and Syntactic Information
14 pages
Question Bank NLP SOLUTIONS
No ratings yet
Question Bank NLP SOLUTIONS
21 pages
Module03 Embeddings
No ratings yet
Module03 Embeddings
102 pages
DLNLP CH-3 N
No ratings yet
DLNLP CH-3 N
11 pages
Natural Language Processing
No ratings yet
Natural Language Processing
25 pages
Lecture Word Embeddings WordTo Vec IR
No ratings yet
Lecture Word Embeddings WordTo Vec IR
60 pages
NLP_CT2_SET A_Answer Key
No ratings yet
NLP_CT2_SET A_Answer Key
10 pages
12 Subrata DL
No ratings yet
12 Subrata DL
25 pages
4. Common Word Embedding - Continuous Bag-Of-Words- Word2Vec
No ratings yet
4. Common Word Embedding - Continuous Bag-Of-Words- Word2Vec
12 pages
Word Vectors I
No ratings yet
Word Vectors I
23 pages
Unsupervised Learning of Sentence Embeddings Using Compositional N-Gram Features
No ratings yet
Unsupervised Learning of Sentence Embeddings Using Compositional N-Gram Features
11 pages
CCS369 - TSS-Unit 2
No ratings yet
CCS369 - TSS-Unit 2
56 pages
Lecture#14
No ratings yet
Lecture#14
38 pages
IEEE Paper Format Template
No ratings yet
IEEE Paper Format Template
2 pages
NLPPR8
No ratings yet
NLPPR8
4 pages
Continuous Bag of Words (Cbow) - Single Word Model - How It Works - Thinkinfi
No ratings yet
Continuous Bag of Words (Cbow) - Single Word Model - How It Works - Thinkinfi
14 pages
NLP CT2 Set B Answer Key
No ratings yet
NLP CT2 Set B Answer Key
12 pages
Word Embedding Learning Process
No ratings yet
Word Embedding Learning Process
6 pages
unit2
No ratings yet
unit2
15 pages
NLP Quick NOtes
No ratings yet
NLP Quick NOtes
15 pages
UNIT-2
No ratings yet
UNIT-2
6 pages
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
100% (1)
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
12 pages
2 Marks
No ratings yet
2 Marks
11 pages
Vector Semantics and Embedding (part 2)
No ratings yet
Vector Semantics and Embedding (part 2)
47 pages
Lebijp 59 SZ 31 Py
No ratings yet
Lebijp 59 SZ 31 Py
69 pages
Homework 2
No ratings yet
Homework 2
4 pages
How Exactly Does Word2vec Work?: David Meyer
No ratings yet
How Exactly Does Word2vec Work?: David Meyer
18 pages
word embedding
No ratings yet
word embedding
35 pages
NLP Concepts
No ratings yet
NLP Concepts
37 pages
NLP - Natural Language Processing
No ratings yet
NLP - Natural Language Processing
74 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Ngrams
100% (1)
Ngrams
22 pages
NVidia question1
No ratings yet
NVidia question1
3 pages
08 Word Embeddings (2021)
No ratings yet
08 Word Embeddings (2021)
58 pages
Torralba Skip Thought Vectors
No ratings yet
Torralba Skip Thought Vectors
10 pages
7a. Word Embeddings Word2Vec and GloVe
No ratings yet
7a. Word Embeddings Word2Vec and GloVe
39 pages
The 7 NLP Techniques That Will Change How You Communicate in the Future (Part I)
No ratings yet
The 7 NLP Techniques That Will Change How You Communicate in the Future (Part I)
19 pages
1506 06726 PDF
No ratings yet
1506 06726 PDF
11 pages
Self Evaluation Exercises (1)
No ratings yet
Self Evaluation Exercises (1)
12 pages
A Simple Word2vec Tutorial - Zafar Ali - Medium - Reader View
No ratings yet
A Simple Word2vec Tutorial - Zafar Ali - Medium - Reader View
9 pages
CS-875-Lecture 4
No ratings yet
CS-875-Lecture 4
47 pages
LP-VI_NLP_ Lab Manual
No ratings yet
LP-VI_NLP_ Lab Manual
21 pages
Cbow
No ratings yet
Cbow
1 page
NLP_Unit2 (2)
No ratings yet
NLP_Unit2 (2)
65 pages
IntroductorySheet
No ratings yet
IntroductorySheet
4 pages
Word Embeddings in NLP - Gunjan Agicha - Medium
No ratings yet
Word Embeddings in NLP - Gunjan Agicha - Medium
5 pages
NLP Notes
No ratings yet
NLP Notes
11 pages
Part 3
No ratings yet
Part 3
5 pages
2045: The Year Man Becomes Immortal
No ratings yet
2045: The Year Man Becomes Immortal
9 pages
Teas Topics To Study
100% (12)
Teas Topics To Study
6 pages
The Secrets of A Slot Machine
No ratings yet
The Secrets of A Slot Machine
4 pages
My Ai Cheat List
100% (11)
My Ai Cheat List
3 pages
Roadmap How To Learn AI in 2024 (Uncovered AI)
No ratings yet
Roadmap How To Learn AI in 2024 (Uncovered AI)
6 pages
From Music To Mathematic
100% (1)
From Music To Mathematic
4 pages
Rationality From AI To Zombies
86% (7)
Rationality From AI To Zombies
1,813 pages
Tech Trend 2024 Report-2
No ratings yet
Tech Trend 2024 Report-2
11 pages
Mind Control Patents
100% (1)
Mind Control Patents
41 pages
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
100% (7)
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
145 pages
Attention Is All You Need
67% (3)
Attention Is All You Need
11 pages
Wisc V Interpretation
100% (1)
Wisc V Interpretation
8 pages
Current and Future Trends on AI Applications - Mohammed A Al-Sharafi
No ratings yet
Current and Future Trends on AI Applications - Mohammed A Al-Sharafi
456 pages
Psych Unit 7a Practice Quiz
No ratings yet
Psych Unit 7a Practice Quiz
4 pages
Generative Adversarial Networks (GANs) - Engine and Applications PDF
No ratings yet
Generative Adversarial Networks (GANs) - Engine and Applications PDF
13 pages
handwritten_notes_recognition_using_artificial_intelligence
No ratings yet
handwritten_notes_recognition_using_artificial_intelligence
4 pages
Data Science Learning Path For 50 Days
No ratings yet
Data Science Learning Path For 50 Days
15 pages
ml unit 2
No ratings yet
ml unit 2
23 pages
A Comparative Study of SMOTE Borderline-SMOTE and ADASYN Oversampling Techniques Using Different Classifiers
No ratings yet
A Comparative Study of SMOTE Borderline-SMOTE and ADASYN Oversampling Techniques Using Different Classifiers
9 pages
Topluluk Öğrenmesi Yöntemleri Ile Göğüs Kanseri Teşhisi: Breast Cancer Diagnosis With Ensemble Learning Methods
No ratings yet
Topluluk Öğrenmesi Yöntemleri Ile Göğüs Kanseri Teşhisi: Breast Cancer Diagnosis With Ensemble Learning Methods
17 pages
HRW
No ratings yet
HRW
28 pages
Retracted Comparative Analysis OfDeepfake Image Detection
No ratings yet
Retracted Comparative Analysis OfDeepfake Image Detection
19 pages
CMT: Convolutional Neural Networks Meet Vision Transformers
No ratings yet
CMT: Convolutional Neural Networks Meet Vision Transformers
11 pages
The 9 Deep Learning Papers You Need To Know About 3
No ratings yet
The 9 Deep Learning Papers You Need To Know About 3
19 pages
Machine Learning Mastery Notes
No ratings yet
Machine Learning Mastery Notes
4 pages
LLM Fine-tuning_presentation
No ratings yet
LLM Fine-tuning_presentation
7 pages
AI Machine Learning Complete Course: For PHP & Python Devs
100% (1)
AI Machine Learning Complete Course: For PHP & Python Devs
96 pages
Ccs355 Neural Networks and Deep Learning Unit1 (1)
No ratings yet
Ccs355 Neural Networks and Deep Learning Unit1 (1)
29 pages
Methodology For Land Cover Classification Using CNN
No ratings yet
Methodology For Land Cover Classification Using CNN
6 pages
TRW Assignment 1
No ratings yet
TRW Assignment 1
10 pages
Supervised vs Unsupervised Learning - Javatpoint
No ratings yet
Supervised vs Unsupervised Learning - Javatpoint
9 pages
Teaching Language To AI - Approaches, Challenges, and Future Directions
No ratings yet
Teaching Language To AI - Approaches, Challenges, and Future Directions
5 pages
Sat - 23.Pdf - Handwritten Hindi Character Recognition Using CNN
No ratings yet
Sat - 23.Pdf - Handwritten Hindi Character Recognition Using CNN
11 pages
WWW Pyimagesearch Com 2020-10-12 Multi Class Object Detection and Bounding Box R (1 36)
No ratings yet
WWW Pyimagesearch Com 2020-10-12 Multi Class Object Detection and Bounding Box R (1 36)
46 pages
Brain Intelligence: Go Beyond Artificial Intelligence
No ratings yet
Brain Intelligence: Go Beyond Artificial Intelligence
15 pages
Accepted Manuscript: Computerized Medical Imaging and Graphics
No ratings yet
Accepted Manuscript: Computerized Medical Imaging and Graphics
25 pages
Anshika_Mishra_Resume
No ratings yet
Anshika_Mishra_Resume
1 page
Yolo
No ratings yet
Yolo
13 pages
Dynamic Image Generation From Text Prompt Research Paper-JOT-5135
100% (1)
Dynamic Image Generation From Text Prompt Research Paper-JOT-5135
7 pages
Domain Specific Language Models Pre Trained On Constru - 2024 - Automation in Co
No ratings yet
Domain Specific Language Models Pre Trained On Constru - 2024 - Automation in Co
14 pages
Advance Deep Learning
No ratings yet
Advance Deep Learning
10 pages
Automatic Early Stopping Using Cross Validation: Quantifying The Criteria
No ratings yet
Automatic Early Stopping Using Cross Validation: Quantifying The Criteria
7 pages