SlideShare a Scribd company logo
How Does word2vec Work? 
Andrew Koo - Insight Data Science
word2vec (Google, 2013) 
• Use documents to train a neural network model 
maximizing the conditional probability of context given the 
word 
• Apply the trained model to each word to get its 
corresponding vector 
• Calculate the vector of sentences by averaging the vector 
of their words 
• Construct the similarity matrix between sentences 
• Use Pagerank to score the sentences in graph
1. Use documents to train a neural 
network model maximizing the conditional 
probability of context given the word 
The goal is to optimize the parameters (Θ) maximizing the 
conditional probability of context (c) given the word (w). D is the set 
of all (w, c) pairs 
For example: I ate a “????” at McDonald last night is more likely 
given Big Mac
2. Apply the model to each word 
to get its corresponding vector 
word vector 
(0.12, 0.23, 0.56) 
(0.24, 0.65, 0.72) 
(0.38, 0.42, 0.12) 
(0.57, 0.01, 0.02) 
(0.53, 0.68, 0.91) 
(0.11, 0.27, 0.45) 
(0.01, 0.05, 0.62) 
The 
Cardinals 
will 
win 
the 
world 
series
3. Calculate the vector of sentences 
by averaging the vector of their words 
word vector 
(0.12, 0.23, 0.56) 
(0.24, 0.65, 0.72) 
(0.38, 0.42, 0.12) 
(0.57, 0.01, 0.02) 
(0.53, 0.68, 0.91) 
(0.11, 0.27, 0.45) 
(0.01, 0.05, 0.62) 
The 
Cardinals 
will 
win 
the 
world 
series 
sentence vector 
(0.28, 0.33, 0.49)
4. Construct the similarity 
matrix between sentences 
1 
0.366 
0.243 
0.564 
0.720 
Sentence Vector 
S’1 
S’2 
S’3 
S’4 
S’5 
0.366 
1 
0.623 
0.132 
0.189 
0.243 
0.623 
1 
0.014 
0.523 
0.564 
0.132 
0.014 
1 
0.002 
matrix * matrix.T similarity matrix 
0.720 
0.189 
0.523 
0.002 
1
5. Use Pagerank to score the 
sentences in graph 
• Rank the sentences 
with underlying 
assumption that 
“summary sentences” 
are similar to most 
other sentences
Ad

More Related Content

What's hot (20)

Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)
Jinpyo Lee
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Minh Pham
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Score based generative model
Score based generative modelScore based generative model
Score based generative model
sangyun lee
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Semi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguationSemi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguation
kokanechandrakant
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
Arvind Devaraj
 
Rnn & Lstm
Rnn & LstmRnn & Lstm
Rnn & Lstm
Subash Chandra Pakhrin
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learning
Roberto Pereira Silveira
 
Attention mechanism 소개 자료
Attention mechanism 소개 자료Attention mechanism 소개 자료
Attention mechanism 소개 자료
Whi Kwon
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
Shreshth Saxena
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanism
Khang Pham
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
Jungwon Kim
 
Language Model (N-Gram).pptx
Language Model (N-Gram).pptxLanguage Model (N-Gram).pptx
Language Model (N-Gram).pptx
HeneWijaya
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
佳蓉 倪
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
NameetDaga1
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
Alexey Grigorev
 
Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)
Jinpyo Lee
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Minh Pham
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Score based generative model
Score based generative modelScore based generative model
Score based generative model
sangyun lee
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Semi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguationSemi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguation
kokanechandrakant
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
Arvind Devaraj
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learning
Roberto Pereira Silveira
 
Attention mechanism 소개 자료
Attention mechanism 소개 자료Attention mechanism 소개 자료
Attention mechanism 소개 자료
Whi Kwon
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanism
Khang Pham
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
Jungwon Kim
 
Language Model (N-Gram).pptx
Language Model (N-Gram).pptxLanguage Model (N-Gram).pptx
Language Model (N-Gram).pptx
HeneWijaya
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
佳蓉 倪
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
NameetDaga1
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
Alexey Grigorev
 

Similar to Word2vec algorithm (20)

Word2 vec
Word2 vecWord2 vec
Word2 vec
ankit_ppt
 
Ltc completed slides
Ltc completed slidesLtc completed slides
Ltc completed slides
Roseline Antai
 
Word2vec ultimate beginner
Word2vec ultimate beginnerWord2vec ultimate beginner
Word2vec ultimate beginner
Sungmin Yang
 
Fusing semantic data
Fusing semantic dataFusing semantic data
Fusing semantic data
Andriy Nikolov
 
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptxDL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
Kv Sagar
 
presentation2-180202073525.pptx
presentation2-180202073525.pptxpresentation2-180202073525.pptx
presentation2-180202073525.pptx
KtonNguyn2
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
Data Con LA
 
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Kunwoo Park
 
Week 3.pdf
Week 3.pdfWeek 3.pdf
Week 3.pdf
RupakKadhare
 
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vec
JIN KYU CHANG
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
Bayes Nets meetup London
 
AI at Stitch Fix 2017
AI at Stitch Fix 2017AI at Stitch Fix 2017
AI at Stitch Fix 2017
👋 Christopher Moody
 
Mining Arguments from Online Debating Systems
Mining Arguments from Online Debating SystemsMining Arguments from Online Debating Systems
Mining Arguments from Online Debating Systems
Andrea Pazienza
 
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
miaoli35
 
Combinatorial Problems2
Combinatorial Problems2Combinatorial Problems2
Combinatorial Problems2
3ashmawy
 
Natural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdfNatural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdf
SravaniGunnu
 
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding ApproachText Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Ahmed Hani Ibrahim
 
Science in text mining
Science in text miningScience in text mining
Science in text mining
Tanay Chowdhury
 
Lecture1.pptx
Lecture1.pptxLecture1.pptx
Lecture1.pptx
jonathanG19
 
집합모델 확장불린모델
집합모델  확장불린모델집합모델  확장불린모델
집합모델 확장불린모델
JUNGEUN KANG
 
Word2vec ultimate beginner
Word2vec ultimate beginnerWord2vec ultimate beginner
Word2vec ultimate beginner
Sungmin Yang
 
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptxDL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
DL-CO2 -Session 3 Learning Vectorial Representations of Words.pptx
Kv Sagar
 
presentation2-180202073525.pptx
presentation2-180202073525.pptxpresentation2-180202073525.pptx
presentation2-180202073525.pptx
KtonNguyn2
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
Data Con LA
 
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Kunwoo Park
 
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vec
JIN KYU CHANG
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
Bayes Nets meetup London
 
Mining Arguments from Online Debating Systems
Mining Arguments from Online Debating SystemsMining Arguments from Online Debating Systems
Mining Arguments from Online Debating Systems
Andrea Pazienza
 
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
miaoli35
 
Combinatorial Problems2
Combinatorial Problems2Combinatorial Problems2
Combinatorial Problems2
3ashmawy
 
Natural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdfNatural Language Processing word to Vec.pdf
Natural Language Processing word to Vec.pdf
SravaniGunnu
 
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding ApproachText Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Ahmed Hani Ibrahim
 
집합모델 확장불린모델
집합모델  확장불린모델집합모델  확장불린모델
집합모델 확장불린모델
JUNGEUN KANG
 
Ad

Recently uploaded (20)

Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptx
ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptxISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptx
ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptx
pankaj6188303
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Modern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx AaModern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx Aa
MuhammadAwaisKamboh
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Process Mining and Data Science in the Financial Industry
Process Mining and Data Science in the Financial IndustryProcess Mining and Data Science in the Financial Industry
Process Mining and Data Science in the Financial Industry
Process mining Evangelist
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptx
ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptxISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptx
ISO 9001_2015 FINALaaaaaaaaaaaaaaaa - MDX - Copy.pptx
pankaj6188303
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Modern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx AaModern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx Aa
MuhammadAwaisKamboh
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Process Mining and Data Science in the Financial Industry
Process Mining and Data Science in the Financial IndustryProcess Mining and Data Science in the Financial Industry
Process Mining and Data Science in the Financial Industry
Process mining Evangelist
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Ad

Word2vec algorithm

  • 1. How Does word2vec Work? Andrew Koo - Insight Data Science
  • 2. word2vec (Google, 2013) • Use documents to train a neural network model maximizing the conditional probability of context given the word • Apply the trained model to each word to get its corresponding vector • Calculate the vector of sentences by averaging the vector of their words • Construct the similarity matrix between sentences • Use Pagerank to score the sentences in graph
  • 3. 1. Use documents to train a neural network model maximizing the conditional probability of context given the word The goal is to optimize the parameters (Θ) maximizing the conditional probability of context (c) given the word (w). D is the set of all (w, c) pairs For example: I ate a “????” at McDonald last night is more likely given Big Mac
  • 4. 2. Apply the model to each word to get its corresponding vector word vector (0.12, 0.23, 0.56) (0.24, 0.65, 0.72) (0.38, 0.42, 0.12) (0.57, 0.01, 0.02) (0.53, 0.68, 0.91) (0.11, 0.27, 0.45) (0.01, 0.05, 0.62) The Cardinals will win the world series
  • 5. 3. Calculate the vector of sentences by averaging the vector of their words word vector (0.12, 0.23, 0.56) (0.24, 0.65, 0.72) (0.38, 0.42, 0.12) (0.57, 0.01, 0.02) (0.53, 0.68, 0.91) (0.11, 0.27, 0.45) (0.01, 0.05, 0.62) The Cardinals will win the world series sentence vector (0.28, 0.33, 0.49)
  • 6. 4. Construct the similarity matrix between sentences 1 0.366 0.243 0.564 0.720 Sentence Vector S’1 S’2 S’3 S’4 S’5 0.366 1 0.623 0.132 0.189 0.243 0.623 1 0.014 0.523 0.564 0.132 0.014 1 0.002 matrix * matrix.T similarity matrix 0.720 0.189 0.523 0.002 1
  • 7. 5. Use Pagerank to score the sentences in graph • Rank the sentences with underlying assumption that “summary sentences” are similar to most other sentences