SlideShare a Scribd company logo
Learning Discourse-level Diversity for
Neural Dialog Models Using Conditional
Variational Autoencoders
Tiancheng Zhao, Ran Zhao and Maxine Eskenazi
Language Technologies Institute
Carnegie Mellon University
Code&Data: https://github.com/snakeztc/NeuralDialog-CVAE
Introduction
● End-to-end dialog models based on encoder-decoder models have shown great promises for
modeling open-domain conversations, due to its flexibility and scalability.
System Response
Encoder Decoder
Dialog History/Context
Introduction
However, dull response problem! [Li et al 2015, Serban et al. 2016]. Current solutions
include:
● Add more info to the dialog context [Xing et al 2016, Li et al 2016]
● Improve decoding algorithm, e.g. beam search [Wiseman and Rush 2016]
YesI don’t knowsure
Encoder Decoder
User: I am feeling quite happy today.
… (previous utterances)
Our Key Insights
● Response generation in conversation is a ONE-TO-MANY mapping problem at the
discourse level.
● A similar dialog context can have many different yet valid responses.
● Learn a probabilistic distribution over the valid responses instead of only keep the
most likely one.
Our Key Insights
● Response generation in conversation is a ONE-TO-MANY mapping problem at the discourse
level.
○ A similar dialog context can have many different yet valid responses.
● Learn a probabilistic distribution over the valid responses instead of only keep the most likely
one.
Our Contributions
1. Present an E2E dialog model adapted from Conditional Variational Autoencoder
(CVAE).
2. Enable integration of expert knowledge via knowledge-guided CVAE.
3. Improve the training method of optimizing CVAE/VAE for text generation.
Conditional Variational Auto Encoder (CVAE)
● C is dialog context
○ B: Do you like cats? A: Yes I do
● Z is the latent variable (gaussian)
● X is the next response
○ B: So do I.
Conditional Variational Auto Encoder (CVAE)
● C is dialog context
○ B: Do you like cats? A: Yes I do
● Z is the latent variable (gaussian)
● X is the next response
○ B: So do I.
● Trained by Stochastic Gradient Variational
Bayes (SGVB) [Kingma and Welling 2013]
Knowledge-Guided CVAE (kgCVAE)
● Y is linguistic features extracted from responses
○ Dialog act: statement -> “So do I”.
● Use Y to guide the learning of latent Z
Training of (kg)CVAE
Reconstruction loss
KL-divergence loss
Testing of (kg)CVAE
Optimization Challenge
Training CVAE with RNN decoder is hard due to the vanishing latent variable problem
[Bowman et al., 2015]
● RNN decoder can cheat by using LM information and ignore Z!
Bowman et al. [2015] described two methods to alleviate the problem :
1. KL annealing (KLA): gradually increase the weight of KL term from 0 to 1 (need early stop).
2. Word drop decoding: setting a proportion of target words to 0 (need careful parameter
picking).
BOW Loss
● Predict the bag-of-words in the responses X at once (word counts in the response)
● Break the dependency between words and eliminate the chance of cheating based on LM.
z
c
RNN Lossx
BOW Loss
● Predict the bag-of-words in the responses X at once (word counts in the response)
● Break the dependency between words and eliminate the chance of cheating based on LM.
z
c
RNN Lossx
xwo
FF Bag-of-word Loss
Dataset
Data Name Switchboard Release 2
Number of dialogs 2,400 (2316/60/62 - train/valid/test)
Number of context-response pairs 207,833/5,225/5,481
Vocabulary Size Top 10K
Dialog Act Labels 42 types, tagged by SVM and human
Number of Topics 70 tagged by humans
Quantitative Metrics
Ref resp1
Ref resp Mc
Context
Hyp resp 1
Hyp resp N
ModelHuman ... ...
Quantitative Metrics
d(r, h) is a distance function [0, 1] to measure the similarity between a reference and a hypothesis.
Appropriateness
Diversity
Ref resp1
Ref resp Mc
Context
Hyp resp 1
Hyp resp N
ModelHuman ... ...
Distance Functions used for Evaluation
1. Smoothed Sentence-level BLEU (1/2/3/4): lexical similarity
2. Cosine distance of Bag-of-word Embeddings: distributed semantic similarity.
(pre-trained Glove embedding on twitter)
a. Average of embeddings (A-bow)
b. Extrema of embeddings (E-bow)
3. Dialog Act Match: illocutionary force-level similarity
a. (Use pre-trained dialog act tagger for tagging)
Models (trained with BOW loss)
Encoder Sampling Decoder
Encoder Greedy Decoder
Encoder Greedy Decoder
z
z
y
sampling
sampling
Baseline
CVAE
kgCVAE
Quantitative Analysis Results
Metrics Perplexi
ty (KL)
BLEU-1
(p/r)
BLEU-2
(p/r)
BLEU-3
(p/r)
BLEU-4
(p/r)
A-bow
(p/r)
E-bow
(p/r)
DA
(p/r)
Baseline
(sample)
35.4
(n/a)
0.405/
0.336
0.3/
0.281
0.272/
0.254
0.226/
0.215
0.387/
0.337
0.701/
0.684
0.736/
0.514
CVAE
(greedy)
20.2
(11.36)
0.372/
0.381
0.295/
0.322
0.265/
0.292
0.223/
0.248
0.389/
0.361
0.705/
0.709
0.704/
0.604
kgCVAE
(greedy)
16.02
(13.08)
0.412/
0.411
0.350/
0.356
0.310/
0.318
0.262/
0.272
0.373/
0.336
0.711/
0.712
0.721/
0.598
Note: BLEU are normalized into [0, 1] to be valid precision and recall distance function
Qualitative Analysis
Topic: Recycling Context: A: are they doing a lot of recycling out in Georgia?
Target (statement): well at my workplace we have places for aluminium cans
Baseline + Sampling kgCVAE + Greedy
1. well I’m a graduate student and have two
kids.
1. (non-understand) pardon.
2. well I was in last year and so we’ve had
lots of recycling.
2. (statement) oh you’re not going to have a
curbside pick up here.
3. I’m not sure. 3. (statement) okay I am sure about a recycling
center.
4. well I don’t know I just moved here in new
york.
4. (yes-answer) yeah so.
Latent Space Visualization
● Visualization of the posterior Z on the test
dataset in 2D space using t-SNE.
● Assign different colors to the top 8 frequent
dialog acts.
● The size of circle represents the response
length.
● Exhibit clear clusterings of responses w.r.t the
dialog act
The Effect of BOW Loss
Same setup on PennTree Bank for LM
[Bowman 2015]. Compare 4 setups:
1. Standard VAE
2. KL Annealing (KLA)
3. BOW
4. BOW + KLA
Goal: low reconstruction loss + small
but non-trivial KL cost
Model Perplexity KL Cost
Standard 122.0 0.05
KLA 111.5 2.02
BOW 97.72 7.41
BOW+KLA 73.04 15.94
KL Cost during Training
● Standard model suffers from vanishing
latent variable.
● KLA requires early stopping.
● BOW leads to stable convergence
with/without KLA.
● The same trend is observed on CVAE.
Conclusion and Future Work
● Identify the ONE-TO-MANY nature of open-domain dialog modeling
● Propose two novel models based on latent variables models for generating diverse yet
appropriate responses.
● Explore further in the direction of leveraging both past linguistic findings and deep models
for controllability and explainability.
● Utilize crowdsourcing to yield more robust evaluation.
Code available here! https://github.com/snakeztc/NeuralDialog-CVAE
Thank you!
Questions?
References
1. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A persona-based neural conversation model.
arXiv preprint arXiv:1603.06155
2. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2015. A diversity-promoting objective function for neural
conversation models. arXiv preprint arXiv:1510.03055 .
3. Samuel R Bowman, Luke Vilnis, Oriol Vinyals, An- drew M Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating
sentences from a continuous space. arXiv preprint arXiv:1511.06349 .
4. Diederik P Kingma and Max Welling. 2013. Auto- encoding variational bayes. arXiv preprint arXiv:1312.6114 .
5. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A persona-based neural conversation model.
arXiv preprint arXiv:1603.06155
Training Details
Word Embedding 200 Glove pre-trained on Twitter
Utterance Encoder Hidden Size 300
Context Encoder Hidden Size 600
Response Decoder Hidden Size 400
Latent Z Size 200
Context Window Size 10 utterances
Optimizer Adam learning rate=0.001
Testset Creation
● Use 10-nearest neighbour to collect similar context in the training data
● Label a subset of the appropriateness of the 10 responses by 2 human
annotators
● bootstrap via SVM on the whole test set (5481 context/response)
● Resulting 6.79 Avg references responses/context
● Distinct reference dialog acts 4.2
Ad

More Related Content

What's hot (20)

NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
shaurya uppal
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
Shreyas Suresh Rao
 
BERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil KumarBERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil Kumar
Senthil Kumar M
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
RIILP
 
From UML/OCL to natural language (using SBVR as pivot)
From UML/OCL to natural language (using SBVR as pivot)From UML/OCL to natural language (using SBVR as pivot)
From UML/OCL to natural language (using SBVR as pivot)
Jordi Cabot
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
bhavesh_physics
 
An NLP-based architecture for the autocompletion of partial domain models
An NLP-based architecture for the autocompletion of partial domain modelsAn NLP-based architecture for the autocompletion of partial domain models
An NLP-based architecture for the autocompletion of partial domain models
Lola Burgueño
 
Information Retrieval with Deep Learning
Information Retrieval with Deep LearningInformation Retrieval with Deep Learning
Information Retrieval with Deep Learning
Adam Gibson
 
BERT
BERTBERT
BERT
Sang Hyun Jeon
 
Reference Scope Identification in Citing Sentences
Reference Scope Identification in Citing SentencesReference Scope Identification in Citing Sentences
Reference Scope Identification in Citing Sentences
Akihiro Kameda
 
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Yuki Tomo
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Association for Computational Linguistics
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
nlab_utokyo
 
Plug play language_models
Plug play language_modelsPlug play language_models
Plug play language_models
Mohammad Moslem Uddin
 
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
Hiroki Shimanaka
 
Polymorphism
PolymorphismPolymorphism
Polymorphism
SherabGyatso
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
RIILP
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Jinho Choi
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Lifeng (Aaron) Han
 
BERT - Part 2 Learning Notes
BERT - Part 2 Learning NotesBERT - Part 2 Learning Notes
BERT - Part 2 Learning Notes
Senthil Kumar M
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
shaurya uppal
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
Shreyas Suresh Rao
 
BERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil KumarBERT - Part 1 Learning Notes of Senthil Kumar
BERT - Part 1 Learning Notes of Senthil Kumar
Senthil Kumar M
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
RIILP
 
From UML/OCL to natural language (using SBVR as pivot)
From UML/OCL to natural language (using SBVR as pivot)From UML/OCL to natural language (using SBVR as pivot)
From UML/OCL to natural language (using SBVR as pivot)
Jordi Cabot
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
bhavesh_physics
 
An NLP-based architecture for the autocompletion of partial domain models
An NLP-based architecture for the autocompletion of partial domain modelsAn NLP-based architecture for the autocompletion of partial domain models
An NLP-based architecture for the autocompletion of partial domain models
Lola Burgueño
 
Information Retrieval with Deep Learning
Information Retrieval with Deep LearningInformation Retrieval with Deep Learning
Information Retrieval with Deep Learning
Adam Gibson
 
Reference Scope Identification in Citing Sentences
Reference Scope Identification in Citing SentencesReference Scope Identification in Citing Sentences
Reference Scope Identification in Citing Sentences
Akihiro Kameda
 
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Yuki Tomo
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
nlab_utokyo
 
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
Hiroki Shimanaka
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
RIILP
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Jinho Choi
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Lifeng (Aaron) Han
 
BERT - Part 2 Learning Notes
BERT - Part 2 Learning NotesBERT - Part 2 Learning Notes
BERT - Part 2 Learning Notes
Senthil Kumar M
 

Similar to Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog Models Using Conditional Variational Autoencoders (20)

[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
Hiroki Shimanaka
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
NameetDaga1
 
Neural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword units
Tae Hwan Jung
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
Traian Rebedea
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
Surya Sg
 
Poster Tweet-Norm 2013
Poster Tweet-Norm 2013Poster Tweet-Norm 2013
Poster Tweet-Norm 2013
pruiz_
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
Jayavardhan Reddy Peddamail
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
JaeHo Jang
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
Machine Learning Prague
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
GowrySailaja
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Deep Learning Italia
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
Qi He
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
Qi He
 
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
IJCI JOURNAL
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
Seonghyun Kim
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
iwan_rg
 
Generating sentences from a continuous space
Generating sentences from a continuous spaceGenerating sentences from a continuous space
Generating sentences from a continuous space
Shuhei Iitsuka
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET Journal
 
How to expand your nlp solution to new languages using transfer learning
How to expand your nlp solution to new languages using transfer learningHow to expand your nlp solution to new languages using transfer learning
How to expand your nlp solution to new languages using transfer learning
Lena Shakurova
 
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
Hiroki Shimanaka
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
NameetDaga1
 
Neural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword units
Tae Hwan Jung
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
Surya Sg
 
Poster Tweet-Norm 2013
Poster Tweet-Norm 2013Poster Tweet-Norm 2013
Poster Tweet-Norm 2013
pruiz_
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
Jayavardhan Reddy Peddamail
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
JaeHo Jang
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
Machine Learning Prague
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
GowrySailaja
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Deep Learning Italia
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
Qi He
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
Qi He
 
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
IJCI JOURNAL
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
Seonghyun Kim
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
iwan_rg
 
Generating sentences from a continuous space
Generating sentences from a continuous spaceGenerating sentences from a continuous space
Generating sentences from a continuous space
Shuhei Iitsuka
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET Journal
 
How to expand your nlp solution to new languages using transfer learning
How to expand your nlp solution to new languages using transfer learningHow to expand your nlp solution to new languages using transfer learning
How to expand your nlp solution to new languages using transfer learning
Lena Shakurova
 
Ad

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Association for Computational Linguistics
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Association for Computational Linguistics
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Association for Computational Linguistics
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Association for Computational Linguistics
 
Ad

Recently uploaded (20)

To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
Understanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s GuideUnderstanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s Guide
GS Virdi
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdfBiophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
PKLI-Institute of Nursing and Allied Health Sciences Lahore , Pakistan.
 
Presentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem KayaPresentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
Introduction to Vibe Coding and Vibe Engineering
Introduction to Vibe Coding and Vibe EngineeringIntroduction to Vibe Coding and Vibe Engineering
Introduction to Vibe Coding and Vibe Engineering
Damian T. Gordon
 
Contact Lens:::: An Overview.pptx.: Optometry
Contact Lens:::: An Overview.pptx.: OptometryContact Lens:::: An Overview.pptx.: Optometry
Contact Lens:::: An Overview.pptx.: Optometry
MushahidRaza8
 
SPRING FESTIVITIES - UK AND USA -
SPRING FESTIVITIES - UK AND USA            -SPRING FESTIVITIES - UK AND USA            -
SPRING FESTIVITIES - UK AND USA -
Colégio Santa Teresinha
 
APM Midlands Region April 2025 Sacha Hind Circulated.pdf
APM Midlands Region April 2025 Sacha Hind Circulated.pdfAPM Midlands Region April 2025 Sacha Hind Circulated.pdf
APM Midlands Region April 2025 Sacha Hind Circulated.pdf
Association for Project Management
 
Sugar-Sensing Mechanism in plants....pptx
Sugar-Sensing Mechanism in plants....pptxSugar-Sensing Mechanism in plants....pptx
Sugar-Sensing Mechanism in plants....pptx
Dr. Renu Jangid
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
Odoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo SlidesOdoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo Slides
Celine George
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
Engage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdfEngage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdf
TechSoup
 
2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx
contactwilliamm2546
 
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx   quiz by Ridip HazarikaTHE STG QUIZ GROUP D.pptx   quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
Ridip Hazarika
 
Metamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative JourneyMetamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative Journey
Arshad Shaikh
 
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
larencebapu132
 
Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025
Mebane Rash
 
Kasdorf "Accessibility Essentials: A 2025 NISO Training Series, Session 5, Ac...
Kasdorf "Accessibility Essentials: A 2025 NISO Training Series, Session 5, Ac...Kasdorf "Accessibility Essentials: A 2025 NISO Training Series, Session 5, Ac...
Kasdorf "Accessibility Essentials: A 2025 NISO Training Series, Session 5, Ac...
National Information Standards Organization (NISO)
 
To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
Understanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s GuideUnderstanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s Guide
GS Virdi
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
Presentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem KayaPresentation of the MIPLM subject matter expert Erdem Kaya
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
Introduction to Vibe Coding and Vibe Engineering
Introduction to Vibe Coding and Vibe EngineeringIntroduction to Vibe Coding and Vibe Engineering
Introduction to Vibe Coding and Vibe Engineering
Damian T. Gordon
 
Contact Lens:::: An Overview.pptx.: Optometry
Contact Lens:::: An Overview.pptx.: OptometryContact Lens:::: An Overview.pptx.: Optometry
Contact Lens:::: An Overview.pptx.: Optometry
MushahidRaza8
 
Sugar-Sensing Mechanism in plants....pptx
Sugar-Sensing Mechanism in plants....pptxSugar-Sensing Mechanism in plants....pptx
Sugar-Sensing Mechanism in plants....pptx
Dr. Renu Jangid
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
Odoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo SlidesOdoo Inventory Rules and Routes v17 - Odoo Slides
Odoo Inventory Rules and Routes v17 - Odoo Slides
Celine George
 
Engage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdfEngage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdf
TechSoup
 
2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx
contactwilliamm2546
 
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx   quiz by Ridip HazarikaTHE STG QUIZ GROUP D.pptx   quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
Ridip Hazarika
 
Metamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative JourneyMetamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative Journey
Arshad Shaikh
 
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
larencebapu132
 
Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025
Mebane Rash
 

Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog Models Using Conditional Variational Autoencoders

  • 1. Learning Discourse-level Diversity for Neural Dialog Models Using Conditional Variational Autoencoders Tiancheng Zhao, Ran Zhao and Maxine Eskenazi Language Technologies Institute Carnegie Mellon University Code&Data: https://github.com/snakeztc/NeuralDialog-CVAE
  • 2. Introduction ● End-to-end dialog models based on encoder-decoder models have shown great promises for modeling open-domain conversations, due to its flexibility and scalability. System Response Encoder Decoder Dialog History/Context
  • 3. Introduction However, dull response problem! [Li et al 2015, Serban et al. 2016]. Current solutions include: ● Add more info to the dialog context [Xing et al 2016, Li et al 2016] ● Improve decoding algorithm, e.g. beam search [Wiseman and Rush 2016] YesI don’t knowsure Encoder Decoder User: I am feeling quite happy today. … (previous utterances)
  • 4. Our Key Insights ● Response generation in conversation is a ONE-TO-MANY mapping problem at the discourse level. ● A similar dialog context can have many different yet valid responses. ● Learn a probabilistic distribution over the valid responses instead of only keep the most likely one.
  • 5. Our Key Insights ● Response generation in conversation is a ONE-TO-MANY mapping problem at the discourse level. ○ A similar dialog context can have many different yet valid responses. ● Learn a probabilistic distribution over the valid responses instead of only keep the most likely one.
  • 6. Our Contributions 1. Present an E2E dialog model adapted from Conditional Variational Autoencoder (CVAE). 2. Enable integration of expert knowledge via knowledge-guided CVAE. 3. Improve the training method of optimizing CVAE/VAE for text generation.
  • 7. Conditional Variational Auto Encoder (CVAE) ● C is dialog context ○ B: Do you like cats? A: Yes I do ● Z is the latent variable (gaussian) ● X is the next response ○ B: So do I.
  • 8. Conditional Variational Auto Encoder (CVAE) ● C is dialog context ○ B: Do you like cats? A: Yes I do ● Z is the latent variable (gaussian) ● X is the next response ○ B: So do I. ● Trained by Stochastic Gradient Variational Bayes (SGVB) [Kingma and Welling 2013]
  • 9. Knowledge-Guided CVAE (kgCVAE) ● Y is linguistic features extracted from responses ○ Dialog act: statement -> “So do I”. ● Use Y to guide the learning of latent Z
  • 10. Training of (kg)CVAE Reconstruction loss KL-divergence loss
  • 12. Optimization Challenge Training CVAE with RNN decoder is hard due to the vanishing latent variable problem [Bowman et al., 2015] ● RNN decoder can cheat by using LM information and ignore Z! Bowman et al. [2015] described two methods to alleviate the problem : 1. KL annealing (KLA): gradually increase the weight of KL term from 0 to 1 (need early stop). 2. Word drop decoding: setting a proportion of target words to 0 (need careful parameter picking).
  • 13. BOW Loss ● Predict the bag-of-words in the responses X at once (word counts in the response) ● Break the dependency between words and eliminate the chance of cheating based on LM. z c RNN Lossx
  • 14. BOW Loss ● Predict the bag-of-words in the responses X at once (word counts in the response) ● Break the dependency between words and eliminate the chance of cheating based on LM. z c RNN Lossx xwo FF Bag-of-word Loss
  • 15. Dataset Data Name Switchboard Release 2 Number of dialogs 2,400 (2316/60/62 - train/valid/test) Number of context-response pairs 207,833/5,225/5,481 Vocabulary Size Top 10K Dialog Act Labels 42 types, tagged by SVM and human Number of Topics 70 tagged by humans
  • 16. Quantitative Metrics Ref resp1 Ref resp Mc Context Hyp resp 1 Hyp resp N ModelHuman ... ...
  • 17. Quantitative Metrics d(r, h) is a distance function [0, 1] to measure the similarity between a reference and a hypothesis. Appropriateness Diversity Ref resp1 Ref resp Mc Context Hyp resp 1 Hyp resp N ModelHuman ... ...
  • 18. Distance Functions used for Evaluation 1. Smoothed Sentence-level BLEU (1/2/3/4): lexical similarity 2. Cosine distance of Bag-of-word Embeddings: distributed semantic similarity. (pre-trained Glove embedding on twitter) a. Average of embeddings (A-bow) b. Extrema of embeddings (E-bow) 3. Dialog Act Match: illocutionary force-level similarity a. (Use pre-trained dialog act tagger for tagging)
  • 19. Models (trained with BOW loss) Encoder Sampling Decoder Encoder Greedy Decoder Encoder Greedy Decoder z z y sampling sampling Baseline CVAE kgCVAE
  • 20. Quantitative Analysis Results Metrics Perplexi ty (KL) BLEU-1 (p/r) BLEU-2 (p/r) BLEU-3 (p/r) BLEU-4 (p/r) A-bow (p/r) E-bow (p/r) DA (p/r) Baseline (sample) 35.4 (n/a) 0.405/ 0.336 0.3/ 0.281 0.272/ 0.254 0.226/ 0.215 0.387/ 0.337 0.701/ 0.684 0.736/ 0.514 CVAE (greedy) 20.2 (11.36) 0.372/ 0.381 0.295/ 0.322 0.265/ 0.292 0.223/ 0.248 0.389/ 0.361 0.705/ 0.709 0.704/ 0.604 kgCVAE (greedy) 16.02 (13.08) 0.412/ 0.411 0.350/ 0.356 0.310/ 0.318 0.262/ 0.272 0.373/ 0.336 0.711/ 0.712 0.721/ 0.598 Note: BLEU are normalized into [0, 1] to be valid precision and recall distance function
  • 21. Qualitative Analysis Topic: Recycling Context: A: are they doing a lot of recycling out in Georgia? Target (statement): well at my workplace we have places for aluminium cans Baseline + Sampling kgCVAE + Greedy 1. well I’m a graduate student and have two kids. 1. (non-understand) pardon. 2. well I was in last year and so we’ve had lots of recycling. 2. (statement) oh you’re not going to have a curbside pick up here. 3. I’m not sure. 3. (statement) okay I am sure about a recycling center. 4. well I don’t know I just moved here in new york. 4. (yes-answer) yeah so.
  • 22. Latent Space Visualization ● Visualization of the posterior Z on the test dataset in 2D space using t-SNE. ● Assign different colors to the top 8 frequent dialog acts. ● The size of circle represents the response length. ● Exhibit clear clusterings of responses w.r.t the dialog act
  • 23. The Effect of BOW Loss Same setup on PennTree Bank for LM [Bowman 2015]. Compare 4 setups: 1. Standard VAE 2. KL Annealing (KLA) 3. BOW 4. BOW + KLA Goal: low reconstruction loss + small but non-trivial KL cost Model Perplexity KL Cost Standard 122.0 0.05 KLA 111.5 2.02 BOW 97.72 7.41 BOW+KLA 73.04 15.94
  • 24. KL Cost during Training ● Standard model suffers from vanishing latent variable. ● KLA requires early stopping. ● BOW leads to stable convergence with/without KLA. ● The same trend is observed on CVAE.
  • 25. Conclusion and Future Work ● Identify the ONE-TO-MANY nature of open-domain dialog modeling ● Propose two novel models based on latent variables models for generating diverse yet appropriate responses. ● Explore further in the direction of leveraging both past linguistic findings and deep models for controllability and explainability. ● Utilize crowdsourcing to yield more robust evaluation. Code available here! https://github.com/snakeztc/NeuralDialog-CVAE
  • 27. References 1. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A persona-based neural conversation model. arXiv preprint arXiv:1603.06155 2. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2015. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055 . 3. Samuel R Bowman, Luke Vilnis, Oriol Vinyals, An- drew M Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 . 4. Diederik P Kingma and Max Welling. 2013. Auto- encoding variational bayes. arXiv preprint arXiv:1312.6114 . 5. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A persona-based neural conversation model. arXiv preprint arXiv:1603.06155
  • 28. Training Details Word Embedding 200 Glove pre-trained on Twitter Utterance Encoder Hidden Size 300 Context Encoder Hidden Size 600 Response Decoder Hidden Size 400 Latent Z Size 200 Context Window Size 10 utterances Optimizer Adam learning rate=0.001
  • 29. Testset Creation ● Use 10-nearest neighbour to collect similar context in the training data ● Label a subset of the appropriateness of the 10 responses by 2 human annotators ● bootstrap via SVM on the whole test set (5481 context/response) ● Resulting 6.79 Avg references responses/context ● Distinct reference dialog acts 4.2