SlideShare a Scribd company logo
Generating Diverse and Consistent QA pairs
from Contexts with Information-Maximizing
Hierarchical Conditional VAEs
Dong Bok Lee1*, Seanie Lee1*, Woo Tae Jung2 , Donghwan Kim2 and Sung Ju Hwang1,3
KAIST1, Daejeon, South Korea
42MARU2, Seoul, South Korea
AITRICS3, Seoul, South Korea
1
Data Scarcity in Question Answering (QA)
2
One of the most crucial challenges in QA is the scarcity of labeled data.
Since it is costly to obtain QA pairs for target text domain with human annotation.
Human AnnotatorNew domain
Coronavirus (COVID-19)
Coronaviruses are a group of related RNA
viruses that cause diseases in mammals and birds.
In humans, these viruses cause respiratory tract
infections that can range from mild to lethal. Mild
illnesses include some cases of the common
cold (which is also caused by other …
QA Annotation
Q: When did the COVID-19 occur?
A: 2019
Neural Question Generation (NQG)
3
Neural Question (and answer) generation is a popular approach to overcome this.
Neural Networks
Input (paragraph)
Philadelphia has more murals than any other
u.s. city, thanks in part to the 1984 creation
of the department of recreation’s mural arts
program, . . . The program has funded more
than 2,800 murals
Q1: which city has more murals than any other city?
A1: Philadelphia
Q2: why Philadelphia has more murals?
A2: the 1984 creation of the department of
recreation’s mural arts program
Q3: when did the department of recreation’s mural
arts program start?
A3: 1984
Q4: how many murals funded the graffiti arts program
by the department of recreation?
A4: more than 2,800
Seq2Seq + attention + copy
Diversity: One-to-many Problem
4
Existing systems have overlooked that QAG is essentially a one-to-many problem.
Neural Networks
Input (paragraph)
Philadelphia has more murals than any other
u.s. city, thanks in part to the 1984 creation
of the department of recreation’s mural arts
program, . . . The program has funded more
than 2,800 murals
Q1: which city has more murals than any other city?
A1: Philadelphia
Q2: why Philadelphia has more murals?
A2: the 1984 creation of the department of
recreation’s mural arts program
Q3: when did the department of recreation’s mural
arts program start?
A3: 1984
Q4: how many murals funded the graffiti arts program
by the department of recreation?
A4: more than 2,800
Philadelphia has more murals than any other
u.s. city, thanks in part to the 1984 creation
of the department of recreation’s mural arts
program, . . . The program has funded more
than 2,800 murals
Q1: which city has more murals than any other city?
A1: Philadelphia
Q2: why Philadelphia has more murals?
A2: the 1984 creation of the department of
recreation’s mural arts program
Q3: when did the department of recreation’s mural
arts program start?
A3: 1984
Q4: how many murals funded the graffiti arts program
by the department of recreation?
A4: more than 2,800
Consistency: Answerability of Question
5
No constraint for consistency between question and answer.
Neural Networks
Input (paragraph)
Philadelphia has more murals than any other
u.s. city, thanks in part to the 1984 creation
of the department of recreation’s mural arts
program, . . . The program has funded more
than 2,800 murals
Q1: which city has more murals than any other city?
A1: Philadelphia
Q2: why Philadelphia has more murals?
A2: the 1984 creation of the department of
recreation’s mural arts program
Q3: when did the department of recreation’s mural
arts program start?
A3: 1984
Q4: how many murals funded the graffiti arts program
by the department of recreation?
A4: more than 2,800
Info-Maximizing Hierarchical Conditional VAEs
6
To overcome these challenges, we propose Info-HCVAE for QA pairs generation.
Diversity-> deep latent variable model (HCVAE) Consistency-> Mutual Information Maximization
+
Derivation of ELBO
7
Formally, our goal is to learn conditional joint distribution as follows:
Question
which city has more
murals than any other city?
Answer
Philadelphia
Context (paragraph)
Philadelphia has more murals than
any other u.s. city, thanks in part to
the 1984 creation of the department
of recreation’s mural arts
program, . . . The program has
funded more than 2,800 murals
𝑥, 𝑦 ~ 𝑝(𝑥, 𝑦|𝑐)
Derivation of ELBO
8
In here, we introduce a separate latent space for question and answer as follows:
Discrete R.V.
(e.g., Categorical Distribution)
Continuous R.V.
(e.g., Gaussian Distribution)
𝑝 𝑥, 𝑦 𝑐 = '
!!
(
!"
𝑝 𝑥 𝑧", 𝑦, 𝑐 𝑝(𝑦|𝑧", 𝑧#, 𝑐)
𝑝 𝑧# 𝑧", 𝑐 𝑝 𝑧" 𝑐 𝑑𝑧"
Derivation of ELBO
9
We then use variational posteriors to maximize following Evidence Lower Bound:
log 𝑝! 𝑥, 𝑦 𝑐 ≥ 𝔼"!~#$ 𝑧# 𝑥, 𝑐 [log 𝑝! 𝑥 𝑧#, 𝑦, 𝑐 ]
+ 𝔼"%~%$ 𝑧& 𝑧#, 𝑦, 𝑐 log 𝑝! 𝑦 𝑧&, 𝑐
−𝐷'([𝑞) 𝑧& 𝑧#, 𝑦, 𝑐 ||𝑝*(𝑧&|𝑧#, 𝑐)]
−𝐷'([𝑞) 𝑧# 𝑥, 𝑐 ||𝑝* 𝑧# 𝑐 ]
=: ℒ+,-./
Derivation of ELBO
10
After training, the generative process of HCVAE is as follows:
1. 𝑝! 𝑧" 𝑐
2. 𝑝! 𝑧# 𝑧", 𝑐 3. 𝑝$ 𝑦 𝑧#, 𝑐
4. 𝑝$ 𝑥 𝑧", 𝑦, 𝑐
1. Sample question L.V.: 𝑧, ~ 𝑝-(𝑧,|𝑐)
2. Sample answer L.V.: 𝑧. ~ 𝑝-(𝑧.|𝑧,, 𝑐)
3. Generate an answer: y ~ 𝑝/(𝑦|𝑧., 𝑐)
4. Generate a question: x ~ 𝑝/(𝑥|𝑧,, 𝑦, 𝑐)
Hierarchical Conditional VAEs
11
This is the overall description of our specific implementation for HCVAE.
Hierarchical Conditional VAEs
12
Conditional Posterior / Prior networks aim at mapping the input into latent space.
Posterior
3 Bi-LSTMs, 2 MLPs, attention
Prior
1 Bi-LSTM, 1 MLP
M
L
PContext
At the end of the main drive,
is a simple, modern stone
statue of mary.
what is at the end of
the main drive?
Question
Answer
Bi-LSTM
Encoder
(a) Conditional Posterior / Prior Networks
Bi-LSTM
Encoder
Bi-LSTM
Encoder
M
L
P
Hierarchical Conditional VAEs
13
AG network aims at generating answer spans from categorical L.V:
At the end of the main drive,
is a simple, modern stone
statue of mary.
modern stone statue of mary
Bi-LSTM Decoder
Answer
Heuristic Matching
Answer Generation
Heuristic Matching Layer in NLI [Mou2016],
Bi-LSTM,
2 linear layers to predict answer spans
[Mou2016] Lili Mou et. al., Natural language inference by tree-based convolution and heuristic matching, ACL 2016
(b) Answer Generation Network
Hierarchical Conditional VAEs
14
QG network aims at generating question from answer and continuous L.V:
Question Generation
Bi-LSTM with gated self-attention [Wang2017],
Luong attention [Luong2015],
LSTM,
Linear layer to predict words,
Maxout copy mechanism [Zhao2018]
what is at the end of
the main drive?
At the end of the main drive,
is a simple, modern stone
statue of mary.
copy
Bi-LSTM
Encoder
Attention
LSTM
Decoder
Question
(c) Question Generation Network
[Wang2017] Wenhui Wang et. al., Gated self-matching net-works for reading comprehension and question answering, ACL 2017
[Luong2015] Thang Luong et. al. Effective approaches to attention-based neural machine translation, EMNLP 2015,
[Zhao2018] Yao Zhao el. al. Paragraph-level neural question generation with maxout pointer and gated self-attention networks, EMNLP2018
Mutual Information Maximization
15
Semantic consistency of QA pairs is important.
Context: ...during the age of enlightenment, philosophers such as
john locke advocated the principle in their writings, whereas others,
such as thomas hobbes, strongly opposed it.
Ground Truth: who was an advocate of separation of powers?
Generated: who opposed the principle of enlightenment?
[Zhang 2019] Zhang and Bansal, Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering, EMNLP2019
Mutual Information Maximization
16
Maximizing the mutual information of QA pairs: Intractable
[Belghazi 2018] Belghazi et al., MINE: Mutual Information Neural Estimation, ICML2018
𝐼 𝑋; 𝑌 = 𝐷!"(ℙ#$||ℙ#⨂ℙ$)
= ∫𝒳×𝒴
log
(ℙ!"
(ℙ!⨂ℙ"
𝑑ℙ#$
≥ 𝔼#$ 𝑇+(𝑥, 𝑦 ] − log 𝔼#⨂,$[exp(𝑇+ 𝑥, 𝑦 ]
𝑇+ ∶ 𝒳 × 𝒴 → ℝ is a discriminator
Mutual Information Maximization
17
Alternative estimator: Jensen-Shannon divergence
[Hjelm 2019] Hjelm et al., Learning deep representations by mutual information estimation and maximization, ICLR 2019
𝑀𝐼 𝑋, 𝑌 ≥ 𝔼!"[log 𝜎(𝑇# 𝑥, 𝑦 )]
+𝔼!⨂"[log 1 − 𝜎 𝑇# 𝑥%
, 𝑦 ]
𝜎 𝑥 =
&
&'()*(,-)
Average pooling of hidden states
𝑀𝐼 𝑋, 𝑌 ≥ 𝔼-,0~ℙ[log 𝑔(𝑥, 𝑦)]
+
&
3
𝔼4-,0~ℕ[log(1 − 𝑔 8𝑥, 𝑦 )]
+
&
3
𝔼-, 40~ℕ[log(1 − 𝑔 𝑥, 8𝑦 )]
=: ℒ6789
𝑔(𝑥, 𝑦) = 𝑠𝑖𝑔𝑚𝑜𝑖𝑑( ̅𝑥:
𝑊 B𝑦)
Mutual Information Maximization
18
Following Yeh and Chen (2019), we maximize MI of QA pairs as follows:
[Yeh 2019] Yeh and Chen, QAInfomax: Learning Robust Question Answering System by Mutual Information Maximization EMNLP 2019
Negative Examples
Training Info-HCVAE
19
We maximize the following objective with Monte Carlo estimators
𝑚𝑎𝑥!ℒ"#$%& + 𝜆ℒ'()*
Θ ≔ {𝜙, 𝜓, 𝜃, 𝑊}
Experimental Setup
20
1) Tasks and Evaluation Metric
• QA generation: QA-based Evaluation, Reverse QA-based Evaluation
• Semi-supervised QA: F1 and Exact Match (EM)
2) Data
• HarvestingQA (Du and Cardie, 2018)
• SQuAD
• Natural Questions
• TriviaQA
Experimental Setup
21
3) Baselines
• Harvesting-QG [Du 2018]:
Seq2Seq + attention + copy
• Maxout-QG [Zhao 2018]:
Seq2Seq + attention + Maxout copy + BERT encoder
• Semantic-QG [Zhang 2019]:
Seq2Seq + attention + Maxout copy + BERT encoder + Reinforcement
[Du 2018] Du and Cardie, Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia , NAACL2018
[Zhao2018] Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks, EMNLP 2018
[Zhang 2019] Zhang and Bansal, Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering, EMNLP2019
QA-based Evaluation (QAE)
22
Train QA model with synthetic data and measure F1, EM with annotated data.
Synthetic data Annotated data
Train Test
[Zhang 2019] Zhang and Bansal, Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering, EMNLP2019
Reverse QA-based Evaluation (R-QAE)
23
Accuracy (F1, EM) of the QA model trained with annotated data, evaluated
on synthetic QA pairs.
Train Test
Annotated data Synthetic Data
QA Generation on SQuAD
24
Method QAE(↑) R-QAE(↓)
SQuAD (EM / F1)
Harvesting-QG
Maxout-QG
Semantic-QG
HCVAE
Info-HCVAE
55.11 / 66.40
56.08 / 67.50
60.49 / 71.91
69.46 / 80.79
71.18 / 81.51
66.77 / 77.85
62.49 / 78.24
74.23 / 88.54
37.57 / 61.24
38.80 / 60.73
QA Generation on SQuAD
25
More data efficient compared to the other baselines
61.38
QA Generation on Natural Questions
26
Method QAE(↑) R-QAE(↓)
Natural Questions (EM / F1)
Harvesting-QG
Maxout-QG
Semantic-QG
HCVAE
Info-HCVAE
27.91 / 41.23
30.98 / 44.96
30.59 / 45.29
31.45 / 46.77
37.18 / 51.46
49.89 / 70.01
49.96 / 70.03
58.42 / 79.23
32.78 / 55.12
29.39 / 53.04
QA Generation on TriviaQA
27
Method QAE(↑) R-QAE(↓)
TriviaQA (EM / F1)
Harvesting-QG
Maxout-QG
Semantic-QG
HCVAE
Info-HCVAE
21.32 / 30.21
24.58 / 34.32
27.54 / 38.25
30.20 / 40.88
35.45 / 44.11
29.75 / 47.73
31.56 / 49.92
37.45 / 58.15
34.41 / 48.16
21.65 / 37.65
QA Generation Examples (One-to-Many)
28
Input (paragraph and answer)
The Scotland act 1998 which was
passed by and given royal assent by
queen Elizabeth ii on 19 November
1998, governs functions and role of
the Scottish parliament and delimits
its legislative competence . . .
Q1: which act was passed in 1998?
Q2: which act governs role of the Scottish
parliament?
Q3: which act was passed by queen
Elizabeth ii?
Q4: which act gave the Scottish parliament
the responsibility to determine its
legislative policy?
We sample the question latent variables multiple times with a fixed answer.
𝑧! ~ 𝑝"(𝑧!|𝑐)
QA Generation Examples (Latent Interpolation)
29
We generate QA pairs by interpolating between two latent codes.
Input (paragraph and answer)
Atop the main building’ s gold dome
is a golden statue of the virgin
mary. ... Next to the main building is
the basilica of the sacred heart.
Immediately behind the basilica is
the grotto, ...a marian place of
prayer and reflection. ... At the end of
the main drive ..., is a simple, modern
stone statue of mary.
Interpolation
QA Generation Examples (Latent Interpolation)
30
Q1: where is the grotto at?
A1: A marian place of prayer and reflection
We generate QA pairs by interpolating between two latent codes.
Input (paragraph and answer)
Atop the main building’ s gold dome
is a golden statue of the virgin
mary. ... Next to the main building is
the basilica of the sacred heart.
Immediately behind the basilica is
the grotto, ...a marian place of
prayer and reflection. ... At the end of
the main drive ..., is a simple, modern
stone statue of mary.
Interpolation
QA Generation Examples (Latent Interpolation)
31
Q1: where is the grotto at?
A1: A marian place of prayer and reflection
Q2: what place is behind the basilica of
prayer?
A2: grotto
We generate QA pairs by interpolating between two latent codes.
Input (paragraph and answer)
Interpolation
Atop the main building’ s gold dome
is a golden statue of the virgin
mary. ... Next to the main building is
the basilica of the sacred heart.
Immediately behind the basilica is
the grotto, ...a marian place of
prayer and reflection. ... At the end of
the main drive ..., is a simple, modern
stone statue of mary.
QA Generation Examples (Latent Interpolation)
32
Q1: where is the grotto at?
A1: A marian place of prayer and reflection
Q2: what place is behind the basilica of
prayer?
A2: grotto
Q3: what is next to the main building at
notre dame?
A3: the basilica of the sacred heart
We generate QA pairs by interpolating between two latent codes.
Input (paragraph and answer)
Interpolation
Atop the main building’ s gold dome
is a golden statue of the virgin
mary. ... Next to the main building is
the basilica of the sacred heart.
Immediately behind the basilica is
the grotto, ...a marian place of
prayer and reflection. ... At the end of
the main drive ..., is a simple, modern
stone statue of mary.
QA Generation Examples (Latent Interpolation)
33
Q1: where is the grotto at?
A1: A marian place of prayer and reflection
Q2: what place is behind the basilica of
prayer?
A2: grotto
Q3: what is next to the main building at
notre dame?
A3: the basilica of the sacred heart
Q4: what is at the end of the main drive?
A4: stone statue of mary
We generate QA pairs by interpolating between two latent codes.
Input (paragraph and answer)
Interpolation
Atop the main building’ s gold dome
is a golden statue of the virgin
mary. ... Next to the main building is
the basilica of the sacred heart.
Immediately behind the basilica is
the grotto, ...a marian place of
prayer and reflection. ... At the end of
the main drive ..., is a simple, modern
stone statue of mary.
Semi-supervised QA
34
Sample 10 different QA pairs for a single paragraph from target datasets
(+S×10, +N×10, +T×10)
Info-HCVAE
Input (paragraph)
Philadelphia has more murals than any other
u.s. city, thanks in part to the 1984 creation
of the department of recreation’s mural arts
program, . . . The program has funded more
than 2,800 murals
Q1: which city has more murals than any other city?
A1: Philadelphia
Q2: why Philadelphia has more murals?
A2: the 1984 creation of the department of
recreation’s mural arts program
.
.
.
Q9: when did the department of recreation’s mural
arts program start?
A9: 1984
Q10: how many murals funded the graffiti arts program
by the department of recreation?
A10: more than 2,800
Semi-supervised QA
35
Generate a QA pair for each paragraph from HarvestingQA dataset
(+H×10% ~ +H×100%)
Info-HCVAE
Input (paragraph)
… The typical division is into
three branches: a legislature,
an executive, and a judiciary,
which is the trias politica model
Q: what are the three branches of the
government ?
A: a legislature, an executive, and
a judiciary,
Semi-supervised QA on SQuAD
36
Data EM F1
SQuAD 80.25 88.23
+𝐒×𝟏𝟎 81.20 (+0.95) 88.36 (+0.13)
+𝐇×𝟏𝟎𝟎% 81.03 (+0.78) 88.79 (+0.56)
+𝐒×𝟏𝟎 + 𝐇×100% 81.44 (+1.19) 88.72 (+0.49)
Info-HCVAE
+𝐒×𝟏𝟎 82.09 (+1.84) 89.11 (+0.88)
+𝐇×𝟏𝟎𝟎% 82.37 (+2.12) 89.63 (+1.40)
+𝐒×𝟏𝟎 + 𝐇×𝟏𝟎𝟎% 82.19 (+1.94) 89.84 (+1.59)
Semantic-QG (baseline)
Semi-supervised QA on Natural Questions
37
Data EM F1
SQuAD 42.77 57.29
+𝐍×𝟏
+𝐍×𝟐
+𝐍×𝟑
+𝐍×𝟓
+𝐍×𝟏𝟎
Natural Questions
46.70 (+3.94)
46.95 (+4.19)
47.73 (+4.96)
48.19 (+5.42)
48.44 (+5.67)
61.55
61.08 (+3.79)
61.34 (+4.05)
61.98 (+4.69)
62.21 (+4.92)
62.69 (+5.40)
73.91
Semi-supervised QA on TriviaQA
38
Data EM F1
SQuAD 48.96 57.98
+𝐓×𝟏
+𝐓×𝟐
+𝐓×𝟑
+𝐓×𝟓
+𝐓×𝟏𝟎
TriviaQA
49.65 (+0.69)
50.01 (+1.05)
49.71 (+0.75)
50.14 (+1.18)
49.65 (+0.69)
64.55
59.13 (+1.21)
59.08 (+1.10)
59.49 (+1.51)
59.21 (+1.23)
59.20 (+1.22)
70.42
Conclusion
39
• We propose a novel probabilistic generative model for one-to-many QA
generation
• By maximizing mutual information of QA pairs, we improve the semantic
consistency of QA pairs.
• Results show that we significantly improve the performance of BERT QA
model by further training it with our generated QA pairs.
• Gap between semi-supervised and supervised learning, due to the discrepancy
among different domains: we hope future research can close the gap.
Codes available at https://github.com/seanie12/Info-HCVAE
Thank you

More Related Content

Similar to Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs (20)

Convolutional recurrent neural network with template based representation for...
Convolutional recurrent neural network with template based representation for...Convolutional recurrent neural network with template based representation for...
Convolutional recurrent neural network with template based representation for...
IJECEIAES
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
Traian Rebedea
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Saurabh Saxena
 
A_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdfA_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdf
ssuser98a1af
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
Ali Kabbadj
 
IRJET-Image Question Answering: A Review
IRJET-Image Question Answering: A ReviewIRJET-Image Question Answering: A Review
IRJET-Image Question Answering: A Review
IRJET Journal
 
IRJET- Question Answering System using Artificial Neural Network
IRJET-  	  Question Answering System using Artificial Neural NetworkIRJET-  	  Question Answering System using Artificial Neural Network
IRJET- Question Answering System using Artificial Neural Network
IRJET Journal
 
Apply deep learning to improve the question analysis model in the Vietnamese ...
Apply deep learning to improve the question analysis model in the Vietnamese ...Apply deep learning to improve the question analysis model in the Vietnamese ...
Apply deep learning to improve the question analysis model in the Vietnamese ...
IJECEIAES
 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
Jin-Hwa Kim
 
Open-ended Visual Question-Answering
Open-ended  Visual Question-AnsweringOpen-ended  Visual Question-Answering
Open-ended Visual Question-Answering
Universitat Politècnica de Catalunya
 
Development of system for generating questions, answers, distractors using tr...
Development of system for generating questions, answers, distractors using tr...Development of system for generating questions, answers, distractors using tr...
Development of system for generating questions, answers, distractors using tr...
IJECEIAES
 
Machine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin UniversityMachine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin University
Deakin University
 
BERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight ManualBERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight Manual
ArkaGhosh65
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
IRJET Journal
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separation
NAVER Engineering
 
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsGeneration of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Sergey Sosnovsky
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
gerogepatton
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
gerogepatton
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
ijaia
 
Convolutional recurrent neural network with template based representation for...
Convolutional recurrent neural network with template based representation for...Convolutional recurrent neural network with template based representation for...
Convolutional recurrent neural network with template based representation for...
IJECEIAES
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
Traian Rebedea
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Saurabh Saxena
 
A_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdfA_Review_of_Question_Answering_Systems.pdf
A_Review_of_Question_Answering_Systems.pdf
ssuser98a1af
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
Ali Kabbadj
 
IRJET-Image Question Answering: A Review
IRJET-Image Question Answering: A ReviewIRJET-Image Question Answering: A Review
IRJET-Image Question Answering: A Review
IRJET Journal
 
IRJET- Question Answering System using Artificial Neural Network
IRJET-  	  Question Answering System using Artificial Neural NetworkIRJET-  	  Question Answering System using Artificial Neural Network
IRJET- Question Answering System using Artificial Neural Network
IRJET Journal
 
Apply deep learning to improve the question analysis model in the Vietnamese ...
Apply deep learning to improve the question analysis model in the Vietnamese ...Apply deep learning to improve the question analysis model in the Vietnamese ...
Apply deep learning to improve the question analysis model in the Vietnamese ...
IJECEIAES
 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
Jin-Hwa Kim
 
Development of system for generating questions, answers, distractors using tr...
Development of system for generating questions, answers, distractors using tr...Development of system for generating questions, answers, distractors using tr...
Development of system for generating questions, answers, distractors using tr...
IJECEIAES
 
Machine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin UniversityMachine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin University
Deakin University
 
BERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight ManualBERT QnA System for Airplane Flight Manual
BERT QnA System for Airplane Flight Manual
ArkaGhosh65
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
IRJET Journal
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separation
NAVER Engineering
 
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge ModelsGeneration of Assessment Questions from Textbooks Enriched with Knowledge Models
Generation of Assessment Questions from Textbooks Enriched with Knowledge Models
Sergey Sosnovsky
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
gerogepatton
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
gerogepatton
 
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMSA BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
A BRIEF SURVEY OF QUESTION ANSWERING SYSTEMS
ijaia
 

More from MLAI2 (20)

Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
MLAI2
 
Online Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient DistillationOnline Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient Distillation
MLAI2
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual Learning
MLAI2
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
MLAI2
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
MLAI2
 
Skill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement LearningSkill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement Learning
MLAI2
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
MLAI2
 
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
MLAI2
 
Mini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set EncodingMini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
MLAI2
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
MLAI2
 
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
MLAI2
 
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-LearningMeta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning
MLAI2
 
Accurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset PoolingAccurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset Pooling
MLAI2
 
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
MLAI2
 
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
MLAI2
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 
Adversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive LearningAdversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive Learning
MLAI2
 
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
MLAI2
 
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...
MLAI2
 
Cost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention ProcessCost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention Process
MLAI2
 
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...
MLAI2
 
Online Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient DistillationOnline Hyperparameter Meta-Learning with Hypergradient Distillation
Online Hyperparameter Meta-Learning with Hypergradient Distillation
MLAI2
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual Learning
MLAI2
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
MLAI2
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
MLAI2
 
Skill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement LearningSkill-Based Meta-Reinforcement Learning
Skill-Based Meta-Reinforcement Learning
MLAI2
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
MLAI2
 
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Genera...
MLAI2
 
Mini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set EncodingMini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
Mini-Batch Consistent Slot Set Encoder For Scalable Set Encoding
MLAI2
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
MLAI2
 
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint L...
MLAI2
 
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-LearningMeta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning
MLAI2
 
Accurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset PoolingAccurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset Pooling
MLAI2
 
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
MLAI2
 
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Le...
MLAI2
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 
Adversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive LearningAdversarial Self-Supervised Contrastive Learning
Adversarial Self-Supervised Contrastive Learning
MLAI2
 
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
MLAI2
 
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...Neural Mask Generator : Learning to Generate Adaptive WordMaskings for Langu...
Neural Mask Generator : Learning to Generate Adaptive Word Maskings for Langu...
MLAI2
 
Cost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention ProcessCost-effective Interactive Attention Learning with Neural Attention Process
Cost-effective Interactive Attention Learning with Neural Attention Process
MLAI2
 
Ad

Recently uploaded (20)

Compliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf textCompliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf text
Earthling security
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptxISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
AyilurRamnath1
 
Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)
Brian Ahier
 
Securiport - A Border Security Company
Securiport  -  A Border Security CompanySecuriport  -  A Border Security Company
Securiport - A Border Security Company
Securiport
 
Co-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using ProvenanceCo-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using Provenance
Paul Groth
 
TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025
Suyash Joshi
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 
7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf
Minuscule Technologies
 
Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.
hok12341073
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Evaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical ContentEvaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical Content
Paul Groth
 
Top 25 AI Coding Agents for Vibe Coders to Use in 2025.pdf
Top 25 AI Coding Agents for Vibe Coders to Use in 2025.pdfTop 25 AI Coding Agents for Vibe Coders to Use in 2025.pdf
Top 25 AI Coding Agents for Vibe Coders to Use in 2025.pdf
SOFTTECHHUB
 
DevOps in the Modern Era - Thoughtfully Critical Podcast
DevOps in the Modern Era - Thoughtfully Critical PodcastDevOps in the Modern Era - Thoughtfully Critical Podcast
DevOps in the Modern Era - Thoughtfully Critical Podcast
Chris Wahl
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
Dancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptxDancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptx
Elliott Richmond
 
6th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 20256th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 2025
DanBrown980551
 
Compliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf textCompliance-as-a-Service document pdf text
Compliance-as-a-Service document pdf text
Earthling security
 
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOMEstablish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Establish Visibility and Manage Risk in the Supply Chain with Anchore SBOM
Anchore
 
Trends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary MeekerTrends Artificial Intelligence - Mary Meeker
Trends Artificial Intelligence - Mary Meeker
Clive Dickens
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptxISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
ISOIEC 42005 Revolutionalises AI Impact Assessment.pptx
AyilurRamnath1
 
Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)Trends Report: Artificial Intelligence (AI)
Trends Report: Artificial Intelligence (AI)
Brian Ahier
 
Securiport - A Border Security Company
Securiport  -  A Border Security CompanySecuriport  -  A Border Security Company
Securiport - A Border Security Company
Securiport
 
Co-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using ProvenanceCo-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using Provenance
Paul Groth
 
TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025TimeSeries Machine Learning - PyData London 2025
TimeSeries Machine Learning - PyData London 2025
Suyash Joshi
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 
7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf7 Salesforce Data Cloud Best Practices.pdf
7 Salesforce Data Cloud Best Practices.pdf
Minuscule Technologies
 
Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.
hok12341073
 
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...Your startup on AWS - How to architect and maintain a Lean and Mean account J...
Your startup on AWS - How to architect and maintain a Lean and Mean account J...
angelo60207
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Evaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical ContentEvaluation Challenges in Using Generative AI for Science & Technical Content
Evaluation Challenges in Using Generative AI for Science & Technical Content
Paul Groth
 
Top 25 AI Coding Agents for Vibe Coders to Use in 2025.pdf
Top 25 AI Coding Agents for Vibe Coders to Use in 2025.pdfTop 25 AI Coding Agents for Vibe Coders to Use in 2025.pdf
Top 25 AI Coding Agents for Vibe Coders to Use in 2025.pdf
SOFTTECHHUB
 
DevOps in the Modern Era - Thoughtfully Critical Podcast
DevOps in the Modern Era - Thoughtfully Critical PodcastDevOps in the Modern Era - Thoughtfully Critical Podcast
DevOps in the Modern Era - Thoughtfully Critical Podcast
Chris Wahl
 
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025Developing Schemas with FME and Excel - Peak of Data & AI 2025
Developing Schemas with FME and Excel - Peak of Data & AI 2025
Safe Software
 
Dancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptxDancing with AI - A Developer's Journey.pptx
Dancing with AI - A Developer's Journey.pptx
Elliott Richmond
 
6th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 20256th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 2025
DanBrown980551
 
Ad

Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

  • 1. Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs Dong Bok Lee1*, Seanie Lee1*, Woo Tae Jung2 , Donghwan Kim2 and Sung Ju Hwang1,3 KAIST1, Daejeon, South Korea 42MARU2, Seoul, South Korea AITRICS3, Seoul, South Korea 1
  • 2. Data Scarcity in Question Answering (QA) 2 One of the most crucial challenges in QA is the scarcity of labeled data. Since it is costly to obtain QA pairs for target text domain with human annotation. Human AnnotatorNew domain Coronavirus (COVID-19) Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans, these viruses cause respiratory tract infections that can range from mild to lethal. Mild illnesses include some cases of the common cold (which is also caused by other … QA Annotation Q: When did the COVID-19 occur? A: 2019
  • 3. Neural Question Generation (NQG) 3 Neural Question (and answer) generation is a popular approach to overcome this. Neural Networks Input (paragraph) Philadelphia has more murals than any other u.s. city, thanks in part to the 1984 creation of the department of recreation’s mural arts program, . . . The program has funded more than 2,800 murals Q1: which city has more murals than any other city? A1: Philadelphia Q2: why Philadelphia has more murals? A2: the 1984 creation of the department of recreation’s mural arts program Q3: when did the department of recreation’s mural arts program start? A3: 1984 Q4: how many murals funded the graffiti arts program by the department of recreation? A4: more than 2,800 Seq2Seq + attention + copy
  • 4. Diversity: One-to-many Problem 4 Existing systems have overlooked that QAG is essentially a one-to-many problem. Neural Networks Input (paragraph) Philadelphia has more murals than any other u.s. city, thanks in part to the 1984 creation of the department of recreation’s mural arts program, . . . The program has funded more than 2,800 murals Q1: which city has more murals than any other city? A1: Philadelphia Q2: why Philadelphia has more murals? A2: the 1984 creation of the department of recreation’s mural arts program Q3: when did the department of recreation’s mural arts program start? A3: 1984 Q4: how many murals funded the graffiti arts program by the department of recreation? A4: more than 2,800 Philadelphia has more murals than any other u.s. city, thanks in part to the 1984 creation of the department of recreation’s mural arts program, . . . The program has funded more than 2,800 murals Q1: which city has more murals than any other city? A1: Philadelphia Q2: why Philadelphia has more murals? A2: the 1984 creation of the department of recreation’s mural arts program Q3: when did the department of recreation’s mural arts program start? A3: 1984 Q4: how many murals funded the graffiti arts program by the department of recreation? A4: more than 2,800
  • 5. Consistency: Answerability of Question 5 No constraint for consistency between question and answer. Neural Networks Input (paragraph) Philadelphia has more murals than any other u.s. city, thanks in part to the 1984 creation of the department of recreation’s mural arts program, . . . The program has funded more than 2,800 murals Q1: which city has more murals than any other city? A1: Philadelphia Q2: why Philadelphia has more murals? A2: the 1984 creation of the department of recreation’s mural arts program Q3: when did the department of recreation’s mural arts program start? A3: 1984 Q4: how many murals funded the graffiti arts program by the department of recreation? A4: more than 2,800
  • 6. Info-Maximizing Hierarchical Conditional VAEs 6 To overcome these challenges, we propose Info-HCVAE for QA pairs generation. Diversity-> deep latent variable model (HCVAE) Consistency-> Mutual Information Maximization +
  • 7. Derivation of ELBO 7 Formally, our goal is to learn conditional joint distribution as follows: Question which city has more murals than any other city? Answer Philadelphia Context (paragraph) Philadelphia has more murals than any other u.s. city, thanks in part to the 1984 creation of the department of recreation’s mural arts program, . . . The program has funded more than 2,800 murals 𝑥, 𝑦 ~ 𝑝(𝑥, 𝑦|𝑐)
  • 8. Derivation of ELBO 8 In here, we introduce a separate latent space for question and answer as follows: Discrete R.V. (e.g., Categorical Distribution) Continuous R.V. (e.g., Gaussian Distribution) 𝑝 𝑥, 𝑦 𝑐 = ' !! ( !" 𝑝 𝑥 𝑧", 𝑦, 𝑐 𝑝(𝑦|𝑧", 𝑧#, 𝑐) 𝑝 𝑧# 𝑧", 𝑐 𝑝 𝑧" 𝑐 𝑑𝑧"
  • 9. Derivation of ELBO 9 We then use variational posteriors to maximize following Evidence Lower Bound: log 𝑝! 𝑥, 𝑦 𝑐 ≥ 𝔼"!~#$ 𝑧# 𝑥, 𝑐 [log 𝑝! 𝑥 𝑧#, 𝑦, 𝑐 ] + 𝔼"%~%$ 𝑧& 𝑧#, 𝑦, 𝑐 log 𝑝! 𝑦 𝑧&, 𝑐 −𝐷'([𝑞) 𝑧& 𝑧#, 𝑦, 𝑐 ||𝑝*(𝑧&|𝑧#, 𝑐)] −𝐷'([𝑞) 𝑧# 𝑥, 𝑐 ||𝑝* 𝑧# 𝑐 ] =: ℒ+,-./
  • 10. Derivation of ELBO 10 After training, the generative process of HCVAE is as follows: 1. 𝑝! 𝑧" 𝑐 2. 𝑝! 𝑧# 𝑧", 𝑐 3. 𝑝$ 𝑦 𝑧#, 𝑐 4. 𝑝$ 𝑥 𝑧", 𝑦, 𝑐 1. Sample question L.V.: 𝑧, ~ 𝑝-(𝑧,|𝑐) 2. Sample answer L.V.: 𝑧. ~ 𝑝-(𝑧.|𝑧,, 𝑐) 3. Generate an answer: y ~ 𝑝/(𝑦|𝑧., 𝑐) 4. Generate a question: x ~ 𝑝/(𝑥|𝑧,, 𝑦, 𝑐)
  • 11. Hierarchical Conditional VAEs 11 This is the overall description of our specific implementation for HCVAE.
  • 12. Hierarchical Conditional VAEs 12 Conditional Posterior / Prior networks aim at mapping the input into latent space. Posterior 3 Bi-LSTMs, 2 MLPs, attention Prior 1 Bi-LSTM, 1 MLP M L PContext At the end of the main drive, is a simple, modern stone statue of mary. what is at the end of the main drive? Question Answer Bi-LSTM Encoder (a) Conditional Posterior / Prior Networks Bi-LSTM Encoder Bi-LSTM Encoder M L P
  • 13. Hierarchical Conditional VAEs 13 AG network aims at generating answer spans from categorical L.V: At the end of the main drive, is a simple, modern stone statue of mary. modern stone statue of mary Bi-LSTM Decoder Answer Heuristic Matching Answer Generation Heuristic Matching Layer in NLI [Mou2016], Bi-LSTM, 2 linear layers to predict answer spans [Mou2016] Lili Mou et. al., Natural language inference by tree-based convolution and heuristic matching, ACL 2016 (b) Answer Generation Network
  • 14. Hierarchical Conditional VAEs 14 QG network aims at generating question from answer and continuous L.V: Question Generation Bi-LSTM with gated self-attention [Wang2017], Luong attention [Luong2015], LSTM, Linear layer to predict words, Maxout copy mechanism [Zhao2018] what is at the end of the main drive? At the end of the main drive, is a simple, modern stone statue of mary. copy Bi-LSTM Encoder Attention LSTM Decoder Question (c) Question Generation Network [Wang2017] Wenhui Wang et. al., Gated self-matching net-works for reading comprehension and question answering, ACL 2017 [Luong2015] Thang Luong et. al. Effective approaches to attention-based neural machine translation, EMNLP 2015, [Zhao2018] Yao Zhao el. al. Paragraph-level neural question generation with maxout pointer and gated self-attention networks, EMNLP2018
  • 15. Mutual Information Maximization 15 Semantic consistency of QA pairs is important. Context: ...during the age of enlightenment, philosophers such as john locke advocated the principle in their writings, whereas others, such as thomas hobbes, strongly opposed it. Ground Truth: who was an advocate of separation of powers? Generated: who opposed the principle of enlightenment? [Zhang 2019] Zhang and Bansal, Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering, EMNLP2019
  • 16. Mutual Information Maximization 16 Maximizing the mutual information of QA pairs: Intractable [Belghazi 2018] Belghazi et al., MINE: Mutual Information Neural Estimation, ICML2018 𝐼 𝑋; 𝑌 = 𝐷!"(ℙ#$||ℙ#⨂ℙ$) = ∫𝒳×𝒴 log (ℙ!" (ℙ!⨂ℙ" 𝑑ℙ#$ ≥ 𝔼#$ 𝑇+(𝑥, 𝑦 ] − log 𝔼#⨂,$[exp(𝑇+ 𝑥, 𝑦 ] 𝑇+ ∶ 𝒳 × 𝒴 → ℝ is a discriminator
  • 17. Mutual Information Maximization 17 Alternative estimator: Jensen-Shannon divergence [Hjelm 2019] Hjelm et al., Learning deep representations by mutual information estimation and maximization, ICLR 2019 𝑀𝐼 𝑋, 𝑌 ≥ 𝔼!"[log 𝜎(𝑇# 𝑥, 𝑦 )] +𝔼!⨂"[log 1 − 𝜎 𝑇# 𝑥% , 𝑦 ] 𝜎 𝑥 = & &'()*(,-)
  • 18. Average pooling of hidden states 𝑀𝐼 𝑋, 𝑌 ≥ 𝔼-,0~ℙ[log 𝑔(𝑥, 𝑦)] + & 3 𝔼4-,0~ℕ[log(1 − 𝑔 8𝑥, 𝑦 )] + & 3 𝔼-, 40~ℕ[log(1 − 𝑔 𝑥, 8𝑦 )] =: ℒ6789 𝑔(𝑥, 𝑦) = 𝑠𝑖𝑔𝑚𝑜𝑖𝑑( ̅𝑥: 𝑊 B𝑦) Mutual Information Maximization 18 Following Yeh and Chen (2019), we maximize MI of QA pairs as follows: [Yeh 2019] Yeh and Chen, QAInfomax: Learning Robust Question Answering System by Mutual Information Maximization EMNLP 2019 Negative Examples
  • 19. Training Info-HCVAE 19 We maximize the following objective with Monte Carlo estimators 𝑚𝑎𝑥!ℒ"#$%& + 𝜆ℒ'()* Θ ≔ {𝜙, 𝜓, 𝜃, 𝑊}
  • 20. Experimental Setup 20 1) Tasks and Evaluation Metric • QA generation: QA-based Evaluation, Reverse QA-based Evaluation • Semi-supervised QA: F1 and Exact Match (EM) 2) Data • HarvestingQA (Du and Cardie, 2018) • SQuAD • Natural Questions • TriviaQA
  • 21. Experimental Setup 21 3) Baselines • Harvesting-QG [Du 2018]: Seq2Seq + attention + copy • Maxout-QG [Zhao 2018]: Seq2Seq + attention + Maxout copy + BERT encoder • Semantic-QG [Zhang 2019]: Seq2Seq + attention + Maxout copy + BERT encoder + Reinforcement [Du 2018] Du and Cardie, Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia , NAACL2018 [Zhao2018] Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks, EMNLP 2018 [Zhang 2019] Zhang and Bansal, Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering, EMNLP2019
  • 22. QA-based Evaluation (QAE) 22 Train QA model with synthetic data and measure F1, EM with annotated data. Synthetic data Annotated data Train Test [Zhang 2019] Zhang and Bansal, Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering, EMNLP2019
  • 23. Reverse QA-based Evaluation (R-QAE) 23 Accuracy (F1, EM) of the QA model trained with annotated data, evaluated on synthetic QA pairs. Train Test Annotated data Synthetic Data
  • 24. QA Generation on SQuAD 24 Method QAE(↑) R-QAE(↓) SQuAD (EM / F1) Harvesting-QG Maxout-QG Semantic-QG HCVAE Info-HCVAE 55.11 / 66.40 56.08 / 67.50 60.49 / 71.91 69.46 / 80.79 71.18 / 81.51 66.77 / 77.85 62.49 / 78.24 74.23 / 88.54 37.57 / 61.24 38.80 / 60.73
  • 25. QA Generation on SQuAD 25 More data efficient compared to the other baselines 61.38
  • 26. QA Generation on Natural Questions 26 Method QAE(↑) R-QAE(↓) Natural Questions (EM / F1) Harvesting-QG Maxout-QG Semantic-QG HCVAE Info-HCVAE 27.91 / 41.23 30.98 / 44.96 30.59 / 45.29 31.45 / 46.77 37.18 / 51.46 49.89 / 70.01 49.96 / 70.03 58.42 / 79.23 32.78 / 55.12 29.39 / 53.04
  • 27. QA Generation on TriviaQA 27 Method QAE(↑) R-QAE(↓) TriviaQA (EM / F1) Harvesting-QG Maxout-QG Semantic-QG HCVAE Info-HCVAE 21.32 / 30.21 24.58 / 34.32 27.54 / 38.25 30.20 / 40.88 35.45 / 44.11 29.75 / 47.73 31.56 / 49.92 37.45 / 58.15 34.41 / 48.16 21.65 / 37.65
  • 28. QA Generation Examples (One-to-Many) 28 Input (paragraph and answer) The Scotland act 1998 which was passed by and given royal assent by queen Elizabeth ii on 19 November 1998, governs functions and role of the Scottish parliament and delimits its legislative competence . . . Q1: which act was passed in 1998? Q2: which act governs role of the Scottish parliament? Q3: which act was passed by queen Elizabeth ii? Q4: which act gave the Scottish parliament the responsibility to determine its legislative policy? We sample the question latent variables multiple times with a fixed answer. 𝑧! ~ 𝑝"(𝑧!|𝑐)
  • 29. QA Generation Examples (Latent Interpolation) 29 We generate QA pairs by interpolating between two latent codes. Input (paragraph and answer) Atop the main building’ s gold dome is a golden statue of the virgin mary. ... Next to the main building is the basilica of the sacred heart. Immediately behind the basilica is the grotto, ...a marian place of prayer and reflection. ... At the end of the main drive ..., is a simple, modern stone statue of mary. Interpolation
  • 30. QA Generation Examples (Latent Interpolation) 30 Q1: where is the grotto at? A1: A marian place of prayer and reflection We generate QA pairs by interpolating between two latent codes. Input (paragraph and answer) Atop the main building’ s gold dome is a golden statue of the virgin mary. ... Next to the main building is the basilica of the sacred heart. Immediately behind the basilica is the grotto, ...a marian place of prayer and reflection. ... At the end of the main drive ..., is a simple, modern stone statue of mary. Interpolation
  • 31. QA Generation Examples (Latent Interpolation) 31 Q1: where is the grotto at? A1: A marian place of prayer and reflection Q2: what place is behind the basilica of prayer? A2: grotto We generate QA pairs by interpolating between two latent codes. Input (paragraph and answer) Interpolation Atop the main building’ s gold dome is a golden statue of the virgin mary. ... Next to the main building is the basilica of the sacred heart. Immediately behind the basilica is the grotto, ...a marian place of prayer and reflection. ... At the end of the main drive ..., is a simple, modern stone statue of mary.
  • 32. QA Generation Examples (Latent Interpolation) 32 Q1: where is the grotto at? A1: A marian place of prayer and reflection Q2: what place is behind the basilica of prayer? A2: grotto Q3: what is next to the main building at notre dame? A3: the basilica of the sacred heart We generate QA pairs by interpolating between two latent codes. Input (paragraph and answer) Interpolation Atop the main building’ s gold dome is a golden statue of the virgin mary. ... Next to the main building is the basilica of the sacred heart. Immediately behind the basilica is the grotto, ...a marian place of prayer and reflection. ... At the end of the main drive ..., is a simple, modern stone statue of mary.
  • 33. QA Generation Examples (Latent Interpolation) 33 Q1: where is the grotto at? A1: A marian place of prayer and reflection Q2: what place is behind the basilica of prayer? A2: grotto Q3: what is next to the main building at notre dame? A3: the basilica of the sacred heart Q4: what is at the end of the main drive? A4: stone statue of mary We generate QA pairs by interpolating between two latent codes. Input (paragraph and answer) Interpolation Atop the main building’ s gold dome is a golden statue of the virgin mary. ... Next to the main building is the basilica of the sacred heart. Immediately behind the basilica is the grotto, ...a marian place of prayer and reflection. ... At the end of the main drive ..., is a simple, modern stone statue of mary.
  • 34. Semi-supervised QA 34 Sample 10 different QA pairs for a single paragraph from target datasets (+S×10, +N×10, +T×10) Info-HCVAE Input (paragraph) Philadelphia has more murals than any other u.s. city, thanks in part to the 1984 creation of the department of recreation’s mural arts program, . . . The program has funded more than 2,800 murals Q1: which city has more murals than any other city? A1: Philadelphia Q2: why Philadelphia has more murals? A2: the 1984 creation of the department of recreation’s mural arts program . . . Q9: when did the department of recreation’s mural arts program start? A9: 1984 Q10: how many murals funded the graffiti arts program by the department of recreation? A10: more than 2,800
  • 35. Semi-supervised QA 35 Generate a QA pair for each paragraph from HarvestingQA dataset (+H×10% ~ +H×100%) Info-HCVAE Input (paragraph) … The typical division is into three branches: a legislature, an executive, and a judiciary, which is the trias politica model Q: what are the three branches of the government ? A: a legislature, an executive, and a judiciary,
  • 36. Semi-supervised QA on SQuAD 36 Data EM F1 SQuAD 80.25 88.23 +𝐒×𝟏𝟎 81.20 (+0.95) 88.36 (+0.13) +𝐇×𝟏𝟎𝟎% 81.03 (+0.78) 88.79 (+0.56) +𝐒×𝟏𝟎 + 𝐇×100% 81.44 (+1.19) 88.72 (+0.49) Info-HCVAE +𝐒×𝟏𝟎 82.09 (+1.84) 89.11 (+0.88) +𝐇×𝟏𝟎𝟎% 82.37 (+2.12) 89.63 (+1.40) +𝐒×𝟏𝟎 + 𝐇×𝟏𝟎𝟎% 82.19 (+1.94) 89.84 (+1.59) Semantic-QG (baseline)
  • 37. Semi-supervised QA on Natural Questions 37 Data EM F1 SQuAD 42.77 57.29 +𝐍×𝟏 +𝐍×𝟐 +𝐍×𝟑 +𝐍×𝟓 +𝐍×𝟏𝟎 Natural Questions 46.70 (+3.94) 46.95 (+4.19) 47.73 (+4.96) 48.19 (+5.42) 48.44 (+5.67) 61.55 61.08 (+3.79) 61.34 (+4.05) 61.98 (+4.69) 62.21 (+4.92) 62.69 (+5.40) 73.91
  • 38. Semi-supervised QA on TriviaQA 38 Data EM F1 SQuAD 48.96 57.98 +𝐓×𝟏 +𝐓×𝟐 +𝐓×𝟑 +𝐓×𝟓 +𝐓×𝟏𝟎 TriviaQA 49.65 (+0.69) 50.01 (+1.05) 49.71 (+0.75) 50.14 (+1.18) 49.65 (+0.69) 64.55 59.13 (+1.21) 59.08 (+1.10) 59.49 (+1.51) 59.21 (+1.23) 59.20 (+1.22) 70.42
  • 39. Conclusion 39 • We propose a novel probabilistic generative model for one-to-many QA generation • By maximizing mutual information of QA pairs, we improve the semantic consistency of QA pairs. • Results show that we significantly improve the performance of BERT QA model by further training it with our generated QA pairs. • Gap between semi-supervised and supervised learning, due to the discrepancy among different domains: we hope future research can close the gap. Codes available at https://github.com/seanie12/Info-HCVAE