Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
21 views
QA With Deep Learning
Uploaded by
louislicam
AI-enhanced title
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save QA with Deep Learning For Later
Download
Save
Save QA with Deep Learning For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
21 views
QA With Deep Learning
Uploaded by
louislicam
AI-enhanced title
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save QA with Deep Learning For Later
Carousel Previous
Carousel Next
Save
Save QA with Deep Learning For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 10
Search
Fullscreen
Question Answering System with Deep Learning Jake Spracher Robert M Schoenhals scPD Stanford Law School jeprachedstantord.edu Stanford University
[email protected]
Department of Management Science and Engineering Stanford University. ‘
[email protected]
Abstract ‘The Stanford Question Answering Dataset (SQuAD) challenge is a machine reading comprehension task that has gained popularity in recent years. In this paper, we implement various existing deep learning methods with in- cremental improvements and conduct a comparative study of their per- formance on SQuAD dataset. Our best model achieves 76.1 FI and 66.1 EM scores on the test set. ‘This project is completed for Assignment 4 of C8224n, 1 Introduction The Stanford Question Answering Dataset (SQuAD) challenge, a machine comprehension task, has gained popularity in recent years from both theoretical and practical perspectives. The Stanford NLP group published the SQuAD(2] dataset consisting of more than 100,000 question-answer tuples taken from the set of Wikipedia articles, in which the answer to ‘each question is the consecutive segment of text in the corresponding reading passage. The primary task is to build models that take a paragraph and a question about it as an input, ‘and identify the answer to the question from the given paragraph. ‘There has been a lot ‘of research on building a state-of-the-art deep learning system on SQuAD that has been reported to accomplish outstanding performance (3][6][5}. Hence, the objective of this paper is to start with the provided starter code for C5224n: Natural Language Processing with Deep Learning (2017-2018 Winter Quarter), make successive improvements by implementing, ‘existing models, and compare their performance on SQuAD. The rest of the paper is organized as follows. Section 2 illustrates the components and the architecture of our system, Section 3 describes the experime i demonstrates error analysis, Section 5 concludes the paper and discusses future work. 2 Model 1 ndlependent implementatic and then illustrate the specific com model is modular in that different components can be swapped with others due to Thus, we first present al the modules considered in this paper, ations of the modules that we run in experiments.2.1 Model Components 24.1 Encoding Layer A d-dimensional word embeddings of a question x1 ,-+- ,xy € Rand context y1,-++ Yar € R¢ aro fed into a bidirectional LSTM with weights shared between the question and context. ‘The encoding layer encodes the embeddings into the representation matrix of the context jddden states Hin RY*2" and the question hidden states U in “2 where his the si of hidden states, 2.1.2 Bidirectional Attention Layer Bidirectional Attention Layer [3] is one of the attention layers that we use representations for context hidden states hy € R® and the quest uy, sty € Ra matrix S in RY" is computed according to 8; = w" [hys wjshious) € R, where w € Fé is a weight vector learned through training. We first compute Contert-To-Question (C2Q) attention. C2Q attention distribution is ob- tained by a = softmax(S,.) € R™."i € {1,---,.N}. ‘The question hidden states uj are then weighted according to a' to get C2Q attention output a = 37", ajuy € R?*, Next, we compute Question-To-Context (Q2C) attention. Q2C attention distribution is obtained by 3 = softmax(m) € R® for m, = max, $4). i € {1,---,.N}. The context hidden states hy are then weighted according to 8 to get Q2C attention output e! = > Shy € R2". Then, wwe get the bidirectional attention encoding by = (his ai;hioa,; hice’) € ROY i € {1,--* ,N} 2.1.8 Coattention Layer Another type of attention layers we implemented is Coattention layer (6). Given the question hidden states u1,-- ,unr € R', we first compute projected question hidden states w tanh(Wu, +b) € REYj € {1,---,M}. Also, sentinels hg and up are appended to the context and question hidden states, which gives us {In,-+~ .ligr,tg}and {ui,++* tae, 9} We then compute a affinity matrix L € RO+Y*+0 where Ly y= hu € R. Using the affinity matrix L, we apply the standard attention mechanism in both directions. Conteat-To-Question attention output is obtained by a; = SAP aja, € RE for af softmax(L,) € RS" i € (1,--+ ,N}. Question-To-Contert attention output is computed SLM Bh, € RE for 84 = softmax(L,,;)) © RY" 7 € {1,--- My Next, we compute second-level attention output s; = Mt" alb; © RY. Finally, [aissi] € RP € (1,--- , N} is fed into a bidirectional LSTM, and the resulting hidden states are the coattention encoding. in a similar way: by 24.4 Modeling Layer Following the example of the BIDAF{3), we implement. a modoling layer comprised of two layers of bidirectional LSTM’s, which outputs M R&** 2.2 Self-Attention Layer A self-attention layer [4] is used as an alternative to the modeling layer. Given the context hidden states H€ R**?*, we apply the attention mechanism to obtain attention distribu- tion A = softmax(HH" /V2h) € RN*%, where softmax is taken with respect to the rows of HH /V2h. Then, self-attention ouiput is computed by a= AH € RN*2" 2.2.1 Output Layers The basic output layer we consider has the identical structure of the BiDAF{3}. ‘This module is used in conjunction with the modeling layer. Let G € R#e** denote the output by ‘an attention layer. Then, the probability distribution of the start index is computed byprt = softmax(w!|G;M)) ¢ RX, where w, € RYH is a trainable woight. ‘The passed to a bidirectional LSTM ‘that outputs My ¢ R“*". Finally, the probability distribution of the end index is obtained by p*4 = softmax(w? [G; My] € RY. Another type of output layer we implemented is Answer-Pointer Layer[6]. Given the blended representation G, the probability distribution of the start index is given by prtt = softmax(w'P, +6 ey) © RY. where Fy = tanh(VG +b @ en) € RX, and w € R¥%,c € R,V © RM*6,b © RM are parameters to be trained. ‘The operator © en produces a matrix by repeating the element on the left hand side for N times. Then, ‘we compute the hidden vector hy by using attention mechanism Gp, € R#° and passing it to a standard LSTM. Finally, the probability distribution of the end index is obtained by pi = softmax(w?F,+exew) © R, where Fy = tanh(VG-+(Wahy +b)Sen) € Re", and W;, € R&*#e ig another trainable weight The start and end indices (F*, ) are selected such that the joint probability pip" is maximized subject to i
You might also like
Solutions
PDF
No ratings yet
Solutions
11 pages
Question Answering On Squad 2.0: Stanford Cs224N Default Project, Option 3
PDF
No ratings yet
Question Answering On Squad 2.0: Stanford Cs224N Default Project, Option 3
11 pages
Question Answering On SQuAD
PDF
No ratings yet
Question Answering On SQuAD
7 pages
A Question-Focused Multi-Factor Attention Network For Question Answering
PDF
No ratings yet
A Question-Focused Multi-Factor Attention Network For Question Answering
8 pages
Reinforced Mnemonic Reader For Machine Comprehension: Minghao Hu Yuxing Peng Xipeng Qiu
PDF
No ratings yet
Reinforced Mnemonic Reader For Machine Comprehension: Minghao Hu Yuxing Peng Xipeng Qiu
8 pages
Visualizing A Neural Machine Translation Model
PDF
No ratings yet
Visualizing A Neural Machine Translation Model
38 pages
Caption Generation With Visual Attention
PDF
No ratings yet
Caption Generation With Visual Attention
25 pages
Mark
PDF
No ratings yet
Mark
3 pages
1703.03130 Stuctured Self-Attentive Sentence Embedding Benguio
PDF
No ratings yet
1703.03130 Stuctured Self-Attentive Sentence Embedding Benguio
15 pages
1704.01792v3
PDF
No ratings yet
1704.01792v3
6 pages
Joint Question/Answering
PDF
No ratings yet
Joint Question/Answering
7 pages
Vinija's Notes - Natural Language Processing - Attention
PDF
No ratings yet
Vinija's Notes - Natural Language Processing - Attention
27 pages
L22_Attention in Deep Learning
PDF
No ratings yet
L22_Attention in Deep Learning
65 pages
Default 116625923
PDF
No ratings yet
Default 116625923
11 pages
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
PDF
No ratings yet
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
5 pages
Transformer Tutorial
PDF
No ratings yet
Transformer Tutorial
14 pages
Attention Mechanism, Transformers, BERT, and GPT: Tutorial and Survey
PDF
No ratings yet
Attention Mechanism, Transformers, BERT, and GPT: Tutorial and Survey
14 pages
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
PDF
No ratings yet
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
11 pages
L3 Transformer and PLMs
PDF
No ratings yet
L3 Transformer and PLMs
111 pages
Towards Interpreting Language Models
PDF
No ratings yet
Towards Interpreting Language Models
79 pages
Stanford Dataset 2.0
PDF
No ratings yet
Stanford Dataset 2.0
9 pages
Cs 224N Default Final Project: Question Answering On Squad 2.0
PDF
No ratings yet
Cs 224N Default Final Project: Question Answering On Squad 2.0
24 pages
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
PDF
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
10 pages
Attn Is All You Need
PDF
No ratings yet
Attn Is All You Need
15 pages
Neural Network Models for Paraphrase Identification, Semantic Textual
PDF
No ratings yet
Neural Network Models for Paraphrase Identification, Semantic Textual
14 pages
lecture15_transformer
PDF
No ratings yet
lecture15_transformer
26 pages
Attention is all you need
PDF
No ratings yet
Attention is all you need
15 pages
poster_version_final_bis
PDF
No ratings yet
poster_version_final_bis
1 page
cs224n 2022 Lecture08 Final Project
PDF
No ratings yet
cs224n 2022 Lecture08 Final Project
71 pages
02-Transformer Based NLP Applications
PDF
No ratings yet
02-Transformer Based NLP Applications
57 pages
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
PDF
No ratings yet
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
15 pages
Tructured Ttention Etworks: (Yoonkim@seas, Carldenton@college, Lhoang@g, Srush@seas) .Harvard - Edu
PDF
No ratings yet
Tructured Ttention Etworks: (Yoonkim@seas, Carldenton@college, Lhoang@g, Srush@seas) .Harvard - Edu
21 pages
05 Attention Slides
PDF
No ratings yet
05 Attention Slides
69 pages
Aneja Convolutional Image Captioning CVPR 2018 Paper
PDF
No ratings yet
Aneja Convolutional Image Captioning CVPR 2018 Paper
10 pages
Natural Language Processing With Deep Learning CS224N/Ling284
PDF
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
62 pages
Deep Neural Network Module 7 Attention Transformer
PDF
No ratings yet
Deep Neural Network Module 7 Attention Transformer
40 pages
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
PDF
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
7 pages
5th Unit
PDF
No ratings yet
5th Unit
36 pages
Lec15 Qa
PDF
No ratings yet
Lec15 Qa
32 pages
Text Understanding From Scratch
PDF
No ratings yet
Text Understanding From Scratch
10 pages
C11-Attention and Transformers
PDF
No ratings yet
C11-Attention and Transformers
59 pages
DAA FinalReport
PDF
No ratings yet
DAA FinalReport
14 pages
6440-Article Text-9665-1-10-20200517
PDF
No ratings yet
6440-Article Text-9665-1-10-20200517
8 pages
Transformer
PDF
No ratings yet
Transformer
5 pages
Modification and Extension of a Neural Question Answering System with Attention and Feature Variants
PDF
No ratings yet
Modification and Extension of a Neural Question Answering System with Attention and Feature Variants
10 pages
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
PDF
No ratings yet
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
20 pages
XCS224N Assignment 4 Neural Machine Translation With Rnns
PDF
No ratings yet
XCS224N Assignment 4 Neural Machine Translation With Rnns
10 pages
Word Embeddings Classification
PDF
No ratings yet
Word Embeddings Classification
52 pages
Lecture Notes - Advanced Language Model - BERT, GPT
PDF
No ratings yet
Lecture Notes - Advanced Language Model - BERT, GPT
24 pages
The Annotated Transformer: Alexander M. Rush
PDF
No ratings yet
The Annotated Transformer: Alexander M. Rush
9 pages
Novel Approach For Semantic Similarity Measuremwnt For High Quality Answer Selection in QA Using DL Methods
PDF
No ratings yet
Novel Approach For Semantic Similarity Measuremwnt For High Quality Answer Selection in QA Using DL Methods
5 pages
Grammar As A Foreign Language: Equal Contribution
PDF
No ratings yet
Grammar As A Foreign Language: Equal Contribution
10 pages
Class44-46 Introduction To Enncoder-Decoder Model Attention-03-09May2023
PDF
No ratings yet
Class44-46 Introduction To Enncoder-Decoder Model Attention-03-09May2023
35 pages
Lecture11 VDL
PDF
No ratings yet
Lecture11 VDL
58 pages
Structured Prompting: Scaling In-Context Learning To 1,000 Examples
PDF
No ratings yet
Structured Prompting: Scaling In-Context Learning To 1,000 Examples
14 pages
UNIT 2 FULL - Compressed
PDF
No ratings yet
UNIT 2 FULL - Compressed
26 pages
Week9 Seq2seq
PDF
No ratings yet
Week9 Seq2seq
32 pages
report24
PDF
No ratings yet
report24
7 pages
A Practical Survey On Faster and Lighter Transformers - 2023 - Fournier Et Al
PDF
No ratings yet
A Practical Survey On Faster and Lighter Transformers - 2023 - Fournier Et Al
40 pages