0% found this document useful (0 votes)

4 views

XCS224N Assignment 4 Neural Machine Translation With Rnns

Uploaded by

bksaif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

XCS224N Assignment 4 Neural Machine Translation With Rnns

Uploaded by

bksaif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

XCS224N Assignment 4 Neural Machine Translation with RNNs 1

XCS224N Assignment 4 Neural Machine Translation with

RNNs
Due Sunday, June 23 at 11:59pm PT.

Guidelines

1. If you have a question about this homework, we encourage you to post your question on our Slack channel, at
http://xcs224n-scpd.slack.com/
2. Familiarize yourself with the collaboration and honor code policy before starting work.
3. For the coding problems, you must use the packages specified in the provided environment description. Since
the autograder uses this environment, we will not be able to grade any submissions which import unexpected
libraries.

Submission Instructions
Coding Submission: Some questions in this assignment require a coding response. For these questions, you should
submit all files indicated in the question to the online student portal. For further details, see Writing Code
and Running the Autograder below.

Honor code
We strongly encourage students to form study groups. Students may discuss and work on homework problems
in groups. However, each student must write down the solutions independently, and without referring to written
notes from the joint session. In other words, each student must understand the solution well enough in order to
reconstruct it by him/herself. In addition, each student should write on the problem set the set of people with
whom s/he collaborated. Further, because we occasionally reuse problem set questions from previous years, we
expect students not to copy, refer to, or look at the solutions in preparing their answers. It is an honor code
violation to intentionally refer to a previous year’s solutions. More information regarding the Stanford honor code
can be foudn at https://communitystandards.stanford.edu/policies-and-guidance/honor-code.

Writing Code and Running the Autograder

All your code should be entered into the src/submission/ directory. When editing files in src/submission/, please only
make changes between the lines containing ### START_CODE_HERE ### and ### END_CODE_HERE ###. Do not make changes
to files outside the src/submission/ directory.
The unit tests in src/grader.py (the autograder) will be used to verify a correct submission. Run the autograder
locally using the following terminal command within the src/ subdirectory:
$ python grader . py

There are two types of unit tests used by the autograder:

• basic: These tests are provided to make sure that your inputs and outputs are on the right track, and that
the hidden evaluation tests will be able to execute.
• hidden: These unit tests are the evaluated elements of the assignment, and run your code with more complex
inputs and corner cases. Just because your code passed the basic local tests does not necessarily mean that
they will pass all of the hidden tests. These evaluative hidden tests will be run when you submit your code to
the Gradescope autograder via the online student portal, and will provide feedback on how many points you
have earned.
For debugging purposes, you can run a single unit test locally. For example, you can run the test case 3a-0-basic
using the following terminal command within the src/ subdirectory:
$ python grader . py 3a -0 - basic
XCS224N Assignment 4 Neural Machine Translation with RNNs 2

Before beginning this course, please walk through the Anaconda Setup for XCS Courses to familiarize yourself with
the coding environment. Use the env defined in src/environment.yml to run your code. This is the same environment
used by the online autograder.

Test Cases
The autograder is a thin wrapper over the python unittest framework. It can be run either locally (on your computer)
or remotely (on SCPD servers). The following description demonstrates what test results will look like for both
local and remote execution. For the sake of example, we will consider two generic tests: 1a-0-basic and 1a-1-hidden.
Local Execution - Hidden Tests
All hidden tests rely on files that are not provided to students. Therefore, the tests can only be run remotely. When
a hidden test like 1a-1-hidden is executed locally, it will produce the following result:

Local Execution - Basic Tests

When a basic test like 1a-0-basic passes locally, the autograder will indicate success:

When a basic test like 1a-0-basic fails locally, the error is printed to the terminal, along with a stack trace indicating
where the error occurred:

Remote Execution
Basic and hidden tests are treated the same by the remote autograder. Here are screenshots of failed basic and
hidden tests. Notice that the same information (error and stack trace) is provided as the in local autograder, now
for both basic and hidden tests.
XCS224N Assignment 4 Neural Machine Translation with RNNs 3

Finally, here is what it looks like when basic and hidden tests pass in the remote autograder.
XCS224N Assignment 4 Neural Machine Translation with RNNs 4

1 Neural Machine Translation with RNNs

We highly recommend reading Zhang et al (2020) to better understand the Cherokee-to-English translation task,
which served as inspiration for this assignment.

In Machine Translation, our goal is to convert a sentence from the source language (e.g. Cherokee) to the target
language (e.g. English). In this assignment, we will implement a sequence-to-sequence (Seq2Seq) network with
attention, to build a Neural Machine Translation (NMT) system. In this section, we describe the training proce-
dure for the proposed NMT system, which uses a Bidirectional LSTM Encoder and a Unidirectional LSTM Decoder.

Figure 1: Seq2Seq Model with Multiplicative Attention, shown on the third step of the
decoder. Note that for readability, we do not picture the concatenation of the previous
combined-output with the decoder input.

Model description (training procedure)

Given a sentence in the source language, we look up the word embeddings from an embeddings matrix, yielding
x1 , . . . , xm | xi ∈ Re×1 , where m is the length of the source sentence and e is the embedding size. We feed these
embeddings to the bidirectional Encoder, yielding hidden states and cell states for both the forwards (→) and
backwards (←) LSTMs. The forwards and backwards versions are concatenated to give hidden states henc i and cell
states cenc i :
XCS224N Assignment 4 Neural Machine Translation with RNNs 5

−−→ ←enc
−− −−→ ←enc−−
henc
i = [henc enc
i ; hi ] where hi ∈ R2h×1 , henc
i , hi ∈ Rh×1 1≤i≤m (1)
−−→ ←enc
−− −−→ ←enc
−−
cenc
i = [cenc enc
i ; ci ] where ci ∈ R2h×1 , cenc
i , ci ∈ Rh×1 1≤i≤m (2)

We then initialize the Decoder’s first hidden state hdec

0 and cell state cdec
0 with a linear projection of the Encoder’s
1
final hidden state and final cell state.
−−→ ←enc
−−
hdec
0 = Wh [henc dec
m ; h1 ] where h0 ∈ Rh×1 , Wh ∈ Rh×2h (3)
−−→ ←enc
−−
cdec
0 = Wc [cenc dec
m ; c1 ] where c0 ∈ Rh×1 , Wc ∈ Rh×2h (4)

With the Decoder initialized, we must now feed it a matching sentence in the target language. On the tth step,
we look up the embedding for the tth word, yt ∈ Re×1 . We then concatenate yt with the combined-output vector
ot−1 ∈ Rh×1 from the previous timestep (we will explain what this is later down this page!) to produce yt ∈ R(e+h)×1 .
Note that for the first target word (i.e. the start token) o0 is a zero-vector. We then feed yt as input to the Decoder
LSTM.

hdec dec
t , ct = Decoder(yt , hdec dec dec
t−1 , ct−1 ) where ht ∈ Rh×1 , cdec
t ∈ Rh×1 (5)
(6)

We then use hdec

t to compute multiplicative attention over henc enc
0 , . . . , hm :

et,i = (hdec T enc

t ) WattProj hi where et ∈ Rm×1 , WattProj ∈ Rh×2h 1≤i≤m (7)
m×1
αt = Softmax(et ) where αt ∈ R (8)
m
X
at = αt,i henc
i where at ∈ R2h×1 (9)
i

We now concatenate the attention output at with the decoder hidden state hdec
t and pass this through a linear layer,
Tanh, and Dropout to attain the combined-output vector ot .

ut = [hdec
t ; at ] where ut ∈ R
3h×1
(10)
h×1 h×3h
vt = Wu ut where vt ∈ R , Wu ∈ R (11)
h×1
ot = Dropout(Tanh(vt )) where ot ∈ R (12)

Then, we produce a probability distribution Pt over target words at the tth timestep:

Pt = Softmax(Wvocab ot ) where Pt ∈ RVt ×1 , Wvocab ∈ RVt ×h (13)

Here, Vt is the size of the target vocabulary. Finally, to train the network we then compute the softmax cross
entropy loss between Pt and gt , where gt is the 1-hot vector of the target word at timestep t:

Jt (θ) = CE(Pt , gt ) (14)

Here, θ represents all the parameters of the model and Jt (θ) is the loss on step t of the decoder. Now that we have
described the model, let’s try implementing it for Cherokee to English translation!

Setting up your Virtual Machine

Follow the instructions in the XCS224N Azure Guide in order to create your VM instance. Though you will need
the GPU to train your model, we strongly advise that you first develop the code locally and ensure that it runs,
1 If ←−− −− →
it’s not obvious, think about why we regard [henc enc
1 , hm ] as the ‘final hidden state’ of the Encoder.
XCS224N Assignment 4 Neural Machine Translation with RNNs 6

before attempting to train it on your VM. GPU time is expensive and limited. It takes approximately 30 minutes
to 1 hour to train the NMT system. We don’t want you to accidentally use all your GPU time for the assignment,
debugging your model rather than training and evaluating it. Finally, make sure that your VM is turned off
whenever you are not using it.
In order to run the model code on your workstation’s CPU or if you have an Apple silicon GPU, please run the
following command to create the proper virtual environment (You did this at the beginning of the course on your
local computer):
$ conda update -n base conda
$ conda env create -- file environment . yml

If you have a local CUDA based GPU (Nvidia) or on your VM, then instead of using XCS224N conda environment,
create a new environment that supports CUDA, XCS224N_CUDA by following line:
$ conda env create -- file e n v i r o n me n t _ c u d a . yml
$ conda activate XCS224N_CUDA

For local development and testing, you can feel free to continue to using the same XCS224N environment you’ve used
for all the assignments so far.

Implementation Assignment
(a) [2 points (Coding)] In order to apply tensor operations, we must ensure that the sentences in a given batch
are of the same length. Thus, we must identify the longest sentence in a batch and pad others to be the
same length. Implement the pad sents function in submission/utils.py, which shall produce these padded
sentences.
(b) [3 points (Coding)] Implement the init function in submission/model embeddings.py to initialize
the necessary source and target embeddings.
(c) [4 points (Coding)] Implement the init function in submission/nmt model.py to initialize the neces-
sary layers (LSTM, projection, and dropout) for the NMT system.
(d) [8 points (Coding)] Implement the encode function in submission/nmt model.py. This function converts
the padded source sentences into the tensor X, generates henc enc dec
1 , . . . , hm , and computes the initial state h0
dec
and initial cell c0 for the Decoder.
(e) [8 points (Coding)] Implement the decode function in submission/nmt model.py. This function constructs
ȳ and runs the step function over every timestep for the input.
(f) [10 points (Coding)] Implement the step function in submission/nmt model.py. This function applies
the Decoder’s LSTM cell for a single timestep, computing the encoding of the target word hdec t , the attention
scores et , attention distribution αt , the attention output at , and finally the combined output ot .
Now it’s time to get things running! Execute the following to generate the necessary vocab file (you can do
this on your local computer):
( XCS224N ) $ sh run . sh vocab

As noted earlier, we recommend that you develop the code on your personal computer. Confirm that you are
running in the proper conda environment and then execute the following command to train the model on your
local machine:
( XCS224N ) $ sh run . sh train_cpu

Once you have ensured that your code does not crash (i.e. let it run until iter 10 or iter 20), power on your
VM from the Azure Web Portal. Then read the Practical Guide for Using the VM section of the XCS224N
Azure Guide for instructions on how to upload your code to the VM. Next, turn to the Managing Processes
on a VM section of the Practical Guide and follow the instructions to create a new tmux session. Concretely,
run the following command to create tmux session called nmt.
( XCS224N_CUDA ) $ tmux new -s nmt
XCS224N Assignment 4 Neural Machine Translation with RNNs 7

Once your VM is configured and you are in a tmux session, reactivate your XCS224N environment and execute.
Note that it is a different conda env XCS224N_CUDA based on environment_cuda.yml. Details can be found from
XCS224N Azure Guide.
$ conda activate XCS224N_CUDA
( XCS224N_CUDA ) $ sh run . sh train_gpu

Once you know your code is running properly, you can detach from session and close your ssh connection to
the server. To detach from the session, run:
( XCS224N_CUDA ) $ tmux detach

You can return to your training model by ssh-ing back into the server and attaching to the tmux session by
running:
( XCS224N_CUDA ) $ tmux a -t nmt

(g) [3 points (Coding)] Once your model is done training (this should take about 30 minutes to 1 hour
on the VM), execute the following command to test the model:
( XCS224N_CUDA ) $ sh run . sh test_gpu

After running this command, it should generate a file src/submission/test_outputs.txt needed for submission.
XCS224N Assignment 4 Neural Machine Translation with RNNs 8

Deliverables
For this assignment, please submit all files within the src/submission subdirectory. This includes:
• src/submission/__init__.py
• src/submission/model_embeddings.py
• src/submission/nmt_model.py
• src/submission/utils.py
• src/submission/test_outputs.txt
XCS224N Assignment 4 Neural Machine Translation with RNNs 9

2 Quiz
[11 points (Online)] This remainder of this homework is a series of multiple choice questions related to the neural
machine translation model. Please input your answers into the Gradescope Online Assessment A4 Online Assessment.
XCS224N Assignment 4 Neural Machine Translation with RNNs 10

This handout includes space for every question that requires a written response. Please feel free to use it to handwrite
your solutions (legibly, please). If you choose to typeset your solutions, the README.md for this assignment includes
instructions to regenerate this handout with your typeset LATEX solutions.

THERE IS NO WRITTEN SUBMISSION FOR THIS ASSIGNMENT.

YOU ARE NOT REQUIRED TO SUBMIT ANYTHING.

Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara 2024 scribd download
50% (2)
Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara 2024 scribd download
62 pages
Solutions
No ratings yet
Solutions
11 pages
The Illustrated Transformer - Jay Alammar - Visualizing Machine Learning One Concept at A Time - .Booklet
No ratings yet
The Illustrated Transformer - Jay Alammar - Visualizing Machine Learning One Concept at A Time - .Booklet
14 pages
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
10 pages
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
7 pages
XCS224N Assignment 3 Dependency Parsing
No ratings yet
XCS224N Assignment 3 Dependency Parsing
8 pages
NLP Assignment 2
No ratings yet
NLP Assignment 2
3 pages
Polynomial Expansion Paper
No ratings yet
Polynomial Expansion Paper
4 pages
French To English Translator in PyTorch
No ratings yet
French To English Translator in PyTorch
30 pages
Cs224n 2020 Lecture08 NMT
No ratings yet
Cs224n 2020 Lecture08 NMT
77 pages
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
No ratings yet
A Character-Level Decoder Without Explicit Segmentation For Neural Machine Translation
11 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
Language Model Evaluation in Open-Ended Text Gener
No ratings yet
Language Model Evaluation in Open-Ended Text Gener
70 pages
NLP Lab2
No ratings yet
NLP Lab2
7 pages
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
No ratings yet
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
66 pages
QQ - GG: Point Any
No ratings yet
QQ - GG: Point Any
14 pages
Incorporating Source-Side Phrase Structures Into Neural Machine Translation
No ratings yet
Incorporating Source-Side Phrase Structures Into Neural Machine Translation
26 pages
05 Attention Slides
No ratings yet
05 Attention Slides
69 pages
assignment-9
No ratings yet
assignment-9
4 pages
Neural Machine Translation, Seq2seq, and Attention
No ratings yet
Neural Machine Translation, Seq2seq, and Attention
17 pages
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
No ratings yet
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
15 pages
Sequence Models-II
No ratings yet
Sequence Models-II
10 pages
Summaries of The Chapters
No ratings yet
Summaries of The Chapters
29 pages
Lec 03
No ratings yet
Lec 03
91 pages
Neural Text Generation: A Practical Guide: Ziang Xie Zxie@cs - Stanford.edu
No ratings yet
Neural Text Generation: A Practical Guide: Ziang Xie Zxie@cs - Stanford.edu
21 pages
Module 3 Part 2 Encoder
No ratings yet
Module 3 Part 2 Encoder
14 pages
XCS224N Problem Set 5 Self-Attention, Transformers, Pre-Training
No ratings yet
XCS224N Problem Set 5 Self-Attention, Transformers, Pre-Training
18 pages
NLP_Answers
No ratings yet
NLP_Answers
13 pages
po
No ratings yet
po
2 pages
CM-Sentence_Generation_Proposal
No ratings yet
CM-Sentence_Generation_Proposal
8 pages
RNN Text Generation
No ratings yet
RNN Text Generation
3 pages
Transformer_2017
No ratings yet
Transformer_2017
7 pages
AN2DL_05_2324_Seq2SeqAndWordEmbedding
No ratings yet
AN2DL_05_2324_Seq2SeqAndWordEmbedding
42 pages
Thuyết Trình TWP
No ratings yet
Thuyết Trình TWP
7 pages
Machine Translation Using Natural Language Process
No ratings yet
Machine Translation Using Natural Language Process
6 pages
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
No ratings yet
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
11 pages
01-Transformer Based NLP Applications
No ratings yet
01-Transformer Based NLP Applications
55 pages
Team03 Project Report PDF
No ratings yet
Team03 Project Report PDF
39 pages
XCS224N Module5 Slides
No ratings yet
XCS224N Module5 Slides
80 pages
DL4CV-Seq-Att
No ratings yet
DL4CV-Seq-Att
63 pages
LAB_8
No ratings yet
LAB_8
2 pages
Natural Language Processing With Pytorch Readthedocs Io en Latest PDF
No ratings yet
Natural Language Processing With Pytorch Readthedocs Io en Latest PDF
35 pages
Visualizing A Neural Machine Translation Model
No ratings yet
Visualizing A Neural Machine Translation Model
38 pages
Neubig 16 Afnlp
No ratings yet
Neubig 16 Afnlp
58 pages
Transformer
No ratings yet
Transformer
39 pages
QA With Deep Learning
No ratings yet
QA With Deep Learning
10 pages
cl8_encdec
No ratings yet
cl8_encdec
51 pages
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
No ratings yet
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
20 pages
3 Sequence and Language Modeling
No ratings yet
3 Sequence and Language Modeling
56 pages
Attention: Sharad Jones
No ratings yet
Attention: Sharad Jones
25 pages
Thesis Trinh Khoi
No ratings yet
Thesis Trinh Khoi
110 pages
7 Transformers
No ratings yet
7 Transformers
20 pages
Problem 1 Proposal
No ratings yet
Problem 1 Proposal
24 pages
Deep Learning Basics
No ratings yet
Deep Learning Basics
10 pages
Toward Multilingual Neural Machine Translation With Universal Encoder and Decoder
No ratings yet
Toward Multilingual Neural Machine Translation With Universal Encoder and Decoder
10 pages
2014 10 Cho EMNLP
No ratings yet
2014 10 Cho EMNLP
11 pages
Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara pdf download
100% (2)
Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara pdf download
68 pages
Speech and Language Processing - J&M
No ratings yet
Speech and Language Processing - J&M
599 pages
C Programming
From Everand
C Programming
Netra
No ratings yet
RC Detailing of Foundations
No ratings yet
RC Detailing of Foundations
28 pages
Proceedings of The Workshop On Tools and Algorithms For The Construction and Analysis of Systems
No ratings yet
Proceedings of The Workshop On Tools and Algorithms For The Construction and Analysis of Systems
343 pages
Artificial intelligence
No ratings yet
Artificial intelligence
1 page
CSC213 Object Oriented Programming-Lab Manual-Sol
No ratings yet
CSC213 Object Oriented Programming-Lab Manual-Sol
83 pages
Drone
No ratings yet
Drone
23 pages
Lesson 4 Analog Input
No ratings yet
Lesson 4 Analog Input
21 pages
SPD910
No ratings yet
SPD910
2 pages
(EN) Influencer Marketing Brief and Scope of Work Template
No ratings yet
(EN) Influencer Marketing Brief and Scope of Work Template
8 pages
Expression Evaluation - Infix Postfix
No ratings yet
Expression Evaluation - Infix Postfix
2 pages
UTS Construction Technology
100% (1)
UTS Construction Technology
480 pages
Your Profile On Freelance: Setting Price Per Hour As A Freelancer: Defining Your Rate
No ratings yet
Your Profile On Freelance: Setting Price Per Hour As A Freelancer: Defining Your Rate
12 pages
A Detailed Analysis of The Lockbit Ransomware: Prepared By: Vlad Pasca, Lifars, LLC Date
No ratings yet
A Detailed Analysis of The Lockbit Ransomware: Prepared By: Vlad Pasca, Lifars, LLC Date
62 pages
19-11-11 - Doc74 cv3074 Apple Answer To Amended Complaint PDF
No ratings yet
19-11-11 - Doc74 cv3074 Apple Answer To Amended Complaint PDF
26 pages
Msa University Engineering Cse Five Year Plan
No ratings yet
Msa University Engineering Cse Five Year Plan
262 pages
12th Computer Science EM Half Yearly Exam 2023 Question Paper Chennai District PDF Download
No ratings yet
12th Computer Science EM Half Yearly Exam 2023 Question Paper Chennai District PDF Download
2 pages
Pacman usingTOC
No ratings yet
Pacman usingTOC
8 pages
Project 1name - Excel Activities in Email Automation - People - Email
No ratings yet
Project 1name - Excel Activities in Email Automation - People - Email
4 pages
Mount SMB.pcap Reconstructing file
No ratings yet
Mount SMB.pcap Reconstructing file
10 pages
The Customer Value Journey
No ratings yet
The Customer Value Journey
11 pages
Penetrometer For Soil Testing
No ratings yet
Penetrometer For Soil Testing
17 pages
Detailed Engineering Assessment
No ratings yet
Detailed Engineering Assessment
47 pages
Smids2020 Article RobotsInTheWorkplaceAThreatToO
No ratings yet
Smids2020 Article RobotsInTheWorkplaceAThreatToO
20 pages
Maths 2nd Semester CLUJ Syllabus
No ratings yet
Maths 2nd Semester CLUJ Syllabus
109 pages
Camera Shots Notes
No ratings yet
Camera Shots Notes
5 pages
Adizes' Model of Growing and Ageing Companies (1988)
No ratings yet
Adizes' Model of Growing and Ageing Companies (1988)
9 pages
BIOS Guide E8310 FPC58-1692
No ratings yet
BIOS Guide E8310 FPC58-1692
29 pages
NeurIPS 2023 Embersim a Large Scale Databank for Boosting Similarity Search in Malware Analysis Paper Datasets and Benchmarks
No ratings yet
NeurIPS 2023 Embersim a Large Scale Databank for Boosting Similarity Search in Malware Analysis Paper Datasets and Benchmarks
22 pages
@CASA Configuration-Room
No ratings yet
@CASA Configuration-Room
5 pages
Evaluation of Policy Making in Russell Group
No ratings yet
Evaluation of Policy Making in Russell Group
6 pages
B techhonoursEligibleCount
No ratings yet
B techhonoursEligibleCount
5 pages

XCS224N Assignment 4 Neural Machine Translation With Rnns

Uploaded by

XCS224N Assignment 4 Neural Machine Translation With Rnns

Uploaded by

XCS224N Assignment 4 Neural Machine Translation with RNNs 1

XCS224N Assignment 4 Neural Machine Translation with

Writing Code and Running the Autograder

There are two types of unit tests used by the autograder:

Local Execution - Basic Tests

1 Neural Machine Translation with RNNs

Model description (training procedure)

We then initialize the Decoder’s first hidden state hdec

We then use hdec

et,i = (hdec T enc

Pt = Softmax(Wvocab ot ) where Pt ∈ RVt ×1 , Wvocab ∈ RVt ×h (13)

Jt (θ) = CE(Pt , gt ) (14)

Setting up your Virtual Machine

THERE IS NO WRITTEN SUBMISSION FOR THIS ASSIGNMENT.

YOU ARE NOT REQUIRED TO SUBMIT ANYTHING.

You might also like