Incorporating BERT Into NMT-1
Incorporating BERT Into NMT-1
Translation
—– Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang
Zhou, Houqiang Li, Tie-Yan Liu
Ramakrishna Mission Vivekananda Educational And Research
Institute
1 Introduction
2 Preliminary Explorations
3 BERT-fused Model
4 Algorithm
5 Results
6 Conclusion
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
2/19
Introduction
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
3/19
Preliminary Explorations
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
4/19
Preliminary Explorations cont.
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
5/19
Preliminary Explorations cont.
Their Approach:
q Output of BERT as context-aware embeddings of Encoder
outperforms initialization approaches.
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
6/19
Proposed Algorithm: BERT-fused Model
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
7/19
Model Architecture
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
8/19
Notations
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
9/19
Step-1: BERT Encoding
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
10/19
Step-2: Encoder Representation
q Let HlE denote the hidden representation of the l-th layer in
the encoder, and let H0E denote the word embedding of
sequence x.
q Denote the i-th element in HlE as hli for any i ∈ [lx ].
q In the l-th layer, l ∈ [L], we have:
q The encoder will eventually output HLE from the last layer.
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
11/19
Step-3: Decoder Representation
q Let Sl<t denote the hidden state of the l-th layer in the
decoder preceding time step t, i.e.,
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
12/19
Step-3: Decoder Representation cont.
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
13/19
Drop-Net Trick
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
14/19
Drop-Net Trick cont.
q In Encoder, h̃li can be written as:
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
15/19
Supervised NMT Results
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
16/19
Supervised NMT Results Cont.
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
17/19
Semi Supervised and Unsupervised NMT Results
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
18/19
Conclusion
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22
Introduction Preliminary Explorations BERT-fused Model Algorithm Results Conclusion
19/19
Thank You
VoxLing [RKMVERI] Presenters : Ayan Maity, Debanjan Nanda, Debayan Datta 2024-11-22