RBBA ResNet - BERT - Bahdanau Attention for Image Caption Generator

The document presents a study on an image caption generator that employs a ResNet50 - BERT - Bahdanau Attention model, which outperforms existing methods in generating descriptive captions for images. The research evaluates seven methodologies using the BLEU metric on the Flickr8K dataset, highlighting advancements in combining computer vision and natural language processing techniques. The proposed model achieved a BLEU-1 score of 0.532143 and a BLEU-4 score of 0.126316, demonstrating its effectiveness in the field of image caption generation.

Uploaded by

saiprathaptedla

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

RBBA ResNet - BERT - Bahdanau Attention for Image Caption Generator

Uploaded by

saiprathaptedla

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

RBBA: ResNet - BERT - Bahdanau Attention for

Image Caption Generator

Duc-Hieu Hoang Duc Ngoc Minh Dang* Hanh Dang-Ngoc

Faculty of Electrical & Electronics Eng. Dept. of Computing Fundamental School of Electrical and Data Engineering
Ton Duc Thang University FPT University University of Technology Sydney
Ho Chi Minh City, Vietnam Ho Chi Minh City, Vietnam Sydney, NSW, Australia
[email protected] [email protected] [email protected]

Anh-Khoa Tran Phuong-Nam Tran Cuong Tuan Nguyen

Faculty of Electrical and Electronics Dept. of Computing Fundamental Dept. of Information Technology Specialization
Ton Duc Thang University FPT University FPT University
Ho Chi Minh City, Vietnam Ho Chi Minh City, Vietnam Ho Chi Minh City, Vietnam
[email protected] [email protected] [email protected]

Abstract—In recent years, the topic of image caption genera- II. R ELATED WORKS
tors has gained significant attention. Several successful projects
have emerged in this field, showcasing notable advancements. A. Convolutional neural networks
Image caption generators automatically generate descriptive cap-
tions for images through the encoder and decoder mechanisms.
Convolutional neural networks (CNNs) have a crucial role
The encoder leverages computer vision models, while the decoder in computer vision. The emergence of architectures such as
utilizes natural language processing models. In this study, we aim VGGNet [1] and ResNet [2] has significantly enhanced the
to assess a comprehensive set of seven distinct methodologies, efficacy of computer vision models across various tasks.
including six existing methods from prior research and one newly VGGNet [1] is a well-known convolutional neural network
proposed. These methods are trained and evaluated with bilingual
evaluation (BLEU) on the Flickr8K dataset. In our experiments, architecture consisting of multiple layers with small convolu-
the proposed ResNet50 – BERT – Bahdanau Attention model tional filters. It gained popularity for its simplicity and effec-
outperforms the other models in terms of the BLEU-1 score of tiveness in image classification tasks. The network architecture
0.532143 and BLEU-4 score of 0.126316. typically follows a consistent pattern of stacking convolutional
Index Terms—Deep learning, Natural language processing, layers with 3x3 filters, followed by max-pooling layers to
Encoder-Decoder, Flickr8K, BLEU, Image Caption.
reduce the spatial dimensions. The VGGNet architecture of-
fers various configurations, commonly known as VGG16 and
I. I NTRODUCTION VGG19, depending on the depth of the network. Its archi-
Generating textual descriptions or captions for images is one tecture has demonstrated impressive performance on various
of the most challenging tasks for artificial intelligence (AI). image classification benchmarks, achieving high-performance
Despite the difficulty, image caption generators have a wide results and establishing itself as a reliable and effective choice
variety of uses, from providing automatic image descriptions for deep-learning tasks.
for the blind to enhancing image search outcomes and pro- ResNet [2] is a groundbreaking convolutional neural net-
ducing more interesting social media posts. The development work architecture that has revolutionized the field of computer
of more precise and sophisticated image caption generators vision. ResNet addresses a common challenge encountered in
has advanced significantly in recent years, and they are now deep neural networks known as the vanishing gradient prob-
widely used across a variety of industries. lem. As networks become deeper, the gradients can vanish,
Image caption generators analyze the contents of an image leading to difficulties in training and optimization. To address
using deep learning techniques and generate a description of this issue, ResNet utilizes skip connections that allow the
what is happening in the image. Typically, image caption gen- network to bypass certain layers. By doing so, ResNet enables
erators employ a combination of computer vision techniques the direct flow of information from earlier layers to subsequent
to process the images and natural language processing (NLP) layers, facilitating the learning process. ResNet has achieved
algorithms to generate the descriptions. These algorithms can remarkable success, outperforming previous models in various
be trained on enormous datasets composed of photos and their computer vision tasks, such as image classification, object
captions, teaching the system how to correlate various visual detection, and image segmentation. Its deep architecture, with
cues with relevant descriptions. variants like ResNet-50, ResNet-101, and ResNet-152, has
become a standard benchmark in the field. Furthermore,
∗ Corresponding author: Duc Ngoc Minh Dang ([email protected]) ResNet’s impact extends beyond computer vision, the concept

979-8-3503-1327-7/23/$31.00 ©2023 IEEE 430 ICTC 2023

zed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on September 23,2024 at 13:27:03 UTC from IEEE Xplore. Restrictions
of residual learning has inspired advancements in other do- generate the caption. Instead of merging the features at a later
mains, including natural language processing and audio signal stage, the image features are injected into the RNNs during
processing. the caption generation process, influencing the output at each
time step. Both approaches have their advantages and trade-
B. Natural language processing offs, and their performance can vary depending on the dataset
The field of NLP includes various activities, such as text and specific requirements of the image captioning task.
generation, sentiment analysis, language translation, speech In merged models, the recurrent neural networks (RNNs)
recognition, and natural language comprehension. Thanks to never directly interact with the image feature vectors or any
advancements in machine learning and deep learning tech- derived vector from the image. Instead, the image is added
niques, NLP has experienced remarkable progress in recent to the language model after the RNNs have encoded the
years. With the appearance of Long Short-Term Memory full prefix. This architecture is known as late-binding, where
networks [3] (LSTMs), Gated Recurrent Unit networks [4] the image representation remains constant throughout the
(GRUs), and especially Transformer [5], the field of NLP decoding process and is not changed at each time step.
witnessed significant advancements. These breakthroughs in In injected models, the image feature vector or a derived
sequence modeling and language understanding have revolu- vector from the image serves as input to the RNNs in parallel
tionized various NLP tasks, including machine translation, text with the word feature vectors of the caption prefix, such that
generation, sentiment analysis, and question-answering. either RNNs take two separate inputs, or the word feature
LSTMs and GRUs are designed to address the vanish- vector is combined with the image feature vector into a single
ing gradient problem in traditional recurrent neural networks input before being passed to the RNNs. The image feature
(RNNs). They employ gating mechanisms that enable the vector does not have to be identical for every word, nor
networks to selectively update and forget information over does it need to be associated with each word. This mixed
time, allowing them to capture long-range dependencies in binding architecture allows for some flexibility in the image
sequential data. LSTMs have been widely used in various NLP representation. However, if the same image is repeatedly
applications, and GRUs, a simplified variant of LSTMs, have provided to the recurrent neural networks (RNNs) at each
gained popularity due to their computational efficiency. time step, modifying the image representation becomes more
However, it was the introduction of the Transformer model challenging as the RNNs’ hidden state is refreshed with the
in 2017 that truly revolutionized the field of NLP. The original image during each iteration.
Transformer [5] model introduced a novel architecture based
solely on self-attention mechanisms, doing away with recur- III. I MAGE C APTION G ENERATOR
rent connections entirely. This architecture enabled parallel A. Merge-based Xception – Word2Vec (MXW2V)
processing of input sequences, making it highly scalable and
efficient. The self-attention mechanism in Transformers allows
the model to weigh the importance of different words or tokens
within a sequence when processing each word. This attention
mechanism provides a global context for each word, enabling
the model to capture dependencies between words regardless
of their position in the sequence. The use of self-attention
also reduces the vanishing gradient problem, as information
can flow directly from any word to any other word in the
sequence.
Fig. 1. Architecture of “Merge Xception - Word2Vec” method
C. Image caption generators
Image caption generators are essential tools that help im- Merge-based model is not exposed to the image feature
prove accessibility for individuals with visual impairments. vector at any point. Instead, the image is processed by
These systems automatically create descriptive captions for CNNs and introduced into the language model after the prefix
images, allowing visually impaired users to better understand has been encoded by the RNNs in its entirety. This is a
visual content shared online, including posts on social me- late binding architecture and it does not modify the image
dia, articles in the news, and web pages. In image caption representation with every time step. The system architecture
generators, two common approaches are used to generate MXW2V is made up of three sub-models: the feature extrac-
descriptive captions for images: merged models and injected tion model (FE-MODEL), the caption encoding model (CAP-
models [6]. The merged models combine image features ENC-MODEL), and finally the merged information decoding
with NLP techniques to generate captions. It processes the model. Xception architecture [7] is employed in the FE-
image through CNNs to extract visual features, which are MODEL, while the Word2Vec [8] technique is employed in
then merged with textual features in a subsequent neural the CAP-ENC-MODEL. The merged information decoding
network to generate the caption. On the other hand, injected model simply concatenates both the feature extraction model
models incorporate the image features directly into RNNs that and caption encoding model and forwards to a dense layer

431
zed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on September 23,2024 at 13:27:03 UTC from IEEE Xplore. Restrictions
using the ReLU activation function. Softmax is used as the the output of the image and text features is concatenated
activation function to predict the output word. The details of together and then passed through LSMTs [3]. By feeding the
MXW2V are depicted in Figure 1. concatenated representation through LSTMs, the model can
effectively learn the relationships between the image and text
B. Merge-based InceptionResnetV2 – GloVe (MIRG)
features and capture the contextual information necessary for
predicting the next word in the sentence.

D. Inject-based InceptionResnetV2 – GloVe (IIRG)

Fig. 2. Architecture of “Merge Inception ResnetV2 - GloVe” method

The MIRG employed in this approach builds upon the Fig. 4. Architecture of “Inject InceptionResnetV2 - GloVe” method
MXW2V, which utilized Xception and Word2Vec. However,
it introduces notable improvements by leveraging the power Similar to the IXW2V approach, this architecture leverages
of the InceptionResNetV2 [9] and the GloVe [10] technique the InceptionResnetV2 [9] and Glove [10] methods to extract
whose architecture is illustrated in Figure 2. InceptionRes- features from images and text, respectively. These features are
NetV2 [9] is an exceptionally deep network, comprising a subsequently concatenated and fed into LSTMs to predict the
total of 164 layers. It combines the innovative ideas of the next word. Figure 4 illustrates the architecture of the IIRG
Inception module with the residual connections. These residual model.
connections reduce the vanishing gradient problem commonly
encountered in deep networks and facilitate the training of E. VGG16 – GRU – Bahdanau attention (VGBA)
highly complex models. GloVe [10] technique changed the
generation of word feature vectors by utilizing global word
co-occurrence data. By capturing the semantic relationships
between words, GloVe effectively merges the advantages of
count-based methods. GloVe effectively merges the advantages
of count-based Latent Semantic Analysis [11] and context-
based Word2Vec [8].
C. Inject-based Xception – Word2Vec (IXW2V)
Fig. 5. Architecture of “VGG16 - GRU - Bahdanau attention” method

VGBA is based on the pioneering work of [12], which

leverages an advanced encoding and decoding framework.
Figure 5 illustrates the details of the VGBA architecture.
The encoder incorporates the VGG16 [1] for visual feature
extraction and the tokenized encoding method for caption
processing. By including these components, the encoder ef-
fectively processes the input data. In the decoder section, both
the GRU [4] network and the Bahdanau attention mecha-
Fig. 3. Architecture of “Inject Xception - Word2Vec” method nism [13] are employed to enhance the output. The Bahdanau
attention mechanism has demonstrated significant performance
The inject architecture, similar to the merge architecture, improvements. Its core concept involves assigning attention
incorporates the combination of image characteristics and weights to prioritize specific feature vectors within the input
caption words into RNNs. In this approach, each caption word sequence. These attention weights inform the decoder about
is processed alongside the image features, creating a new the level of attention each input word should receive at
representation of the image for different parts of the phrase as different stages of decoding. By utilizing a set of attention
it is generated. The IXW2V architecture specifically utilizes weights, the decoder can focus on the most relevant portion of
the inject architecture and is constructed based on it. Figure 3 the image, guided by the alignment scores computed by a feed-
illustrates the architecture of the IXW2V model. In IXW2V, forward neural network. This attention mechanism enables the

432
zed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on September 23,2024 at 13:27:03 UTC from IEEE Xplore. Restrictions
input and output sequences to concentrate on the most crucial To assess the performance of our model, we employed
elements, resulting in improved performance. the BLEU metric [16], which is a widely used and estab-
lished evaluation measure in the field of natural language
F. Xception – TT-LSTMs with Bi-LSTMs (XBi-LSTM)
processing. BLEU is commonly utilized to evaluate the quality
of machine-generated text by comparing it to one or more
reference translations or human-generated text. It calculates
a score ranging from 0 to 1, with a higher score indicating
a better match between the generated text and the reference
text. The BLEU metric takes into account various factors such
as precision, n-gram matches, and brevity penalty. It considers
both the presence and the ordering of words, thereby capturing
Fig. 6. Architecture of “Xception - TT-LSTMs with Bi-LSTMs” method
the fluency and correctness of the generated captions. By
employing the BLEU metric, we aimed to quantitatively
The encoder-decoder structure in Figure 6 is used by the evaluate the performance of our model and compare it with
approach known as TT-LSTM [14] which is built using a other methods in the field. The use of this standard metric
combination of the merge and inject models. For both text and allows for a fair and objective assessment of the quality of the
image, TT-LSTM suggests creating two sub-encoder models. captions generated by our model.
The two aforementioned procedures will then be merged. The We carefully selected specific parameter configurations to
Xception network is utilized in the image encoder model. Bi- optimize our research outcomes. These include utilizing the
LSTM is employed for the decoder, while LSTMs are applied Adam optimizer with a learning rate of 0.001 to guide the
to the language encoder. training process. For the cost function, we employed the
cross-entropy loss function, a commonly used measure for
G. ResNet50 – BERT – Bahdanau attention (RBBA) multi-class classification tasks. To initialize the LSTMs/GRUs,
we applied the glorot-uniform initializer, which helps ensure
effective information flow within the model. During training,
we used a batch size of 32, which determines the number of
samples processed in each iteration. The model was trained
for a total of 10 epochs. To prevent overfitting and enhance
generalization, we incorporated a dropout rate of 0.5, which
randomly deactivates a portion of the neural network units
during training. This regularization technique encourages the
Fig. 7. Architecture of “ResNet50 - BERT - Bahdanau attention” method model to learn more robust and generalized features.

This architecture utilizes the ResNet50 [2] and BERT [15] B. Experiment results
methods, the same approach as mentioned in VGBA, to Tables I and II present the experimental results of various
extract features from images and text, respectively. In this architectures for image captioning. Among these architectures,
architecture, the Bahdanua attention mechanism is combined the VGBA demonstrates exceptional performance in terms of
with LSTMs [3] to generate the final output. The detailed both speed and accuracy. With a relatively short training time
architecture is illustrated in Figure 7. of 826.8s, this architecture achieves the best score in terms of
BLEU-2 and BLEU-3 (Table I). The inclusion of Bahdanau
IV. P ERFORMANCE EVALUATION
attention yields a significant enhancement in the output, as
A. Experiment setup it allows the model to comprehend the image context more
In this research, we utilized the Flickr8K dataset, which is accurately and consistently.
a widely used and diverse dataset for image captioning tasks. Table II illustrates these methods in action for captioning
The dataset consists of 8,000 high-quality images sourced the image. Among them, the RBBA attention stands out
from the popular online platform Flickr. These images cover as it describes the image with the highest level of detail
a wide range of life themes, including captivating scenes of and accuracy compared to the other methods. It effectively
animals such as dogs, cats, and people engaging in various leverages the ResNet50 and BERT models, along with the
activities. The dataset also includes images depicting fun and inclusion of Bahdanau attention, resulting in more precise and
entertainment activities, sports events, and daily life routines. comprehensive image captions.
The diversity of themes and subjects within the Flickr8K Figure 8 demonstrates the performance of various methods
dataset makes it a suitable choice for training and testing in generating image captions, as measured by the BLEU
the image caption generator models. By incorporating such a scores. The bar chart clearly illustrates the distinct differences
diverse collection of images, we aimed to enhance the model’s in performance among these methods. Notably, the Bahdanau
ability to generate accurate and meaningful captions for a wide attention mechanism still stands out as a particularly effective
range of visual content. approach. The Bahdanau attention mechanism has proven to be

433
zed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on September 23,2024 at 13:27:03 UTC from IEEE Xplore. Restrictions
TABLE I
P ERFORMANCE COMPARISON ON F LICKR 8K DATASET

Methods Parameter Time training (per epochs) BLEU-1 BLEU-2 BLEU-3 BLEU-4
Inject Xception – Word2Vec 5,002,649 807.6 s 0.364886 0.190213 0.121723 0.046814
Merge Xception – Word2Vec 5,002,649 732.8 s 0.375879 0.196083 0.118743 0.042923
Inject InceptionResnetV2 – GloVe 5,512,165 3,387.1 s 0.360748 0.201883 0.116986 0.062958
Merge InceptionResnetV2 – GloVe 5,315,557 2,096.8 s 0.381128 0.221911 0.155406 0.069800
VGG16 – GRU – Bahdanau attention 4,872,345 826.8 s 0.461538 0.339683 0.254815 0.039086
Xception – TT LSTM with Bi-LSTM 8,517,273 3,139.8 s 0.426407 0.265075 0.191381 0.089676
ResNet50–BERT–Bahdanau attention 119,818,810 7,488.0 s 0.532143 0.227003 0.175572 0.126316

TABLE II
C APTION COMPARISON RESULTS OF METHODS FOR IMAGES

Test image Real caption Method Predict caption

1. girl with a black swimsuit plays Xception – Word2Vec (Inject) young boy is playing in the water
in the sprinkler Xception – Word2Vec (Merge) young girl is playing in the water
2. young girl is playing in fountain InceptionResnetV2 – GloVe (In- young girl in pink shirt is playing into
of water ject) the water
3. young girl plays in fountain wa- InceptionResnetV2 – GloVe girl in bathtub spits water from water
ter (Merge) fountain
4. little girl crouches to splash VGG16 – GRU – Bahdanau at- young girl playing in fountain water
fountain water tention
5. young girl in a bathing suit play- Xception – TT LSTM with Bi- boy in blue shirt is playing in the
ing with water shooting out of the LSTM water
ground ResNet50 – BERT – Bahdanau the little girl wearing the swimsuit is
attention playing into the water fountain
1. black dog jumps up to catch Xception – Word2Vec (Inject) dog is running through the grass
white ball Xception – Word2Vec (Merge) dog is running through the grass
2. dog catches ball in the air InceptionResnetV2 – GloVe (In- dog is jumping over hurdler
3. dog catches toy outside of brick ject)
house InceptionResnetV2 – GloVe black dog is running through the grass
4. dog leaps to catch ball (Merge)
5. the big black dog is jumping up VGG16 – GRU – Bahdanau at- black dog jumps up to catch the ball
in the air to catch ball tention
Xception – TT LSTM with Bi- black dog jumps to catch ball
LSTM
ResNet50 – BERT – Bahdanau dog catch ball outside
attention
1. boy plays basketball Xception – Word2Vec (Inject) two men are playing in the grass
2. boy wearing blue shorts is Xception – Word2Vec (Merge) two men are playing in the grass
bouncing basketball in front of the InceptionResnetV2 – GloVe (In- the man in the red shirt is playing the
net ject) basketball
3. little boy plays with basketball InceptionResnetV2 – GloVe the basketball player in the red strip
and toy basketball hoop (Merge) is trying to get the ball
4. little boy playing basketball in VGG16 – GRU – Bahdanau at- boy is holding basketball
the grass tention
5. the child in the blue shorts drib- Xception – TT LSTM with Bi- the boy is playing basketball
bled the basketball LSTM
ResNet50 – BERT – Bahdanau little boy wearing blue short is playing
attention basketball in the grass

highly successful in generating accurate and contextually rel- approach falls short of the RBBA approach in terms of BLEU-
evant image captions. It encompasses a sophisticated attention 1 and BLEU-4 scores, it outshines the leading model when it
mechanism that allows the model to focus on different regions comes to BLEU-2 and BLEU-3 scores. In terms of speed, the
of the image while generating the corresponding captions. VGBA model appears to be significantly faster than the RBBA
model. VGBA model achieves the training of an epoch within
V. C ONCLUSIONS only 826.8 seconds, whereas the RBBA model, on the other
In conclusion, the comparison of these methods sheds light hand, takes up to 7,488 seconds.
on the advantages and trade-offs inherent in different model
and attention mechanism combinations. The RBBA approach R EFERENCES
excels in generating accurate and descriptive captions, which is [1] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
demonstrated by its impressive BLEU-1 and BLEU-4 scores, large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[2] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
as well as the meaningfulness of the captions produced for recognition,” in Proceedings of the IEEE Conference on Computer Vision
sample images. On the other hand, although the VGBA and Pattern Recognition (CVPR), 2016, pp. 770–778.

434
zed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on September 23,2024 at 13:27:03 UTC from IEEE Xplore. Restrictions
Fig. 8. BLUE-score comparison on Flickr8K dataset

[3] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural for word representation,” in Proceedings of the 2014 conference on
computation, vol. 9, no. 8, pp. 1735–1780, 1997. empirical methods in natural language processing (EMNLP), 2014, pp.
[4] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, 1532–1543.
H. Schwenk, and Y. Bengio, “Learning phrase representations using [11] S. T. Dumais, “Latent semantic analysis,” Annual Review of Information
rnn encoder-decoder for statistical machine translation,” arXiv preprint Science and Technology (ARIST), vol. 38, pp. 189–230, 2004.
arXiv:1406.1078, 2014. [12] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel,
[5] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, and Y. Bengio, “Show, attend and tell: Neural image caption generation
Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in with visual attention,” in International conference on machine learning.
neural information processing systems, vol. 30, 2017. PMLR, 2015, pp. 2048–2057.
[6] M. Tanti, A. Gatt, and K. P. Camilleri, “Where to put the image in an
[13] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by
image caption generator,” Natural Language Engineering, vol. 24, no. 3,
jointly learning to align and translate,” arXiv preprint arXiv:1409.0473,
pp. 467–489, 2018.
2014.
[7] F. Chollet, “Xception: Deep learning with depthwise separable convolu-
tions,” in Proceedings of the IEEE conference on computer vision and [14] P. P. Khaing et al., “Two-tier lstm model for image caption generation.”
pattern recognition, 2017, pp. 1251–1258. International Journal of Intelligent Engineering & Systems, vol. 14,
[8] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of no. 4, 2021.
word representations in vector space,” arXiv preprint arXiv:1301.3781, [15] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training
2013. of deep bidirectional transformers for language understanding,” arXiv
[9] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4, preprint arXiv:1810.04805, 2018.
inception-resnet and the impact of residual connections on learning,” in [16] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for
Proceedings of the AAAI conference on artificial intelligence, vol. 31, automatic evaluation of machine translation,” in Proceedings of the 40th
no. 1, 2017. annual meeting of the Association for Computational Linguistics, 2002,
[10] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors pp. 311–318.

435
zed licensed use limited to: Vignan's Foundation for Science Technology & Research (Deemed to be University). Downloaded on September 23,2024 at 13:27:03 UTC from IEEE Xplore. Restrictions

Walc 2
92% (26)
Walc 2
301 pages
Wisdom Oracle PDF
79% (57)
Wisdom Oracle PDF
248 pages
WALC 10 Memory
83% (12)
WALC 10 Memory
186 pages
Pachislo Manual PDF
100% (3)
Pachislo Manual PDF
30 pages
Service Manual IGT PDF
86% (7)
Service Manual IGT PDF
274 pages
EPA07 Maxxforce 11, 13 Engine Service Manual
79% (29)
EPA07 Maxxforce 11, 13 Engine Service Manual
490 pages
Gideon's Guardians - New Meth Recipe - A - K - A Easter Bunny Meth
67% (6)
Gideon's Guardians - New Meth Recipe - A - K - A Easter Bunny Meth
50 pages
Unlock Codes All Cell Phones
100% (25)
Unlock Codes All Cell Phones
15 pages
Service Manual All GM 1977 PDF
100% (1)
Service Manual All GM 1977 PDF
468 pages
DIY: Immobilizer Hacking For Lost Keys or Swapped ECU
50% (4)
DIY: Immobilizer Hacking For Lost Keys or Swapped ECU
14 pages
Cell Phone Unlock Code Instructions
63% (8)
Cell Phone Unlock Code Instructions
41 pages
Microwave Medical Image Segmentation For Brain Stroke Diagnosis Imaging-Process-Informed Image Processing
No ratings yet
Microwave Medical Image Segmentation For Brain Stroke Diagnosis Imaging-Process-Informed Image Processing
5 pages
Isuzu NPR - NPR HD - NQR Commercial Truck Tiltmaster Service Manual Supplement 2003
83% (6)
Isuzu NPR - NPR HD - NQR Commercial Truck Tiltmaster Service Manual Supplement 2003
215 pages
2019 Mac Pro Service Technician Manual
No ratings yet
2019 Mac Pro Service Technician Manual
341 pages
All CDMA Codes
75% (4)
All CDMA Codes
17 pages
Micr Basics Hand Book
100% (4)
Micr Basics Hand Book
21 pages
X32 Matrix Setup Guide: Step 1: Assign The Matrix Mix To An Output
No ratings yet
X32 Matrix Setup Guide: Step 1: Assign The Matrix Mix To An Output
2 pages
Compassion Fatigue Scale 1
100% (1)
Compassion Fatigue Scale 1
3 pages
ServiceManualNamux4English PDF
100% (9)
ServiceManualNamux4English PDF
112 pages
All Mobile Tricks
91% (35)
All Mobile Tricks
19 pages
Device Unlock Code Instructions
100% (1)
Device Unlock Code Instructions
28 pages
Hidden Pic PDF
67% (3)
Hidden Pic PDF
1 page
Applying IoT Platform To Design A Data Collection System For Hybrid Power System
No ratings yet
Applying IoT Platform To Design A Data Collection System For Hybrid Power System
4 pages
Aocv
No ratings yet
Aocv
5 pages
Mobile Charge Carrier Based Modeling of 4H21 DNTT and Structure Analysis of OTFT
No ratings yet
Mobile Charge Carrier Based Modeling of 4H21 DNTT and Structure Analysis of OTFT
5 pages
Derivation of Small-Signal Model of Boost-SEPIC Interleaved Converter Based On PWM Switch Model
No ratings yet
Derivation of Small-Signal Model of Boost-SEPIC Interleaved Converter Based On PWM Switch Model
6 pages
The Practice of Mapping-Based Navigation System For Indoor Robot With RPLIDAR and Raspberry Pi
No ratings yet
The Practice of Mapping-Based Navigation System For Indoor Robot With RPLIDAR and Raspberry Pi
4 pages
41-Remote_Monitoring_and_Health_Diagnosis_of_Distribution_Transformers_Based_Lora_Apply_to_Rural_Areas_of_Vietnam (1)
No ratings yet
41-Remote_Monitoring_and_Health_Diagnosis_of_Distribution_Transformers_Based_Lora_Apply_to_Rural_Areas_of_Vietnam (1)
7 pages
Comparative_study_of_PCF_background_materials_for_Terahertz_Region
No ratings yet
Comparative_study_of_PCF_background_materials_for_Terahertz_Region
6 pages
An_Efficient_Method_for_Accelerating_Kyber_and_Dilithium_Post-Quantum_Cryptography
No ratings yet
An_Efficient_Method_for_Accelerating_Kyber_and_Dilithium_Post-Quantum_Cryptography
5 pages
yang2021
No ratings yet
yang2021
6 pages
Cheng 2020
No ratings yet
Cheng 2020
5 pages
Data Framework
No ratings yet
Data Framework
5 pages
YOLOv3 Tiny-Based Human Detection and Tracking Using Vitis AI and VVAS on FPGA
No ratings yet
YOLOv3 Tiny-Based Human Detection and Tracking Using Vitis AI and VVAS on FPGA
6 pages
High_Density_Planar_Integrated_Magnetics_with_Common_Mode_Noise_Immunity
No ratings yet
High_Density_Planar_Integrated_Magnetics_with_Common_Mode_Noise_Immunity
6 pages
A Decision Machine Learning Support System For Human Skin Disease Classifier
No ratings yet
A Decision Machine Learning Support System For Human Skin Disease Classifier
5 pages
Iccece51280 2021 9342169
No ratings yet
Iccece51280 2021 9342169
5 pages
PCB-Fire Automated Classification and Fault
No ratings yet
PCB-Fire Automated Classification and Fault
6 pages
Awp Course File
No ratings yet
Awp Course File
24 pages
TY ETC Pat 2020 Curriculum
No ratings yet
TY ETC Pat 2020 Curriculum
59 pages
Icoiact50329 2020 9332009
No ratings yet
Icoiact50329 2020 9332009
5 pages
Data Converters ADC and DAC Architectures
No ratings yet
Data Converters ADC and DAC Architectures
55 pages
1 s2.0 S1877050918318829 Main
No ratings yet
1 s2.0 S1877050918318829 Main
8 pages
ETC S.Y.B.Tech. Pattern 2020 2
No ratings yet
ETC S.Y.B.Tech. Pattern 2020 2
37 pages
ICGEA 2024
No ratings yet
ICGEA 2024
6 pages
The_Development_of_Low-Cost_Dry_Electrode_using_PDMS_CNT_Composite
No ratings yet
The_Development_of_Low-Cost_Dry_Electrode_using_PDMS_CNT_Composite
4 pages
Minh Q. Dinh: Employments
No ratings yet
Minh Q. Dinh: Employments
2 pages
Laohapensaeng 2021
No ratings yet
Laohapensaeng 2021
5 pages
Notes For 3rd Year Electrical Engineer
No ratings yet
Notes For 3rd Year Electrical Engineer
4 pages
A Knowledge Distillation Integrated Pruning Method For Vision Transformer
No ratings yet
A Knowledge Distillation Integrated Pruning Method For Vision Transformer
6 pages
ekkaravarodome2020
No ratings yet
ekkaravarodome2020
4 pages
Satellite Laser Communication Assisted P-Cycle Protection Against SRLG Failures in WDM Optical Networks
No ratings yet
Satellite Laser Communication Assisted P-Cycle Protection Against SRLG Failures in WDM Optical Networks
6 pages
Impact_of_High-k_Materials_and_Tube_Numbers_on_CNFET_Gates_Performances (2)
No ratings yet
Impact_of_High-k_Materials_and_Tube_Numbers_on_CNFET_Gates_Performances (2)
8 pages
Ism Brochure
No ratings yet
Ism Brochure
20 pages
Center of Excellence For Control and Automation in The Context of
No ratings yet
Center of Excellence For Control and Automation in The Context of
4 pages
Floorplanning For Embedded Multi-Die Interconnect Bridge Packages
No ratings yet
Floorplanning For Embedded Multi-Die Interconnect Bridge Packages
8 pages
IEEE - iEECON2021 - Data Analytics of Electricity Revenue
No ratings yet
IEEE - iEECON2021 - Data Analytics of Electricity Revenue
4 pages
CV-Tien
No ratings yet
CV-Tien
1 page
Adobe Scan 21 Janv. 2024
No ratings yet
Adobe Scan 21 Janv. 2024
1 page
The Atmospheric Ozone Monitoring System by Using Internet of Things Technology For Nanosatellites 3U CubeSat
No ratings yet
The Atmospheric Ozone Monitoring System by Using Internet of Things Technology For Nanosatellites 3U CubeSat
5 pages
Suraci-Chemical and Electrical Characterizatio
No ratings yet
Suraci-Chemical and Electrical Characterizatio
4 pages
Modeling of Mobile Antenna Optimization Based On Artificial Neural Network
No ratings yet
Modeling of Mobile Antenna Optimization Based On Artificial Neural Network
4 pages
Yang 2020
No ratings yet
Yang 2020
3 pages
R.-Seating-Costing.xlsxf_
No ratings yet
R.-Seating-Costing.xlsxf_
6 pages
Four-wheel Independently Driven Formula Experimental EV for Motion Control Studies[1]
No ratings yet
Four-wheel Independently Driven Formula Experimental EV for Motion Control Studies[1]
8 pages
ZZZZ 43
No ratings yet
ZZZZ 43
1 page
Multimedia Privacy Security and Protecti
No ratings yet
Multimedia Privacy Security and Protecti
6 pages
AETA 2017 - Recent Advances in Electrical Engineering and Related Sciences: Theory and Application 1st Edition Vo Hoang Duy Et Al. (Eds.) instant download
100% (1)
AETA 2017 - Recent Advances in Electrical Engineering and Related Sciences: Theory and Application 1st Edition Vo Hoang Duy Et Al. (Eds.) instant download
63 pages
Hasan Mahmud Zilany
No ratings yet
Hasan Mahmud Zilany
4 pages
Recruitment Plan of Xiamen University Postdoctoral Fellowship Programs 2024 (3)
No ratings yet
Recruitment Plan of Xiamen University Postdoctoral Fellowship Programs 2024 (3)
9 pages
ICGEA 2023
No ratings yet
ICGEA 2023
6 pages
ETC B.Tech E TC Pattern 2020 Syllabus 23 24
No ratings yet
ETC B.Tech E TC Pattern 2020 Syllabus 23 24
26 pages
Iceit51700 2021 9375577
No ratings yet
Iceit51700 2021 9375577
5 pages
2 T.Y B. Tech Structure and Syllabus
No ratings yet
2 T.Y B. Tech Structure and Syllabus
78 pages
A Survey of Industry in Cambodia and Future Prospects Industry 4
No ratings yet
A Survey of Industry in Cambodia and Future Prospects Industry 4
4 pages
EEG-based Confusion Recognition Using Different Machine Learning Methods
No ratings yet
EEG-based Confusion Recognition Using Different Machine Learning Methods
6 pages
2022 IET ICETA An RF-DC Converter IC For Power Charging
No ratings yet
2022 IET ICETA An RF-DC Converter IC For Power Charging
28 pages
College of Electrical Engineering and Computer Science
No ratings yet
College of Electrical Engineering and Computer Science
26 pages
$R3GUMBV
No ratings yet
$R3GUMBV
2 pages
ADC Lab Manual Auto Even2019 20 YBJ
No ratings yet
ADC Lab Manual Auto Even2019 20 YBJ
91 pages
Stretchable Electronics
From Everand
Stretchable Electronics
Takao Someya
No ratings yet
sai us chronic pb
No ratings yet
sai us chronic pb
1 page
Image Caption Generator Using Deep Learning
No ratings yet
Image Caption Generator Using Deep Learning
5 pages
Apply Deep Learning-based CNN and LSTM for Visual Image Caption Generator
No ratings yet
Apply Deep Learning-based CNN and LSTM for Visual Image Caption Generator
6 pages
A Novel Approach of Image Caption Generator Using Deep Learning
No ratings yet
A Novel Approach of Image Caption Generator Using Deep Learning
6 pages
Samsung Full Codes
100% (4)
Samsung Full Codes
7 pages
Enhanced GM User Guide
0% (1)
Enhanced GM User Guide
2 pages
Hacking - How To Hack - Ultimate Hacking - Harry Jones
100% (2)
Hacking - How To Hack - Ultimate Hacking - Harry Jones
38 pages
Secret Codes For Phone
No ratings yet
Secret Codes For Phone
13 pages
Data Link, Fault Tracing V2
100% (1)
Data Link, Fault Tracing V2
22 pages
Charmed RPG Player Handbook
75% (4)
Charmed RPG Player Handbook
18 pages
ELM327 v2.1
No ratings yet
ELM327 v2.1
94 pages
Hacking The HP DPS-1200FB A PSU
No ratings yet
Hacking The HP DPS-1200FB A PSU
8 pages
Credit Card Numbers
No ratings yet
Credit Card Numbers
4 pages
Hidden Secret Service Menu Codes For Sony, Samsung, LG and Philips TV
100% (1)
Hidden Secret Service Menu Codes For Sony, Samsung, LG and Philips TV
8 pages
Make Magazine - Vol 33
100% (3)
Make Magazine - Vol 33
164 pages
Future Will Wont Grammar Drills Sentence Transformation Rephrasing
No ratings yet
Future Will Wont Grammar Drills Sentence Transformation Rephrasing
1 page
Formative SHE Task History - of - The - Atom - Worksheet
No ratings yet
Formative SHE Task History - of - The - Atom - Worksheet
3 pages
(MMPI) : The Minnesota Multiphasic Personality Inventory
No ratings yet
(MMPI) : The Minnesota Multiphasic Personality Inventory
22 pages
Let Us Remember: One Book, One Pen, One Child, and One Teacher Can Change The World. Malala Yousafzai
No ratings yet
Let Us Remember: One Book, One Pen, One Child, and One Teacher Can Change The World. Malala Yousafzai
3 pages
Radiology Dissertation Ideas
100% (2)
Radiology Dissertation Ideas
8 pages
Kohlberg's Stages of Moral Development: "Right Action Tends To Be
100% (5)
Kohlberg's Stages of Moral Development: "Right Action Tends To Be
12 pages
MQF-5-Certificate-Health-science-bridging
No ratings yet
MQF-5-Certificate-Health-science-bridging
4 pages
Document From Akanksha
No ratings yet
Document From Akanksha
15 pages
Pryll John O. Colita
No ratings yet
Pryll John O. Colita
5 pages
2021-2022 Academic Calendar
No ratings yet
2021-2022 Academic Calendar
7 pages
American Institute For Myofascial Studies Brochure
100% (1)
American Institute For Myofascial Studies Brochure
20 pages
SF-36 Total Score As A Single Measure of Health-Related Quality of Life: Scoping Review
No ratings yet
SF-36 Total Score As A Single Measure of Health-Related Quality of Life: Scoping Review
12 pages
SBM Assessment Tool With MOVs As of Dec 9 2021
100% (1)
SBM Assessment Tool With MOVs As of Dec 9 2021
19 pages
ELT Methods and Practices
No ratings yet
ELT Methods and Practices
103 pages
Statement of Purpose
100% (1)
Statement of Purpose
2 pages
Lesson Plan Components
No ratings yet
Lesson Plan Components
3 pages
MAS Session1
No ratings yet
MAS Session1
72 pages
SSC4215201404 Sports Speciality Module-Soccer
No ratings yet
SSC4215201404 Sports Speciality Module-Soccer
2 pages
Lesson Plan 5th Form
No ratings yet
Lesson Plan 5th Form
3 pages
Concept Of Adaptive Behaviour
No ratings yet
Concept Of Adaptive Behaviour
14 pages
Cuet 4th Call
No ratings yet
Cuet 4th Call
22 pages
2024 EJU (2nd Session) Examination Sites (Outside Japan) - JASSO
No ratings yet
2024 EJU (2nd Session) Examination Sites (Outside Japan) - JASSO
5 pages
AFFECT IN L2 LEARNING AND TEACHING Jane Arnold
100% (2)
AFFECT IN L2 LEARNING AND TEACHING Jane Arnold
7 pages
Ruba's Resume
No ratings yet
Ruba's Resume
1 page