0% found this document useful (0 votes)
71 views

Bilingual Neural Machine Translation From English To Yoruba Using A Transformer Model

The necessity for language translation in Nigeria arises from its linguistic diversity, facilitating effective communication and understanding across communities. Yoruba, considered a language with limited resources, has potential for greater online presence. This research proposes a neural machine translation model using a transformer architecture to convert English text into Yoruba text.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

Bilingual Neural Machine Translation From English To Yoruba Using A Transformer Model

The necessity for language translation in Nigeria arises from its linguistic diversity, facilitating effective communication and understanding across communities. Yoruba, considered a language with limited resources, has potential for greater online presence. This research proposes a neural machine translation model using a transformer architecture to convert English text into Yoruba text.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

Bilingual Neural Machine Translation From English


To Yoruba Using A Transformer Model
Adeboje Olawale Timothy1 Adetunmbi Olusola Adebayo2
Department of Mathematical and Computing, Department of Computer Science,
Koladaisi University Ibadan, Nigeria Federal University of Technology, Akure. Nigeria.

Arome Gabriel Junior3 Akinyede Raphael Olufemi4


Department of Cybersecurity, Department of Information Systems,
Federal University of Technology, Akure. Nigeria Federal University of Technology, Akure. Nigeria.

Abstract:- The necessity for language translation in Communication can be categorized into two: verbal/oral
Nigeria arises from its linguistic diversity, facilitating and non-verbal. Non-verbal communication, which predates
effective communication and understanding across verbal communication, does not rely on words but instead
communities. Yoruba, considered a language with limited employs gestures, cues, vocal tones, spatial relationships, and
resources, has potential for greater online presence. This other non-linguistic elements to convey messages. It is often
research proposes a neural machine translation model used to express emotions such as respect, love, dislike, or
using a transformer architecture to convert English text discomfort and tends to be spontaneous and less structured
into Yoruba text. While previous studies have addressed compared to verbal communication.
this area, challenges such as vanishing gradients,
translation accuracy, and computational efficiency for Verbal/oral communication involves the use of spoken
longer sequences persist. This research proposes to words or language to convey information. It requires
address these limitations by employing a transformer- organizing words in a structured and grammatically correct
based model, which has demonstrated efficacy in manner to effectively communicate ideas to the audience.
overcoming issues associated with Recurrent Neural Verbal communication offers several advantages over non-
Networks (RNNs). Unlike RNNs, transformers utilize verbal communication, including efficiency in time and
attention mechanisms to establish comprehensive resource utilization, support for teamwork, prompt feedback,
connections between input and output, improving flexibility, and enhanced transparency and understanding
translation quality and computational efficiency. between communicators.

Keywords:- NLP, Text to Text, Neural Machine Translation, Nigeria is a culturally diverse society with a multitude of
Encoder, Decoder, BERT, T5. languages [22]. Estimates suggest there are between 450 to
500 languages spoken throughout the country. While English
I. INTRODUCTION serves as the official language, Nigeria has also designated
three regional languages: Hausa, spoken by approximately 20
The existence of man on the surface of the earth is million people in the North; Yoruba, spoken by about 19
identified as the existence of communication. In the million in the West; and Igbo, spoken by around 17 million in
beginning, God created man in His image and since then there parts of the South-East. These three languages are considered
has been a communication between man and God and major languages.
between men (Genesis 1:27-29). This simply shows that
communication is the basis of human animal existence. Additionally, Nigeria recognizes about 500 other
Communication is the means of sending and receiving languages, some of which hold varying degrees of official
information, ideas, thoughts, and feelings between status within their regions and serve as lingua franca, such as
individuals or groups. It involves the transmission and Efik, Ibibio, and Fulani. Among these languages, Nigerian
reception of messages through various channels such as Pidgin serves as a national lingua franca.
speech, writing, gestures, body language, and even through
technological means like emails or social media. This is the In the Nigerian context, "majority" languages refer
reason why the desire to communicate is essential in man and specifically to Hausa, Yoruba, and Igbo, while "minority"
this is exhibited even by babies, who in their own way express languages encompass those spoken by smaller ethnic groups.
their needs and feelings by crying, laughing, groaning, etc., These distinctions are formally acknowledged and hold
even animals communicate to one another, while machines on significant roles in various sectors, particularly in education
the other hand communicate at their lowest levels using within the country.
discrete values of 1s and 0s called bits i.e., binary digits.

IJISRT24JUL767 www.ijisrt.com 826


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

English Language has since continued to gain so much Neural machine translation encompasses various forms:
prominence in the country that its dominance has stifled the text to text, text to speech, speech to speech, and speech to
growth (and even led to the extinction of some) of the 529 text. Natural languages are categorized into high-resource
indigenous languages in Nigeria [18]. English holds a languages (HRL) and low-resource languages (LRL) based on
dominant position across all sectors in Nigeria, including the availability of textual and speech data on the web [17].
government, education, media, judiciary, and science and High-resource languages, such as English, benefit from
technology. Government officials often refrain from using abundant data resources that facilitate machine learning and
indigenous languages even in informal settings to avoid understanding. Similarly, many Western European languages
perceptions of tribalism. Both at the national and state levels, like French and German, as well as Chinese, Japanese, and
English remains the primary language for debates and official Russian, fall into this category. In contrast, low-resource
records, despite efforts to promote indigenous languages. Due languages lack sufficient data resources, making them less
to Nigeria's linguistic diversity, English plays a central role in studied, resource-scarce, and less commonly taught. These
national unity and administrative functions. Legal languages are often characterized by limited computational
proceedings, particularly in the Supreme Court and High tools and sparse textual and speech data online [21]
Courts, are primarily conducted in English, with lawyers
presenting arguments and judges delivering judgments in the The Yoruba language falls under the category of low-
same language. resource languages (LRL), facing several significant
challenges. One of the primary issues is the difficulty in
However, the prevalence of the English language in employing alignment techniques across different levels (word,
Nigeria, particularly in the southwestern region, has resulted sentence, and document) for annotation, creating a cohesive
in the diminishing use of our esteemed Yoruba language, dataset, and collecting raw text. These tasks are essential for
notably among its speakers. This shift has profound cultural any natural language processing (NLP) endeavor or mapping
and linguistic implications for national identity, leading to technique [17].
many children being unable to communicate in their native
tongue. As a result, many children are unable to speak their Furthermore, the lack of adequate textual material poses
native language. Language and culture are inseparable, and a critical challenge, as lexicons form the backbone for many
the decline in the use of Yoruba among younger generations NLP tasks. Developing an effective lexicon for LRLs like
has contributed to the erosion of our cultural heritage. Yoruba remains particularly challenging due to limited textual
Consequently, younger people are increasingly losing touch resources. Additionally, the morphology of LRLs such as
with the core values and traditions of their cultures. This shift Yoruba is dynamic and continuously evolving, leading to a
is also evident in their fashion choices, which now mirror diverse and expansive vocabulary. This variability
those of the predominant language speakers. complicates the development of a robust framework for
morphological pattern recognition, given the presence of
Research conducted by [19] showed that using the multiple roots and complex word formations.
mother tongue in primary education significantly benefits
learning outcomes. In a six-year primary education project, it This research is set to propose a text-to-text neural
was observed that pupils taught in Yoruba performed better machine translator for English language to Yoruba language
compared to those taught solely in English. It was noted that to address the challenges of low resource language as Yoruba
"the child learns better in his mother tongue, which is as language, extinction of Yoruba Language, inability of Yoruba
natural to him as his mother’s milk." However, if English children to speak Yoruba language, vanishing of Yoruba
continues to serve as Nigeria's national and educational culture, and the aforementioned challenges using a
language, as well as its lingua franca for unity in a diverse and transformer model. The Transformer model is a type of neural
multilingual country, its pervasive use in media, legislation, network [15] initially recognized for its exceptional
and throughout society could pose a threat to indigenous performance in machine translation. Today, it has become the
languages. This trend suggests that indigenous languages are standard for constructing extensive self-supervised learning
not merely at risk but are gradually declining [20]. systems. In recent years, Transformers have surged in
prominence not only within natural language processing
To break the language barrier, there arise the language (NLP) but also across diverse domains like computer vision
translator. Language translators are essential tools that enable and multi-modal processing. As Transformers advance, they
communication between individuals or groups who speak are progressively assuming a pivotal role in advancing both
different languages. Initially, human language translators are the research and practical applications of artificial intelligence
there to translate languages. However, human language (AI) [16].
translators face several challenges and problems, despite their
invaluable role in facilitating communication across
languages. Some of these challenges include accuracy,
precision, ambiguity etc. Neural machine translation (NMT)
has revolutionized the field of translation due to several key
advantages it offers over traditional approaches.

IJISRT24JUL767 www.ijisrt.com 827


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

II. LITERATURE REVIEW translating each phrase into the target language, and
optionally reordering them based on target language models
Many researchers have worked in the area of Neural and distortion probabilities. However, the research faces
Machine Translation, most especially translation of English challenges such as insufficient data, orthographic errors, and
language to Yoruba Language. Such among them are: high error rates during system compilation.

In their work titled "Web-Based English to Yoruba In [6] titled Japanese-to-English Machine Translation
Machine Translation" [24], the researchers aimed to achieve Using Recurrent Neural Networks system. The research
English-to-Yoruba text translation using a rule-based method. objective is to translate Japanese language to English
Their approach primarily utilized a dictionary-based rule- language. A bidirectional recurrent neural network (BiRNN)
based system, considered the most practical among various was used as the encoder with the forward hidden state of an
types of machine translation methods. However, a limitation RNN that reads the source sentence as it is ordered and the
of this approach was its inability to accurately translate backward hidden state of an RNN that reads the source
English words with multiple meanings, influenced by their sentence in reverse. However, there was no complex sentence
grammatical context. translations.

In their study titled "Development of an English to In "[7], titled 'Arabic Machine Translation Using
Yorùbá Machine Translator" [1], the researchers aimed to Bidirectional LSTM Encoder-Decoder,' the research aimed to
create a machine translator that converts English text into develop a model primarily based on Bidirectional Long Short-
Yorùbá, thereby making the Yorùbá language more accessible Term Memory (BiLSTM). The objective was to map input
to a wider audience. The translation process employed phrase sequences to vectors using BiLSTM as an encoder, and
structure grammar and re-write rules. These rules were subsequently use another Long Short-Term Memory (LSTM)
developed and evaluated using Natural Language Tool Kits for decoding the target sequences from these vectors. The
(NLTKs), with analysis conducted using parse tree and input sentences were sequentially embedded into the BiLSTM
Automata theory-based techniques. However, the study noted encoder word by word until the end of the English sentence
that accuracy decreases as the length of sentences increases. sequence. The hidden and memory states obtained were fed
into the LSTM decoder to initialize its state, and the decoder
In [2], titled 'Machine to Man Communication in Yoruba output was processed through a softmax layer and compared
Language,' the research aims to establish communication with the target data. However, the system faced limitations
between humans and machines using a Text-To-Speech due to being trained on a limited number of words in the
system for the Yoruba language. The study involves text corpus.
analysis, natural language processing, and digital signal
processing techniques. An observed outcome of the research In "[8], titled 'Using Bidirectional Encoder
is that pronunciation accuracy decreases as sentence length Representations from Transformers for Conversational
increases. Machine Comprehension,' the goal was to apply the BERT
language model to the task of Conversational Machine
[3] in “Development of a Yorúbà Text-to-Speech Comprehension (CMC). The research introduced a novel
System Using Festival”. The objective of the research is to architecture called BERT-CMC, which builds upon the BERT
develop a speech synthesis for Yoruba language. Speech data model. This architecture incorporates a new module for
were recorded in a quiet environment with a noise cancelling encoding conversational history, inspired by the Transformer-
microphone on a typical multimedia computer system using XL model. However, experimental findings indicated that
the Speech Filing System software (SFS), analysed and Transformer-based architectures struggle to fully capture the
annotated using PRAAT speech processing software. contextual nuances of conversations.
However, the speech produced by the system is of low quality
and the measure of naturalness is low. In "[9], titled 'Development of a Recurrent Neural
Network Model for English to Yorùbá Machine Translation,'
In [4], titled 'Token Validation in Automatic Corpus the aim was to create a recurrent neural network (RNN) model
Gathering for Yoruba Language,' the research aims to create a specifically for translating English to Yorùbá. The model
language identifier that identifies potentially valid Yoruba underwent testing and evaluation using both human and
words. The approach involves using a dictionary lookup automatic assessment methods, demonstrating adequate
method to collect words, which is implemented as a Finite fluency and providing translations of decent to good quality.
State Machine using Java. However, the system's capability is However, the research had limitations: it could only handle
currently restricted to gathering monolingual Yorùbá language single tasks and a limited vocabulary size, and it lacked the
data. capability to expand its vocabulary extensively.

In [5], titled 'Developing Statistical Machine Translation


System for English and Nigerian Languages,' the research
aims to create a translation system between English and two
Nigerian languages: Igbo and Yorùbá. The study employs a
phrase-based Statistical Machine Translation approach, which
involves grouping source language words into sequences,

IJISRT24JUL767 www.ijisrt.com 828


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

In [10], titled 'Design and Implementation of English to III. METHODOLOGY


Yorùbá Verb Phrase Machine Translation System,' the
research aimed to create a system for translating English verb This research proposed to translate English language text
phrases into Yorùbá. However, the study faced several into Yoruba language text using transformer model. Figure 1
limitations: the evaluation method was time-consuming, the shows the architecture of a transformer – model.
system's engine lacked robustness, which hindered the
addition of new rewrite rules, and the system showed low
responsiveness to queries due to a limited database size.

In "[11], titled 'A Novel Technique in the Development


of Yorùbá Alphabets Recognition System Through the Use of
Deep Learning Using Convolutional Neural Network,' the
objective was to create a deep convolutional network named
YORÙBÁNET for detecting Yorùbá alphabets. The model
was implemented in the Matlab R2018a environment using a
custom framework. The training dataset consisted of 10,500
samples, with 2,100 samples reserved for testing. Training the
model involved 30 epochs, with 164 iterations per epoch,
totaling 4,920 iterations. The entire training process took
approximately 11,296 minutes and 41 seconds. This research
focused solely on Yorùbá alphabet recognition and did not
involve sentence translations.

In "[12], titled 'Sentence Augmentation for Language


Translation Using GPT-2,' the research aimed to explore the
use of GPT-2 for generating monolingual data to enhance
Machine Translation. The study utilized a sentence generator
based on GPT-2 to produce additional data for neural machine
translation (NMT), aiming to maintain similar characteristics
to the original text. However, the research did not assess the
effectiveness of the more recent GPT-3 model in augmenting
sentences for language translation.
Fig 1 The Transformer - Model Architecture
In "[13], titled 'Linguistically-Motivated Yorùbá-English
Machine Translation,' the research aims to analyze and
The Transformer model is divided into two macro
provide linguistic descriptions of errors made by different
blocks, which are Encoder and Decoder.
models. Three sentence-level models—SMT, BiLSTM, and
Transformer—were trained to translate from Yorùbá to
 Encoder
English. However, these models exhibited shortcomings such
It is responsible for language learning and identification.
as missing words, incorrect words or spellings, semantic and
The encoder encodes the input sequence and passes it to the
syntactic errors, as well as issues with grammar and word
decoder. The encoder block is composed of a stack of N = 6
order.
identical layers. Each layer has two sub-layers. The first is a
multi-head self-attention mechanism, and the second is a fully
In "[14], titled 'Convolutional Recurrent Neural Network
connected feed-forward network. The research will make use
(CRNN) for the Recognition of Yorùbá Handwritten
of Bidirectional Encoder Representations from Transformers
Characters,' the collected data underwent preprocessing steps
(BERT) for the encoder. Bidirectional Encoder
including grayscale conversion, binarization, and
Representations from Transformers (BERT) is a language
normalization to eliminate distortions from the digitization
representation model, designed to pre-train deep bidirectional
process. The convolutional recurrent neural network model
representations, with the goal of extracting context-sensitive
was then trained using these preprocessed images. However,
features from the input text. It is a neural network-based
the model exhibited low recognition accuracy for characters
technique for language processing pre-training.
containing under dots and diacritic signs.
 Dataset selection
The selection of datasets was a critical step in our
methodology. The system will make use of a parallel corpus
obtained from MENYO-20k and odunola/yoruba-english-
pairs (https://huggingface.co/datasets/odunola/yoruba-english-
pairs) to have a robust Yoruba corpus. This is the source
language in string (English language), which is expected to be
translated by the model.

IJISRT24JUL767 www.ijisrt.com 829


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

 Text representation and modeling techniques = +TE+SE (4)


In this section, the research explores the text representation
and modeling techniques to be employed in this research. Where is the positional encoding, TE is the Token
encoding and SE is the segment Encoding
The input (raw text) is transformed into embedding
vector that can easily be understood by the BERT algorithm.
The input (string) will be converted into set of tokens, the where = sequence of
token goes through token embedding where it is vectorize length and is the size of the embedding vector.
which have contextual meaning.
From figure 1, It is observed that the Add and Norm
Consider the embedding of the word in the takes the input of Multi-Head attention and a residual
sequence, denoted by connection from the positional encoding. The residual
connection is done to ensure that there is a strong information
signal that flows through deep networks. This is required
(1)
because during back propagation, there is vanishing gradient,
so to prevent that, we induce small strong signal from the
Where is the word in the sequence, is the input to different parts of the network.
transformation matrix and is the transformed
matrix. Residual connections carry over the previous
embeddings to the subsequent layers. As such, the encoder
For example, lets consider the two sentences: Sentence blocks enrich the embedding vectors with additional
1: He is a good boy. Sentence 2: Is he a good boy information obtained from the multi-head self-attention
calculations and position-wise feed-forward networks.
Step 1: Apply word Piece Tokenization. Also, CLS and
SEP tokens will be added. CLS is added to the start of the After the Multi Head attention, the result will be added
input sentence and SEP tokens to the end of the input token. to the embedded input using the residual connection.

Step 2: Token embedding ( ) : Lookup the 768 = (8)


dimension vector dimension. This is the input identification.
(1, 5,768). Where 1denotes the first sentence, 5 denotes the Where is the input embedded and
number of words in sentence 1 and 768 denotes the BERT
is the resultant multi head attention
model transformation.
process
Step 3: Segment Embedding ( ): This is much The next is to perform a layer normalization and this is
important when there are multiple sentences in the input calculated by :
sequence. Sentence 1=0 and Sentence 2= 1

Step 4: Apply Position Embedding ( ): This Layer Norm ( )=γ +β (9)


determines the index position of the individual token in the
input sequence. We want the model to treat words that appear Where = mean, σ = standard deviation, γ= gamma
close to each other as “close” and words that are distant as (multiplicative), β =beta (additive) and will initially set to 1
“distant”. Once Position Encoding is computed, it will be and 0.
reused for every sentence during the training and interference.
The Transformer model utilizes "Add & Norm" blocks
Positional Embedding is calculated for time set to even to facilitate efficient training. These blocks incorporate two
and odd respectively in equation 2 and 3: essential components: a residual connection and a Layer
Normalization layer. The residual connection establishes a
direct path for the gradient, ensuring that vectors are updated
(2) rather than entirely replaced by the attention layers. This helps
with gradient flow during training. On the other hand, the
Layer Normalization layer maintains a reasonable scale for
(3) the outputs, enhancing the stability and performance of the
model. Unlike batch normalization, layer normalization aims
Where = position encoding, = position of the to reduce the effect of covariant shift. In other words, it
sentence, = index position in the input sequence, = prevents the mean and standard deviation of embedding
dimension of the vector. vector elements from moving around, which makes training
unstable and slow (i.e., we can’t make the learning rate big
This is the sum of position, segment and token enough). Unlike batch normalization, layer normalization
embedding to form the input for the BERT model works at each embedding vector (not at the batch level).

IJISRT24JUL767 www.ijisrt.com 830


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

Normalization converts features into similar scale, this will The research will make use of T5-Base on the decoder
improve the performance of the system. side of the transformer. Text-to-Text Transfer Transformer
(T5). T5-Base is the standard or baseline version of T5. It
Next is the Feed Forward Network (FFN) comprises strikes a balance between model complexity and efficiency,
two dense layers that are individually and uniformly applied making it widely used and well-suited for many NLP tasks. It
to every position. The Feed Forward layer is primarily used to offers a good trade-off between computational cost and
transform the representation of the input sequence into a more performance and serves as a solid starting point for most
suitable form for the task at hand. This is achieved by applications. T5-Base strikes a good balance between
applying a linear transformation followed by a non-linear computational efficiency and task performance, making it a
activation function. The output of the Feed Forward layer has widely used variant.
the same shape as the input, which is then added to the
original input. The output will be embedded as done in the encoding
stage
The FFN takes a vector (the hidden representation at a
particular position in the sequence) and passes it through two = +TE+SE (14)
learned linear transformations, (represented by the matrices
and and bias vectors and . A rectified-linear Where is the positional encoding, TE is the Token
(ReLU) activation function applied between the two linear encoding and SE is the segment E,
transformations.
where = sequence of length and is the size of the
embedding vector.
) + (11)
The will be divided into four,
Where and and and are bias vector
one will be sent to the Add and Norm phase and three will be
sent to the Multi-Head Attention. The three will be Query (Q),
After the Multi Head attention, the result is added to the
Key (K) and Value (V).
embedded input using the residual connection.
The attention between the Q,K and V is calculated by
= (12)

Where is the input embedded and (15)


is the resultant multi head attention
process. The goal here is to make the model casual. It means the
output at a certain position can depend on the words on the
The next is to perform a layer normalization and this is previous position. The model must not be able to see the
calculated by : future words.

Layer Norm ( )=γ +β (13) = (16)

With the new head, the next is to concatenate the heads


Where = mean, σ = standard deviation, γ= gamma
by :
(multiplicative), β =beta (additive) and will initially set to 1
and 0.
(17)
The encoder outputs for each word a vector that not only
captures its meaning (the embedding) or the position but also Where Q=Query, K= Key and V=Value
its interaction with other words of the multi head attention.
The output of the encoder is another matrix that has the same After the Multi Head attention, the result is added to the
dimension as the input matrices in which the sequence of embedded output using the residual connection
embedding. The output of the encoder will be passed to the
next encoder and the adjacent decoder. = (18)

 Decoder Where is the input embedded and


The goal of the decoder is to translate English language
is the resultant multi head attention
to Yoruba language. The output of the Encoder (value and
process
key) is joined to the output (Query) of the masked multi head
attention in the Decoder phase. The output (shift right) is the
trained language, however, this starts with the start of
sentence (SOS).

IJISRT24JUL767 www.ijisrt.com 831


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

The next is to perform a layer normalization and this is REFERENCES


calculated by :
[1]. Eludiora, S. I., & Odejobi, O. A. (2016). Development
of an English to Yorùbá Machine Translator.
Layer Norm ( )=γ +β (19) International Journal of Modern Education and
Computer Science, 8(11), 8.
Where = mean, σ = standard deviation, γ= gamma [2]. Akintola, A., & Ibiyemi, T. (2017). Machine to Man
(multiplicative), β =beta (additive) and will initially set to 1 Communication in Yorùbá Language. Annal. Comput.
and 0. Sci. Ser, 15(2).
[3]. Iyanda, A. R., & Ninan, O. D. (2017). Development of
Next is the multi-Head attention, this can be called a a Yorúbà Textto-Speech System Using Festival.
cross- attention because the output of the encoder (key and Innovative Systems Design and Engineering
value) is joined with the output of the masked multi head (ISDE), 8(5).
attention, which is the query. The output of the encoder is [4]. Adewole, L. B., Adetunmbi, A. O., Alese, B. K., &
represented by and the query (output of the Oluwadare, S. A. (2017). Token Validation in
Automatic Corpus Gathering for Yoruba
masked multi head) is represented as Language. FUOYE Journal of Engineering and
The attention between the and Technology, 2(1), 4.
[5]. Ayogu, I. I., Adetunmbi, A. O., & Ojokoh, B. A.
is calculated by (2018). Developing statistical machine translation
system for english and nigerian languages. Asian
Journal of Research in Computer Science, 1(4), 1-8.
[6]. Greenstein, E., & Penner, D. (2015). Japanese-to-
english machine translation using recurrent neural
networks. Retrieved Aug, 19, 2019.
(20) [7]. Nouhaila, B. E. N. S. A. L. A. H., Habib, A. Y. A. D.,
Abdellah, A. D. I. B., & Abdelhamid, I. E. F. (2017).
Where = output of the encoder, Arabic machine translation using Bidirectional LSTM
Encoder-Decoder.
= query, = where is the head [8]. Gogoulou, E. (2019). Using Bidirectional Encoder
Representations from Transformers for
Linear will be used to map into another Conversational Machine Comprehension.
sequence by vocabulary size. It tells for every embedding the [9]. Esan, A., Oladosu, J., Oyeleye, C., Adeyanju, I.,
position of the word in the vocabulary. Olaniyan, O., Okomba, N., ... & Adanigbo, O. (2020).
Development of a recurrent neural network model for
Softmax (seq, vocabulary size) will calculate the loss English to Yorùbá machine translation. Development
function, that is cross entropy loss. The research will select a , 11(5).
token from the vocabulary corresponding to the position of [10]. Ajibade, B., & Eludiora, S. (2021). Design and
the token with the maximum value. The research will use Implementation of English To Yor\ub\'a Verb Phrase
Beam search strategy to select at each step the top B words Machine Translation System. arXiv preprint arXiv:
and evaluate all the possible next words, keeping the top B 2104.04125.
most portable sequence. [11]. Oyeniran, O. A., & Oyebode, E. O. (2021).
YORÙBÁNET: A deep convolutional neural network
IV. EXPERIMENTS design for Yorùbá alphabets recognition. International
Journal of Engineering Applied Sciences and
This research will be evaluated using the following Technology, 5(11), 57-61.
standard metrics such as such as BLEU score, Precision, [12]. [12] Sawai, R., Paik, I., & Kuwana, A. (2021).
Recall, F1 score, mean opinion score, comparative mean Sentence augmentation for language translation using
opinion score etc. gpt-2. Electronics, 10(24), 3082.
[13]. [13] Adebara, I., Abdul-Mageed, M., & Silfverberg,
V. CONCLUSION M. (2022, October). Linguistically-motivated Yorùbá-
English machine translation. In Proceedings of the
The research work is expected to develop an efficient, 29th International Conference on Computational
effective and more accuracy translation system for English Linguistics (pp. 5066-5075).
language to Yoruba language to enhance the learning of [14]. [14] Ajao, J., Yusuff, S., & Ajao, A. (2022). Yorùbá
Yoruba language, enhance the education standard of Yoruba character recognition system using convolutional
people and to promote Yoruba culture. recurrent neural network. Black Sea Journal of
Engineering and Science, 5(4), 151-157.

IJISRT24JUL767 www.ijisrt.com 832


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL767

[15]. [15] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit,


J., Jones, L., Gomez, A. N., ... & Polosukhin, I.
(2017). Attention Is All You Need.(Nips), 2017. arXiv
preprint arXiv:1706.03762, 10, S0140525X16001837.
[16]. [16] Xiao, T., & Zhu, J. (2023). Introduction to
Transformers: an NLP Perspective. arXiv preprint
arXiv:2311.17633.
[17]. [17] Magueresse, A., Carles, V., & Heetderks, E.
(2020). Low-resource languages: A review of past
work and future challenges. arXiv preprint arXiv:
2006.07264.
[18]. [18] Ajepe, I., & Ademowo, A. J. (2016). English
language dominance and the fate of indigenous
languages in Nigeria. International Journal of History
and Cultural Studies, 2(4), 10-17.
[19]. [19] Fadoro, J. O. (2010). Revisiting the mother-
tongue medium controversy. Montem Paperbacks,
Akure.
[20]. [20] Mishina, U. L., & Iskandar, I. (2019). The role of
English language in Nigerian development. GNOSI:
An Interdisciplinary Journal of Human Theory and
Praxis, 2(2), 47-54.
[21]. [21] Bibi, N., Rana, T., Maqbool, A., Alkhalifah, T.,
Khan, W. Z., Bashir, A. K., & Zikria, Y. B. (2023).
Reusable Component Retrieval: A Semantic Search
Approach for Low-Resource Languages. ACM
Transactions on Asian and Low-Resource Language
Information Processing, 22(5), 1-31.
[22]. [22] Omoniyi, A. M. (2012). SOCIO-POLITICAL
PROBLEMS OF LANGUAGE TEACHING IN
NIGERIA. Advisory Editorial Board, 152.
[23]. [23] Khurana, D., Koli, A., Khatter, K., & Singh, S.
(2023). Natural language processing: state of the art,
current trends and challenges. Multimedia tools and
applications, 82(3), 3713-3744.
[24]. [20] Mishina, U. L., & Iskandar, I. (2019). The role of
English language in Nigerian development. GNOSI:
An Interdisciplinary Journal of Human Theory and
Praxis, 2(2), 47-54.
[25]. [21] Bibi, N., Rana, T., Maqbool, A., Alkhalifah, T.,
Khan, W. Z., Bashir, A. K., & Zikria, Y. B. (2023).
Reusable Component Retrieval: A Semantic Search
Approach for Low-Resource Languages. ACM
Transactions on Asian and Low-Resource Language
Information Processing, 22(5), 1-31.
[26]. [22] Omoniyi, A. M. (2012). SOCIO-POLITICAL
PROBLEMS OF LANGUAGE TEACHING IN
NIGERIA. Advisory Editorial Board, 152.

IJISRT24JUL767 www.ijisrt.com 833

You might also like