On The Vietnamese Name Entity Recognition: A Deep Learning Method Approach

This document presents a deep learning model for Vietnamese named entity recognition (NER) that combines bidirectional long short-term memory (Bi-LSTM) and conditional random fields (CRF). The model uses character-level embeddings from a Bi-LSTM combined with part-of-speech and chunk features as input to a CRF output layer. Evaluation on a Vietnamese dataset achieves a new state-of-the-art F1 score of 95.61% for Vietnamese NER.

Uploaded by

Mehari Yohannes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views

On The Vietnamese Name Entity Recognition: A Deep Learning Method Approach

Uploaded by

Mehari Yohannes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

On the Vietnamese Name Entity Recognition: A

Deep Learning Method Approach

Ngoc C. Lê Ngoc-Yen Nguyen Anh-Duong Trinh
School of Applied Mathematics and Informatics School of Applied Mathematics and Informatics iCOMM Media & Tech, Jsc
Hanoi University of Science and Technology Hanoi Univ of Science and Technology [email protected]
Institution of Mathematics [email protected]
Vietnam Academy of Science and Technology
[email protected]
arXiv:1912.01109v1 [cs.CL] 18 Nov 2019

Abstract—Named entity recognition (NER) plays an important Fields (CRFs) [11], [22]. For the VLSP 2016 data set, the first
role in text-based information retrieval. In this paper, we combine Vietnamese NER system has applied MEMMs with specific
Bidirectional Long Short-Term Memory (Bi-LSTM) [7], [27] with features [25]. However, they have not achieved accuracy that
Conditional Random Field (CRF) [9] to create a novel deep
learning model for the NER problem. Each word as input of far beyond those of classical machine learning methods. Most
the deep learning model is represented by a Word2vec-trained of the above models depends heavily on specific resources
vector. A word embedding set trained from about one million and hand-crafted features, which makes it difficult for those
articles in 2018 collected through a Vietnamese news portal models to apply to new domains and other tasks.
(baomoi.com). In addition, we concatenate a Word2Vec [18]- In [19], [20], the author used the information of word, word
trained vector with semantic feature vector (Part-Of-Speech
(POS) tagging, chunk-tag) and hidden syntactic feature vector shapes, part-of-speech tags, chunking tags as hand-crafted fea-
(extracted by Bi-LSTM nerwork) to achieve the (so far best) tures for CRF to label entity tags [23]. Over the past few years,
result in Vietnamese NER system. The result was conducted many deep learning models have been proposed to overcome
on the data set VLSP2016 (Vietnamese Language and Speech these limitations. Some NER models have used LSTM and
Processing 2016 [29]) competition. CRF to predict NER [8], [12]. In addition, benefits from both
Index Terms—Vietnamese, Named Entity Recognition, Long
Short-Term Memory, Conditional Random Field, Word Embed- the expression of words and characters when combining CNN
ding and CRF are presented in [17], [28].
In this study, we introduce a deep neural network for Viet-
I. I NTRODUCTION namese NER using extraction of morphological features au-
Named-entity recognition (NER) (also known as entity iden- tomatically through a Bi-LSTM (character feature) network
tification, entity chunking and entity extraction) is a subtask combined with POS features - tagging and chunk tag. The
of information extraction that seeks to locate and classify model includes two bidirectional-lstm hidden layer and an
named entity mentions in unstructured text into pre-defined output layer CRF. For Vietnamese language, we use the data
categories such as the person names, organizations, locations, set from the 2016 VLSP contest. The results show that our
medical codes, time expressions, quantities, monetary values, model outperforms the best previous systems for Vietnamese
percentages, etc . It is a fundamental NLP research problem NER [23] with F1 is 95.61% on test set.
that has been studied for years. It is also considered as one The remainder of this paper is structured as follows. Section
of the most basic and important tasks in some big problems II refers related work on named entity recognition. Section
such as information extraction, question answering, entity III describes the implementation method. Section IV gives
linking, or machine translation. Recently, there are many experimental results and discussions. Finally, the conclusion
novel ideal in NER task such as Cross-View Training (CVT) will be given in Section V.
[4], a semi-supervised learning algorithm that improves the
representations of a Bi-LSTM sentence encoder using a mix II. RELATED WORK
of labeled and unlabeled data, or deep contextualized word The approaches for NER task can be divided into two rou-
representation [24] and contextual string embeddings, a recent tines: (1) statistical learning approaches and (2) deep learning
type of contextualized word embedding that were shown to methods.
yield state-of-the-art results [1], [2]. These studies have shown In the first type, the authors used traditional labeling models
new state-of-the-art methods with F1 scores on NER task. such as crf, hidden markov model, support vector machine,
In Vietnamese language, NER systems in VLSP 2016 adopted maximum entropy that are heavily dependent on hand-crafted
either conventional feature-based sequence labeling models features. Sentences are expressed in the form of a set of
such as Recurrent neural network (RNN), Bidirectional Long features such as word, pos, chunk, etc Then they are put into
Short Term Memory (Bi-LSTM) [25], Maximum-Entropy- a linear model for labeling. Some examples following this
Markov Models (MEMMs) [14], [21], Conditional Random routine are [6], [13], [15], [16]. These models were proven to
work quite well for low existing resources languages such as
Vietnamese. However, these kinds of NER systems are relied
heavily on the used feature set, and on hand-crafted features
that are expensive to construct and are difficultly reusable [23].
For the second routine, with the appearance of deep learning
models with superior computational performance seems to
improve the accuracy of the NER task. The performance of
deep learning models also have been shown much better than
the statistical based methods. In particular, the convolutional
neural network (CNN) [30], recurrent neural network (RNN),
LSTM networks are popular use, we can exploit the syntax
feature through character embedding in combination with
word embedding [26], [28]. Other information such as pos-
tag and chunk-tag is also used to provide more information
about semantic [3], [20], [25]. The word vectors are combined Fig. 1. Character-level Embedding
in different ways, then feed into the Bi-LSTM network with
CRF in output. For Vietnamese, there are many NER systems
using LSTM network. In [25], the authors introduced a model
that uses two Bi-LSTM layers with softmax layers at the LSTM units contain a memory cell that can maintain in-
output, with input from vectors using syntax specific, F1 score formation in memory for controlled periods of time. A cell
is 92.05%. A model using single Bi-LSTM layer combining in the LSTM network consists of three control gates: forget
crf at the output to achieve F1-score of 83.25% was given in gate (determining which information is ignored and which is
[22]. A number of high-precision models are introduced in the retained), update gate (deciding how much of the memorized
[3] with Bi-LSTM-CRF model with the input is the extracted information is added to the current state) and the output gate
vector with characteristic of word character, F1 is 94.88%. And (making the decision about which part of the current cell
most recently, a combination of Bi-LSTM - attention layer - makes it to the output). At time t, cell updates are given as
CRF model with F1 score of 95.33% was given in [23]. follows:

III. METHODOLOGY ft = σ (Wf ht−1 + Uf xt + bf ) (1)

A. Feature engineering it = σ(Wi ht−1 + Ui xt + bi ) (2)
Word embedding To build up a word embedding set, we ot = σ(W0 ht−1 + U0 xt + b0 ) (3)
∼
use the skip-gram neural network model, which is trained c = tanh(Wc ht−1 + Uc xt + bc ) (4)
from one million articles in 2018 through a Vietnamese t
∼
news portal (baomoi.com). Skip-gram is used to predict the ct = ft ct−1 + it c (5)
t
context word for a given target word. We choose sliding
ht = ot tanh(ct ), (6)
window of Size 2, therefore, normally there are four context
words corresponding to a target word. For words that are not
where σ is the sigmoid function and is an pointwise
trained, a vector called unknown (UNK) embedding is used
operator, which can be multiplication or addition or tanh
instead. The UNK embedding is created q by random vectors
q function, xt is the input vector at time t, ht is the hidden
3 3
sampled uniformly from the range [− dim , + dim ], where state vector that holds information from the beginning to the
the dimension (dim) is the dimension of word embeddings [3]. present time. The gates f , i, o, c are the forget gate, input gate,
To improve the performance of our system, we use semantic output gate and cell vector respectively. The matrices Uf , Ui ,
features to vectorize words into models (part of speech tagging Uo , Uc are weight matrices that connect input and gates. The
and chunk-tag). The pos-tagging vectors and chunk-tag vectors matrices Wf , Wi , Wo , Wc are weight matrices that connect
were represented as one-hot vectors. gates and hidden state.
1) Character Embedding: Recently, automatically extract-
ing hidden features using neural networks (LSTM, CNN) used
C. Bi-LSTM
in many articles, proved effective in the NER task [10]. In
this research, we use the Bi-LSTM network to extract hidden In sequence labeling task, the context of a word is rep-
patterns that characterize the syntax of words, as shown in resented more effectively by the context, i.e. the companion
Fig. III-A1. words, the left and right words in a sentence. To archive
these information, a candidate model is Bidirectional-LSTM
B. LSTM network as shown in Fig. III-C. Then, the output of the Bi-
LSTM networks are a type of Recurrent Neural Networks LSTM network is observed by concatenating its left and right
(RNN) that uses special units in addition to standard units. context representations.
Fig. 2. Bidirectional - LSTM

Fig. 3. Our Deep Learning model

D. Conditional Random Fields (CRFs)
For many sequence labeling tasks, a simple but effective
approach is consider the correlation between the labels close tags. The number of sentences in train-set, validation-set and
together and the best sequence decoding with simple rules. The test-set is shown in the T Table IV.
conditional random field is essentially a probability model,
which can predict labels with predefined structures. Instead TABLE I
of decoding independent labels, CRFs learn the sequence of NUMBER OF SENTENCES
labels from train data to decode output labels at the same time.
Dataset Number of sentences
In CRF, when given a word sequence x = (x1 , x2 , ..., xm ), the Train 14861
conditional probability of a tag sequence y = (y1 , y2 , ..., ym ), Validation 2000
is defined as in [19]: Test 2831

exp(w.F (y, x))

P (y|x) = P , (7)
exp(w.F (y 0 , x))
y 0 ∈Y The number of entities included in the train set and test set
where w is the parameter vector estimated from training data. is shown in the following table:
The feature function F (y, x) ∈ IRd is defined globally on an
entire input sequence and an entire tag sequence. Space Y is TABLE II
NUMBER OF LABELS IN DATASET
the space of all possible tag sequences. The feature function
F (y, x) is calculated by summing local feature functions: Type Train Test
n Location 6245 1379
X Organization 1213 274
Fj (y, x) = fj (yi−1 , yi , x, i) (8) Person 7480 1294
i=1 Miscellaneous names 282 49
Total 15220 2996
E. Our Deep Learning Model
For the NER labeling task for Vietnamese, we use multiple
Bi-LSTM layers with the CRF layer at the top to detect
entities named in the sequence [3], as shown in Fig. III-E. B. Hyper-parameters
The architecture operates as in the following sequence:
Table IV-B summarizes the hyper-parameters that we have
• The input of our neural network is sequence of word chosen for our NER model. In order to have more efficient
representations training process, the parameters are optimized using Nesterov-
• Each word representation encoded through two Bi-LSTM accelerated Adaptive Moment Estimation (Nadam) optimizer
layers [5] with batch size 64. Word representation concatenated by
• The CRF layer at the top to decode hidden feature vectors a 300 dimensional word2vec-vector (pre-trained from bao-
from previous layer (Bi-LSTM layer) moi.com) and two one-hot vectors represent pos tags and
chunks, respectively and a 60 dimensional character vector
IV. EXPERIMENTS
(generated from a bi-lstm network with dropout rate equal 0.3,
A. Datasets as shown in Figure 1). To prevent overfitting, we fix dropout
To evaluate the model, we use dataset from VLSP-2016 rate to 0.5 for both Bi-LSTM layers (as shown in Figure 3).
NER task [29], with four types of entities including person The NER model is trained in 40 epoch. First 20 epochs, the
(PER), location (LOC), organization (ORG), miscellaneous initial learning rate is set at 0.004. In the remaining epochs,
(MISC). In addition, VLSP-2016 dataset provides the infor- it is fixed to 0.0004. The best model obtained when the value
mation about word segmentation, part-of-speech, and chunking of the loss function on validation-set is minimal.
TABLE III ACKNOWLEDGMENT
THE MODEL HYPE-PARAMETERS
The first author also has receive the support from Institute of
Hyper-parameter Value Mathematics, Vietnam Academy of Science and Technology,
Character dimension 60 Year 2019. This work is also supported by iCOMM Media &
Word dimension 300
Hidden size char 30 Tech, Jsc. We would like to thank the iCOMM RnD team for
Hidden size word 64 supported resources and text data that we used during training
Update function Nadam and experiments our model.
Learning rate first 20 0epoch 0.004
Learning rate last 20 epoch 0.0004
Dropout character embedding 0.3 R EFERENCES
Dropout two Bi-LSTM layers 0.5 [1] A. Akbik, D. Blythe, and R. Vollgraf, Contextual String Embeddings
Batch size 64 for Sequence Labeling, COLING 2018, 27th International Conference
on Computational Linguistics, pp. 1638–1649, 2018.
[2] A. Akbik, T. Bergmann, and R. Vollgraf, Pooled Contextualized Em-
beddings for Named Entity Recognition, Proceedings of the 2019
C. Experimental Results Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Volume 1,
The experiment is conducted by combining all input features pp. 724–728, June 2019.
[3] D. N. Anh, H. N. Kiem, and V. N. Van, Neural sequence labeling
include pos-tag feature, chunk feature and character feature. for Vietnamese POS Tagging and NER, 2019 IEEE-RIVF International
The results are shown in Table IV-C. Conference on Computing and Communication Technologies (RIVF),
March 2019.
[4] K. Clark, M-T. Luong, C. D. Manning, and Q. V. Le, Semi-Supervised
TABLE IV Sequence Modeling with Cross-View Training, ICLR 2018, Feb. 16th
RESULTS ON VLSP 2016 TEST-SET 2018.
[5] T. Dozat, Incorporating Nesterov Momentum into Adam, International
Precision Recall F1-Score Conference on Learning Representations 2016 (ICLR 2016).
LOC 95.43 96.95 96.18 [6] R. Florian, A. Ittycheriah, H. Jing, and T. Zhang, Named entity recog-
PER 95.53 97.53 96.52 nition through classifier combination, In Proceedings of the Seventh
ORG 87.32 90.51 88.89 Conference on Natural Language Learning at HLT-NAACL 2003 -
MISC 100.0 87.76 93.48 Volume 4, pp. 168–171, 2019.
Avg/total 95.32 95.93 95.61 [7] S. Hochreiter and Jrgen Schmidhuber, ”Long short-term memory”,
Neural Computation Vol. 9:8, pp. 1735-1780, 1997.
[8] Z. Huang, W. Xu, and K. Yu, Bidirectional LSTM-CRF Models for
With VLSP 2016 dataset, the experiment achieved state-of- Sequence Tagging, arXiv:1508.01991, Aug. 2015.
[9] J. Lafferty, A. McCallum, and F. Pereira, ”Conditional random fields:
the-art performances on Vietnamese NER task with 95.61% Probabilistic models for segmenting and labeling sequence data”, In
F1-score. Table IV-C shows the performance of our deep Proceeding 18th International Conference on Machine Learning pp.
learning model and several published systems on NER task. 282-289, 2001.
[10] G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer,
Neural Architectures for Named Entity Recognition, In Proceedings of
TABLE V the 2016 Conference of the North American Chapter of the Association
PERFORMANCES ON VLSP 2016 DATASET for Computational Linguistics: Human Language Technologies, pp.260–
270, June 2016.
Model F1-Score [11] T. H. Le, T. T. T. Nguyen, T. H. Do, and X. T. Nguyen, Named Entity
VNER[12] 95.33 Recognition in Vietnamese Text, The Fourth International Workshop on
Feature-based CRF [10] 93.93 Vietnamese Language and Speech Processing (VLSP 2016), 2016.
NNVLP [9] 92.91 [12] T.A. Le, M.Y. Arkhipov, and M.S. Burtsev, ”Application of a Hybrid Bi-
Nguyen et al. 2018 [21] 94.88 LSTM-CRF Model to the Task of Russian Named Entity Recognition”,
Our NER model 95.61 In: Artificial Intelligence and Natural Language, AINL 2017, Commu-
nications in Computer and Information Science, vol. 789, pp. 91–103,
2017.
[13] P. Le-Hong, A. Roussanaly, T. M. H. Nguyen, and M. Rossignol, An em-
The general difference with other systems in Table 5 is that pirical study of maximum entropy approach for part-of-speech tagging
we trained a new word embedding set by word2vec model, of Vietnamese texts, Traitement Automatique des Langues Naturelles -
described in Subsection III-A. Moreover, we use two Bi-LSTM TALN 2010, ATALA (Association pour le Traitement Automatique des
Langues), Jul 2010, Montral, Canada, pp. 12–23, Oct 2010.
layers in order to encode word representations. [14] P. Le-Hong, Vietnamese Named Entity Recognition using Token Regular
Expressions and Bidirectional Inference, In The Fourth International
Workshop on Vietnamese Language and Speech Processing (VLSP
V. CONCLUSIONS 2016), 2016.
[15] D. Lin and X. Wu, Phrase clustering for discriminative learning, ACL ’09
In this paper, we presented a neural network model for Proceedings of the Joint Conference of the 47th Annual Meeting of the
Vietnamese named entity recognition task, which obtains state- ACL and the 4th International Joint Conference on Natural Language
of-the-art performance. Experiments on recognize Vietnamese Processing of the AFNLP: Volume 2, pp. 1030–1038, 2009.
[16] G. Luo, X. Huang, C-Y. Lin, and Z. Nie, Joint Named Entity Recognition
entity in sequence labeling task showed the effectiveness and Disambiguation, Proceedings of the 2015 Conference on Empirical
of training a new word embedding set and using two Bi- Methods in Natural Language Processing, pp. 879–888, September
LSTM layers in order to extract hidden features from word 2015.
[17] X. Ma, and E. Hovy, End-to-end Sequence Labeling via Bi-directional
representations. Our results is outperform the best previous LSTM-CNNs-CRF, Proceedings of the 54th Annual Meeting of the
systems for Vietnamese Named entity recognition. Association for Computational Linguistics, pp. 1064–1074, March 2016.
[18] T. Mikolov, K. Chen, G. Corrado, and J. D. Tomas, ”Efficient Estimation
of Word Representations in Vector Space”, arXiv:1301.3781, 2013.
[19] P. Q. N. Minh, A Feature-Rich Vietnamese Named-Entity Recognition
Model, arXiv:1803.04375, 12 Mar 2018.
[20] P. Q. N. Minh, A Feature-Based Model for Nested Named Entity
Recognition at VLSP-2018 NER Evaluation Campaign, In Proceedings
of Vietnamese Speech and Language Processing (VLSP), 2018.
[21] T. C. V. Nguyen, T. S. Pham, T. H. Vuong, N. V. Nguyen, and M. V.
Tran, Dsktlab-ner: Nested named entity recognition in vietnamese text,
In The Fourth International Workshop on Vietnamese Language and
Speech Processing (VLSP 2016), 2016.
[22] T. S. Nguyen, L. M. Nguyen, and X. C. Tran, Vietnamese named entity
recognition at vlsp 2016 evaluation campaign, In Proceedings of The
Fourth International Workshop on Vietnamese Language and Speech
Processing, 2016.
[23] K. A. Nguyen, N. Dong, and C-T. Nguyen, Attentive Neural Network
for Named Entity Recognition in Vietnamese, 2019 IEEE-RIVF Inter-
national Conference on Computing and Communication Technologies
(RIVF), March 2019.
[24] M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee,
and Luke Zettlemoyer, Deep contextualized word representations, Pro-
ceedings of the 2018 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language
Technologies, Volume 1, June 2018.
[25] T-H. Pham and P. Le-Hong, The Importance of Automatic Syntactic
Features in Vietnamese Named Entity Recognition, The 31st Pacific
Asia Conference on Language, Information and Computation, November
2017.
[26] T-H. Pham, X-K. Pham, T-A. Nguyen, and P. Le-Hong, NNVLP: A
Neural Network-Based Vietnamese Language Processing Toolkit, In
The Companion Volume of the IJCNLP 2017 Proceedings: System
Demonstrations, pp. 37-40, 2017.
[27] M. Schuster and K. K. Paliwal. ”Bidirectional recurrent neural net-
works”, Signal Processing, IEEE Transactions, Vol. 45.11, pp. 2673–
2681, 1997.
[28] F. Wu, J. Liu, C. Wu, Y. Huang, and X. Xie, Neural Chinese Named
Entity Recognition via CNN-LSTM-CRF and Joint Training with Word
Segmentation, The World Wide Web Conference, pp. 3342–3348, Apr.
2019.
[29] http://vlsp.org.vn/resources-vlsp2016
[30] W. Zhang. ”Shift-invariant pattern recognition neural network and its
optical architecture”, Proceedings of Annual Conference of the Japan
Society of Applied Physics, pp. 734, Sept. 1988.

Archaeology - A Brief Introduction (PDFDrive)
100% (4)
Archaeology - A Brief Introduction (PDFDrive)
425 pages
Edukasyon Pantahanan at Pangkabuhayan With Entrep
100% (1)
Edukasyon Pantahanan at Pangkabuhayan With Entrep
26 pages
Cheatsheet Recurrent Neural Networks
No ratings yet
Cheatsheet Recurrent Neural Networks
5 pages
Research Paper
No ratings yet
Research Paper
6 pages
A Feature-Rich Vietnamese Named-Entity Recognition Model
No ratings yet
A Feature-Rich Vietnamese Named-Entity Recognition Model
12 pages
Named Entity Recognition With Bidirectional Lstm-Cnns
No ratings yet
Named Entity Recognition With Bidirectional Lstm-Cnns
14 pages
N16-1030
No ratings yet
N16-1030
11 pages
Thattinaphanich 2019
No ratings yet
Thattinaphanich 2019
6 pages
Mark
No ratings yet
Mark
3 pages
Bidirectional LSTM-CRF Models For Sequence Tagging
No ratings yet
Bidirectional LSTM-CRF Models For Sequence Tagging
10 pages
Evaluating The Utility of Hand-Crafted Features in Sequence Labelling
No ratings yet
Evaluating The Utility of Hand-Crafted Features in Sequence Labelling
7 pages
A Vietnamese Language Model Based On Recurrent Neural Network
No ratings yet
A Vietnamese Language Model Based On Recurrent Neural Network
5 pages
Bidirectional LSTM-CRF For Named Entity Recognition
No ratings yet
Bidirectional LSTM-CRF For Named Entity Recognition
10 pages
Full Text 01
No ratings yet
Full Text 01
66 pages
2016-ACL-Combining Discrete and Neural Features For Sequence Labeling
No ratings yet
2016-ACL-Combining Discrete and Neural Features For Sequence Labeling
15 pages
Deep Learning in Natural Language Processing A State-of-the-Art Survey
No ratings yet
Deep Learning in Natural Language Processing A State-of-the-Art Survey
6 pages
Long Short-Term Memory RNN For Biomedical Named Entity Recognition
No ratings yet
Long Short-Term Memory RNN For Biomedical Named Entity Recognition
11 pages
Character-Aware Neural Language Models
No ratings yet
Character-Aware Neural Language Models
9 pages
ACL 2020 Proceedings Template 2 PDF
No ratings yet
ACL 2020 Proceedings Template 2 PDF
4 pages
Applsci 10 05841
No ratings yet
Applsci 10 05841
14 pages
2018 - AAAIAdaptive Co-Attention Network For Named Entity Recognition in Tweets
No ratings yet
2018 - AAAIAdaptive Co-Attention Network For Named Entity Recognition in Tweets
8 pages
2020.acl-main.577
No ratings yet
2020.acl-main.577
7 pages
Trend
No ratings yet
Trend
47 pages
A Unified Architecture For Natural Language Processing
No ratings yet
A Unified Architecture For Natural Language Processing
15 pages
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
No ratings yet
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
10 pages
APznzaYD23xZzgrNn UY T9fGgJbB0 Kfhgt21x0vaHH4qfIvCmiqGVPY37T19O (2)
No ratings yet
APznzaYD23xZzgrNn UY T9fGgJbB0 Kfhgt21x0vaHH4qfIvCmiqGVPY37T19O (2)
10 pages
Approaches For Neural-Network Language Model Adaptation: August 2017
No ratings yet
Approaches For Neural-Network Language Model Adaptation: August 2017
6 pages
8 - 1 - NICS 2018 - Vietnamese Keyword Extraction Using Hybrid Deep Learning Methods - Bui Thanh Hung
No ratings yet
8 - 1 - NICS 2018 - Vietnamese Keyword Extraction Using Hybrid Deep Learning Methods - Bui Thanh Hung
6 pages
Dynamic Embedding Projection-Gated
No ratings yet
Dynamic Embedding Projection-Gated
10 pages
Densely Connected Bidirectional LSTM With Applications To Sentence Classification
No ratings yet
Densely Connected Bidirectional LSTM With Applications To Sentence Classification
7 pages
Word Alignment Modeling With Context Dependent Deep Neural Network
No ratings yet
Word Alignment Modeling With Context Dependent Deep Neural Network
10 pages
Graph Convolutional Networks For Named Entity Recognition: Gcns NER
No ratings yet
Graph Convolutional Networks For Named Entity Recognition: Gcns NER
9 pages
Zhou 2020
No ratings yet
Zhou 2020
5 pages
Context Aware Document Embedding: Lau and Baldwin 2016 Le and Mikolov 2014
No ratings yet
Context Aware Document Embedding: Lau and Baldwin 2016 Le and Mikolov 2014
8 pages
2305 19523
No ratings yet
2305 19523
22 pages
Comparing Gru and LSTM For Automatic Speech Recognition: Shubham Khandelwal, Benjamin Lecouteux, Laurent Besacier
No ratings yet
Comparing Gru and LSTM For Automatic Speech Recognition: Shubham Khandelwal, Benjamin Lecouteux, Laurent Besacier
7 pages
1508.06615 - PTB Character Aware Neural Language Models Yoon Kim
No ratings yet
1508.06615 - PTB Character Aware Neural Language Models Yoon Kim
9 pages
Reinforced Iterative Knowledge Distillation For Cross-Lingual Named Entity Recognition
No ratings yet
Reinforced Iterative Knowledge Distillation For Cross-Lingual Named Entity Recognition
9 pages
BiLSTM_BPTT
No ratings yet
BiLSTM_BPTT
8 pages
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
No ratings yet
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
44 pages
Arxiv: Natural Language Processing (Almost) From Scratch
No ratings yet
Arxiv: Natural Language Processing (Almost) From Scratch
47 pages
Thuyết Trình TWP
No ratings yet
Thuyết Trình TWP
7 pages
A Survey On Recent Advances in Named Entity Recognition From Deep Learning Models
No ratings yet
A Survey On Recent Advances in Named Entity Recognition From Deep Learning Models
14 pages
Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks
No ratings yet
Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks
5 pages
Optimal Hyperparameters For Deep LSTM-Networks For Sequence Labeling Tasks
No ratings yet
Optimal Hyperparameters For Deep LSTM-Networks For Sequence Labeling Tasks
34 pages
A Vietnamese Language Model Based On Recurrent Neural Network
No ratings yet
A Vietnamese Language Model Based On Recurrent Neural Network
19 pages
Grounded Recurrent Neural Networks
No ratings yet
Grounded Recurrent Neural Networks
11 pages
1-s2.0-S0893608005001206-main
No ratings yet
1-s2.0-S0893608005001206-main
9 pages
Pre Trained Models For NLP
No ratings yet
Pre Trained Models For NLP
15 pages
Recurrent Convolutional Neural Networks For Text Classification
No ratings yet
Recurrent Convolutional Neural Networks For Text Classification
7 pages
Toward Multilingual Neural Machine Translation With Universal Encoder and Decoder
No ratings yet
Toward Multilingual Neural Machine Translation With Universal Encoder and Decoder
10 pages
Named Entity Recognition Using Deep Learning
100% (1)
Named Entity Recognition Using Deep Learning
21 pages
Japanese Named Entity Recognition
No ratings yet
Japanese Named Entity Recognition
6 pages
Lecture 2a - Word Level Semantics
No ratings yet
Lecture 2a - Word Level Semantics
34 pages
applsci-10-07557
No ratings yet
applsci-10-07557
22 pages
Deep Learning Basics
No ratings yet
Deep Learning Basics
10 pages
Neural Net
No ratings yet
Neural Net
62 pages
Performance - Evaluation - of - Recurrent - Neural - Networks-LSTM - and - GRU - For ASR - IC2E3
No ratings yet
Performance - Evaluation - of - Recurrent - Neural - Networks-LSTM - and - GRU - For ASR - IC2E3
6 pages
Deep Learning for Natural Language
No ratings yet
Deep Learning for Natural Language
23 pages
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
From Everand
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
Adam Jones
No ratings yet
Sociology 9699 Paper 1
No ratings yet
Sociology 9699 Paper 1
11 pages
1 Daftar Entry Littlejohn, Ensiklopedia Teori Komunikasi
No ratings yet
1 Daftar Entry Littlejohn, Ensiklopedia Teori Komunikasi
12 pages
Dewi, 2016 Inkuri Terbuka Dan Inkuiri Terbimbing
No ratings yet
Dewi, 2016 Inkuri Terbuka Dan Inkuiri Terbimbing
8 pages
TOC - Question Paper - MID Sem Exam Nov-2021
No ratings yet
TOC - Question Paper - MID Sem Exam Nov-2021
2 pages
Project Design
No ratings yet
Project Design
46 pages
ENV 132 SOC 122 Disasters Sana Khosa
No ratings yet
ENV 132 SOC 122 Disasters Sana Khosa
6 pages
Art and Humanities
No ratings yet
Art and Humanities
6 pages
A Practical Introduction To Regression Discontinuity Designs
No ratings yet
A Practical Introduction To Regression Discontinuity Designs
165 pages
Flat Clustering PDF
No ratings yet
Flat Clustering PDF
73 pages
ss2 Frame Work
No ratings yet
ss2 Frame Work
1 page
UCSP-Week 15 Nov 2-4 - DLL
No ratings yet
UCSP-Week 15 Nov 2-4 - DLL
3 pages
Lesson Plan - Parts of The School (4º Ano)
100% (1)
Lesson Plan - Parts of The School (4º Ano)
3 pages
Conformity Presentation
No ratings yet
Conformity Presentation
10 pages
Third Quarter First Summative Test in Capstone Research
100% (2)
Third Quarter First Summative Test in Capstone Research
2 pages
Dissertation Chapters 4 and 5
100% (2)
Dissertation Chapters 4 and 5
8 pages
Prospectus - Bachelor of Arts in English Language Studies (Enhanced Curriculum)
No ratings yet
Prospectus - Bachelor of Arts in English Language Studies (Enhanced Curriculum)
1 page
TSci02 Syllabus-Teaching Science in The Elementary Grades
No ratings yet
TSci02 Syllabus-Teaching Science in The Elementary Grades
8 pages
An Information-Theoretic Perspective of Tf-Idf Measures: Akiko Aizawa
No ratings yet
An Information-Theoretic Perspective of Tf-Idf Measures: Akiko Aizawa
21 pages
Seminar - Presentation 01 14 23
No ratings yet
Seminar - Presentation 01 14 23
43 pages
O Level Tour Consents Letter
No ratings yet
O Level Tour Consents Letter
2 pages
TOK Essay
No ratings yet
TOK Essay
3 pages
Проект Аширали Г. Асанова А. Ауелбекова У. Ишанова А.
No ratings yet
Проект Аширали Г. Асанова А. Ауелбекова У. Ишанова А.
26 pages
Week 2 What-is-Interaction-Design Grp.1
No ratings yet
Week 2 What-is-Interaction-Design Grp.1
35 pages
Predictive Modeling of YouTube Using Supervised Machine Learning Algorithm For Identifying Trending Videos and Its Impact On Engagement
No ratings yet
Predictive Modeling of YouTube Using Supervised Machine Learning Algorithm For Identifying Trending Videos and Its Impact On Engagement
6 pages
Research in Daily Life 2 Syllabus
0% (1)
Research in Daily Life 2 Syllabus
7 pages
Formato-De-Planificación-Microcurricular-2do Trimestre-3ro BGU
No ratings yet
Formato-De-Planificación-Microcurricular-2do Trimestre-3ro BGU
2 pages
Sobotka Victoria
No ratings yet
Sobotka Victoria
33 pages
PIO - Sesi 2 - Budaya Dan Iklim Organisasi
No ratings yet
PIO - Sesi 2 - Budaya Dan Iklim Organisasi
11 pages

On The Vietnamese Name Entity Recognition: A Deep Learning Method Approach

Uploaded by

On The Vietnamese Name Entity Recognition: A Deep Learning Method Approach

Uploaded by

On the Vietnamese Name Entity Recognition: A

Deep Learning Method Approach

III. METHODOLOGY ft = σ (Wf ht−1 + Uf xt + bf ) (1)

Fig. 3. Our Deep Learning model

exp(w.F (y, x))

You might also like