1
1
net/publication/327070172
CITATIONS READS
44 1,494
2 authors:
All content following this page was uploaded by Sk Ahammad Fahad on 09 October 2018.
Abstract— In the age of knowledge, Natural Language language processing for its capability with decent acceleration
Processing (NLP) express its demand by a huge range of and reliable producing.
utilization. Previously NLP was dealing with statically data.
Contemporary time NLP is doing considerably with the corpus, In the field of NLP, DL completely takeover Text
lexicon database, pattern reorganization. Considering Deep Classification and Categorization, Named Entity Recognition
Learning (DL) method recognize artificial Neural Network (NN) (NER), Part-of-Speech Tagging (POST), Semantic Parsing
to nonlinear process, NLP tools become increasingly accurate and Question Answering, Paraphrase Detection, Language
and efficient that begin a debacle. Multi-Layer Neural Network Generation and Multi-document Summarization, Machine
obtaining the importance of the NLP for its capability including Translation, Speech Recognition, Character Recognition, Spell
standard speed and resolute output. Hierarchical designs of data Checking etc. Hierarchical representations of data operate
operate recurring processing layers to learn and with this complicated processing layers to learn and have by this
arrangement of DL methods manage several practices. In this imagined Deep Learning methods dominate many realms.
paper, this resumed striving to reach a review of the tools and the Those tools are powered by DL and applied on Natural
necessary methodology to present a clear understanding of the Language to complete the NLP process to achieve the goal.
association of NLP and DL for truly understand in the training. Text Summarization, Caption Generation, Question
Efficiency and execution both are improved in NLP by Part of Answering, Text Classification, Machine Translation,
speech tagging (POST), Morphological Analysis, Named Entity Language Modeling, Speech Recognition all NLP tools are
Recognition (NER), Semantic Role Labeling (SRL), Syntactic
working with DL go obtain desire accurate result. Several
Parsing, and Coreference resolution. Artificial Neural Networks
(ANN), Time Delay Neural Networks (TDNN), Recurrent Neural
methods are proposed and the system was implemented to
Network (RNN), Convolution Neural Networks (CNN), and NLP by DL and they are doing well.
Long-Short-Term-Memory (LSTM) dealings among Dense NLP methods are modified when DL was associated. NLP
Vector (DV), Windows Approach (WA), and Multitask learning process like POS, NER, Morphology, Syntactic Parsing,
(MTL) as a characteristic of Deep Learning. After statically Coreference resolution is discussed in subsequent section.
methods, when DL communicate the influence of NLP, the Different type of Neural Network like; ANN, TDNN, RNN,
individual form of the NLP process and DL rule collaboration
CNN are discussed with the relation and in NLP process was
was started a fundamental connection.
discussed in this paper. Different technique and tools of DL
Keywords—Deep Learning; Natural language processing; Deep like; LSTM, MTL, DV, CBOW, VL, WA, SRL, Non-Linear
nural Network; Multitask Learning Function are discussed with the possible relation with NLP. In
this paper, it was trying to make a review of the tools and the
basic methodology. After statically methods, when DL take
I. INTRODUCTION over the control to NLP, a new form of the NLP process and
In the age of information, Natural Language Processing DL process collaboration was presented with the basic
(NLP) create its demand by a comprehensive area of relation.
application. To present the significant knowledge to non-
programmer from computer system Natural Language II. NATURAL LANGUAGE PROCESSING
Processing was deliberate as a working field from 1950. Non-
Subject material experts obtain an answer of simple queries by During a document record recognized for accomplishing
improvement of NLP. Previously NLP was dealing with besides, this is not continued to direct accomplishment. It
statically data. Recent year NLP is doing well with the corpus, requires processing background for authentic execution. There
lexicon database, Neural Network. Since Deep Learning (DL) has some obligatory progression in contemporary Natural
method allow artificial Neural Network (NN) to nonlinear Language Processing (NLP). Deep Learning (DL) is a superior
process, Natural Language processing tools enhance more form of Neural Network (NN) and it deals with preprocessed
accurate and valuable that make a revolution. Multi-Layer data. Before NN applied in NPL subsequent processing the
Neural Network expanding the influence of the natural document. DL also operated among concocted document file.
That’s why unusual primary actions of the process of textual
documents are extremely valuable. There have six compulsory
International Conference on Smart Computing and Electronic Enterprise. (ICSCEE2018) ©2018 IEEE
rounds should ensue to perceive more sustainable and accurate
result by implementing DL. Those are; Splitting and
Tokenization, Part-of-Speech Tagging, Morphological
Analysis, Named Entity Recognition, Syntactic Parsing, and
Conference Resolution [7,8].
Splitting and Tokenization segment clean document from
outcast tags and absolute splitting document in to token. The
input file has any tags for designing the text. NLP consider
simply real fresh text for processing. Tags require cleaning
before performing a better and correct result in NLP.
Tokenization is that the approach of separating a stream of
text into words, phrases, symbols, or different principal
elements specified as tokens. How to split and character of the
token is should describe as per the demand for output.
Succeeding cleaning the writing from unwanted content and
splitting text to token, NLP proceeding by tokens [3].
Part of speech tagging (POST) or POS tagging is a
mechanism of NLP, it performs the extremely significant role
on text phrase, syntax, translation, and semantic analysis.
Rule-Based tagging perform POS tagging by match everything
on a lexical dictionary and match rest of words independently
with each part of speech precept. In Stochastic POS tagging is
accomplished by applying prospect including a tagged corpus.
Both Rule-Based and Stochastic POS tagging are worked
blended in Transformation-Based POS tagging [2].
Morphology allots with the connection between the
structure and purpose of words. Morphological Analysis is the
insignificant semantic systems including a lexical or a logical
definition of words. Morphological analyzers and lemmatizes
typically demand training data. The latter is practiced on
character-aligned combinations of stems and lemmas, where Fig. 1. Basic steps for Natural Language Processing for Deep Learning
Technique
stems are extorted and neophytes stem into lemmas [9].
The anaphor is the appearance whose interpretation (i.e.,
Named Entity Recognition (NER) is the policy that
comparing by both pavement and theoretical real-world
recognizes the name and number and eliminates from tokens
article) depends on that of the differential phase. The
which are allowing for additional processing. Hand-Made
predecessor is the semantic appearance on which an anaphor
NER sharpens on obtaining name and numbers applying the
depends. Coreference resolution complete specifying
rules which are human-made. Those rules utilizing
disclosure and both pronominal from token and dismissed
grammatical syntactic and Orthographic innovations in
them [4,5].
combination with dictionaries. Machine Learning-based NER
method applies a classification analytical pattern to determine
and converting description query into a distribution problem. III. DEEP LEARNIG TECHNIQUE ON NATURAL LANGUAGE
As tike POS tagging there has a Hybrid scheme developed PROCESSING
which one is the combination of human-made and machine-
made rules [6]. A. Dense Vector
Syntactic Parsing produces comprehensive syntactic Artificial Neural Networks (ANN) Input textual data to the
analysis and Sentiment analysis including both ingredient and data vector x in-dimensions and produce the product in out-
dependence description with a compositional pattern atop trees dimensions. DL distributes with raw words (tokens) and not
utilizing deep learning. WordNet® is an immense on-line engineered features, the first layer has to map words into real-
database of English. Nouns, verbs, adjectives and adverbs valued vectors for processing by consequent layers of the NN.
section organized within collections of subjective peculiarity Alternatively, forming a unique dimension for individual and
synonyms (synsets), individual expressing a particular every feature, implant every feature into a D-Dimensional
representing. Synsets section interlinked with hints of space and describe as a Dense Vector (DV) in the space [1]. If
conceptual-semantic and lexical associations [6]. there exists a correlation distribution which can learn, that can
be captured by DV. It is the principal influence of DV
Coreference resolution is the responsibility of defining
semantic definitions that belong to the corresponding real- representation. DV training will cause comparable
world existence in natural language. Coreference consists of characteristics have to experience vectors-information
two semantic appearances—antecedent and anaphor. between them [7].
International Conference on Smart Computing and Electronic Enterprise. (ICSCEE2018) ©2018 IEEE
B. Feature Extraction convolution to D Dimensional vector and transfers them to
For deriving a feature from tokens; Distance and choose the most appropriate feature of presented features
Dimensionality Feature, Continuous Bag of words (CBOW), called pooling. Deciding the best feature by pooling it
and Variable Length (VL) signify implemented. Distance can developed the ANN for text classification. CNN good for
measure by deducting the identical tokens word to feature clustering but starving on learning sequential knowledge.
vector. Distance invariably positive, it can deduct whatever When RNN train neural nets by backpropagation algorithm,
from anyone. This measurement data can be used to train NN vanishing gradient problem is occurring. To supervise this
and make DL more specific. Distance, the weight of token, problem, Long-Short-Term-Memory (LSTM) is implemented.
synset (synonyms) everything is a feature of the embedded LSTM control log-logistic function and select the parameters
token and all of the dimension are property and those are [1].
associated with processing speed, and accuracy. To obtain a
feature dimension is an important feature of a token. CBOW is D. Multi-Tasking Learning
a feature based on dense vector feature extraction. NN get Multitask learning (MTL) in Deep Neural Network is the
trained from inputted token individually with a unique feature. method where NN perform several learning processes at the
The weight of word executes this feature extends the identical time to gain a complementary advantage. In NN,
important. A number of unique words are proffered to vector associated task particular feature is essential for another
representation (One hot vector illustration) and it’s a feature. In NLP, POS prediction feature activity is also
significant part of train DL [1]. Averaging embedded feature accomplished for SRL and NER. Adjustment or upgrade to
and weight averaging of an embedded feature is essentially POS task also generalization for SRL with NER. In the deep
allied to expand weight. DV has the primary tokens within the layers of architected NN automatically learn. NN get trained
sequence. But DL inadequate to supervise to variable length. on related task by according deep layers. Maximum the time,
Consider an established measurement Windows Approach latest layer in the network is liability specified and according
(WA) can be a simplistic explication of this limitation. WA to the layer enhance the performance. In MTL tasks, cascading
strongly collaborate with POS but it not proper to operate with feature is the numerous dynamic way to accomplish the
Semantic Role Labeling (SRL) [7]. aspired output. Use task feature to another task is obvious to
C. Deep Neural Network use an SLR and POS classifier and use the result to another
feature to train a parser [11]. When several tasks labeled for in
Determination of basis of deep learning Neural Network
one dataset, shallow manner can be applied to the task jointly
depends on the impression was operating to supervise to
and one all task labels at the same time by a unique model. In
accomplish the purpose. Non-Linear Function, Output
shallow joint training can improve joint training on POS
Transformation Function, Loss Function are numerously
tagging and noun-phrase chunking task. Relation extraction,
accessible on NLP. NN is approximator and it can
parsing, and NER can be jointly training in the statically
approximate any non-zero liner function. Infinity number of
parsing model to improve the achievement.
positive and negative range and a bound output range is the
property of a Non-Linear Function. In Outer Transformation IV. CONCLUTION
Function the peripheral function of NN use as a transmutation
Natural Language Processing (NLP) formulate its demand
function. To represent a Multi-class distribution, Outer
with Deep Learning (DL) method. Artificial Neural Network
Transformation is pretty much recommended and used
(NN) and non-linear process, Natural Language processing
extensively. How greatly the network encouraged from the tools enhance increased accurate and efficient and DL rule
genuine output is was intimate by Loss Function. It depends collaboration was introduced with the primary relationship. In
on the specification, character, and extent of tolerance. the field of NLP, DL completely takeover Text Classification
Ranking loss, categorical entropy loss, log loss are the current and Categorization, Named Entity Recognition (NER), Part-
function of Loss Function [11]. Time Delay Neural Networks of-Speech Tagging (POST), Semantic Parsing and Question
(TDNN) can be the best alternative when fashioning for long- Answering, Paraphrase Detection, Language Generation and
distance dependencies. TDNN accumulated to local feature in Multi-document Summarization, Machine Translation, Speech
the deeper layer and tiny local (more global) function Recognition, Character Recognition, Spell Checking etc. NLP
inconsequently. TDNN perform a linear into tokens and it methods are modified when DL was associated. NLP process
accomplishes on POS, NER and further complicated steps of like POS, NER, Morphology, Syntactic Parsing, Coreference
NLP, like SLR [10]. Recurrent Neural Network (RNN) is resolution are associated with Neural Network like; ANN,
techno scientifically on acknowledged handwriting. It presents TDNN, RNN, CNN are discussed with the relation and in
dynamic ephemeral arbitrary by performing the primary NLP process and tools of DL like; LSTM, MTL, DV, CBOW,
element of the network. RNN connection into blends VL, WA, SRL, Non-Linear Function maintain relationship
accompanies an addressed series. RNN further deal with with NLP process to collaborating basic relation.
variable length and present involvement by timestamps [10].
Other hand Convolution Neural Networks (CNN) obtain token REFERENCES
to engrave feature by implementing convolution method and [1] J. Schmidhuber, "Deep learning in neural networks: An overview",
build an artificial network. CNN produces output from Neural Networks, vol. 61, pp. 85-117, 2015.
International Conference on Smart Computing and Electronic Enterprise. (ICSCEE2018) ©2018 IEEE
[2] M. Hjouj, A. Alarabeyyat and I. Olab, "Rule Based Approach for Arabic 2013 IEEE 3rd International Conference on System Engineering and
Part of Speech Tagging and Name Entity Recognition", International Technology, 2013.
Journal of Advanced Computer Science and Applications, vol. 7, no. 6, [7] H. Li, "Deep learning for natural language processing: advantages and
2016. challenges", National Science Review, vol. 5, no. 1, pp. 24-26, 2017.
[3] S. Fahad, "Design and Develop Semantic Textual Document Clustering [8] S. Fahad and W. Yafooz, "Review on Semantic Document Clustering",
Model", Journal of Computer Science and Information Technology, vol. International Journal of Contemporary Computer Research, vol. 1, no. 1,
5, no. 2, 2017. pp. 14-30, 2017.
[4] J. Zheng, W. Chapman, R. Crowley and G. Savova, "Coreference [9] O. Bonami and B. Sagot, "Computational methods for descriptive and
resolution: A review of general methodologies and applications in the theoretical morphology: a brief introduction", Morphology, vol. 27, no.
clinical domain", Journal of Biomedical Informatics, vol. 44, no. 6, pp. 4, pp. 423-429, 2017.
1113-1122, 2011.
[10] G. Goth, "Deep or shallow, NLP is breaking out", Communications of
[5] A. Kaczmarek and M. Marcińczuk, "A preliminary study in zero the ACM, vol. 59, no. 3, pp. 13-16, 2016.
anaphora coreference resolution for Polish", Cognitive Studies | Études
cognitives, no. 17, 2017. [11] R. Gunderman, "Deep Questioning and Deep Learning", Academic
Radiology, vol. 19, no. 4, pp. 489-490, 2012.
[6] W. Yafooz, S. Abidin, N. Omar and R. Halim, "Dynamic semantic
textual document clustering using frequent terms and named entity",