SlideShare a Scribd company logo
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
DOI : 10.5121/ijaia.2017.8201 1
AN ENTITY-DRIVEN RECURSIVE NEURAL
NETWORK MODEL FOR CHINESE DISCOURSE
COHERENCE MODELING
Fan Xu, Shujing Du, Maoxi Li and Mingwen Wang
School of Computer Information Engineering, Jiangxi Normal University
Nanchang 330022, China
ABSTRACT
Chinese discourse coherence modeling remains a challenge taskin Natural Language Processing
field.Existing approaches mostlyfocus on the need for feature engineering, whichadoptthe sophisticated
features to capture the logic or syntactic or semantic relationships acrosssentences within a text.In this
paper, we present an entity-drivenrecursive deep modelfor the Chinese discourse coherence evaluation
based on current English discourse coherenceneural network model. Specifically, to overcome the
shortage of identifying the entity(nouns) overlap across sentences in the currentmodel, Our combined
modelsuccessfully investigatesthe entities information into the recursive neural network
freamework.Evaluation results on both sentence ordering and machine translation coherence rating
task show the effectiveness of the proposed model, which significantly outperforms the existing strong
baseline.
KEYWORDS
Entity, Recursive Neural Network, Chinese Discourse, Coherence
1. INTRODUCTION
Discourse Coherence Modeling (DCM) aims to evaluate a degree of coherence among sentences
within a discourse or text. It is considered one of the key problems in Natural Language
Processing (NLP) due to its wide usage in many NLP applications, such as statistical machine
translation[1]
, discourse generation[2][3][4]
, text automation summarization[3][5][6]
, student essay
scoring[7][8][9]
.
In general, a coherent discourse generally has many similar components (lexical overlap or
coreference) across sentences within a text, while incoherent discourse is the other one.
Therefore, the traditional cohesion theory of Centering[10]
driven and entity-based
model[11][12][13][14]
was proposed to capture the syntactic or semantic distribution of discourse
entities (nouns) between two adjacent sentences in a text. Thereafter, many extension works were
presented such as Feng and Hirst[15]
’s multiple ranking model, Lin et al.[16]
’s discourse relation-
based approach, Louis and Nenkova[17]
’s syntactic patterns-based model. However, the potential
issue of the existing traditional coherence models need feature engineering, which is a time-
consuming job.
In order to overcome the limitation of feature engineering issue, modern research tries to use
neural network to extract the syntactic or semantic representation of a sentence automatically. Li
et al.[18]
proposed neural deep model to deal with English discourse coherence evaluation.
However, their discourse coherence model only focuses on the distributed representation for
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
2
sentences, and did not consider the entity (nouns) distribution across sentences. In fact, the
entities can be overlapped between two adjacent sentences, and are good insight to capture the
coherence between adjacent two sentences as mentioned in traditional entity-based method.
Therefore, we successfully integrate this kind of information into current recursive neural
network framework. Evaluation results on both sentence ordering and machine translation
coherence rating task show the effectiveness of the proposed model, which significantly
outperforms the existing strong baseline.
Therefore, this paper tries to answer the following three questions:
(1) Can the current English discourse coherence models (traditional or neural method) work
for Chinese discourse coherence evaluation task?
(2) Can the traditional entity based model be integrated into current deep model?
(3) Which kind of word embedding works better for Chinese discourse coherence evaluation?
The rest of this paper is organized as follows. Section 2 reviews related work on discourse
coherence modeling. Section 3 introduces the framework of our entity-driven recursive neural
network based Chinese discourse coherence model. Section 4 describes the experiment results
and detailed analysis. Finally, some conclusions are drawn in Section 5.
2. RELATED WORK
In this section, we describe the related work for discourse coherence modeling from traditional
and neural network modes, respectively.
2.1. TRADITIONAL COHERENCE MODEL
The task of DCM was first introduced by Foltz et al.[19]
. They formulated the discourse coherence
as a function of semantic relatedness between two adjacent sentences within a text, and employed
a vector-based representation of lexical meaning to compute the semantic relatedness. Since then,
many supervised approaches to DCM, such as the entity-based model[11][12][13][15]
, discourse
relation-based model[16]
, syntactic patterns-based model[17]
, co reference resolution-based
model[20][21]
, content-based model via Hidden Markov Model (HMM)[3][22]
and cohesion-driven
based model[23]
have been proposed in literature.
To be more specific, Barzilay and Lapata[11][12]
presented an entity-based model to capture the
distribution of discourse entities between two adjacent sentences within a text. As an extensive
work of entity-based approach, Lin et al.[16]
explored the function of discourse relations to revise
the entity and to catch the behavior of discourse relation transfer among sentences. In addition,
Feng and Hirst[15]
showed that multiple ranking instead of pair wise ranking was effective for the
DCM.
Differently, Louis and Nenkova[17]
explored the function of syntactic structure in the DCM.
Besides, Iida et al.[20]
and Elsner et al.[21]
demonstrated the importance of the usage of co
reference resolution. In addition, Barzilay et al.[3]
and Elsner et al.[22]
showed that an Hidden
Markov Model (HMM)-based content model can be used to capture the topic’s transfer from the
first sentence to the end sentence of a text, where topics were formulated as hidden states and
sentences were treated as observations. Still, a potential issue of the HMM model is its domain-
dependent mechanism. Also, Xu et al.[23]
explored the impact of Halliday[24]
’s Theme Structure
Theory (TST) in English discourse coherence modeling. Their model shows the importance of the
theme structure, a cohesion theory of Halliday’s systemic-functional grammar, to DCM, and the
appropriateness of theme and co reference based filtering mechanism.
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
3
Figure 1: The framework of entity-driven recursive model for Chinese Discourse Coherence Modeling.
2.2. NEURAL COHERENCE MODEL
Recently, Li et al.[18]
presented a neural deep model for English discourse coherence modeling.
They demonstrated the effectiveness of both recurrent and recursive neural network (RNN) model
for English situation.
However, as mentioned in the Section 1, their model did not consider the entity (nouns)
distribution or entity overlap across sentences within a text. In fact, the entity overlap between
two adjacent sentences indicates logical or semantic coherence for para text. Therefore, we
successfully integrate these information into their model.
3. ENTITY-DRIVEN RNN COHERENCE MODEL
In this section, we describe our entity-driven RNN Chinese discourse coherence model.
3.1. FRAMEWORK
Figure 1 shows the entity-driven recursive deep model for Chinese discourse coherence
modeling. Our deep model is based on Li et al.[18]
’s English discourse coherence framework. On
comparison, their model doesn’t intensify the effectiveness of entities across each sentences in a
text. Therefore, we successfully integrate the entities into current recursive neural network model.
3.2. SENTENCE REPRESENTATION
For the word-level representation, each word in a sentence can be represented by using a vector
representation (or word embedding), and are able to capture the semantic meanings through
toolkit, e.g. word2vec1
or Glove2
. More specifically, the word of a sentence can be represented
using a specific vector embedding ew={ew
1
,ew
2
,…,ew
K
},where K denotes the dimension of the word
embedding.
1
http://code.google.com/p/word2vec/
2
http://nlp.stanford.edu/projects/glove/
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
4
For the sentence-level representation, as shown in Figure 1, the vector representation for the
whole sentence is computed as a representation for each parent node based on its immediate
children recursively in a bottom-up fashion until reaching the root of the tree. Concretely, for a
given parent p in the tree and its two children c1(associated with vector representation hc1) and
c2(associated with vector representation hc2), standard recursive network calculates hpfor p as
follows:
hp=f(WRecursive.[hc1, hc2]+bRecursive) (1)
where [hc1, hc2] refers to the concatenating vector for children vector hc1 and hc2;WRecursive is a k*2K
matrix and bRecursive is the K*1 bias vector; f(.) is tanh function.
3.3. ENTITY-DRIVEN SENTENCE CONVOLUTION
The framework treats a window of sentences as a clique C(sliding windows of L sentences)and
associates each clique with a tag yc that takes the value 1 if coherent, and 0 otherwise. As shown
in Figure 1, each clique C takes as input a (L*K)*1vector hc by concatenating the embedding of
all its contained sentences. The hidden layer takes as input hc and performs the convolution using
a non-linear tanh function. The concatenating output vector for hidden layers, defined as qc, can
therefore be rewritten as:
qc=f(Wsen*(hc*hentity)+bsen) (2)
where Wsen is a H*(L*K) dimensional matrix and bsen is a H*1 dimensional bias vector; H refers to
the number of neurons in the hidden layer.
3.3.1. ENTITY-DRIVEN MECHANISM
Firstly, we conduct vector summation operation for each nouns’ word embedding to generate
hentity formulated as:
Hentity=ewNN1 ⊕ ewNN2⊕……⊕ ewNNk (3)
Then, we conduct element wise multiplication operation between hc and hentity.
The value of the output layer can be formulated as:
P(yc=1)=sigmod(UT
qc+b) (4)
where U is an H*1 vector and b denotes the bias; yc with value 1 means the text is coherent, and 0
otherwise.
Therefore, the total coherence score for a given document is the probability that all cliques within
the document are coherent, which is given by:
Sd=∏∈
=
dC
cyp )1( (5)
Finally, we can determine whether a text is coherent according to the value of their coherence
score.
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
5
3.4. TRAINING AND OPTIMIZATION
The cost function for the model is given by:
2
0
1
( )
2C trainset
Q
J H
M M θ
θ
∈ ∈Θ
Θ = +∑ ∑ (6)
0 log[ ( 1)] (1 )log[1 ( 1)]c c c cH y p y y p y= − = − − − = (7)
where Θ =[WRecursive,Wsen,Usen]; M denotes the number of training samples.
We adopt the widely applied optimization diagonal variant of AdaGrad (Duchi et al.[25]
) to
optimize the loss function.
4. EXPERIMENTS
In this section, we demonstrate the effectiveness of our discourse coherence model through both
sentence ordering and machine translation coherence rating tasks. The former aims to discern an
original text from a permuted ordering of its sentences, while the latter aims to discern a human
or reference translation from automatically machine generated translation.
4.1. DATASET
Sentence Ordering Dataset: We select documents for Chinese Treebank 6.0 from Linguistic
Data Consortium (LDC) with catalog number LDC2007T36 and ISBN1-58563-450-6. We select
the 100 documents from chtb_2946 to chtb_3045 as our training dataset, and 100 documents from
chtb_3046 to chtb_3145 as our testing dataset. The sentences in each source file will be
permutated at most 20 times. The total number of testing texts is 1027.The average number of
sentence are 10.33 and 13.56 for training set and testing set, respectively. In the evaluation, we
consider the original texts are more coherent (positive instances) than the permutated ones
(negative instances).
Machine Translation Dataset: Similarly, we extract documents for NIST Open Machine
Translation 2008 Evaluation (MT08) Selected Reference and System Translations from
Linguistic Data Consortium (LDC) with catalog number LDC2010T01 and ISBN1-58563-533-2.
Therein, the English-to-Chinese language pairs have 127 documents with 1830 segments, output
from 11 machine translation systems. The average number of sentence are 13.38 and 13.39 for
training set and testing set, respectively. In evaluation, we consider the human or reference
translation texts are more coherent than the machine generated one.
4.2. EXPERIMENTAL SETTINGS
Initialization: Similar to Li et al.[18]
, the parameter Wsen, Wrecursive and h0are initialized by
randomly drawing from the uniform distribution. The number of hidden layer H is set to
100.Learning rate in the optimization process is set to 0.01, and batch size is set to 20.
Differently, word embedding {e} for Chinese are trained using word2vec and Glove respectively.
The dimension for word embedding is 50 or 100. The window size L is 3 or 5.
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
6
Evaluation Metric: We report system’s performance using accuracy, which is the ratio of the
number of the selected original text/translation document divided by the total number of
texts/translation document.
Baseline System 1: Entity graph based model[14]
which has been demenstrated as a simple but
effective implementation of the entity-based coherence model. We re-implement their method in
this paper based on publicly available code3
.
Baseline System 2: Another baseline, Li et al.[18]
’s recursive neural model, which did not
consider the entity transition information.We transplant there English discourse coherence
framework to Chinese situation. Furthermore, we successfully integrate the entity information
into their deep model.
In addition, we employ Stanford parser4
to generate sentence-level constitute parser tree and
generate the part-of-speech to get the entities (nouns) occur in each sentence, and use utility
ICTCLAS5
to conduct Chinese word segmentation.
4.3. EXPERIMENT RESULTS
In this section, we report the experiment results for the Chinese discourse coherence modeling on
both sentence ordering and machine translation coherence rating task.
4.3.1. RESULTS ON SENTENCE ORDERING
Table 1 shows the performance of our entity-driven deep model using different windows size,
different dimension, and with different type of word embedding.
Table 1: The performance under different settings on sentence ordering task.
As shown in Table 1, it shows that:
(1)Dimension
Generally speaking, the performance increases with the increment of the dimension. In fact, the
larger the dimension, the more representative ability it is.
(2) Window size
The performance decreases with the increment of the window size, and the best performance
yields at the window size with 3. It is mostly caused by the local entity distribution characteristic
demenstrated by Barzilay and Lapata[11][12]
,Guinaudeau and Strube[14]
. As the increment of the
number of the window size, the entity co-occurance decreases accordingly.
3
http://github.com/karins/CoherenceFramework.
4
http://nlp.stanford.edu/software/lex-parser.shtml.
5
http://ictclas.nlpir.org/downloads
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
7
Table 2, below, lists the performance comparision among our model and the current baseline
model (traditional model and neural network model).
Table 2: Performance comparison among different coherence model on sentence ordering task;
Performance that is significantly superior to baseline systems (p<0.05, using paired t-test for significance)
is denoted by *.
It shows that our combined model significantly outperforms the current deep model for Chinese
discourse coherence modeling, which demonstrates the effectiveness and importance of the entity
distribution across sentences. Interestingly, the traditional entity based model also works for
Chinese discourse coherence evaluation, which doesn’t work fine for English situation. This is
mostly caused by the entity distribution are obvious in Chinese discourse than in English text.
4.3.2. RESULTS ON MACHINE TRANSLATION COHERENCE RATING
Table 3, below, lists the performance of our model and the baseline model.
Table 3: Performance comparison among different coherence model on machine translation
coherence rating task with dimension equals to 100; Performance that is significantly superior to baseline
systems (p<0.05, using paired t-test for significance) is denoted by *.
In fact, discourse coherence evaluation for the machine translation task is more common than the
sentence ordering task evaluation. As the results show in Table 3, again, our model significantly
outperforms the current model. Also, our model significantly outperforms the traditional entity-
based model. It is mostly caused by the entity distribution is not obvious in the text generated by
the machine. But the entity (nouns) information still can be integrated into current recursive
neural network model.
5. CONCLUSIONS AND FUTURE WORK
In this paper, we present an entity-driven recursive deep model for Chinese discourse coherence
modeling. We successfully integrate the entities across each sentence into current recursive neural
framework. Evaluation results on both sentence ordering and machine translation coherence
rating task show the effectiveness of the proposed model. Our future work is to integrate the co
reference mechanism into current combined recursive neural network model, together with other
coherence evaluation task.
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
8
ACKNOWLEDGEMENTS
The authors would like to thank the anonymous reviewers for their comments on this paper. This
research was supported bythe National Natural Science Foundation of China under Grant
No.61402208, No.61462045, No.61462044, No.61662030, the Natural Science Foundation and
Education Department of Jiangxi Province under Grant No. 20151BAB207027 and GJJ150351,
and the Research Project of State Language Commission under Grant No.YB125-99.
REFERENCES
[1] Heidi J. Fox,(2002),“Phrasal Cohesion and Statistical Machine Translation”, In Proceedings of
EMNLP, pages304-311.
[2] Radu Soricut and Daniel Marcu,(2006),“Discourse Generation Using Utility-Trained Coherence
Models”, In Proceedings of COLING-ACL, pages803-810.
[3] Regina Barzilay and Lillian Lee,(2004),“Catching the Drift: Probabilistic Content Models, with
Applications to Generation and Summarization”, In Proceedings of NAACL-HLT, pages113-120.
[4] Jiwei Li, Minh-Thang Luong and Dan Jurafsky,(2015),“A Hierarchical Neural Autoencoder for
Paragraphs and Documents”, In Proceedings of ACL, pages1106-1115.
[5] Zi-Heng Lin, Hwee Tou Ng and Min-Yen Kan,(2012),“Combining Coherence Models and Machine
Translation Evaluation Metrics for Summarization Evaluation”, In Proceedings of ACL,pages 1006–
1014.
[6] Danushka Bollegala, Naoaki Okazaki and Mitsuru Ishizuka,(2006),“A Bottom-Up Approach to
Sentence Ordering for Multi-Document Summarization”,In Proceedings of ICCL-ACL,pages 385-
392.
[7] Helen Yannakoudakis and Ted Briscoe,(2012),“Modeling coherence in ESOL learner texts”, In
Proceedings of ACL, pages33-43.
[8] Jill Burstein,Joel Tetreault and Slava Andreyev,(2010),“Using Entity-Based Features to Model
Coherence in Student Essays”, In Proeedings of NAACL-HLT, pages681-684.
[9] Derrick Higgins, Jill Burstin,Daniel Marcu and Claudia Gentile,(2004),“Evaluating Multiple Aspects
of Coherence in Student Essays”, In Proceedings of NAACL-HLT,pages185-192.
[10] Barbara J. Grosz, Scott Weinstein and Aravind K. Joshi,(1995),“Centering:A Framework for
Modeling the Local Coherence of Discourse”,Computational Linguistics, 21(2):203-225.
[11] Regina Barzilay and Mirella Lapata,(2005),“Modeling Local Coherence: An Entity-Based
Approach”, In Proceedings of ACL,pages 141-148.
[12] Regina Barzilay and Mirella Lapata,(2008),“Modeling Local Coherence: An Entity-Based
Approach”,Computational Linguistics, 34(1):1-34.
[13] Mirella Lapata and Regina Barzilay,(2005), “Automatic Evaluation of Text Coherence: Models and
Representations”, In Proceedings of IJCAI, pages 1085-1090.
[14] Camille Guinaudeau and Michael Strube, (2013), “Graph-based local coherence modeling”, In
Proceedings of ACL, pages 93–103.
[15] Vanessa Wei Feng and Graeme Hirst,(2012),“Extending the Entity-based Coherence Model with
Multiple Ranks”, In Proceedings of EACL, pages 315-324.
[16] Zi-Heng Lin, Hwee Tou Ng and Min-Yen Kan,(2011),“Automatically Evaluating Text Coherence
Using Discourse Relations”, In Proceedings of ACL, pages 997-1006.
[17] Annie Louis and Ani Nenkova,(2012),“A coherence model based on syntactic patterns”,
InProceedings of EMNLP-CNLL, pages1157-1168.
[18] Jiwei Li and Eduard Hovy,(2014),“A Model of Coherence Based on Distributed Sentence
Representation”, In Proceedings of EMNLP, pages2039-2048.
[19] Peter W. Foltz, Walter Kintsch and Thomas K. Landauer,(1998),“The measurement of textual
coherence with latent semantic analysis”,Discourse Processes,25(2&3): 285-307.
[20] Ryu Iida and Takenobu Tokunaga,(2012),“A Metric for Evaluating Discourse Coherence based on
Coreference Resolution”, In Proceedings of COLING, Poster, pages483-494.
[21] Micha Elsner and Eugena Charniak,(2008),“Coreference-inspired Coherence Modeling”, In
Proceedings of ACL 2008, Short Papers, pages41-44.
[22] Micha Elsner, Joseph Austerweil and Eugene Charniak,(2007),“A Unified Local and Global Model
for Discourse Coherence”, In Proceedings of NAACL, pages436-443.
International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017
9
[23] Fan Xu, Qiaoming Zhu, Guodong Zhou and Mingwen Wang,(2014),“Cohesion-driven Discourse
Coherence Modeling”, Journal of Chinese Information Processing, 28(3):11-21.
[24 ]M. A. K. Halliday,(1994),“An Introduction to Functional Grammar”, Hodder EducationPress,
London, United Kingdom.
[25] John Duchi, Elad Hazan and Yoram Singer,(2011),“Adaptive subgradient methods for online learning
and stochastic optimization”,The Journal of Machine Learning Research,12:2121-2159.
AUTHORS
Fan Xu holds a Doctoral Degree (Ph.D.) in Computer Science from Soochow University,
China. His areas of research interest includes Natural Language Processing, Chinese
Information Processing, Discourse Analysis, and Speech Recognition. At present he is
working as Lector, School of Computer Information Engineering, Jiangxi Normal
University, China. He is member of various professional bodies including ACL, IEEE, and
ACIS.
Shujing Duis a Master of Computer Science of Jiangxi Normal University, China. Her
research interest includes Natural Language Processing, Chinese Information Processing,
Discourse Analysis
Maoxi Li holds a Doctoral Degree (Ph.D.) in Computer Science from Chinese Academy of
Sciences. His areas of research interest includes Machine Translation and Natural Language
Processing. At present he is working as Associate Professor, School of Computer
Information Engineering, Jiangxi Normal University, China.
Mingwen Wang holds a Doctoral Degree (Ph.D.) in Computer Science from Shanghai
Jiaotong University, China. His areas of research interest includes Machine Learning,
Information Retrieval, Natural Language Processing, Image Processing, and Chinese
Information Processing. At present he is working as Professor, School of Computer
Information Engineering, Jiangxi Normal University, China. He is member of various
professional bodies including ACL, IEEE, CCF, and ACIS.

More Related Content

What's hot (20)

Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic Relations
Editor IJCATR
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_based
cyan1d3
 
An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semantics
Andre Freitas
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
Andre Freitas
 
New word analogy corpus
New word analogy corpusNew word analogy corpus
New word analogy corpus
Lukáš Svoboda
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
cscpconf
 
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
Hiroki Shimanaka
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentation
Richard Littauer
 
Topicmodels
TopicmodelsTopicmodels
Topicmodels
Ajay Ohri
 
Report
ReportReport
Report
butest
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
ijnlc
 
Compositional Distributional Models of Meaning
Compositional Distributional Models of MeaningCompositional Distributional Models of Meaning
Compositional Distributional Models of Meaning
Dimitrios Kartsaklis
 
Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1
University of Minnesota, Duluth
 
Canini09a
Canini09aCanini09a
Canini09a
Ajay Ohri
 
Topic models
Topic modelsTopic models
Topic models
Ajay Ohri
 
A supervised word sense disambiguation method using ontology and context know...
A supervised word sense disambiguation method using ontology and context know...A supervised word sense disambiguation method using ontology and context know...
A supervised word sense disambiguation method using ontology and context know...
Alexander Decker
 
IRJET- An Analysis of Recent Advancements on the Dependency Parser
IRJET- An Analysis of Recent Advancements on the Dependency ParserIRJET- An Analysis of Recent Advancements on the Dependency Parser
IRJET- An Analysis of Recent Advancements on the Dependency Parser
IRJET Journal
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
ijseajournal
 
Hyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyHyponymy extraction of domain ontology
Hyponymy extraction of domain ontology
IJwest
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
ijnlc
 
Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic Relations
Editor IJCATR
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_based
cyan1d3
 
An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semantics
Andre Freitas
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
Andre Freitas
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
cscpconf
 
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
Hiroki Shimanaka
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentation
Richard Littauer
 
Report
ReportReport
Report
butest
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
ijnlc
 
Compositional Distributional Models of Meaning
Compositional Distributional Models of MeaningCompositional Distributional Models of Meaning
Compositional Distributional Models of Meaning
Dimitrios Kartsaklis
 
Topic models
Topic modelsTopic models
Topic models
Ajay Ohri
 
A supervised word sense disambiguation method using ontology and context know...
A supervised word sense disambiguation method using ontology and context know...A supervised word sense disambiguation method using ontology and context know...
A supervised word sense disambiguation method using ontology and context know...
Alexander Decker
 
IRJET- An Analysis of Recent Advancements on the Dependency Parser
IRJET- An Analysis of Recent Advancements on the Dependency ParserIRJET- An Analysis of Recent Advancements on the Dependency Parser
IRJET- An Analysis of Recent Advancements on the Dependency Parser
IRJET Journal
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
ijseajournal
 
Hyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyHyponymy extraction of domain ontology
Hyponymy extraction of domain ontology
IJwest
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
ijnlc
 

Similar to An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coherence Modeling Full Text (20)

Effect of word embedding vector dimensionality on sentiment analysis through ...
Effect of word embedding vector dimensionality on sentiment analysis through ...Effect of word embedding vector dimensionality on sentiment analysis through ...
Effect of word embedding vector dimensionality on sentiment analysis through ...
IAESIJAI
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
kevig
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
Parang Saraf
 
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from TextCooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
University of Bari (Italy)
 
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from TextCooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Fulvio Rotella
 
Artificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain OntologiesArtificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain Ontologies
International Journal of Science and Research (IJSR)
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
Bhaskar Mitra
 
Learning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption GLearning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption G
JospehStull43
 
O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE
O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE
O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE
IJDMS
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
Ali Kabbadj
 
Text Mining: (Asynchronous Sequences)
Text Mining: (Asynchronous Sequences)Text Mining: (Asynchronous Sequences)
Text Mining: (Asynchronous Sequences)
IJERA Editor
 
Learning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docxLearning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docx
croysierkathey
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
Nikhil Jaiswal
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...
Patricia Tavares Boralli
 
EXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATION
EXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATIONEXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATION
EXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATION
IJNSA Journal
 
Exploiting rhetorical relations to
Exploiting rhetorical relations toExploiting rhetorical relations to
Exploiting rhetorical relations to
IJNSA Journal
 
A neural probabilistic language model
A neural probabilistic language modelA neural probabilistic language model
A neural probabilistic language model
c sharada
 
P13 corley
P13 corleyP13 corley
P13 corley
UKM university
 
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
ijsc
 
Texts Classification with the usage of Neural Network based on the Word2vec’s...
Texts Classification with the usage of Neural Network based on the Word2vec’s...Texts Classification with the usage of Neural Network based on the Word2vec’s...
Texts Classification with the usage of Neural Network based on the Word2vec’s...
ijsc
 
Effect of word embedding vector dimensionality on sentiment analysis through ...
Effect of word embedding vector dimensionality on sentiment analysis through ...Effect of word embedding vector dimensionality on sentiment analysis through ...
Effect of word embedding vector dimensionality on sentiment analysis through ...
IAESIJAI
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
kevig
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
Parang Saraf
 
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from TextCooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
University of Bari (Italy)
 
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from TextCooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Fulvio Rotella
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
Bhaskar Mitra
 
Learning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption GLearning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption G
JospehStull43
 
O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE
O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE
O NTOLOGY B ASED D OCUMENT C LUSTERING U SING M AP R EDUCE
IJDMS
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
Ali Kabbadj
 
Text Mining: (Asynchronous Sequences)
Text Mining: (Asynchronous Sequences)Text Mining: (Asynchronous Sequences)
Text Mining: (Asynchronous Sequences)
IJERA Editor
 
Learning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docxLearning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docx
croysierkathey
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
Nikhil Jaiswal
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...
Patricia Tavares Boralli
 
EXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATION
EXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATIONEXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATION
EXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATION
IJNSA Journal
 
Exploiting rhetorical relations to
Exploiting rhetorical relations toExploiting rhetorical relations to
Exploiting rhetorical relations to
IJNSA Journal
 
A neural probabilistic language model
A neural probabilistic language modelA neural probabilistic language model
A neural probabilistic language model
c sharada
 
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
ijsc
 
Texts Classification with the usage of Neural Network based on the Word2vec’s...
Texts Classification with the usage of Neural Network based on the Word2vec’s...Texts Classification with the usage of Neural Network based on the Word2vec’s...
Texts Classification with the usage of Neural Network based on the Word2vec’s...
ijsc
 

Recently uploaded (20)

Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdfAre Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Telecoms Supermarket
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Unlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive GuideUnlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive Guide
vikasascentbpo
 
Mastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdfMastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdf
Spiral Mantra
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Social Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTechSocial Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTech
Steve Jonas
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdfAre Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Are Cloud PBX Providers in India Reliable for Small Businesses (1).pdf
Telecoms Supermarket
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Unlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive GuideUnlocking the Power of IVR: A Comprehensive Guide
Unlocking the Power of IVR: A Comprehensive Guide
vikasascentbpo
 
Mastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdfMastering Advance Window Functions in SQL.pdf
Mastering Advance Window Functions in SQL.pdf
Spiral Mantra
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Social Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTechSocial Media App Development Company-EmizenTech
Social Media App Development Company-EmizenTech
Steve Jonas
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 

An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coherence Modeling Full Text

  • 1. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 DOI : 10.5121/ijaia.2017.8201 1 AN ENTITY-DRIVEN RECURSIVE NEURAL NETWORK MODEL FOR CHINESE DISCOURSE COHERENCE MODELING Fan Xu, Shujing Du, Maoxi Li and Mingwen Wang School of Computer Information Engineering, Jiangxi Normal University Nanchang 330022, China ABSTRACT Chinese discourse coherence modeling remains a challenge taskin Natural Language Processing field.Existing approaches mostlyfocus on the need for feature engineering, whichadoptthe sophisticated features to capture the logic or syntactic or semantic relationships acrosssentences within a text.In this paper, we present an entity-drivenrecursive deep modelfor the Chinese discourse coherence evaluation based on current English discourse coherenceneural network model. Specifically, to overcome the shortage of identifying the entity(nouns) overlap across sentences in the currentmodel, Our combined modelsuccessfully investigatesthe entities information into the recursive neural network freamework.Evaluation results on both sentence ordering and machine translation coherence rating task show the effectiveness of the proposed model, which significantly outperforms the existing strong baseline. KEYWORDS Entity, Recursive Neural Network, Chinese Discourse, Coherence 1. INTRODUCTION Discourse Coherence Modeling (DCM) aims to evaluate a degree of coherence among sentences within a discourse or text. It is considered one of the key problems in Natural Language Processing (NLP) due to its wide usage in many NLP applications, such as statistical machine translation[1] , discourse generation[2][3][4] , text automation summarization[3][5][6] , student essay scoring[7][8][9] . In general, a coherent discourse generally has many similar components (lexical overlap or coreference) across sentences within a text, while incoherent discourse is the other one. Therefore, the traditional cohesion theory of Centering[10] driven and entity-based model[11][12][13][14] was proposed to capture the syntactic or semantic distribution of discourse entities (nouns) between two adjacent sentences in a text. Thereafter, many extension works were presented such as Feng and Hirst[15] ’s multiple ranking model, Lin et al.[16] ’s discourse relation- based approach, Louis and Nenkova[17] ’s syntactic patterns-based model. However, the potential issue of the existing traditional coherence models need feature engineering, which is a time- consuming job. In order to overcome the limitation of feature engineering issue, modern research tries to use neural network to extract the syntactic or semantic representation of a sentence automatically. Li et al.[18] proposed neural deep model to deal with English discourse coherence evaluation. However, their discourse coherence model only focuses on the distributed representation for
  • 2. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 2 sentences, and did not consider the entity (nouns) distribution across sentences. In fact, the entities can be overlapped between two adjacent sentences, and are good insight to capture the coherence between adjacent two sentences as mentioned in traditional entity-based method. Therefore, we successfully integrate this kind of information into current recursive neural network framework. Evaluation results on both sentence ordering and machine translation coherence rating task show the effectiveness of the proposed model, which significantly outperforms the existing strong baseline. Therefore, this paper tries to answer the following three questions: (1) Can the current English discourse coherence models (traditional or neural method) work for Chinese discourse coherence evaluation task? (2) Can the traditional entity based model be integrated into current deep model? (3) Which kind of word embedding works better for Chinese discourse coherence evaluation? The rest of this paper is organized as follows. Section 2 reviews related work on discourse coherence modeling. Section 3 introduces the framework of our entity-driven recursive neural network based Chinese discourse coherence model. Section 4 describes the experiment results and detailed analysis. Finally, some conclusions are drawn in Section 5. 2. RELATED WORK In this section, we describe the related work for discourse coherence modeling from traditional and neural network modes, respectively. 2.1. TRADITIONAL COHERENCE MODEL The task of DCM was first introduced by Foltz et al.[19] . They formulated the discourse coherence as a function of semantic relatedness between two adjacent sentences within a text, and employed a vector-based representation of lexical meaning to compute the semantic relatedness. Since then, many supervised approaches to DCM, such as the entity-based model[11][12][13][15] , discourse relation-based model[16] , syntactic patterns-based model[17] , co reference resolution-based model[20][21] , content-based model via Hidden Markov Model (HMM)[3][22] and cohesion-driven based model[23] have been proposed in literature. To be more specific, Barzilay and Lapata[11][12] presented an entity-based model to capture the distribution of discourse entities between two adjacent sentences within a text. As an extensive work of entity-based approach, Lin et al.[16] explored the function of discourse relations to revise the entity and to catch the behavior of discourse relation transfer among sentences. In addition, Feng and Hirst[15] showed that multiple ranking instead of pair wise ranking was effective for the DCM. Differently, Louis and Nenkova[17] explored the function of syntactic structure in the DCM. Besides, Iida et al.[20] and Elsner et al.[21] demonstrated the importance of the usage of co reference resolution. In addition, Barzilay et al.[3] and Elsner et al.[22] showed that an Hidden Markov Model (HMM)-based content model can be used to capture the topic’s transfer from the first sentence to the end sentence of a text, where topics were formulated as hidden states and sentences were treated as observations. Still, a potential issue of the HMM model is its domain- dependent mechanism. Also, Xu et al.[23] explored the impact of Halliday[24] ’s Theme Structure Theory (TST) in English discourse coherence modeling. Their model shows the importance of the theme structure, a cohesion theory of Halliday’s systemic-functional grammar, to DCM, and the appropriateness of theme and co reference based filtering mechanism.
  • 3. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 3 Figure 1: The framework of entity-driven recursive model for Chinese Discourse Coherence Modeling. 2.2. NEURAL COHERENCE MODEL Recently, Li et al.[18] presented a neural deep model for English discourse coherence modeling. They demonstrated the effectiveness of both recurrent and recursive neural network (RNN) model for English situation. However, as mentioned in the Section 1, their model did not consider the entity (nouns) distribution or entity overlap across sentences within a text. In fact, the entity overlap between two adjacent sentences indicates logical or semantic coherence for para text. Therefore, we successfully integrate these information into their model. 3. ENTITY-DRIVEN RNN COHERENCE MODEL In this section, we describe our entity-driven RNN Chinese discourse coherence model. 3.1. FRAMEWORK Figure 1 shows the entity-driven recursive deep model for Chinese discourse coherence modeling. Our deep model is based on Li et al.[18] ’s English discourse coherence framework. On comparison, their model doesn’t intensify the effectiveness of entities across each sentences in a text. Therefore, we successfully integrate the entities into current recursive neural network model. 3.2. SENTENCE REPRESENTATION For the word-level representation, each word in a sentence can be represented by using a vector representation (or word embedding), and are able to capture the semantic meanings through toolkit, e.g. word2vec1 or Glove2 . More specifically, the word of a sentence can be represented using a specific vector embedding ew={ew 1 ,ew 2 ,…,ew K },where K denotes the dimension of the word embedding. 1 http://code.google.com/p/word2vec/ 2 http://nlp.stanford.edu/projects/glove/
  • 4. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 4 For the sentence-level representation, as shown in Figure 1, the vector representation for the whole sentence is computed as a representation for each parent node based on its immediate children recursively in a bottom-up fashion until reaching the root of the tree. Concretely, for a given parent p in the tree and its two children c1(associated with vector representation hc1) and c2(associated with vector representation hc2), standard recursive network calculates hpfor p as follows: hp=f(WRecursive.[hc1, hc2]+bRecursive) (1) where [hc1, hc2] refers to the concatenating vector for children vector hc1 and hc2;WRecursive is a k*2K matrix and bRecursive is the K*1 bias vector; f(.) is tanh function. 3.3. ENTITY-DRIVEN SENTENCE CONVOLUTION The framework treats a window of sentences as a clique C(sliding windows of L sentences)and associates each clique with a tag yc that takes the value 1 if coherent, and 0 otherwise. As shown in Figure 1, each clique C takes as input a (L*K)*1vector hc by concatenating the embedding of all its contained sentences. The hidden layer takes as input hc and performs the convolution using a non-linear tanh function. The concatenating output vector for hidden layers, defined as qc, can therefore be rewritten as: qc=f(Wsen*(hc*hentity)+bsen) (2) where Wsen is a H*(L*K) dimensional matrix and bsen is a H*1 dimensional bias vector; H refers to the number of neurons in the hidden layer. 3.3.1. ENTITY-DRIVEN MECHANISM Firstly, we conduct vector summation operation for each nouns’ word embedding to generate hentity formulated as: Hentity=ewNN1 ⊕ ewNN2⊕……⊕ ewNNk (3) Then, we conduct element wise multiplication operation between hc and hentity. The value of the output layer can be formulated as: P(yc=1)=sigmod(UT qc+b) (4) where U is an H*1 vector and b denotes the bias; yc with value 1 means the text is coherent, and 0 otherwise. Therefore, the total coherence score for a given document is the probability that all cliques within the document are coherent, which is given by: Sd=∏∈ = dC cyp )1( (5) Finally, we can determine whether a text is coherent according to the value of their coherence score.
  • 5. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 5 3.4. TRAINING AND OPTIMIZATION The cost function for the model is given by: 2 0 1 ( ) 2C trainset Q J H M M θ θ ∈ ∈Θ Θ = +∑ ∑ (6) 0 log[ ( 1)] (1 )log[1 ( 1)]c c c cH y p y y p y= − = − − − = (7) where Θ =[WRecursive,Wsen,Usen]; M denotes the number of training samples. We adopt the widely applied optimization diagonal variant of AdaGrad (Duchi et al.[25] ) to optimize the loss function. 4. EXPERIMENTS In this section, we demonstrate the effectiveness of our discourse coherence model through both sentence ordering and machine translation coherence rating tasks. The former aims to discern an original text from a permuted ordering of its sentences, while the latter aims to discern a human or reference translation from automatically machine generated translation. 4.1. DATASET Sentence Ordering Dataset: We select documents for Chinese Treebank 6.0 from Linguistic Data Consortium (LDC) with catalog number LDC2007T36 and ISBN1-58563-450-6. We select the 100 documents from chtb_2946 to chtb_3045 as our training dataset, and 100 documents from chtb_3046 to chtb_3145 as our testing dataset. The sentences in each source file will be permutated at most 20 times. The total number of testing texts is 1027.The average number of sentence are 10.33 and 13.56 for training set and testing set, respectively. In the evaluation, we consider the original texts are more coherent (positive instances) than the permutated ones (negative instances). Machine Translation Dataset: Similarly, we extract documents for NIST Open Machine Translation 2008 Evaluation (MT08) Selected Reference and System Translations from Linguistic Data Consortium (LDC) with catalog number LDC2010T01 and ISBN1-58563-533-2. Therein, the English-to-Chinese language pairs have 127 documents with 1830 segments, output from 11 machine translation systems. The average number of sentence are 13.38 and 13.39 for training set and testing set, respectively. In evaluation, we consider the human or reference translation texts are more coherent than the machine generated one. 4.2. EXPERIMENTAL SETTINGS Initialization: Similar to Li et al.[18] , the parameter Wsen, Wrecursive and h0are initialized by randomly drawing from the uniform distribution. The number of hidden layer H is set to 100.Learning rate in the optimization process is set to 0.01, and batch size is set to 20. Differently, word embedding {e} for Chinese are trained using word2vec and Glove respectively. The dimension for word embedding is 50 or 100. The window size L is 3 or 5.
  • 6. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 6 Evaluation Metric: We report system’s performance using accuracy, which is the ratio of the number of the selected original text/translation document divided by the total number of texts/translation document. Baseline System 1: Entity graph based model[14] which has been demenstrated as a simple but effective implementation of the entity-based coherence model. We re-implement their method in this paper based on publicly available code3 . Baseline System 2: Another baseline, Li et al.[18] ’s recursive neural model, which did not consider the entity transition information.We transplant there English discourse coherence framework to Chinese situation. Furthermore, we successfully integrate the entity information into their deep model. In addition, we employ Stanford parser4 to generate sentence-level constitute parser tree and generate the part-of-speech to get the entities (nouns) occur in each sentence, and use utility ICTCLAS5 to conduct Chinese word segmentation. 4.3. EXPERIMENT RESULTS In this section, we report the experiment results for the Chinese discourse coherence modeling on both sentence ordering and machine translation coherence rating task. 4.3.1. RESULTS ON SENTENCE ORDERING Table 1 shows the performance of our entity-driven deep model using different windows size, different dimension, and with different type of word embedding. Table 1: The performance under different settings on sentence ordering task. As shown in Table 1, it shows that: (1)Dimension Generally speaking, the performance increases with the increment of the dimension. In fact, the larger the dimension, the more representative ability it is. (2) Window size The performance decreases with the increment of the window size, and the best performance yields at the window size with 3. It is mostly caused by the local entity distribution characteristic demenstrated by Barzilay and Lapata[11][12] ,Guinaudeau and Strube[14] . As the increment of the number of the window size, the entity co-occurance decreases accordingly. 3 http://github.com/karins/CoherenceFramework. 4 http://nlp.stanford.edu/software/lex-parser.shtml. 5 http://ictclas.nlpir.org/downloads
  • 7. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 7 Table 2, below, lists the performance comparision among our model and the current baseline model (traditional model and neural network model). Table 2: Performance comparison among different coherence model on sentence ordering task; Performance that is significantly superior to baseline systems (p<0.05, using paired t-test for significance) is denoted by *. It shows that our combined model significantly outperforms the current deep model for Chinese discourse coherence modeling, which demonstrates the effectiveness and importance of the entity distribution across sentences. Interestingly, the traditional entity based model also works for Chinese discourse coherence evaluation, which doesn’t work fine for English situation. This is mostly caused by the entity distribution are obvious in Chinese discourse than in English text. 4.3.2. RESULTS ON MACHINE TRANSLATION COHERENCE RATING Table 3, below, lists the performance of our model and the baseline model. Table 3: Performance comparison among different coherence model on machine translation coherence rating task with dimension equals to 100; Performance that is significantly superior to baseline systems (p<0.05, using paired t-test for significance) is denoted by *. In fact, discourse coherence evaluation for the machine translation task is more common than the sentence ordering task evaluation. As the results show in Table 3, again, our model significantly outperforms the current model. Also, our model significantly outperforms the traditional entity- based model. It is mostly caused by the entity distribution is not obvious in the text generated by the machine. But the entity (nouns) information still can be integrated into current recursive neural network model. 5. CONCLUSIONS AND FUTURE WORK In this paper, we present an entity-driven recursive deep model for Chinese discourse coherence modeling. We successfully integrate the entities across each sentence into current recursive neural framework. Evaluation results on both sentence ordering and machine translation coherence rating task show the effectiveness of the proposed model. Our future work is to integrate the co reference mechanism into current combined recursive neural network model, together with other coherence evaluation task.
  • 8. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 8 ACKNOWLEDGEMENTS The authors would like to thank the anonymous reviewers for their comments on this paper. This research was supported bythe National Natural Science Foundation of China under Grant No.61402208, No.61462045, No.61462044, No.61662030, the Natural Science Foundation and Education Department of Jiangxi Province under Grant No. 20151BAB207027 and GJJ150351, and the Research Project of State Language Commission under Grant No.YB125-99. REFERENCES [1] Heidi J. Fox,(2002),“Phrasal Cohesion and Statistical Machine Translation”, In Proceedings of EMNLP, pages304-311. [2] Radu Soricut and Daniel Marcu,(2006),“Discourse Generation Using Utility-Trained Coherence Models”, In Proceedings of COLING-ACL, pages803-810. [3] Regina Barzilay and Lillian Lee,(2004),“Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization”, In Proceedings of NAACL-HLT, pages113-120. [4] Jiwei Li, Minh-Thang Luong and Dan Jurafsky,(2015),“A Hierarchical Neural Autoencoder for Paragraphs and Documents”, In Proceedings of ACL, pages1106-1115. [5] Zi-Heng Lin, Hwee Tou Ng and Min-Yen Kan,(2012),“Combining Coherence Models and Machine Translation Evaluation Metrics for Summarization Evaluation”, In Proceedings of ACL,pages 1006– 1014. [6] Danushka Bollegala, Naoaki Okazaki and Mitsuru Ishizuka,(2006),“A Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization”,In Proceedings of ICCL-ACL,pages 385- 392. [7] Helen Yannakoudakis and Ted Briscoe,(2012),“Modeling coherence in ESOL learner texts”, In Proceedings of ACL, pages33-43. [8] Jill Burstein,Joel Tetreault and Slava Andreyev,(2010),“Using Entity-Based Features to Model Coherence in Student Essays”, In Proeedings of NAACL-HLT, pages681-684. [9] Derrick Higgins, Jill Burstin,Daniel Marcu and Claudia Gentile,(2004),“Evaluating Multiple Aspects of Coherence in Student Essays”, In Proceedings of NAACL-HLT,pages185-192. [10] Barbara J. Grosz, Scott Weinstein and Aravind K. Joshi,(1995),“Centering:A Framework for Modeling the Local Coherence of Discourse”,Computational Linguistics, 21(2):203-225. [11] Regina Barzilay and Mirella Lapata,(2005),“Modeling Local Coherence: An Entity-Based Approach”, In Proceedings of ACL,pages 141-148. [12] Regina Barzilay and Mirella Lapata,(2008),“Modeling Local Coherence: An Entity-Based Approach”,Computational Linguistics, 34(1):1-34. [13] Mirella Lapata and Regina Barzilay,(2005), “Automatic Evaluation of Text Coherence: Models and Representations”, In Proceedings of IJCAI, pages 1085-1090. [14] Camille Guinaudeau and Michael Strube, (2013), “Graph-based local coherence modeling”, In Proceedings of ACL, pages 93–103. [15] Vanessa Wei Feng and Graeme Hirst,(2012),“Extending the Entity-based Coherence Model with Multiple Ranks”, In Proceedings of EACL, pages 315-324. [16] Zi-Heng Lin, Hwee Tou Ng and Min-Yen Kan,(2011),“Automatically Evaluating Text Coherence Using Discourse Relations”, In Proceedings of ACL, pages 997-1006. [17] Annie Louis and Ani Nenkova,(2012),“A coherence model based on syntactic patterns”, InProceedings of EMNLP-CNLL, pages1157-1168. [18] Jiwei Li and Eduard Hovy,(2014),“A Model of Coherence Based on Distributed Sentence Representation”, In Proceedings of EMNLP, pages2039-2048. [19] Peter W. Foltz, Walter Kintsch and Thomas K. Landauer,(1998),“The measurement of textual coherence with latent semantic analysis”,Discourse Processes,25(2&3): 285-307. [20] Ryu Iida and Takenobu Tokunaga,(2012),“A Metric for Evaluating Discourse Coherence based on Coreference Resolution”, In Proceedings of COLING, Poster, pages483-494. [21] Micha Elsner and Eugena Charniak,(2008),“Coreference-inspired Coherence Modeling”, In Proceedings of ACL 2008, Short Papers, pages41-44. [22] Micha Elsner, Joseph Austerweil and Eugene Charniak,(2007),“A Unified Local and Global Model for Discourse Coherence”, In Proceedings of NAACL, pages436-443.
  • 9. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.2, March 2017 9 [23] Fan Xu, Qiaoming Zhu, Guodong Zhou and Mingwen Wang,(2014),“Cohesion-driven Discourse Coherence Modeling”, Journal of Chinese Information Processing, 28(3):11-21. [24 ]M. A. K. Halliday,(1994),“An Introduction to Functional Grammar”, Hodder EducationPress, London, United Kingdom. [25] John Duchi, Elad Hazan and Yoram Singer,(2011),“Adaptive subgradient methods for online learning and stochastic optimization”,The Journal of Machine Learning Research,12:2121-2159. AUTHORS Fan Xu holds a Doctoral Degree (Ph.D.) in Computer Science from Soochow University, China. His areas of research interest includes Natural Language Processing, Chinese Information Processing, Discourse Analysis, and Speech Recognition. At present he is working as Lector, School of Computer Information Engineering, Jiangxi Normal University, China. He is member of various professional bodies including ACL, IEEE, and ACIS. Shujing Duis a Master of Computer Science of Jiangxi Normal University, China. Her research interest includes Natural Language Processing, Chinese Information Processing, Discourse Analysis Maoxi Li holds a Doctoral Degree (Ph.D.) in Computer Science from Chinese Academy of Sciences. His areas of research interest includes Machine Translation and Natural Language Processing. At present he is working as Associate Professor, School of Computer Information Engineering, Jiangxi Normal University, China. Mingwen Wang holds a Doctoral Degree (Ph.D.) in Computer Science from Shanghai Jiaotong University, China. His areas of research interest includes Machine Learning, Information Retrieval, Natural Language Processing, Image Processing, and Chinese Information Processing. At present he is working as Professor, School of Computer Information Engineering, Jiangxi Normal University, China. He is member of various professional bodies including ACL, IEEE, CCF, and ACIS.