0% found this document useful (0 votes)
21 views9 pages

Revisiting Document-Level Relation Extraction With Context-Guided Link Prediction

Uploaded by

joykiratsingh16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views9 pages

Revisiting Document-Level Relation Extraction With Context-Guided Link Prediction

Uploaded by

joykiratsingh16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Revisiting Document-Level Relation Extraction

with Context-Guided Link Prediction


Monika Jain1 , Raghava Mutharaju1 , Ramakanth Kavuluru2 , Kuldeep Singh3
1
Indraprastha Institute of Information Technology, Delhi, India
2
University of Kentucky, Lexington, Kentucky, United States
3
Cerence GmbH and Zerotha Research, Germany
{monikaja, [email protected]}, [email protected], [email protected]

Abstract coreference-mediated). However, even after considering fea-


arXiv:2401.11800v1 [cs.IR] 22 Jan 2024

tures between the entity pairs and executing the reasoning


Document-level relation extraction (DocRE) poses the chal-
process, DocRE is still hard due to the latent and unspeci-
lenge of identifying relationships between entities within a
document as opposed to the traditional RE setting where a fic/imprecise contexts.
single sentence is input. Existing approaches rely on logi- Consider the sentence in Figure 1 and its labeled relation
cal reasoning or contextual cues from entities. This paper re- applies to jurisdiction (Congress, US) from the DocRED
frames document-level RE as link prediction over a knowl- dataset (Yao et al. 2019). Even with the inclusion of multi-
edge graph with distinct benefits: 1) Our approach combines hop and co-reference reasoning, inferring the correct rela-
entity context with document-derived logical reasoning, en- tion becomes challenging because the relation depends on
hancing link prediction quality. 2) Predicted links between multiple sentences and cannot be identified based on the lan-
entities offer interpretability, elucidating employed reason- guage used in the sentence.
ing. We evaluate our approach on three benchmark datasets:
DocRED, ReDocRED, and DWIE. The results indicate that
our proposed method outperforms the state-of-the-art models
and suggests that incorporating context-based link prediction
techniques can enhance the performance of document-level
relation extraction models.

Introduction
Relation extraction is the task of extracting semantic links
or connections between entities from an input text (Bal-
dini Soares et al. 2019). In recent years, document-level rela-
tion extraction problem (DocRE) evolved as a new subtopic
due to the widespread use of relational knowledge in knowl-
edge graphs (Yu et al. 2017) and the inherent manifestation
of cross-sentence relations involving multi-hop reasoning.
Thus, as compared to traditional RE, DocRE has two ma-
jor challenges: subject and object entities in a given triple
might be dispersed across distinct sentences, and certain en-
tities may have aliases in the form of distinct entity men- Figure 1: A partial document and labeled relation from Do-
tions. Consequently, the signal (hints) needed for DocRE cRED. Blue color represents concerned entities, pink color
is not confined to a single sentence. A common approach represents other mentioned entities, and yellow color de-
to solve this problem is by taking the input sentences and notes the sentence number.
constructing a structured graph based on syntactic trees, co-
references, or heuristics to represent relation information be-
For these kinds of sentences, external context (knowl-
tween all entity pairs (Nan et al. 2020). A graph neural net-
edge) can play a vital role in helping the model capture more
work model is applied to the constructed graph, which per-
about the involved entities. For the above example, using the
forms multi-hop graph convolutions to derive features of the
Wikidata knowledge base (Vrandečić and Krötzsch 2014)
involved entities. A classifier uses these features to make
and WordNet (Miller 1995), we can get details, such as the
predictions (Zhang et al. 2020). Another approach (Xu,
entity types, synonyms, and other direct and indirect rela-
Chen, and Zhao 2021a, 2023) explicitly models the rea-
tions between entities (if they exist) on the Web. For these
soning process of different reasoning skills (e.g., multi-hop,
kinds of sentences, external context (knowledge) can play a
Copyright © 2024, Association for the Advancement of Artificial vital role in helping the model capture more about the in-
Intelligence (www.aaai.org). All rights reserved. volved entities. For the above example, using the Wikidata
knowledge base (Vrandečić and Krötzsch 2014), we can get Graph-Based DocRE is based on a graph constructed
additional details, such as the entity types, synonyms, and with mentions, entities, sentences, or documents. The rela-
other direct and indirect relations between entities (if they tions between these nodes are then deduced through reason-
exist) on the Web. ing on this constructed graph. Earlier research in this line of
Previous research in this domain has underscored the po- work solves DocRE with multi-hop reasoning on a mention
tential of external context to enhance performance in rela- level graph for inter-sentential entity pair (Zeng et al. 2020).
tion extraction, co-reference resolution, and named entity A discriminative reasoning framework (DRN) is introduced
recognition (Shimorina, Heinecke, and Herledan 2022). The in a different study. This framework involves modeling the
distinctive innovation of our work lies in the fusion of con- pathways of reasoning skills connecting various pairs of en-
text extracted from Wikidata and WordNet with a reasoning tities (Xu, Chen, and Zhao 2021a). DRN is designed to es-
framework, enabling the prediction of entity relationships timate the relation probability distribution of different rea-
based on input document observations. Given the wide avail- soning paths based on the constructed graph and vectorized
ability of external context, Knowledge Graph (KG) triples document contexts for each entity pair, thereby recognizing
can augment training data, thereby casting the DocRE task their relation. We have used DRN as a base model for im-
as a knowledge Graph based link prediction challenge. In plementing reasoning skills.
other words, given head and tail entities, we address the Context Knowledge Based RE study the context inte-
question of determining the appropriate relation. gration with model primarily through knowledge base (KB).
We demonstrate that framing DocRE as a link prediction Past work in this line of work uses entity types and en-
problem, combined with contextual knowledge and reason- tity aliases to predict the relation (Vashishth et al. 2018;
ing, yields enhanced accuracy in predicting the relation be- Fernàndez-Cañellas et al. 2020). RECON (Bastos et al.
tween entities. We furnish traversal paths as compelling jus- 2021) encoded attribute and relation triples in the Knowl-
tifications for relation predictions, thereby shedding light on edge Graph and combined the embedding with their cor-
why a particular relation is favored over others. Notably, this responding sentence embedding. KB-Both (Verlinden et al.
marks the first instance of presenting a traversal path be- 2021) uses entity details from hyperlinked text documents
tween entities for each prediction in the context of DocRE. from Wikipedia and Knowledge Graph (KG) from Wiki-
Our contributions in this work are as follows. data to enhance performance. In distinction to that (Wang
• We introduce an innovative approach named DocRE- et al. 2022) integrate knowledge, including co-references,
CLiP (Document-level Relation Extraction with attributes, and relations with different injection methods,
Context-guided Link Prediction), which amalgamates for improving the state-of-the-art. In contrast to these ap-
external entity context with reasoning via a link proaches, we consider the context of entities and the exter-
prediction algorithm. nal relation paths between them. Furthermore, we employ a
reasoning model that effectively addresses DocRE and en-
• Our empirical analyses encompass three widely-used hances the robustness of the proposed approach.
public document-level relation extraction datasets, show-
casing our model’s improvement over the recent state-of- Methodology
the-art methods.
• Significantly, for every prediction, our approach is first in Problem Formulation
DocRE literature to supply a traversal path as corrobora- An unstructured document D consisting of K sentences
tive evidence, bolstering the model’s interpretability. is represented by {S}ki=1 , where each sentence is a se-
quence of words and entities E={ei }P i=1 (P is the total num-
Related work ber of entities). Entities ei has multiple mentions, msi k , scat-
DocRE Previous efforts (Zhang, Qi, and Manning 2018; tered across the document D. Entity alias are represented as
Jain, Singh, and Mutharaju 2023) in relation extraction have {msi k }Q
k=1 . Our objective is to extract the relation between
focused on predicting relationships within a single sentence. two entities in E namely P(r|ei ,ej ) where ei , ej ∈ E, r ∈ R,
Recent years have seen growing interest in relation extrac- here R is a total labeled relation set. The context (back-
tion beyond single sentence (Yao et al. 2019). ground knowledge) of an entity ei is represented by Cei
Transformer-Based DocRE is another interesting ap- and a context path, i.e., a sequence of connected entities and
proach to tackle the document-level relation extraction prob- edges from the head entity (ei ) to the tail entity (ej ) is rep-
lem (Zeng et al. 2020). One primary focus revolves around resented by CPei ,ej .
maximizing the effective utilization of long-distance token
dependencies using a transformer. Earlier research considers Approach
DocRE as semantic segmentation task, employing the entity Our proposed framework, DocRE-CLiP, integrates
matrix, and they utilize a U-Net to capture and model (Zhang document-derived reasoning with context knowledge
et al. 2021). In a separate study, localized contextual pooling using link prediction. In the first step, we extract triples
was introduced to focus on tokens relevant to individual en- from the sentences of the given document. In the second
tity pairs (Zhou et al. 2021). On the other hand, the DocRE step, we extract two types of context: 1) entity context, such
challenge is addressed by incorporating explicit supervision as its aliases, and 2) context paths from an external KB
for token dependencies, achieved by leveraging evidential (Wikidata, in this case). Using the triples and extracted con-
information (Ma, Wang, and Okazaki 2023). texts, we create a context graph to calculate a link prediction
score. Then in the third step, we use several reasoning Illustrated in Figure 2 is the triple generation process em-
mechanisms such as logical reasoning, intra-sentence ploying a contextual path. The entity Canadian is two hops
reasoning, and co-reference reasoning to calculate relation away from the entity Ontario. Canada is an intermediary en-
scores for pairs of entities. In the final step, the aggregation tity, while country and ethnic group are intermediary prop-
module combines the relation scores from the second and erties. The Contextual Path (CPei ,ej ) are a set of triples
third steps. We have also implemented a path-based beam formed using the intermediary entities and properties, as
search in the framework to explain the predicted relation by shown in Figure 2.
providing traversal paths based on scores (refer to Figure 3).
We now detail the architecture of our proposed framework.
Triplet Extraction Module. Document Relation Extrac-
tion (DocRE) datasets often contain labeled triplets; how-
ever, popular datasets (Yao et al. 2019) have about 64.6%
missing triples, yielding an incomplete graph (Tan et al.
2022b). To extract all the triples from the document, we uti-
lize an open-source state-of-the-art method (Huguet Cabot
and Navigli 2021) for triplet generation. This model is cho-
sen based on source code availability and run time. This
module takes a document D as input and produces triples
(s, p, o), where s is the head entity, p is the relation, and o Figure 2: Triples constructed using N-hop path extracted
is the tail entity. These extracted triples (T ) follow the equa- from Wikidata. The head and tail entities are blue in color.
tion below, where n is the total count of triples extracted Intermediate entities are in peach color.
from document D.
T (D) = {si , pi , oi }ni=1 (1) Link Prediction Module. Link prediction is the task of
predicting absent or potential connections among nodes
Context Module. Our goal involves extracting two types within a network (Liben-Nowell and Kleinberg 2003). Given
of contexts – entity context and the contextual path between that the document relation extraction (DocRE) task involves
entity pairs. For entity ei , we generate entity context using constructing a graph that interlinks entities and considering
entity type and synonyms, which are derived by using Word- our context, formulated as triples, which can be conceptual-
Net (Miller 1995). We incorporate the entity context Cei into ized as a Knowledge Graph (KG), we approach the DocRE
the triples. Here, S denotes the total number of extracted challenge as a link prediction problem. This approach en-
synonyms. Refer to equations 2 and 3 for more details. compasses both an encoder and a decoder. The encoder
maps each entity ei ∈ E to a real-valued vector vi ∈ Rd ,
Cei = {ei , hasSynonym, Synonymk }Sk=1 , (2) where R denotes the set of real-valued relation vectors of di-
Cei = {ei , hasEntityT ype, EntityT ype} (3) mension d. The decoder reconstructs graph edges by lever-
aging vertex representations, essentially scoring (subject, re-
Let us consider an entity “USA” as an example. When
lation, object) triples using a function: Rd × R × Rd → R.
we utilize WordNet, we discover that synonyms for “US”
While prior methods often employ a solitary real-valued
include, “America”, and “United States”. Additionally, the
vector ei ∈ E, our approach computes representations using
entity is categorized as the type “Country”. Consequently, (l)
the following set of triples is generated through this process: an R-GCN (Schlichtkrull et al. 2018) encoder, where hi
is the hidden state of node ei in the l-th layer of the neural
{USA, hasSynonym, US} network. To compute the forward pass for an entity ei in a
{USA, hasSynonym, America} relational multi-graph, the propagation model at layer l + 1
{USA, hasSynonym, United States} is computed as follows.
{USA, hasEntityType, Country}  
(l+1)
X X 1 (l) (l) (l)
The second source of contextual information pertains hi = σ ( Wr(l) hj + W0 hi ) (4)
r
c i,r
to entity paths. Predicting relations between entity pairs r∈R j∈Ni
poses challenges stemming from inherent document defi- (l) (l)
ciencies (Tan et al. 2022b). To address these issues, we in- Here, hi ∈ Rd with d(l) being the dimensionality of
(l)
troduced external context by harnessing insights from Wiki- layer l. W0l and Wr represent the block diagonal weight
data. The procedure involves extracting paths (direct and matrices of the neural network, and σ represents the acti-
indirect) between entity-entity, mention-entity, and entity- vation function. Nir signifies the set of neighboring indices
mention pairs from Wikidata, provided they exist. The con- of node i under relation r ∈ R, and ci,r is a normalization
textual path pertains to an entity pair ei , ej . We considered constant.
context paths spanning an N-hop distance (N being cho- For training the link prediction model, our dataset com-
sen based on experimental findings) between the entity pair. prises a) core training triples from the dataset, b) triplets ob-
Subsequently, the extracted path is transformed into triples tained through the triplet extraction module using Equation
and forwarded to the link prediction model. 1, c) triplets formulated using the context module guided by
Equations 2 and 3, and d) triplets constructed using context lation between ei and ej entities by a multi-layer perceptron
paths connecting entity pairs. We use DistMult (Yang et al. (MLP) for each path, respectively (equation 7).
2015) as the decoder. It performs well on the standard link
prediction benchmarks. Every relation r in a triple is scored 
sigmoid (MLPr (αij )) ,

using equation 5.
P (r | ei , ej ) = max  sigmoid (MLPr (βij )) ,  (7)
P (r | i, j) = P (eTi × Rr × ej ) (5) sigmoid (MLPr (γij ))

Reasoning Module. We consider three types of reasoning By the end of this step, we will get the score of each rela-
in our approach. tion (r) for a given {ei , ej }.
1) Intra-sentence reasoning, which is a combination of Aggregation Module. In this module, we aggregate the
pattern recognition and common sense reasoning. Intra- probability score from the reasoning module equation 7 and
sentence reasoning path is defined as P Iij = msi 1 ◦ s1 ◦ msj 1 link prediction probability score using equation 5. Further,
for entity pair {ei , ej } insider the sentence s1 in document the binary cross-entropy is used as a training objective func-
D. msi 1 and msj 1 are mentions and “o” denotes reasoning tion (Yao et al. 2019) for predicting the final relation.
step on reasoning path from ei to ej .
2) Logical reasoning is where a bridge entity indirectly es- Path-Based Beam Search
tablishes the relations between two entities. Logical reason- An essential component of our approach is that it can ex-
ing path is formally denoted as P Lij = msi 1 ◦ s1 ◦ msl 1 ◦ plain the predicted relation by providing the most relevant
mls2 ◦ s2 ◦ msj 2 for entity pair {ei , ej } from sentence s1 path in the graph between the given entity pairs. This repre-
and s2 is directly established by bridge entity el . sents a notable advancement, as contemporary state-of-the-
3) Co-reference reasoning which is nothing but co-reference art models cannot often furnish explanations alongside their
resolution. Co-reference reasoning path is defined as predictions. Unlike Greedy search, where each position is
P Cij = msi 1 ◦ s1 ◦ s2 ◦ msj 2 between two entities ei and assessed in isolation, and the best choice is selected without
ej which occur in same sentence as other entity. Our imple- considering preceding positions, we use beam search. This
mentation of these reasoning skills is inspired by (Xu, Chen, strategy selects the top “N” sequences thus far and factors
and Zhao 2021a). in probabilities involving the concatenation of all previous
Consider an entity pair {ei , ej } and its intra sentence rea- paths and the path in the current position.
soning path (PIij ), logical reasoning path (PLij ) and co- Inspired by (Rossi et al. 2022), we used beam search to
reference reasoning path (PCij ) in the sentence. The var- derive plausible paths leading to the target entity within a
ious reasoning is modeled to recognize the entity pair as graph. This graph (G) is constructed using the triples to train
intra-sentence reasoning RP I (r) = P (r | ei , ej , P Iij , D), the link prediction module, augmented by test result triplets
logical reasoning RP L (r) = P (r | ei , ej , P Lij , D) and from the model’s predictions. Our objective is to create
co-reference reasoning RP C (r) = P (r | ei , ej , P Cij , D). a comprehensive graph encompassing the maximum avail-
Reasoning type is selected with max probability to recognize able details, to generate substantial explanations for the pre-
the relation between each entity pair using the equation: dictions. Formally, we conceptualize the path-based beam
search challenge as follows. Given a structured relational
P (r | ei , ej , D) = max [RP I (r), RP L (r), RP C (r)] (6) query (ei , r, ?), where ei serves as the head entity, r sig-
For discerning relations between two entities, we em- nifies the query relation, and (ei , r, ej ) ∈ G, our objective
ploy two categories of context representation – heteroge- is to identify a collection of plausible answer entities ej by
neous graph context representation (HGC) and document- navigating paths through the existing entities and relations
level context representation (DLC) to model diverse reason- within G, leading to tail entities. We compile a list of distinct
ing paths (Zhou et al. 2021). In heterogeneous graph con- entities reached during the final search step and assign the
text representation (HGC), a word is portrayed as a concate- highest score attained among all paths leading to each en-
nation embedding of its word (We ), entity type (Wt ), and tity. Subsequently, we present the top-ranked unique entities.
co-reference embedding (Wc ). This composite embedding is This approach surpasses the direct output of entities ranked
then input into a BiLSTM to convert the document D into a at the beam’s apex, which often includes duplicates. Upon
vectorized form using the equation: BiLSTM([We :Wt :Wc ]). completing this step, we obtain actual paths (sequences of
Following the methodology of (Zeng et al. 2015), a hetero- nodes and edges) for enhanced interpretability.
geneous graph is constructed based on sentence and men-
tion nodes. For document-level context representation, fol- Experimental Setup
lowing (Eisenbach et al. 2023), a self-attention mechanism We conduct our evaluation in response to the following re-
is employed to learn document-level context (DLC) for a search questions. RQ1: What is the effectiveness of DocRE-
specific mention based on the vectorized input document D. CLiP that combines context knowledge with reasoning
To model intra-sentence reasoning path (αij ), logical rea- in solving document-level relation extraction tasks? RQ2:
soning path (βij ) and co-reference reasoning path (γij ), How does knowledge encoded from external sources impact
HGC and DLC representation are combined (Xu, Chen, and the performance of DocRE-CLiP? RQ3: Does the explana-
Zhao 2021b). These reasoning representations are the in- tion generated by our approach provide sufficient grounds to
put to the classifier to compute the probabilities of the re- support the inferred relation?
Figure 3: Illustration of proposed framework DocRE-CLiP and its various modules.

Datasets Hyper-Parameters and Metrics


For the reasoning module, we follow the settings of (Xu,
The proposed model is evaluated on three widely-used Chen, and Zhao 2021a). We use the word embedding from
public datasets 1) DocRED (Yao et al. 2019) 2) ReDo- GloVe (100d) and apply a Bidirectional LSTM (128d) to
cRED (Ma, Wang, and Okazaki 2023) and 3) DWIE (Za- a word representation for encoding. We employ uncased
porojets et al. 2021). ReDocRED is a revised version of BERT-Based model (768d) as an encoder with a learning
handling DocRED issues such as false negatives and incom- rate 1e-3. We used AdamW as an optimizer, and the learn-
pleteness (Tan et al. 2022b). Dataset details are in Table 1. ing rate is 1e − 3. R-GCN is used as an encoder with a sin-
gle encoding layer (200d) embeddings for the link predic-
tion model. We regularize the encoder through edge dropout
Dataset #Triples #Rel #Entities #Entity Types #Doc applied before normalization, with a dropout rate of 0.2 for
DocRED 50,503 96 30554 7 5053
ReDocRED 120,664 96 38239 10 5053 self-loops and 0.4 for other edges. We apply l2 regulariza-
DWIE 19465 66 6644 10 777 tion to the decoder with a penalty of 0.01. Adam (Kingma
and Ba 2015) is used as an optimizer, and the model is
Table 1: Dataset statistics trained with 100 epochs using a learning rate of 0.01. For
extracting the context paths, we use SPARQL queries to re-
trieve paths between entities. If multiple paths exist between
entities, we consider the path with the highest page rank.
The N-hop path length of the context varies from 1 to 4. The
Baseline Models for Comparison rationale behind this range is that we found no pertinent in-
formation for the context beyond four hops. We have used
We used several competitive baselines and a recent state-of-
beam size of 128 for beam search for all three datasets.
the-art dataset for comparison. For DocRED, we compared
our approach with BERT based models such as SIRE (Zeng, We use the evaluation metrics of DocRED (Yao et al.
Wu, and Chang 2021), HeterGSAN-Rec (Xu, Chen, and 2019), i.e., F1 and Ign F1 for DocRE-CLiP. Ign F1 is mea-
Zhao 2021b), ATLOP (Zhou et al. 2021), DRN (Xu, sured by removing relations in the annotated training set
Chen, and Zhao 2021a), and RoBERTa based model such from the development and test sets.
as DREEAM (Ma, Wang, and Okazaki 2023). and AT-
LOP (Zhou et al. 2021), KD-DocRE (Tan et al. 2022a), Results
DocuNet (Zhang et al. 2021), EIDER (Xie et al. 2022), We have compared DocRE-CLiP with various baseline mod-
SAIS (Xiao et al. 2022), (Zhou et al. 2021) are also com- els on DocRED, ReDocRED, and DWIE datasets given in
pared with the proposed DocRE-CLiP. Table 2. The results effectively address our primary research
Similarly, for the ReDocRED dataset, we used AT- question (RQ1). To delve into the specifics of (RQ1), we ob-
LOP (Zhou et al. 2021), DRN (Xu, Chen, and Zhao 2021a), serve that incorporating context information from Wikidata
DocuNet (Zhang et al. 2021), KD-DocRE (Tan et al. 2022a) and WordNet improves the performance compared to the
and the best baseline DREEAMinf erence (Ma, Wang, and baseline models. Notably, DocRE-CLiP surpasses all graph-
Okazaki 2023). For the DWIE dataset, we considered the based, reasoning-oriented, and transformer-based models by
state-of-the-art model DRN (Xu, Chen, and Zhao 2021a). incorporating contextual information.
Other than baseline model, we evaluated our model with Examining the results for the DocRED dataset, our
context-based models such as KIRE (Wang et al. 2022), RE- DocRE-CLiP model showcases an improvement of approx-
SIDE (Vashishth et al. 2018), RECON (Bastos et al. 2021) imately 1% compared to top-performing models like KD-
and KB-graph (Verlinden et al. 2021). DocRE and DREEAM. For the ReDocRED dataset, DocRE-
Baseline PLM/GNN Dev Test
F1 Ign F1 F1 Ign F1
Dataset-DocRED
SIRE BERT 61.6 59.82 62.05 60.18
HeterGSAN-Rec BERT 60.18 58.13 59.45 57.12
ATLOP BERT 61.09 59.22 61.30 59.31
DRN BERT 61.39 59.33 61.37 59.15
DocuNet RoBERTa 64.12 62.23 64.55 62.39
ATLOP RoBERTa 63.18 61.32 63.40 61.39
KD-DocRE RoBERTa 67.12 65.27 67.28 65.24
SAIS RoBERTa 65.17 62.23 65.11 63.44
DREEAM RoBERTa 67.41 65.52 67.53 65.47
EIDER RoBERTa 64.27 62.34 64.79 62.85
KIRE - 52.65 50.46 51.98 49.69
RESIDE GNN 51.59 49.64 50.71 48.62
RECON GNN 52.89 50.78 52.27 49.97
KB-Graph - 52.81 50.69 52.19 49.88
DocRE-CLiP BERT 68.13±0.15 66.43±0.17 68.51 66.31
Dataset-DWIE
DRN*GloV e BERT - - 56.04 54.22
RESIDE GNN 65.11 55.74 66.78 57.64
RECON GNN 65.48 56.12 66.94 58.02
KB-Graph - 65.39 56.03 66.89 57.94
DocRE-CLiP BERT 66.12±0.12 57.11±0.16 67.10±0.11 58.87±0.17
Dataset-ReDocRED
ATLOP BERT - - 77.56 76.82
DRN* BERT - - 75.6 74.3
KD-DocRE BERT - - 81.04 80.32
DocuNet RoBERTa - - 79.46 78.52
DREEAM RoBERTa - - 81.44 80.39
DocRE-CLiP BERT - - 81.55±0.14 80.57±0.22

Table 2: Results on DocRED, ReDocRED, and DWIE


datasets, including the baseline models. The precision col-
umn is blank (-) for baselines that do not report it. * denotes
results obtained after modifying their code as the dataset ne-
cessitates. The mean and standard deviation of F1 and IgnF1
on the dev set are reported for three training runs. We report
the official test score for DocRED on the best checkpoint on
the dev set.

Figure 4: Performance of DocRE-CLiP across various con-


CLiP outperforms baseline models, including DREEAM, texts using the DocRED, ReDocRED, and DWIE datasets
KD-DocRE, and DocuNet. Furthermore, in the case of
DWIE dataset, our model outperforms all the baseline
models, such as DRN, GAIN, and ATLOP, KIRE. No-
formance. Furthermore, we scrutinized the effect of con-
tably, the ReDocRED dataset exhibits only slight improve-
text path details on DocRE-CLiP by considering only those
ment over the recent state-of-the-art DREEAM. This could
paths that notably enhanced performance. Our analysis es-
be attributed to ReDocRED being an enhanced version
tablishes that the incorporation of context enhances perfor-
of DocRED, which already tackled the issue of dataset
mance across all datasets. Thus, we effectively address our
incompleteness. On the other hand, both DocRED and
second research question, (RQ2). This study’s findings lead
DWIE demonstrate significant improvements by incorporat-
us to conclude that DocRE-CLiP benefits the most from the
ing REBEL triplets, setting them apart from ReDocRED in
context path compared to the other contexts.
this regard. The unique aspect of DocRE-CLiP lies in its in-
corporation of contextual paths between entities in addition Effectiveness of link prediction model with context.
to entity context, contributing to its superior performance Our focus has been on investigating link prediction mod-
compared to context-aware approaches like KIRE, KB-both, els utilizing individual triples from the dataset. Throughout
and RESIDE, which leverage knowledge bases. This ad- our analysis, we evaluate the performance of different link
vantage is rooted in the reasoning framework that underlies prediction models, specifically DistMult (Yang et al., 2015),
DocRE-CLiP. Complex (Trouillon et al., 2016), R-GCN (Schlichtkrull et
al., 2017), and KGE-HAKE (Zhang et al., 2022). Subse-
Ablation Study quently, we explored the influence of context on their perfor-
mance. Notably, in each instance, upon the incorporation of
Effectiveness of different contexts on DocRE-CLiP. context, the performance of the link prediction models im-
Figure 4 offers an overview of our findings into DocRE- proves. Table 4 summarizes the results. Considering these
CLiP’s performance under varying context conditions. Ini- findings, RGCN has exhibited superior performance across
tially, we gauged its performance without context and doc- metrics such as hits@1, hits@2, hits@10, and MRR. As a
umented the outcomes. Subsequently, we introduced docu- result, we have opted to select RGCN for link prediction.
ment triplets along with the dataset labels and determined
the corresponding F1 scores. Additionally, we measured the Study of the path for an explanation on DocRE-CLiP.
impact of entity context and documented the ensuing per- Since explanation is an important part of our model, we an-
Case1: Model Metric DocRED ReDocRED DWIE
Sentence 0: The Château. de Pirou is a castle in the commune of Pirou in the Hits@1 0.092 0.061 0.293
département of Manche ( Normandy ) France] DistMult Hits@3 0.104 0.088 0.307
Correct answer: contains administrative territorial entity Hits@10 0.127 0.111 0.334
Baseline: country MRR 0.105 0.080 0.307
DocRE-CLiP: contains administrative territorial entity Hits@1 0.10 0.071 0.297
Case2: DistMult+context Hits@3 0.113 0.093 0.324
Sentence 0: The Wigram Baronetcy of Walthamstow House in the County of Hits@10 0.112 0.12 0.337
Essex is a title in the Baronetage of the UNITED KINGDOM. Sentence 3:The MRR 0.11 0.12 0.43
second Baronet also represented Wexford Borough in Parliament. Sentence 5: Hits@1 0.092 0.076 0.286
The fourth Baronet was a Lieutenant - General in the army and sat as a Conser- Complex Hits@3 0.097 0.096 0.296
Hits@10 0.110 0.104 0.317
vative Member of Parliament for South Hampshire and Fareham
MRR 0.099 0.087 0.297
Correct answer: legislative body
Hits@1 0.101 0.09 0.31
Baseline: has part
Complexcontext Hits@3 0.12 0.10 0.33
DocRE-CLiP:legislative body Hits@10 0.15 0.13 0.36
Case3: MRR 0.1 0.98 0.30
Sentence 1:[ Taking a more electronic music sound than his previous releases Hits@1 0.06 0.033 0.38
TY.O was released in December 2011 by Universal Island Records but for rea- R-GCN Hits@3 0.09 0.06 0.43
sons unknown to Cruz its British and American release were held off.] Sentence Hits@10 0.11 0.091 0.45
3:[ TY.O features a range of top - twenty and top - thirty singles including “Hang- MRR 0.0827 0.0532 0.40
over” (featuring Florida) “Troublemaker” “There She Goes” (sometimes fea- Hits@1 0.11 0.51 0.67
turing Pitbull) the limited release “World in Our Hands” and “Fast Car” which R-GCNcontext Hits@3 0.14 0.61 0.91
features on the Special Edition and Fast Hits versions of the album] Hits@10 0.23 0.13 0.98
Correct answer: Inception MRR 1.34 1.13 1.32
Baseline: publication date Hits@1 0.07 0.05 0.44
DocRE-CLiP: publication date KGE-HAKE Hits@3 0.103 0.11 0.45
Hits@10 0.123 0.112 0.47
MRR 0.09 0.13 0.45
Table 3: Case study with DocRE-CLiP prediction. Under- Hits@1 0.10 0.08 0.48
KGE-
line text represents entities in the sentence, and purple color HAKEcontext
Hits@3 0.156 0.14 0.50
represents the DocRE-CLiP prediction. Hits@10 0.18 0.144 0.51
MRR 0.136 0.153 0.52

Table 4: Performance of link prediction models.


alyze the performance of the traversal path. Table 5 provides
a few examples of explanation with respect to the hops for Query: (IBM research Brazil, parent organization, ?x)
document-level relation extraction. This answers RQ3. Answer: IBM research
Explanation:
IBM research Brazil, part of, IBM research
Case Study Query: (Piraeus, country, ?x)
Answer: Greece
We discuss two successful and one failed case of DocRE- Explanation:
CLiP and compare them with the baseline model, DRN {Piraeus, located in the administrative territorial entity,
(Table 3). Case1: To identify the relation between Kiato}
{Kiato, country, Greece}
Manche and France in sentence 0, we use external Query: (Quincy, country, ?x)
knowledge about France and Mancy. We get connect- Answer: United States
ing context path using DocRE-CLiP: {France, con- Explanation:
{Quincy, country, American}
tains the administrative territorial entity, Normandy} {American, country of citizenship, America}
{Normandy, contains the administrative territorial entity, {America, synonym, United States}
Manche}. Following the context path directly leads to
relation contains the administrative territorial between Table 5: Example queries and results on DocRED dataset
France and Mance. Case2: With the aid of context path
between entities, {United Kingdom, legislative body,
Parliament of the United Kingdom} {Parliament of the ity. As future work, researchers can extend our efforts by
United Kingdom, instance of, Parliament}. DocRE-CLiP crafting a versatile model capable of traversing diverse
successfully identifies correct relation legislative body. document types, thereby significantly amplifying its apti-
Case3: Using pattern recognition, DocRE-CLiP identifies tude for assimilating knowledge. Furthermore, with con-
the publication date as a relation. However, the inclusion of clusive evidence provided in our work as the first step,
the entity context, such as “Troublemaker”, “description”, the document RE and KG link prediction research find-
and “song”, does not contribute significantly to the accuracy ings will interchangeably benefit. The code and the mod-
of the prediction. Consequently, DocRE-CLiP encounters els are available at https://github.com/kracr/document-level-
difficulties in correctly predicting the relation. relation-extraction.

Conclusion
Acknowledgements
This paper introduces DocRE-CLiP, a context-driven ap-
proach for document-level relation extraction (DocRE). We express our sincere gratitude to the Infosys Centre for
Our results suggest that integrating diverse context types Artificial Intelligence (CAI) at IIIT-Delhi for their support.
into the link prediction module enriches relation predic- RK’s effort has been supported by the U.S. National Library
tion within the DocRE framework providing interpretabil- of Medicine (through grant R01LM013240)
References Miller, G. A. 1995. WordNet: A Lexical Database for En-
Baldini Soares, L.; FitzGerald, N.; Ling, J.; and glish. Commun. ACM, 38(11): 39–41.
Kwiatkowski, T. 2019. Matching the Blanks: Distri- Nan, G.; Guo, Z.; Sekulic, I.; and Lu, W. 2020. Reasoning
butional Similarity for Relation Learning. In Proceedings with Latent Structure Refinement for Document-Level Re-
of the 57th Annual Meeting of the Association for Computa- lation Extraction. In Proceedings of the 58th Annual Meet-
tional Linguistics, 2895–2905. Florence, Italy: Association ing of the Association for Computational Linguistics, 1546–
for Computational Linguistics. 1557. Online: Association for Computational Linguistics.
Bastos, A.; Nadgeri, A.; Singh, K.; Mulang’, I. O.; Shekar- Rossi, A.; Firmani, D.; Merialdo, P.; and Teofili, T. 2022.
pour, S.; Hoffart, J.; and Kaul, M. 2021. RECON: Relation Explaining Link Prediction Systems Based on Knowledge
Extraction using Knowledge Graph Context in a Graph Neu- Graph Embeddings. In Proceedings of the 2022 Interna-
ral Network. In Leskovec, J.; Grobelnik, M.; Najork, M.; tional Conference on Management of Data, SIGMOD ’22,
Tang, J.; and Zia, L., eds., WWW ’21: The Web Conference 2062–2075. New York, NY, USA: Association for Comput-
2021, Virtual Event / Ljubljana, Slovenia, April 19-23, 2021, ing Machinery. ISBN 9781450392495.
1673–1685. ACM / IW3C2.
Schlichtkrull, M. S.; Kipf, T. N.; Bloem, P.; van den Berg,
Eisenbach, M.; Lübberstedt, J.; Aganian, D.; and Gross, H. R.; Titov, I.; and Welling, M. 2018. Modeling Relational
2023. A Little Bit Attention Is All You Need for Per- Data with Graph Convolutional Networks. In Gangemi, A.;
son Re-Identification. In IEEE International Conference on Navigli, R.; Vidal, M.; Hitzler, P.; Troncy, R.; Hollink, L.;
Robotics and Automation, ICRA 2023, London, UK, May 29 Tordai, A.; and Alam, M., eds., The Semantic Web - 15th
- June 2, 2023, 7598–7605. IEEE. International Conference, ESWC 2018, Heraklion, Crete,
Fernàndez-Cañellas, D.; Marco Rimmek, J.; Espadaler, J.; Greece, June 3-7, 2018, Proceedings, volume 10843 of Lec-
Garolera, B.; Barja, A.; Codina, M.; Sastre, M.; Giro-i Ni- ture Notes in Computer Science, 593–607. Springer.
eto, X.; Riveiro, J. C.; and Bou-Balust, E. 2020. Enhancing
Online Knowledge Graph Population with Semantic Knowl- Shimorina, A.; Heinecke, J.; and Herledan, F. 2022. Knowl-
edge. In Pan, J. Z.; Tamma, V.; d’Amato, C.; Janowicz, K.; edge Extraction From Texts Based on Wikidata. In Proceed-
Fu, B.; Polleres, A.; Seneviratne, O.; and Kagal, L., eds., ings of the 2022 Conference of the North American Chap-
The Semantic Web – ISWC 2020, 183–200. Cham: Springer ter of the Association for Computational Linguistics: Hu-
International Publishing. ISBN 978-3-030-62419-4. man Language Technologies: Industry Track, 297–304. Hy-
brid: Seattle, Washington + Online: Association for Compu-
Huguet Cabot, P.-L.; and Navigli, R. 2021. REBEL: Re- tational Linguistics.
lation Extraction By End-to-end Language generation. In
Findings of the Association for Computational Linguistics: Tan, Q.; He, R.; Bing, L.; and Ng, H. T. 2022a. Document-
EMNLP 2021, 2370–2381. Punta Cana, Dominican Repub- Level Relation Extraction with Adaptive Focal Loss and
lic: Association for Computational Linguistics. Knowledge Distillation. In Findings of the Association for
Computational Linguistics: ACL 2022, 1672–1681. Dublin,
Jain, M.; Singh, K.; and Mutharaju, R. 2023. ReOnto: A
Ireland: Association for Computational Linguistics.
Neuro-Symbolic Approach for Biomedical Relation Extrac-
tion. In Machine Learning and Knowledge Discovery in Tan, Q.; Xu, L.; Bing, L.; Ng, H. T.; and Aljunied, S. M.
Databases: Research Track: European Conference, ECML 2022b. Revisiting DocRED - Addressing the False Negative
PKDD 2023, Turin, Italy, September 18–22, 2023, Proceed- Problem in Relation Extraction. In Goldberg, Y.; Kozareva,
ings, Part IV, 230–247. Berlin, Heidelberg: Springer-Verlag. Z.; and Zhang, Y., eds., Proceedings of the 2022 Confer-
ISBN 978-3-031-43420-4. ence on Empirical Methods in Natural Language Process-
Kingma, D. P.; and Ba, J. 2015. Adam: A Method for ing, EMNLP 2022, Abu Dhabi, United Arab Emirates, De-
Stochastic Optimization. In Bengio, Y.; and LeCun, Y., eds., cember 7-11, 2022, 8472–8487. Association for Computa-
3rd International Conference on Learning Representations, tional Linguistics.
ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Confer- Vashishth, S.; Joshi, R.; Prayaga, S. S.; Bhattacharyya, C.;
ence Track Proceedings. and Talukdar, P. 2018. RESIDE: Improving Distantly-
Liben-Nowell, D.; and Kleinberg, J. 2003. The Link Pre- Supervised Neural Relation Extraction using Side Informa-
diction Problem for Social Networks. In Proceedings of tion. In Proceedings of the 2018 Conference on Empiri-
the Twelfth International Conference on Information and cal Methods in Natural Language Processing, 1257–1266.
Knowledge Management, CIKM ’03, 556–559. New York, Brussels, Belgium: Association for Computational Linguis-
NY, USA: Association for Computing Machinery. ISBN tics.
1581137230. Verlinden, S.; Zaporojets, K.; Deleu, J.; Demeester, T.; and
Ma, Y.; Wang, A.; and Okazaki, N. 2023. DREEAM: Guid- Develder, C. 2021. Injecting Knowledge Base Information
ing Attention with Evidence for Improving Document-Level into End-to-End Joint Entity and Relation Extraction and
Relation Extraction. In Vlachos, A.; and Augenstein, I., eds., Coreference Resolution. In Zong, C.; Xia, F.; Li, W.; and
Proceedings of the 17th Conference of the European Chap- Navigli, R., eds., Findings of the Association for Computa-
ter of the Association for Computational Linguistics, EACL tional Linguistics: ACL/IJCNLP 2021, Online Event, August
2023, Dubrovnik, Croatia, May 2-6, 2023, 1963–1975. As- 1-6, 2021, volume ACL/IJCNLP 2021 of Findings of ACL,
sociation for Computational Linguistics. 1952–1957. Association for Computational Linguistics.
Vrandečić, D.; and Krötzsch, M. 2014. Wikidata: A Free Zaporojets, K.; Deleu, J.; Develder, C.; and Demeester,
Collaborative Knowledgebase. Commun. ACM, 57(10): T. 2021. DWIE: An entity-centric dataset for multi-
78–85. task document-level information extraction. Inf. Process.
Wang, X.; Wang, Z.; Sun, W.; and Hu, W. 2022. Enhancing Manag., 58(4): 102563.
Document-Level Relation Extraction by Entity Knowledge Zeng, D.; Liu, K.; Chen, Y.; and Zhao, J. 2015. Distant
Injection. In Sattler, U.; Hogan, A.; Keet, C. M.; Presutti, Supervision for Relation Extraction via Piecewise Convolu-
V.; Almeida, J. P. A.; Takeda, H.; Monnin, P.; Pirrò, G.; and tional Neural Networks. In Proceedings of the 2015 Confer-
d’Amato, C., eds., The Semantic Web - ISWC 2022 - 21st ence on Empirical Methods in Natural Language Process-
International Semantic Web Conference, Virtual Event, Oc- ing, 1753–1762. Lisbon, Portugal: Association for Compu-
tober 23-27, 2022, Proceedings, volume 13489 of Lecture tational Linguistics.
Notes in Computer Science, 39–56. Springer. Zeng, S.; Wu, Y.; and Chang, B. 2021. SIRE: Separate Intra-
Xiao, Y.; Zhang, Z.; Mao, Y.; Yang, C.; and Han, J. 2022. and Inter-sentential Reasoning for Document-level Relation
SAIS: Supervising and Augmenting Intermediate Steps for Extraction. In Findings of the Association for Compu-
Document-Level Relation Extraction. In Proceedings of the tational Linguistics: ACL-IJCNLP 2021, 524–534. Online:
2022 Conference of the North American Chapter of the As- Association for Computational Linguistics.
sociation for Computational Linguistics: Human Language Zeng, S.; Xu, R.; Chang, B.; and Li, L. 2020. Double Graph
Technologies, 2395–2409. Seattle, United States: Associa- Based Reasoning for Document-level Relation Extraction.
tion for Computational Linguistics. In Proceedings of the 2020 Conference on Empirical Meth-
Xie, Y.; Shen, J.; Li, S.; Mao, Y.; and Han, J. 2022. Eider: ods in Natural Language Processing (EMNLP), 1630–1640.
Empowering Document-level Relation Extraction with Ef- Online: Association for Computational Linguistics.
ficient Evidence Extraction and Inference-stage Fusion. In Zhang, N.; Chen, X.; Xie, X.; Deng, S.; Tan, C.; Chen, M.;
Findings of the Association for Computational Linguistics: Huang, F.; Si, L.; and Chen, H. 2021. Document-level Rela-
ACL 2022, 257–268. Dublin, Ireland: Association for Com- tion Extraction as Semantic Segmentation. In Zhou, Z.-H.,
putational Linguistics. ed., Proceedings of the Thirtieth International Joint Confer-
Xu, W.; Chen, K.; and Zhao, T. 2021a. Discriminative Rea- ence on Artificial Intelligence, IJCAI-21, 3999–4006. Inter-
soning for Document-level Relation Extraction. In Find- national Joint Conferences on Artificial Intelligence Organi-
ings of the Association for Computational Linguistics: ACL- zation. Main Track.
IJCNLP 2021, 1653–1663. Online: Association for Compu- Zhang, Y.; Qi, P.; and Manning, C. D. 2018. Graph Convolu-
tational Linguistics. tion over Pruned Dependency Trees Improves Relation Ex-
Xu, W.; Chen, K.; and Zhao, T. 2021b. Document-Level Re- traction. In Proceedings of the 2018 Conference on Empir-
lation Extraction with Reconstruction. In Thirty-Fifth AAAI ical Methods in Natural Language Processing, 2205–2215.
Conference on Artificial Intelligence, AAAI 2021, Thirty- Brussels, Belgium: Association for Computational Linguis-
Third Conference on Innovative Applications of Artificial tics.
Intelligence, IAAI 2021, The Eleventh Symposium on Edu- Zhang, Z.; Cai, J.; Zhang, Y.; and Wang, J. 2020. Learning
cational Advances in Artificial Intelligence, EAAI 2021, Vir- Hierarchy-Aware Knowledge Graph Embeddings for Link
tual Event, February 2-9, 2021, 14167–14175. AAAI Press. Prediction. In The Thirty-Fourth AAAI Conference on Ar-
Xu, W.; Chen, K.; and Zhao, T. 2023. Document-Level Re- tificial Intelligence, AAAI 2020, The Thirty-Second Innova-
lation Extraction with Path Reasoning. ACM Trans. Asian tive Applications of Artificial Intelligence Conference, IAAI
Low-Resour. Lang. Inf. Process., 22(4). 2020, The Tenth AAAI Symposium on Educational Advances
Yang, B.; Yih, W.; He, X.; Gao, J.; and Deng, L. 2015. Em- in Artificial Intelligence, EAAI 2020, New York, NY, USA,
bedding Entities and Relations for Learning and Inference February 7-12, 2020, 3065–3072. AAAI Press.
in Knowledge Bases. In Bengio, Y.; and LeCun, Y., eds., Zhou, W.; Huang, K.; Ma, T.; and Huang, J. 2021.
3rd International Conference on Learning Representations, Document-Level Relation Extraction with Adaptive Thresh-
ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Confer- olding and Localized Context Pooling. In Thirty-Fifth AAAI
ence Track Proceedings. Conference on Artificial Intelligence, AAAI 2021, Thirty-
Yao, Y.; Ye, D.; Li, P.; Han, X.; Lin, Y.; Liu, Z.; Liu, Z.; Third Conference on Innovative Applications of Artificial
Huang, L.; Zhou, J.; and Sun, M. 2019. DocRED: A Large- Intelligence, IAAI 2021, The Eleventh Symposium on Edu-
Scale Document-Level Relation Extraction Dataset. In Pro- cational Advances in Artificial Intelligence, EAAI 2021, Vir-
ceedings of the 57th Annual Meeting of the Association for tual Event, February 2-9, 2021, 14612–14620. AAAI Press.
Computational Linguistics, 764–777. Florence, Italy: Asso-
ciation for Computational Linguistics.
Yu, M.; Yin, W.; Hasan, K. S.; dos Santos, C.; Xiang, B.;
and Zhou, B. 2017. Improved Neural Relation Detection for
Knowledge Base Question Answering. In Proceedings of the
55th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), 571–581. Vancouver,
Canada: Association for Computational Linguistics.

You might also like