0% found this document useful (0 votes)
17 views

Buildeing Knowlwdge Base Through Deep Learning Relation Extraction

Uploaded by

Haider Malik
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Buildeing Knowlwdge Base Through Deep Learning Relation Extraction

Uploaded by

Haider Malik
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Building Knowledge Base through Deep Learning Relation Extraction

and Wikidata
Pero Subasic, Hongfeng Yin and Xiao Lin
AI Agents Group, DOCOMO Innovations Inc, Palo Alto, CA, USA
{psubasic, hyin, xlin}@docomoinnovations.com

Abstract for compactness and completeness are plainly at odds with


Many AI agent tasks require domain specific knowledge each other such that existing KG generation techniques fail
graph (KG) that is compact and complete. We present a to satisfy both objectives properly. Accordingly, there is a
methodology to build domain specific KG by merging out- need for an improved knowledge graph generation tech-
put from deep learning-based relation extraction from free nique that satisfies the conflicting needs for completeness
text and existing knowledge database such as Wikidata. We
first form a static KG by traversing knowledge database and compactness. We also aim to build a methodology to
constrained by domain keywords. Very large high-quality support easier knowledge base construction in multiple
training data set is then generated automatically by match- languages and domains.
ing Common Crawl data with relation keywords extracted We thus propose a methodology to build a domain spe-
from knowledge database. We describe the training data cific KG. Figure 1 depicts the processes of domain specific
generation process in detail and subsequent experiments
with deep learning approaches to relation extraction. The re- KG generation through deep learning-based relation ex-
sulting model is used to generate new triples from free text traction and knowledge database. We choose Wikidata as
corpus and create a dynamic KG. The static and dynamic the initial knowledge database. After being language fil-
KGs are then merged into a new KB satisfying the require- tered, the database is transformed and stored into Mon-
ment of specific knowledge-oriented AI tasks such as ques- goDB so that a hierarchical traversal starting from a set of
tion answering, chatting, or intelligent retrieval. The pro-
posed methodology can be easily transferred to other do- seed keywords could be performed efficiently. This set of
mains or languages. seed keywords can be given for specific application thus
this approach can be applied to arbitrary domain. It is also
possible to extract this set of keywords automatically from
Introduction some given text corpora. The resulting subject-relation-
Knowledge graph (KG) plays an important role in closed object triples from this step are used to form a so-called
domain question-answering (QA) systems. There are many static KG and also are used to match sentences from
large-scale KGs available (Bollacker 2008; Lehmann et al. Common Crawl free text to create a large dataset to train
2012; Lenat 1995; Mitchell et al. 2018; Vrandecic and our relation extraction model. The trained model is then
Krotzsh 2014). To answer user queries, a KG should be applied to infer new triples from free text corpora which
compact (pertain to a particular topic) or the QA engine form a dynamic KG to satisfy the requirement of com-
may provide wrong answers due to the knowledge graph pleteness. The static and dynamic KGs are then aggregated
having too many extraneous facts and relations. The into a new KG that can be exported into various formats
knowledge graph should be complete so as to have as such as RDF, property graph etc., and be used by a domain
many facts as possible about the topic of interest or the QA specific knowledge-based AI agent.
engine may be unable to answer user’s query. The need The paper first reviews the related works regarding
knowledge graph generation and relation extraction. It then
describes our label dataset preparation, relation extraction
Copyright held by the author(s). In A. Martin, K. Hinkelmann, A. Gerber, model and KG generation in details, followed by some re-
D. Lenat, F. van Harmelen, P. Clark (Eds.), Proceedings of the AAAI sults of experiments of benchmarking relation extraction
2019 Spring Symposium on Combining Machine Learning with models and application of proposed approach for a soccer
Knowledge Engineering (AAAI-MAKE 2019). Stanford University, Palo
Alto, California, USA, March 25-27, 2019. domain.
researches focus on tiny improvements on the noisy train-
ing data. However, these RE results fall short from re-
quirements of practical applications. The biggest challenge
of RE is to automatically generate massive high-quality
training data. We solve this problem by matching Common
Crawl data with a structured knowledge base like Wikidata.
Our approach is thus unique in that it utilizes a struc-
tured database to form a static KG through hierarchical
traversal of links connected with domain keywords for
compactness. This KG is used to generate triples to train
sequence tagging relation extraction model to infer new
triples from free text corpus and generate a dynamic KG
for completeness. The major contribution of our study is
Figure 1. Flow Diagram for Construction of Domain Specific that we generated a large dataset for relation extraction
Knowledge Graph. model training. Furthermore, the approach is easily trans-
ferrable to other domains and languages as long as the text
data is available. Specifically, to transfer to a new domain,
we need a new set of keywords or documents representing
Related Work the domain. To transfer to a new language, we need entity
A knowledge graph could be constructed by collaborative extractors, static knowledge graph in that language (Wiki-
way to collect entities and links (Clark 2014), or automatic data satisfies this requirement), and large text corpus in
natural language processing to obtain subject-relation- target language (Common Crawl satisfies that requirement,
object triples, such as through transformation of embed- but other sources can be used).
ding representation (Lin et al. 2015; Socher et al. 2012;
Wang et al. 2014), deep neural network model extraction
approaches (Santos, Xiang and Zhou 2015; Zeng 2014; Relation Extraction
Zeng et al. 2015; Zhang and Wang; 2015; Zhou et al 2016)
and inference method from graph paths (Guu, Miller and Label Data Generation
Liang et al. 2015). Researchers in recent years also propose The datasets used in distant supervision are usually devel-
to use end-to-end system (Kertkeidkachorn and Ichise oped by aligning a structural knowledge base like Freebase
2017, Shang et al. 2019), deep reinforcement learning with free text like Wikipedia or news. One example is
method (Feng 2018, Yang, Yang and Cohen 2017) to get (Riedel, Yao, and McCallum 2010) who match Freebase
better result. relations with the New York Times (NYT) corpus. Usually,
As one of the major approaches to expand KG, relation two entities with relation in a sentence associate a keyword
extraction (RE) aims to extract relational facts from plain in the sentence to represent the relation in the knowledge
text between entities contained in text. Supervised learning base. Therefore, it is required to match two entities and a
approach is effective, but preparation of a high-quality la- keyword for a sentence to generate a positive relation. This
beled data is a major bottleneck in practice. One technique will largely reduce noise in generating positive samples.
to avoid this difficulty is distant supervision (Mintz et al., However, the total number of positive samples is also
2009), which assumes that if two entities have a relation- largely reduced. The problem can be solved by using very
ship in a known knowledge base, then all sentences that large free text corpora: billions of web pages available in
mention these two entities will express that relationship in Common Crawl web data.
some way. All sentences that contain these two entities are The Common Crawl corpus contains petabytes of data
selected as training instances. The distant supervision is an collected over 8 years of web crawling. The corpus con-
effective method of automatically labeling training data. tains raw web page data, metadata extracts and text ex-
However, it has a major shortcoming. The distant supervi- tracts. We use one year of Common Crawl text data. After
sion assumption is too strong and causes the wrong label language filtering, cleaning and deduplication there are
problem. A sentence that mentions two entities does not about 6 billion English web pages. The training data gener-
necessarily express their relation in a knowledge base. It is ation is shown in Fig. 2, and the in-house entity extraction
possible that these two entities may simply appear in a sen- system is used to label entities in Common Crawl text.
tence without the specific relation in the knowledge base. A Wikdata relation category has an id P-number, a
The noisy training data fundamentally limit the perfor- relation name and several mapped relation keywords, for
mances of any trained model (Luo et al. 2017). Most of RE example:
• P-number: P19 • actor 1,014,708
• Name: place of birth • capital of 957,203
• Mapped relation keywords: birth city, birth • son 954,848
location, birth place, birthplace, born at, born in, • directed by 890,268
location born, location of birth, POB • married 843,009
Wikidata dump used in our task consists of: • born in 796,941
• 48,756,678 triples • coach 736,866
• 783 relation categories Therefore, the massive high-quality labeled sentences are
• 2,384 relation keywords generated automatically for training supervised machine
• leaning models. With the labeled sentences, we can build
Entity
RE models for specific domains, for specific relations or
Common
Wikidata crawl data
extraction for open domain.

Relation triples
Entity texts Relation Extraction Models for Soccer
with Wikidada
relation In a specific domain example, we use labeled sentences to
categories
build RE models for soccer. First, we extract 17,950 soccer
Relation triples
Entity matching and entities and 722,528 triples with at least one soccer entity
Relation category to relation keyword Negative sentence
relation keyword
with relation
categories and
matching for generation from Wikidata, 78 relation categories with 640 relation
matching positive sentences
keywords keywords.
Wikidata category
to keyword table
Training data generation:
Labeled sentences
• Positive sample generation:
1. Select two entities (e1, e2) and a relation
Figure 2. Flow Chart of Training Data Generation keyword (r_kw with relation category r_cat)
in a matched sentence s
First, Wikidata relation category triples are mapped to 2. If (e1, r_kw, e2) is in the relation keyword
Wikidata relation keyword triples. Then, Wikidata triples
keyword triples are matched with Common Crawl entity- 3. Set “e1, e2, r_kw, r_cat, s” as a positive
labeled sentences. It yields: sample
• 386 million matched sentences • Negative sample generation
• 65 million unique sentences. 1. Select two entities (e1, e2) in a sentence s
• There are 688 relation keywords with more than 2. One entity must be a soccer entity
1000 matched sentences 3. Both entities are in the entity list generated
• Example: from Wikidata relation triples
• Wikidata keyword triple: 4. Set “e1, e2, NONE, NA, s” as a negative
o [[Martín_Sastre]] born in sample. Select randomly with some probabil-
[[Montevideo]] ity to obtain sufficient number of negative
• Matched Common Crawl sentence: samples.
o [[Martín_Sastre]] was born in 5. Remove duplicated samples
[[Montevideo]] in 1976 and lives in • Total Generated Training Data:
[[Madrid]] o 2,121,640 samples
Matched unique sentences for top relation keywords o 335,734 positive relation sentences
• state 4,336,046 o 1,785,906 negative relation sentences
• city 4,251.983
• capital 2,797,477 Building the Models
• starring 2,032.749
• borders 1,874,461
• town 1,737,493
• wife 1,730,569
• founder 1,337,416
• is located in 1,136,473
• husband 1,016,505
Model F1 Precision Recall

Sequence Tagging 98.25% 97.61% 98.89%


PCNN 82.89% 86.00% 80.00%
LSTM Classification 91.28% 90.10% 92.50%
LSTM + Attention 95.40% 94.80% 96.00%

Table 1. Performance Comparison of Different Models


In comparison with distant supervision datasets our da-
tasets can train much higher-quality models.

Figure 3. Flow Chart of Model Training Comparison


The PCNN model (Zeng et al. 2015), LSTM with Atten-
tion model (Zhou et al. 2016) and LSTM classification
model (Zhang and Wang 2015) are trained with 90% data
for training, 10% data for testing. Also, sequence tag mod-
el (Lample et al. 2016) is trained with 80% data for train-
ing, 10% data for testing during training and 10% data for
testing after training. Comparison of Figure 3 in Reference (Lin et al. 2016)
A positive sentence is tagged as follows:
[[ John ]] Entity lives in [[ New York ]] Entity To validate the wining sequence tagging approach, we cre-
O O O O B-Re I-Re O O O O O ate validation data set from Common Crawl outside the
training data by using different time period. We also vali-
(a) dated on data used from other data source, different from
Common Crawl with similar outcome. Validation results
are as follows:
o F1 93.15%
o Precision: 90.43%
o Recall: 96.02%
Although the models perform well for the training data, we
found that there are a lot of false positives when the mod-
els are applied on arbitrary free text. This issue can be im-
proved with new negative sample generation which we de-
scribe here.
• Improved negative sample generation: a sentence s
should have a keyword in the 640 relation keywords
o For each pair of entities (e1, e2) in s
§ e1 or e2 is a soccer entity
§ e1 and e2 are in the entity list gener-
ated from Wikidata relation triples
(b) (c) § If s is a matched sentence, and e1
and e2 are not in the relation triple
Figure 4. Performance Comparison (a) Precision vs Recall; (b) of s
& (c) Precision vs Epoch • Set “e1, e2, NONE, NA, s”
Table 1. shows the performance of each model. Based on as a negative sample with a
F1 score: Sequence Tagging > LSTM+Attention > LSTM probability 0.5
Classification > PCNN. § If s is not a matched sentence
• Set “e1, e2, NONE, NA, s” Construction of Knowledge Graph
as a negative sample with a
probability 0.15 (15 out of As illustrated in Figure 1, the goal of the proposed ap-
100 samples are selected as proach is to build a knowledge graph from a static KG
negative) built from knowledge database and a dynamic KG generat-
o Remove duplicated samples ed from deep learning relation extraction. To form the stat-
• The new training data with improved negative sample ic knowledge base, a suitable knowledge database (in this
generation: example, Wikidata) is language filtered (English, Japanese,
o 1,702,924 samples and so on) and the resulting knowledge graph is stored in a
o 363,458 positive samples suitable database platform such as MongoDB. To build the
o 1,339,466 Negative samples static knowledge graph, database is searched for seed key-
With this training data, the performances of sequence words to act as seed vertices for the resulting knowledge
tagging model on unseen data are reduced only slightly: graph. These seed vertices are then expanded by hierar-
o F1: 92.38% chical traversal. In particular, the hierarchical traversal
o Precision: 89.42%, proceeds by finding all descendent vertices of the seed ver-
o Recall: 95.53% tex. The algorithm then recursively iterates across these
descendent (child) vertices. In addition, all ancestor verti-
ces that have links to the seed vertex are identified by add-
ing parent Wikidata items and recursively iterating across
the parents of the seed vertex. Since the seed keywords are
directed to the topic of interest (e.g., soccer), the hierar-
chical traversal of a resulting knowledge graph is also per-
forming a domain filtering to the topic of interest. The rela-
tion triples from static knowledge graph may then be ex-
tracted and expanded to assist in the labeling of positive
and negative sentences from a training corpus to train a
deep learning relation extraction model. Deep learning
model applied on free text, such as news articles, blogs,
and similar up-to-date sources, generates dynamic
knowledge graph. The static and dynamic knowledge
graphs are then merged to form a combined knowledge
graph. The two approaches ensure that we have slowly-
Figure 5. Soccer RE for Common Crawl data changing (therefore ‘static’) knowledge as well as fast-
changing (therefore ‘dynamic’) knowledge in the resulting
Apply the model to Common Crawl data knowledge graph. When merging static KG and dynamic
Figure 5 shows the flowchart of soccer RE. For each KG, several subjective rules are enforced: (1) if relation of
sentence in Common Crawl entity texts, if the sen- a triple in the dynamic KG is not defined in Wikidata
tence contains at least one soccer entity and two enti- property list, this triple will be ignored; (2) if any of two
ties in the entity list generated from Wikidata re- entities of a triple in the dynamic KG is not defined in the
Wikidata item list, a pseudo item is created with a unique
lation triples, the sentence is a soccer sentence.
Q-number and the triple will be added into the knowledge
Then, the duplicated soccer sentences are re- base as a valid link; (3) relation defined in static KG has
moved and the sentences without the relation higher precedence – if a relation in dynamic KG is con-
keywords are filtered out. The left sentences are flicted with a relation in static KG, the one in dynamic KG
tagged with IOB tags. Finally, the RE models are will be ignored and the relation in static KG will be kept in
applied to the sentences to extract the relations. the merged knowledge base. The merged KG only exists in
The results are: the final Neo4j database.
In our experiment, we assumed a single fact knowledge-
o Total soccer sentences with two labeled entities: based question answering system in soccer domain to
64,085,913 demonstrate the proposed approach. To assure that the QA
o Total relations extracted: 600,964 system can answer user query correctly, we make the KG
o Aggregate unique relations: 147,486. contain facts represented by triples related with soccer as
closely as possible. At the same time, we included as many
soccer related triples as possible. Wikidata is adopted as when merged KG is used because the corresponding facts
structured database to generate static KG. Relation extrac- are added from relation extraction results.
tion approach described in Section 2 is used to extract soc-
cer related triples which are then merged into dynamic KG. Q: who is Louis Giskus?
Table 2 lists the top three relations in the new KG. It A: [Louis Giskus] => [chairperson] => [Surinamese
demonstrates that triple facts in the aggregated KG are cor- Football Association]
rectly condensed into a specific domain of soccer. P641 Q: how Antonio Conte is related with Juventus
(sport, in which the subject participates or belongs to) is F.C.?
not used in sentence labeling for its coverage is too broad. A: [head coach]
An example of triple with P641 is given: [Lionel Messi] => Q: who is the manager of Manchester City F.C.?
[P641 (sport)] => [association football]. A: [Manchester City F.C.] => [represented by] =>
[Pep Guardiola]
P-number Label Occurrence
Table 4. QA Examples using Dynamic KG
P641 sport 404,761
P54 member of sports team 128,632 In Wikidata, defined items have different language la-
P1344 Participant 30,307 bels. By incorporating corresponding language labels into
Neo4j database, the resulting KB can easily accommodate
Table 2. Top 3 Relations in Static Soccer KG the capabilities of visualizing or querying in languages
The statistics of static KG and dynamic KG is listed in other than English. As a demonstration, Figure 6 shows a
Table 3. As it shows, static KG contains only 0.81% enti- query using Japanese to query the KB.
ties and 0.29% links from the original Wikidata. Queries
performed on static KG thus will be significantly more ef-
ficient than on the original database, thus lowering re-
quirement for computation power and memory usage. This
is especially important for AI agent edge devices where
hardware resources are limited. At the same time, the links
in domain KG increased by 15.6%, resulting in a large in-
crease of coverage. This number is dependent on the size
of corpus used to extract relations. Larger corpus size will
yield larger link increase resulting in more knowledge cov-
erage. For example, in Wikidata database there are 67 links
starting with Q170645 (2018 FIFA World Cup). In merged
KG, this number increases to 472. Figure 6. KB Query and Visualization in Japanese

Static KG Merged KG
Number of Entities 405,639 425,224 Summary
Number of Predicates 676,500 807,718
% of Entity 0.81% This paper presents a methodology to build a knowledge
N/A graph for domain specific AI application where KG is re-
Wikidata Predicate 0.29%
Increased Comparing to Static KG 15.6% quired to be compact and complete. This KG is constructed
by aggregating a static knowledge database such as Wiki-
Table 3. Triple Statistics of Aggregated Knowledge Graph data and a dynamic knowledge database, which is formed
Since there is no real question-answering system that is by subject-relation-object triples extracted from free text
based on the knowledge graphs created in this study, im- corpora through deep learning relation extraction model. In
provement of question-answering performance from the this study, a large high-quality dataset for training relation
merged KG over simply static KG or dynamic KG alone is extraction model is developed by matching Common
not able to be evaluated quantitatively. Neo4j is used in the Crawl data with knowledge database. This dataset was
demonstration to simulate QA system – instead of a natural used to train our own sequence tagging based relation ex-
language question, a database query is issued to get re- traction model and achieved the-state-of-art performance.
sponse (in a real system this is usually accomplished by Another important contribution is multi-language and mul-
appropriate AIML mapping). Table 4 shows some query ti-domain applicability of the approach.
examples. As expected, some questions can be answered
It is inevitable that there might be wrong “facts” inferred Lin, Y. et al. 2016. Neural Relation Extraction with Selective At-
from test corpora by the relation extraction model. It would tention over Instances. In Proceedings of the 54th Annual Meeting
of the Association for Computational Linguistics, 2124-2133,
be an interesting but challenging future work to evaluate Berlin, Germany.
validity of predicted triples and delete these wrong “facts”
Luo, B. et al. 2017. Learning with Noise: Enhance Distantly Su-
in order that they will not be integrated into knowledge pervised Relation Extraction with Dynamic Transition Matrix. In
base and become “truth”. To infer new links directly from Proceedings of the 55th Annual Meeting of the Association for
knowledge database to further expand the knowledge base Computational Linguistics, 430-439, Vancouver, Canada.
could be another interesting topic. Another topic that could Mintz, M. 2009. Distant Supervision for Relation Extraction
be worthy to pursue is to study whether joint named entity without Labeled Data. In Proceedings of the 47th Annual Meeting
recognition and relation extraction could be integrated into of the ACL and the 4th IJCNLP of the AFNLP, 1003-1011, Sun-
tec, Singapore.
our flow (Bekoulis et al. 2018).
Riedel, S., Yao, L. and McCallum, A. 2010. Modeling Relations
and Their Mentions without Labeled Text. In: Balcázar J.L.,
Bonchi F., Gionis A., Sebag M. (eds.) Machine Learning and
Acknowledgments Knowledge Discovery in Databases. ECML PKDD 2010. Lecture
We thank Yinrui Li for conducting the benchmark study of Notes in Computer Science, vol 6323. Springer, Berlin, Heidel-
berg.
deep learning algorithms for relation extraction and contri-
Santos, C., Xiang, B. and Zhou, B. 2015. Classifying Relations
bution to the data of Figure 4. We also thank the anony-
by Ranking with Convolutional Neural Networks. In Proceedings
mous reviewers for their helpful comments. of the 53rd Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference on Natural
Language Processing, 626–634, Beijing, China.
References Shang, C. et al., 2019. End-to-end Structure-Aware Convolutional
Bekoulis, G. et al. 2018. Joint Entity Recognition and Relation Networks for Knowledge Base Completion, arXiv:1811.04441,
Extraction as a Multi-head Selection Problem. Expert System accepted for Proceedings of AAAI 2019.
with Applications, vol 114, 34-45. Socher, R. et al. 2012. Semantic Compositionality through Recur-
Bollacker, K. et al. 2008. Freebase: A Collaboratively Created sive Matrix-Vector Spaces. In Proceedings of the 2012 Joint Con-
Graph Database for Structuring Human Knowledge. In Proceed- ference on Empirical Methods in Natural Language Processing
ings of SIGMOD’08, 1247-1249, ACM and Computational Natural Language Learning, 1201-1211, Jeju
Island, Korea.
Clark, P. et al. 2014. Automatic Construction of Inference-
Supporting Knowledge Bases. In Proceedings of 4th Workshop on Vrandecic, D. and Krotzsh M. 2014. Wikidata: A Free Collabora-
Automated Knowledge Base Construction (AKBC’2014). tive Knowledgebase. Communications of the ACM 57(10):78-85.
Feng, J. et al. 2018, Reinforcement Learning for Relation Classi- Wang, Z. et al. 2014. Knowledge Graph Embedding by Translat-
fication from Noisy Data. in Proceedings of the 32nd AAAI Con- ing on Hyperplanes. In Proceedings of the 28th AAAI Conference
ference on Artificial Intelligence, 5779-5786. on Artificial Intelligence, 1112-1119.
Guu., K., Miller, J. and Liang, P. 2015. Traversing Knowledge Xie, Q. et al. 2017. An Interpretable Knowledge Transfer Model
Graphs in Vector Space. In Proceedings of the 2015 Conference for Knowledge Base Completion. In Proceedings of the 55th An-
on Empirical Methods in Natural Language Processing, 318-327, nual Meeting of the Association for Computational Linguistics,
Lisbon, Portugal. 950–962, Vancouver, Canada, ACL.
Kertkeidkachorn, N. and Ichise, R. 2017. T2KG: An End-to-End Yang, F., Yang, Z. and Cohen, W. 2017. Differentiable Learning
System for Creating Knowledge Graph from Unstructured Text. of Logical Rules for Knowledge Base Reasoning. In Proceedings
In Proceeding of AAAI-17 Workshop on Knowledge-Based Tech- of 31st Conference on Neural Information Processing Systems
niques for Problem Solving and Reasoning, 743-749. (NIPS 2017), Long Beach, CA, USA.
Lample, G. et al. 2016. Neural Architectures for Named Entity Zeng, D. 2014. Relation Classification via Convolutional Deep
Recognition. In Proceedings of NAACL-HLT 2016, 260–270, San Neural Network. In Proceedings of the 25th International Confer-
Diego, California. ence on Computational Linguistics (COLING 2014), 2335–2344,
Dublin, Ireland.
Lehmann, J. et al. 2012. DBpedia – a Large-scale, Multilingual
Knowledge Base Extracted from Wikipedia. Semantic Web Zeng, D., Liu., K., Chen., Y. and Zhao, J. 2015. Distant Supervi-
1(2012):1-5. sion for Relation Extraction via Piecewise Convolutional Neural
Networks. In Proceedings of the 2015 Conference on Empirical
Lenat, D. 1995. CYC: A Large-Scale Investment in Knowledge Methods in Natural Language Processing, 1753–1762, Lisbon,
Infrastructure. Communications of the ACM 38(11):33-38. Portugal, ACL.
Mitchell, T. et al. 2018. Never Ending Learning. Communications Zhang, D and Wang, D. 2015. Relation Classification via Recur-
of the ACM 61(5):103-115. rent Neural Network. arXiv:1508.01006.
Lin, Y. et al. 2015. Learning Entity and Relation Embeddings for Zhou, P. et al. 2016. Attention-Based Bidirectional Long Short-
Knowledge Graph Completion. In Proceedings of the 29th AAAI Term Memory Networks for Relation Classification. In Proceed-
Conference on Artificial Intelligence, 2181-2187. ings of the 54th Annual Meeting of the Association for Computa-
tional Linguistics, 207–212, Berlin, Germany, ACL.

You might also like