Buildeing Knowlwdge Base Through Deep Learning Relation Extraction
Buildeing Knowlwdge Base Through Deep Learning Relation Extraction
and Wikidata
Pero Subasic, Hongfeng Yin and Xiao Lin
AI Agents Group, DOCOMO Innovations Inc, Palo Alto, CA, USA
{psubasic, hyin, xlin}@docomoinnovations.com
Relation triples
Entity texts Relation Extraction Models for Soccer
with Wikidada
relation In a specific domain example, we use labeled sentences to
categories
build RE models for soccer. First, we extract 17,950 soccer
Relation triples
Entity matching and entities and 722,528 triples with at least one soccer entity
Relation category to relation keyword Negative sentence
relation keyword
with relation
categories and
matching for generation from Wikidata, 78 relation categories with 640 relation
matching positive sentences
keywords keywords.
Wikidata category
to keyword table
Training data generation:
Labeled sentences
• Positive sample generation:
1. Select two entities (e1, e2) and a relation
Figure 2. Flow Chart of Training Data Generation keyword (r_kw with relation category r_cat)
in a matched sentence s
First, Wikidata relation category triples are mapped to 2. If (e1, r_kw, e2) is in the relation keyword
Wikidata relation keyword triples. Then, Wikidata triples
keyword triples are matched with Common Crawl entity- 3. Set “e1, e2, r_kw, r_cat, s” as a positive
labeled sentences. It yields: sample
• 386 million matched sentences • Negative sample generation
• 65 million unique sentences. 1. Select two entities (e1, e2) in a sentence s
• There are 688 relation keywords with more than 2. One entity must be a soccer entity
1000 matched sentences 3. Both entities are in the entity list generated
• Example: from Wikidata relation triples
• Wikidata keyword triple: 4. Set “e1, e2, NONE, NA, s” as a negative
o [[Martín_Sastre]] born in sample. Select randomly with some probabil-
[[Montevideo]] ity to obtain sufficient number of negative
• Matched Common Crawl sentence: samples.
o [[Martín_Sastre]] was born in 5. Remove duplicated samples
[[Montevideo]] in 1976 and lives in • Total Generated Training Data:
[[Madrid]] o 2,121,640 samples
Matched unique sentences for top relation keywords o 335,734 positive relation sentences
• state 4,336,046 o 1,785,906 negative relation sentences
• city 4,251.983
• capital 2,797,477 Building the Models
• starring 2,032.749
• borders 1,874,461
• town 1,737,493
• wife 1,730,569
• founder 1,337,416
• is located in 1,136,473
• husband 1,016,505
Model F1 Precision Recall
Static KG Merged KG
Number of Entities 405,639 425,224 Summary
Number of Predicates 676,500 807,718
% of Entity 0.81% This paper presents a methodology to build a knowledge
N/A graph for domain specific AI application where KG is re-
Wikidata Predicate 0.29%
Increased Comparing to Static KG 15.6% quired to be compact and complete. This KG is constructed
by aggregating a static knowledge database such as Wiki-
Table 3. Triple Statistics of Aggregated Knowledge Graph data and a dynamic knowledge database, which is formed
Since there is no real question-answering system that is by subject-relation-object triples extracted from free text
based on the knowledge graphs created in this study, im- corpora through deep learning relation extraction model. In
provement of question-answering performance from the this study, a large high-quality dataset for training relation
merged KG over simply static KG or dynamic KG alone is extraction model is developed by matching Common
not able to be evaluated quantitatively. Neo4j is used in the Crawl data with knowledge database. This dataset was
demonstration to simulate QA system – instead of a natural used to train our own sequence tagging based relation ex-
language question, a database query is issued to get re- traction model and achieved the-state-of-art performance.
sponse (in a real system this is usually accomplished by Another important contribution is multi-language and mul-
appropriate AIML mapping). Table 4 shows some query ti-domain applicability of the approach.
examples. As expected, some questions can be answered
It is inevitable that there might be wrong “facts” inferred Lin, Y. et al. 2016. Neural Relation Extraction with Selective At-
from test corpora by the relation extraction model. It would tention over Instances. In Proceedings of the 54th Annual Meeting
of the Association for Computational Linguistics, 2124-2133,
be an interesting but challenging future work to evaluate Berlin, Germany.
validity of predicted triples and delete these wrong “facts”
Luo, B. et al. 2017. Learning with Noise: Enhance Distantly Su-
in order that they will not be integrated into knowledge pervised Relation Extraction with Dynamic Transition Matrix. In
base and become “truth”. To infer new links directly from Proceedings of the 55th Annual Meeting of the Association for
knowledge database to further expand the knowledge base Computational Linguistics, 430-439, Vancouver, Canada.
could be another interesting topic. Another topic that could Mintz, M. 2009. Distant Supervision for Relation Extraction
be worthy to pursue is to study whether joint named entity without Labeled Data. In Proceedings of the 47th Annual Meeting
recognition and relation extraction could be integrated into of the ACL and the 4th IJCNLP of the AFNLP, 1003-1011, Sun-
tec, Singapore.
our flow (Bekoulis et al. 2018).
Riedel, S., Yao, L. and McCallum, A. 2010. Modeling Relations
and Their Mentions without Labeled Text. In: Balcázar J.L.,
Bonchi F., Gionis A., Sebag M. (eds.) Machine Learning and
Acknowledgments Knowledge Discovery in Databases. ECML PKDD 2010. Lecture
We thank Yinrui Li for conducting the benchmark study of Notes in Computer Science, vol 6323. Springer, Berlin, Heidel-
berg.
deep learning algorithms for relation extraction and contri-
Santos, C., Xiang, B. and Zhou, B. 2015. Classifying Relations
bution to the data of Figure 4. We also thank the anony-
by Ranking with Convolutional Neural Networks. In Proceedings
mous reviewers for their helpful comments. of the 53rd Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference on Natural
Language Processing, 626–634, Beijing, China.
References Shang, C. et al., 2019. End-to-end Structure-Aware Convolutional
Bekoulis, G. et al. 2018. Joint Entity Recognition and Relation Networks for Knowledge Base Completion, arXiv:1811.04441,
Extraction as a Multi-head Selection Problem. Expert System accepted for Proceedings of AAAI 2019.
with Applications, vol 114, 34-45. Socher, R. et al. 2012. Semantic Compositionality through Recur-
Bollacker, K. et al. 2008. Freebase: A Collaboratively Created sive Matrix-Vector Spaces. In Proceedings of the 2012 Joint Con-
Graph Database for Structuring Human Knowledge. In Proceed- ference on Empirical Methods in Natural Language Processing
ings of SIGMOD’08, 1247-1249, ACM and Computational Natural Language Learning, 1201-1211, Jeju
Island, Korea.
Clark, P. et al. 2014. Automatic Construction of Inference-
Supporting Knowledge Bases. In Proceedings of 4th Workshop on Vrandecic, D. and Krotzsh M. 2014. Wikidata: A Free Collabora-
Automated Knowledge Base Construction (AKBC’2014). tive Knowledgebase. Communications of the ACM 57(10):78-85.
Feng, J. et al. 2018, Reinforcement Learning for Relation Classi- Wang, Z. et al. 2014. Knowledge Graph Embedding by Translat-
fication from Noisy Data. in Proceedings of the 32nd AAAI Con- ing on Hyperplanes. In Proceedings of the 28th AAAI Conference
ference on Artificial Intelligence, 5779-5786. on Artificial Intelligence, 1112-1119.
Guu., K., Miller, J. and Liang, P. 2015. Traversing Knowledge Xie, Q. et al. 2017. An Interpretable Knowledge Transfer Model
Graphs in Vector Space. In Proceedings of the 2015 Conference for Knowledge Base Completion. In Proceedings of the 55th An-
on Empirical Methods in Natural Language Processing, 318-327, nual Meeting of the Association for Computational Linguistics,
Lisbon, Portugal. 950–962, Vancouver, Canada, ACL.
Kertkeidkachorn, N. and Ichise, R. 2017. T2KG: An End-to-End Yang, F., Yang, Z. and Cohen, W. 2017. Differentiable Learning
System for Creating Knowledge Graph from Unstructured Text. of Logical Rules for Knowledge Base Reasoning. In Proceedings
In Proceeding of AAAI-17 Workshop on Knowledge-Based Tech- of 31st Conference on Neural Information Processing Systems
niques for Problem Solving and Reasoning, 743-749. (NIPS 2017), Long Beach, CA, USA.
Lample, G. et al. 2016. Neural Architectures for Named Entity Zeng, D. 2014. Relation Classification via Convolutional Deep
Recognition. In Proceedings of NAACL-HLT 2016, 260–270, San Neural Network. In Proceedings of the 25th International Confer-
Diego, California. ence on Computational Linguistics (COLING 2014), 2335–2344,
Dublin, Ireland.
Lehmann, J. et al. 2012. DBpedia – a Large-scale, Multilingual
Knowledge Base Extracted from Wikipedia. Semantic Web Zeng, D., Liu., K., Chen., Y. and Zhao, J. 2015. Distant Supervi-
1(2012):1-5. sion for Relation Extraction via Piecewise Convolutional Neural
Networks. In Proceedings of the 2015 Conference on Empirical
Lenat, D. 1995. CYC: A Large-Scale Investment in Knowledge Methods in Natural Language Processing, 1753–1762, Lisbon,
Infrastructure. Communications of the ACM 38(11):33-38. Portugal, ACL.
Mitchell, T. et al. 2018. Never Ending Learning. Communications Zhang, D and Wang, D. 2015. Relation Classification via Recur-
of the ACM 61(5):103-115. rent Neural Network. arXiv:1508.01006.
Lin, Y. et al. 2015. Learning Entity and Relation Embeddings for Zhou, P. et al. 2016. Attention-Based Bidirectional Long Short-
Knowledge Graph Completion. In Proceedings of the 29th AAAI Term Memory Networks for Relation Classification. In Proceed-
Conference on Artificial Intelligence, 2181-2187. ings of the 54th Annual Meeting of the Association for Computa-
tional Linguistics, 207–212, Berlin, Germany, ACL.